emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* How to get parsed output of org-eww-copy-for-org-mode ?
@ 2019-12-24  9:59 stardiviner
  2019-12-24 10:26 ` Marco Wahl
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: stardiviner @ 2019-12-24  9:59 UTC (permalink / raw)
  To: Org Mode


I try to get the parsed HTML content into Org format content for org capture
template.

#+begin_src emacs-lisp
(require 'org-eww)
(with-temp-buffer
  (insert html)
  (org-eww-copy-for-org-mode)
  ;; FIXME does not yank converted content, inserted original HTML instead.
  (current-kill 0)
  (org-yank))
#+end_src

But in upper code snippet, the ~current-kill~ or ~org-yank~ (or ~yank~) can't get the
output. I try to use *advice-add*, but I don't know which advice combinator can
archive the purpose that get the parsed output of ~org-eww-copy-for-org-mode~ and
save it somewhere like variable or register. So that I can yank in capture
buffer again.

-- 
[ stardiviner ]
       I try to make every word tell the meaning what I want to express.

       Blog: https://stardiviner.github.io/
       IRC(freenode): stardiviner, Matrix: stardiviner
       GPG: F09F650D7D674819892591401B5DF1C95AE89AC3
      

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to get parsed output of org-eww-copy-for-org-mode ?
  2019-12-24  9:59 How to get parsed output of org-eww-copy-for-org-mode ? stardiviner
@ 2019-12-24 10:26 ` Marco Wahl
  2019-12-25  8:08 ` Adam Porter
  2019-12-25 21:54 ` Bob Newell
  2 siblings, 0 replies; 7+ messages in thread
From: Marco Wahl @ 2019-12-24 10:26 UTC (permalink / raw)
  To: emacs-orgmode

stardiviner <numbchild@gmail.com> writes:

> I try to get the parsed HTML content into Org format content for org capture
> template.
>
> #+begin_src emacs-lisp
> (require 'org-eww)
> (with-temp-buffer
>   (insert html)
>   (org-eww-copy-for-org-mode)
>   ;; FIXME does not yank converted content, inserted original HTML instead.
>   (current-kill 0)
>   (org-yank))
> #+end_src
>
> But in upper code snippet, the ~current-kill~ or ~org-yank~ (or
> ~yank~) can't get the
> output. I try to use *advice-add*, but I don't know which advice
> combinator can
> archive the purpose that get the parsed output of
> ~org-eww-copy-for-org-mode~ and
> save it somewhere like variable or register. So that I can yank in capture
> buffer again.

org-eww-copy-for-org-mode works reasonably only on a buffer that has
been prepared by the shr library.  The typical example for such buffer
is the output of eww.

If plain html is given and you want to use org-eww-copy-for-org-mode you
could prepare a suitable buffer along the lines of shr-render-buffer, I
think.


HTH

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to get parsed output of org-eww-copy-for-org-mode ?
  2019-12-24  9:59 How to get parsed output of org-eww-copy-for-org-mode ? stardiviner
  2019-12-24 10:26 ` Marco Wahl
@ 2019-12-25  8:08 ` Adam Porter
  2019-12-25 11:00   ` stardiviner
  2019-12-25 11:33   ` [SOLVED] " stardiviner
  2019-12-25 21:54 ` Bob Newell
  2 siblings, 2 replies; 7+ messages in thread
From: Adam Porter @ 2019-12-25  8:08 UTC (permalink / raw)
  To: emacs-orgmode

You may find the package org-web-tools useful.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to get parsed output of org-eww-copy-for-org-mode ?
  2019-12-25  8:08 ` Adam Porter
@ 2019-12-25 11:00   ` stardiviner
  2019-12-25 11:33   ` [SOLVED] " stardiviner
  1 sibling, 0 replies; 7+ messages in thread
From: stardiviner @ 2019-12-25 11:00 UTC (permalink / raw)
  To: emacs-orgmode


Adam Porter <adam@alphapapa.net> writes:

> You may find the package org-web-tools useful.

Interesting, =org-web-tools= is useful, I will dive into source code to find what
is usable in my purpose.

-- 
[ stardiviner ]
       I try to make every word tell the meaning what I want to express.

       Blog: https://stardiviner.github.io/
       IRC(freenode): stardiviner, Matrix: stardiviner
       GPG: F09F650D7D674819892591401B5DF1C95AE89AC3
      

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [SOLVED] Re: How to get parsed output of org-eww-copy-for-org-mode ?
  2019-12-25  8:08 ` Adam Porter
  2019-12-25 11:00   ` stardiviner
@ 2019-12-25 11:33   ` stardiviner
  1 sibling, 0 replies; 7+ messages in thread
From: stardiviner @ 2019-12-25 11:33 UTC (permalink / raw)
  To: emacs-orgmode


Adam Porter <adam@alphapapa.net> writes:

> You may find the package org-web-tools useful.

Thanks, Adam I found function ~org-web-tools--html-to-org-with-pandoc~ which can be
used in my case.

-- 
[ stardiviner ]
       I try to make every word tell the meaning what I want to express.

       Blog: https://stardiviner.github.io/
       IRC(freenode): stardiviner, Matrix: stardiviner
       GPG: F09F650D7D674819892591401B5DF1C95AE89AC3
      

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to get parsed output of org-eww-copy-for-org-mode ?
  2019-12-24  9:59 How to get parsed output of org-eww-copy-for-org-mode ? stardiviner
  2019-12-24 10:26 ` Marco Wahl
  2019-12-25  8:08 ` Adam Porter
@ 2019-12-25 21:54 ` Bob Newell
  2019-12-27  1:35   ` stardiviner
  2 siblings, 1 reply; 7+ messages in thread
From: Bob Newell @ 2019-12-25 21:54 UTC (permalink / raw)
  To: numbchild; +Cc: Org Mode

I don't seem to have any trouble with org-eww-copy-for-org-mode. I
capture with a capture template. The code below may be longer or more
than you want, but it works for me.

My capture template is this:

     ("w" "Website" plain
      (function org-website-clipper)
      "* %a\n%T\n" :immediate-finish t)

And it depends on the following code.

  (require 'ol-eww)
  (require 'ol-w3m)

;;; Change this to suit:
(defvar org-website-page-archive-file "~/organize/website/websites.org")
(defun org-website-clipper ()
  "When capturing a website page, go to the right place in capture file,
   but do sneaky things. Because it's a w3m or eww page, we go
   ahead and insert the fixed-up page content, as I don't see a
   good way to do that from an org-capture template alone. Requires
   Emacs 25+ and the 2017-02-12 or later patched version of org-eww.el."
 (interactive)

;;; Address the plague of trailing whitespace in some web buffers.

 (let ((buffer-read-only nil))
      (delete-trailing-whitespace))

;;; Check for acceptable major mode (w3m or eww) and set up a couple of
;;; browser specific values. Error if unknown mode.

  (cond
   ((eq major-mode 'w3m-mode)
     (org-w3m-copy-for-org-mode))
   ((eq major-mode 'eww-mode)
     (org-eww-copy-for-org-mode))
   (t
     (error "Not valid -- must be in w3m or eww mode")))

;;; Check if we have a full path to the archive file.
;;; Create any missing directories.

  (unless (file-exists-p org-website-page-archive-file)
    (let ((dir (file-name-directory org-website-page-archive-file)))
      (unless (file-exists-p dir)
        (make-directory dir))))

  ;; Open the archive file and yank in the content.
  ;; Headers are fixed up later by org-capture.

  (find-file org-website-page-archive-file)
  (goto-char (point-max))
  ;; Leave a blank line for org-capture to fill in
  ;; with a timestamp, URL, etc.
  (insert "\n\n")
  ;; Insert the web content but keep our place.
  (save-excursion (yank))
  ;; Don't keep the page info on the kill ring.
  ;; Also fix the yank pointer.
  (setq kill-ring (cdr kill-ring))
  (setq kill-ring-yank-pointer kill-ring)
  ;; Final repositioning.
  (forward-line -1)
)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to get parsed output of org-eww-copy-for-org-mode ?
  2019-12-25 21:54 ` Bob Newell
@ 2019-12-27  1:35   ` stardiviner
  0 siblings, 0 replies; 7+ messages in thread
From: stardiviner @ 2019-12-27  1:35 UTC (permalink / raw)
  To: Bob Newell; +Cc: Org Mode


This is very interesting, thanks.

I will reference your code.

Bob Newell <bobnewell@bobnewell.net> writes:

> I don't seem to have any trouble with org-eww-copy-for-org-mode. I
> capture with a capture template. The code below may be longer or more
> than you want, but it works for me.
>
> My capture template is this:
>
>      ("w" "Website" plain
>       (function org-website-clipper)
>       "* %a\n%T\n" :immediate-finish t)
>
> And it depends on the following code.
>
>   (require 'ol-eww)
>   (require 'ol-w3m)
>
> ;;; Change this to suit:
> (defvar org-website-page-archive-file "~/organize/website/websites.org")
> (defun org-website-clipper ()
>   "When capturing a website page, go to the right place in capture file,
>    but do sneaky things. Because it's a w3m or eww page, we go
>    ahead and insert the fixed-up page content, as I don't see a
>    good way to do that from an org-capture template alone. Requires
>    Emacs 25+ and the 2017-02-12 or later patched version of org-eww.el."
>  (interactive)
>
> ;;; Address the plague of trailing whitespace in some web buffers.
>
>  (let ((buffer-read-only nil))
>       (delete-trailing-whitespace))
>
> ;;; Check for acceptable major mode (w3m or eww) and set up a couple of
> ;;; browser specific values. Error if unknown mode.
>
>   (cond
>    ((eq major-mode 'w3m-mode)
>      (org-w3m-copy-for-org-mode))
>    ((eq major-mode 'eww-mode)
>      (org-eww-copy-for-org-mode))
>    (t
>      (error "Not valid -- must be in w3m or eww mode")))
>
> ;;; Check if we have a full path to the archive file.
> ;;; Create any missing directories.
>
>   (unless (file-exists-p org-website-page-archive-file)
>     (let ((dir (file-name-directory org-website-page-archive-file)))
>       (unless (file-exists-p dir)
>         (make-directory dir))))
>
>   ;; Open the archive file and yank in the content.
>   ;; Headers are fixed up later by org-capture.
>
>   (find-file org-website-page-archive-file)
>   (goto-char (point-max))
>   ;; Leave a blank line for org-capture to fill in
>   ;; with a timestamp, URL, etc.
>   (insert "\n\n")
>   ;; Insert the web content but keep our place.
>   (save-excursion (yank))
>   ;; Don't keep the page info on the kill ring.
>   ;; Also fix the yank pointer.
>   (setq kill-ring (cdr kill-ring))
>   (setq kill-ring-yank-pointer kill-ring)
>   ;; Final repositioning.
>   (forward-line -1)
> )


-- 
[ stardiviner ]
       I try to make every word tell the meaning what I want to express.

       Blog: https://stardiviner.github.io/
       IRC(freenode): stardiviner, Matrix: stardiviner
       GPG: F09F650D7D674819892591401B5DF1C95AE89AC3
      

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-12-27  1:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-24  9:59 How to get parsed output of org-eww-copy-for-org-mode ? stardiviner
2019-12-24 10:26 ` Marco Wahl
2019-12-25  8:08 ` Adam Porter
2019-12-25 11:00   ` stardiviner
2019-12-25 11:33   ` [SOLVED] " stardiviner
2019-12-25 21:54 ` Bob Newell
2019-12-27  1:35   ` stardiviner

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).