From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bob Newell Subject: Re: How to get parsed output of org-eww-copy-for-org-mode ? Date: Wed, 25 Dec 2019 11:54:26 -1000 Message-ID: References: <87lfr259jq.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:39447) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ikEc3-0005is-Fm for emacs-orgmode@gnu.org; Wed, 25 Dec 2019 16:54:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ikEc2-0004Rs-AU for emacs-orgmode@gnu.org; Wed, 25 Dec 2019 16:54:39 -0500 Received: from mail-ot1-x329.google.com ([2607:f8b0:4864:20::329]:42926) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ikEc1-0004Qk-Vw for emacs-orgmode@gnu.org; Wed, 25 Dec 2019 16:54:38 -0500 Received: by mail-ot1-x329.google.com with SMTP id 66so30310407otd.9 for ; Wed, 25 Dec 2019 13:54:37 -0800 (PST) In-Reply-To: <87lfr259jq.fsf@gmail.com> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: numbchild@gmail.com Cc: Org Mode I don't seem to have any trouble with org-eww-copy-for-org-mode. I capture with a capture template. The code below may be longer or more than you want, but it works for me. My capture template is this: ("w" "Website" plain (function org-website-clipper) "* %a\n%T\n" :immediate-finish t) And it depends on the following code. (require 'ol-eww) (require 'ol-w3m) ;;; Change this to suit: (defvar org-website-page-archive-file "~/organize/website/websites.org") (defun org-website-clipper () "When capturing a website page, go to the right place in capture file, but do sneaky things. Because it's a w3m or eww page, we go ahead and insert the fixed-up page content, as I don't see a good way to do that from an org-capture template alone. Requires Emacs 25+ and the 2017-02-12 or later patched version of org-eww.el." (interactive) ;;; Address the plague of trailing whitespace in some web buffers. (let ((buffer-read-only nil)) (delete-trailing-whitespace)) ;;; Check for acceptable major mode (w3m or eww) and set up a couple of ;;; browser specific values. Error if unknown mode. (cond ((eq major-mode 'w3m-mode) (org-w3m-copy-for-org-mode)) ((eq major-mode 'eww-mode) (org-eww-copy-for-org-mode)) (t (error "Not valid -- must be in w3m or eww mode"))) ;;; Check if we have a full path to the archive file. ;;; Create any missing directories. (unless (file-exists-p org-website-page-archive-file) (let ((dir (file-name-directory org-website-page-archive-file))) (unless (file-exists-p dir) (make-directory dir)))) ;; Open the archive file and yank in the content. ;; Headers are fixed up later by org-capture. (find-file org-website-page-archive-file) (goto-char (point-max)) ;; Leave a blank line for org-capture to fill in ;; with a timestamp, URL, etc. (insert "\n\n") ;; Insert the web content but keep our place. (save-excursion (yank)) ;; Don't keep the page info on the kill ring. ;; Also fix the yank pointer. (setq kill-ring (cdr kill-ring)) (setq kill-ring-yank-pointer kill-ring) ;; Final repositioning. (forward-line -1) )