From mboxrd@z Thu Jan 1 00:00:00 1970 From: Adam Porter Subject: ANN: org-web-tools Date: Fri, 21 Jul 2017 12:53:32 -0500 Message-ID: <87o9sdad7n.fsf@alphapapa.net> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:59321) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dYc7e-0006Uz-Fl for emacs-orgmode@gnu.org; Fri, 21 Jul 2017 13:53:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dYc7b-0001Hl-CA for emacs-orgmode@gnu.org; Fri, 21 Jul 2017 13:53:54 -0400 Received: from [195.159.176.226] (port=54910 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dYc7b-0001HK-5m for emacs-orgmode@gnu.org; Fri, 21 Jul 2017 13:53:51 -0400 Received: from list by blaine.gmane.org with local (Exim 4.84_2) (envelope-from ) id 1dYc7L-0008V2-3f for emacs-orgmode@gnu.org; Fri, 21 Jul 2017 19:53:35 +0200 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: emacs-orgmode@gnu.org Hi friends, I've just uploaded a package containing some code that I've been using in my personal Emacs config for a while. It has commands and functions useful for retrieving web page content and processing it into Org-mode content. For example, you can copy a URL to the clipboard or kill-ring, then run a command that downloads the page, isolates the "readable" content with eww-readable, converts it to Org-mode content with Pandoc, and displays it in an Org-mode buffer. Another command does all of that but inserts it as an Org entry instead of displaying it in a new buffer. So you can quickly and easily read a web page in an Org buffer, or insert a page's content as an entry into an Org buffer. You may also find the support functions useful in building your own commands. I haven't submitted it to MELPA yet; I'd like to get some feedback and testing before doing that, so if any of these look useful to you, please give it a test drive! Here's a list of the commands and functions: Commands + org-web-tools-insert-link-for-url: Insert an Org-mode link to the URL in the clipboard or kill-ring. Downloads the page to get the HTML title. + org-web-tools-insert-web-page-as-entry: Insert the web page for the URL in the clipboard or kill-ring as an Org-mode entry, as a sibling heading of the current entry. + org-web-tools-read-url-as-org: Display the web page for the URL in the clipboard or kill-ring as Org-mode text in a new buffer, processed with eww-readable. + org-web-tools-convert-url-list-to-page-entries: With point on a list of URLs in an Org-mode buffer, replace the list of URLs with a list of Org headings, each containing the web page content of that URL, converted to Org-mode text and processed with eww-readable. Functions + org-web-tools--eww-readable: Return "readable" part of HTML with title. + org-web-tools--get-url: Return content for URL as string. + org-web-tools--html-title: Return title of HTML page. + org-web-tools--html-to-org-with-pandoc: Return string of HTML converted to Org with Pandoc. + org-web-tools--url-as-readable-org: Return string containing Org entry of URLs web page content. Content is processed with eww-readable and Pandoc. Entry will be a top-level heading, with article contents below a second-level "Article" heading, and a timestamp in the first-level entry for writing comments. + org-web-tools--demote-headings-below: Demote all headings in buffer so the highest level is below LEVEL. + org-web-tools--get-first-url: Return URL in clipboard, or first URL in the kill-ring, or nil if none. + org-web-tools--read-org-bracket-link: Return (TARGET . DESCRIPTION) for Org bracket LINK or next link on current line. + org-web-tools--remove-dos-crlf: Remove all DOS CRLF (^M) in buffer. Thanks, Adam