emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Feng Shu <tumashu@gmail.com>
To: emacs-orgmode@gnu.org
Subject: Re: html to org-mode
Date: Sat, 04 Jan 2014 12:56:32 +0800	[thread overview]
Message-ID: <87ha9kuyyn.fsf@news.tumashu-localhost.org> (raw)
In-Reply-To: <CAJ51ETrsyuAwpYOvJ2yqYsVirJdX4qcmkmaRVROj-mm4C3LF_g@mail.gmail.com> (John Kitchin's message of "Fri, 3 Jan 2014 21:40:14 -0500")

John Kitchin <jkitchin@andrew.cmu.edu> writes:

> Hi everyone,
>
> I was playing around with org-rss today, and it is pretty cool. I
> would like to customize the way the subheading bodies look though,
> primarily to unescape some html things like &lt;, to get rid of all
> the html tags, convert <a ..> to org-mode links, to download <img ...>
> so they can be displayed, etc... 
>
> for example a body of an rss entry looks like: 
>
> <title>Philip Herron: Cython Book</title>
> <guid>http://redbrain.co.uk/?p=147</guid>
> <link>http://redbrain.co.uk/cython-book/</link> <description><p>Hey
> all i thought i should really share that i actually wrote a book on
> Cython. The book has detailed examples and even shows you how you can
> extend native C/C++ applications in python by doing it for Tmux. <a
> href="http://bit.ly/195ahQs">http://bit.ly/195ahQs</a></p> <p><a
> href="http://redbrain.co.uk/wp-content/uploads/2013/12/photo.jpg"><img
> class="aligncenter size-full wp-image-148" alt="photo"
> src="http://redbrain.co.uk/wp-content/uploads/2013/12/photo.jpg"
> width="640" height="480" /></a>The code can be found: <a
> href="https://github.com/redbrain/cython-book">https://github.com/redbrain/cython-book</a></p></description>
> <pubDate>Tue, 10 Dec 2013 14:45:08 +0000</pubDate>
>
> I would like this simplified to something like:
> Philip Herron: Cython Book
>
> http://redbrain.co.uk/?p=147
>
> http://redbrain.co.uk/cython-book/
> Hey all i thought i should really share that i actually wrote a book
> on Cython. The book has detailed examples and even shows you how you
> can extend native C/C++ applications in python by doing it for Tmux.
> http://bit.ly/195ahQs
>
> [[feed-images/photo.jpg]]
>
> The code can be found: https://github.com/redbrain/cython-book
>
> basically, get the html code as close to org as reasonable. i found a
> way to get an html parse tree (libxml-parse-html-region start end),
> but I can't figure out how to convert that to the text I want. 
>
> Has anyone done anything like this?
>
> John

Maybe eww can help you...

>
> -----------------------------------
> John Kitchin
> Associate Professor
> Doherty Hall A207F
> Department of Chemical Engineering
> Carnegie Mellon University
> Pittsburgh, PA 15213
> 412-268-7803
> http://kitchingroup.cheme.cmu.edu

-- 

  reply	other threads:[~2014-01-04  5:00 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-04  2:40 html to org-mode John Kitchin
2014-01-04  4:56 ` Feng Shu [this message]
2014-01-04  6:22   ` York Zhao
2014-01-04 10:54     ` Bastien
2014-01-04 13:48       ` John Kitchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ha9kuyyn.fsf@news.tumashu-localhost.org \
    --to=tumashu@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).