emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* importing html
@ 2008-02-08 15:30 Brian Gough
  2008-02-08 16:15 ` Bastien Guerry
  2008-02-11 11:05 ` Rick Moynihan
  0 siblings, 2 replies; 3+ messages in thread
From: Brian Gough @ 2008-02-08 15:30 UTC (permalink / raw)
  To: emacs-orgmode

Is there an html->org mode converter?  I have some web pages I want to
import into org. Thanks.

-- 
Brian Gough

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: importing html
  2008-02-08 15:30 importing html Brian Gough
@ 2008-02-08 16:15 ` Bastien Guerry
  2008-02-11 11:05 ` Rick Moynihan
  1 sibling, 0 replies; 3+ messages in thread
From: Bastien Guerry @ 2008-02-08 16:15 UTC (permalink / raw)
  To: emacs-orgmode

Brian Gough <bjg@gnu.org> writes:

> Is there an html->org mode converter?  I have some web pages I want to
> import into org. Thanks.

AFAIK there is no such tool.  And designing a generic tool for this
might be really tricky.  I guess it's easier to code something ad hoc,
just suitable for the specific structure of the HTML files you want to
import.

Maybe you can send one of yours HTML files?

-- 
Bastien

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: importing html
  2008-02-08 15:30 importing html Brian Gough
  2008-02-08 16:15 ` Bastien Guerry
@ 2008-02-11 11:05 ` Rick Moynihan
  1 sibling, 0 replies; 3+ messages in thread
From: Rick Moynihan @ 2008-02-11 11:05 UTC (permalink / raw)
  To: Brian Gough; +Cc: emacs-orgmode

Brian Gough wrote:
> Is there an html->org mode converter?  I have some web pages I want to
> import into org. Thanks.
> 

Whenever I need to parse HTML, I turn to hpricot.  A fantastic Ruby 
based parser which does a great job of handling worst of the webs HTML.

http://code.whytheluckystiff.net/hpricot/

It makes simple screen scraping and file-format conversion the job of 
minutes.

R.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-02-11 11:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-08 15:30 importing html Brian Gough
2008-02-08 16:15 ` Bastien Guerry
2008-02-11 11:05 ` Rick Moynihan

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).