From mboxrd@z Thu Jan 1 00:00:00 1970 From: Rick Moynihan Subject: Re: importing html Date: Mon, 11 Feb 2008 11:05:28 +0000 Message-ID: <47B02BF8.1000501@calicojack.co.uk> References: <87wspftqh9.wl%bjg@network-theory.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JOWU9-0003tb-1Z for emacs-orgmode@gnu.org; Mon, 11 Feb 2008 06:06:05 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JOWU7-0003rf-B5 for emacs-orgmode@gnu.org; Mon, 11 Feb 2008 06:06:04 -0500 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JOWU6-0003rY-Tc for emacs-orgmode@gnu.org; Mon, 11 Feb 2008 06:06:03 -0500 In-Reply-To: <87wspftqh9.wl%bjg@network-theory.co.uk> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Brian Gough Cc: emacs-orgmode@gnu.org Brian Gough wrote: > Is there an html->org mode converter? I have some web pages I want to > import into org. Thanks. > Whenever I need to parse HTML, I turn to hpricot. A fantastic Ruby based parser which does a great job of handling worst of the webs HTML. http://code.whytheluckystiff.net/hpricot/ It makes simple screen scraping and file-format conversion the job of minutes. R.