From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matt Price Subject: Re: HTML --> Org-mode? Date: Mon, 26 Jan 2015 23:42:06 -0500 Message-ID: References: <87siex473l.fsf@gmail.com> <87a9154670.fsf@gmail.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a1134642088a23d050d9ae0dc Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:50560) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YFxyb-0004UQ-QS for emacs-orgmode@gnu.org; Mon, 26 Jan 2015 23:42:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YFxya-0000RA-1g for emacs-orgmode@gnu.org; Mon, 26 Jan 2015 23:42:09 -0500 Received: from mail-lb0-x22b.google.com ([2a00:1450:4010:c04::22b]:38283) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YFxyZ-0000R2-Jj for emacs-orgmode@gnu.org; Mon, 26 Jan 2015 23:42:07 -0500 Received: by mail-lb0-f171.google.com with SMTP id u14so11092288lbd.2 for ; Mon, 26 Jan 2015 20:42:06 -0800 (PST) In-Reply-To: <87a9154670.fsf@gmail.com> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Org Mode --001a1134642088a23d050d9ae0dc Content-Type: text/plain; charset=UTF-8 I think the answer may be something like: (shell-command-to-string (concat "pandoc -f html -t org <<< '" :html "'" ) Though I'm not quite sure how to go about it just yet. On Mon, Jan 26, 2015 at 3:50 PM, Tory S. Anderson wrote: > man pandoc will be your friend. It guided me to the following simple > (interactive) use: > > pandoc -f html -t org > how are you? > I am good > *how are you?* /I am good/ > > I won't be able to help you much farther than that, though. > - Tory > > Matt Price writes: > > > That should be enough. I would need to feed a string form emacs to > > pandoc, then capture the output as a new string that can be output in > > the export filter. Do you know how to do that part? > > Thanks, > > Matt > > > > On Mon, Jan 26, 2015 at 3:31 PM, Tory S. Anderson > > wrote: > > > > Using the magic wizard program Pandoc, I just had success with a > > simple little example: > > > > pandoc -o test.org test.html > > > > Input test.html: > > > > > > TEST strong! > >
> >
Cell 1
> >
Cell 2
> >
Cell 3
> >
Cell 4
> >
> > > > > > > > Output test.org: > > *TEST strong!* > > Cell 1 > > Cell 2 > > Cell 3 > > Cell 4 > > > > I'm not sure how sophisticated the strings you are dealing with, > > but pandoc might do the trick for you. > > - Tory > > > > > > > > > > Matt Price writes: > > > > > Hmm, > > > > > > Looks like I asked this about a year ago and didn't follow up on > > it. > > > Does anyone know a way to generate org-mode syntax from an html > > > string? I would like to extend zotxt slightly (see my last post) > > and > > > at present zotxt can pull citations 7 bibliography entries from > > Zotero > > > only in plain-text and HTML form. The plaintext form loses > > > information, so I would like to translate the HTML into org-mode > > > syntax. > > > > > > Since this would have to happen in the context of an > > > > > > (org-add-link-type ) > > > > > > invocation, it would be best if this could be done directly in > > emacs > > > somehow... > > > > > > Thanks as always, > > > > > > Matt > > > --001a1134642088a23d050d9ae0dc Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I think the answer may be something like:

(shell-command-to-string (concat=C2=A0 "pandoc -f html -t org &= lt;<< '"=C2=A0 :html "'" )

Though= I'm not quite sure how to go about it just yet.=C2=A0

On Mon, Jan 26, 2015 at 3:5= 0 PM, Tory S. Anderson <torys.anderson@gmail.com> wro= te:
man pandoc will be your friend. It gu= ided me to the following simple (interactive) use:

pandoc -f html -t org
<b> how are you? </b>
<i> I am good </i>
*how are you?* /I am good/

I won't be able to help you much farther than that, though.
- Tory

Matt Price <moptop99@gmail.com= > writes:

> That should be enough. I would need to feed a string form emacs to
> pandoc, then capture the output as a new string that can be output in<= br> > the export filter. Do you know how to do that part?
> Thanks,
> Matt
>
> On Mon, Jan 26, 2015 at 3:31 PM, Tory S. Anderson
> <torys.anderson@gmail.c= om> wrote:
>
>=C2=A0 =C2=A0 =C2=A0Using the magic wizard program Pandoc, I just had s= uccess with a
>=C2=A0 =C2=A0 =C2=A0simple little example:
>
>=C2=A0 =C2=A0 =C2=A0pandoc -o test.org test.html
>
>=C2=A0 =C2=A0 =C2=A0Input test.html:
>=C2=A0 =C2=A0 =C2=A0<html>
>=C2=A0 =C2=A0 =C2=A0<body>
>=C2=A0 =C2=A0 =C2=A0<strong>TEST strong!</strong>
>=C2=A0 =C2=A0 =C2=A0<div class=3D'table'>
>=C2=A0 =C2=A0 =C2=A0<div class=3D'cell'>Cell 1</div>= ;
>=C2=A0 =C2=A0 =C2=A0<div class=3D'cell'>Cell 2</div>= ;
>=C2=A0 =C2=A0 =C2=A0<div class=3D'cell'>Cell 3</div>= ;
>=C2=A0 =C2=A0 =C2=A0<div class=3D'cell'>Cell 4</div>= ;
>=C2=A0 =C2=A0 =C2=A0</div>
>=C2=A0 =C2=A0 =C2=A0</body>
>=C2=A0 =C2=A0 =C2=A0</html>
>
>=C2=A0 =C2=A0 =C2=A0Output test.org:
>=C2=A0 =C2=A0 =C2=A0*TEST strong!*
>=C2=A0 =C2=A0 =C2=A0Cell 1
>=C2=A0 =C2=A0 =C2=A0Cell 2
>=C2=A0 =C2=A0 =C2=A0Cell 3
>=C2=A0 =C2=A0 =C2=A0Cell 4
>
>=C2=A0 =C2=A0 =C2=A0I'm not sure how sophisticated the strings you = are dealing with,
>=C2=A0 =C2=A0 =C2=A0but pandoc might do the trick for you.
>=C2=A0 =C2=A0 =C2=A0- Tory
>
>
>
>
>=C2=A0 =C2=A0 =C2=A0Matt Price <moptop99@gmail.com> writes:
>
>=C2=A0 =C2=A0 =C2=A0> Hmm,
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> Looks like I asked this about a year ago and d= idn't follow up on
>=C2=A0 =C2=A0 =C2=A0it.
>=C2=A0 =C2=A0 =C2=A0> Does anyone know a way to generate org-mode sy= ntax from an html
>=C2=A0 =C2=A0 =C2=A0> string? I would like to extend zotxt slightly = (see my last post)
>=C2=A0 =C2=A0 =C2=A0and
>=C2=A0 =C2=A0 =C2=A0> at present zotxt can pull citations 7 bibliogr= aphy entries from
>=C2=A0 =C2=A0 =C2=A0Zotero
>=C2=A0 =C2=A0 =C2=A0> only in plain-text and HTML form. The plaintex= t form loses
>=C2=A0 =C2=A0 =C2=A0> information, so I would like to translate the = HTML into org-mode
>=C2=A0 =C2=A0 =C2=A0> syntax.
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> Since this would have to happen in the context= of an
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> (org-add-link-type )
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> invocation, it would be best if this could be = done directly in
>=C2=A0 =C2=A0 =C2=A0emacs
>=C2=A0 =C2=A0 =C2=A0> somehow...
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> Thanks as always,
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> Matt
>

--001a1134642088a23d050d9ae0dc--