emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* HTML2Org ?
@ 2013-07-30 21:35 Fabrice Popineau
  2013-07-30 22:06 ` Neil Smithline
  0 siblings, 1 reply; 4+ messages in thread
From: Fabrice Popineau @ 2013-07-30 21:35 UTC (permalink / raw)
  To: emacs-orgmode@gnu.org

[-- Attachment #1: Type: text/plain, Size: 90 bytes --]

Anybody tried to write an HTML to Org parser (even a crude one) ?

Best regards,

Fabrice

[-- Attachment #2: Type: text/html, Size: 159 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HTML2Org ?
  2013-07-30 21:35 HTML2Org ? Fabrice Popineau
@ 2013-07-30 22:06 ` Neil Smithline
  2013-07-30 22:15   ` Fabrice Popineau
  0 siblings, 1 reply; 4+ messages in thread
From: Neil Smithline @ 2013-07-30 22:06 UTC (permalink / raw)
  To: Fabrice Popineau; +Cc: Org Mode

[-- Attachment #1: Type: text/plain, Size: 421 bytes --]

How would you get the document structure our of the HTML unless it only
used heading tags?

Even something as simple as bold could be hidden within some monstrous CSS.

From my mobile. Please excuse abbrvs, tpyos, and auto ward correction.
On Jul 30, 2013 5:36 PM, "Fabrice Popineau" <fabrice.popineau@gmail.com>
wrote:

> Anybody tried to write an HTML to Org parser (even a crude one) ?
>
> Best regards,
>
> Fabrice
>

[-- Attachment #2: Type: text/html, Size: 769 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HTML2Org ?
  2013-07-30 22:06 ` Neil Smithline
@ 2013-07-30 22:15   ` Fabrice Popineau
  2013-07-30 23:59     ` Thorsten Jolitz
  0 siblings, 1 reply; 4+ messages in thread
From: Fabrice Popineau @ 2013-07-30 22:15 UTC (permalink / raw)
  To: Neil Smithline; +Cc: Org Mode

[-- Attachment #1: Type: text/plain, Size: 754 bytes --]

I was wondering about something doing the reverse of the exporter: get some
fragment of Org text
from an exported HTML fragment.
However, it won't be much easier: links, macros, babel ... I guess only the
basic markup could be reversed.

Fabrice


2013/7/31 Neil Smithline <emacs-orgmode@neilsmithline.com>

> How would you get the document structure our of the HTML unless it only
> used heading tags?
>
> Even something as simple as bold could be hidden within some monstrous
> CSS.
>
> From my mobile. Please excuse abbrvs, tpyos, and auto ward correction.
> On Jul 30, 2013 5:36 PM, "Fabrice Popineau" <fabrice.popineau@gmail.com>
> wrote:
>
>> Anybody tried to write an HTML to Org parser (even a crude one) ?
>>
>> Best regards,
>>
>> Fabrice
>>
>

[-- Attachment #2: Type: text/html, Size: 1525 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HTML2Org ?
  2013-07-30 22:15   ` Fabrice Popineau
@ 2013-07-30 23:59     ` Thorsten Jolitz
  0 siblings, 0 replies; 4+ messages in thread
From: Thorsten Jolitz @ 2013-07-30 23:59 UTC (permalink / raw)
  To: emacs-orgmode

Fabrice Popineau <fabrice.popineau@gmail.com> writes:

> I was wondering about something doing the reverse of the exporter: get some
> fragment of Org text
> from an exported HTML fragment.
> However, it won't be much easier: links, macros, babel ... I guess only the
> basic markup could be reversed.

do you know pandoc?

,-------------------------------------------------------------------------
| About pandoc
| 
| If you need to convert files from one markup format into another, pandoc
| is your swiss-army knife. Pandoc can convert documents in markdown,
| reStructuredText, textile, HTML, DocBook, LaTeX, or MediaWiki markup to
| 
|     HTML formats: XHTML, HTML5, and HTML slide shows using Slidy,
|     Slideous, S5, or DZSlides. Word processor formats: Microsoft Word
|     docx, OpenOffice/LibreOffice ODT, OpenDocument XML Ebooks: EPUB
|     version 2 or 3, FictionBook2 Documentation formats: DocBook, GNU
|     TexInfo, Groff man pages TeX formats: LaTeX, ConTeXt, LaTeX Beamer
|     slides PDF via LaTeX Lightweight markup formats: Markdown,
|     reStructuredText, AsciiDoc, MediaWiki markup, Emacs Org-Mode,
|     Textile
`-------------------------------------------------------------------------

http://johnmacfarlane.net/pandoc/

-- 
cheers,
Thorsten

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-07-31  5:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-30 21:35 HTML2Org ? Fabrice Popineau
2013-07-30 22:06 ` Neil Smithline
2013-07-30 22:15   ` Fabrice Popineau
2013-07-30 23:59     ` Thorsten Jolitz

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).