emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Carsten Dominik <dominik@science.uva.nl>
To: Alex Ott <alexott@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Re: DocBook exporter code (version 1.0)
Date: Sun, 8 Mar 2009 14:46:46 +0100	[thread overview]
Message-ID: <F8142DCB-518B-41F7-9355-67A613DC07F2@uva.nl> (raw)
In-Reply-To: <m2ab7w8i46.fsf@flash.lan>


On Mar 8, 2009, at 10:43 AM, Alex Ott wrote:

> Hello
>
>>>>>> "CD" == Carsten Dominik writes:
> ....
> CD> One of the really weak features in Org's design is that  
> exporting is
> CD> not implemented in a generic way.  All exporters share a  
> preprocessing
> CD> step that turns Org format into something a little more sane and
> CD> consistent.  Then each exporter goes its own way.  This setup  
> makes
> CD> maintenance sort of a nightmare, because each change to Org syntax
> CD> needs to be implemented in all exporters separately.  Maybe you  
> have
> CD> read my swearing when I was trying to fix the LaTeX exporter  
> which I
> CD> did not understand completely at first - it was written by  
> Bastien.
>
> CD> I had really hoped that the next step in exporting Org would be to
> CD> rewrite the exporter from scratch, in a generic way, that will  
> then
> CD> make supporting different formatters more stable and easy.   
> Adding a
> CD> new exporter does not get us closer to that idea.
>
> I think, that instead of parsing text directly, we need to write  
> generic
> exporter, that will export all data as a tree, consisting from  
> header +
> list of the entries, and inside these entries provide all needed
> information about text (markup, url information, etc.).  And for new  
> export
> format, author will define only small piece of code -- mostly header
> generation, and replacement tables for formatting tags, urls  
> decorations,
> etc.

Yes, exactly.  This is exactly the right idea.  A complete
parser that captures the *entire* structure including all meta data,
without being specific to a backend.

There is a start for such a system in the git repo,
in EXPERIMENTAL/org-export.el, written by Bastien.  It does a
pass to get the headline structure of the document, and the properties
as meta data in a property list.   I believe it might not
capture TODO state and/or priority, but I am not certain.

The LaTeX exporter is based on this parser, but it is a half-done job
as it does the parsing only for the outline structure, not really
for the content.

There is also a parser for plain lists, in org-list.el, which
is also used in the LaTeX exporter (yes, Bastien had many of the
right ideas).

The is lots of other meta info like targets, links, formatting  
information
that would have to be encoded in some way.  It might be useful to  
start with the
export preprocessor on the entire file and use it.  One of the hard  
things will
be to identify stuf that is LaTeX, but also this code is in principle  
present.

This would be great to achieve.  Be warned that it will be hard
to get right, and that you and others would largely have to drive
it.  I will help, but cannot do the main thrust - maintaining Org
as it is and adding some features takes most of the energy I can
currently contribute.

- Carsten

  reply	other threads:[~2009-03-08 13:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-08  5:10 DocBook exporter code (version 1.0) Baoqiu Cui
2009-03-08  7:22 ` Carsten Dominik
2009-03-08  9:43   ` Alex Ott
2009-03-08 13:46     ` Carsten Dominik [this message]
2009-03-09  4:46   ` Baoqiu Cui
2009-03-09  6:25     ` Carsten Dominik
2009-03-09 17:21       ` Baoqiu Cui
2009-03-12 16:02 ` Dale Smith
2009-03-13  0:26   ` Baoqiu Cui
2009-03-13  2:05     ` Baoqiu Cui
2009-03-13  3:12       ` Baoqiu Cui
2009-03-13  6:37       ` Carsten Dominik
2009-03-13 13:42         ` Dale Smith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F8142DCB-518B-41F7-9355-67A613DC07F2@uva.nl \
    --to=dominik@science.uva.nl \
    --cc=alexott@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).