emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [ANN] Org Export in contrib
@ 2011-11-25 17:32 Nicolas Goaziou
  2011-11-25 18:57 ` Nicolas Goaziou
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Nicolas Goaziou @ 2011-11-25 17:32 UTC (permalink / raw)
  To: Org Mode List

Hello,

I've pushed org-export.el to contrib. It's a general export engine,
built on top of org-elements aiming at simplifying life of both
developers and maintainers (and, therefore, of end-users).

It predates org-exp.el and org-special-blocks.el. Though, keep it mind
that, as advanced as it is, it isn't yet a drop-in replacement for
them. It still lacks an interface (à la `org-export'), back-ends, and
tons of testing and improvements. That being said, it's usable anyway
and one can already write back-ends for it. I'll show a silly example
later in this mail.

Now, let's have a peek into the guts of that beast.

Besides the parser, the generic exporter is made of three distinct
parts:

- The communication channel consists in a property list, which is
  created and updated during the process.  Its use is to offer every
  piece of information, would it be export options or contextual data,
  all in a single place.  The exhaustive list of properties is given in
  "The Communication Channel" section of the file.

- The transcoder walks the parse tree, ignores or treat as plain text
  elements and objects according to export options, and eventually calls
  back-end specific functions to do the real transcoding, concatenating
  their return value along the way.

- The filter system is activated at the very beginning and the very end
  of the export process, and each time an element or an object has been
  converted.  It is the entry point to fine-tune standard output from
  back-end transcoders.

The core function is `org-export-as'.  It returns the transcoded buffer
as a string.

In order to derive an exporter out of this generic implementation, one
can define a transcode function for each element or object.  Such
function should return a string for the corresponding element, without
any trailing space, or nil.  It must accept three arguments:

  1. the element or object itself,
  2. its contents, or nil when it isn't recursive,
  3. the property list used as a communication channel.

If no such function is found, that element or object type will simply be
ignored, along with any separating blank line.  The same will happen if
the function returns the nil value.  If that function returns the empty
string, the type will be ignored, but the blank lines will be kept.

Contents, when not nil, are stripped from any global indentation
(although the relative one is preserved).  They also always end with
a single newline character.

These functions must follow a strict naming convention:
`org-BACKEND-TYPE' where, obviously, BACKEND is the name of the export
back-end and TYPE the type of the element or object handled.

Moreover, two additional functions can be defined.  On the one hand,
`org-BACKEND-template' returns the final transcoded string, and can be
used to add a preamble and a postamble to document's body.  It must
accept two arguments: the transcoded string and the property list
containing export options.  On the other hand, `org-BACKEND-plain-text',
when defined, is to be called on every text not recognized as an element
or an object.  It must accept two arguments: the text string and the
information channel.

Any back-end can define its own variables.  Among them, those
customizables should belong to the `org-export-BACKEND' group.  Also,
a special variable, `org-BACKEND-option-alist', allows to define buffer
keywords and "#+options:" items specific to that back-end.  See
`org-export-option-alist' for supported defaults and syntax.

Tools for common tasks across back-ends are implemented in the last
part of the file.


* For Maintainers

  To word it differently, this exporter doesn't rely on any
  text-property during the process. Thus, it makes
  `org-if-unprotected-at' and alike obsolete in the whole code base. Org
  core doesn't have to bother anymore about its exporter weaknesses.

  Also, buffer's pre-processing is reduced to its strict minimum: Babel
  code expansion. No footnote normalization, list markers to add and
  remove...

  Being only a beefed-up parse tree reader, any element or object added
  to Elements is available through the exporter with no further
  modification. Back-end just have to create the appropriate new
  transcoders, unless that element or object should be ignored anyway.


* For Developers

  All data needed is available in two places: the properties associated
  to the element being transcoded, through the use of
  `org-element-get-property', and the communication channel, with the
  help of `plist-get'. Period.

  Also, the exporter takes care about all the filtering required by
  options, and enforces the same number of blank lines in the Org buffer
  and in the source code (though this can be overcome with the use of
  filters). It's easier this way to concentrate on the shape of the
  output.

  Tools for common tasks (like building table of contents or listings,
  or numbering headlines) are provided in the library.


* For Users

  Hooks are gone. Sorry. Most of them happened during a pre-process part
  that doesn't exist anymore.

  Though, there are three way to configure the output, in increasing
  power:

  1. Variables (customizable or not) are still there, provided by either
     the exporter itself or its back-ends.

  2. Filter sets are provided to fine-tune output of any
     back-end. A filter set is a list of functions, applied in a FIFO
     order, whose signature is the resulting string of the previous
     function (or the back-end output for the first one) and the
     back-end as a symbol. The return value of the last function
     replaces back-end's output. If one of the return values is nil, the
     element or object on which the filters are applied is ignored in
     the final output.

     Also, three special filter sets apply on the parse tree, on plain
     text, and on the final output.

     For example, the LaTeX back-end has the bad habit to "boldify"
     deadline, scheduled and closed strings close to time-stamps in the
     buffer. I'd rather have them emphasized. Obviously, I don't want to
     annoy other back-ends with this. The following will do the trick.

     #+begin_src emacs-lisp
     (add-to-list 'org-export-filter-time-stamp-functions
                  (lambda (ts backend)
                    (if (not (eq backend 'latex))
                        ts
                      (replace-regexp-in-string "textbf" "emph" ts))))
     #+end_src

  3. Whole parts of any back-end can be redefined (or advised).  For
     example, if I don't like the way the LaTeX back-end transcodes
     verbatim text, I can always create an `org-latex-verbatim' function
     of my own.


* A (silly) Back-End: `summary'

  I want a back-end, `summary', which only exports headlines of the
  current buffer, in a markdown format. I would like to have the
  opportunity to add a few lines of text before the first headline.  It
  should also delimit beginning and end of output by ASCII scissors. Oh,
  and scissors string should be customizable!

  As I only want headlines, I only need to implement an
  `org-summary-headline' function. Though, text before the first
  headline in my buffer will be ignored (it isn't an headline).

  So this back-end will have to define its own buffer keyword:
  "#+PREAMBLE:". I need to be able to encounter this keyword more than
  once in the buffer as my preamble will probably span on more than one
  line. The following snippet will do this, and provide the text as the
  value of the `:preamble' property in the communication channel. It
  also creates a customizable `org-summary-scissors' variable, which is
  rightfully added to the `org-export-summary' group.

  #+begin_src emacs-lisp
  (defcustom org-summary-scissors "--%<--------------------------------\n"
    "String used as a delimiter for summary output.
  It should end with a newline character."
    :group 'org-export-summary
    :type 'string)
  (defvar org-summary-option-alist)
  (add-to-list 'org-summary-option-alist
               '(:preamble "PREAMBLE" nil nil newline))
  #+end_src

  Now onto the headline transcoder. A quick look at the
  `org-element-headline-parser' function tell me that `:raw-value'
  property should be enough, as I need no fancy transformation. I might
  want to also use `:level' to get the number of "equal" signs before
  the text, but a longer look at the list of properties offered in the
  communication channel tells me that `org-export-get-relative-level'
  may be more adequate. So be it.

  #+begin_src emacs-lisp
  (defun org-summary-headline (headline contents info)
    "Return HEADLINE in a Markdown syntax.
  CONTENTS is the contents of the headline.  INFO is the property
  list used as a communication channel."
    (let ((title (org-element-get-property :raw-value headline))
          (pre-blank (org-element-get-property :pre-blank headline))
          (level (org-export-get-relative-level headline info))
          ;; Depth of 6 is a hard limit in HTML (and therefore Markdown)
          ;; syntax.
          (limit (min (plist-get info :headline-levels) 6)))
      (when (<= level limit)
        (concat (make-string level ?=) " " title
                (make-string (1+ pre-blank) ?\n)
                contents))))
  #+end_src

  This should be sufficient to take care of document's body. Now, I only
  need to add the scissors, the preamble, and the title in the final
  output. This all happens with the help of the `org-summary-template'
  function.

  I remember that "#+TITLE:" belongs to `org-element-parsed-keywords',
  which means that its value isn't a string but a secondary string. As
  I don't want to transcode it (my back-end only knows about headline),
  I'll get the raw value back with `org-element-interpret-secondary'
  function (If I had wanted to transcode it, I would have used
  `org-export-secondary-string' instead).

  #+begin_src emacs-lisp
  (defun org-summary-template (contents info)
    "Return complete document transcoded with summary back-end.
  CONTENTS is the body of the document.  INFO is the plist used as
  a communication channel."
    (let ((title (org-element-interpret-secondary (plist-get info :title)))
          (preamble (plist-get info :preamble)))
      (concat org-summary-scissors
              (upcase title) "\n\n"
              preamble "\n\n"
              contents
              org-summary-scissors)))
  #+end_src

  Now, I can test all of this by with M-: (org-export-as 'summary) on
  a test buffer[1]. So far, so good. But I know better and define an
  interactive function for that important action. While I'm at it, it
  will display my summary in a buffer named "*Summary*".

  #+begin_src emacs-lisp
  (defun org-export-as-summary ()
    "Create the summary of the current Org buffer.
  Summary is displayed in a buffer called \"*Summary*\"."
    (interactive)
    (when (eq major-mode 'org-mode)
      (switch-to-buffer (org-export-to-buffer 'summary "*Summary*"))))
  #+end_src

  That's all, folks.


I'll try to package its first back-end, org-latex.el, into experimental/
before monday.

Feedback, as always, is welcome.


Some text that will probably be ignored.

* Head 1

  some text
  
** Head 1.1

   some text too

*** Head 1.1.1

    Some text again

** Head 1.2

   some text
   
* Head 2                                                                :noexport:

  some text
--8<---------------cut here---------------end--------------->8---


Regards,

[1] For example, this one:
--8<---------------cut here---------------start------------->8---
#+Title: My Summary Test
#+Preamble: I hope
#+Preamble: that it is working
#+Options: H:2

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-11-29  6:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-25 17:32 [ANN] Org Export in contrib Nicolas Goaziou
2011-11-25 18:57 ` Nicolas Goaziou
2011-11-27 11:21   ` Carsten Dominik
2011-11-27 19:54     ` Nicolas Goaziou
2011-11-28 11:40       ` Carsten Dominik
2011-11-28 19:38         ` Nicolas Goaziou
2011-11-27 11:06 ` Carsten Dominik
2011-11-29  6:15 ` Robert Klein
2011-11-29  6:34   ` Robert Klein

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).