From mboxrd@z Thu Jan 1 00:00:00 1970 From: Carsten Dominik Subject: Re: [ANN] Org Export in contrib Date: Sun, 27 Nov 2011 12:06:58 +0100 Message-ID: References: <87ipm8w1jz.fsf@gmail.com> Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([140.186.70.92]:53965) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RUcZd-0003nG-Ix for emacs-orgmode@gnu.org; Sun, 27 Nov 2011 06:07:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RUcZb-0008OK-HG for emacs-orgmode@gnu.org; Sun, 27 Nov 2011 06:07:05 -0500 Received: from mail-ww0-f49.google.com ([74.125.82.49]:40555) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RUcZb-0008OC-7E for emacs-orgmode@gnu.org; Sun, 27 Nov 2011 06:07:03 -0500 Received: by wwf5 with SMTP id 5so5447675wwf.30 for ; Sun, 27 Nov 2011 03:07:02 -0800 (PST) In-Reply-To: <87ipm8w1jz.fsf@gmail.com> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Nicolas Goaziou Cc: Org Mode List Hi everyone, is there anyone who is planning to implement a texinfo exporter based on org-elements? If not, I would write this exporter... - Carsten On 25.11.2011, at 18:32, Nicolas Goaziou wrote: > Hello, >=20 > I've pushed org-export.el to contrib. It's a general export engine, > built on top of org-elements aiming at simplifying life of both > developers and maintainers (and, therefore, of end-users). >=20 > It predates org-exp.el and org-special-blocks.el. Though, keep it mind > that, as advanced as it is, it isn't yet a drop-in replacement for > them. It still lacks an interface (=E0 la `org-export'), back-ends, = and > tons of testing and improvements. That being said, it's usable anyway > and one can already write back-ends for it. I'll show a silly example > later in this mail. >=20 > Now, let's have a peek into the guts of that beast. >=20 > Besides the parser, the generic exporter is made of three distinct > parts: >=20 > - The communication channel consists in a property list, which is > created and updated during the process. Its use is to offer every > piece of information, would it be export options or contextual data, > all in a single place. The exhaustive list of properties is given in > "The Communication Channel" section of the file. >=20 > - The transcoder walks the parse tree, ignores or treat as plain text > elements and objects according to export options, and eventually = calls > back-end specific functions to do the real transcoding, concatenating > their return value along the way. >=20 > - The filter system is activated at the very beginning and the very = end > of the export process, and each time an element or an object has been > converted. It is the entry point to fine-tune standard output from > back-end transcoders. >=20 > The core function is `org-export-as'. It returns the transcoded = buffer > as a string. >=20 > In order to derive an exporter out of this generic implementation, one > can define a transcode function for each element or object. Such > function should return a string for the corresponding element, without > any trailing space, or nil. It must accept three arguments: >=20 > 1. the element or object itself, > 2. its contents, or nil when it isn't recursive, > 3. the property list used as a communication channel. >=20 > If no such function is found, that element or object type will simply = be > ignored, along with any separating blank line. The same will happen = if > the function returns the nil value. If that function returns the = empty > string, the type will be ignored, but the blank lines will be kept. >=20 > Contents, when not nil, are stripped from any global indentation > (although the relative one is preserved). They also always end with > a single newline character. >=20 > These functions must follow a strict naming convention: > `org-BACKEND-TYPE' where, obviously, BACKEND is the name of the export > back-end and TYPE the type of the element or object handled. >=20 > Moreover, two additional functions can be defined. On the one hand, > `org-BACKEND-template' returns the final transcoded string, and can be > used to add a preamble and a postamble to document's body. It must > accept two arguments: the transcoded string and the property list > containing export options. On the other hand, = `org-BACKEND-plain-text', > when defined, is to be called on every text not recognized as an = element > or an object. It must accept two arguments: the text string and the > information channel. >=20 > Any back-end can define its own variables. Among them, those > customizables should belong to the `org-export-BACKEND' group. Also, > a special variable, `org-BACKEND-option-alist', allows to define = buffer > keywords and "#+options:" items specific to that back-end. See > `org-export-option-alist' for supported defaults and syntax. >=20 > Tools for common tasks across back-ends are implemented in the last > part of the file. >=20 >=20 > * For Maintainers >=20 > To word it differently, this exporter doesn't rely on any > text-property during the process. Thus, it makes > `org-if-unprotected-at' and alike obsolete in the whole code base. = Org > core doesn't have to bother anymore about its exporter weaknesses. >=20 > Also, buffer's pre-processing is reduced to its strict minimum: Babel > code expansion. No footnote normalization, list markers to add and > remove... >=20 > Being only a beefed-up parse tree reader, any element or object added > to Elements is available through the exporter with no further > modification. Back-end just have to create the appropriate new > transcoders, unless that element or object should be ignored anyway. >=20 >=20 > * For Developers >=20 > All data needed is available in two places: the properties associated > to the element being transcoded, through the use of > `org-element-get-property', and the communication channel, with the > help of `plist-get'. Period. >=20 > Also, the exporter takes care about all the filtering required by > options, and enforces the same number of blank lines in the Org = buffer > and in the source code (though this can be overcome with the use of > filters). It's easier this way to concentrate on the shape of the > output. >=20 > Tools for common tasks (like building table of contents or listings, > or numbering headlines) are provided in the library. >=20 >=20 > * For Users >=20 > Hooks are gone. Sorry. Most of them happened during a pre-process = part > that doesn't exist anymore. >=20 > Though, there are three way to configure the output, in increasing > power: >=20 > 1. Variables (customizable or not) are still there, provided by = either > the exporter itself or its back-ends. >=20 > 2. Filter sets are provided to fine-tune output of any > back-end. A filter set is a list of functions, applied in a FIFO > order, whose signature is the resulting string of the previous > function (or the back-end output for the first one) and the > back-end as a symbol. The return value of the last function > replaces back-end's output. If one of the return values is nil, = the > element or object on which the filters are applied is ignored in > the final output. >=20 > Also, three special filter sets apply on the parse tree, on plain > text, and on the final output. >=20 > For example, the LaTeX back-end has the bad habit to "boldify" > deadline, scheduled and closed strings close to time-stamps in the > buffer. I'd rather have them emphasized. Obviously, I don't want = to > annoy other back-ends with this. The following will do the trick. >=20 > #+begin_src emacs-lisp > (add-to-list 'org-export-filter-time-stamp-functions > (lambda (ts backend) > (if (not (eq backend 'latex)) > ts > (replace-regexp-in-string "textbf" "emph" ts)))) > #+end_src >=20 > 3. Whole parts of any back-end can be redefined (or advised). For > example, if I don't like the way the LaTeX back-end transcodes > verbatim text, I can always create an `org-latex-verbatim' = function > of my own. >=20 >=20 > * A (silly) Back-End: `summary' >=20 > I want a back-end, `summary', which only exports headlines of the > current buffer, in a markdown format. I would like to have the > opportunity to add a few lines of text before the first headline. It > should also delimit beginning and end of output by ASCII scissors. = Oh, > and scissors string should be customizable! >=20 > As I only want headlines, I only need to implement an > `org-summary-headline' function. Though, text before the first > headline in my buffer will be ignored (it isn't an headline). >=20 > So this back-end will have to define its own buffer keyword: > "#+PREAMBLE:". I need to be able to encounter this keyword more than > once in the buffer as my preamble will probably span on more than one > line. The following snippet will do this, and provide the text as the > value of the `:preamble' property in the communication channel. It > also creates a customizable `org-summary-scissors' variable, which is > rightfully added to the `org-export-summary' group. >=20 > #+begin_src emacs-lisp > (defcustom org-summary-scissors = "--%<--------------------------------\n" > "String used as a delimiter for summary output. > It should end with a newline character." > :group 'org-export-summary > :type 'string) > (defvar org-summary-option-alist) > (add-to-list 'org-summary-option-alist > '(:preamble "PREAMBLE" nil nil newline)) > #+end_src >=20 > Now onto the headline transcoder. A quick look at the > `org-element-headline-parser' function tell me that `:raw-value' > property should be enough, as I need no fancy transformation. I might > want to also use `:level' to get the number of "equal" signs before > the text, but a longer look at the list of properties offered in the > communication channel tells me that `org-export-get-relative-level' > may be more adequate. So be it. >=20 > #+begin_src emacs-lisp > (defun org-summary-headline (headline contents info) > "Return HEADLINE in a Markdown syntax. > CONTENTS is the contents of the headline. INFO is the property > list used as a communication channel." > (let ((title (org-element-get-property :raw-value headline)) > (pre-blank (org-element-get-property :pre-blank headline)) > (level (org-export-get-relative-level headline info)) > ;; Depth of 6 is a hard limit in HTML (and therefore = Markdown) > ;; syntax. > (limit (min (plist-get info :headline-levels) 6))) > (when (<=3D level limit) > (concat (make-string level ?=3D) " " title > (make-string (1+ pre-blank) ?\n) > contents)))) > #+end_src >=20 > This should be sufficient to take care of document's body. Now, I = only > need to add the scissors, the preamble, and the title in the final > output. This all happens with the help of the `org-summary-template' > function. >=20 > I remember that "#+TITLE:" belongs to `org-element-parsed-keywords', > which means that its value isn't a string but a secondary string. As > I don't want to transcode it (my back-end only knows about headline), > I'll get the raw value back with `org-element-interpret-secondary' > function (If I had wanted to transcode it, I would have used > `org-export-secondary-string' instead). >=20 > #+begin_src emacs-lisp > (defun org-summary-template (contents info) > "Return complete document transcoded with summary back-end. > CONTENTS is the body of the document. INFO is the plist used as > a communication channel." > (let ((title (org-element-interpret-secondary (plist-get info = :title))) > (preamble (plist-get info :preamble))) > (concat org-summary-scissors > (upcase title) "\n\n" > preamble "\n\n" > contents > org-summary-scissors))) > #+end_src >=20 > Now, I can test all of this by with M-: (org-export-as 'summary) on > a test buffer[1]. So far, so good. But I know better and define an > interactive function for that important action. While I'm at it, it > will display my summary in a buffer named "*Summary*". >=20 > #+begin_src emacs-lisp > (defun org-export-as-summary () > "Create the summary of the current Org buffer. > Summary is displayed in a buffer called \"*Summary*\"." > (interactive) > (when (eq major-mode 'org-mode) > (switch-to-buffer (org-export-to-buffer 'summary "*Summary*")))) > #+end_src >=20 > That's all, folks. >=20 >=20 > I'll try to package its first back-end, org-latex.el, into = experimental/ > before monday. >=20 > Feedback, as always, is welcome. >=20 >=20 > Some text that will probably be ignored. >=20 > * Head 1 >=20 > some text >=20 > ** Head 1.1 >=20 > some text too >=20 > *** Head 1.1.1 >=20 > Some text again >=20 > ** Head 1.2 >=20 > some text >=20 > * Head 2 = :noexport: >=20 > some text > --8<---------------cut here---------------end--------------->8--- >=20 >=20 > Regards, >=20 > [1] For example, this one: > --8<---------------cut here---------------start------------->8--- > #+Title: My Summary Test > #+Preamble: I hope > #+Preamble: that it is working > #+Options: H:2 >=20 > --=20 > Nicolas Goaziou >=20