From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Goaziou Subject: [ANN] Org Export in contrib Date: Fri, 25 Nov 2011 18:32:16 +0100 Message-ID: <87ipm8w1jz.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([140.186.70.92]:55603) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RTzew-0007yd-D1 for emacs-orgmode@gnu.org; Fri, 25 Nov 2011 12:33:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RTzeu-0007TD-IW for emacs-orgmode@gnu.org; Fri, 25 Nov 2011 12:33:58 -0500 Received: from mail-lpp01m010-f41.google.com ([209.85.215.41]:50534) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RTzeu-0007T6-7h for emacs-orgmode@gnu.org; Fri, 25 Nov 2011 12:33:56 -0500 Received: by lamb11 with SMTP id b11so114387lam.0 for ; Fri, 25 Nov 2011 09:33:54 -0800 (PST) List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Org Mode List Hello, I've pushed org-export.el to contrib. It's a general export engine, built on top of org-elements aiming at simplifying life of both developers and maintainers (and, therefore, of end-users). It predates org-exp.el and org-special-blocks.el. Though, keep it mind that, as advanced as it is, it isn't yet a drop-in replacement for them. It still lacks an interface (=C3=A0 la `org-export'), back-ends, and tons of testing and improvements. That being said, it's usable anyway and one can already write back-ends for it. I'll show a silly example later in this mail. Now, let's have a peek into the guts of that beast. Besides the parser, the generic exporter is made of three distinct parts: - The communication channel consists in a property list, which is created and updated during the process. Its use is to offer every piece of information, would it be export options or contextual data, all in a single place. The exhaustive list of properties is given in "The Communication Channel" section of the file. - The transcoder walks the parse tree, ignores or treat as plain text elements and objects according to export options, and eventually calls back-end specific functions to do the real transcoding, concatenating their return value along the way. - The filter system is activated at the very beginning and the very end of the export process, and each time an element or an object has been converted. It is the entry point to fine-tune standard output from back-end transcoders. The core function is `org-export-as'. It returns the transcoded buffer as a string. In order to derive an exporter out of this generic implementation, one can define a transcode function for each element or object. Such function should return a string for the corresponding element, without any trailing space, or nil. It must accept three arguments: 1. the element or object itself, 2. its contents, or nil when it isn't recursive, 3. the property list used as a communication channel. If no such function is found, that element or object type will simply be ignored, along with any separating blank line. The same will happen if the function returns the nil value. If that function returns the empty string, the type will be ignored, but the blank lines will be kept. Contents, when not nil, are stripped from any global indentation (although the relative one is preserved). They also always end with a single newline character. These functions must follow a strict naming convention: `org-BACKEND-TYPE' where, obviously, BACKEND is the name of the export back-end and TYPE the type of the element or object handled. Moreover, two additional functions can be defined. On the one hand, `org-BACKEND-template' returns the final transcoded string, and can be used to add a preamble and a postamble to document's body. It must accept two arguments: the transcoded string and the property list containing export options. On the other hand, `org-BACKEND-plain-text', when defined, is to be called on every text not recognized as an element or an object. It must accept two arguments: the text string and the information channel. Any back-end can define its own variables. Among them, those customizables should belong to the `org-export-BACKEND' group. Also, a special variable, `org-BACKEND-option-alist', allows to define buffer keywords and "#+options:" items specific to that back-end. See `org-export-option-alist' for supported defaults and syntax. Tools for common tasks across back-ends are implemented in the last part of the file. * For Maintainers To word it differently, this exporter doesn't rely on any text-property during the process. Thus, it makes `org-if-unprotected-at' and alike obsolete in the whole code base. Org core doesn't have to bother anymore about its exporter weaknesses. Also, buffer's pre-processing is reduced to its strict minimum: Babel code expansion. No footnote normalization, list markers to add and remove... Being only a beefed-up parse tree reader, any element or object added to Elements is available through the exporter with no further modification. Back-end just have to create the appropriate new transcoders, unless that element or object should be ignored anyway. * For Developers All data needed is available in two places: the properties associated to the element being transcoded, through the use of `org-element-get-property', and the communication channel, with the help of `plist-get'. Period. Also, the exporter takes care about all the filtering required by options, and enforces the same number of blank lines in the Org buffer and in the source code (though this can be overcome with the use of filters). It's easier this way to concentrate on the shape of the output. Tools for common tasks (like building table of contents or listings, or numbering headlines) are provided in the library. * For Users Hooks are gone. Sorry. Most of them happened during a pre-process part that doesn't exist anymore. Though, there are three way to configure the output, in increasing power: 1. Variables (customizable or not) are still there, provided by either the exporter itself or its back-ends. 2. Filter sets are provided to fine-tune output of any back-end. A filter set is a list of functions, applied in a FIFO order, whose signature is the resulting string of the previous function (or the back-end output for the first one) and the back-end as a symbol. The return value of the last function replaces back-end's output. If one of the return values is nil, the element or object on which the filters are applied is ignored in the final output. Also, three special filter sets apply on the parse tree, on plain text, and on the final output. For example, the LaTeX back-end has the bad habit to "boldify" deadline, scheduled and closed strings close to time-stamps in the buffer. I'd rather have them emphasized. Obviously, I don't want to annoy other back-ends with this. The following will do the trick. #+begin_src emacs-lisp (add-to-list 'org-export-filter-time-stamp-functions (lambda (ts backend) (if (not (eq backend 'latex)) ts (replace-regexp-in-string "textbf" "emph" ts)))) #+end_src 3. Whole parts of any back-end can be redefined (or advised). For example, if I don't like the way the LaTeX back-end transcodes verbatim text, I can always create an `org-latex-verbatim' function of my own. * A (silly) Back-End: `summary' I want a back-end, `summary', which only exports headlines of the current buffer, in a markdown format. I would like to have the opportunity to add a few lines of text before the first headline. It should also delimit beginning and end of output by ASCII scissors. Oh, and scissors string should be customizable! As I only want headlines, I only need to implement an `org-summary-headline' function. Though, text before the first headline in my buffer will be ignored (it isn't an headline). So this back-end will have to define its own buffer keyword: "#+PREAMBLE:". I need to be able to encounter this keyword more than once in the buffer as my preamble will probably span on more than one line. The following snippet will do this, and provide the text as the value of the `:preamble' property in the communication channel. It also creates a customizable `org-summary-scissors' variable, which is rightfully added to the `org-export-summary' group. #+begin_src emacs-lisp (defcustom org-summary-scissors "--%<--------------------------------\n" "String used as a delimiter for summary output. It should end with a newline character." :group 'org-export-summary :type 'string) (defvar org-summary-option-alist) (add-to-list 'org-summary-option-alist '(:preamble "PREAMBLE" nil nil newline)) #+end_src Now onto the headline transcoder. A quick look at the `org-element-headline-parser' function tell me that `:raw-value' property should be enough, as I need no fancy transformation. I might want to also use `:level' to get the number of "equal" signs before the text, but a longer look at the list of properties offered in the communication channel tells me that `org-export-get-relative-level' may be more adequate. So be it. #+begin_src emacs-lisp (defun org-summary-headline (headline contents info) "Return HEADLINE in a Markdown syntax. CONTENTS is the contents of the headline. INFO is the property list used as a communication channel." (let ((title (org-element-get-property :raw-value headline)) (pre-blank (org-element-get-property :pre-blank headline)) (level (org-export-get-relative-level headline info)) ;; Depth of 6 is a hard limit in HTML (and therefore Markdown) ;; syntax. (limit (min (plist-get info :headline-levels) 6))) (when (<=3D level limit) (concat (make-string level ?=3D) " " title (make-string (1+ pre-blank) ?\n) contents)))) #+end_src This should be sufficient to take care of document's body. Now, I only need to add the scissors, the preamble, and the title in the final output. This all happens with the help of the `org-summary-template' function. I remember that "#+TITLE:" belongs to `org-element-parsed-keywords', which means that its value isn't a string but a secondary string. As I don't want to transcode it (my back-end only knows about headline), I'll get the raw value back with `org-element-interpret-secondary' function (If I had wanted to transcode it, I would have used `org-export-secondary-string' instead). #+begin_src emacs-lisp (defun org-summary-template (contents info) "Return complete document transcoded with summary back-end. CONTENTS is the body of the document. INFO is the plist used as a communication channel." (let ((title (org-element-interpret-secondary (plist-get info :title))) (preamble (plist-get info :preamble))) (concat org-summary-scissors (upcase title) "\n\n" preamble "\n\n" contents org-summary-scissors))) #+end_src Now, I can test all of this by with M-: (org-export-as 'summary) on a test buffer[1]. So far, so good. But I know better and define an interactive function for that important action. While I'm at it, it will display my summary in a buffer named "*Summary*". #+begin_src emacs-lisp (defun org-export-as-summary () "Create the summary of the current Org buffer. Summary is displayed in a buffer called \"*Summary*\"." (interactive) (when (eq major-mode 'org-mode) (switch-to-buffer (org-export-to-buffer 'summary "*Summary*")))) #+end_src That's all, folks. I'll try to package its first back-end, org-latex.el, into experimental/ before monday. Feedback, as always, is welcome. Some text that will probably be ignored. * Head 1 some text =20=20 ** Head 1.1 some text too *** Head 1.1.1 Some text again ** Head 1.2 some text =20=20=20 * Head 2 :no= export: some text --8<---------------cut here---------------end--------------->8--- Regards, [1] For example, this one: --8<---------------cut here---------------start------------->8--- #+Title: My Summary Test #+Preamble: I hope #+Preamble: that it is working #+Options: H:2 --=20 Nicolas Goaziou