From mboxrd@z Thu Jan 1 00:00:00 1970 From: tsd@tsdye.com (Thomas S. Dye) Subject: Re: [ANN] Org Elements in contrib Date: Tue, 22 Nov 2011 06:00:49 -1000 Message-ID: References: <87ty5xxqbu.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from eggs.gnu.org ([140.186.70.92]:36444) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RSsmW-0000zK-LS for emacs-orgmode@gnu.org; Tue, 22 Nov 2011 11:01:20 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RSsmM-0004d2-PF for emacs-orgmode@gnu.org; Tue, 22 Nov 2011 11:01:12 -0500 Received: from oproxy3-pub.bluehost.com ([69.89.21.8]:37840) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1RSsmM-0004cl-Fh for emacs-orgmode@gnu.org; Tue, 22 Nov 2011 11:01:02 -0500 In-Reply-To: <87ty5xxqbu.fsf@gmail.com> (Nicolas Goaziou's message of "Mon, 21 Nov 2011 19:50:29 +0100") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Nicolas Goaziou Cc: Org Mode List Aloha Nicolas, This looks brilliant. The interactive functions seem to know their way around my various org-mode files. Looking forward to the generic exporter and the LaTeX back-end. All the best, Tom Nicolas Goaziou writes: > Hello, > > > I've added org-element.el in contrib directory. It is a complete parser > and interpreter for Org syntax. > > While it was written to be extensible, it is also an attempt to > normalize current syntax and provide guidance for its evolution. > > Org syntax can be divided into three categories: "Greater elements", > "Elements" and "Objects". > > An object can be defined anywhere on a line. It may span over more than > a line but never contains a blank one. Objects belong to the following > types: `emphasis', `entity', `export-snippet', `footnote-reference', > `inline-babel-call', `inline-src-block', `latex-fragment', `line-break', > `link', `macro', `radio-target', `statistics-cookie', `subscript', > `superscript', `target', `time-stamp' and `verbatim'. > > An element always starts and ends at the beginning of a line. The only > element's type containing objects is called a `paragraph'. Other types > are: `comment', `comment-block', `example-block', `export-block', > `fixed-width', `horizontal-rule', `keyword', `latex-environment', > `babel-call', `property-drawer', `quote-section', `src-block', `table' > and `verse-block'. > > Elements containing paragraphs are called greater elements. Concerned > types are: `center-block', `drawer', `dynamic-block', > `footnote-definition', `headline', `inlinetask', `item', `plain-list', > `quote-block' and `special-block'. > > Greater elements (excepted `headline' and `item' types) and elements > (excepted `keyword', `babel-call', and `property-drawer' types) can have > a fixed set of keywords as attributes. Those are called "affiliated > keywords", to distinguish them from others keywords, which are > full-fledged elements. In particular, the "name" affiliated keyword > allows to label almost any element in an Org buffer. > > Notwithstanding affiliated keywords, each greater element, element and > object has a fixed set of properties attached to it. Among them, three > are shared by all types: `:begin' and `:end', which refer to the > beginning and ending buffer positions of the considered element or > object, and `:post-blank', which holds the number of blank lines, or > white spaces, at its end. > > Some elements also have special properties whose value can hold objects > themselves (i.e. an item tag, an headline name, a table cell). Such > values are called "secondary strings". > > Lisp-wise, an element or an object can be represented as a list. It > follows the pattern (TYPE PROPERTIES CONTENTS), where: TYPE is a symbol > describing the Org element or object. PROPERTIES is the property list > attached to it. See docstring of appropriate parsing function to get an > exhaustive list. CONTENTS is a list of elements, objects or raw strings > contained in the current element or object, when applicable. > > An Org buffer is a nested list of such elements and objects, whose type > is `org-data' and properties is nil. > > The first part of this file implements a parser and an interpreter for > each type of Org syntax. > > The next two parts introduce two accessors and a function retrieving the > smallest element containing point (respectively > `org-element-get-property', `org-element-get-contents' and > `org-element-at-point'). > > The following part creates a fully recursive buffer parser. It also > provides a tool to map a function to elements or objects matching some > criteria in the parse tree. Functions of interest are > `org-element-parse-buffer', `org-element-map' and, to a lesser extent, > `org-element-parse-secondary-string'. > > The penultimate part is the cradle of an interpreter for the obtained > parse tree: `org-element-interpret-data' (and its relative, > `org-element-interpret-secondary'). > > The library ends by furnishing a set of interactive tools for element's > navigation and manipulation. > > More specifically, that last part includes some tools like > `org-element-forward', `org-element-backward', > `org-element-drag-forward', `org-element-drag-backward', > `org-element-mark-element', `org-element-up', > `org-element-unindent-buffer'... > > For the impatient (well, not quite as you're still reading this), you > can evaluate the following examples in an Org buffer : > > (org-element-parse-buffer) > (org-element-parse-buffer 'headline) > (org-element-parse-buffer 'headline 'visible-only) > > Also, the following code will parse the buffer, interpret the parsed > tree, and create a canonical copy of it (no indentation, lowercased > blocks, standard keywords): > > #+begin_src org > (let ((out (org-element-interpret-data (org-element-parse-buffer)))) > (switch-to-buffer (get-buffer-create "*Bijectivep*")) > (erase-buffer) > (insert out) > (goto-char (point-min)) > (org-mode)) > #+end_src > > Beside allowing to add keywords like "#+name:", "#+caption:" or > "#+attr_latex:" to almost any Org element, it also introduces two less > noticable changes: > > 1. "#+label:" keywords are deprecated in favor of "#+name:". Though, > for now, "label" is still considered as a synonym of "name". > > 2. Protected HTML snippets (like @) are no longer supported, as > they were too specific. > > Instead, a general mechanism to inline back-end specific commands > is created. Thus, the HTML back-end will see "some text<\b>" > while the LaTeX one will only see "some text" if the buffer > contains: > > @html{}some text@html{<\b>} > > Syntax is heavier, but a configurable variable allows to define > shortcuts, allowing to reduce it to, for example, @h{}. No > shortcut is provided by default. > > Also, the syntax is experimental, and may change if proven to be > inadequate. > > > I will commit a generic exporter built on top of Elements, along with > a LaTeX back-end, in a couple of days. > > Feedback is welcome. > > > Regards, -- Thomas S. Dye http://www.tsdye.com