emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Nicolas Goaziou <n.goaziou@gmail.com>
To: Org Mode List <emacs-orgmode@gnu.org>
Subject: [ANN] Org Elements in contrib
Date: Mon, 21 Nov 2011 19:50:29 +0100	[thread overview]
Message-ID: <87ty5xxqbu.fsf@gmail.com> (raw)

Hello,


I've added org-element.el in contrib directory. It is a complete parser
and interpreter for Org syntax.

While it was written to be extensible, it is also an attempt to
normalize current syntax and provide guidance for its evolution.

Org syntax can be divided into three categories: "Greater elements",
"Elements" and "Objects".

An object can be defined anywhere on a line. It may span over more than
a line but never contains a blank one. Objects belong to the following
types: `emphasis', `entity', `export-snippet', `footnote-reference',
`inline-babel-call', `inline-src-block', `latex-fragment', `line-break',
`link', `macro', `radio-target', `statistics-cookie', `subscript',
`superscript', `target', `time-stamp' and `verbatim'.

An element always starts and ends at the beginning of a line. The only
element's type containing objects is called a `paragraph'. Other types
are: `comment', `comment-block', `example-block', `export-block',
`fixed-width', `horizontal-rule', `keyword', `latex-environment',
`babel-call', `property-drawer', `quote-section', `src-block', `table'
and `verse-block'.

Elements containing paragraphs are called greater elements. Concerned
types are: `center-block', `drawer', `dynamic-block',
`footnote-definition', `headline', `inlinetask', `item', `plain-list',
`quote-block' and `special-block'.

Greater elements (excepted `headline' and `item' types) and elements
(excepted `keyword', `babel-call', and `property-drawer' types) can have
a fixed set of keywords as attributes. Those are called "affiliated
keywords", to distinguish them from others keywords, which are
full-fledged elements. In particular, the "name" affiliated keyword
allows to label almost any element in an Org buffer.

Notwithstanding affiliated keywords, each greater element, element and
object has a fixed set of properties attached to it. Among them, three
are shared by all types: `:begin' and `:end', which refer to the
beginning and ending buffer positions of the considered element or
object, and `:post-blank', which holds the number of blank lines, or
white spaces, at its end.

Some elements also have special properties whose value can hold objects
themselves (i.e. an item tag, an headline name, a table cell). Such
values are called "secondary strings".

Lisp-wise, an element or an object can be represented as a list. It
follows the pattern (TYPE PROPERTIES CONTENTS), where: TYPE is a symbol
describing the Org element or object. PROPERTIES is the property list
attached to it. See docstring of appropriate parsing function to get an
exhaustive list. CONTENTS is a list of elements, objects or raw strings
contained in the current element or object, when applicable.

An Org buffer is a nested list of such elements and objects, whose type
is `org-data' and properties is nil.

The first part of this file implements a parser and an interpreter for
each type of Org syntax.

The next two parts introduce two accessors and a function retrieving the
smallest element containing point (respectively
`org-element-get-property', `org-element-get-contents' and
`org-element-at-point').

The following part creates a fully recursive buffer parser. It also
provides a tool to map a function to elements or objects matching some
criteria in the parse tree. Functions of interest are
`org-element-parse-buffer', `org-element-map' and, to a lesser extent,
`org-element-parse-secondary-string'.

The penultimate part is the cradle of an interpreter for the obtained
parse tree: `org-element-interpret-data' (and its relative,
`org-element-interpret-secondary').

The library ends by furnishing a set of interactive tools for element's
navigation and manipulation.

More specifically, that last part includes some tools like
`org-element-forward', `org-element-backward',
`org-element-drag-forward', `org-element-drag-backward',
`org-element-mark-element', `org-element-up',
`org-element-unindent-buffer'... 

For the impatient (well, not quite as you're still reading this), you
can evaluate the following examples in an Org buffer :

                       (org-element-parse-buffer)
                  (org-element-parse-buffer 'headline)
           (org-element-parse-buffer 'headline 'visible-only)

Also, the following code will parse the buffer, interpret the parsed
tree, and create a canonical copy of it (no indentation, lowercased
blocks, standard keywords):

#+begin_src org
(let ((out (org-element-interpret-data (org-element-parse-buffer))))
  (switch-to-buffer (get-buffer-create "*Bijectivep*"))
  (erase-buffer)
  (insert out)
  (goto-char (point-min))
  (org-mode))
#+end_src

Beside allowing to add keywords like "#+name:", "#+caption:" or
"#+attr_latex:" to almost any Org element, it also introduces two less
noticable changes:

  1. "#+label:" keywords are deprecated in favor of "#+name:". Though,
     for now, "label" is still considered as a synonym of "name".

  2. Protected HTML snippets (like @<b>) are no longer supported, as
     they were too specific.

     Instead, a general mechanism to inline back-end specific commands
     is created. Thus, the HTML back-end will see "<b>some text<\b>"
     while the LaTeX one will only see "some text" if the buffer
     contains:

                     @html{<b>}some text@html{<\b>}

     Syntax is heavier, but a configurable variable allows to define
     shortcuts, allowing to reduce it to, for example, @h{<b>}. No
     shortcut is provided by default.

     Also, the syntax is experimental, and may change if proven to be
     inadequate.


I will commit a generic exporter built on top of Elements, along with
a LaTeX back-end, in a couple of days.

Feedback is welcome.


Regards,

-- 
Nicolas Goaziou

             reply	other threads:[~2011-11-21 18:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-21 18:50 Nicolas Goaziou [this message]
2011-11-21 22:02 ` [ANN] Org Elements in contrib Martyn Jago
2011-11-22  5:17 ` Eric Abrahamsen
2011-11-22 14:02 ` Brian Wightman
2011-11-22 16:00 ` Thomas S. Dye
2011-11-23 20:12   ` Wes Hardaker
2012-06-10 16:40 ` Michael Brand
2012-06-12 12:32   ` Nicolas Goaziou
2012-06-12 14:14     ` Michael Brand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ty5xxqbu.fsf@gmail.com \
    --to=n.goaziou@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).