emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [RFC] Org syntax (draft)
@ 2013-03-07 20:37 Nicolas Goaziou
  2013-03-07 20:47 ` Carsten Dominik
                   ` (6 more replies)
  0 siblings, 7 replies; 37+ messages in thread
From: Nicolas Goaziou @ 2013-03-07 20:37 UTC (permalink / raw)
  To: Org Mode List

Hello,

As discussed a few days ago, here is a document describing the complete
Org syntax as read by the parser. I also added some comments. I am going
to put the Org file on Worg, so anyone can update it and fix mistakes.

                          ━━━━━━━━━━━━━━━━━━━━
                           ORG SYNTAX (DRAFT)
                          ━━━━━━━━━━━━━━━━━━━━


Table of Contents
─────────────────

1 Headlines and Sections
2 Affiliated Keywords
3 Greater Elements
.. 3.1 Greater Blocks
.. 3.2 Drawers and Property Drawers
.. 3.3 Dynamic Blocks
.. 3.4 Footnote Definitions
.. 3.5 Inlinetasks
.. 3.6 Plain Lists and Items
.. 3.7 Tables
4 Elements
.. 4.1 Babel Call
.. 4.2 Blocks
.. 4.3 Clock, Diary Sexp and Planning
.. 4.4 Comments
.. 4.5 Fixed Width Areas
.. 4.6 Horizontal Rules
.. 4.7 Keywords
.. 4.8 LaTeX Environments
.. 4.9 Node Properties
.. 4.10 Paragraphs
.. 4.11 Table Rows
5 Objects
.. 5.1 Entities and LaTeX Fragments
.. 5.2 Export Snippets
.. 5.3 Footnote References
.. 5.4 Inline Babel Calls and Source Blocks
.. 5.5 Line Breaks
.. 5.6 Links
.. 5.7 Macros
.. 5.8 Targets and Radio Targets
.. 5.9 Statistics Cookies
.. 5.10 Subscript and Superscript
.. 5.11 Table Cells
.. 5.12 Timestamps
.. 5.13 Text Markup


This document describes and comments Org syntax as it is currently read
by its parser (Org Elements) and, therefore, by the export framework.
It also includes a few comments on that syntax.

A core concept in this syntax is that only headlines and sections are
context-free[1][2].  Every other syntactical part only exists within
specific environments.

Three categories are used to classify these environments: “Greater
elements”, “elements”, and “objects”, from the broadest scope to the
narrowest.

The paragraph is the unit of measurement.  An element defines
syntactical parts that are at the same level as a paragraph, i.e. which
cannot contain or be included in a paragraph.  An object is a part that
could be included in an element.  Greater elements are all parts that
can contain an element.

Empty lines belong to the largest element ending before them.  For
example, in a list, empty lines between items belong are part of the
item before them, but empty lines at the end of a list belong to the
plain list element.

Unless specified otherwise, case is not significant.


1 Headlines and Sections
════════════════════════

  A headline is defined as:

  ╭────
  │ STARS KEYWORD PRIORITY TITLE TAGS
  ╰────

  STARS is a string starting at column 0 and containing at least one
  asterisk (and up to `org-inlinetask-min-level' if `org-inlinetask'
  library is loaded).  It’s the sole compulsory part of a headline.

  KEYWORD is a TODO keyword, which have to belong to the list defined in
  `org-todo-keywords'.  Case is significant.

  PRIORITY is a priority cookie, i.e. a single letter preceded by a hash
  sign # and enclosed within square brackets.  Case is significant.

  TITLE can be made of any character but a new line.  Though, it will
  match after every other part have been matched.

  TAGS is made of words containing any alpha-numeric character,
  underscore, at sign, hash sign or percent sign, and separated with
  colons.

  Examples of valid headlines include:

  ╭────
  │ *
  │ 
  │ ** DONE
  │ 
  │ *** Some e-mail
  │ 
  │ **** TODO [#A] COMMENT Title :tag:a2%:
  ╰────

  If the first word appearing in the title is `org-comment-keyword', the
  headline will be considered as “commented”.  If that first word is
  `org-quote-string', it will be considered as “quoted”.  In both
  situations, case is significant.

  If its title is `org-footnote-section', it will be considered as
  a “footnote section”.  Case is significant.

  If `org-archive-tag' is one of its tags, it will be considered as
  “archived”.  Case is significant.

  A headline contains directly at most one section, followed by any
  number of headlines.  Only a section can contain another section.

  A section contains directly any greater element or element.  Only
  a headline can contain a section.  As an exception, text before the
  first headline in the document also belongs to a section.

  In a quoted headline contains a section, the latter will be considered
  as a “quote section”.

  As an example, consider the following document:

  ╭────
  │ An introduction.
  │ 
  │ * A Headline 
  │ 
  │   Some text.
  │ 
  │ ** Sub-Topic 1
  │ 
  │ ** Sub-Topic 2
  │ 
  │ *** Additional entry 
  │ 
  │ ** QUOTE Another Sub-Topic
  │ 
  │    Some other text.
  ╰────

  Its internal structure could be summarized as:

  ╭────
  │ (document
  │  (section)
  │  (headline
  │   (section)
  │   (headline)
  │   (headline
  │    (headline))
  │   (headline
  │    (quote-section))))
  ╰────


2 Affiliated Keywords
═════════════════════

  With the exception of [inlinetasks], [items], [planning], [clocks],
  [node properties] and [table rows], every other element type can be
  assigned attributes.

  This is done by adding specific keywords, named “affiliated keywords”,
  just above the element considered, no blank line allowed.

  Affiliated keywords are built upon one of the following patterns:
  “#+KEY: VALUE”, “#+KEY[OPTIONAL]: VALUE” or “#+ATTR_BACKEND: VALUE”.

  KEY is either “CAPTION”, “HEADER”, “NAME”, “PLOT” or “RESULTS” string.

  BACKEND is a string constituted of alpha-numeric characters, hyphens
  or underscores.

  OPTIONAL and VALUE can contain any character but a new line.  Only
  keywords in `org-element-dual-keywords' can have an optional value.

  An affiliated keyword can appear on multiple lines if KEY belongs to
  `org-element-multiple-keywords' or if its pattern is “#+ATTR_BACKEND:
  VALUE”.

  Affiliated keywords whose KEY belong to `org-element-parsed-keywords'
  can contain objects in their value and their optional value, if
  applicable.


  [inlinetasks] See section 3.5

  [items] See section 3.6

  [planning] See section 4.3

  [clocks] See section 4.3

  [node properties] See section 4.9

  [table rows] See section 4.11


3 Greater Elements
══════════════════

  Unless specified otherwise, greater elements can contain directly any
  other element or greater element excepted:

  • elements of their own type,
  • [node properties], which can only be found in [property drawers],
  • [items], which can only be found in [plain lists].


  [node properties] See section 4.9

  [property drawers] See section 3.2

  [items] See section 3.6

  [plain lists] See section 3.6


3.1 Greater Blocks
──────────────────

  Greater blocks consist in the following pattern:

  ╭────
  │ #+BEGIN_NAME PARAMETERS
  │ CONTENTS
  │ #+END_NAME
  ╰────

  NAME can contain any non-whitespace character.

  PARAMETERS can contain any character, and can be omitted.

  If NAME is “CENTER”, it will be a “center block”.  If it is “QUOTE”,
  it will be a “quote block”.

  If the block is neither a center block, a quote block or a [block
  element], it will be a “special block”.

  CONTENTS can contain any element, but another greater block of the
  same type.


  [block element] See section 4.2


3.2 Drawers and Property Drawers
────────────────────────────────

  Pattern for drawers is:

  ╭────
  │ :NAME:
  │ CONTENTS
  │ :END:
  ╰────

  NAME has to either be “PROPERTIES” or belong to `org-drawers' list.

  If NAME is “PROPERTIES”, the drawer will become a “property drawer”.

  In a property drawers, CONTENTS can only contain [node property]
  elements.  Otherwise it can contain any element but another drawer or
  property drawer.

                                  ―――――

  It would be nice if users hadn’t to register drawers names before
  using them in `org-drawers' (or through the `#+DRAWERS:' keyword).
  Anything starting with `^[ \t]*:\w+:[ \t]$' and ending with
  `^[ \t]*:END:[ \t]$' could be considered as a drawer.  — ngz


  [node property] See section 4.9


3.3 Dynamic Blocks
──────────────────

  Pattern for dynamic blocks is:

  ╭────
  │ #+BEGIN: NAME PARAMETERS
  │ CONTENTS
  │ #+END:
  ╰────

  NAME cannot contain any whitespace character.

  PARAMETERS can contain any character and can be omitted.


3.4 Footnote Definitions
────────────────────────

  Pattern for footnote definitions is:

  ╭────
  │ [LABEL] CONTENTS
  ╰────

  It must start at column 0.

  LABEL is either a number or follows the pattern “fn:WORD”, where word
  can contain any word-constituent character, hyphens and underscore
  characters.

  CONTENTS can contain any element excepted another footnote definition.
  It ends at the next footnote definition, the next headline, two
  consecutive empty lines or the end of buffer.


3.5 Inlinetasks
───────────────

  Inlinetasks are defined by `org-inlinetask-min-level' contiguous
  asterisk characters starting at column 0, followed by a whitespace
  character.

  Optionally, inlinetasks can be ended with a string constituted of
  `org-inlinetask-min-level' contiguous characters starting at column 0,
  followed by a space and the “END” string.

  Inlinetasks are recognized only after `org-inlinetask' library is
  loaded.


3.6 Plain Lists and Items
─────────────────────────

  Items are defined by a line starting with the following pattern:
  “BULLET COUNTER-SET CHECK-BOX TAG”, in which only BULLET is mandatory.

  BULLET is either an asterisk, a hyphen, a plus sign character or
  follows either the pattern “COUNTER.” or “COUNTER)".  In any case,
  BULLET is follwed by a whitespace character or line ending.

  COUNTER can be a number or a single letter.

  COUNTER-SET follows the pattern [@COUNTER].

  CHECK-BOX is either a single whitespace character, a “X” character or
  a hyphen, enclosed within square brackets.

  TAG follows “TAG-TEXT ::” pattern, where TAG-TEXT can contain any
  character but a new line.

  An item ends before the next item, the first line less or equally
  indented than its starting line, or two consecutive empty lines.
  Indentation of lines within other greater elements do not count,
  neither do inlinetasks boundaries.

  A plain list is a set of consecutive items of the same indentation.
  It can only directly contain items.

  If first item in a plain list has a counter in its bullet, the plain
  list will be an “ordered plain-list”.  If it contains a tag, it will
  be a “descriptive list”.  Otherwise, it will be an “unordered list”.
  List types are mutually exclusive.

  For example, consider the following excerpt of an Org document:

  ╭────
  │ 1. item 1
  │ 2. [X] item 2
  │    - some tag :: item 2.1
  ╰────

  Its internal structure is as follows:

  ╭────
  │ (ordered-plain-list
  │  (item)
  │  (item
  │   (descriptive-plain-list
  │    (item))))
  ╰────


3.7 Tables
──────────

  Tables start at lines beginning with either a vertical bar or the “+-”
  string followed by plus or minus signs only, assuming they are not
  preceded with lines of the same type.  These lines can be indented.

  A table starting with a vertical bar has “org” type.  Otherwise it has
  “table.el” type.

  Org tables end at the first line not starting with a vertical bar.
  Table.el tables end at the first line not starting with either
  a vertical line or a plus sign.  Such lines can be indented.

  An org table can only contain table rows.  A table.el table does not
  contain anything.


4 Elements
══════════

  Elements cannot contain any other element.

  Only [keywords] whose name belongs to
  `org-element-document-properties', [verse blocks] , [paragraphs] and
  [table rows] can contain objects.


  [keywords] See section 4.7

  [verse blocks] See section 4.2

  [paragraphs] See section 4.10

  [table rows] See section 4.11


4.1 Babel Call
──────────────

  Pattern for babel calls is:

  ╭────
  │ #+CALL: VALUE
  ╰────

  VALUE is optional.  It can contain any character but a new line.


4.2 Blocks
──────────

  Like [greater blocks], pattern for blocks is:

  ╭────
  │ #+BEGIN_NAME DATA
  │ CONTENTS
  │ #+END_NAME
  ╰────

  NAME cannot contain any whitespace character.

  If NAME is “COMMENT”, it will be a “comment block”.  If it is
  “EXAMPLE”, it will be an “example block”.  If it is “SRC”, it will be
  a “source block”.  If it is “VERSE”, it will be a “verse block”.

  If NAME is a string matching the name of any export back-end loaded,
  the block will be an “export block”.

  DATA can contain any character but a new line.  It can be ommitted,
  unless the block is a “source block”.  In this case, it must follow
  the pattern “LANGUAGE SWITCHES ARGUMENTS”, where SWITCHES and
  ARGUMENTS are optional.

  LANGUAGE cannot contain any whitespace character.

  SWITCHES is made of any number of “SWITCH” patterns, separated by
  blank lines.

  A SWITCH pattern is either “-l “FORMAT"", where FORMAT can contain any
  character but a double quote and a new line, “-S” or “+S”, where
  S stands for a single letter.

  ARGUMENTS can contain any character but a new line.

  CONTENTS can contain any character, including new lines.  Though it
  will only contain Org objects if the block is a verse block.
  Otherwise, contents will not be parsed.


  [greater blocks] See section 3.1


4.3 Clock, Diary Sexp and Planning
──────────────────────────────────

  A clock follows the pattern:

  ╭────
  │ CLOCK: TIMESTAMP DURATION
  ╰────

  Both TIMESTAMP and DURATION are optional.

  TIMESTAMP is a [timestamp] object.

  DURATION follows the pattern:

  ╭────
  │ => HH:MM
  ╰────

  HH is a number containing any number of digits.  MM is a two digit
  numbers.

  A diary sexp is a line starting at column 0 with “%%(" string.  It can
  then contain any character besides a new line.

  A planning is a line filled with more at most three INFO parts, where
  each INFO part follows the pattern:

  ╭────
  │ KEYWORD: TIMESTAMP
  ╰────

  KEYWORD is a string among `org-deadline-string',
  `org-scheduled-string' and `org-closed-string'.  TIMESTAMP is is
  a [timestamp] object.

  Even though a planning element can exist anywhere in a section or
  a greater element, it will only affect the headline containing the
  section if it is put on the line following that headline.


  [timestamp] See section 5.12


4.4 Comments
────────────

  A “comment line” starts with a hash signe and a whitespace character
  or an end of line.

  Comments can contain any number of consecutive comment lines.


4.5 Fixed Width Areas
─────────────────────

  A “fixed-width line” start with a colon character and a whitespace or
  an end of line.

  Fixed width areas can contain any number of consecutive fixed-width
  lines.


4.6 Horizontal Rules
────────────────────

  A horizontal rule is a line made of at least 5 consecutive hyphens.
  It can be indented.


4.7 Keywords
────────────

  Keywords follow the syntax:

  ╭────
  │ #+KEY: VALUE
  ╰────

  KEY can contain any non-whitespace character, but it cannot be equal
  to “CALL” or any affiliated keyword.

  VALUE can contain any character excepted a new line.

  If KEY belongs to `org-element-document-properties', VALUE can contain
  objects.


4.8 LaTeX Environments
──────────────────────

  Pattern for LaTeX environments is:

  ╭────
  │ \begin{NAME}
  │ CONTENTS
  │ \end{NAME}
  ╰────

  NAME is constituted of alpha-numeric characters and may end with an
  asterisk.

  CONTENTS can contain anything but the “\end{NAME}” string.


4.9 Node Properties
───────────────────

  Patter for node properties is:

  ╭────
  │ :PROPERTY: VALUE
  ╰────

  PROPERTY can contain any non-whitespace character.  VALUE can contain
  any character but a new line.

  Node properties can only exist in a [property drawers].


  [property drawers] See section 3.2


4.10 Paragraphs
───────────────

  Paragraphs are the default element, which means that any unrecognized
  context is a paragraph.

  Empty lines and other elements end paragraphs.

  Paragraphs can contain every type of object.


4.11 Table Rows
───────────────

  A table rows is either constituted of a vertical bar and any number of
  [table cells] or a vertical bar followed by a hyphen.

  In the first case the table row has the “standard” type.  In the
  second case, it has the “rule” type.

  Table rows can only exist in [tables].


  [table cells] See section 5.11

  [tables] See section 3.7


5 Objects
═════════

  Objects can only be found in the following locations:

  • [affiliated keywords] defined in `org-element-parsed-keywords',
  • [document properties],
  • [headline] titles,
  • [inlinetask] titles,
  • [item] tags,
  • [paragraphs],
  • [table cells],
  • [table rows], which can only contain table cell objects,
  • [verse blocks].

  Most objects cannot contain objects.  Those which can will be
  specified.


  [affiliated keywords] See section 2

  [document properties] See section 4.7

  [headline] See section 1

  [inlinetask] See section 3.5

  [item] See section 3.6

  [paragraphs] See section 4.10

  [table cells] See section 5.11

  [table rows] See section 4.11

  [verse blocks] See section 4.2


5.1 Entities and LaTeX Fragments
────────────────────────────────

  An entity follows the pattern:

  ╭────
  │ \NAME POST
  ╰────

  where NAME has a valid association in either `org-entities' or
  `org-entities-user'.

  POST is the end of line, "{}" string, or a non-alphabetical character.
  It isn’t separated from NAME by a whitespace character.

  A LaTeX fragment can follow multiple patterns:

  ╭────
  │ \NAME POST
  │ \(CONTENTS\)
  │ \[CONTENTS\]
  │ $$CONTENTS$$
  │ PRE$CHAR$POST
  │ PRE$BORDER1 BODY BORDER2$
  ╰────

  NAME contains alphabetical characters only and must not have an
  association in either `org-entities' or `org-entities-user'.

  POST is the same as for entities.

  CONTENTS can contain any character but cannot contain “\)" in the
  second template or “\]" in the third one.

  PRE is either the beginning of line or a character different from `$'.

  CHAR is a non-whitespace character different from `.', ~,~, `?', `;',
  ~’~ or a double quote.

  POST is any of `-', `.', ~,~, `?', `;', `:', ~’~, a double quote,
  a whitespace character and the end of line.

  BORDER1 is a non-whitespace character different from `.', `;', `.'
  and `$'.

  BODY can contain any character excepted `$', and may not span over
  more than 3 lines.

  BORDER2 is any non-whitespace character different from ~,~, `.' and
  `$'.

                                  ―――――

        It would introduce incompatibilities with previous Org
        versions, but support for “$…$” (and for symmetry,
        `$$...$$') constructs ought to be removed.

        They are slow to parse, fragile, redundant, imply false
        positives and do not look good in LaTeX output anyway.
        Even the LaTeX community suggests to use `\(...\)' over
        `$...$'.  — ngz


5.2 Export Snippets
───────────────────

  Patter for export snippets is:

  ╭────
  │ @@NAME:VALUE@@
  ╰────

  NAME can contain any alpha-numeric character and hyphens.

  VALUE can contain anything but “@@” string.


5.3 Footnote References
───────────────────────

  There are four patterns for footnote references:

  ╭────
  │ [MARK]
  │ [fn:LABEL]
  │ [fn:LABEL:DEFINITION]
  │ [fn::DEFINITION]
  ╰────

  MARK is a number.

  LABEL can contain any word constituent character, hyphens and
  underscores.

  DEFINITION can contain any character.  Though opening and closing
  square brackets must be balanced in it.  It can contain any object
  encountered in a paragraph, even other footnote references.

  If the reference follows the third pattern, it is called an “inline
  footnote”.  If it follows the fourth one, i.e. if LABEL is omitted, it
  is an “anonymous footnote”.


5.4 Inline Babel Calls and Source Blocks
────────────────────────────────────────

  Inline Babel calls follow any of the following patterns:

  ╭────
  │ call_NAME(ARGUMENTS)
  │ call_NAME[HEADER](ARGUMENTS)[HEADER]
  ╰────

  NAME can contain any character besides `(', `)' and “\n”.

  HEADER can contain any character besides `]' and “\n”.

  ARGUMENTS can contain any character besides `)' and “\n”.

  Inline source blocks follow any of the following patterns:

  ╭────
  │ src_LANG{BODY}
  │ src_LANG[OPTIONS]{BODY}
  ╰────

  LANG can contain any non-whitespace character.

  OPTIONS and BODY can contain any character but “\n”.


5.5 Line Breaks
───────────────

  A line break consists in “\\SPACE” pattern at the end of an otherwise
  non-empty line.

  SPACE can contain any number of tabs and spaces, including 0.


5.6 Links
─────────

  There are 4 major types of links:

  ╭────
  │ RADIO                     ("radio" link)
  │ <PROTOCOL:PATH>           ("angle" link)
  │ PRE PROTOCOL:PATH2 POST   ("plain" link)
  │ [[PATH3]DESCRIPTION]      ("regular" link)
  ╰────

  RADIO is a string matched by some [radio target].  It can contain
  [entities], [latex fragments], [subscript] and [superscript] only.

  PROTOCOL is a string among `org-link-types'.

  PATH can contain any character but `]', `<', `>' and `\n'.

  PRE and POST are non word constituent.  They can be, respectively, the
  beginning or the end of a line.

  PATH2 can contain any non-whitespace character excepted `(', `)', `<'
  and `>'.  It must end with a word-constituent character, or any
  non-whitespace non-punctuation character followed by `/'.

  DESCRIPTION must be enclosed within square brackets.  It can contain
  any character but square brackets.  Object-wise, it can contain any
  object found in a paragraph excepted a [footnote reference], a [radio
  target] and a [line break].  It cannot contain another link either,
  unless it is a plain link.

  DESCRIPTION is optional.

  PATH3 is built according to the following patterns:

  ╭────
  │ FILENAME           ("file" type)
  │ PROTOCOL:PATH4     ("PROTOCOL" type)
  │ id:ID              ("id" type)
  │ #CUSTOM-ID         ("custom-id" type)
  │ (CODEREF)          ("coderef" type)
  │ FUZZY              ("fuzzy" type)
  ╰────

  FILENAME is a file name, either absolute or relative.

  PATH4 can contain any character besides square brackets.

  ID is constituted of hexadecimal numbers separated with hyphens.

  PATH4, CUSTOM-ID, CODEREF and FUZZY can contain any character besides
  square brackets.

                                  ―――――

        I suggest to remove angle links.  If one needs spaces in
        PATH, she can use standard link syntax instead.

        I also suggest to remove `org-link-types' dependency in
        PROTOCOL and match `[a-zA-Z]' instead, for portability.  —
        ngz


  [radio target] See section 5.8

  [entities] See section 5.1

  [latex fragments] See section 5.1

  [subscript] See section 5.10

  [superscript] See section 5.10

  [footnote reference] See section 5.3

  [line break] See section 5.5


5.7 Macros
──────────

  Macros follow the pattern:

  ╭────
  │ {{{NAME(ARGUMENTS)}}}
  ╰────

  NAME must start with a letter and can be followed by any number of
  alpha-numeric characters, hyphens and underscores.

  ARGUMENTS can contain anything but "}}}" string.  Values within
  ARGUMENTS are separated by commas.  Non-separating commas have to be
  escaped with a backslash character.


5.8 Targets and Radio Targets
─────────────────────────────

  Radio targets follow the pattern:

  ╭────
  │ <<<CONTENTS>>>
  ╰────

  CONTENTS can be any character besides `<', `>' and “\n”.  As far as
  objects go, it can contain [entities], [latex fragments], [subscript]
  and [superscript] only.

  Targets follow the pattern:

  ╭────
  │ <<TARGET>>
  ╰────

  TARGET can contain any character besides `<', `>' and “\n”.  It cannot
  contain any object.


  [entities] See section 5.1

  [latex fragments] See section 5.1

  [subscript] See section 5.10

  [superscript] See section 5.10


5.9 Statistics Cookies
──────────────────────

  Statistics cookies follow either pattern:

  ╭────
  │ [PERCENT%]
  │ [NUM1/NUM2]
  ╰────

  PERCENT, NUM1 and NUM2 are numbers or the empty string.


5.10 Subscript and Superscript
──────────────────────────────

  Pattern for subscript is:

  ╭────
  │ CHAR_SCRIPT
  ╰────

  Pattern for superscript is:

  ╭────
  │ CHAR^SCRIPT
  ╰────

  CHAR is any non-whitespace character.

  SCRIPT can be `*', a string made of word-constituent characters maybe
  preceded by a plus or a minus sign, an expression enclosed in
  parenthesis (resp. curly brackets) containing balanced parenthesis
  (resp. curly brackets).


5.11 Table Cells
────────────────

  Table cells follow the pattern:

  ╭────
  │ CONTENTS|
  ╰────

  CONTENTS can contain any character excepted a vertical bar.


5.12 Timestamps
───────────────

  There are seven possible patterns for timestamps:

  ╭────
  │ <%%(SEXP)>                                     (diary)
  │ <DATE TIME REPEATER>                         (active)
  │ [DATE TIME REPEATER]                         (inactive)
  │ <DATE TIME REPEATER>--<DATE TIME REPEATER>   (active range)
  │ <DATE TIME-TIME REPEATER>                    (active range)
  │ [DATE TIME REPEATER]--[DATE TIME REPEATER]   (inactive range)
  │ [DATE TIME-TIME REPEATER]                    (inactive range)
  ╰────

  SEXP can contain any character excepted `>' and `\n'.

  DATE follows the pattern:

  ╭────
  │ YYYY-MM-DD DAYNAME
  ╰────

  Y, M and D are digits.  DAYNAME can contain any non
  whitespace-character besides `+', `-', `]', `>', a digit or `\n'.

  TIME follows the pattern =H:MM~.  H can be one or two digit long and
  can start with 0.

  REPEATER follows the patter:

  ╭────
  │ MARK VALUE UNIT
  ╰────

  MARK is `+' (cumulate type), `++' (catch-up type) or `.+' (restart
  type).

  VALUE is a number.

  UNIT is a character among `h' (hour), `d' (day), `w' (week), `m'
  (month), `y' (year).

  MARK, VALUE and UNIT are not separated by whitespace characters.


5.13 Text Markup
────────────────

  Text markup follows the pattern:

  ╭────
  │ PRE MARKER CONTENTS MARKER POST
  ╰────

  PRE is a whitespace character, `(', `{' ~’~ or a double quote.  It can
  also be a beginning of line.

  MARKER is a character among `*' (bold), `=' (verbatim), `/' (italic),
  `+' (strike-through), `_' (underline), `~' (code).

  CONTENTS is a string following the pattern:

  ╭────
  │ BORDER BODY BORDER
  ╰────

  BORDER can be any non-whitespace character excepted ~,~, ~’~ or
  a double quote.

  BODY can contain contain any character but may not span over more than
  3 lines.

  BORDER and BODY are not separated by whitespaces.

  CONTENTS can contain any object encountered in a paragraph when markup
  is “bold”, “italic”, “strike-through” or “underline”.

  POST is a whitespace character, `-', `.', ~,~, `:', `!', `?', ~’~,
  `)', `}' or a double quote.  It can also be an end of line.

  PRE, MARKER, CONTENTS, MARKER and POST are not separated by whitespace
  characters.

                                  ―――――

        All of this is wrong if `org-emphasis-regexp-components'
        or `org-emphasis-alist' are modified.

        This should really be simplified and made persistent
        (i.e. no defcustom allowed).  Otherwise, portability and
        parsing are jokes.

        Also, CONTENTS should be anything within code and verbatim
        emphasis, by definition.  — ngz



Footnotes
─────────

[1] In particular, the parser requires stars at column 0 to be quoted
by a comma when they do not define a headline.

[2] It also means that only headlines and sections can be recognized
just by looking at the beginning of the line.

As a consequence, using `org-element-at-point' or
`org-element-context' will move up to the parent headline, and parse
top-down from there until context around is found.



Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-07 20:37 Nicolas Goaziou
@ 2013-03-07 20:47 ` Carsten Dominik
  2013-03-07 22:07 ` Achim Gratz
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: Carsten Dominik @ 2013-03-07 20:47 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Org Mode List

woooooow, this is awesome Nicolas, thank you!

- Carsten

On 7.3.2013, at 21:37, Nicolas Goaziou <n.goaziou@gmail.com> wrote:

> Hello,
> 
> As discussed a few days ago, here is a document describing the complete
> Org syntax as read by the parser. I also added some comments. I am going
> to put the Org file on Worg, so anyone can update it and fix mistakes.
> 
>                          ━━━━━━━━━━━━━━━━━━━━
>                           ORG SYNTAX (DRAFT)
>                          ━━━━━━━━━━━━━━━━━━━━
> 
> 
> Table of Contents
> ─────────────────
> 
> 1 Headlines and Sections
> 2 Affiliated Keywords
> 3 Greater Elements
> .. 3.1 Greater Blocks
> .. 3.2 Drawers and Property Drawers
> .. 3.3 Dynamic Blocks
> .. 3.4 Footnote Definitions
> .. 3.5 Inlinetasks
> .. 3.6 Plain Lists and Items
> .. 3.7 Tables
> 4 Elements
> .. 4.1 Babel Call
> .. 4.2 Blocks
> .. 4.3 Clock, Diary Sexp and Planning
> .. 4.4 Comments
> .. 4.5 Fixed Width Areas
> .. 4.6 Horizontal Rules
> .. 4.7 Keywords
> .. 4.8 LaTeX Environments
> .. 4.9 Node Properties
> .. 4.10 Paragraphs
> .. 4.11 Table Rows
> 5 Objects
> .. 5.1 Entities and LaTeX Fragments
> .. 5.2 Export Snippets
> .. 5.3 Footnote References
> .. 5.4 Inline Babel Calls and Source Blocks
> .. 5.5 Line Breaks
> .. 5.6 Links
> .. 5.7 Macros
> .. 5.8 Targets and Radio Targets
> .. 5.9 Statistics Cookies
> .. 5.10 Subscript and Superscript
> .. 5.11 Table Cells
> .. 5.12 Timestamps
> .. 5.13 Text Markup
> 
> 
> This document describes and comments Org syntax as it is currently read
> by its parser (Org Elements) and, therefore, by the export framework.
> It also includes a few comments on that syntax.
> 
> A core concept in this syntax is that only headlines and sections are
> context-free[1][2].  Every other syntactical part only exists within
> specific environments.
> 
> Three categories are used to classify these environments: “Greater
> elements”, “elements”, and “objects”, from the broadest scope to the
> narrowest.
> 
> The paragraph is the unit of measurement.  An element defines
> syntactical parts that are at the same level as a paragraph, i.e. which
> cannot contain or be included in a paragraph.  An object is a part that
> could be included in an element.  Greater elements are all parts that
> can contain an element.
> 
> Empty lines belong to the largest element ending before them.  For
> example, in a list, empty lines between items belong are part of the
> item before them, but empty lines at the end of a list belong to the
> plain list element.
> 
> Unless specified otherwise, case is not significant.
> 
> 
> 1 Headlines and Sections
> ════════════════════════
> 
>  A headline is defined as:
> 
>  ╭────
>  │ STARS KEYWORD PRIORITY TITLE TAGS
>  ╰────
> 
>  STARS is a string starting at column 0 and containing at least one
>  asterisk (and up to `org-inlinetask-min-level' if `org-inlinetask'
>  library is loaded).  It’s the sole compulsory part of a headline.
> 
>  KEYWORD is a TODO keyword, which have to belong to the list defined in
>  `org-todo-keywords'.  Case is significant.
> 
>  PRIORITY is a priority cookie, i.e. a single letter preceded by a hash
>  sign # and enclosed within square brackets.  Case is significant.
> 
>  TITLE can be made of any character but a new line.  Though, it will
>  match after every other part have been matched.
> 
>  TAGS is made of words containing any alpha-numeric character,
>  underscore, at sign, hash sign or percent sign, and separated with
>  colons.
> 
>  Examples of valid headlines include:
> 
>  ╭────
>  │ *
>  │ 
>  │ ** DONE
>  │ 
>  │ *** Some e-mail
>  │ 
>  │ **** TODO [#A] COMMENT Title :tag:a2%:
>  ╰────
> 
>  If the first word appearing in the title is `org-comment-keyword', the
>  headline will be considered as “commented”.  If that first word is
>  `org-quote-string', it will be considered as “quoted”.  In both
>  situations, case is significant.
> 
>  If its title is `org-footnote-section', it will be considered as
>  a “footnote section”.  Case is significant.
> 
>  If `org-archive-tag' is one of its tags, it will be considered as
>  “archived”.  Case is significant.
> 
>  A headline contains directly at most one section, followed by any
>  number of headlines.  Only a section can contain another section.
> 
>  A section contains directly any greater element or element.  Only
>  a headline can contain a section.  As an exception, text before the
>  first headline in the document also belongs to a section.
> 
>  In a quoted headline contains a section, the latter will be considered
>  as a “quote section”.
> 
>  As an example, consider the following document:
> 
>  ╭────
>  │ An introduction.
>  │ 
>  │ * A Headline 
>  │ 
>  │   Some text.
>  │ 
>  │ ** Sub-Topic 1
>  │ 
>  │ ** Sub-Topic 2
>  │ 
>  │ *** Additional entry 
>  │ 
>  │ ** QUOTE Another Sub-Topic
>  │ 
>  │    Some other text.
>  ╰────
> 
>  Its internal structure could be summarized as:
> 
>  ╭────
>  │ (document
>  │  (section)
>  │  (headline
>  │   (section)
>  │   (headline)
>  │   (headline
>  │    (headline))
>  │   (headline
>  │    (quote-section))))
>  ╰────
> 
> 
> 2 Affiliated Keywords
> ═════════════════════
> 
>  With the exception of [inlinetasks], [items], [planning], [clocks],
>  [node properties] and [table rows], every other element type can be
>  assigned attributes.
> 
>  This is done by adding specific keywords, named “affiliated keywords”,
>  just above the element considered, no blank line allowed.
> 
>  Affiliated keywords are built upon one of the following patterns:
>  “#+KEY: VALUE”, “#+KEY[OPTIONAL]: VALUE” or “#+ATTR_BACKEND: VALUE”.
> 
>  KEY is either “CAPTION”, “HEADER”, “NAME”, “PLOT” or “RESULTS” string.
> 
>  BACKEND is a string constituted of alpha-numeric characters, hyphens
>  or underscores.
> 
>  OPTIONAL and VALUE can contain any character but a new line.  Only
>  keywords in `org-element-dual-keywords' can have an optional value.
> 
>  An affiliated keyword can appear on multiple lines if KEY belongs to
>  `org-element-multiple-keywords' or if its pattern is “#+ATTR_BACKEND:
>  VALUE”.
> 
>  Affiliated keywords whose KEY belong to `org-element-parsed-keywords'
>  can contain objects in their value and their optional value, if
>  applicable.
> 
> 
>  [inlinetasks] See section 3.5
> 
>  [items] See section 3.6
> 
>  [planning] See section 4.3
> 
>  [clocks] See section 4.3
> 
>  [node properties] See section 4.9
> 
>  [table rows] See section 4.11
> 
> 
> 3 Greater Elements
> ══════════════════
> 
>  Unless specified otherwise, greater elements can contain directly any
>  other element or greater element excepted:
> 
>  • elements of their own type,
>  • [node properties], which can only be found in [property drawers],
>  • [items], which can only be found in [plain lists].
> 
> 
>  [node properties] See section 4.9
> 
>  [property drawers] See section 3.2
> 
>  [items] See section 3.6
> 
>  [plain lists] See section 3.6
> 
> 
> 3.1 Greater Blocks
> ──────────────────
> 
>  Greater blocks consist in the following pattern:
> 
>  ╭────
>  │ #+BEGIN_NAME PARAMETERS
>  │ CONTENTS
>  │ #+END_NAME
>  ╰────
> 
>  NAME can contain any non-whitespace character.
> 
>  PARAMETERS can contain any character, and can be omitted.
> 
>  If NAME is “CENTER”, it will be a “center block”.  If it is “QUOTE”,
>  it will be a “quote block”.
> 
>  If the block is neither a center block, a quote block or a [block
>  element], it will be a “special block”.
> 
>  CONTENTS can contain any element, but another greater block of the
>  same type.
> 
> 
>  [block element] See section 4.2
> 
> 
> 3.2 Drawers and Property Drawers
> ────────────────────────────────
> 
>  Pattern for drawers is:
> 
>  ╭────
>  │ :NAME:
>  │ CONTENTS
>  │ :END:
>  ╰────
> 
>  NAME has to either be “PROPERTIES” or belong to `org-drawers' list.
> 
>  If NAME is “PROPERTIES”, the drawer will become a “property drawer”.
> 
>  In a property drawers, CONTENTS can only contain [node property]
>  elements.  Otherwise it can contain any element but another drawer or
>  property drawer.
> 
>                                  ―――――
> 
>  It would be nice if users hadn’t to register drawers names before
>  using them in `org-drawers' (or through the `#+DRAWERS:' keyword).
>  Anything starting with `^[ \t]*:\w+:[ \t]$' and ending with
>  `^[ \t]*:END:[ \t]$' could be considered as a drawer.  — ngz
> 
> 
>  [node property] See section 4.9
> 
> 
> 3.3 Dynamic Blocks
> ──────────────────
> 
>  Pattern for dynamic blocks is:
> 
>  ╭────
>  │ #+BEGIN: NAME PARAMETERS
>  │ CONTENTS
>  │ #+END:
>  ╰────
> 
>  NAME cannot contain any whitespace character.
> 
>  PARAMETERS can contain any character and can be omitted.
> 
> 
> 3.4 Footnote Definitions
> ────────────────────────
> 
>  Pattern for footnote definitions is:
> 
>  ╭────
>  │ [LABEL] CONTENTS
>  ╰────
> 
>  It must start at column 0.
> 
>  LABEL is either a number or follows the pattern “fn:WORD”, where word
>  can contain any word-constituent character, hyphens and underscore
>  characters.
> 
>  CONTENTS can contain any element excepted another footnote definition.
>  It ends at the next footnote definition, the next headline, two
>  consecutive empty lines or the end of buffer.
> 
> 
> 3.5 Inlinetasks
> ───────────────
> 
>  Inlinetasks are defined by `org-inlinetask-min-level' contiguous
>  asterisk characters starting at column 0, followed by a whitespace
>  character.
> 
>  Optionally, inlinetasks can be ended with a string constituted of
>  `org-inlinetask-min-level' contiguous characters starting at column 0,
>  followed by a space and the “END” string.
> 
>  Inlinetasks are recognized only after `org-inlinetask' library is
>  loaded.
> 
> 
> 3.6 Plain Lists and Items
> ─────────────────────────
> 
>  Items are defined by a line starting with the following pattern:
>  “BULLET COUNTER-SET CHECK-BOX TAG”, in which only BULLET is mandatory.
> 
>  BULLET is either an asterisk, a hyphen, a plus sign character or
>  follows either the pattern “COUNTER.” or “COUNTER)".  In any case,
>  BULLET is follwed by a whitespace character or line ending.
> 
>  COUNTER can be a number or a single letter.
> 
>  COUNTER-SET follows the pattern [@COUNTER].
> 
>  CHECK-BOX is either a single whitespace character, a “X” character or
>  a hyphen, enclosed within square brackets.
> 
>  TAG follows “TAG-TEXT ::” pattern, where TAG-TEXT can contain any
>  character but a new line.
> 
>  An item ends before the next item, the first line less or equally
>  indented than its starting line, or two consecutive empty lines.
>  Indentation of lines within other greater elements do not count,
>  neither do inlinetasks boundaries.
> 
>  A plain list is a set of consecutive items of the same indentation.
>  It can only directly contain items.
> 
>  If first item in a plain list has a counter in its bullet, the plain
>  list will be an “ordered plain-list”.  If it contains a tag, it will
>  be a “descriptive list”.  Otherwise, it will be an “unordered list”.
>  List types are mutually exclusive.
> 
>  For example, consider the following excerpt of an Org document:
> 
>  ╭────
>  │ 1. item 1
>  │ 2. [X] item 2
>  │    - some tag :: item 2.1
>  ╰────
> 
>  Its internal structure is as follows:
> 
>  ╭────
>  │ (ordered-plain-list
>  │  (item)
>  │  (item
>  │   (descriptive-plain-list
>  │    (item))))
>  ╰────
> 
> 
> 3.7 Tables
> ──────────
> 
>  Tables start at lines beginning with either a vertical bar or the “+-”
>  string followed by plus or minus signs only, assuming they are not
>  preceded with lines of the same type.  These lines can be indented.
> 
>  A table starting with a vertical bar has “org” type.  Otherwise it has
>  “table.el” type.
> 
>  Org tables end at the first line not starting with a vertical bar.
>  Table.el tables end at the first line not starting with either
>  a vertical line or a plus sign.  Such lines can be indented.
> 
>  An org table can only contain table rows.  A table.el table does not
>  contain anything.
> 
> 
> 4 Elements
> ══════════
> 
>  Elements cannot contain any other element.
> 
>  Only [keywords] whose name belongs to
>  `org-element-document-properties', [verse blocks] , [paragraphs] and
>  [table rows] can contain objects.
> 
> 
>  [keywords] See section 4.7
> 
>  [verse blocks] See section 4.2
> 
>  [paragraphs] See section 4.10
> 
>  [table rows] See section 4.11
> 
> 
> 4.1 Babel Call
> ──────────────
> 
>  Pattern for babel calls is:
> 
>  ╭────
>  │ #+CALL: VALUE
>  ╰────
> 
>  VALUE is optional.  It can contain any character but a new line.
> 
> 
> 4.2 Blocks
> ──────────
> 
>  Like [greater blocks], pattern for blocks is:
> 
>  ╭────
>  │ #+BEGIN_NAME DATA
>  │ CONTENTS
>  │ #+END_NAME
>  ╰────
> 
>  NAME cannot contain any whitespace character.
> 
>  If NAME is “COMMENT”, it will be a “comment block”.  If it is
>  “EXAMPLE”, it will be an “example block”.  If it is “SRC”, it will be
>  a “source block”.  If it is “VERSE”, it will be a “verse block”.
> 
>  If NAME is a string matching the name of any export back-end loaded,
>  the block will be an “export block”.
> 
>  DATA can contain any character but a new line.  It can be ommitted,
>  unless the block is a “source block”.  In this case, it must follow
>  the pattern “LANGUAGE SWITCHES ARGUMENTS”, where SWITCHES and
>  ARGUMENTS are optional.
> 
>  LANGUAGE cannot contain any whitespace character.
> 
>  SWITCHES is made of any number of “SWITCH” patterns, separated by
>  blank lines.
> 
>  A SWITCH pattern is either “-l “FORMAT"", where FORMAT can contain any
>  character but a double quote and a new line, “-S” or “+S”, where
>  S stands for a single letter.
> 
>  ARGUMENTS can contain any character but a new line.
> 
>  CONTENTS can contain any character, including new lines.  Though it
>  will only contain Org objects if the block is a verse block.
>  Otherwise, contents will not be parsed.
> 
> 
>  [greater blocks] See section 3.1
> 
> 
> 4.3 Clock, Diary Sexp and Planning
> ──────────────────────────────────
> 
>  A clock follows the pattern:
> 
>  ╭────
>  │ CLOCK: TIMESTAMP DURATION
>  ╰────
> 
>  Both TIMESTAMP and DURATION are optional.
> 
>  TIMESTAMP is a [timestamp] object.
> 
>  DURATION follows the pattern:
> 
>  ╭────
>  │ => HH:MM
>  ╰────
> 
>  HH is a number containing any number of digits.  MM is a two digit
>  numbers.
> 
>  A diary sexp is a line starting at column 0 with “%%(" string.  It can
>  then contain any character besides a new line.
> 
>  A planning is a line filled with more at most three INFO parts, where
>  each INFO part follows the pattern:
> 
>  ╭────
>  │ KEYWORD: TIMESTAMP
>  ╰────
> 
>  KEYWORD is a string among `org-deadline-string',
>  `org-scheduled-string' and `org-closed-string'.  TIMESTAMP is is
>  a [timestamp] object.
> 
>  Even though a planning element can exist anywhere in a section or
>  a greater element, it will only affect the headline containing the
>  section if it is put on the line following that headline.
> 
> 
>  [timestamp] See section 5.12
> 
> 
> 4.4 Comments
> ────────────
> 
>  A “comment line” starts with a hash signe and a whitespace character
>  or an end of line.
> 
>  Comments can contain any number of consecutive comment lines.
> 
> 
> 4.5 Fixed Width Areas
> ─────────────────────
> 
>  A “fixed-width line” start with a colon character and a whitespace or
>  an end of line.
> 
>  Fixed width areas can contain any number of consecutive fixed-width
>  lines.
> 
> 
> 4.6 Horizontal Rules
> ────────────────────
> 
>  A horizontal rule is a line made of at least 5 consecutive hyphens.
>  It can be indented.
> 
> 
> 4.7 Keywords
> ────────────
> 
>  Keywords follow the syntax:
> 
>  ╭────
>  │ #+KEY: VALUE
>  ╰────
> 
>  KEY can contain any non-whitespace character, but it cannot be equal
>  to “CALL” or any affiliated keyword.
> 
>  VALUE can contain any character excepted a new line.
> 
>  If KEY belongs to `org-element-document-properties', VALUE can contain
>  objects.
> 
> 
> 4.8 LaTeX Environments
> ──────────────────────
> 
>  Pattern for LaTeX environments is:
> 
>  ╭────
>  │ \begin{NAME}
>  │ CONTENTS
>  │ \end{NAME}
>  ╰────
> 
>  NAME is constituted of alpha-numeric characters and may end with an
>  asterisk.
> 
>  CONTENTS can contain anything but the “\end{NAME}” string.
> 
> 
> 4.9 Node Properties
> ───────────────────
> 
>  Patter for node properties is:
> 
>  ╭────
>  │ :PROPERTY: VALUE
>  ╰────
> 
>  PROPERTY can contain any non-whitespace character.  VALUE can contain
>  any character but a new line.
> 
>  Node properties can only exist in a [property drawers].
> 
> 
>  [property drawers] See section 3.2
> 
> 
> 4.10 Paragraphs
> ───────────────
> 
>  Paragraphs are the default element, which means that any unrecognized
>  context is a paragraph.
> 
>  Empty lines and other elements end paragraphs.
> 
>  Paragraphs can contain every type of object.
> 
> 
> 4.11 Table Rows
> ───────────────
> 
>  A table rows is either constituted of a vertical bar and any number of
>  [table cells] or a vertical bar followed by a hyphen.
> 
>  In the first case the table row has the “standard” type.  In the
>  second case, it has the “rule” type.
> 
>  Table rows can only exist in [tables].
> 
> 
>  [table cells] See section 5.11
> 
>  [tables] See section 3.7
> 
> 
> 5 Objects
> ═════════
> 
>  Objects can only be found in the following locations:
> 
>  • [affiliated keywords] defined in `org-element-parsed-keywords',
>  • [document properties],
>  • [headline] titles,
>  • [inlinetask] titles,
>  • [item] tags,
>  • [paragraphs],
>  • [table cells],
>  • [table rows], which can only contain table cell objects,
>  • [verse blocks].
> 
>  Most objects cannot contain objects.  Those which can will be
>  specified.
> 
> 
>  [affiliated keywords] See section 2
> 
>  [document properties] See section 4.7
> 
>  [headline] See section 1
> 
>  [inlinetask] See section 3.5
> 
>  [item] See section 3.6
> 
>  [paragraphs] See section 4.10
> 
>  [table cells] See section 5.11
> 
>  [table rows] See section 4.11
> 
>  [verse blocks] See section 4.2
> 
> 
> 5.1 Entities and LaTeX Fragments
> ────────────────────────────────
> 
>  An entity follows the pattern:
> 
>  ╭────
>  │ \NAME POST
>  ╰────
> 
>  where NAME has a valid association in either `org-entities' or
>  `org-entities-user'.
> 
>  POST is the end of line, "{}" string, or a non-alphabetical character.
>  It isn’t separated from NAME by a whitespace character.
> 
>  A LaTeX fragment can follow multiple patterns:
> 
>  ╭────
>  │ \NAME POST
>  │ \(CONTENTS\)
>  │ \[CONTENTS\]
>  │ $$CONTENTS$$
>  │ PRE$CHAR$POST
>  │ PRE$BORDER1 BODY BORDER2$
>  ╰────
> 
>  NAME contains alphabetical characters only and must not have an
>  association in either `org-entities' or `org-entities-user'.
> 
>  POST is the same as for entities.
> 
>  CONTENTS can contain any character but cannot contain “\)" in the
>  second template or “\]" in the third one.
> 
>  PRE is either the beginning of line or a character different from `$'.
> 
>  CHAR is a non-whitespace character different from `.', ~,~, `?', `;',
>  ~’~ or a double quote.
> 
>  POST is any of `-', `.', ~,~, `?', `;', `:', ~’~, a double quote,
>  a whitespace character and the end of line.
> 
>  BORDER1 is a non-whitespace character different from `.', `;', `.'
>  and `$'.
> 
>  BODY can contain any character excepted `$', and may not span over
>  more than 3 lines.
> 
>  BORDER2 is any non-whitespace character different from ~,~, `.' and
>  `$'.
> 
>                                  ―――――
> 
>        It would introduce incompatibilities with previous Org
>        versions, but support for “$…$” (and for symmetry,
>        `$$...$$') constructs ought to be removed.
> 
>        They are slow to parse, fragile, redundant, imply false
>        positives and do not look good in LaTeX output anyway.
>        Even the LaTeX community suggests to use `\(...\)' over
>        `$...$'.  — ngz
> 
> 
> 5.2 Export Snippets
> ───────────────────
> 
>  Patter for export snippets is:
> 
>  ╭────
>  │ @@NAME:VALUE@@
>  ╰────
> 
>  NAME can contain any alpha-numeric character and hyphens.
> 
>  VALUE can contain anything but “@@” string.
> 
> 
> 5.3 Footnote References
> ───────────────────────
> 
>  There are four patterns for footnote references:
> 
>  ╭────
>  │ [MARK]
>  │ [fn:LABEL]
>  │ [fn:LABEL:DEFINITION]
>  │ [fn::DEFINITION]
>  ╰────
> 
>  MARK is a number.
> 
>  LABEL can contain any word constituent character, hyphens and
>  underscores.
> 
>  DEFINITION can contain any character.  Though opening and closing
>  square brackets must be balanced in it.  It can contain any object
>  encountered in a paragraph, even other footnote references.
> 
>  If the reference follows the third pattern, it is called an “inline
>  footnote”.  If it follows the fourth one, i.e. if LABEL is omitted, it
>  is an “anonymous footnote”.
> 
> 
> 5.4 Inline Babel Calls and Source Blocks
> ────────────────────────────────────────
> 
>  Inline Babel calls follow any of the following patterns:
> 
>  ╭────
>  │ call_NAME(ARGUMENTS)
>  │ call_NAME[HEADER](ARGUMENTS)[HEADER]
>  ╰────
> 
>  NAME can contain any character besides `(', `)' and “\n”.
> 
>  HEADER can contain any character besides `]' and “\n”.
> 
>  ARGUMENTS can contain any character besides `)' and “\n”.
> 
>  Inline source blocks follow any of the following patterns:
> 
>  ╭────
>  │ src_LANG{BODY}
>  │ src_LANG[OPTIONS]{BODY}
>  ╰────
> 
>  LANG can contain any non-whitespace character.
> 
>  OPTIONS and BODY can contain any character but “\n”.
> 
> 
> 5.5 Line Breaks
> ───────────────
> 
>  A line break consists in “\\SPACE” pattern at the end of an otherwise
>  non-empty line.
> 
>  SPACE can contain any number of tabs and spaces, including 0.
> 
> 
> 5.6 Links
> ─────────
> 
>  There are 4 major types of links:
> 
>  ╭────
>  │ RADIO                     ("radio" link)
>  │ <PROTOCOL:PATH>           ("angle" link)
>  │ PRE PROTOCOL:PATH2 POST   ("plain" link)
>  │ [[PATH3]DESCRIPTION]      ("regular" link)
>  ╰────
> 
>  RADIO is a string matched by some [radio target].  It can contain
>  [entities], [latex fragments], [subscript] and [superscript] only.
> 
>  PROTOCOL is a string among `org-link-types'.
> 
>  PATH can contain any character but `]', `<', `>' and `\n'.
> 
>  PRE and POST are non word constituent.  They can be, respectively, the
>  beginning or the end of a line.
> 
>  PATH2 can contain any non-whitespace character excepted `(', `)', `<'
>  and `>'.  It must end with a word-constituent character, or any
>  non-whitespace non-punctuation character followed by `/'.
> 
>  DESCRIPTION must be enclosed within square brackets.  It can contain
>  any character but square brackets.  Object-wise, it can contain any
>  object found in a paragraph excepted a [footnote reference], a [radio
>  target] and a [line break].  It cannot contain another link either,
>  unless it is a plain link.
> 
>  DESCRIPTION is optional.
> 
>  PATH3 is built according to the following patterns:
> 
>  ╭────
>  │ FILENAME           ("file" type)
>  │ PROTOCOL:PATH4     ("PROTOCOL" type)
>  │ id:ID              ("id" type)
>  │ #CUSTOM-ID         ("custom-id" type)
>  │ (CODEREF)          ("coderef" type)
>  │ FUZZY              ("fuzzy" type)
>  ╰────
> 
>  FILENAME is a file name, either absolute or relative.
> 
>  PATH4 can contain any character besides square brackets.
> 
>  ID is constituted of hexadecimal numbers separated with hyphens.
> 
>  PATH4, CUSTOM-ID, CODEREF and FUZZY can contain any character besides
>  square brackets.
> 
>                                  ―――――
> 
>        I suggest to remove angle links.  If one needs spaces in
>        PATH, she can use standard link syntax instead.
> 
>        I also suggest to remove `org-link-types' dependency in
>        PROTOCOL and match `[a-zA-Z]' instead, for portability.  —
>        ngz
> 
> 
>  [radio target] See section 5.8
> 
>  [entities] See section 5.1
> 
>  [latex fragments] See section 5.1
> 
>  [subscript] See section 5.10
> 
>  [superscript] See section 5.10
> 
>  [footnote reference] See section 5.3
> 
>  [line break] See section 5.5
> 
> 
> 5.7 Macros
> ──────────
> 
>  Macros follow the pattern:
> 
>  ╭────
>  │ {{{NAME(ARGUMENTS)}}}
>  ╰────
> 
>  NAME must start with a letter and can be followed by any number of
>  alpha-numeric characters, hyphens and underscores.
> 
>  ARGUMENTS can contain anything but "}}}" string.  Values within
>  ARGUMENTS are separated by commas.  Non-separating commas have to be
>  escaped with a backslash character.
> 
> 
> 5.8 Targets and Radio Targets
> ─────────────────────────────
> 
>  Radio targets follow the pattern:
> 
>  ╭────
>  │ <<<CONTENTS>>>
>  ╰────
> 
>  CONTENTS can be any character besides `<', `>' and “\n”.  As far as
>  objects go, it can contain [entities], [latex fragments], [subscript]
>  and [superscript] only.
> 
>  Targets follow the pattern:
> 
>  ╭────
>  │ <<TARGET>>
>  ╰────
> 
>  TARGET can contain any character besides `<', `>' and “\n”.  It cannot
>  contain any object.
> 
> 
>  [entities] See section 5.1
> 
>  [latex fragments] See section 5.1
> 
>  [subscript] See section 5.10
> 
>  [superscript] See section 5.10
> 
> 
> 5.9 Statistics Cookies
> ──────────────────────
> 
>  Statistics cookies follow either pattern:
> 
>  ╭────
>  │ [PERCENT%]
>  │ [NUM1/NUM2]
>  ╰────
> 
>  PERCENT, NUM1 and NUM2 are numbers or the empty string.
> 
> 
> 5.10 Subscript and Superscript
> ──────────────────────────────
> 
>  Pattern for subscript is:
> 
>  ╭────
>  │ CHAR_SCRIPT
>  ╰────
> 
>  Pattern for superscript is:
> 
>  ╭────
>  │ CHAR^SCRIPT
>  ╰────
> 
>  CHAR is any non-whitespace character.
> 
>  SCRIPT can be `*', a string made of word-constituent characters maybe
>  preceded by a plus or a minus sign, an expression enclosed in
>  parenthesis (resp. curly brackets) containing balanced parenthesis
>  (resp. curly brackets).
> 
> 
> 5.11 Table Cells
> ────────────────
> 
>  Table cells follow the pattern:
> 
>  ╭────
>  │ CONTENTS|
>  ╰────
> 
>  CONTENTS can contain any character excepted a vertical bar.
> 
> 
> 5.12 Timestamps
> ───────────────
> 
>  There are seven possible patterns for timestamps:
> 
>  ╭────
>  │ <%%(SEXP)>                                     (diary)
>  │ <DATE TIME REPEATER>                         (active)
>  │ [DATE TIME REPEATER]                         (inactive)
>  │ <DATE TIME REPEATER>--<DATE TIME REPEATER>   (active range)
>  │ <DATE TIME-TIME REPEATER>                    (active range)
>  │ [DATE TIME REPEATER]--[DATE TIME REPEATER]   (inactive range)
>  │ [DATE TIME-TIME REPEATER]                    (inactive range)
>  ╰────
> 
>  SEXP can contain any character excepted `>' and `\n'.
> 
>  DATE follows the pattern:
> 
>  ╭────
>  │ YYYY-MM-DD DAYNAME
>  ╰────
> 
>  Y, M and D are digits.  DAYNAME can contain any non
>  whitespace-character besides `+', `-', `]', `>', a digit or `\n'.
> 
>  TIME follows the pattern =H:MM~.  H can be one or two digit long and
>  can start with 0.
> 
>  REPEATER follows the patter:
> 
>  ╭────
>  │ MARK VALUE UNIT
>  ╰────
> 
>  MARK is `+' (cumulate type), `++' (catch-up type) or `.+' (restart
>  type).
> 
>  VALUE is a number.
> 
>  UNIT is a character among `h' (hour), `d' (day), `w' (week), `m'
>  (month), `y' (year).
> 
>  MARK, VALUE and UNIT are not separated by whitespace characters.
> 
> 
> 5.13 Text Markup
> ────────────────
> 
>  Text markup follows the pattern:
> 
>  ╭────
>  │ PRE MARKER CONTENTS MARKER POST
>  ╰────
> 
>  PRE is a whitespace character, `(', `{' ~’~ or a double quote.  It can
>  also be a beginning of line.
> 
>  MARKER is a character among `*' (bold), `=' (verbatim), `/' (italic),
>  `+' (strike-through), `_' (underline), `~' (code).
> 
>  CONTENTS is a string following the pattern:
> 
>  ╭────
>  │ BORDER BODY BORDER
>  ╰────
> 
>  BORDER can be any non-whitespace character excepted ~,~, ~’~ or
>  a double quote.
> 
>  BODY can contain contain any character but may not span over more than
>  3 lines.
> 
>  BORDER and BODY are not separated by whitespaces.
> 
>  CONTENTS can contain any object encountered in a paragraph when markup
>  is “bold”, “italic”, “strike-through” or “underline”.
> 
>  POST is a whitespace character, `-', `.', ~,~, `:', `!', `?', ~’~,
>  `)', `}' or a double quote.  It can also be an end of line.
> 
>  PRE, MARKER, CONTENTS, MARKER and POST are not separated by whitespace
>  characters.
> 
>                                  ―――――
> 
>        All of this is wrong if `org-emphasis-regexp-components'
>        or `org-emphasis-alist' are modified.
> 
>        This should really be simplified and made persistent
>        (i.e. no defcustom allowed).  Otherwise, portability and
>        parsing are jokes.
> 
>        Also, CONTENTS should be anything within code and verbatim
>        emphasis, by definition.  — ngz
> 
> 
> 
> Footnotes
> ─────────
> 
> [1] In particular, the parser requires stars at column 0 to be quoted
> by a comma when they do not define a headline.
> 
> [2] It also means that only headlines and sections can be recognized
> just by looking at the beginning of the line.
> 
> As a consequence, using `org-element-at-point' or
> `org-element-context' will move up to the parent headline, and parse
> top-down from there until context around is found.
> 
> 
> 
> Regards,
> 
> -- 
> Nicolas Goaziou
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-07 20:37 Nicolas Goaziou
  2013-03-07 20:47 ` Carsten Dominik
@ 2013-03-07 22:07 ` Achim Gratz
  2013-03-08 10:04 ` Bastien
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: Achim Gratz @ 2013-03-07 22:07 UTC (permalink / raw)
  To: emacs-orgmode

Nicolas Goaziou writes:
> As discussed a few days ago, here is a document describing the complete
> Org syntax as read by the parser. I also added some comments. I am going
> to put the Org file on Worg, so anyone can update it and fix mistakes.

Wonderful.  This will be really useful!


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Waldorf MIDI Implementation & additional documentation:
http://Synth.Stromeko.net/Downloads.html#WaldorfDocs

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-07 20:37 Nicolas Goaziou
  2013-03-07 20:47 ` Carsten Dominik
  2013-03-07 22:07 ` Achim Gratz
@ 2013-03-08 10:04 ` Bastien
  2013-03-08 13:25 ` François Pinard
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: Bastien @ 2013-03-08 10:04 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Org Mode List

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> As discussed a few days ago, here is a document describing the complete
> Org syntax as read by the parser. I also added some comments. I am going
> to put the Org file on Worg, so anyone can update it and fix mistakes.

Thanks Nicolas -- yep, that's really *great*!

-- 
 Bastien

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-07 20:37 Nicolas Goaziou
                   ` (2 preceding siblings ...)
  2013-03-08 10:04 ` Bastien
@ 2013-03-08 13:25 ` François Pinard
  2013-03-08 15:23 ` Nicolas Richard
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: François Pinard @ 2013-03-08 13:25 UTC (permalink / raw)
  To: emacs-orgmode

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> As discussed a few days ago, here is a document describing the complete
> Org syntax as read by the parser.

Fantastique! :-)  I'm preciously saving this!

François

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-07 20:37 Nicolas Goaziou
                   ` (3 preceding siblings ...)
  2013-03-08 13:25 ` François Pinard
@ 2013-03-08 15:23 ` Nicolas Richard
  2013-03-08 22:06   ` Nicolas Goaziou
  2013-03-09 23:16 ` Achim Gratz
  2013-03-17  7:18 ` Achim Gratz
  6 siblings, 1 reply; 37+ messages in thread
From: Nicolas Richard @ 2013-03-08 15:23 UTC (permalink / raw)
  To: emacs-orgmode

Nicolas Goaziou <n.goaziou@gmail.com> writes:
> As discussed a few days ago, here is a document describing the complete
> Org syntax as read by the parser. I also added some comments. I am going
> to put the Org file on Worg, so anyone can update it and fix mistakes.

[for the record, the org file mentionned by Nicolas is currently at
<http://orgmode.org/worg/dev/org-syntax.org>]

This looks truly awesome. I give some (naïve) comments below, from my
non-expert point of view.

> The paragraph is the unit of measurement.  An element defines
> syntactical parts that are at the same level as a paragraph, i.e. which
> cannot contain or be included in a paragraph.  An object is a part that
> could be included in an element.  Greater elements are all parts that
> can contain an element.

This is very clear but I'm slightly worried about confusion that might come
from "Greater element" not being an "element", and the word "element"
being a common word :

> Empty lines belong to the largest element ending before them.  For
> example, in a list, empty lines between items belong are part of the
> item before them, but empty lines at the end of a list belong to the
> plain list element.

Is the word "element" (in /largest element ending.../) to be understood
as an "element" from the above definition ? I guess not (this would
require both list items and plain lists to be on the level 'element',
from your example)

> 1 Headlines and Sections
> ════════════════════════
>
>   A headline is defined as:
>
>   ╭────
>   │ STARS KEYWORD PRIORITY TITLE TAGS
>   ╰────
>
>   STARS is a string starting at column 0 and containing at least one
>   asterisk (and up to `org-inlinetask-min-level' if `org-inlinetask'
>   library is loaded).  It’s the sole compulsory part of a headline.

Perhaps it should be mentionned that STARS has to end by a space (see
below). I suggest adding : The number of stars defines the level of the
headline.

>   KEYWORD is a TODO keyword, which have to belong to the list defined in
>   `org-todo-keywords'.  Case is significant.

The option #+TODO: is used also.

>   PRIORITY is a priority cookie, i.e. a single letter preceded by a hash
>   sign # and enclosed within square brackets.  Case is significant.

I suggest dropping "Case is significant" (or maybe give the whole story :
IIRC, it is the ascii code of the given letter that is used as priority)

>   ╭────
>   │ *

I don't see a space character after that one in your email and it
doesn't seem to be recognized as a headline by the exporter (hence my
above suggestion)

>   If the first word appearing in the title is `org-comment-keyword',
>   the

That should be `org-comment-string' I guess.

>   A headline contains directly at most one section, followed by any
>   number of headlines.  Only a section can contain another section.

From what I understand, "A section is delimited by two headlines (and
buffer limits)." [I initially thought it was "by two headlines of the
same level", which it is not from the structure example you give later.]

>   A section contains directly any greater element or element.  Only
>   a headline can contain a section.  As an exception, text before the
>   first headline in the document also belongs to a section.


>   In a quoted headline contains a section, the latter will be considered
>   as a “quote section”.

s/In/If/
unsure: s/quote section/quoted section/ ?

>   As an example, consider the following document:

<snip, useful example>

>   BACKEND is a string constituted of alpha-numeric characters, hyphens
>   or underscores.

I suggest: BACKEND is a string which is an element of (mapcar 'car
org-export-registered-backends).

>   OPTIONAL and VALUE can contain any character but a new line.  Only
>   keywords in `org-element-dual-keywords' can have an optional value.

I guess OPTIONAL cannot contain a closing square bracket ]

>   An affiliated keyword can appear on multiple lines if KEY belongs to
>   `org-element-multiple-keywords' or if its pattern is “#+ATTR_BACKEND:
>   VALUE”.

I suggest s/on multiple lines/more than once/

>   PARAMETERS can contain any character, and can be omitted.

any other than new line, I guess.

>   CONTENTS can contain any element, but another greater block of the
>   same type.

What is the type of a greater block ? the /name/ ?

I did have a quick look at the rest of your mail, and it is very nice to
have all of it written down explicitly, so again a big thanks for all of
this (and the rest of your) work. Unfortunately I don't have much time
right now to read it thoroughtfully, so just one single comment :

>        Even the LaTeX community suggests to use `\(...\)' over
>        `$...$'.  — ngz

AFAIK that's not for technical reasons and also I would be curious to
know who does that in real documents : '$' is so much more convenient.
But one might think of rebinding $ to a command which would insert \( and
\) appropriately within org-mode (see below). (OTOH, there are technical
reasons for avoiding $$ and $$.)

Here some elisp for the above behaviour :
(defun yf/org-electric-dollar nil
"When called once, insert \\(\\) and leave point in between.
When called twice, replace the previously inserted \\(\\) by one $."
  (interactive)
  (if (and (looking-at "\\\\)") (looking-back "\\\\("))
      (progn (delete-char 2)
             (delete-char -2)
             (insert "$"))
    (insert "\\(\\)")
    (backward-char 2)))
(define-key org-mode-map (kbd "$") 'yf/org-electric-dollar)

-- 
N.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-08 15:23 ` Nicolas Richard
@ 2013-03-08 22:06   ` Nicolas Goaziou
  2013-03-09 10:52     ` Waldemar Quevedo
  2013-03-13 14:07     ` Nicolas Richard
  0 siblings, 2 replies; 37+ messages in thread
From: Nicolas Goaziou @ 2013-03-08 22:06 UTC (permalink / raw)
  To: Nicolas Richard; +Cc: emacs-orgmode

Hello,

"Nicolas Richard" <theonewiththeevillook@yahoo.fr> writes:

> Nicolas Goaziou <n.goaziou@gmail.com> writes:
>> As discussed a few days ago, here is a document describing the complete
>> Org syntax as read by the parser. I also added some comments. I am going
>> to put the Org file on Worg, so anyone can update it and fix mistakes.
>
> [for the record, the org file mentionned by Nicolas is currently at
> <http://orgmode.org/worg/dev/org-syntax.org>]
>
> This looks truly awesome. I give some (naïve) comments below, from my
> non-expert point of view.

Thank you for your comments.

>> The paragraph is the unit of measurement.  An element defines
>> syntactical parts that are at the same level as a paragraph, i.e. which
>> cannot contain or be included in a paragraph.  An object is a part that
>> could be included in an element.  Greater elements are all parts that
>> can contain an element.
>
> This is very clear but I'm slightly worried about confusion that might come
> from "Greater element" not being an "element", and the word "element"
> being a common word :

element means "Element + Greater Element". It is to be understood as the
opposite of object. I think there shouldn't be much ambiguity according
to context.

>> Empty lines belong to the largest element ending before them.  For
>> example, in a list, empty lines between items belong are part of the
>> item before them, but empty lines at the end of a list belong to the
>> plain list element.
>
> Is the word "element" (in /largest element ending.../) to be understood
> as an "element" from the above definition ? I guess not (this would
> require both list items and plain lists to be on the level 'element',
> from your example)

Again, it's a shortcut for "in the largest element or greater element
ending before them".

>> 1 Headlines and Sections
>> ════════════════════════
>>
>>   A headline is defined as:
>>
>>   ╭────
>>   │ STARS KEYWORD PRIORITY TITLE TAGS
>>   ╰────
>>
>>   STARS is a string starting at column 0 and containing at least one
>>   asterisk (and up to `org-inlinetask-min-level' if `org-inlinetask'
>>   library is loaded).  It’s the sole compulsory part of a headline.
>
> Perhaps it should be mentionned that STARS has to end by a space (see
> below).

I agree.

> I suggest adding : The number of stars defines the level of the
> headline.

Does it belong to the syntax definition? Level is how Org uses syntax
internally. Also the sentence, although right, is misleading, because
level definition also depends on `org-odd-levels-only'.

>>   KEYWORD is a TODO keyword, which have to belong to the list defined in
>>   `org-todo-keywords'.  Case is significant.
>
> The option #+TODO: is used also.

Then it should be ~org-todo-keywords-1~, which is where all TODO
keywords are added eventually.

>>   PRIORITY is a priority cookie, i.e. a single letter preceded by a hash
>>   sign # and enclosed within square brackets.  Case is significant.
>
> I suggest dropping "Case is significant" (or maybe give the whole story :
> IIRC, it is the ascii code of the given letter that is used as
> priority)

I'm not sure that the purpose of this document should be to explain how
syntax will be used.

>>   ╭────
>>   │ *
>
> I don't see a space character after that one in your email and it
> doesn't seem to be recognized as a headline by the exporter (hence my
> above suggestion)
>
>>   If the first word appearing in the title is `org-comment-keyword',
>>   the
>
> That should be `org-comment-string' I guess.

Indeed. Btw, I think this variable should be a defconst, not
a defcustom. It just makes things harder for little benefit.

>>   A headline contains directly at most one section, followed by any
>>   number of headlines.  Only a section can contain another section.
>
> From what I understand, "A section is delimited by two headlines (and
> buffer limits)." [I initially thought it was "by two headlines of the
> same level", which it is not from the structure example you give
> later.]

"Only a section can contain another section" is wrong. It should be
removed.

>>   A section contains directly any greater element or element.  Only
>>   a headline can contain a section.  As an exception, text before the
>>   first headline in the document also belongs to a section.
>
>
>>   In a quoted headline contains a section, the latter will be considered
>>   as a “quote section”.
>
> s/In/If/

Yes.

> unsure: s/quote section/quoted section/ ?

No, it is "quote section".

>>   BACKEND is a string constituted of alpha-numeric characters, hyphens
>>   or underscores.
>
> I suggest: BACKEND is a string which is an element of (mapcar 'car
> org-export-registered-backends).

Not really. Parser can understand #+attr_foo even if foo is not
registered as a valid back-end.

>>   OPTIONAL and VALUE can contain any character but a new line.  Only
>>   keywords in `org-element-dual-keywords' can have an optional value.
>
> I guess OPTIONAL cannot contain a closing square bracket ]

It can.

>>   An affiliated keyword can appear on multiple lines if KEY belongs to
>>   `org-element-multiple-keywords' or if its pattern is “#+ATTR_BACKEND:
>>   VALUE”.
>
> I suggest s/on multiple lines/more than once/

Ok.

>>   PARAMETERS can contain any character, and can be omitted.
>
> any other than new line, I guess.

Correct.

>>   CONTENTS can contain any element, but another greater block of the
>>   same type.
>
> What is the type of a greater block ? the /name/ ?

Yes. I think it should be better to say something like: CONTENTS cannot
contain the string "#+END_NAME" on a line on its own.

> I did have a quick look at the rest of your mail, and it is very nice to
> have all of it written down explicitly, so again a big thanks for all of
> this (and the rest of your) work. Unfortunately I don't have much time
> right now to read it thoroughtfully, so just one single comment :
>
>>        Even the LaTeX community suggests to use `\(...\)' over
>>        `$...$'.  — ngz
>
> AFAIK that's not for technical reasons and also I would be curious to
> know who does that in real documents : '$' is so much more convenient.

Yes, I mixed $$...$$ and $...$. This sentence could be removed. Though
I still maintain my POV about $...$. It may be convenient in a latex
file, but in a free-form text format like Org, it's error prone.

I also forgot to write about optional #+tblfm: line below Org tables.

Would you (or Someone) mind updating the org-syntax.org file on Worg?

Thank you again.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-08 22:06   ` Nicolas Goaziou
@ 2013-03-09 10:52     ` Waldemar Quevedo
  2013-03-09 14:23       ` Carsten Dominik
                         ` (3 more replies)
  2013-03-13 14:07     ` Nicolas Richard
  1 sibling, 4 replies; 37+ messages in thread
From: Waldemar Quevedo @ 2013-03-09 10:52 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Nicolas Richard, emacs-orgmode

Hey Nicolas, this looks very detailed and I think it could be useful
for people trying to write other parsers implementations for org-mode.
Thanks for sharing!

By the way, does it exist somewhere a set of examples of Emacs
org-mode -> html conversion for all org-mode features?
(How are changes from org-mode -> html converstion from Emacs tested
during development?)

I am mantaining the org-ruby gem which is used to render org-mode texts to html,
and currently there is no "roadmap" of features to implement for it.
As a result, features and tweaks are added to the library
as long as someone submits a ticket requesting the feature in Github.
(Here is a list of the export features supported in case someone wants
to take a look:
https://github.com/bdewey/org-ruby/tree/master/spec/html_examples )
Having a set of examples features from org-mode would be very useful
to see how much coverage other implementations of org-mode exporting
features have.

Cheers everyone, keep org-mode being an awesome tool :)

- Waldemar

On Sat, Mar 9, 2013 at 7:06 AM, Nicolas Goaziou <n.goaziou@gmail.com> wrote:
> Hello,
>
> "Nicolas Richard" <theonewiththeevillook@yahoo.fr> writes:
>
>> Nicolas Goaziou <n.goaziou@gmail.com> writes:
>>> As discussed a few days ago, here is a document describing the complete
>>> Org syntax as read by the parser. I also added some comments. I am going
>>> to put the Org file on Worg, so anyone can update it and fix mistakes.
>>
>> [for the record, the org file mentionned by Nicolas is currently at
>> <http://orgmode.org/worg/dev/org-syntax.org>]
>>
>> This looks truly awesome. I give some (naïve) comments below, from my
>> non-expert point of view.
>
> Thank you for your comments.
>
>>> The paragraph is the unit of measurement.  An element defines
>>> syntactical parts that are at the same level as a paragraph, i.e. which
>>> cannot contain or be included in a paragraph.  An object is a part that
>>> could be included in an element.  Greater elements are all parts that
>>> can contain an element.
>>
>> This is very clear but I'm slightly worried about confusion that might come
>> from "Greater element" not being an "element", and the word "element"
>> being a common word :
>
> element means "Element + Greater Element". It is to be understood as the
> opposite of object. I think there shouldn't be much ambiguity according
> to context.
>
>>> Empty lines belong to the largest element ending before them.  For
>>> example, in a list, empty lines between items belong are part of the
>>> item before them, but empty lines at the end of a list belong to the
>>> plain list element.
>>
>> Is the word "element" (in /largest element ending.../) to be understood
>> as an "element" from the above definition ? I guess not (this would
>> require both list items and plain lists to be on the level 'element',
>> from your example)
>
> Again, it's a shortcut for "in the largest element or greater element
> ending before them".
>
>>> 1 Headlines and Sections
>>> ════════════════════════
>>>
>>>   A headline is defined as:
>>>
>>>   ╭────
>>>   │ STARS KEYWORD PRIORITY TITLE TAGS
>>>   ╰────
>>>
>>>   STARS is a string starting at column 0 and containing at least one
>>>   asterisk (and up to `org-inlinetask-min-level' if `org-inlinetask'
>>>   library is loaded).  It’s the sole compulsory part of a headline.
>>
>> Perhaps it should be mentionned that STARS has to end by a space (see
>> below).
>
> I agree.
>
>> I suggest adding : The number of stars defines the level of the
>> headline.
>
> Does it belong to the syntax definition? Level is how Org uses syntax
> internally. Also the sentence, although right, is misleading, because
> level definition also depends on `org-odd-levels-only'.
>
>>>   KEYWORD is a TODO keyword, which have to belong to the list defined in
>>>   `org-todo-keywords'.  Case is significant.
>>
>> The option #+TODO: is used also.
>
> Then it should be ~org-todo-keywords-1~, which is where all TODO
> keywords are added eventually.
>
>>>   PRIORITY is a priority cookie, i.e. a single letter preceded by a hash
>>>   sign # and enclosed within square brackets.  Case is significant.
>>
>> I suggest dropping "Case is significant" (or maybe give the whole story :
>> IIRC, it is the ascii code of the given letter that is used as
>> priority)
>
> I'm not sure that the purpose of this document should be to explain how
> syntax will be used.
>
>>>   ╭────
>>>   │ *
>>
>> I don't see a space character after that one in your email and it
>> doesn't seem to be recognized as a headline by the exporter (hence my
>> above suggestion)
>>
>>>   If the first word appearing in the title is `org-comment-keyword',
>>>   the
>>
>> That should be `org-comment-string' I guess.
>
> Indeed. Btw, I think this variable should be a defconst, not
> a defcustom. It just makes things harder for little benefit.
>
>>>   A headline contains directly at most one section, followed by any
>>>   number of headlines.  Only a section can contain another section.
>>
>> From what I understand, "A section is delimited by two headlines (and
>> buffer limits)." [I initially thought it was "by two headlines of the
>> same level", which it is not from the structure example you give
>> later.]
>
> "Only a section can contain another section" is wrong. It should be
> removed.
>
>>>   A section contains directly any greater element or element.  Only
>>>   a headline can contain a section.  As an exception, text before the
>>>   first headline in the document also belongs to a section.
>>
>>
>>>   In a quoted headline contains a section, the latter will be considered
>>>   as a “quote section”.
>>
>> s/In/If/
>
> Yes.
>
>> unsure: s/quote section/quoted section/ ?
>
> No, it is "quote section".
>
>>>   BACKEND is a string constituted of alpha-numeric characters, hyphens
>>>   or underscores.
>>
>> I suggest: BACKEND is a string which is an element of (mapcar 'car
>> org-export-registered-backends).
>
> Not really. Parser can understand #+attr_foo even if foo is not
> registered as a valid back-end.
>
>>>   OPTIONAL and VALUE can contain any character but a new line.  Only
>>>   keywords in `org-element-dual-keywords' can have an optional value.
>>
>> I guess OPTIONAL cannot contain a closing square bracket ]
>
> It can.
>
>>>   An affiliated keyword can appear on multiple lines if KEY belongs to
>>>   `org-element-multiple-keywords' or if its pattern is “#+ATTR_BACKEND:
>>>   VALUE”.
>>
>> I suggest s/on multiple lines/more than once/
>
> Ok.
>
>>>   PARAMETERS can contain any character, and can be omitted.
>>
>> any other than new line, I guess.
>
> Correct.
>
>>>   CONTENTS can contain any element, but another greater block of the
>>>   same type.
>>
>> What is the type of a greater block ? the /name/ ?
>
> Yes. I think it should be better to say something like: CONTENTS cannot
> contain the string "#+END_NAME" on a line on its own.
>
>> I did have a quick look at the rest of your mail, and it is very nice to
>> have all of it written down explicitly, so again a big thanks for all of
>> this (and the rest of your) work. Unfortunately I don't have much time
>> right now to read it thoroughtfully, so just one single comment :
>>
>>>        Even the LaTeX community suggests to use `\(...\)' over
>>>        `$...$'.  — ngz
>>
>> AFAIK that's not for technical reasons and also I would be curious to
>> know who does that in real documents : '$' is so much more convenient.
>
> Yes, I mixed $$...$$ and $...$. This sentence could be removed. Though
> I still maintain my POV about $...$. It may be convenient in a latex
> file, but in a free-form text format like Org, it's error prone.
>
> I also forgot to write about optional #+tblfm: line below Org tables.
>
> Would you (or Someone) mind updating the org-syntax.org file on Worg?
>
> Thank you again.
>
>
> Regards,
>
> --
> Nicolas Goaziou
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-09 10:52     ` Waldemar Quevedo
@ 2013-03-09 14:23       ` Carsten Dominik
  2013-03-09 14:42         ` Nicolas Goaziou
  2013-03-15 20:22       ` Nicolas Goaziou
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 37+ messages in thread
From: Carsten Dominik @ 2013-03-09 14:23 UTC (permalink / raw)
  To: Waldemar Quevedo; +Cc: Nicolas Richard, emacs-orgmode, Nicolas Goaziou


On 9.3.2013, at 11:52, Waldemar Quevedo <waldemar.quevedo@gmail.com> wrote:

> Hey Nicolas, this looks very detailed and I think it could be useful
> for people trying to write other parsers implementations for org-mode.
> Thanks for sharing!

Maybe someone knowledgeable can turn Nicola's description into a formal parser description that can then be used by something like yacc to produce code for arbitrary languages?  I am not sure if I am making sense though.

- Carsten

> 
> By the way, does it exist somewhere a set of examples of Emacs
> org-mode -> html conversion for all org-mode features?
> (How are changes from org-mode -> html converstion from Emacs tested
> during development?)
> 
> I am mantaining the org-ruby gem which is used to render org-mode texts to html,
> and currently there is no "roadmap" of features to implement for it.
> As a result, features and tweaks are added to the library
> as long as someone submits a ticket requesting the feature in Github.
> (Here is a list of the export features supported in case someone wants
> to take a look:
> https://github.com/bdewey/org-ruby/tree/master/spec/html_examples )
> Having a set of examples features from org-mode would be very useful
> to see how much coverage other implementations of org-mode exporting
> features have.
> 
> Cheers everyone, keep org-mode being an awesome tool :)
> 
> - Waldemar
> 
> On Sat, Mar 9, 2013 at 7:06 AM, Nicolas Goaziou <n.goaziou@gmail.com> wrote:
>> Hello,
>> 
>> "Nicolas Richard" <theonewiththeevillook@yahoo.fr> writes:
>> 
>>> Nicolas Goaziou <n.goaziou@gmail.com> writes:
>>>> As discussed a few days ago, here is a document describing the complete
>>>> Org syntax as read by the parser. I also added some comments. I am going
>>>> to put the Org file on Worg, so anyone can update it and fix mistakes.
>>> 
>>> [for the record, the org file mentionned by Nicolas is currently at
>>> <http://orgmode.org/worg/dev/org-syntax.org>]
>>> 
>>> This looks truly awesome. I give some (naïve) comments below, from my
>>> non-expert point of view.
>> 
>> Thank you for your comments.
>> 
>>>> The paragraph is the unit of measurement.  An element defines
>>>> syntactical parts that are at the same level as a paragraph, i.e. which
>>>> cannot contain or be included in a paragraph.  An object is a part that
>>>> could be included in an element.  Greater elements are all parts that
>>>> can contain an element.
>>> 
>>> This is very clear but I'm slightly worried about confusion that might come
>>> from "Greater element" not being an "element", and the word "element"
>>> being a common word :
>> 
>> element means "Element + Greater Element". It is to be understood as the
>> opposite of object. I think there shouldn't be much ambiguity according
>> to context.
>> 
>>>> Empty lines belong to the largest element ending before them.  For
>>>> example, in a list, empty lines between items belong are part of the
>>>> item before them, but empty lines at the end of a list belong to the
>>>> plain list element.
>>> 
>>> Is the word "element" (in /largest element ending.../) to be understood
>>> as an "element" from the above definition ? I guess not (this would
>>> require both list items and plain lists to be on the level 'element',
>>> from your example)
>> 
>> Again, it's a shortcut for "in the largest element or greater element
>> ending before them".
>> 
>>>> 1 Headlines and Sections
>>>> ════════════════════════
>>>> 
>>>>  A headline is defined as:
>>>> 
>>>>  ╭────
>>>>  │ STARS KEYWORD PRIORITY TITLE TAGS
>>>>  ╰────
>>>> 
>>>>  STARS is a string starting at column 0 and containing at least one
>>>>  asterisk (and up to `org-inlinetask-min-level' if `org-inlinetask'
>>>>  library is loaded).  It’s the sole compulsory part of a headline.
>>> 
>>> Perhaps it should be mentionned that STARS has to end by a space (see
>>> below).
>> 
>> I agree.
>> 
>>> I suggest adding : The number of stars defines the level of the
>>> headline.
>> 
>> Does it belong to the syntax definition? Level is how Org uses syntax
>> internally. Also the sentence, although right, is misleading, because
>> level definition also depends on `org-odd-levels-only'.
>> 
>>>>  KEYWORD is a TODO keyword, which have to belong to the list defined in
>>>>  `org-todo-keywords'.  Case is significant.
>>> 
>>> The option #+TODO: is used also.
>> 
>> Then it should be ~org-todo-keywords-1~, which is where all TODO
>> keywords are added eventually.
>> 
>>>>  PRIORITY is a priority cookie, i.e. a single letter preceded by a hash
>>>>  sign # and enclosed within square brackets.  Case is significant.
>>> 
>>> I suggest dropping "Case is significant" (or maybe give the whole story :
>>> IIRC, it is the ascii code of the given letter that is used as
>>> priority)
>> 
>> I'm not sure that the purpose of this document should be to explain how
>> syntax will be used.
>> 
>>>>  ╭────
>>>>  │ *
>>> 
>>> I don't see a space character after that one in your email and it
>>> doesn't seem to be recognized as a headline by the exporter (hence my
>>> above suggestion)
>>> 
>>>>  If the first word appearing in the title is `org-comment-keyword',
>>>>  the
>>> 
>>> That should be `org-comment-string' I guess.
>> 
>> Indeed. Btw, I think this variable should be a defconst, not
>> a defcustom. It just makes things harder for little benefit.
>> 
>>>>  A headline contains directly at most one section, followed by any
>>>>  number of headlines.  Only a section can contain another section.
>>> 
>>> From what I understand, "A section is delimited by two headlines (and
>>> buffer limits)." [I initially thought it was "by two headlines of the
>>> same level", which it is not from the structure example you give
>>> later.]
>> 
>> "Only a section can contain another section" is wrong. It should be
>> removed.
>> 
>>>>  A section contains directly any greater element or element.  Only
>>>>  a headline can contain a section.  As an exception, text before the
>>>>  first headline in the document also belongs to a section.
>>> 
>>> 
>>>>  In a quoted headline contains a section, the latter will be considered
>>>>  as a “quote section”.
>>> 
>>> s/In/If/
>> 
>> Yes.
>> 
>>> unsure: s/quote section/quoted section/ ?
>> 
>> No, it is "quote section".
>> 
>>>>  BACKEND is a string constituted of alpha-numeric characters, hyphens
>>>>  or underscores.
>>> 
>>> I suggest: BACKEND is a string which is an element of (mapcar 'car
>>> org-export-registered-backends).
>> 
>> Not really. Parser can understand #+attr_foo even if foo is not
>> registered as a valid back-end.
>> 
>>>>  OPTIONAL and VALUE can contain any character but a new line.  Only
>>>>  keywords in `org-element-dual-keywords' can have an optional value.
>>> 
>>> I guess OPTIONAL cannot contain a closing square bracket ]
>> 
>> It can.
>> 
>>>>  An affiliated keyword can appear on multiple lines if KEY belongs to
>>>>  `org-element-multiple-keywords' or if its pattern is “#+ATTR_BACKEND:
>>>>  VALUE”.
>>> 
>>> I suggest s/on multiple lines/more than once/
>> 
>> Ok.
>> 
>>>>  PARAMETERS can contain any character, and can be omitted.
>>> 
>>> any other than new line, I guess.
>> 
>> Correct.
>> 
>>>>  CONTENTS can contain any element, but another greater block of the
>>>>  same type.
>>> 
>>> What is the type of a greater block ? the /name/ ?
>> 
>> Yes. I think it should be better to say something like: CONTENTS cannot
>> contain the string "#+END_NAME" on a line on its own.
>> 
>>> I did have a quick look at the rest of your mail, and it is very nice to
>>> have all of it written down explicitly, so again a big thanks for all of
>>> this (and the rest of your) work. Unfortunately I don't have much time
>>> right now to read it thoroughtfully, so just one single comment :
>>> 
>>>>       Even the LaTeX community suggests to use `\(...\)' over
>>>>       `$...$'.  — ngz
>>> 
>>> AFAIK that's not for technical reasons and also I would be curious to
>>> know who does that in real documents : '$' is so much more convenient.
>> 
>> Yes, I mixed $$...$$ and $...$. This sentence could be removed. Though
>> I still maintain my POV about $...$. It may be convenient in a latex
>> file, but in a free-form text format like Org, it's error prone.
>> 
>> I also forgot to write about optional #+tblfm: line below Org tables.
>> 
>> Would you (or Someone) mind updating the org-syntax.org file on Worg?
>> 
>> Thank you again.
>> 
>> 
>> Regards,
>> 
>> --
>> Nicolas Goaziou
>> 
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-09 14:23       ` Carsten Dominik
@ 2013-03-09 14:42         ` Nicolas Goaziou
  2013-03-09 15:05           ` Carsten Dominik
  0 siblings, 1 reply; 37+ messages in thread
From: Nicolas Goaziou @ 2013-03-09 14:42 UTC (permalink / raw)
  To: Carsten Dominik; +Cc: Nicolas Richard, Waldemar Quevedo, emacs-orgmode

Hello,

Carsten Dominik <carsten.dominik@gmail.com> writes:

> On 9.3.2013, at 11:52, Waldemar Quevedo <waldemar.quevedo@gmail.com> wrote:
>
>> Hey Nicolas, this looks very detailed and I think it could be useful
>> for people trying to write other parsers implementations for org-mode.
>> Thanks for sharing!
>
> Maybe someone knowledgeable can turn Nicola's description into
> a formal parser description that can then be used by something like
> yacc to produce code for arbitrary languages? I am not sure if I am
> making sense though.

*cough* you mean GNU Bison or, perhaps better, Wisent (from Semantic).
I don't know how well they handle context sensitive grammars, though..


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-09 14:42         ` Nicolas Goaziou
@ 2013-03-09 15:05           ` Carsten Dominik
  0 siblings, 0 replies; 37+ messages in thread
From: Carsten Dominik @ 2013-03-09 15:05 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Nicolas Richard, Waldemar Quevedo, emacs-orgmode


On 9.3.2013, at 15:42, Nicolas Goaziou <n.goaziou@gmail.com> wrote:

> Hello,
> 
> Carsten Dominik <carsten.dominik@gmail.com> writes:
> 
>> On 9.3.2013, at 11:52, Waldemar Quevedo <waldemar.quevedo@gmail.com> wrote:
>> 
>>> Hey Nicolas, this looks very detailed and I think it could be useful
>>> for people trying to write other parsers implementations for org-mode.
>>> Thanks for sharing!
>> 
>> Maybe someone knowledgeable can turn Nicola's description into
>> a formal parser description that can then be used by something like
>> yacc to produce code for arbitrary languages? I am not sure if I am
>> making sense though.
> 
> *cough* you mean GNU Bison

Told you I am not sure if I am making sense.

Anyway, a general parser would be useful for extensions like org-ruby...

- Carsten

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-07 20:37 Nicolas Goaziou
                   ` (4 preceding siblings ...)
  2013-03-08 15:23 ` Nicolas Richard
@ 2013-03-09 23:16 ` Achim Gratz
  2013-03-09 23:49   ` Nicolas Goaziou
  2013-03-17  7:18 ` Achim Gratz
  6 siblings, 1 reply; 37+ messages in thread
From: Achim Gratz @ 2013-03-09 23:16 UTC (permalink / raw)
  To: emacs-orgmode

Hi Nicolas,

here are my first comments. I'm still trying to wrap my head around some
things, so if I'm off the map on something, please be patient.

Do you mind if I fix some obvious typos directly on Worg or do you'd
rather want patches?


Nicolas Goaziou writes:
> A core concept in this syntax is that only headlines and sections are
> context-free[1][2].  Every other syntactical part only exists within
> specific environments.

Blank lines or empty lines are also context-free syntactical elements,
I'd think.

> Three categories are used to classify these environments: “Greater
> elements”, “elements”, and “objects”, from the broadest scope to the
> narrowest.

It might be easier to talk about those things if "Greater Element" was
called "Collection" to perhaps keep with the thingies theme of naming
the syntax.

> The paragraph is the unit of measurement.  An element defines
> syntactical parts that are at the same level as a paragraph, i.e. which
> cannot contain or be included in a paragraph.  An object is a part that
> could be included in an element.  Greater elements are all parts that
> can contain an element.

Here's my main contention with that model: I think there should be an
greater element, maybe named "paragraph block" that translates into a
paragraph at the backend level.  Most backends will have a paragraph
model that is much less limited than what the current definition of an
Org paragraph is.  This could be optionally be an implicit greater
block that is defined by the presence or absence of blank lines between
elements, I'd think.

> 3.1 Greater Blocks
> ──────────────────

The same naming confusion as with the various "elements", for now I'd
link to think of these as "Box".

>   Greater blocks consist in the following pattern:
>
>   ╭────
>   │ #+BEGIN_NAME PARAMETERS
>   │ CONTENTS
>   │ #+END_NAME
>   ╰────

I'm beginning to wonder if these should have the same syntax as blocks.
Maybe that's a too fine a distinction visually, but adding a colon would
disambiguate the greater blocks from the normal ones.  In other words

#+BEGIN_CENTER: humdum
…
#+END_CENTER:

would be a center block, while

#+BEGIN_CENTER humdum
…
#+END_CENTER

would be an export block for the center backend.

> 4.2 Blocks
> ──────────
>
>   Like [greater blocks], pattern for blocks is:
>
>   ╭────
>   │ #+BEGIN_NAME DATA
>   │ CONTENTS
>   │ #+END_NAME
>   ╰────
[…]
>   DATA can contain any character but a new line.

I'd keep with PARAMETERS here.

>   If NAME is a string matching the name of any export back-end loaded,
>   the block will be an “export block”.

Conversely, blocks that are not having a recognizable name will simply
insert their content as if the block markers were not there, e.g. it
seems to treat these as parsed blocks.  I don't think this should
happen, instead Org should parse this as an unknown export backend and
drop the content with a warning, not unlike a comment.

This will be a major sticking point with external parsers: they'd
otherwise need to know about the Org export backends to when to use the
content of the block and when not.  A portable Org document should be
able to specify which export backends it expects to be available (and
maybe what standard backend it is derived from) to elicit the correct
behaviour.

>   CONTENTS can contain any character, including new lines.  Though it
>   will only contain Org objects if the block is a verse block.
>   Otherwise, contents will not be parsed.

Would it make sense to make a general distinction between parsed and
non-parsed blocks based on some configuration, even though this would
produce the same issue as with export backends?

>         I suggest to remove angle links.  If one needs spaces in
>         PATH, she can use standard link syntax instead.

They are very ubiquitous on certain platforms, so copy&paste would be
made frustrating there if you'd need to re-format them each time around.



Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf Blofeld V1.15B11:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-09 23:16 ` Achim Gratz
@ 2013-03-09 23:49   ` Nicolas Goaziou
  2013-03-10  4:35     ` Jambunathan K
  0 siblings, 1 reply; 37+ messages in thread
From: Nicolas Goaziou @ 2013-03-09 23:49 UTC (permalink / raw)
  To: Achim Gratz; +Cc: emacs-orgmode

Hello,

Achim Gratz <Stromeko@nexgo.de> writes:

> Do you mind if I fix some obvious typos directly on Worg or do you'd
> rather want patches?

Please go ahead. This is on Worg so anyone can improve it.

> Nicolas Goaziou writes:
>> A core concept in this syntax is that only headlines and sections are
>> context-free[1][2].  Every other syntactical part only exists within
>> specific environments.
>
> Blank lines or empty lines are also context-free syntactical elements,
> I'd think.

No, the aren't, as they belong to the broadest element ending before
them. So you need to know what this element is.

>> Three categories are used to classify these environments: “Greater
>> elements”, “elements”, and “objects”, from the broadest scope to the
>> narrowest.
>
> It might be easier to talk about those things if "Greater Element" was
> called "Collection" to perhaps keep with the thingies theme of naming
> the syntax.

Collection could also be ambiguous as a paragraph may be seen as
a collection of objects.

>> The paragraph is the unit of measurement.  An element defines
>> syntactical parts that are at the same level as a paragraph, i.e. which
>> cannot contain or be included in a paragraph.  An object is a part that
>> could be included in an element.  Greater elements are all parts that
>> can contain an element.
>
> Here's my main contention with that model: I think there should be an
> greater element, maybe named "paragraph block" that translates into a
> paragraph at the backend level.  Most backends will have a paragraph
> model that is much less limited than what the current definition of an
> Org paragraph is.  This could be optionally be an implicit greater
> block that is defined by the presence or absence of blank lines between
> elements, I'd think.

I don't get it. What would be the exact definition of a "paragraph
block"? What limitations are you talking about?

>> 3.1 Greater Blocks
>> ──────────────────
>
> The same naming confusion as with the various "elements", for now I'd
> link to think of these as "Box".

This naming was for the org-syntax file only. "Greater blocks" means
nothing for org-element.el, but "center block", "quote block", "special
block" do.

>>   Greater blocks consist in the following pattern:
>>
>>   ╭────
>>   │ #+BEGIN_NAME PARAMETERS
>>   │ CONTENTS
>>   │ #+END_NAME
>>   ╰────
>
> I'm beginning to wonder if these should have the same syntax as blocks.
> Maybe that's a too fine a distinction visually, but adding a colon would
> disambiguate the greater blocks from the normal ones.  In other words
>
> #+BEGIN_CENTER: humdum
> &
> #+END_CENTER:
>
> would be a center block, while
>
> #+BEGIN_CENTER humdum
> &
> #+END_CENTER
>
> would be an export block for the center backend.

I agree. More on that below.

>> 4.2 Blocks
>> ──────────
>>
>>   Like [greater blocks], pattern for blocks is:
>>
>>   ╭────
>>   │ #+BEGIN_NAME DATA
>>   │ CONTENTS
>>   │ #+END_NAME
>>   ╰────
> […]
>>   DATA can contain any character but a new line.
>
> I'd keep with PARAMETERS here.

Ok. Just fix it.

>>   If NAME is a string matching the name of any export back-end loaded,
>>   the block will be an “export block”.
>
> Conversely, blocks that are not having a recognizable name will simply
> insert their content as if the block markers were not there, e.g. it
> seems to treat these as parsed blocks.

You are talking about "special blocks", right? They have a special
purpose. In latex back-end,

  #+begin_special
  ...
  #+end_special

becomes

  \begin{special}
  ...
  \end{special}

IOW this is an Org feature.

> I don't think this should happen, instead Org should parse this as an
> unknown export backend and drop the content with a warning, not unlike
> a comment.

This would remove special blocks.

> This will be a major sticking point with external parsers: they'd
> otherwise need to know about the Org export backends to when to use the
> content of the block and when not.  A portable Org document should be
> able to specify which export backends it expects to be available (and
> maybe what standard backend it is derived from) to elicit the correct
> behaviour.

I agree, as notified above. If we want to separate Org /format/ from
Emacs, we need to separate special blocks from export blocks. The former
cannot be the fallback type when the latter isn't recognized.

In that case, a different syntax for export blocks would be needed.
Maybe the colons you suggested above. I think that something more
visible would be better, though.

>>   CONTENTS can contain any character, including new lines.  Though it
>>   will only contain Org objects if the block is a verse block.
>>   Otherwise, contents will not be parsed.
>
> Would it make sense to make a general distinction between parsed and
> non-parsed blocks based on some configuration, even though this would
> produce the same issue as with export backends?

This is inherent from the block type. This mustn't be configurable.
There is no point in parsing a src-block, for example. On the other
hand, if you don't parse (partially) contents of a verse-block, you get
an example-block, and one of them becomes useless.

Then, there are special blocks. It was suggested, a few days ago, that
a parameter could be set in order to tell the parser what to do with
their contents. That's an interesting idea. But it only makes sense if
there is also a way to specify a transformation function on these
contents (otherwise, an export block would be sufficient). Also the same
could be achieved with Babel, the non-parsed data being an example
block, and the transformation function a src-block.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-09 23:49   ` Nicolas Goaziou
@ 2013-03-10  4:35     ` Jambunathan K
  2013-03-10  7:08       ` Nicolas Goaziou
  0 siblings, 1 reply; 37+ messages in thread
From: Jambunathan K @ 2013-03-10  4:35 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Achim Gratz, emacs-orgmode


Nicolas

>> Do you mind if I fix some obvious typos directly on Worg or do you'd
>> rather want patches?
>
> Please go ahead. This is on Worg so anyone can improve it.

Please consider adding the Org spec (and also the exporter reference)
document to the Org manual.

This will be a good excuse for exercising the TexInfo exporter and see
where it leads.

Committing to Org or Worg has same load cycle.  I feel there is more
value if it is right within the Org manual (i.e, part of Emacs).

Only reason you may want to have it on Worg is possibly because it is
likely to be read by wider audience.  People seem to like browsers.

(You can consider building a standalone PDF/HTML document out of .texi
sources and have people read it)

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10  4:35     ` Jambunathan K
@ 2013-03-10  7:08       ` Nicolas Goaziou
  2013-03-10 10:14         ` Bastien
  0 siblings, 1 reply; 37+ messages in thread
From: Nicolas Goaziou @ 2013-03-10  7:08 UTC (permalink / raw)
  To: Jambunathan K; +Cc: Achim Gratz, emacs-orgmode

Hello,

Jambunathan K <kjambunathan@gmail.com> writes:

> Please consider adding the Org spec (and also the exporter reference)
> document to the Org manual.
>
> This will be a good excuse for exercising the TexInfo exporter and see
> where it leads.
>
> Committing to Org or Worg has same load cycle.  I feel there is more
> value if it is right within the Org manual (i.e, part of Emacs).
>
> Only reason you may want to have it on Worg is possibly because it is
> likely to be read by wider audience.  People seem to like browsers.

It is not ready to go into the manual in its current state. As specified
in its title, it's nothing more than a draft. Some parts have to be
rewritten, some information is missing, my notes have to be removed,
etc. Once it becomes clear enough, Bastien may consider adding it to the
manual.

The same holds for the exporter document.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10  7:08       ` Nicolas Goaziou
@ 2013-03-10 10:14         ` Bastien
  2013-03-10 10:16           ` Bastien
                             ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Bastien @ 2013-03-10 10:14 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Achim Gratz, emacs-orgmode, Jambunathan K

Hi Nicolas,

the manual would enjoy a subsection in "Hacking" on how to create
a new exporter, either from scratch or as a derived exporter.
(Such a subsection can be short enough, thanks to derived backend.)

From this section, we can throw links to the exporter reference
document and the Org syntax document published on Worgs.

We may also add footnotes referring to the Org syntax relevant
sections, when needed.

But both reference documents don't fit into the manual IMO.  They
are great resources for developers, not for users.  The footnotes
are enough for advanced users who want to go beyond the manual.

Thanks!

-- 
 Bastien

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10 10:14         ` Bastien
@ 2013-03-10 10:16           ` Bastien
  2013-03-10 13:07             ` Achim Gratz
  2013-03-10 15:44           ` Jambunathan K
  2013-04-09 16:37           ` Bastien
  2 siblings, 1 reply; 37+ messages in thread
From: Bastien @ 2013-03-10 10:16 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Achim Gratz, emacs-orgmode, Jambunathan K

Bastien <bzg@altern.org> writes:

> But both reference documents don't fit into the manual IMO.  They
> are great resources for developers, not for users.  The footnotes
> are enough for advanced users who want to go beyond the manual.

That said, we can also bundle both documents into Org's distribution,
as .org files in the doc/ directory.  And have a make rule to convert
them to .pdf and info docs.

-- 
 Bastien

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10 10:16           ` Bastien
@ 2013-03-10 13:07             ` Achim Gratz
  2013-03-10 14:11               ` Bastien
  0 siblings, 1 reply; 37+ messages in thread
From: Achim Gratz @ 2013-03-10 13:07 UTC (permalink / raw)
  To: emacs-orgmode

Bastien writes:
> That said, we can also bundle both documents into Org's distribution,
> as .org files in the doc/ directory.  And have a make rule to convert
> them to .pdf and info docs.

I don't want to be the party pooper, but if these documents should go
into the distribution, then we must insist that all modifications be FSF
copyrighted, otherwise we'd have to remove and/or rewrite the ones that
aren't when they are incorporated into the distribution.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10 13:07             ` Achim Gratz
@ 2013-03-10 14:11               ` Bastien
  2013-03-10 16:02                 ` Achim Gratz
  0 siblings, 1 reply; 37+ messages in thread
From: Bastien @ 2013-03-10 14:11 UTC (permalink / raw)
  To: Achim Gratz; +Cc: emacs-orgmode

Achim Gratz <Stromeko@nexgo.de> writes:

> Bastien writes:
>> That said, we can also bundle both documents into Org's distribution,
>> as .org files in the doc/ directory.  And have a make rule to convert
>> them to .pdf and info docs.
>
> I don't want to be the party pooper, but if these documents should go
> into the distribution, then we must insist that all modifications be FSF
> copyrighted, otherwise we'd have to remove and/or rewrite the ones that
> aren't when they are incorporated into the distribution.

No, the documents can go into the distribution with contributions from
anyone, because they won't be in Emacs.  FSF assignment is needed only
for things that go into Emacs.

Best,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10 10:14         ` Bastien
  2013-03-10 10:16           ` Bastien
@ 2013-03-10 15:44           ` Jambunathan K
  2013-03-14 16:58             ` Eric S Fraga
  2013-04-09 16:37           ` Bastien
  2 siblings, 1 reply; 37+ messages in thread
From: Jambunathan K @ 2013-03-10 15:44 UTC (permalink / raw)
  To: Bastien; +Cc: emacs-orgmode


Bastien

> But both reference documents don't fit into the manual IMO.  

You are a jerk, a BIG JERK. 

Jambunathan K.
-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10 14:11               ` Bastien
@ 2013-03-10 16:02                 ` Achim Gratz
  2013-03-10 16:09                   ` Jambunathan K
  0 siblings, 1 reply; 37+ messages in thread
From: Achim Gratz @ 2013-03-10 16:02 UTC (permalink / raw)
  To: emacs-orgmode

Bastien writes:
> No, the documents can go into the distribution with contributions from
> anyone, because they won't be in Emacs.  FSF assignment is needed only
> for things that go into Emacs.

I understood that these or substantial parts of it will end up in the
Org manual, which is in Emacs.  If that's not the case, then disregard
my comment.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf rackAttack V1.04R1:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10 16:02                 ` Achim Gratz
@ 2013-03-10 16:09                   ` Jambunathan K
  2013-03-10 17:12                     ` Achim Gratz
  0 siblings, 1 reply; 37+ messages in thread
From: Jambunathan K @ 2013-03-10 16:09 UTC (permalink / raw)
  To: Achim Gratz; +Cc: emacs-orgmode

Achim Gratz <Stromeko@nexgo.de> writes:

> Bastien writes:
>> No, the documents can go into the distribution with contributions from
>> anyone, because they won't be in Emacs.  FSF assignment is needed only
>> for things that go into Emacs.
>
> I understood that these or substantial parts of it will end up in the
> Org manual, which is in Emacs.  If that's not the case, then disregard
> my comment.

Emacs lisp has a manual of it's own.  I don't see how Org export
reference *cannot* end in Emacs.  

Bastien is doing what(ever) suits his whims and you are approving of it.
I disapprove of what you are doing, Achim.  Export syntax deserves to be
part of Org/Emacs.  Let the maintainer go to hell.  He is talking
irreverently/hand-wavingly about some work which has stretched to good
part of around 3 years.  The Orgmode maintainership is in the hands of
the wrong person, he calls shots based on his owh whims and I regret it.

Jambunathan K.

>
>
> Regards,
> Achim.

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10 16:09                   ` Jambunathan K
@ 2013-03-10 17:12                     ` Achim Gratz
  2013-03-10 21:44                       ` Jonathan Leech-Pepin
  0 siblings, 1 reply; 37+ messages in thread
From: Achim Gratz @ 2013-03-10 17:12 UTC (permalink / raw)
  To: emacs-orgmode

Jambunathan K writes:
> Emacs lisp has a manual of it's own.  I don't see how Org export
> reference *cannot* end in Emacs.  

I said that I'm expecting these references to become part of the
manual(s).  I still expect that and will try to help it along, but it
doesn't necessarily need to take the exact sequence of events that I
envisioned.

> Bastien is doing what(ever) suits his whims and you are approving of
> it.

I haven't approved or disapproved anything.  I have only stated the
plain fact that if my understanding of the future course of events is
incorrect, then my comment does not apply (and conversely, if it does,
then the issue I've stated needs to be dealt with).

> I disapprove of what you are doing, Achim.

You're welcome.  (Sun Tzu, III/2)


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10 17:12                     ` Achim Gratz
@ 2013-03-10 21:44                       ` Jonathan Leech-Pepin
  0 siblings, 0 replies; 37+ messages in thread
From: Jonathan Leech-Pepin @ 2013-03-10 21:44 UTC (permalink / raw)
  To: Achim Gratz; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1379 bytes --]

Hello

On 10 March 2013 13:12, Achim Gratz <Stromeko@nexgo.de> wrote:

> Jambunathan K writes:
> > Emacs lisp has a manual of it's own.  I don't see how Org export
> > reference *cannot* end in Emacs.
>
> I said that I'm expecting these references to become part of the
> manual(s).  I still expect that and will try to help it along, but it
> doesn't necessarily need to take the exact sequence of events that I
> envisioned.
>

I have to agree with Bastien that they do not really fit into the main
Org manual.

Providing them with Emacs (so that they are immediately available) is
a good thing in my mind, however I would put them as a separate
document similarly to how the Elisp manual is separate.

Regards,

Jon


> > Bastien is doing what(ever) suits his whims and you are approving of
> > it.
>
> I haven't approved or disapproved anything.  I have only stated the
> plain fact that if my understanding of the future course of events is
> incorrect, then my comment does not apply (and conversely, if it does,
> then the issue I've stated needs to be dealt with).
>
> > I disapprove of what you are doing, Achim.
>
> You're welcome.  (Sun Tzu, III/2)
>
>
> Regards,
> Achim.
> --
> +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
>
> SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2:
> http://Synth.Stromeko.net/Downloads.html#WaldorfSDada
>
>
>

[-- Attachment #2: Type: text/html, Size: 2174 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
@ 2013-03-12 10:19 orgmode
  2013-03-13 15:33 ` Nicolas Goaziou
  0 siblings, 1 reply; 37+ messages in thread
From: orgmode @ 2013-03-12 10:19 UTC (permalink / raw)
  To: emacs-orgmode


Hi Nicolas,

great work! It's fantastic that orgmode now gets a specification.

What may help is to document the syntax machine readable and somewhat  
more formal.  This ensures that there are less differences in  
interpretation and that the specification can be used to generate an  
orgmode parser directly.  An example could be how the ietf specifies  
things, have a look at https://en.wikipedia.org/wiki/ABNF or EBNF.   
It's not much difference from what you have done, but it's more  
unambigous.

Thanks for the great work!

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-08 22:06   ` Nicolas Goaziou
  2013-03-09 10:52     ` Waldemar Quevedo
@ 2013-03-13 14:07     ` Nicolas Richard
  2013-03-15 20:39       ` Nicolas Goaziou
  1 sibling, 1 reply; 37+ messages in thread
From: Nicolas Richard @ 2013-03-13 14:07 UTC (permalink / raw)
  To: emacs-orgmode

Hi,

I obviously did not send and actually lost a message I prepared two days
ago. I'll try again.

>> I suggest adding : The number of stars defines the level of the
>> headline.
>
> Does it belong to the syntax definition? Level is how Org uses syntax
> internally. Also the sentence, although right, is misleading, because
> level definition also depends on `org-odd-levels-only'.

I think it's partly in the syntax, since it defines "parentness" for
headlines (the numeric level is of no importance, but the relative level
is used).

>
>> I suggest dropping "Case is significant" (or maybe give the whole story :
>> IIRC, it is the ascii code of the given letter that is used as
>> priority)
>
> I'm not sure that the purpose of this document should be to explain how
> syntax will be used.

That is why I suggested dropping the mention : case is not significant
for the syntax. Very minor though, obviously.

>> That should be `org-comment-string' I guess.
>
> Indeed. Btw, I think this variable should be a defconst, not
> a defcustom. It just makes things harder for little benefit.

As you know, "Comment" is also a french word meaning "how", and that
could very well appear uppercased as the first word of a title. (I'd
personally recommend against uppercasing titles, but I'd understand if
someone wanted to customize the word for such reasons)

> Would you (or Someone) mind updating the org-syntax.org file on Worg?

Please review the attached patch and apply parts as you wish (even if I
wanted to do it myself, I don't have worg access.)

Last word about #+TBLFM: I'm not sure if that should go into the "affiliated
keywords" section (thus rewriting parts of it, because that one goes
below the table, unlike other affiliated keywords) or a special section
on its own. Thus I'm not changing anything wrt that.

From f97c00bfbd8a14d0b2953ee0e8b6817a2b9f0306 Mon Sep 17 00:00:00 2001
From: Nicolas Richard <theonewiththeevillook@yahoo.fr>
Date: Mon, 11 Mar 2013 16:25:21 +0100
Subject: [PATCH] dev/org-syntax.org: minor

---
 dev/org-syntax.org | 34 +++++++++++++++++++---------------
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/dev/org-syntax.org b/dev/org-syntax.org
index 9b2a843..a918a75 100644
--- a/dev/org-syntax.org
+++ b/dev/org-syntax.org
@@ -15,7 +15,8 @@ within specific environments.
 
 Three categories are used to classify these environments: "Greater
 elements", "elements", and "objects", from the broadest scope to the
-narrowest.
+narrowest.  The word "element" is used for both Greater and non-Greater
+elements, the context should make that clear.
 
 The paragraph is the unit of measurement.  An element defines
 syntactical parts that are at the same level as a paragraph,
@@ -41,16 +42,17 @@ Unless specified otherwise, case is not significant.
   STARS KEYWORD PRIORITY TITLE TAGS
   #+END_EXAMPLE
 
-  STARS is a string starting at column 0 and containing at least one
+  STARS is a string starting at column 0, containing at least one
   asterisk (and up to ~org-inlinetask-min-level~ if =org-inlinetask=
-  library is loaded).  It's the sole compulsory part of a headline.
+  library is loaded) and ended by a space character.  The number of
+  asterisks is used to define the level of the headline.  It's the
+  sole compulsory part of a headline.
 
   KEYWORD is a TODO keyword, which has to belong to the list defined
-  in ~org-todo-keywords~.  Case is significant.
+  in ~org-todo-keywords-1~.  Case is significant.
 
   PRIORITY is a priority cookie, i.e. a single letter preceded by
-  a hash sign # and enclosed within square brackets.  Case is
-  significant.
+  a hash sign # and enclosed within square brackets.
 
   TITLE can be made of any character but a new line.  Though, it will
   match after every other part have been matched.
@@ -71,7 +73,7 @@ Unless specified otherwise, case is not significant.
   ,**** TODO [#A] COMMENT Title :tag:a2%:
   #+END_EXAMPLE
     
-  If the first word appearing in the title is ~org-comment-keyword~,
+  If the first word appearing in the title is ~org-comment-string~,
   the headline will be considered as "commented".  If that first word
   is ~org-quote-string~, it will be considered as "quoted".  In both
   situations, case is significant.
@@ -82,14 +84,14 @@ Unless specified otherwise, case is not significant.
   If ~org-archive-tag~ is one of its tags, it will be considered as
   "archived".  Case is significant.
 
-  A headline contains directly at most one section, followed by any
-  number of headlines.  Only a section can contain another section.
+  A headline contains directly one section, followed by any
+  number of deeper level headlines.
 
   A section contains directly any greater element or element.  Only
   a headline can contain a section.  As an exception, text before the
   first headline in the document also belongs to a section.
 
-  In a quoted headline contains a section, the latter will be
+  If a quoted headline contains a section, the latter will be
   considered as a "quote section".
 
   As an example, consider the following document:
@@ -136,7 +138,8 @@ Unless specified otherwise, case is not significant.
   attributes.
 
   This is done by adding specific keywords, named "affiliated
-  keywords", just above the element considered, no blank line allowed.
+  keywords", just above the element considered, no blank line
+  allowed.
 
   Affiliated keywords are built upon one of the following patterns:
   "#+KEY: VALUE", "#+KEY[OPTIONAL]: VALUE" or "#+ATTR_BACKEND: VALUE".
@@ -150,7 +153,7 @@ Unless specified otherwise, case is not significant.
   OPTIONAL and VALUE can contain any character but a new line.  Only
   "CAPTION" and "RESULTS" keywords can have an optional value.
 
-  An affiliated keyword can appear on multiple lines if KEY is either
+  An affiliated keyword can appear more than once if KEY is either
   "CAPTION" or "HEADER" or if its pattern is "#+ATTR_BACKEND: VALUE".
 
   "CAPTION", "AUTHOR", "DATE" and "TITLE" keywords can contain objects
@@ -183,7 +186,8 @@ Unless specified otherwise, case is not significant.
 
    NAME can contain any non-whitespace character.
 
-   PARAMETERS can contain any character, and can be omitted.
+   PARAMETERS can contain any character other than new line, and can
+   be omitted.
 
    If NAME is "CENTER", it will be a "center block".  If it is
    "QUOTE", it will be a "quote block".
@@ -191,8 +195,8 @@ Unless specified otherwise, case is not significant.
    If the block is neither a center block, a quote block or a [[#Blocks][block
    element]], it will be a "special block".
 
-   CONTENTS can contain any element, but another greater block of the
-   same type.
+   CONTENTS can contain any element, except : a line =#+END_NAME= on
+   its own and lines beginning with STARS must be quoted by a comma.
 
 ** Drawers and Property Drawers
    :PROPERTIES:
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-12 10:19 [RFC] Org syntax (draft) orgmode
@ 2013-03-13 15:33 ` Nicolas Goaziou
  0 siblings, 0 replies; 37+ messages in thread
From: Nicolas Goaziou @ 2013-03-13 15:33 UTC (permalink / raw)
  To: orgmode; +Cc: emacs-orgmode

Hello,

orgmode@h-rd.org writes:

> What may help is to document the syntax machine readable and somewhat
> more formal.

I think it's a bit too early for that. The document describes the
current syntax, but also uncovers some ambiguous parts of that syntax,
which may need to be fixed (at least require to be discussed).

I agree that both tasks can be done in parallel, but I wouldn't like the
one you propose to shadow the one that I describe.

Anyway, improvements are welcome. Feel free to provide a patch for the
document.

> This ensures that there are less differences in interpretation and
> that the specification can be used to generate an orgmode parser
> directly. An example could be how the ietf specifies things, have
> a look at https://en.wikipedia.org/wiki/ABNF or EBNF. It's not much
> difference from what you have done, but it's more unambigous.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10 15:44           ` Jambunathan K
@ 2013-03-14 16:58             ` Eric S Fraga
  2013-03-14 18:26               ` Jambunathan K
  0 siblings, 1 reply; 37+ messages in thread
From: Eric S Fraga @ 2013-03-14 16:58 UTC (permalink / raw)
  To: Jambunathan K; +Cc: emacs-orgmode

Jambunathan K <kjambunathan@gmail.com> writes:

> You are a jerk, a BIG JERK. 

This is completely uncalled.  What satisfaction do you gain from this?
This is a brilliant, informative and polite mailing list except when it
comes to your contributions.

Don't bother answering because I've added you to my spam database.  I've
not had to do this for an individual for some years now.  I'll never see
any of your emails again.  Bye.

-- 
: Eric S Fraga, GnuPG: 0xC89193D8FFFCF67D
: in Emacs 24.3.50.1 and Org release_7.9.3f-1199-g3a0e55

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-14 16:58             ` Eric S Fraga
@ 2013-03-14 18:26               ` Jambunathan K
  2013-03-14 18:51                 ` David Engster
  0 siblings, 1 reply; 37+ messages in thread
From: Jambunathan K @ 2013-03-14 18:26 UTC (permalink / raw)
  To: emacs-orgmode


Eric

Eric S Fraga <e.fraga@ucl.ac.uk> writes:

> Jambunathan K <kjambunathan@gmail.com> writes:
>
>> You are a jerk, a BIG JERK. 
>
> This is completely uncalled.  What satisfaction do you gain from this?
> This is a brilliant, informative and polite mailing list except when it
> comes to your contributions.
>
> Don't bother answering because I've added you to my spam database.  I've
> not had to do this for an individual for some years now.  I'll never see
> any of your emails again.  Bye.

Still you haven't answered my "Fudging the mail reply headers" question
to my satisfaction.  I just let you off the hook (at the last moment)
before charging ahead.

Jambunathan K.
-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-14 18:26               ` Jambunathan K
@ 2013-03-14 18:51                 ` David Engster
  0 siblings, 0 replies; 37+ messages in thread
From: David Engster @ 2013-03-14 18:51 UTC (permalink / raw)
  To: Jambunathan K; +Cc: emacs-orgmode

Jambunathan K. writes:
> Still you haven't answered my "Fudging the mail reply headers" question
> to my satisfaction.

http://www.gnu.org/software/emacs/manual/html_node/message/Mailing-Lists.html

"A mailing list poster can use MFT to express that responses should be
sent to just the list, and not the poster as well. This will happen if
the poster is already subscribed to the list."

-David

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-09 10:52     ` Waldemar Quevedo
  2013-03-09 14:23       ` Carsten Dominik
@ 2013-03-15 20:22       ` Nicolas Goaziou
  2013-03-17 18:48       ` Samuel Wales
  2013-04-05 17:01       ` Bastien
  3 siblings, 0 replies; 37+ messages in thread
From: Nicolas Goaziou @ 2013-03-15 20:22 UTC (permalink / raw)
  To: Waldemar Quevedo; +Cc: Nicolas Richard, emacs-orgmode

Hello,

Waldemar Quevedo <waldemar.quevedo@gmail.com> writes:

> By the way, does it exist somewhere a set of examples of Emacs
> org-mode -> html conversion for all org-mode features?
> (How are changes from org-mode -> html converstion from Emacs tested
> during development?)

I don't think something like this exists. Though, html back-end supports
all Org syntax described in the document.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-13 14:07     ` Nicolas Richard
@ 2013-03-15 20:39       ` Nicolas Goaziou
  0 siblings, 0 replies; 37+ messages in thread
From: Nicolas Goaziou @ 2013-03-15 20:39 UTC (permalink / raw)
  To: Nicolas Richard; +Cc: emacs-orgmode

Hello,

"Nicolas Richard" <theonewiththeevillook@yahoo.fr> writes:

> As you know, "Comment" is also a french word meaning "how", and that
> could very well appear uppercased as the first word of a title. (I'd
> personally recommend against uppercasing titles, but I'd understand if
> someone wanted to customize the word for such reasons)

"Comment" is not "COMMENT", as you say. Also, it is a minor annoyance.

On the other hand, crippling portability of Org format because it
depends on an external variable is a major problem.

Therefore, I stand on my ground: I suggest to turn `org-comment-string'
and al. into defconst.

>> Would you (or Someone) mind updating the org-syntax.org file on Worg?
>
> Please review the attached patch and apply parts as you wish (even if I
> wanted to do it myself, I don't have worg access.)

I applied the patch. Thank you.

> Last word about #+TBLFM: I'm not sure if that should go into the "affiliated
> keywords" section (thus rewriting parts of it, because that one goes
> below the table, unlike other affiliated keywords) or a special section
> on its own. Thus I'm not changing anything wrt that.

TBLFM is not an affiliated keyword. It's a keyword specific to Org
tables. It can exist anywhere, but it only makes sense right after
a table.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-07 20:37 Nicolas Goaziou
                   ` (5 preceding siblings ...)
  2013-03-09 23:16 ` Achim Gratz
@ 2013-03-17  7:18 ` Achim Gratz
  2013-03-17  9:36   ` Sebastien Vauban
  6 siblings, 1 reply; 37+ messages in thread
From: Achim Gratz @ 2013-03-17  7:18 UTC (permalink / raw)
  To: emacs-orgmode

Nicolas Goaziou writes:
> As discussed a few days ago, here is a document describing the complete
> Org syntax as read by the parser. I also added some comments. I am going
> to put the Org file on Worg, so anyone can update it and fix mistakes.

after some playing with the Org manual in Org that Tom has been working
on I am starting to think that there should be a way to define the same
macro differently for different export backends.  That would be mainly
so that you could have a macro expansion use export snippets tailored to
that backend where (after stripping the export snippets) the expansion
makes little or no sense in other backends.  What do you think?


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

DIY Stuff:
http://Synth.Stromeko.net/DIY.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-17  7:18 ` Achim Gratz
@ 2013-03-17  9:36   ` Sebastien Vauban
  0 siblings, 0 replies; 37+ messages in thread
From: Sebastien Vauban @ 2013-03-17  9:36 UTC (permalink / raw)
  To: emacs-orgmode-mXXj517/zsQ

Hi Achim,

Achim Gratz wrote:
> Nicolas Goaziou writes:
>> As discussed a few days ago, here is a document describing the complete
>> Org syntax as read by the parser. I also added some comments. I am going
>> to put the Org file on Worg, so anyone can update it and fix mistakes.
>
> after some playing with the Org manual in Org that Tom has been working
> on I am starting to think that there should be a way to define the same
> macro differently for different export backends.  That would be mainly
> so that you could have a macro expansion use export snippets tailored to
> that backend where (after stripping the export snippets) the expansion
> makes little or no sense in other backends.  What do you think?

This is already possible, as once explained by Nicolas on this ML:

  ╭────
  │ You can also have macros generating raw code geared towards LaTeX or HTML
  │ back-ends (through export-snippets). For example:
  │ 
  │ #+MACRO: my-mod @@e-latex:\something{$1}@@@@e-html:<div class="something">$1</div>@@
  │ 
  │ This is an example: {{{my-mod(text)}}}.
  ╰────

Best regards,
  Seb

-- 
Sebastien Vauban

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-09 10:52     ` Waldemar Quevedo
  2013-03-09 14:23       ` Carsten Dominik
  2013-03-15 20:22       ` Nicolas Goaziou
@ 2013-03-17 18:48       ` Samuel Wales
  2013-04-05 17:01       ` Bastien
  3 siblings, 0 replies; 37+ messages in thread
From: Samuel Wales @ 2013-03-17 18:48 UTC (permalink / raw)
  To: Waldemar Quevedo; +Cc: Nicolas Richard, emacs-orgmode, Nicolas Goaziou

On 3/9/13, Waldemar Quevedo <waldemar.quevedo@gmail.com> wrote:
> By the way, does it exist somewhere a set of examples of Emacs
> org-mode -> html conversion for all org-mode features?
> (How are changes from org-mode -> html converstion from Emacs tested
> during development?)

+1

That would be great.  I'd definitely donate any of my blog posts, but
perhaps others have more comprehensive tests.

Samuel

-- 
The Kafka Pandemic: http://thekafkapandemic.blogspot.com

The disease DOES progress.  MANY people have died from it.  Just like
AIDS, it attacks MANY body systems.  ANYBODY can get it.  There is NO
hope without activist action.  This means YOU.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-09 10:52     ` Waldemar Quevedo
                         ` (2 preceding siblings ...)
  2013-03-17 18:48       ` Samuel Wales
@ 2013-04-05 17:01       ` Bastien
  3 siblings, 0 replies; 37+ messages in thread
From: Bastien @ 2013-04-05 17:01 UTC (permalink / raw)
  To: Waldemar Quevedo; +Cc: Nicolas Richard, emacs-orgmode, Nicolas Goaziou

Hi Waldemar,

Waldemar Quevedo <waldemar.quevedo@gmail.com> writes:

> By the way, does it exist somewhere a set of examples of Emacs
> org-mode -> html conversion for all org-mode features?

Not really -- and it would be nice to have one, especially for
developers like you who are in charge of an external exporter!

> I am mantaining the org-ruby gem which is used to render org-mode texts to html,
> and currently there is no "roadmap" of features to implement for it.
> As a result, features and tweaks are added to the library
> as long as someone submits a ticket requesting the feature in Github.
> (Here is a list of the export features supported in case someone wants
> to take a look:
> https://github.com/bdewey/org-ruby/tree/master/spec/html_examples )
> Having a set of examples features from org-mode would be very useful
> to see how much coverage other implementations of org-mode exporting
> features have.
>
> Cheers everyone, keep org-mode being an awesome tool :)

Thank *you* for maintaining the org-ruby gem -- truly a gem to the
github community!  Hopefully you'll be able to update the gem wrt
the latest syntactic changes.  There are not too many of them, and
not every will use Org 8.0 so soon, but still.

Best,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC] Org syntax (draft)
  2013-03-10 10:14         ` Bastien
  2013-03-10 10:16           ` Bastien
  2013-03-10 15:44           ` Jambunathan K
@ 2013-04-09 16:37           ` Bastien
  2 siblings, 0 replies; 37+ messages in thread
From: Bastien @ 2013-04-09 16:37 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Achim Gratz, emacs-orgmode, Jambunathan K

Hi all,

Bastien <bzg@altern.org> writes:

> the manual would enjoy a subsection in "Hacking" on how to create
> a new exporter, either from scratch or as a derived exporter.
> (Such a subsection can be short enough, thanks to derived backend.)

FWIW, I started a rudimentary one.  

This is "Adding export back-ends" in the current manual from master.

-- 
 Bastien

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2013-04-09 16:51 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-12 10:19 [RFC] Org syntax (draft) orgmode
2013-03-13 15:33 ` Nicolas Goaziou
  -- strict thread matches above, loose matches on Subject: below --
2013-03-07 20:37 Nicolas Goaziou
2013-03-07 20:47 ` Carsten Dominik
2013-03-07 22:07 ` Achim Gratz
2013-03-08 10:04 ` Bastien
2013-03-08 13:25 ` François Pinard
2013-03-08 15:23 ` Nicolas Richard
2013-03-08 22:06   ` Nicolas Goaziou
2013-03-09 10:52     ` Waldemar Quevedo
2013-03-09 14:23       ` Carsten Dominik
2013-03-09 14:42         ` Nicolas Goaziou
2013-03-09 15:05           ` Carsten Dominik
2013-03-15 20:22       ` Nicolas Goaziou
2013-03-17 18:48       ` Samuel Wales
2013-04-05 17:01       ` Bastien
2013-03-13 14:07     ` Nicolas Richard
2013-03-15 20:39       ` Nicolas Goaziou
2013-03-09 23:16 ` Achim Gratz
2013-03-09 23:49   ` Nicolas Goaziou
2013-03-10  4:35     ` Jambunathan K
2013-03-10  7:08       ` Nicolas Goaziou
2013-03-10 10:14         ` Bastien
2013-03-10 10:16           ` Bastien
2013-03-10 13:07             ` Achim Gratz
2013-03-10 14:11               ` Bastien
2013-03-10 16:02                 ` Achim Gratz
2013-03-10 16:09                   ` Jambunathan K
2013-03-10 17:12                     ` Achim Gratz
2013-03-10 21:44                       ` Jonathan Leech-Pepin
2013-03-10 15:44           ` Jambunathan K
2013-03-14 16:58             ` Eric S Fraga
2013-03-14 18:26               ` Jambunathan K
2013-03-14 18:51                 ` David Engster
2013-04-09 16:37           ` Bastien
2013-03-17  7:18 ` Achim Gratz
2013-03-17  9:36   ` Sebastien Vauban

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).