emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Tom Gillespie <tgbugs@gmail.com>
To: Asa Zeren <asaizeren@gmail.com>
Cc: emacs-orgmode <emacs-orgmode@gnu.org>
Subject: Re: Thoughts on the standardization of Org
Date: Sun, 1 Nov 2020 01:20:19 -0400	[thread overview]
Message-ID: <CA+G3_PNzdNxZ4x0s6T6V2gnYnoZ+C5EqLcmZjjWbsatVh9UtwA@mail.gmail.com> (raw)
In-Reply-To: <CANKzsSBja5GNHWu0VEvH9dSu_vWFhiLyoebTT3GdT_Jc4gxKaw@mail.gmail.com>

Hi Asa,
    My general take is that any active work toward standardization
would be premature. At the very least a full implementation outside
of Emacs would need to exist. In the absence of that there is little
point to standardization. There is ample existing documentation to
build a compliant parser (pandoc exists as well ...) and any effort
toward standardization right now would be better spent improving
the existing implementation or fixing broken ones (e.g. org-ruby).

From your comments, I would suggest reading through
https://orgmode.org/worg/dev/org-syntax.html if you have not
done so already. Much of what you mention is already there.

If something like standardization is still desired, I would suggest that
the proper framing for any such activities would be as improvement and
clarification in the documentation, and potentially as formalization of some
of the existing behaviors of the system. Org is a fairly stable system,
and as others have said, explicitly leaving things open an unspecified
would be vital.

There are also parts of org (e.g. babel) where the behavior needs to be
regularized and made consistent. At the moment those areas need
contributors, not standardization.

A few more thoughts in line. Best!
Tom

On Sat, Oct 31, 2020 at 8:22 PM Asa Zeren <asaizeren@gmail.com> wrote:

> this is impossible. If org catches on before it is standardized, we
> end up in the situation of Markdown, with many competing standards and
> non-standards. Hence, standardization is essential.

The situation for Org is not comparable to markdown. There is a single reference
implementation for org at the moment. The codebase is massive. There are many
existing parsers for org files. Many are obviously broken since they
do not match the reference
implementation's behavior. The obviousness is a sign that there is not a need
for standardization at this time. Further, there is little risk that
another impl will
be created without interoperating with the elisp implementation. For example,
consider Mauro's use case: being able to get colleagues who do not use Emacs
to use Org. I suspect most of the people who would be working on other
implementations
would be starting from Emacs and would be unlikely to leave. Also
unlike markdown,
html export is just one tiny part of Org, whereas markdown was
implemented repeatedly
to allow text input on web pages where people needed to implement
parts of html that
had not already been specified in markdown.

> Standardizing org is much harder than standardizing something like
> Markdown, but I think by breaking it down as follows will maximize the
> portability of org while not compromising on development of org.

See some of my other recent emails. In the short term this is impossible
due to the deep dependence on Emacs Lisp. Any outside implementation
that is created today would have to implement elisp. Few have been able
to do this in over 30 years. Moving beyond elisp requires additional machinery
to be added to org to be able to specify other top level langauges. This is
not something that is remotely ready for standardization because no one
even has a single working implementation yet!

> I see three areas of standardization, which I think should be
> standardized separately:
>  - Org DOM
No. This is an implementation detail (see below for more).

>  - Org Syntax
This is pretty much done, there are some outstanding points for discussion,
but they are about implementation details, not about the contents of the
syntax. Also extension of the syntax needs to be open and defined entirely
by the elisp implementation, as mentioned by others.

>  - Org Standard Environments
Read https://orgmode.org/worg/dev/org-syntax.html. It will get you up to speed
with the existing terminology that is used in the community.

>
> Org DOM:
> The first thing to specify is the org DOM. (Maybe a different name
> should be used to avoid confusion with the HTML DOM) This is the
> structure of an org-mode document, without the textual
> representation. Many org-related tools operate on org documents
> without needing to use the textual representation. Specifying the DOM
> separately would (a) create a separation of concerns and (b) allow for
> better libraries built around org mode.

Depending on exactly what you mean by DOM this does not need to be standardized.
There are a couple of points that need to be clarified regarding how
to treeify the flat
list of elements that come out of a parse in order to tie things like
associated keywords
to the correct elements, but these are quite minimal. The potential
rats nest that is
trying to standardize a DOM when it is an implementation detail means
that I would
strongly discourage even thinking about Org in that way. I would even
discourage putting
too much emphasis on the org-element api which, while extremely useful
inside Emacs,
is not something that should be standardized because it is a detail
peculiar to the elisp
implementation.

There are cases where certain behaviors, such as how to parse and format
footnotes, could be specified, but such behaviors don't require a dom in order
to be specified, and adding a DOM to the picture does nothing but complicate
the format. Org is a text format. The semantics for interaction with the text
format are defined entirely by the text representation (In Emacs
there.is.only.buffer).
Other semantics, such as export to html and latex, are not something that you
would want to try to standardize, you would likely lose friends, enemies, and
whatever sanity you had left at the end (see discussion on Mauro's thread about
the fact that it is probably just easier to use Emacs directly if you
need to export
to a certain format in a specific way. It is free software after all.)

To the extent that an element tree could be useful, I think it would
be as a concept
in an implementation guide, not as something formally specified.

> Org Syntax:
> This would be specifying the mapping between the DOM and the textual
> representation, specified in terms of an environment.

There is no DOM. Modification to an org document must be made on the
text representation otherwise it is meaningless. This isn't html where there
is no canonical representation outside the DOM. The text representation of
an org document IS the canonical representation (modulo a normalization
pass).

> Org Standard Environments:
> This is how I would specify elements such as #+begin_src..#+end_src
> would be specified, as standardized elements of the environment. This
> would be structured as a number of individual standard environments,
> such as "Source Blocks" or "Standard Header Properties" (specifying
> #+title, #+author, etc.)

These are well specified already in the worg syntax draft. There are a couple
of special cases such as src and example blocks that could be included
explicitly in the syntax to facilitate interoperability with parsers
for org babel
languages. Beyond that, the community already has vocabulary that covers
what you describe here, as mentioned above.


  parent reply	other threads:[~2020-11-01  5:21 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-01  0:22 Thoughts on the standardization of Org Asa Zeren
2020-11-01  0:40 ` Dr. Arne Babenhauserheide
2020-11-01  3:08   ` Asa Zeren
2020-11-01  4:23     ` Pankaj Jangid
2020-11-01  7:54     ` Tim Cross
2020-11-01  2:28 ` Tim Cross
2020-11-01  3:39   ` Pankaj Jangid
2020-11-02 12:39     ` Eric S Fraga
2020-11-02 14:22       ` Greg Minshall
2020-11-02 14:56         ` Eric S Fraga
2020-11-02 15:23           ` Russell Adams
2020-11-02 15:31             ` TEC
2020-11-02 15:48             ` Eric S Fraga
2020-11-02 16:27               ` Carsten Dominik
2020-11-02 22:05           ` Tim Cross
2020-11-03  3:29           ` Greg Minshall
2020-11-01  5:20 ` Tom Gillespie [this message]
2020-11-01 10:25   ` Dr. Arne Babenhauserheide
2020-11-01 10:28     ` TEC
2020-11-01 18:02       ` Jack Kamm
2020-11-01 16:03     ` Asa Zeren
2020-11-01 17:27       ` Dr. Arne Babenhauserheide
2020-11-01 17:29         ` TEC
2020-11-01 18:43         ` Asa Zeren
2020-11-01  6:24 ` TEC
2020-11-01 16:13 ` Russell Adams
2020-11-01 19:46   ` Daniele Nicolodi
2020-11-01 23:10     ` Dr. Arne Babenhauserheide
2020-11-02  8:37       ` Daniele Nicolodi
2020-11-02  9:02         ` TEC
2020-11-02 11:04           ` Daniele Nicolodi
2020-11-02 13:43             ` TEC
2020-11-07 21:20             ` Jean Louis
2020-11-09 14:04               ` Maxim Nikulin
2020-11-09 15:57                 ` Daniele Nicolodi
2020-11-09 15:59                 ` Jean Louis
2020-11-10 16:19                   ` Maxim Nikulin
2020-11-10 20:22                     ` Jean Louis
2020-11-10 23:08                     ` Tom Gillespie
2020-11-11  0:00                       ` Tim Cross
2020-11-09 21:46                 ` Tim Cross
2020-11-09 22:45                   ` Emails are not safe - " Jean Louis
2020-11-10  4:13                   ` Greg Minshall
2020-11-10  4:49                     ` Tim Cross
2020-11-10  7:12                       ` Greg Minshall
2020-11-10 16:29                     ` Maxim Nikulin
2020-11-10 20:35                       ` Jean Louis
2020-11-10 22:30                         ` Tim Cross
2020-11-11  5:03                           ` Jean Louis
2020-11-11  6:40                             ` Tim Cross
2020-11-27 16:49                             ` Maxim Nikulin
2020-11-27 17:16                               ` Jean Louis
2020-11-11 17:10                         ` Maxim Nikulin
2020-11-11 17:34                           ` Jean Louis
2020-11-12  3:39                             ` Greg Minshall
2020-11-11  3:49                       ` Greg Minshall
2020-11-02  9:53         ` Dr. Arne Babenhauserheide
2020-11-02  1:17 ` Ken Mankoff
2020-11-02  8:12   ` Russell Adams
2020-11-02  9:57     ` Dr. Arne Babenhauserheide
2020-11-03  8:24 ` David Rogers
2020-11-03 12:14   ` Ken Mankoff
2020-11-03 12:27     ` Russell Adams
2020-11-03 13:00     ` Eric S Fraga
2020-11-03 13:31       ` Ken Mankoff
2020-11-03 15:03         ` Eric S Fraga
2020-11-03 20:27           ` TEC
2020-11-03 14:38     ` Devin Prater
2020-11-03 22:03     ` David Rogers
  -- strict thread matches above, loose matches on Subject: below --
2020-11-01 13:34 Gustav Wikström
2020-11-01 18:39 Asa Zeren
2020-11-03 22:30 Asa Zeren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+G3_PNzdNxZ4x0s6T6V2gnYnoZ+C5EqLcmZjjWbsatVh9UtwA@mail.gmail.com \
    --to=tgbugs@gmail.com \
    --cc=asaizeren@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).