From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id 0MAvNW7cnl+ECAAA0tVLHw (envelope-from ) for ; Sun, 01 Nov 2020 16:03:58 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id AOACMW7cnl8OfwAAB5/wlQ (envelope-from ) for ; Sun, 01 Nov 2020 16:03:58 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 2EE419402A0 for ; Sun, 1 Nov 2020 16:03:58 +0000 (UTC) Received: from localhost ([::1]:45870 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kZFpj-0008N7-RE for larch@yhetil.org; Sun, 01 Nov 2020 11:03:55 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:36526) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kZFp8-0008Mz-DZ for emacs-orgmode@gnu.org; Sun, 01 Nov 2020 11:03:18 -0500 Received: from mail-il1-x12b.google.com ([2607:f8b0:4864:20::12b]:43990) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kZFp5-0003EL-U2 for emacs-orgmode@gnu.org; Sun, 01 Nov 2020 11:03:18 -0500 Received: by mail-il1-x12b.google.com with SMTP id k1so10791912ilc.10 for ; Sun, 01 Nov 2020 08:03:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=qFqXszv00OVE3PyKfriwXYXxi2hV+W8hHoLTebsNQ2o=; b=fBiLVmzxbWzk065W5n+9c4OPgWM2KPS1/nUfKZ5BEiy+ej25q3O7vWcL3FRYH+Urk+ 4jDv4Btp6HBOCPB8Vr6f4r/OxYvdhYzCDrYDSP+curHxs0UnTPUOE/1nsXczlkU4lkvU oTkAZSBYu7AN5CQFDl6KDb5pZfHsUTFanX3XDuRXZ9+IJznCzbeblA45bWUQhvj/9yPB pUFQi2g7u4k9Zkl9DMoN/JzrImbmlsUUX+7m3nt75ePrbIpyXYG0goh8HYrLKlmYmEPC QwWrMzHxOTNd2eFAAuwQPUvV8hvgzXga93jtIXUyTMGD10jr3CK6gvoSlv1v/OAXT6+9 jRxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=qFqXszv00OVE3PyKfriwXYXxi2hV+W8hHoLTebsNQ2o=; b=QiCEZtxab1IG73A/U5PRfdCRJs9QNSytPaDN+7R2bUo/+Makp2ZeAgkwfCl7HbRMtD KvxOf/2CbO9pMFmMvZ/VOcFFw2KsBNxIi9hRlXC1z81Am2Vjpr45lqi0q2zIdvljAX/V PDJU7PhlPIPP4CZQNb1WThqltbIfmOZ4/SmM40squ0BmvTjayNbnAVUhO/ehVMqwM8E5 J6LIj2PeTYJ80R2rhNDZdLs6rOZFyh2jJCL1NW+s2kGb1+6gTWiW4n5eCdvCJOtncGaM ydpd1JfmcEJV2PDyHMSV4jOLxdc7yjEE50W7sdy82fgxz9vBxiPD/QlBcqP5vwyGfj5c hWRA== X-Gm-Message-State: AOAM533/S4+uPImbX2lv66lBMdqf0qtEw0YXblO2CcvyuKTy4a7HZ3uX 7qrG9lbPnATk4Zt4eOjT8YrQIW+tpCYhD8Js6p5ivtjzBeh60Q== X-Google-Smtp-Source: ABdhPJztC0NThsPxCTR4D9a/MB/7WeoJJeGQinDU6nOq9fS+DZoNx4jmE2kgQ/Eh85AUhAB7dnRNFRQtqVD0CEpQTkw= X-Received: by 2002:a92:c84e:: with SMTP id b14mr8014012ilq.295.1604246594626; Sun, 01 Nov 2020 08:03:14 -0800 (PST) MIME-Version: 1.0 References: <87wnz5e5lg.fsf@web.de> In-Reply-To: <87wnz5e5lg.fsf@web.de> From: Asa Zeren Date: Sun, 1 Nov 2020 11:03:03 -0500 Message-ID: Subject: Re: Thoughts on the standardization of Org To: "Dr. Arne Babenhauserheide" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2607:f8b0:4864:20::12b; envelope-from=asaizeren@gmail.com; helo=mail-il1-x12b.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tom Gillespie , emacs-orgmode@gnu.org Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Scanner: ns3122888.ip-94-23-21.eu Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=fBiLVmzx; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Spam-Score: -1.71 X-TUID: D1lCvXZZnLmo Thanks for the comments. Both of you have raised some very good points, but I think that there has b= een some confusion as to a number of my arguments. I hope to clarify some thing= s below. On Sun Nov. 1, 2020, at 1:20AM Tom Gillespie wrote: > My general take is that any active work toward standardization > would be premature. At the very least a full implementation outside > of Emacs would need to exist. In the absence of that there is little > point to standardization. There is ample existing documentation to > build a compliant parser (pandoc exists as well ...) and any effort > toward standardization right now would be better spent improving > the existing implementation or fixing broken ones (e.g. org-ruby). This could very well be the case. When to create a formal standard is a ver= y hard question, and there are lots of reasons for it to be too early. One point I do think needs to be clarified is the extent of a "full implementation". I don't think that a full editing environment like the one= that exists in Emacs today needs to exist, only a fully functional export framework. This would require it to understand the full org syntax and semantics. Also, part of the reason I wrote my original thoughts is because I observed some motivation towards standardization, as part of the MIME type effort. > From your comments, I would suggest reading through > https://orgmode.org/worg/dev/org-syntax.html if you have not done so > already. Much of what you mention is already there. I did give it a read, and I have just given it another read. While I do con= fess I did make some terminology mistakes, most of my points still stand after g= iving it another read through. > If something like standardization is still desired, I would suggest that = the > proper framing for any such activities would be as improvement and > clarification in the documentation, and potentially as formalization of s= ome > of the existing behaviors of the system. Org is a fairly stable system, a= nd as > others have said, explicitly leaving things open an unspecified would be > vital. There are also parts of org (e.g. babel) where the behavior needs= to > be regularized and made consistent. At the moment those areas need > contributors, not standardization. I do agree that this is the right method of creating the standard. Org-mode= is a very large beast to standardize, and it can only be done incrementally, or = it is doomed to fail. > On Sat, Oct 31, 2020 at 8:22 PM Asa Zeren wrote: > > this is impossible. If org catches on before it is standardized, we end= up > > in the situation of Markdown, with many competing standards and > > non-standards. Hence, standardization is essential. > The situation for Org is not comparable to markdown. There is a single > reference implementation for org at the moment. The codebase is massive. = There > are many existing parsers for org files. Many are obviously broken since = they > do not match the reference implementation's behavior. The obviousness is = a > sign that there is not a need for standardization at this time. Further, = there > is little risk that another impl will be created without interoperating w= ith > the elisp implementation. For example, consider Mauro's use case: being a= ble > to get colleagues who do not use Emacs to use Org. I suspect most of the > people who would be working on other implementations would be starting fr= om > Emacs and would be unlikely to leave. Also unlike markdown, html export i= s > just one tiny part of Org, whereas markdown was implemented repeatedly to > allow text input on web pages where people needed to implement parts of h= tml > that had not already been specified in markdown. I agree that this is the current situation. However, there is a real danger here. People are continually trying to create org implementations (myself included), and if one of these is successful before an org standard is crea= ted, and it differs from the original elisp implementation in non-trivial to fix ways, then we have an issue. Perhaps this will not come to pass, and other implementations should strive for parity, but it is still a danger. > Standardizing org is much harder than standardizing something like Markdo= wn, > but I think by breaking it down as follows will maximize the portability = of > org while not compromising on development of org. See some of my other > recent emails. In the short term this is impossible due to the deep > dependence on Emacs Lisp. Any outside implementation that is created toda= y > would have to implement elisp. Few have been able to do this in over 30 > years. Moving beyond elisp requires additional machinery to be added to > org to be able to specify other top level languages. This is not somethin= g > that is remotely ready for standardization because no one even has a sing= le > working implementation yet! I definitely agree that a deep dependence on Emacs Lisp should not be standardized, and thus there are certain parts of the current org-mode implementation that cannot be currently specified. However, there still are areas of org that /can/ be specified without elisp, and we should not stop standardization of anything because of some things. > > I see three areas of standardization, which I think should be standardi= zed > > separately: > > - Org DOM > No. This is an implementation detail (see below for more). > ... > Depending on exactly what you mean by DOM this does not need to be > standardized. There are a couple of points that need to be clarified > regarding how to treeify the flat list of elements that come out of a par= se in > order to tie things like associated keywords to the correct elements, but > these are quite minimal. The potential rats nest that is trying to standa= rdize > a DOM when it is an implementation detail means that I would strongly > discourage even thinking about Org in that way. I would even discourage > putting too much emphasis on the org-element api which, while extremely u= seful > inside Emacs, is not something that should be standardized because it is = a > detail peculiar to the elisp implementation. I think that my use of the word DOM has been very confusing here. I definit= ely agree that we should not standardize the org-element API, nor the particula= r way syntax nodes are represented in elisp. However, what I do think we should standardize is the abstract tree representation of an org document. For example, elements vs objects, the idea of nested headlines, etc. would be specified in the DOM, separate from how to write them. For clarification, t= his is what I mean should be specified for HTML. (Though org would necessarily = be a bit more complicated) A document is a node. Nodes are either text nodes, which contain only t= ext, or they are normal nodes (what's the real name for them?). If they are normal nodes, they have a tag, which is text, and a number of attribute= s that have a text key and optionally a text value. Each normal node cont= ains an ordered list of children nodes. Also another note is that the worg syntax document does begin to specify this. My point is to bring this out into a separate document. > To the extent that an element tree could be useful, I think it would be a= s a > concept in an implementation guide, not as something formally specified. You may be right about this. Perhaps a formal standard is unnecessary for t= his. > > - Org Standard Environments > Read https://orgmode.org/worg/dev/org-syntax.html. It will get you up to = speed with > the existing terminology that is used in the community. > ... > > Org Standard Environments: > >This is how I would specify elements such as #+begin_src..#+end_src woul= d be > > specified, as standardized elements of the environment. This would be > > structured as a number of individual standard environments, such as > > "Source Blocks" or "Standard Header Properties" (specifying #+title, #+= author, etc.) > These are well specified already in the > worg syntax draft. There are a couple of special cases such as src and ex= ample > blocks that could be included explicitly in the syntax to facilitate > interoperability with parsers for org babel languages. Beyond that, the > community already has vocabulary that covers what you describe here, as > mentioned above. I think I was unclear. I am discussing /how/ they are specified, not their specification, which is, as you say, currently specified in the worg document. Perhaps the best way to illustrate my idea is with an example. Worg said: > Affiliated Keywords > > With the exception of comment, clocks, headlines, inlinetasks, items, nod= e > properties, planning, property drawers, sections, and table rows, every o= ther > element type can be assigned attributes. > > This is done by adding specific keywords, named =E2=80=9Caffiliated keywo= rds=E2=80=9D, just > above the element considered, no blank line allowed. > > Affiliated keywords are built upon one of the following patterns: > > #+KEY: VALUE > #+KEY[OPTIONAL]: VALUE > #+ATTR_BACKEND: VALUE > > KEY is either =E2=80=9CCAPTION=E2=80=9D, =E2=80=9CHEADER=E2=80=9D, =E2=80= =9CNAME=E2=80=9D, =E2=80=9CPLOT=E2=80=9D or =E2=80=9CRESULTS=E2=80=9D strin= g. > > BACKEND is a string constituted of alpha-numeric characters, hyphens or > underscores. > > OPTIONAL and VALUE can contain any character but a new line. Only =E2=80= =9CCAPTION=E2=80=9D > and =E2=80=9CRESULTS=E2=80=9D keywords can have an optional value. > > An affiliated keyword can appear more than once if KEY is either =E2=80= =9CCAPTION=E2=80=9D or > =E2=80=9CHEADER=E2=80=9D or if its pattern is =E2=80=9C#+ATTR_BACKEND: VA= LUE=E2=80=9D. > > =E2=80=9CCAPTION=E2=80=9D, =E2=80=9CAUTHOR=E2=80=9D, =E2=80=9CDATE=E2=80= =9D and =E2=80=9CTITLE=E2=80=9D keywords can contain objects in their > value and their optional value, if applicable. The way I envision this standardized is the following: > Affiliated Keywords > > With the exception of comment, clocks, headlines, inlinetasks, items, nod= e > properties, planning, property drawers, sections, and table rows, every o= ther > element type can be assigned attributes. > > This is done by adding specific keywords, named =E2=80=9Caffiliated keywo= rds=E2=80=9D, just > above the element considered, no blank line allowed. > > Affiliated keywords are built upon one of the following patterns: > > #+KEY: VALUE > #+KEY[OPTIONAL]: VALUE > > OPTIONAL and VALUE can contain any character but a new line. > > An environment specifies a number of legal KEYs, and for each one must > specify the following: > - the structure of VALUE and OPTIONAL > - whether OPTIONAL is permitted > - whether the keyword can be repeated multiple times on a single element > > ... > > Org Standard Environment #42: Backend Attributes > > Affiliated keywords where key begins with =3DATTR_=3D, followed by a stri= ng > BACKEND, which must consist of alphanumeric characters, hyphens, or > underscores, are defined. OPTIONAL is not permitted. Multiple occurrences= of > the keyword are permitted. The structure of VALUE is determined by the ex= port > backend specified by BACKEND. > > These should be used to give additional information to an export backend > identified by BACKEND. > > Org Standard Environment #314: Captioning > > The affiliated keywords "CAPTION," "AUTHOR," "DATE," and "TITLE," are def= ined. > OPTIONAL is permitted in CAPTION. CAPTION may appear multiple times on a > single element. > OPTIONAL is not permitted in AUTHOR, DATE, or TITLE. These may not appear > multiple times on a single element. > > For CAPTION, AUTHOR, DATE, and TITLE, objects may appear in VALUE and > OPTIONAL (if applicable). I hope that this example explains what I mean better. Dr. Arne Babenhauserheide said: > I would like to add, that this is pretty easy to do, and also to make > independent of the users emacs environment. Here is an example that > uses the whole orgmode-babel-latex-html machinery to create derived > documents from source-of-truth org-mode files which get exported to a > book: Yes. Emacs can definitely be used in this way. However, I do not believe th= at it should be the only tool that can be used in this way, even if no other tool exists as present. I hope I have clarified some of the confusions surrounding my argument. Thanks, Asa