From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id 0KCJGY0nqWGADwEAgWs5BA (envelope-from ) for ; Thu, 02 Dec 2021 21:07:41 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id cBYUFY0nqWHTVAAAbx9fmQ (envelope-from ) for ; Thu, 02 Dec 2021 20:07:41 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id A932616E34 for ; Thu, 2 Dec 2021 21:07:40 +0100 (CET) Received: from localhost ([::1]:43052 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mssMj-00046O-Oq for larch@yhetil.org; Thu, 02 Dec 2021 15:07:38 -0500 Received: from eggs.gnu.org ([209.51.188.92]:60220) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mssMF-00046E-2f for emacs-orgmode@gnu.org; Thu, 02 Dec 2021 15:07:07 -0500 Received: from [2607:f8b0:4864:20::42d] (port=33392 helo=mail-pf1-x42d.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mssMD-00035O-Bt for emacs-orgmode@gnu.org; Thu, 02 Dec 2021 15:07:06 -0500 Received: by mail-pf1-x42d.google.com with SMTP id x5so668015pfr.0 for ; Thu, 02 Dec 2021 12:07:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=references:user-agent:from:to:cc:subject:date:in-reply-to :message-id:mime-version:content-transfer-encoding; bh=XvobEmWhCW5SvxtIbPw1ohz5XBpSDbPfquvizjy7DR0=; b=GCAHjZBYSUZiEosx3kO1PnvW/vT92CHkPt8ZUitSY0mLR2IPrt1gS1mwYS6FZkfkJd 4vfS/GV3AaNGfewX8DmixVnOAaF4kNtEk+YaS4TstJaoDCFUkll7ctf0z4ZhPfjKptij haJWv3nNouGS210YH5rqozRbBVJ9rzPdZqiBxbdRXBdK73PvpbUBboehb68xoYvKEuem G5wBhI87luMNpT02PmQM09T2zPeb4izooC+9I4m1PI76lqc8oIYxj23YzhJ6r5vXMIbk ggXi4JXUcq8DREXWOoKnNibP3PXCe6tLv5PMksLTUR3YydSTT6K5iVP0ysl3p4IFwHSz XY9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:references:user-agent:from:to:cc:subject:date :in-reply-to:message-id:mime-version:content-transfer-encoding; bh=XvobEmWhCW5SvxtIbPw1ohz5XBpSDbPfquvizjy7DR0=; b=iGijSvzpXVjH6frI4dTWjh/2NWvyUwh7Xszfd3hAtvV+fbrn342m6pvebErgVcOYmX NH92PQ3JfSe7XiPTedYPX2Kiwyxtt0NyiuOrMEYPFLKJA/SVljjEmgU9D5P95tUnoKXq bAj9SRAVDkVZlvvTmuWvTc/OiI1mH7fE/9O0RYC284fV5F73xUrv5iqUnM/N0Rkw/r47 8nhfe1eGbP996YcV1MXRVnjr2JJFX6RfBffH2W3/nXx+TP0QfEYlU6l0kyJpA/cOf0wB cI/QzgsAtx5+PYCZZjgngt1wxHvOVSUNFJVHMnZXJn/YNAeXGJ648MP98qTAijivVaP5 WrVA== X-Gm-Message-State: AOAM533epKrWvTy/UQIQghIYKtBJCcb6LpLA7z/1ali7MnN1OdJlO81c sCQoXXXFpYYukPBrhEek8Mx7vSiGPDA= X-Google-Smtp-Source: ABdhPJw3nJhg4OGNpIsx5q02J3xeQGTFqsupU258AtjzzDKLpFaJuP49W83c4ZyWl7tfxKVP2dFhEQ== X-Received: by 2002:a05:6a00:1403:b0:4a7:e88d:5d with SMTP id l3-20020a056a00140300b004a7e88d005dmr14743393pfu.68.1638475623831; Thu, 02 Dec 2021 12:07:03 -0800 (PST) Received: from localhost (61-245-128-160.3df580.per.nbn.aussiebb.net. [61.245.128.160]) by smtp.gmail.com with ESMTPSA id q5sm666956pfu.66.2021.12.02.12.07.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Dec 2021 12:07:03 -0800 (PST) References: <87mtljpd1w.fsf@gmail.com> User-agent: mu4e 1.6.9; emacs 28.0.50 From: Timothy To: Tom Gillespie Subject: Re: Some commentary on the Org Syntax document Date: Fri, 03 Dec 2021 03:16:40 +0800 In-reply-to: Message-ID: <871r2upy8s.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Host-Lookup-Failed: Reverse DNS lookup failed for 2607:f8b0:4864:20::42d (failed) Received-SPF: pass client-ip=2607:f8b0:4864:20::42d; envelope-from=tecosaur@gmail.com; helo=mail-pf1-x42d.google.com X-Spam_score_int: -12 X-Spam_score: -1.3 X-Spam_bar: - X-Spam_report: (-1.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, PDS_HP_HELO_NORDNS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: emacs-orgmode Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1638475660; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=XvobEmWhCW5SvxtIbPw1ohz5XBpSDbPfquvizjy7DR0=; b=GJ9t+1rRWN+zs2EF8+NdH+O3X8p/Hfyd+V0NtxEvY2uiE1mEHTDQbX3EYYTzF0amJH/BTU ckymjgAgzjdQIbjINR6pdjXk2MCAMA6QOqbbRZNyglKMuCrnGMRQVvWCX6dH+w616ZVuPK AT+9se4xUKPnEFG2URtLnC/mU5inSjYVx79SvrvVf2llLrFStKIo5GCx5lJOaPZyT1j1GF YVvxxpcglBWTGmoPPSG3jVtr3JnTnFRccv9HaZ0zrSSiodVz10bFfaA9CQOu2G3Gcg2fmG DsJ8HF97LnPz93x2gvp4RFG2IVbm0aDrpKmgqbMKwAIcseXr9b+IfVTxwtStxA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1638475660; a=rsa-sha256; cv=none; b=kFvcsety0ohyrJkyMC9cmxZ/Gg2ydwBAkcx0YBwg/ASPNjm6cejuduIwMXPw+iQgmWkGSV a/YN7O4z7lZe2MIQ058edma9RjWV6PiYaLkkPLUPVKWHPygRaUrjQkpv/8YU1OqJ9iiIZX aiJF2EQ+ifKdt9L6gGbg8LNdXbc8XyOzf3TiFL1QHJFzvfdzf6b+mnlP8e/xokpAoDNL0U W64E0Dbc2wnWJ77a/QTdSTPmCWo2+gC9su8HD2m/jPGZmj9JJdnZuLOULcG2OZx+CfQX/x oZb+zQfFiUt9ZQ+FBkGAJu+M9+seSePUi+XAfRt/IlpXpsLygNNWWB0vVsS7YA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=GCAHjZBY; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -4.12 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=GCAHjZBY; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: A932616E34 X-Spam-Score: -4.12 X-Migadu-Scanner: scn0.migadu.com X-TUID: CQ528mAApLlJ Hi Tom, Thanks for your comments, they've been most helpful. I have some comments on your comments, and have also started drafting some tweaks to the document in light of your initial comments, put as a diff excerpt at the end of this email. For starters, I have come more general comments. However, this has turned out a bit longer than I intended. Unfortunately I am moments away from heading to bed, so to quote Pascal "I have only made this letter longer because I have not had the time to make it shorter". I think a a big problem is the mix of implicit and explicit information. Some components are rigorously specified in terms of the characters they may contain, elements and objects that are recognised inside them, and even the order in which different parts of the pattern are parsed. As mentioned originally, the current Dynamic Blocks description doesn't even mention the CONTENTS part of the pattern, and relies on the reader inferring that it operates similarly to the CONTENTS part of Drawers. Forcing the reader to start making inferences like this is a treacherous path, and I think I can blame for some of the other issues I've experienced. Take for instance the "surely X can't contain a newline?" comments I've made. In the Node Properties and Entities descriptions you have statements along the lines of "X can contain any character [...] except a newline". In my mind this then sets up the reader to interpret a similar statement without the "except a newline" clause to mean that newlines are permitted. I'm also thinking that the term "element" is overworked in the document. It's basically pulling tripple duty: you have Elements, Greater Elements, and elements which are Elements and/or Greater Elements =F0=9F=98= =93. The naming here is quite understandable, and I think we all know that naming things well isn't easy, but I think it would behove us to try to give each term a single unique meaning across the document --- or at least try to come as close to that as reasonably possible. I think we may be able to improve this by tweaking the hierarchy of terms and then applying it rigorously throughout the document. At the highest level, I think we want to encapsulate Headlines, Sections, Greater Elements, Elements, and Objects. I suppose we might call these the *components* of an Org document. Then we have the group of Element and Greater Elements, which are useful to clump together. Each component is usually given in terms of a number of forms or patterns, which usually contain terms which are elucidated in the description of that component. So, the hierarchy appears to be something like. 1. (Headline / Section / Greater Element / Element / Object) 2. Headline 3. Section 4. Greater Element 5. (Greater Element / Element) 6. Element 7. Object 8. Pattern / Form 9. Term We could say call (1) Components, (7) Units, (6) Objects, (5) Element or Object (why not spell it out to avoid telling people to remember something). I could have put more thought into this, but it should do for illustrating my line of thinking. Let me know if you have any good ideas. A separate improvement could be using more formatting to distinguish when terms are used in a particular way. Now for a few specific comments. Tom Gillespie writes: >> As a general comment, in many places the Org Syntax document states what >> characters a component can contain, but not what objects/elements. This = feels >> like a bit of a hole in the current specifications. > > This is indeed confusing because there are some implicit constraints > that are not listed because they never come up. I've sort of covered this before, but I think the document would benefit from being more explicit in general. > For example, you cannot have two newlines > inside an inline footnote because the two newlines break the paragraph an= d the > thing that appears to be an inline footnote is just plain text that is > never terminated. Specifically regarding newlines, perhaps we could add something like this to the start of the Objects section? "Furthermore, while many objects may contain newlines, an empty line (i.e. a double newline) often terminates the element that the object is a part of, such as a paragraph." > Ensuring that font locking is in sync org-element and org-export is > critical to ensure that users know what will actually happen. On this, I'm cautiously optimistic about the discussion about using org-element for fontification. >> Heading >> =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80 >> >> =E2=81=83 Ok, so `TITLE' can have any character but a newline, but what = Org components can it contain? >> I=E2=80=99m going to assume any object? > > Via org-element-object-restrictions it is standard-set-no-line-break whic= h is > all elements except citation-reference, table-cell, and line-break. I must thank you and Ihor for pointing me to org-element-object-restrictions! I wasn't aware of that till now, and it's most helpful. Should all the information given by it be included in the Syntax document? I lean towards saying yes. >> >> Drawers and Property Drawers >> =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80 >> >> =E2=81=83 =E2=80=9CContents can contain any element but another drawer= =E2=80=9D >> =E2=80=A2 Does =E2=80=9Cany element=E2=80=9D mean =E2=80=9Cany Element= or Greater Element=E2=80=9D > > Any element that does not have greater precedence, so that would > be only a heading. I'm not sure this element =3D Element / Greater Element "shorthand" is doing us any favours, but I've discussed that already... >> >> Dynamic Blocks >> =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80 >> >> =E2=81=83 It is not specified what `CONTENTS' may be > > Implicitly follows the same rules as drawers, no headings > and no nesting of dynamic blocks. Text should be added > that states this explicitly. I'm drafting some changes, and this change has been added. >> =E2=81=83 Surely `PARAMETERS' cannot contain a newline? > > Termination by newline is implicit in the example, but the text is confus= ing. Made explicit in my draft. >> Plain Lists and Items >> =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80 >> >> =E2=81=83 It is not completely clear what content an item may have. >> I assume any Object? > > By my reading it may contain anything, objects and elements, > except for a heading, but that is already implied by the de-indent. > > To quote from the docs: > > An item ends before the next item, the first line less or equally > indented than its starting line, or two consecutive empty lines. > Indentation of lines within other greater elements do not count, > neither do inlinetasks boundaries. > > This makes plain lists one of the most complex elements to parse. Is it? Perhaps I'm not doing it right but it didn't seem bad to me when implementing my parser (though I need to add the element support). All right, that's all I have time for for now. Hopefully some of this is of use/interest. -- Timothy