From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id 6CjvJzJRnWN8QwAAbAwnHQ (envelope-from ) for ; Sat, 17 Dec 2022 06:18:42 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id qFHjJzJRnWMlYwAAauVa8A (envelope-from ) for ; Sat, 17 Dec 2022 06:18:42 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id D8D692D3E7 for ; Sat, 17 Dec 2022 06:18:41 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1p6Pa2-0001Xf-UE; Sat, 17 Dec 2022 00:17:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p6Pa0-0001XI-EW for emacs-orgmode@gnu.org; Sat, 17 Dec 2022 00:17:48 -0500 Received: from mail-yw1-x112e.google.com ([2607:f8b0:4864:20::112e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1p6PZy-0004e8-8Z for emacs-orgmode@gnu.org; Sat, 17 Dec 2022 00:17:47 -0500 Received: by mail-yw1-x112e.google.com with SMTP id 00721157ae682-3b48b139b46so60254977b3.12 for ; Fri, 16 Dec 2022 21:17:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=xoZ7w2nDWzZlvtHFW0+c9rboJTUag+JUX/Bm5nrH0Gs=; b=WXOG98BqopfWqy5SG3ctzB/7txhVrKLZ7bj/zT7hoatNVDd8YgklCLCuB/cUd0GJUe NdFE2DOO1M7ezMWB0oJ9mEfJfDIbGESqZQASo7HVQE084HWnwNgqvigSWE0OEbHlInzS 0Bsk9wNzURWYSu/x8if/VPK0dSFZWkun8XqdCcqOhxCFCx9ExqwaLsk7FUNDwTpuQhoA y+GM5T0H+mnYfuZFuxyztnow63qlhEV/Idovxq6QQWJDD5tkOOEn6XSh1gEEdAjBFE4R BlRrKltFapOO+akAQ7yAE8VcKXpWjQZr8yhBlAJFNEJsmUbzUbjK4j3+nC7uaDlPR1mz +H4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xoZ7w2nDWzZlvtHFW0+c9rboJTUag+JUX/Bm5nrH0Gs=; b=aze/sdzeBDIoIbVfu8MRNJiE2OY9+giyNGBjxaC8IuVeAdSAPEs7Tk2ZxOL9Z1c5fF ZsSKZu3qrUZoVR8Vml8fLd+28eeox5BEyuriyPF2QQbgxlo3QaAywhe2NrDxKpcDfKa/ WlldDSKg9ayRwG4T/+x7ICRMd3F/kIleoDy7peDXrw5kGIEMBaswAukLFGER3KSU3HNx ZwhGOWlt/K0MjETj6wVQfsNzkWgqtZbfrH2EciZXhukl7H3JxN5HKj36P1SfXrxAAK/i kiZ3RbtgGc2mFVnaFSv0GbgwMJxbJ2VZF/Hn2xcSMHZXT2Kyb3K2jPKkPk5NpgZgSOVb GeOw== X-Gm-Message-State: AFqh2kpc6wvu4JM26U+nu3So0w1Biu6quc6cq5TFaRckC/oAGBo1HCOe 29tCzOnQX+L7301m+7yxE9s6CSoE04rA2K2/3DU= X-Google-Smtp-Source: AMrXdXs1UE2l7nOmC/nBnj0Jz6eL3R24/fxiO6mb+Kk5TNc6G+Wt7g1olfOXSulH5q5q1IVdkOKYx5nVthpHurEJ2go= X-Received: by 2002:a05:690c:31d:b0:3cd:1f8:de66 with SMTP id bg29-20020a05690c031d00b003cd01f8de66mr2284531ywb.201.1671254264836; Fri, 16 Dec 2022 21:17:44 -0800 (PST) MIME-Version: 1.0 References: <116c3126-32cc-44d0-9e95-e802161e1e84@app.fastmail.com> <87fsdhui6z.fsf@localhost> <86pmckvisp.fsf@gmail.com> <878rj8u2na.fsf@localhost> In-Reply-To: <878rj8u2na.fsf@localhost> From: Tom Gillespie Date: Fri, 16 Dec 2022 21:17:33 -0800 Message-ID: Subject: Re: [Syntax discussion] Should we treat src blocks without LANG as paragraphs? (was: [BUG] ox-html does not export captions of source blocks without language) To: Ihor Radchenko Cc: Tim Cross , emacs-orgmode@gnu.org Content-Type: multipart/alternative; boundary="000000000000c8f4f505efff3476" Received-SPF: pass client-ip=2607:f8b0:4864:20::112e; envelope-from=tgbugs@gmail.com; helo=mail-yw1-x112e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1671254322; a=rsa-sha256; cv=none; b=lHq96PcbjQA4cx6+1LmtskRBYvRa64Yaz38s0o4iTnjOlwOflZI4uXfbMl41ikyJ1ztAPi uT2NHl70d6q1sUd7nBMA6vf/u/Ti4KTi+nQ+L+kdv1+vl/QANs5aYDjOkAzNHGOC+XTXdf D1hnMsTK9D4nZ2zdP3gLj4+KowAeDvYrtJBSzptwOnQ0gcQp8TWWYoufQu8MTQkVxv6jei F47Sd9DjLcn9ezRdjfH3tmer3ik43As82+lPJaKSyUj57osjgThD09ifH5QnnvTtVKyk+k Yt3qimi+ACnmS94DZEngQqXCSnq2tZESuwvmN0giW1CI5OGzNEQQ8uMz3J3ZGw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=WXOG98Bq; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1671254322; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=xoZ7w2nDWzZlvtHFW0+c9rboJTUag+JUX/Bm5nrH0Gs=; b=IqDdcbvVMtwvlD9JwyaMx11em3pcW8jUyL5I4XtOH03Vr4jsJVyW7Tbelyz2/rfwN+SnRT 3Pfpy5EvrCIKP4hADekKr9PvKAUrTPmCdhUICkMYGWefmFjY63muczM2VinZFDh3X08RkL zq+Uen9/sB77KrV7PCRE2VulbFPuRFa02akRHyG7hn3PqDy8XwVuQAYIPMk20HGppsTTFi pIsJaqrKIUvOP5D2d31DaZX8TQSzEy1G2REsctCOV5tlQKARUDrsvTDSty7JCmc+rishbZ bS4Sn0rB2ilwdJBpi3UKwDCNEXwCDQIyrJYahBiXtoJIdS59E3VWoEpWWLEGKg== X-Migadu-Spam-Score: -4.48 X-Spam-Score: -4.48 X-Migadu-Queue-Id: D8D692D3E7 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=WXOG98Bq; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gmail.com X-Migadu-Scanner: scn1.migadu.com X-TUID: ANXMWbvji1tK --000000000000c8f4f505efff3476 Content-Type: text/plain; charset="UTF-8" Hi Ihor, Chiming in here with a slight variant on what others have said. Best! Tom I don't think this should be handled at the syntactic layer at all. The empty string block language should be syntactically valid with any special behavior needed being handled later. Linters could treat it as a warning/error though, but the parsing is made significantly easier if the empty string is present allowing the grammar to be fully closed and regular. Thus, I don't think we need to make this a syntactic error or pun a src block without a lang to another type. I think we can add an implementation for when the block language is the empty string. This keeps the grammar regular by removing a special case. I assume that internally the empty string block lang would mostly call the example block codepaths, except that it should probably issue a warning or fail if someone tries to org-src-edit the block so that we can alert them that they are missing the lang. Treating src blocks missing a lang as paragraphs is incorrect because according to the syntax spec they are syntactically still blocks (greater or lesser depending on your inclinations). I think the general principle we want to follow here is that a block (or any entity in general) should not lose its type because some part of its syntax is malformed (I have made similar arguments about property drawers). That is, if something starts with #+begin_NAME stuff and there is a corresponding #+end_NAME, then it is a block. The choice of how a src block without a lang should behave is a bit more complex as there are multiple consumers of src blocks that make different assumptions. As mentioned above. I think that if a block is missing the lang we could think of it instead as the null language. If we have the :var language because someone has other contents on the line they have a well formed src block, but will get a different error because there is currently no known language ":var". --000000000000c8f4f505efff3476 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Ihor,
=C2=A0=C2=A0 Chiming in here with = a slight variant on what
others have said. Best!
Tom

I don't think this should be handled at the = syntactic
layer at all. The empty string block language should
be syntactically valid with any special behavior
needed b= eing handled later.

Linters could treat it as a wa= rning/error though,
but the parsing is made significantly easier = if
the empty string is present allowing the grammar
to = be fully closed and regular.

Thus, I don't= think we need to make this a syntactic
error or pun a src block = without a lang to another type.

I think we can add= an implementation for when the
block language is the empty strin= g. This keeps
the grammar regular by removing a special case.
=

I assume that internally the empty string block l= ang
would mostly call the example block codepaths,
exce= pt that it should probably issue a warning or fail
if someone tri= es to org-src-edit the block so that we
can alert them that they = are missing the lang.

Treating src blocks miss= ing a lang as paragraphs is
incorrect because according to the sy= ntax spec they
are syntactically still blocks (greater or lesser = depending
on your inclinations).

I think= the general principle we want to follow here is
that a block (or= any entity in general) should not lose
its type because some par= t of its syntax is malformed
(I have made similar arguments about= property drawers).

That is, if something starts w= ith #+begin_NAME stuff
and there is a corresponding #+end_NAME, t= hen it
is a block.

The choice of how a s= rc block without a lang should
behave is a bit more complex as th= ere are multiple
consumers of src blocks that make different assu= mptions.

As mentioned above. I think that if a blo= ck is missing the lang
we could think of it instead as the null l= anguage. If we have the
:var language because someone has other c= ontents on the line
they have a well formed src block, but will g= et a different error
because there is currently no known language= ":var".
--000000000000c8f4f505efff3476--