From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1.migadu.com ([2001:41d0:303:e224::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms1.migadu.com with LMTPS id 0KiIEgG0JmZcawAA62LTzQ:P1 (envelope-from ) for ; Mon, 22 Apr 2024 21:01:21 +0200 Received: from aspmx1.migadu.com ([2001:41d0:303:e224::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1.migadu.com with LMTPS id 0KiIEgG0JmZcawAA62LTzQ (envelope-from ) for ; Mon, 22 Apr 2024 21:01:21 +0200 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b="q8fD443/"; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=posteo.net ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1713812481; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=G4jaehzJs3A+GFZxl39dDD/qCuW7oi3+vZUdnhuu0GY=; b=pDGRto0BT5RR020ZZJHlWK6pg3e+6GybhZdpHWfqNBRNZXcsVaaR8NUbTEDxv+j1pFjnSb k+5wL9wni6t5uyMdx0h8El9C0w/bXmFcvNcxljm1m2nchVNRISCAQcN9y2pUbXr9ko0s0L hqnkV1+Wd2Lfw9ommq79JDwSAglhtv0kNdmo/ZP1sLJl6iBRu5rFQFnYNG1gvFDnGdJ3a/ GN6BtwqaAjKIO8irWtkC09GtMzQk81wA3DQ9XRgI632VCbSWZ4w2LBe7FoZ7moUoGmSmwy TfguPzqQm5Gq3c2kyi+Jp8CNA7ACbLHvgbsIwrY7T5aPhqVbT8bxC8WW9c50Iw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1713812481; a=rsa-sha256; cv=none; b=CoA7/+93VclNhNsGC0GQSpl6Xg07rr4NGJb0l/KazXTbF0djhtbJv0jW0VCrZioo4Rm//Y iEORvGJ+CCU65M1jLR2nrD0uQeRoc8lZSZI2Kl6lYIQW5bCT9UJof/LU23JngtqMfptNGm 0hpKpsxtHPYQPnQGvkTBv/OXUxukws22FuMNuKpQurJGn9kDfmb8qprpvV0O2h+62PBx9Q YeAv7pZBBR/ykRy1V0RvS4HdlgJe/DNQK7JFd6oBJ5WnD0UxJFwOFR/Px0y3skqY9Kq2Eq wI8nzHeZyBxH/2z5cWh8W/KkHW7P8ttwmLzxIEwexGGwqxNWaZ/dGGNRIVl/vw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b="q8fD443/"; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=posteo.net Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 0D74C23083 for ; Mon, 22 Apr 2024 21:01:21 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ryytt-0006iZ-LY; Mon, 22 Apr 2024 15:00:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ryytq-0006iD-GB for emacs-orgmode@gnu.org; Mon, 22 Apr 2024 15:00:22 -0400 Received: from mout01.posteo.de ([185.67.36.65]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ryytn-00071b-0l for emacs-orgmode@gnu.org; Mon, 22 Apr 2024 15:00:22 -0400 Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id 92DBE240027 for ; Mon, 22 Apr 2024 21:00:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1713812416; bh=8ZD5H2DxIda9Sfkf9OeIvAZpWB65gDwf0YdFaF2oJso=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type: From; b=q8fD443/wev4LhkROQIxBwsH+ru4msu5iS3jLVLrV/MQ5IuEma3BWkMU6HYIwZb93 a4PyeTedODlTevZkK/WYks3ZKTlBCT43xybwF+u/fcSO7u4PmyfaH+QKcuV7H31j/N cQ4CkgicmohKBIu3uZSlQr2S9+c30zmMNd1WE81R0tGSCnfdmlKyw6+HhPcQQfd3Ym 6yQbU1Q7StkF68b8XtSp/2oVAxy0pdFU8JKhnzw/TH3+Iqv1147cwFRRzcRm2dR+k/ awBYXLVgBCf2WxBQaRIfRPK6+/T/spzpccDLZCeYiwuLzlD00x9HJr0wEqYS6iMl1+ MIzyVqdkwZKJQ== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4VNZM32hrlz6txW; Mon, 22 Apr 2024 21:00:15 +0200 (CEST) From: Ihor Radchenko To: Max Nikulin Cc: emacs-orgmode@gnu.org Subject: Re: Trailing whitespace after export snippets without a transcoder In-Reply-To: References: <5210ac1c-ed73-4b82-a296-41cf90b9f0a7@gmail.com> <87jzvmwnmw.fsf@localhost> <87ilauagvy.fsf@localhost> <87y1hqqgiu.fsf@localhost> <87ttk26crq.fsf@localhost> <87h6fwmgkm.fsf@localhost> <87wmoqq3ad.fsf@localhost> Date: Mon, 22 Apr 2024 19:01:07 +0000 Message-ID: <87wmoprzm4.fsf@localhost> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=185.67.36.65; envelope-from=yantar92@posteo.net; helo=mout01.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US X-Migadu-Spam-Score: -9.59 X-Spam-Score: -9.59 X-Migadu-Queue-Id: 0D74C23083 X-Migadu-Scanner: mx12.migadu.com X-TUID: XfQOrpNA3tYI --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Max Nikulin writes: >> I do not think that we need to handle this Org mode-wide (it will be >> difficult and will likely cause breaking changes). > > I have not figured out why it may become a breaking changes and what=20 > backends need blank lines inside paragraph. I would make stripping empty= =20 > lines default behavior with some option to disable this feature. For example, consider an HTML exporter that aligns tags nicely and keeps blank lines between markup blocks for readability. If we remove such blank lines unconditionally, it will be problematic. >> See the attached tentative fix. > > Since zero width spaces are part of Org syntax, they need special treatme= nt. They are not a part of Org syntax, and we currently do not handle them specially. They still work as escape-character simply because Org syntax defines markup boundaries using a closed set of whitespace characters - (rx (any " \t")). So, any non-tab non-space whitespace will be an equivalent of zero-width space for all practical purposes. > ---- 8< ---- > #+macro: empty (eval "") > > Some *bold*=E2=80=8B@@comment: *@@ text. > @@comment: line@@ > More /italic/=E2=80=8B{{{empty}}} text. > {{{empty}}} > Last line. > ---- >8 ---- > > LaTeX export: > ---- 8< ---- > Some \textbf{bold}=E2=80=8Btext. > More \emph{italic}=E2=80=8B text. > > Last line. > ---- >8 ---- > > Notice visible space character disappeared after "bold". I guess that I can change the condition to not include trailing space from (rx whitespace eol) to (rx (any " \t|) eol). See the attached updated version of the patch set. > ... I am leaving up=20 > to you to decide if empty line appeared due to a macro is a bug or a=20 > feature. If I remember it correctly, your opinion is that a macro=20 > expanding to multiple paragraphs is a valid one. Yes. I do believe that we should keep macros as dumb as possible, so that people can use them in the most flexible ways, including breaking paragraphs, if so desired. A more annoying one is First line @@comment:foo@@ last line. vs. First line @@comment:foo @@last line. where we encounter the peculiarity of Org syntax with trailing tabs and spaces included as part of the object, but not newlines. But I do not see any good way to address this problem without rewriting half of Org mode. --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=v2-0001-org-export-data-Handle-trailing-spaces-when-trans.patch >From 229a563dc38e1fdfd63be2dfebb1a9e9023e44b2 Mon Sep 17 00:00:00 2001 Message-ID: <229a563dc38e1fdfd63be2dfebb1a9e9023e44b2.1713812419.git.yantar92@posteo.net> From: Ihor Radchenko Date: Sun, 21 Apr 2024 15:37:18 +0300 Subject: [PATCH v2 1/2] org-export-data: Handle trailing spaces when transcoder returns nil * lisp/ox.el (org-export--keep-spaces): New helper function containing logic about keeping spaces in place of removed object from `org-export--prune-tree'. The logic is modified to keep spaces in the case when previous plain-string object ends with a whitespace, but not " " or "\t". This can happen, for example, when there is a trailing zero-width space. We do want to keep spaces in such scenario. (org-export-data): When transcoder returns nil, handle trailing spaces after an object the same way `org-export--prune-tree' does. Remove special handling of export snippets that unconditionally keep their trailing spaces. (org-export--prune-tree): Use the helper function. Link: https://orgmode.org/list/87h6fwmgkm.fsf@localhost --- lisp/ox.el | 67 ++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 42 insertions(+), 25 deletions(-) diff --git a/lisp/ox.el b/lisp/ox.el index fc746950d..6f6689188 100644 --- a/lisp/ox.el +++ b/lisp/ox.el @@ -1880,6 +1880,38 @@ (defun org-export-transcoder (blob info) (let ((transcoder (cdr (assq type (plist-get info :translate-alist))))) (and (functionp transcoder) transcoder))))) +(defun org-export--keep-spaces (data info) + "Non-nil, when post-blank spaces after removing DATA should be preserved. +INFO is the info channel. + +This function returns nil, when previous exported element already has +trailing spaces or when DATA does not have non-zero non-nil +`:post-blank' property. + +When the return value is non-nil, it is a string containing the trailing +spaces." + ;; When DATA is an object, interpret this as if DATA should be + ;; ignored (see `org-export--prune-tree'). Keep spaces in place of + ;; removed element, if necessary. Example: "Foo.[10%] Bar" would + ;; become "Foo.Bar" if we do not keep spaces. Another example: "A + ;; space@@ascii:*@@ character." should become "A space character" + ;; in non-ASCII export. + (let ((post-blank (org-element-post-blank data))) + (unless (or (not post-blank) + (zerop post-blank) + (eq 'element (org-element-class data))) + (let ((previous (org-export-get-previous-element data info))) + (unless (or (not previous) + (pcase (org-element-type previous) + (`plain-text + (string-match-p + (rx (any " \t") eos) previous)) + (_ (org-element-post-blank previous)))) + ;; When previous element does not have + ;; trailing spaces, keep the trailing + ;; spaces from DATA. + (make-string post-blank ?\s)))))) + ;;;###autoload (defun org-export-data (data info) "Convert DATA into current backend format. @@ -1930,15 +1962,11 @@ (defun org-export-data (data info) (eq (plist-get info :with-archived-trees) 'headline) (org-element-property :archivedp data))) (let ((transcoder (org-export-transcoder data info))) - (or (and (functionp transcoder) - (if (eq type 'link) - (broken-link-handler - (funcall transcoder data nil info)) - (funcall transcoder data nil info))) - ;; Export snippets never return a nil value so - ;; that white spaces following them are never - ;; ignored. - (and (eq type 'export-snippet) "")))) + (and (functionp transcoder) + (if (eq type 'link) + (broken-link-handler + (funcall transcoder data nil info)) + (funcall transcoder data nil info))))) ;; Element/Object with contents. (t (let ((transcoder (org-export-transcoder data info))) @@ -1979,8 +2007,8 @@ (defun org-export-data (data info) (puthash data (cond - ((not results) "") - ((memq type '(nil org-data plain-text raw)) results) + ((not results) (or (org-export--keep-spaces data info) "")) + ((memq type '(nil org-data plain-text raw)) results) ;; Append the same white space between elements or objects ;; as in the original buffer, and call appropriate filters. (t @@ -2641,24 +2669,13 @@ (defun org-export--prune-tree (data info) (let ((type (org-element-type data))) (if (org-export--skip-p data info selected excluded) (if (memq type '(table-cell table-row)) (push data ignore) - (let ((post-blank (org-element-post-blank data))) - (if (or (not post-blank) (zerop post-blank) - (eq 'element (org-element-class data))) - (org-element-extract data) + (if-let ((keep-spaces (org-export--keep-spaces data info))) ;; Keep spaces in place of removed ;; element, if necessary. ;; Example: "Foo.[10%] Bar" would become ;; "Foo.Bar" if we do not keep spaces. - (let ((previous (org-export-get-previous-element data info))) - (if (or (not previous) - (pcase (org-element-type previous) - (`plain-text - (string-match-p - (rx whitespace eos) previous)) - (_ (org-element-post-blank previous)))) - ;; Previous object ends with whitespace already. - (org-element-extract data) - (org-element-set data (make-string post-blank ?\s))))))) + (org-element-set data keep-spaces) + (org-element-extract data))) (if (and (eq type 'headline) (eq (plist-get info :with-archived-trees) 'headline) -- 2.44.0 --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=v2-0002-ox-md-ox-ascii-ox-texinfo-Strip-blank-lines-from-.patch >From 3fa3ed068fcfc58470430a5c4bae3a5ffd1ca3ed Mon Sep 17 00:00:00 2001 Message-ID: <3fa3ed068fcfc58470430a5c4bae3a5ffd1ca3ed.1713812419.git.yantar92@posteo.net> In-Reply-To: <229a563dc38e1fdfd63be2dfebb1a9e9023e44b2.1713812419.git.yantar92@posteo.net> References: <229a563dc38e1fdfd63be2dfebb1a9e9023e44b2.1713812419.git.yantar92@posteo.net> From: Ihor Radchenko Date: Sun, 21 Apr 2024 15:54:48 +0300 Subject: [PATCH v2 2/2] ox-md, ox-ascii, ox-texinfo: Strip blank lines from paragraphs * lisp/org-macs.el (org-remove-blank-lines): New helper function to strip blank lines from string. * lisp/ox-ascii.el (org-ascii-paragraph): * lisp/ox-latex.el (org-latex-paragraph): * lisp/ox-md.el (org-md-paragraph): * lisp/ox-texinfo.el (org-texinfo-paragraph): Strip blank lines from paragraphs - these exporters are using blank lines as paragraph separators. Reported-by: Max Nikulin Link: https://orgmode.org/list/v00le7$frp$1@ciao.gmane.io --- lisp/org-macs.el | 4 ++++ lisp/ox-ascii.el | 6 ++++++ lisp/ox-latex.el | 4 +--- lisp/ox-md.el | 6 ++++++ lisp/ox-texinfo.el | 7 ++++++- 5 files changed, 23 insertions(+), 4 deletions(-) diff --git a/lisp/org-macs.el b/lisp/org-macs.el index 1254ddb54..93803bfe9 100644 --- a/lisp/org-macs.el +++ b/lisp/org-macs.el @@ -1244,6 +1244,10 @@ (defun org-remove-tabs (s &optional width) t t s))) s) +(defun org-remove-blank-lines (s) + "Remove blank lines in S." + (replace-regexp-in-string (rx "\n" (1+ (0+ space) "\n")) "\n" s)) + (defun org-wrap (string &optional width lines) "Wrap string to either a number of lines, or a width in characters. If WIDTH is non-nil, the string is wrapped to that width, however many lines diff --git a/lisp/ox-ascii.el b/lisp/ox-ascii.el index db4356ec6..e767f66cf 100644 --- a/lisp/ox-ascii.el +++ b/lisp/ox-ascii.el @@ -1651,6 +1651,12 @@ (defun org-ascii-paragraph (paragraph contents info) "Transcode a PARAGRAPH element from Org to ASCII. CONTENTS is the contents of the paragraph, as a string. INFO is the plist used as a communication channel." + ;; Ensure that we do not create multiple paragraphs, when a single + ;; paragraph is expected. + ;; Multiple newlines may appear in CONTENTS, for example, when + ;; certain objects are stripped from export, leaving single newlines + ;; before and after. + (setq contents (org-remove-blank-lines contents)) (org-ascii--justify-element (let ((indented-line-width (plist-get info :ascii-indented-line-width))) (if (not (wholenump indented-line-width)) contents diff --git a/lisp/ox-latex.el b/lisp/ox-latex.el index 5c19e1fe7..2267a604e 100644 --- a/lisp/ox-latex.el +++ b/lisp/ox-latex.el @@ -3039,9 +3039,7 @@ (defun org-latex-paragraph (_paragraph contents _info) ;; Multiple newlines may appear in CONTENTS, for example, when ;; certain objects are stripped from export, leaving single newlines ;; before and after. - (replace-regexp-in-string - (rx "\n" (1+ (0+ space) "\n")) "\n" - contents)) + (org-remove-blank-lines contents)) ;;;; Plain List diff --git a/lisp/ox-md.el b/lisp/ox-md.el index fa2beeb95..28f0a4cf6 100644 --- a/lisp/ox-md.el +++ b/lisp/ox-md.el @@ -628,6 +628,12 @@ (defun org-md-paragraph (paragraph contents _info) "Transcode PARAGRAPH element into Markdown format. CONTENTS is the paragraph contents. INFO is a plist used as a communication channel." + ;; Ensure that we do not create multiple paragraphs, when a single + ;; paragraph is expected. + ;; Multiple newlines may appear in CONTENTS, for example, when + ;; certain objects are stripped from export, leaving single newlines + ;; before and after. + (setq contents (org-remove-blank-lines contents)) (let ((first-object (car (org-element-contents paragraph)))) ;; If paragraph starts with a #, protect it. (if (and (stringp first-object) (string-prefix-p "#" first-object)) diff --git a/lisp/ox-texinfo.el b/lisp/ox-texinfo.el index 4aef9c41c..fc9ec9209 100644 --- a/lisp/ox-texinfo.el +++ b/lisp/ox-texinfo.el @@ -1517,7 +1517,12 @@ (defun org-texinfo-paragraph (_paragraph contents _info) "Transcode a PARAGRAPH element from Org to Texinfo. CONTENTS is the contents of the paragraph, as a string. INFO is the plist used as a communication channel." - contents) + ;; Ensure that we do not create multiple paragraphs, when a single + ;; paragraph is expected. + ;; Multiple newlines may appear in CONTENTS, for example, when + ;; certain objects are stripped from export, leaving single newlines + ;; before and after. + (org-remove-blank-lines contents)) ;;;; Plain List -- 2.44.0 --=-=-= Content-Type: text/plain -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at --=-=-=--