emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@posteo.net>
To: Max Nikulin <manikulin@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Re: Trailing whitespace after export snippets without a transcoder
Date: Sun, 21 Apr 2024 13:00:10 +0000	[thread overview]
Message-ID: <87wmoqq3ad.fsf@localhost> (raw)
In-Reply-To: <v00le7$frp$1@ciao.gmane.io>

[-- Attachment #1: Type: text/plain, Size: 891 bytes --]

Max Nikulin <manikulin@gmail.com> writes:

>> I have no clue about the rationale of this special behaviour - it dates
>> back to the days when Org export was merged. It is also not documented
>> anywhere, AFAIK.
>
> I would not expect that the space after the following export snippet is 
> ignored in the case of ox-html or ox-latex backend:
>
>      A space@@ascii:*@@ character.
>
> The space may be put inside the export snippet if the intention is to 
> omit it for output formats other than plain text. So current behavior is 
> perfectly reasonable and flexible enough.

Hmm. We actually have a similar scenario in `org-export--prune-tree'
with a slightly different logic - only keep spaces when previous object
does not have any.

I am attaching tentative patch that will duplicate the logic for export
snippets as well. And for any other object where transcoder returns nil.
WDYT?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-export-data-Handle-trailing-spaces-when-transcod.patch --]
[-- Type: text/x-patch, Size: 3472 bytes --]

From 54939c4044fb407b068c0666c258ccd01e59c2af Mon Sep 17 00:00:00 2001
Message-ID: <54939c4044fb407b068c0666c258ccd01e59c2af.1713703523.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Sun, 21 Apr 2024 15:37:18 +0300
Subject: [PATCH 1/2] org-export-data: Handle trailing spaces when transcoder
 returns nil

* lisp/ox.el (org-export-data): When transcoder returns nil, handle
trailing spaces after an object the same way `org-export--prune-tree'
does.  Remove special handling of export snippets that unconditionally
keep their trailing spaces.

Link: https://orgmode.org/list/87h6fwmgkm.fsf@localhost
---
 lisp/ox.el | 43 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 32 insertions(+), 11 deletions(-)

diff --git a/lisp/ox.el b/lisp/ox.el
index fc746950d..ccc4c94ce 100644
--- a/lisp/ox.el
+++ b/lisp/ox.el
@@ -1930,15 +1930,11 @@ (defun org-export-data (data info)
 			   (eq (plist-get info :with-archived-trees) 'headline)
 			   (org-element-property :archivedp data)))
 		  (let ((transcoder (org-export-transcoder data info)))
-		    (or (and (functionp transcoder)
-                             (if (eq type 'link)
-			         (broken-link-handler
-			          (funcall transcoder data nil info))
-                               (funcall transcoder data nil info)))
-			;; Export snippets never return a nil value so
-			;; that white spaces following them are never
-			;; ignored.
-			(and (eq type 'export-snippet) ""))))
+		    (and (functionp transcoder)
+                         (if (eq type 'link)
+			     (broken-link-handler
+			      (funcall transcoder data nil info))
+                           (funcall transcoder data nil info)))))
 		 ;; Element/Object with contents.
 		 (t
 		  (let ((transcoder (org-export-transcoder data info)))
@@ -1979,8 +1975,33 @@ (defun org-export-data (data info)
 	  (puthash
 	   data
 	   (cond
-	    ((not results) "")
-	    ((memq type '(nil org-data plain-text raw)) results)
+	    ((not results)
+             ;; TRANSCODER returned nil.  When DATA is an object,
+             ;; interpret this as if DATA should be ignored (see
+             ;; `org-export--prune-tree').  Keep spaces in place of
+             ;; removed element, if necessary.
+	     ;; Example: "Foo.[10%] Bar" would become
+	     ;; "Foo.Bar" if we do not keep spaces.
+             ;; Another example: "A space@@ascii:*@@ character."
+             ;; should become "A space character" in non-ASCII export.
+             (let ((post-blank (org-element-post-blank data)))
+               (or
+                (unless (or (not post-blank)
+                            (zerop post-blank)
+                            (eq 'element (org-element-class data)))
+                  (let ((previous (org-export-get-previous-element data info)))
+		    (unless (or (not previous)
+			        (pcase (org-element-type previous)
+				  (`plain-text
+				   (string-match-p
+				    (rx  whitespace eos) previous))
+				  (_ (org-element-post-blank previous))))
+                      ;; When previous element does not have
+                      ;; trailing spaces, keep the trailing
+                      ;; spaces from DATA.
+		      (make-string post-blank ?\s))))
+                "")))
+            ((memq type '(nil org-data plain-text raw)) results)
 	    ;; Append the same white space between elements or objects
 	    ;; as in the original buffer, and call appropriate filters.
 	    (t
-- 
2.44.0


[-- Attachment #3: Type: text/plain, Size: 763 bytes --]


> The issue is empty lines that serve as paragraph separators and that may 
> appear due to expansion of a macro or due to skipped export snippets. 
> Perhaps transcoders of other elements, e.g. links, may return empty 
> strings as well.

Right. This is a special case in MD where blank lines separate
paragraphs. Just like in ox-latex, where we fixed exactly same thing
reported in https://orgmode.org/list/tufdb6$11h2$1@ciao.gmane.io

It is also a side effect of the fact that newlines are not considered a
part of the Org markup objects.

I do not think that we need to handle this Org mode-wide (it will be
difficult and will likely cause breaking changes). We can only adjust
the export backends sensitive to blank lines.

See the attached tentative fix.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0001-ox-md-ox-ascii-ox-texinfo-Strip-blank-lines-from-par.patch --]
[-- Type: text/x-patch, Size: 4474 bytes --]

From 08c584b90ca6950e4186093bf5742d7443448254 Mon Sep 17 00:00:00 2001
Message-ID: <08c584b90ca6950e4186093bf5742d7443448254.1713704215.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Sun, 21 Apr 2024 15:54:48 +0300
Subject: [PATCH] ox-md, ox-ascii, ox-texinfo: Strip blank lines from
 paragraphs

* lisp/org-macs.el (org-remove-blank-lines): New helper function to
strip blank lines from string.
* lisp/ox-ascii.el (org-ascii-paragraph):
* lisp/ox-latex.el (org-latex-paragraph):
* lisp/ox-md.el (org-md-paragraph):
* lisp/ox-texinfo.el (org-texinfo-paragraph): Strip blank lines from
paragraphs - these exporters are using blank lines as paragraph
separators.

Reported-by: Max Nikulin <manikulin@gmail.com>
Link: https://orgmode.org/list/v00le7$frp$1@ciao.gmane.io
---
 lisp/org-macs.el   | 4 ++++
 lisp/ox-ascii.el   | 6 ++++++
 lisp/ox-latex.el   | 4 +---
 lisp/ox-md.el      | 6 ++++++
 lisp/ox-texinfo.el | 7 ++++++-
 5 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/lisp/org-macs.el b/lisp/org-macs.el
index 0046f3493..85bf6e4fa 100644
--- a/lisp/org-macs.el
+++ b/lisp/org-macs.el
@@ -1554,6 +1554,10 @@ (defun org-remove-tabs (s &optional width)
 	     t t s)))
   s)
 
+(defun org-remove-blank-lines (s)
+  "Remove blank lines in S."
+  (replace-regexp-in-string (rx "\n" (1+ (0+ space) "\n")) "\n" s))
+
 (defun org-wrap (string &optional width lines)
   "Wrap string to either a number of lines, or a width in characters.
 If WIDTH is non-nil, the string is wrapped to that width, however many lines
diff --git a/lisp/ox-ascii.el b/lisp/ox-ascii.el
index db4356ec6..e767f66cf 100644
--- a/lisp/ox-ascii.el
+++ b/lisp/ox-ascii.el
@@ -1651,6 +1651,12 @@ (defun org-ascii-paragraph (paragraph contents info)
   "Transcode a PARAGRAPH element from Org to ASCII.
 CONTENTS is the contents of the paragraph, as a string.  INFO is
 the plist used as a communication channel."
+  ;; Ensure that we do not create multiple paragraphs, when a single
+  ;; paragraph is expected.
+  ;; Multiple newlines may appear in CONTENTS, for example, when
+  ;; certain objects are stripped from export, leaving single newlines
+  ;; before and after.
+  (setq contents (org-remove-blank-lines contents))
   (org-ascii--justify-element
    (let ((indented-line-width (plist-get info :ascii-indented-line-width)))
      (if (not (wholenump indented-line-width)) contents
diff --git a/lisp/ox-latex.el b/lisp/ox-latex.el
index 8a10f9390..cae7bb3b2 100644
--- a/lisp/ox-latex.el
+++ b/lisp/ox-latex.el
@@ -3040,9 +3040,7 @@ (defun org-latex-paragraph (_paragraph contents _info)
   ;; Multiple newlines may appear in CONTENTS, for example, when
   ;; certain objects are stripped from export, leaving single newlines
   ;; before and after.
-  (replace-regexp-in-string
-   (rx "\n" (1+ (0+ space) "\n")) "\n"
-   contents))
+  (org-remove-blank-lines contents))
 
 
 ;;;; Plain List
diff --git a/lisp/ox-md.el b/lisp/ox-md.el
index fa2beeb95..28f0a4cf6 100644
--- a/lisp/ox-md.el
+++ b/lisp/ox-md.el
@@ -628,6 +628,12 @@ (defun org-md-paragraph (paragraph contents _info)
   "Transcode PARAGRAPH element into Markdown format.
 CONTENTS is the paragraph contents.  INFO is a plist used as
 a communication channel."
+  ;; Ensure that we do not create multiple paragraphs, when a single
+  ;; paragraph is expected.
+  ;; Multiple newlines may appear in CONTENTS, for example, when
+  ;; certain objects are stripped from export, leaving single newlines
+  ;; before and after.
+  (setq contents (org-remove-blank-lines contents))
   (let ((first-object (car (org-element-contents paragraph))))
     ;; If paragraph starts with a #, protect it.
     (if (and (stringp first-object) (string-prefix-p "#" first-object))
diff --git a/lisp/ox-texinfo.el b/lisp/ox-texinfo.el
index 4aef9c41c..fc9ec9209 100644
--- a/lisp/ox-texinfo.el
+++ b/lisp/ox-texinfo.el
@@ -1517,7 +1517,12 @@ (defun org-texinfo-paragraph (_paragraph contents _info)
   "Transcode a PARAGRAPH element from Org to Texinfo.
 CONTENTS is the contents of the paragraph, as a string.  INFO is
 the plist used as a communication channel."
-  contents)
+  ;; Ensure that we do not create multiple paragraphs, when a single
+  ;; paragraph is expected.
+  ;; Multiple newlines may appear in CONTENTS, for example, when
+  ;; certain objects are stripped from export, leaving single newlines
+  ;; before and after.
+  (org-remove-blank-lines contents))
 
 ;;;; Plain List
 
-- 
2.44.0


[-- Attachment #5: Type: text/plain, Size: 224 bytes --]


-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

  reply	other threads:[~2024-04-21 13:05 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-22  0:04 Inline comments ypuntot
2023-06-22  1:50 ` Max Nikulin
2023-06-28 15:51   ` [PATCH] org-faq.org: " Max Nikulin
2023-06-29 10:47     ` Ihor Radchenko
2023-06-30 10:34       ` Max Nikulin
2023-07-01 10:48         ` Ihor Radchenko
2023-08-06 18:34           ` Bastien Guerry
2023-07-01 15:24         ` Ihor Radchenko
2023-07-02  5:46           ` Org FAQ design (Re: [PATCH] org-faq.org: Inline comments) Max Nikulin
2023-07-08  5:48       ` [PATCH v2] org-faq.org: Inline comments Max Nikulin
2023-07-08  9:31         ` Ihor Radchenko
2023-07-10 16:09           ` Max Nikulin
2023-09-01 11:28             ` Ihor Radchenko
2024-04-15 12:17               ` Ihor Radchenko
2024-04-17 14:44                 ` Max Nikulin
2024-04-20 11:14                   ` Trailing whitespace after export snippets without a transcoder (was: [PATCH v2] org-faq.org: Inline comments) Ihor Radchenko
2024-04-20 15:02                     ` Trailing whitespace after export snippets without a transcoder Max Nikulin
2024-04-21 13:00                       ` Ihor Radchenko [this message]
2024-04-22 10:45                         ` Max Nikulin
2024-04-22 19:01                           ` Ihor Radchenko
2024-04-25 10:54                             ` Max Nikulin
2024-04-28 11:19                               ` Ihor Radchenko
2024-05-19 10:04                                 ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wmoqq3ad.fsf@localhost \
    --to=yantar92@posteo.net \
    --cc=emacs-orgmode@gnu.org \
    --cc=manikulin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).