From: Ihor Radchenko <yantar92@posteo.net>
To: Max Nikulin <manikulin@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Re: Trailing whitespace after export snippets without a transcoder
Date: Sun, 21 Apr 2024 13:00:10 +0000 [thread overview]
Message-ID: <87wmoqq3ad.fsf@localhost> (raw)
In-Reply-To: <v00le7$frp$1@ciao.gmane.io>
[-- Attachment #1: Type: text/plain, Size: 891 bytes --]
Max Nikulin <manikulin@gmail.com> writes:
>> I have no clue about the rationale of this special behaviour - it dates
>> back to the days when Org export was merged. It is also not documented
>> anywhere, AFAIK.
>
> I would not expect that the space after the following export snippet is
> ignored in the case of ox-html or ox-latex backend:
>
> A space@@ascii:*@@ character.
>
> The space may be put inside the export snippet if the intention is to
> omit it for output formats other than plain text. So current behavior is
> perfectly reasonable and flexible enough.
Hmm. We actually have a similar scenario in `org-export--prune-tree'
with a slightly different logic - only keep spaces when previous object
does not have any.
I am attaching tentative patch that will duplicate the logic for export
snippets as well. And for any other object where transcoder returns nil.
WDYT?
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-export-data-Handle-trailing-spaces-when-transcod.patch --]
[-- Type: text/x-patch, Size: 3472 bytes --]
From 54939c4044fb407b068c0666c258ccd01e59c2af Mon Sep 17 00:00:00 2001
Message-ID: <54939c4044fb407b068c0666c258ccd01e59c2af.1713703523.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Sun, 21 Apr 2024 15:37:18 +0300
Subject: [PATCH 1/2] org-export-data: Handle trailing spaces when transcoder
returns nil
* lisp/ox.el (org-export-data): When transcoder returns nil, handle
trailing spaces after an object the same way `org-export--prune-tree'
does. Remove special handling of export snippets that unconditionally
keep their trailing spaces.
Link: https://orgmode.org/list/87h6fwmgkm.fsf@localhost
---
lisp/ox.el | 43 ++++++++++++++++++++++++++++++++-----------
1 file changed, 32 insertions(+), 11 deletions(-)
diff --git a/lisp/ox.el b/lisp/ox.el
index fc746950d..ccc4c94ce 100644
--- a/lisp/ox.el
+++ b/lisp/ox.el
@@ -1930,15 +1930,11 @@ (defun org-export-data (data info)
(eq (plist-get info :with-archived-trees) 'headline)
(org-element-property :archivedp data)))
(let ((transcoder (org-export-transcoder data info)))
- (or (and (functionp transcoder)
- (if (eq type 'link)
- (broken-link-handler
- (funcall transcoder data nil info))
- (funcall transcoder data nil info)))
- ;; Export snippets never return a nil value so
- ;; that white spaces following them are never
- ;; ignored.
- (and (eq type 'export-snippet) ""))))
+ (and (functionp transcoder)
+ (if (eq type 'link)
+ (broken-link-handler
+ (funcall transcoder data nil info))
+ (funcall transcoder data nil info)))))
;; Element/Object with contents.
(t
(let ((transcoder (org-export-transcoder data info)))
@@ -1979,8 +1975,33 @@ (defun org-export-data (data info)
(puthash
data
(cond
- ((not results) "")
- ((memq type '(nil org-data plain-text raw)) results)
+ ((not results)
+ ;; TRANSCODER returned nil. When DATA is an object,
+ ;; interpret this as if DATA should be ignored (see
+ ;; `org-export--prune-tree'). Keep spaces in place of
+ ;; removed element, if necessary.
+ ;; Example: "Foo.[10%] Bar" would become
+ ;; "Foo.Bar" if we do not keep spaces.
+ ;; Another example: "A space@@ascii:*@@ character."
+ ;; should become "A space character" in non-ASCII export.
+ (let ((post-blank (org-element-post-blank data)))
+ (or
+ (unless (or (not post-blank)
+ (zerop post-blank)
+ (eq 'element (org-element-class data)))
+ (let ((previous (org-export-get-previous-element data info)))
+ (unless (or (not previous)
+ (pcase (org-element-type previous)
+ (`plain-text
+ (string-match-p
+ (rx whitespace eos) previous))
+ (_ (org-element-post-blank previous))))
+ ;; When previous element does not have
+ ;; trailing spaces, keep the trailing
+ ;; spaces from DATA.
+ (make-string post-blank ?\s))))
+ "")))
+ ((memq type '(nil org-data plain-text raw)) results)
;; Append the same white space between elements or objects
;; as in the original buffer, and call appropriate filters.
(t
--
2.44.0
[-- Attachment #3: Type: text/plain, Size: 763 bytes --]
> The issue is empty lines that serve as paragraph separators and that may
> appear due to expansion of a macro or due to skipped export snippets.
> Perhaps transcoders of other elements, e.g. links, may return empty
> strings as well.
Right. This is a special case in MD where blank lines separate
paragraphs. Just like in ox-latex, where we fixed exactly same thing
reported in https://orgmode.org/list/tufdb6$11h2$1@ciao.gmane.io
It is also a side effect of the fact that newlines are not considered a
part of the Org markup objects.
I do not think that we need to handle this Org mode-wide (it will be
difficult and will likely cause breaking changes). We can only adjust
the export backends sensitive to blank lines.
See the attached tentative fix.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0001-ox-md-ox-ascii-ox-texinfo-Strip-blank-lines-from-par.patch --]
[-- Type: text/x-patch, Size: 4474 bytes --]
From 08c584b90ca6950e4186093bf5742d7443448254 Mon Sep 17 00:00:00 2001
Message-ID: <08c584b90ca6950e4186093bf5742d7443448254.1713704215.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Sun, 21 Apr 2024 15:54:48 +0300
Subject: [PATCH] ox-md, ox-ascii, ox-texinfo: Strip blank lines from
paragraphs
* lisp/org-macs.el (org-remove-blank-lines): New helper function to
strip blank lines from string.
* lisp/ox-ascii.el (org-ascii-paragraph):
* lisp/ox-latex.el (org-latex-paragraph):
* lisp/ox-md.el (org-md-paragraph):
* lisp/ox-texinfo.el (org-texinfo-paragraph): Strip blank lines from
paragraphs - these exporters are using blank lines as paragraph
separators.
Reported-by: Max Nikulin <manikulin@gmail.com>
Link: https://orgmode.org/list/v00le7$frp$1@ciao.gmane.io
---
lisp/org-macs.el | 4 ++++
lisp/ox-ascii.el | 6 ++++++
lisp/ox-latex.el | 4 +---
lisp/ox-md.el | 6 ++++++
lisp/ox-texinfo.el | 7 ++++++-
5 files changed, 23 insertions(+), 4 deletions(-)
diff --git a/lisp/org-macs.el b/lisp/org-macs.el
index 0046f3493..85bf6e4fa 100644
--- a/lisp/org-macs.el
+++ b/lisp/org-macs.el
@@ -1554,6 +1554,10 @@ (defun org-remove-tabs (s &optional width)
t t s)))
s)
+(defun org-remove-blank-lines (s)
+ "Remove blank lines in S."
+ (replace-regexp-in-string (rx "\n" (1+ (0+ space) "\n")) "\n" s))
+
(defun org-wrap (string &optional width lines)
"Wrap string to either a number of lines, or a width in characters.
If WIDTH is non-nil, the string is wrapped to that width, however many lines
diff --git a/lisp/ox-ascii.el b/lisp/ox-ascii.el
index db4356ec6..e767f66cf 100644
--- a/lisp/ox-ascii.el
+++ b/lisp/ox-ascii.el
@@ -1651,6 +1651,12 @@ (defun org-ascii-paragraph (paragraph contents info)
"Transcode a PARAGRAPH element from Org to ASCII.
CONTENTS is the contents of the paragraph, as a string. INFO is
the plist used as a communication channel."
+ ;; Ensure that we do not create multiple paragraphs, when a single
+ ;; paragraph is expected.
+ ;; Multiple newlines may appear in CONTENTS, for example, when
+ ;; certain objects are stripped from export, leaving single newlines
+ ;; before and after.
+ (setq contents (org-remove-blank-lines contents))
(org-ascii--justify-element
(let ((indented-line-width (plist-get info :ascii-indented-line-width)))
(if (not (wholenump indented-line-width)) contents
diff --git a/lisp/ox-latex.el b/lisp/ox-latex.el
index 8a10f9390..cae7bb3b2 100644
--- a/lisp/ox-latex.el
+++ b/lisp/ox-latex.el
@@ -3040,9 +3040,7 @@ (defun org-latex-paragraph (_paragraph contents _info)
;; Multiple newlines may appear in CONTENTS, for example, when
;; certain objects are stripped from export, leaving single newlines
;; before and after.
- (replace-regexp-in-string
- (rx "\n" (1+ (0+ space) "\n")) "\n"
- contents))
+ (org-remove-blank-lines contents))
;;;; Plain List
diff --git a/lisp/ox-md.el b/lisp/ox-md.el
index fa2beeb95..28f0a4cf6 100644
--- a/lisp/ox-md.el
+++ b/lisp/ox-md.el
@@ -628,6 +628,12 @@ (defun org-md-paragraph (paragraph contents _info)
"Transcode PARAGRAPH element into Markdown format.
CONTENTS is the paragraph contents. INFO is a plist used as
a communication channel."
+ ;; Ensure that we do not create multiple paragraphs, when a single
+ ;; paragraph is expected.
+ ;; Multiple newlines may appear in CONTENTS, for example, when
+ ;; certain objects are stripped from export, leaving single newlines
+ ;; before and after.
+ (setq contents (org-remove-blank-lines contents))
(let ((first-object (car (org-element-contents paragraph))))
;; If paragraph starts with a #, protect it.
(if (and (stringp first-object) (string-prefix-p "#" first-object))
diff --git a/lisp/ox-texinfo.el b/lisp/ox-texinfo.el
index 4aef9c41c..fc9ec9209 100644
--- a/lisp/ox-texinfo.el
+++ b/lisp/ox-texinfo.el
@@ -1517,7 +1517,12 @@ (defun org-texinfo-paragraph (_paragraph contents _info)
"Transcode a PARAGRAPH element from Org to Texinfo.
CONTENTS is the contents of the paragraph, as a string. INFO is
the plist used as a communication channel."
- contents)
+ ;; Ensure that we do not create multiple paragraphs, when a single
+ ;; paragraph is expected.
+ ;; Multiple newlines may appear in CONTENTS, for example, when
+ ;; certain objects are stripped from export, leaving single newlines
+ ;; before and after.
+ (org-remove-blank-lines contents))
;;;; Plain List
--
2.44.0
[-- Attachment #5: Type: text/plain, Size: 224 bytes --]
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
next prev parent reply other threads:[~2024-04-21 13:05 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-22 0:04 Inline comments ypuntot
2023-06-22 1:50 ` Max Nikulin
2023-06-28 15:51 ` [PATCH] org-faq.org: " Max Nikulin
2023-06-29 10:47 ` Ihor Radchenko
2023-06-30 10:34 ` Max Nikulin
2023-07-01 10:48 ` Ihor Radchenko
2023-08-06 18:34 ` Bastien Guerry
2023-07-01 15:24 ` Ihor Radchenko
2023-07-02 5:46 ` Org FAQ design (Re: [PATCH] org-faq.org: Inline comments) Max Nikulin
2023-07-08 5:48 ` [PATCH v2] org-faq.org: Inline comments Max Nikulin
2023-07-08 9:31 ` Ihor Radchenko
2023-07-10 16:09 ` Max Nikulin
2023-09-01 11:28 ` Ihor Radchenko
2024-04-15 12:17 ` Ihor Radchenko
2024-04-17 14:44 ` Max Nikulin
2024-04-20 11:14 ` Trailing whitespace after export snippets without a transcoder (was: [PATCH v2] org-faq.org: Inline comments) Ihor Radchenko
2024-04-20 15:02 ` Trailing whitespace after export snippets without a transcoder Max Nikulin
2024-04-21 13:00 ` Ihor Radchenko [this message]
2024-04-22 10:45 ` Max Nikulin
2024-04-22 19:01 ` Ihor Radchenko
2024-04-25 10:54 ` Max Nikulin
2024-04-28 11:19 ` Ihor Radchenko
2024-05-19 10:04 ` Ihor Radchenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wmoqq3ad.fsf@localhost \
--to=yantar92@posteo.net \
--cc=emacs-orgmode@gnu.org \
--cc=manikulin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).