From: Ihor Radchenko <yantar92@gmail.com>
To: K K <k_foreign@outlook.com>
Cc: Max Nikulin <manikulin@gmail.com>,
"emacs-orgmode@gnu.org" <emacs-orgmode@gnu.org>
Subject: [PATCH] Add new entity \-- serving as markup separator/escape symbol
Date: Thu, 28 Jul 2022 21:17:32 +0800 [thread overview]
Message-ID: <87mtct9y1f.fsf@localhost> (raw)
In-Reply-To: <87v8rkav2x.fsf@localhost>
[-- Attachment #1: Type: text/plain, Size: 1948 bytes --]
Ihor Radchenko <yantar92@gmail.com> writes:
> I am attaching a tentative patch that will make Org export remove
> zero-width spaces when those spaces actually separate the object
> boundaries.
>
> Any objections?
Given the raised objections, zero-width space does not appear to be a
useful escape symbol because it has its valid uses as a standalone space
symbol.
The raised objections can be solved using some kind of intricate
heuristics, but I do not feel like it is a good direction to go. The
code will be too complex and fragile.
Therefore, I am proposing a different approach for shielding
fontification: introducing a special entity.
The new entity is \--, which is a valid boundary between emphasis
markup. It will be removed during export (replaced by "").
"\--" specifically is somewhat arbitrary choice. The actual requirements
for the entity name are: (1) No clash with LaTeX (which is why simpler
\- would not cut it); (2) Being a valid markup boundary: entity must end
with (any space ?- ?\( ?' ?\" ?\{).
I am attaching a tentative patch introducing the new entity. Note that
some minor tweaks to the parser were needed. I do not see it as a big
deal - the current entity regexp has much more cumbersome exceptions.
Also, the patch will not work correctly on org → org export, similar to
pointed in one of the replies to the previous abandoned approach. I do
not want to address it here because a much more appropriate solution for
this issue is changing org-element-interpret-data.
Consider (org-element-interpret-data '("asd" (bold () "bold") "bsd"))
This will return "asd*bold*bsd", which is not correct even though the
given Org datum is not wrong by itself - such things can easily appear
when user filters are applied to parse tree during org→org export.
Otherwise, the patch should be good enough to play around and kick-start
the discussion.
WDYT?
Best,
Ihor
[-- Attachment #2: 0001-Add-new-entity-serving-as-markup-separator-escape-sy.patch --]
[-- Type: text/x-patch, Size: 2994 bytes --]
From 521a4b06578cf37f22e9f33d2f45b967419ad3a3 Mon Sep 17 00:00:00 2001
Message-Id: <521a4b06578cf37f22e9f33d2f45b967419ad3a3.1659013441.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Thu, 28 Jul 2022 21:02:26 +0800
Subject: [PATCH] Add new entity \-- serving as markup separator/escape symbol
* lisp/org-entities.el (org-entities): Add \-- entity. This entity is
exported as an empty string and simply serves as markup separator if
the user needs any.
* lisp/org.el (org-fontify-entities):
* lisp/org-element.el (org-element-entity-parser):
(org-element--set-regexps): Update entity regexp to match "-".
---
lisp/org-element.el | 4 ++--
lisp/org-entities.el | 4 ++++
lisp/org.el | 2 +-
3 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/lisp/org-element.el b/lisp/org-element.el
index 9e9b7c5ec..6405b4db8 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -258,7 +258,7 @@ (defun org-element--set-regexps ()
"\\$"
;; Objects starting with "\": line break,
;; entity, latex fragment.
- "\\\\\\(?:[a-zA-Z[(]\\|\\\\[ \t]*$\\|_ +\\)"
+ "\\\\\\(?:[-a-zA-Z[(]\\|\\\\[ \t]*$\\|_ +\\)"
;; Objects starting with raw text: inline Babel
;; source block, inline Babel call.
"\\(?:call\\|src\\)_"))
@@ -3158,7 +3158,7 @@ (defun org-element-entity-parser ()
Assume point is at the beginning of the entity."
(catch 'no-object
- (when (looking-at "\\\\\\(?:\\(?1:_ +\\)\\|\\(?1:there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)")
+ (when (looking-at "\\\\\\(?:\\(?1:_ +\\)\\|\\(?1:there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z-]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)")
(save-excursion
(let* ((value (or (org-entity-get (match-string 1))
(throw 'no-object nil)))
diff --git a/lisp/org-entities.el b/lisp/org-entities.el
index d35e3fa8a..9d79d23fc 100644
--- a/lisp/org-entities.el
+++ b/lisp/org-entities.el
@@ -264,6 +264,10 @@ (defconst org-entities
("rsaquo" "\\guilsinglright{}" nil "›" ">" ">" "›")
"* Other"
+
+ "** Escaping Org markup"
+ ("--" "" nil "" "" "" "")
+
"** Misc. (often used)"
("circ" "\\^{}" nil "ˆ" "^" "^" "∘")
("vert" "\\vert{}" t "|" "|" "|" "|")
diff --git a/lisp/org.el b/lisp/org.el
index 937892ef3..29ccff83b 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -5828,7 +5828,7 @@ (defun org-fontify-entities (limit)
;; i.e., "\_ ", could be fontified anyway, and it would be
;; confusing when adding a second white space character.
(while (re-search-forward
- "\\\\\\(there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\($\\|{}\\|[^[:alpha:]\n]\\)"
+ "\\\\\\(there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z-]+\\)\\($\\|{}\\|[^[:alpha:]\n]\\)"
limit t)
(when (and (not (org-at-comment-p))
(setq ee (org-entity-get (match-string 1)))
--
2.35.1
next prev parent reply other threads:[~2022-07-28 13:17 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-19 5:32 How to force markup without spaces cinsky
2012-11-19 7:11 ` Vladimir Lomov
2012-11-19 10:06 ` Seong-Kook Shin
2012-11-19 14:40 ` Suvayu Ali
2012-12-13 21:26 ` Bastien
2022-07-25 17:50 ` K
2022-07-25 18:27 ` K
2022-07-25 19:02 ` K
2022-07-26 1:26 ` Ihor Radchenko
2022-07-26 2:23 ` Max Nikulin
2022-07-26 4:26 ` K K
2022-07-26 6:30 ` Max Nikulin
2022-07-26 12:59 ` [PATCH] org-export: Remove zero-width space escapes during export Ihor Radchenko
2022-07-26 14:25 ` Timothy
2022-07-26 15:27 ` András Simonyi
2022-07-26 16:38 ` Max Nikulin
2022-07-27 3:30 ` Max Nikulin
2022-07-28 13:17 ` Ihor Radchenko [this message]
2022-07-28 15:34 ` [PATCH] Add new entity \-- serving as markup separator/escape symbol Max Nikulin
2022-07-29 1:43 ` Ihor Radchenko
2022-07-29 2:50 ` Max Nikulin
2022-07-29 9:06 ` [PATCH v2] " Ihor Radchenko
2022-07-30 0:22 ` Samuel Wales
2022-07-30 4:12 ` Samuel Wales
2022-07-30 6:49 ` Ihor Radchenko
2022-07-30 15:44 ` Max Nikulin
2022-07-28 22:20 ` [PATCH] " Tim Cross
2022-07-29 0:32 ` Juan Manuel Macías
2022-07-29 5:49 ` tomas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mtct9y1f.fsf@localhost \
--to=yantar92@gmail.com \
--cc=emacs-orgmode@gnu.org \
--cc=k_foreign@outlook.com \
--cc=manikulin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).