From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kaushal Modi Subject: [PATCH] Fix for double-escaping # and ![ in ox-md Date: Wed, 20 Dec 2017 18:02:33 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="001a114075daead6cd0560c9634b" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:45972) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eRihg-00032a-1F for emacs-orgmode@gnu.org; Wed, 20 Dec 2017 13:02:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eRihZ-0001TJ-Sp for emacs-orgmode@gnu.org; Wed, 20 Dec 2017 13:02:52 -0500 Received: from mail-yw0-x22c.google.com ([2607:f8b0:4002:c05::22c]:42497) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eRihZ-0001St-Mw for emacs-orgmode@gnu.org; Wed, 20 Dec 2017 13:02:45 -0500 Received: by mail-yw0-x22c.google.com with SMTP id t189so2027741ywg.9 for ; Wed, 20 Dec 2017 10:02:45 -0800 (PST) List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: emacs-org list --001a114075daead6cd0560c9634b Content-Type: text/plain; charset="UTF-8" Hello, I have this test case; export it by doing C-c C-e C-s m M (assuming ox-md is required.. I think it is required by default). ===== * Escaping hashes and exclamations correctly in body :PROPERTIES: :EXPORT_FILE_NAME: escaping-hashes-and-exclamations-in-body :END: I intend to show these # characters verbatim; they should not render as Markdown headings. They also shouldn't show up with a leading =\= in the final rendered output. # This is an Org comment #This is not an Org comment. It has a hash char at beginning of a paragraph which must be escaped just once i.e. show up as =\#= in Markdown. blah # This isn't an Org comment either This * will be escaped just once i.e. show up as =\*= in Markdown. This _ will be escaped just once i.e. show up as =\_= in Markdown. This \ will be escaped just once i.e. show up as =\\= in Markdown. Hash char at beginning of a continued line #like this must be escaped just once i.e. show up as =\#= in Markdown. ![this exclamation must be escaped just once i.e. show up as =\!= in Markdown] This ! does not need to be escaped as there is no ambiguity. ===== Here's the relevant excerpt of the export that is erroneous: ===== Hash char at beginning of a continued line \\#like this must be escaped just once i.e. show up as `\#` in Markdown. \\![this exclamation must be escaped just once i.e. show up as `\!` in Markdown] ===== Note that the # and ![ are double-escaped. So any markdown->HTML renderer will print that "\" as it is. Digging through ox-md.el, I found that the order of replace-regexp-in-string was incorrect in org-md-plain-text. Here's the diff: ===== diff --git a/lisp/ox-md.el b/lisp/ox-md.el index 12188387355..927a73b780c 100644 --- a/lisp/ox-md.el +++ b/lisp/ox-md.el @@ -500,14 +500,15 @@ TEXT is the string to transcode. INFO is a plist holding contextual information." (when (plist-get info :with-smart-quotes) (setq text (org-export-activate-smart-quotes text :html info))) + ;; The below series of replacements in `text' is order sensitive. + ;; Protect `, *, _, and \ + (setq text (replace-regexp-in-string "[`*_\\]" "\\\\\\&" text)) ;; Protect ambiguous #. This will protect # at the beginning of ;; a line, but not at the beginning of a paragraph. See ;; `org-md-paragraph'. (setq text (replace-regexp-in-string "\n#" "\n\\\\#" text)) ;; Protect ambiguous ! (setq text (replace-regexp-in-string "\\(!\\)\\[" "\\\\!" text nil nil 1)) - ;; Protect `, *, _ and \ - (setq text (replace-regexp-in-string "[`*_\\]" "\\\\\\&" text)) ;; Handle special strings, if required. (when (plist-get info :with-special-strings) (setq text (org-html-convert-special-strings text))) ===== After applying this patch, the same portion exported doesn't have double-escaping before # and ![. If this looks good, I can commit this to maint. -- Kaushal Modi --001a114075daead6cd0560c9634b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,

I have this test case; export it by doing C-c C-e C-s m M (assuming ox-md= is required.. I think it is required by default).

=3D=3D=3D=3D=3D* Escaping hashes and exclamations correctly in body
:PROPERTIES:
:= EXPORT_FILE_NAME: escaping-hashes-and-exclamations-in-body
:END:
I in= tend to show these # characters verbatim; they should not render
as Mark= down headings. They also shouldn't show up with a leading =3D\=3D
in= the final rendered output.

# This is an Org comment

#This is= not an Org comment. It has a hash char at beginning of a
paragraph whic= h must be escaped just once i.e. show up as =3D\#=3D in
Markdown.
blah # This isn't an Org comment either

This * will be escaped = just once i.e. show up as =3D\*=3D in Markdown.

This _ will be escap= ed just once i.e. show up as =3D\_=3D in Markdown.

This \ will be es= caped just once i.e. show up as =3D\\=3D in Markdown.

Hash char at b= eginning of a continued line
#like this must be escaped just once i.e. s= how up as =3D\#=3D in Markdown.

![this exclamation must be escaped j= ust once i.e. show up as =3D\!=3D in
Markdown]

This ! does not ne= ed to be escaped as there is no ambiguity.
=3D=3D=3D=3D=3D

= Here's the relevant excerpt of the export that is erroneous:

=3D= =3D=3D=3D=3D
Hash char at beginning of a continued line
\\#like this = must be escaped just once i.e. show up as `\#` in Markdown.

\\![this= exclamation must be escaped just once i.e. show up as `\!` in
Markdown]=
=3D=3D=3D=3D=3D

Note that the # and ![ are double-escaped.= So any markdown->HTML renderer will print that "\" as it is.<= br>
Digging through ox-md.el, I found that the order of replace-re= gexp-in-string was incorrect in org-md-plain-text.

Here's = the diff:

=3D=3D=3D=3D=3D
diff --git a/lisp/ox-md.el b/lisp/ox-md= .el
index 12188387355..927a73b780c 100644
--- a/lisp/ox-md.el
+++ = b/lisp/ox-md.el
@@ -500,14 +500,15 @@ TEXT is the string to transcode.= =C2=A0 INFO is a plist holding
=C2=A0contextual information."
= =C2=A0=C2=A0 (when (plist-get info :with-smart-quotes)
=C2=A0=C2=A0=C2= =A0=C2=A0 (setq text (org-export-activate-smart-quotes text :html info)))+=C2=A0 ;; The below series of replacements in `text' is order sensit= ive.
+=C2=A0 ;; Protect `, *, _, and \
+=C2=A0 (setq text (replace-re= gexp-in-string "[`*_\\]" "\\\\\\&" text))
=C2=A0= =C2=A0 ;; Protect ambiguous #.=C2=A0 This will protect # at the beginning o= f
=C2=A0=C2=A0 ;; a line, but not at the beginning of a paragraph.=C2=A0= See
=C2=A0=C2=A0 ;; `org-md-paragraph'.
=C2=A0=C2=A0 (setq text = (replace-regexp-in-string "\n#" "\n\\\\#" text))
=C2= =A0=C2=A0 ;; Protect ambiguous !
=C2=A0=C2=A0 (setq text (replace-regexp= -in-string "\\(!\\)\\[" "\\\\!" text nil nil 1))
-= =C2=A0 ;; Protect `, *, _ and \
-=C2=A0 (setq text (replace-regexp-in-st= ring "[`*_\\]" "\\\\\\&" text))
=C2=A0=C2=A0 ;; = Handle special strings, if required.
=C2=A0=C2=A0 (when (plist-get info = :with-special-strings)
=C2=A0=C2=A0=C2=A0=C2=A0 (setq text (org-html-con= vert-special-strings text)))
=3D=3D=3D=3D=3D

After applying= this patch, the same portion exported doesn't have double-escaping bef= ore # and ![.

If this looks good, I can commit this to maint.<= br>
--

Kaushal Modi

--001a114075daead6cd0560c9634b--