* org-html-use-unicode-chars breaks source code blocks
@ 2015-08-04 13:40 Vladimir Alexiev
2015-08-04 17:35 ` Rasmus
0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Alexiev @ 2015-08-04 13:40 UTC (permalink / raw)
To: emacs-orgmode
Hi!
I've set org-html-use-unicode-chars since I want ox-html to leave IRIs as IRIs.
But this has another undesired effect: it breaks <URL> references in code,
since it doesn't escape the brackets.
Eg this:
#+BEGIN_SRC Turtle
@prefix aat: <http://vocab.getty.edu/aat/>.
#+END_SRC
results in the URL being invisible in the exported HTML.
The fault is here:
(defun org-html-final-function (contents backend info)
...
(when org-html-use-unicode-chars
(require 'mm-url)
(mm-url-decode-entities))
previous code carefully escaped the entities in org-html-protect-char-alist,
only for mm-url-decode-entities to unescape them.
http://article.gmane.org/gmane.emacs.orgmode/94742 is somewhat related.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-04 13:40 org-html-use-unicode-chars breaks source code blocks Vladimir Alexiev
@ 2015-08-04 17:35 ` Rasmus
2015-08-04 18:37 ` Nicolas Goaziou
0 siblings, 1 reply; 14+ messages in thread
From: Rasmus @ 2015-08-04 17:35 UTC (permalink / raw)
To: emacs-orgmode
[-- Attachment #1: Type: text/plain, Size: 477 bytes --]
Hi,
Vladimir Alexiev <vladimir.alexiev@ontotext.com> writes:
> I've set org-html-use-unicode-chars since I want ox-html to leave IRIs as IRIs.
> But this has another undesired effect: it breaks <URL> references in code,
> since it doesn't escape the brackets.
I think this should only apply to entities. Any reason to do it on the
whole output? Nicolas?
This patch makes that change.
Rasmus
--
This message is brought to you by the department of redundant departments
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-ox-html-Only-translate-entities-to-UTF-8.patch --]
[-- Type: text/x-diff, Size: 1940 bytes --]
From 535366ec1e1819c73bb038712a19f5e1be0a51b7 Mon Sep 17 00:00:00 2001
From: Rasmus <rasmus@gmx.us>
Date: Tue, 4 Aug 2015 19:12:00 +0200
Subject: [PATCH 1/4] ox-html: Only translate entities to UTF-8
* ox-html.el (org-html-final-function): Do not check
:html-use-unicode-chars.
(org-html-entity): Check :html-use-unicode-chars
(org-html-use-unicode-chars): Update docstring.
Reported-by: Vladimir Alexiev <vladimir.alexiev@ontotext.com>
<http://permalink.gmane.org/gmane.emacs.orgmode/99451>
---
lisp/ox-html.el | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/lisp/ox-html.el b/lisp/ox-html.el
index 2c13bf6..c329b72 100644
--- a/lisp/ox-html.el
+++ b/lisp/ox-html.el
@@ -609,10 +609,10 @@ Warning: non-nil may break indentation of source code blocks."
:type 'boolean)
(defcustom org-html-use-unicode-chars nil
- "Non-nil means to use unicode characters instead of HTML entities."
+ "Non-nil means to use unicode characters for org-entities instead of HTML codes."
:group 'org-export-html
- :version "24.4"
- :package-version '(Org . "8.0")
+ :version "25.1"
+ :package-version '(Org . "8.3")
:type 'boolean)
;;;; Drawers
@@ -2359,7 +2359,9 @@ holding contextual information. See `org-export-data'."
"Transcode an ENTITY object from Org to HTML.
CONTENTS are the definition itself. INFO is a plist holding
contextual information."
- (org-element-property :html entity))
+ (if (plist-get info :html-use-unicode-chars)
+ (org-element-property :utf-8 entity)
+ (org-element-property :html entity)))
;;;; Example Block
@@ -3500,9 +3502,6 @@ contextual information."
(set-auto-mode t)
(if (plist-get info :html-indent)
(indent-region (point-min) (point-max)))
- (when (plist-get info :html-use-unicode-chars)
- (require 'mm-url)
- (mm-url-decode-entities))
(buffer-substring-no-properties (point-min) (point-max))))
\f
--
2.5.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-04 17:35 ` Rasmus
@ 2015-08-04 18:37 ` Nicolas Goaziou
2015-08-07 9:56 ` Rasmus
0 siblings, 1 reply; 14+ messages in thread
From: Nicolas Goaziou @ 2015-08-04 18:37 UTC (permalink / raw)
To: Rasmus; +Cc: emacs-orgmode
Hello,
Rasmus <rasmus@gmx.us> writes:
> Vladimir Alexiev <vladimir.alexiev@ontotext.com> writes:
>
>> I've set org-html-use-unicode-chars since I want ox-html to leave IRIs as IRIs.
>> But this has another undesired effect: it breaks <URL> references in code,
>> since it doesn't escape the brackets.
>
> I think this should only apply to entities. Any reason to do it on the
> whole output? Nicolas?
It was introduced in e8742b78e0a982a7fca0bf25b4f3551be58660ef. I'm not
sure about the intent of this variable but I tend to think it is about
beautification of the output. As a consequence, it isn't meant to apply
to Org entities specifically.
However, as you noticed, it is not subtle enough to apply
`mm-url-decode-entities' on the full output. It needs to be applied
piece-wise wherever that makes sense. `org-html-entity' is one case.
Maybe `org-html-plain-text' for another one.
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-04 18:37 ` Nicolas Goaziou
@ 2015-08-07 9:56 ` Rasmus
2015-08-07 10:37 ` Nicolas Goaziou
0 siblings, 1 reply; 14+ messages in thread
From: Rasmus @ 2015-08-07 9:56 UTC (permalink / raw)
To: emacs-orgmode
Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
> Hello,
>
> Rasmus <rasmus@gmx.us> writes:
>
>> Vladimir Alexiev <vladimir.alexiev@ontotext.com> writes:
>>
>>> I've set org-html-use-unicode-chars since I want ox-html to leave IRIs as IRIs.
>>> But this has another undesired effect: it breaks <URL> references in code,
>>> since it doesn't escape the brackets.
>>
>> I think this should only apply to entities. Any reason to do it on the
>> whole output? Nicolas?
>
> It was introduced in e8742b78e0a982a7fca0bf25b4f3551be58660ef. I'm not
> sure about the intent of this variable but I tend to think it is about
> beautification of the output. As a consequence, it isn't meant to apply
> to Org entities specifically.
>
> However, as you noticed, it is not subtle enough to apply
> `mm-url-decode-entities' on the full output. It needs to be applied
> piece-wise wherever that makes sense. `org-html-entity' is one case.
> Maybe `org-html-plain-text' for another one.
OK. I added it to plain-text as well. What is an example of a plain-text
that would need beautification?
Should we apply it to snippets as well? In the spirit of "beautification"
it would make sense, but it could also seem like a bad choice.
Rasmus
--
Slowly unravels in a ball of yarn and the devil collects it
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-07 9:56 ` Rasmus
@ 2015-08-07 10:37 ` Nicolas Goaziou
2015-08-07 10:57 ` Rasmus
0 siblings, 1 reply; 14+ messages in thread
From: Nicolas Goaziou @ 2015-08-07 10:37 UTC (permalink / raw)
To: Rasmus; +Cc: emacs-orgmode
> OK. I added it to plain-text as well. What is an example of a plain-text
> that would need beautification?
To tell the truth, I don't know why we need beautification in the first
place. Bastien introduced it, so he may be able to answer.
> Should we apply it to snippets as well? In the spirit of "beautification"
> it would make sense, but it could also seem like a bad choice.
I think it makes sense to apply it to snippets, indeed, but see above.
Regards,
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-07 10:37 ` Nicolas Goaziou
@ 2015-08-07 10:57 ` Rasmus
2015-08-08 21:09 ` Andreas Leha
2015-08-16 14:03 ` Bastien Guerry
0 siblings, 2 replies; 14+ messages in thread
From: Rasmus @ 2015-08-07 10:57 UTC (permalink / raw)
To: emacs-orgmode
Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
>> OK. I added it to plain-text as well. What is an example of a plain-text
>> that would need beautification?
>
> To tell the truth, I don't know why we need beautification in the first
> place. Bastien introduced it, so he may be able to answer.
To this extend; me neither. But Vladimir uses it to "leave IRIs as IRIs"
(I don't know what this means).
>> Should we apply it to snippets as well? In the spirit of "beautification"
>> it would make sense, but it could also seem like a bad choice.
>
> I think it makes sense to apply it to snippets, indeed, but see above.
My initial reaction was to kill it as well. But I might feel like this a
bit to often (I feel the same way about headline keywords like COMMENT).
Rasmus
--
9000!
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-07 10:57 ` Rasmus
@ 2015-08-08 21:09 ` Andreas Leha
2015-08-09 19:32 ` Sebastien Vauban
2015-08-16 13:48 ` Bastien Guerry
2015-08-16 14:03 ` Bastien Guerry
1 sibling, 2 replies; 14+ messages in thread
From: Andreas Leha @ 2015-08-08 21:09 UTC (permalink / raw)
To: emacs-orgmode
Hi,
[ deleted: discussion on beatification ]
>
> My initial reaction was to kill it as well. But I might feel like this a
> bit to often (I feel the same way about headline keywords like COMMENT).
There has been repeated 'bashing' of the COMMENT keyword lately on this
list. Let me just raise a voice in defence. I do not mind the syntax
too much, but the functionality of commenting a whole subtree without
loosing the outline functionality is really handy. Especially also in
distinction to the equally handy :noexport: tag.
So, even if there is probably not a high risk for the COMMENT keyword to
be dropped I just wanted to express my support for it.
Regards,
Andreas
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-08 21:09 ` Andreas Leha
@ 2015-08-09 19:32 ` Sebastien Vauban
2015-08-16 13:48 ` Bastien Guerry
1 sibling, 0 replies; 14+ messages in thread
From: Sebastien Vauban @ 2015-08-09 19:32 UTC (permalink / raw)
To: emacs-orgmode-mXXj517/zsQ
Andreas Leha <andreas.leha-A1rZ2h3LdSKGMSlLMZIubhS11BummzK+@public.gmane.org> writes:
> [ deleted: discussion on beatification ]
>
>> My initial reaction was to kill it as well. But I might feel like this a
>> bit to often (I feel the same way about headline keywords like COMMENT).
>
> There has been repeated 'bashing' of the COMMENT keyword lately on this
> list. Let me just raise a voice in defence. I do not mind the syntax
> too much, but the functionality of commenting a whole subtree without
> loosing the outline functionality is really handy. Especially also in
> distinction to the equally handy :noexport: tag.
>
> So, even if there is probably not a high risk for the COMMENT keyword to
> be dropped I just wanted to express my support for it.
+1
Both COMMENT and :noexport: are necessary, for achieving different tasks.
Best regards,
Seb
--
Sebastien Vauban
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-08 21:09 ` Andreas Leha
2015-08-09 19:32 ` Sebastien Vauban
@ 2015-08-16 13:48 ` Bastien Guerry
2015-08-16 18:47 ` Brady Trainor
1 sibling, 1 reply; 14+ messages in thread
From: Bastien Guerry @ 2015-08-16 13:48 UTC (permalink / raw)
To: Andreas Leha; +Cc: emacs-orgmode
Andreas Leha <andreas.leha@med.uni-goettingen.de> writes:
> So, even if there is probably not a high risk for the COMMENT keyword to
> be dropped I just wanted to express my support for it.
COMMENT will stay, for sure.
--
Bastien
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-07 10:57 ` Rasmus
2015-08-08 21:09 ` Andreas Leha
@ 2015-08-16 14:03 ` Bastien Guerry
1 sibling, 0 replies; 14+ messages in thread
From: Bastien Guerry @ 2015-08-16 14:03 UTC (permalink / raw)
To: Rasmus; +Cc: emacs-orgmode
Hi,
I removed `org-html-use-unicode-chars'.
Thanks,
--
Bastien
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-16 13:48 ` Bastien Guerry
@ 2015-08-16 18:47 ` Brady Trainor
2015-08-17 8:01 ` Nicolas Goaziou
0 siblings, 1 reply; 14+ messages in thread
From: Brady Trainor @ 2015-08-16 18:47 UTC (permalink / raw)
To: emacs-orgmode
Speaking of COMMENT, I had noticed some strange behavior if I have a state like COMMENTED_OUT.
If I have a header like =#+TODO: TODO COMMENTED_OUT | DONE=, and cycle through state with S-<right arrow>, it gets pretty wonky. For now, I simply use COMMENT when I have a section I want to consider as commented out, but ideally this can behavior can be different?
Always true, but I haven't been on the mailing list for a spell, so, big thanks to all who contribute to this software.
Bastien Guerry <bzg@gnu.org> writes:
> Andreas Leha <andreas.leha@med.uni-goettingen.de> writes:
>
>> So, even if there is probably not a high risk for the COMMENT keyword to
>> be dropped I just wanted to express my support for it.
>
> COMMENT will stay, for sure.
--
Brady
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-16 18:47 ` Brady Trainor
@ 2015-08-17 8:01 ` Nicolas Goaziou
2015-08-17 8:41 ` Brady Trainor
2015-08-17 16:44 ` Rasmus
0 siblings, 2 replies; 14+ messages in thread
From: Nicolas Goaziou @ 2015-08-17 8:01 UTC (permalink / raw)
To: Brady Trainor; +Cc: emacs-orgmode
Hello,
Brady Trainor <algebrat@uw.edu> writes:
> Speaking of COMMENT, I had noticed some strange behavior if I have a state like COMMENTED_OUT.
>
> If I have a header like =#+TODO: TODO COMMENTED_OUT | DONE=, and cycle
> through state with S-<right arrow>, it gets pretty wonky.
Could you elaborate a bit? I cannot reproduce anything suspicious except
a minor fontification glitch.
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-17 8:01 ` Nicolas Goaziou
@ 2015-08-17 8:41 ` Brady Trainor
2015-08-17 16:44 ` Rasmus
1 sibling, 0 replies; 14+ messages in thread
From: Brady Trainor @ 2015-08-17 8:41 UTC (permalink / raw)
To: emacs-orgmode
Hello,
Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
> Hello,
>
> Brady Trainor <algebrat@uw.edu> writes:
>
>> Speaking of COMMENT, I had noticed some strange behavior if I have a
>> state like COMMENTED_OUT.
>>
>> If I have a header like =#+TODO: TODO COMMENTED_OUT | DONE=, and cycle
>> through state with S-<right arrow>, it gets pretty wonky.
>
> Could you elaborate a bit? I cannot reproduce anything suspicious except
> a minor fontification glitch.
>
>
> Regards,
Ah, I should have checked 8.3... This is on 8.2.10 that it was doing this. In 8.2.10, as states were passed through, I could see several different all-caps states in one headline at once, though only one would be highlighted.
I guess on 8.3, it is just that COMMENT of COMMENTED_OUT would be highlighted, leaving ED_OUT unhighlighted. I am guessing that is what you are seeing.
--
Brady
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: org-html-use-unicode-chars breaks source code blocks
2015-08-17 8:01 ` Nicolas Goaziou
2015-08-17 8:41 ` Brady Trainor
@ 2015-08-17 16:44 ` Rasmus
1 sibling, 0 replies; 14+ messages in thread
From: Rasmus @ 2015-08-17 16:44 UTC (permalink / raw)
To: emacs-orgmode
Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
> Hello,
>
> Brady Trainor <algebrat@uw.edu> writes:
>
>> Speaking of COMMENT, I had noticed some strange behavior if I have a state like COMMENTED_OUT.
>>
>> If I have a header like =#+TODO: TODO COMMENTED_OUT | DONE=, and cycle
>> through state with S-<right arrow>, it gets pretty wonky.
>
> Could you elaborate a bit? I cannot reproduce anything suspicious except
> a minor fontification glitch.
I only saw the fontification error as well, which can be fixed in
org-set-font-lock-defaults. AFAICT, you did not fix this, right?
Rasmus
--
The right to be left alone is a human right
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-08-17 16:45 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-04 13:40 org-html-use-unicode-chars breaks source code blocks Vladimir Alexiev
2015-08-04 17:35 ` Rasmus
2015-08-04 18:37 ` Nicolas Goaziou
2015-08-07 9:56 ` Rasmus
2015-08-07 10:37 ` Nicolas Goaziou
2015-08-07 10:57 ` Rasmus
2015-08-08 21:09 ` Andreas Leha
2015-08-09 19:32 ` Sebastien Vauban
2015-08-16 13:48 ` Bastien Guerry
2015-08-16 18:47 ` Brady Trainor
2015-08-17 8:01 ` Nicolas Goaziou
2015-08-17 8:41 ` Brady Trainor
2015-08-17 16:44 ` Rasmus
2015-08-16 14:03 ` Bastien Guerry
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).