* Bug: [regression] superscript not available after non-alphanumeric [8.2.7b (8.2.7b-dist @ /home/benda/gnto/usr/share/emacs/site-lisp/org-mode/)]
@ 2014-06-27 10:42 heroxbd
2014-06-27 11:55 ` Nicolas Goaziou
0 siblings, 1 reply; 9+ messages in thread
From: heroxbd @ 2014-06-27 10:42 UTC (permalink / raw)
To: emacs-orgmode
superscript after non-alphanumeric, primarily used for isotopes, is
broken again[1, 2].
#+begin_org
\ce{^{238}U}, ^2H
#+end_org
is exported as
#+begin_latex
\ce\{$^{\text{238}}$U\}, \^{}2H
#+end_latex
on org-mode 8.2.7b
I've also tried 8.0.7, the bug persists. So I suppose the regression is
introduced by 8.0 exporter refactorization.
How about making a set of unit tests for the exporter to watch against
these?
Cheers,
Benda
1. http://lists.gnu.org/archive/html/emacs-orgmode/2009-09/msg00887.html
2. http://orgmode.org/cgit.cgi/org-mode.git/commit/?id=7d5408a717374641b2d2cddcfef27ec9c137a9a7
current state:
==============
(setq
org-ctrl-c-ctrl-c-hook '(org-babel-hash-at-point org-babel-execute-safely-maybe)
org-latex-format-headline-function 'org-latex-format-headline-default-function
org-src-fontify-natively t
org-html-format-inlinetask-function 'ignore
org-export-with-drawers nil
org-export-copy-to-kill-ring t
org-export-with-tags 'not-in-toc
org-export-preprocess-before-selecting-backend-code-hook '(org-beamer-select-beamer-code)
org-tab-first-hook '(org-hide-block-toggle-maybe org-src-native-tab-command-maybe
org-babel-hide-result-toggle-maybe org-babel-header-arg-expand)
org-modules '(org-bbdb org-bibtex org-docview org-gnus org-info org-jsinfo org-irc org-mew org-mhe org-rmail
org-special-blocks org-vm org-wl org-w3m)
org-cycle-hook '(org-cycle-hide-archived-subtrees org-cycle-hide-drawers org-cycle-hide-inline-tasks
org-cycle-show-empty-lines org-optimize-window-after-visibility-change)
org-agenda-before-write-hook '(org-agenda-add-entry-text)
org-speed-command-hook '(org-speed-command-default-hook org-babel-speed-command-hook)
org-ascii-format-inlinetask-function 'org-ascii-format-inlinetask-default
org-babel-pre-tangle-hook '(save-buffer)
org-occur-hook '(org-first-headline-recenter)
org-export-latex-after-blockquotes-hook '(org-special-blocks-convert-latex-special-cookies)
org-latex-default-packages-alist '(("T1" "fontenc" nil) ("" "graphicx" t) ("" "longtable" nil)
("" "float" nil) ("" "wrapfig" nil) ("" "soul" t) ("" "textcomp" t)
("" "marvosym" t) ("" "latexsym" t) ("" "amssymb" t) ("" "hyperref" nil)
("" "fontspec" nil) ("CJKchecksingle" "xeCJK" nil) "\\tolerance=1000"
"\\setCJKmainfont[BoldFont={WenQuanYi Zen Hei},ItalicFont={AR PL UKai CN}, FallBack={AR PL UMing CN}]{Kochi Mincho}" "\\setCJKsansfont{AR PL UKai CN}")
org-html-format-headline-function 'ignore
org-metaup-hook '(org-babel-load-in-session-maybe)
org-confirm-elisp-link-function 'yes-or-no-p
org-export-latex-format-toc-function 'org-export-latex-format-toc-default
org-latex-classes '(("article" "\\documentclass[11pt]{article}" ("\\section{%s}" . "\\section*{%s}")
("\\subsection{%s}" . "\\subsection*{%s}")
("\\subsubsection{%s}" . "\\subsubsection*{%s}")
("\\paragraph{%s}" . "\\paragraph*{%s}") ("\\subparagraph{%s}" . "\\subparagraph*{%s}"))
("thesis" "\\documentclass[12pt,final]{tohoku-thesis}"
("\\chapter{%s}" . "\\chapter*{%s}") ("\\section{%s}" . "\\section*{%s}")
("\\subsection{%s}" . "\\subsection*{%s}")
("\\subsubsection{%s}" . "\\subsubsection*{%s}"))
("book" "\\documentclass[11pt]{book}" ("\\part{%s}" . "\\part*{%s}")
("\\chapter{%s}" . "\\chapter*{%s}") ("\\section{%s}" . "\\section*{%s}")
("\\subsection{%s}" . "\\subsection*{%s}")
("\\subsubsection{%s}" . "\\subsubsection*{%s}"))
("beamer" "\\documentclass{beamer}" org-beamer-sectioning)
("revtex" "\\RequirePackage{fixltx2e}\n\\documentclass[11pt, reprint]{revtex4-1}"
("\\section{%s}" . "\\section*{%s}") ("\\subsection{%s}" . "\\subsection*{%s}")
("\\subsubsection{%s}" . "\\subsubsection*{%s}")
("\\paragraph{%s}" . "\\paragraph*{%s}") ("\\subparagraph{%s}" . "\\subparagraph*{%s}"))
)
org-latex-format-drawer-function '(lambda (name contents) contents)
org-export-preprocess-after-blockquote-hook '(org-special-blocks-make-special-cookies)
org-format-latex-options '(:foreground default :background default :scale 1.7 :html-foreground "Black"
:html-background "Transparent" :html-scale 1.0 :matchers
("begin" "$1" "$" "$$" "\\(" "\\["))
org-latex-to-pdf-process '("latexmk -f -pdf %f")
org-babel-tangle-body-hook '((lambda nil (org-preprocess-apply-macros)))
org-clock-out-hook '(org-clock-remove-empty-clock-drawer)
org-export-latex-classes '(("article" "\\documentclass[11pt]{article}" ("\\section{%s}" . "\\section*{%s}")
("\\subsection{%s}" . "\\subsection*{%s}")
("\\subsubsection{%s}" . "\\subsubsection*{%s}")
("\\paragraph{%s}" . "\\paragraph*{%s}")
("\\subparagraph{%s}" . "\\subparagraph*{%s}"))
("report" "\\documentclass[11pt]{report}" ("\\part{%s}" . "\\part*{%s}")
("\\chapter{%s}" . "\\chapter*{%s}") ("\\section{%s}" . "\\section*{%s}")
("\\subsection{%s}" . "\\subsection*{%s}")
("\\subsubsection{%s}" . "\\subsubsection*{%s}"))
("book" "\\documentclass[11pt]{book}" ("\\part{%s}" . "\\part*{%s}")
("\\chapter{%s}" . "\\chapter*{%s}") ("\\section{%s}" . "\\section*{%s}")
("\\subsection{%s}" . "\\subsection*{%s}")
("\\subsubsection{%s}" . "\\subsubsection*{%s}"))
("beamer" "\\documentclass{beamer}" org-beamer-sectioning)
("revtex" "\\RequirePackage{fixltx2e}\n\\documentclass[11pt, reprint]{revtex4-1}"
("\\section{%s}" . "\\section*{%s}") ("\\subsection{%s}" . "\\subsection*{%s}")
("\\subsubsection{%s}" . "\\subsubsection*{%s}")
("\\paragraph{%s}" . "\\paragraph*{%s}")
("\\subparagraph{%s}" . "\\subparagraph*{%s}"))
)
org-export-first-hook '((lambda nil (org-babel-tangle)) org-beamer-initialize-open-trackers)
org-mode-hook '(org-mode-reftex-setup
#[nil "\300\301\302\303\304$\207"
[org-add-hook change-major-mode-hook org-show-block-all append local] 5]
#[nil "\300\301\302\303\304$\207"
[org-add-hook change-major-mode-hook org-babel-show-result-all append local] 5]
org-babel-result-hide-spec org-babel-hide-all-hashes)
org-ascii-format-drawer-function '(lambda (name contents width) contents)
org-footnote-define-inline t
org-from-is-user-regexp nil
org-export-allow-bind-keywords t
org-html-format-drawer-function '(lambda (name contents) contents)
org-export-latex-final-hook '(org-beamer-amend-header org-beamer-fix-toc org-beamer-auto-fragile-frames
org-beamer-place-default-actions-for-lists)
org-latex-format-toc-function 'org-latex-no-toc
org-export-latex-after-initial-vars-hook '(org-beamer-after-initial-vars)
org-metadown-hook '(org-babel-pop-to-session-maybe)
org-agenda-files '("~/art/refBook.org")
org-src-mode-hook '(org-src-babel-configure-edit-buffer org-src-mode-configure-edit-buffer)
org-file-apps '((auto-mode . emacs) ("\\.mm\\'" . default) ("\\.x?html?\\'" . default)
("\\.pdf\\'" . default) ("\\.pdf::\\([0-9]+\\)\\'" . "okular \"%s\" -p %1"))
org-export-html-after-blockquotes-hook '(org-special-blocks-convert-html-special-cookies)
org-footnote-auto-label nil
org-after-todo-state-change-hook '(org-clock-out-if-current)
org-babel-tangle-lang-exts '(("python" . "py") ("emacs-lisp" . "el"))
org-babel-load-languages '((python . t) (R . t) (emacs-lisp . t) (sh . t))
org-latex-format-inlinetask-function 'ignore
org-confirm-shell-link-function 'yes-or-no-p
)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Bug: [regression] superscript not available after non-alphanumeric [8.2.7b (8.2.7b-dist @ /home/benda/gnto/usr/share/emacs/site-lisp/org-mode/)]
2014-06-27 10:42 Bug: [regression] superscript not available after non-alphanumeric [8.2.7b (8.2.7b-dist @ /home/benda/gnto/usr/share/emacs/site-lisp/org-mode/)] heroxbd
@ 2014-06-27 11:55 ` Nicolas Goaziou
2014-06-28 1:39 ` syntax specification (was Re: Bug: [regression] superscript not available after non-alphanumeric) heroxbd
2014-06-29 11:47 ` [PATCH] curly nested latex fragments (was: " heroxbd
0 siblings, 2 replies; 9+ messages in thread
From: Nicolas Goaziou @ 2014-06-27 11:55 UTC (permalink / raw)
To: heroxbd; +Cc: emacs-orgmode
Hello,
heroxbd@gentoo.org writes:
> #+begin_org
> \ce{^{238}U}, ^2H
> #+end_org
>
> is exported as
>
> #+begin_latex
> \ce\{$^{\text{238}}$U\}, \^{}2H
> #+end_latex
>
> on org-mode 8.2.7b
If you want to insert raw LaTeX in an Org buffer, then \ce{^{238}U} is
invalid because you cannot nest braces. You can write instead:
@@latex:\ce{^{238}U}@@
or you can define a macro, e.g.,:
#+MACRO: ce @@latex:\ce{$1}@@
and then use
{{{ce(^{238}U)}}}
Also, ^2H is not recognized as superscript _on purpose_. Per Org syntax,
you have to add a non-blank character before the caret. Otherwise, there
would be ambiguity between underline (e.g., _under_) and subscript
(_under). And superscript syntax follows subscript's.
In this case, you can probably use a math snippet, e.g.,
\(^2\)H
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 9+ messages in thread
* syntax specification (was Re: Bug: [regression] superscript not available after non-alphanumeric)
2014-06-27 11:55 ` Nicolas Goaziou
@ 2014-06-28 1:39 ` heroxbd
2014-06-29 11:47 ` [PATCH] curly nested latex fragments (was: " heroxbd
1 sibling, 0 replies; 9+ messages in thread
From: heroxbd @ 2014-06-28 1:39 UTC (permalink / raw)
To: emacs-orgmode
Hi Nicolas,
Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
> If you want to insert raw LaTeX in an Org buffer, then \ce{^{238}U} is
> invalid because you cannot nest braces. You can write instead:
>
> @@latex:\ce{^{238}U}@@
>
> or you can define a macro, e.g.,:
>
> #+MACRO: ce @@latex:\ce{$1}@@
>
> and then use
>
> {{{ce(^{238}U)}}}
>
> Also, ^2H is not recognized as superscript _on purpose_. Per Org syntax,
> you have to add a non-blank character before the caret. Otherwise, there
> would be ambiguity between underline (e.g., _under_) and subscript
> (_under). And superscript syntax follows subscript's.
>
> In this case, you can probably use a math snippet, e.g.,
>
> \(^2\)H
Thank you for the explanation. I got to know what went wrong.
I am wondering where the claims "you cannot nest braces" and "Per Org
syntax, you have to add a non-blank character before the caret" come
from. Is there a general principle guideline for the org syntax, or is
it a taste of the maintainer only?
Is it true when an exporter maintainer changes, the syntax changes to
his somehow incompatible preference? In [1], Carsten regarded "you have
to add a non-blank character before the caret" as a bug and fixed it;
while you regard it as a rule. I am curious about what was the
compelling motivation to make this shift.
Interpreting \ce{^{238}U} directly complicates the exporter parser
logic, while gives LaTeX composers a syntax sugar. The inconvenience of
"\(^2\)H" is similar to "\_leading_under_line". Either syntax is not
superior to the other. Maintaining a stable syntax is the principle in
this case.
Don't get me wrong. I appreciate and respect your new-school exporting
framework, and the sexy features it makes possible. I am to express my
value and concern on the longterm specification (and consequently
usability) of the org syntax.
Cheers,
Benda
1. http://lists.gnu.org/archive/html/emacs-orgmode/2009-09/msg00887.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] curly nested latex fragments (was: superscript not available after non-alphanumeric)
2014-06-27 11:55 ` Nicolas Goaziou
2014-06-28 1:39 ` syntax specification (was Re: Bug: [regression] superscript not available after non-alphanumeric) heroxbd
@ 2014-06-29 11:47 ` heroxbd
2014-06-29 13:53 ` [PATCH] curly nested latex fragments Nicolas Goaziou
1 sibling, 1 reply; 9+ messages in thread
From: heroxbd @ 2014-06-29 11:47 UTC (permalink / raw)
To: emacs-orgmode
[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]
Hello Nicolas,
Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
> If you want to insert raw LaTeX in an Org buffer, then \ce{^{238}U} is
> invalid because you cannot nest braces. You can write instead:
>
> @@latex:\ce{^{238}U}@@
>
> or you can define a macro, e.g.,:
>
> #+MACRO: ce @@latex:\ce{$1}@@
>
> and then use
>
> {{{ce(^{238}U)}}}
Nesting braces is already implemented in the classic org-latex.el[1],
and is forward ported into org-element.el. Would you like to take a
look at the attached patch? Thanks.
> Also, ^2H is not recognized as superscript _on purpose_. Per Org syntax,
> you have to add a non-blank character before the caret. Otherwise, there
> would be ambiguity between underline (e.g., _under_) and subscript
> (_under). And superscript syntax follows subscript's.
>
> In this case, you can probably use a math snippet, e.g.,
>
> \(^2\)H
If \ce{^2H} works as above, it is not a problem for me. Although make
it configurable is more user-friendly; "^:{}" is already there afterall,
adding another style feels natural.
Thanks,
Benda
1. http://orgmode.org/w/org-mode.git?p=org-mode.git;a=blob;f=lisp/org-latex.el;hb=107f921d121f5a9bb5a9324f19339e4435633d2d#l2597
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: org-8.2.7b_element-latex-nested-curly.patch --]
[-- Type: text/x-patch, Size: 1618 bytes --]
support nested curly bracket pairs in latex fragments.
http://lists.gnu.org/archive/html/emacs-orgmode/2014-06/msg01022.html
Index: org-8.2.7b/lisp/org-element.el
===================================================================
--- org-8.2.7b.orig/lisp/org-element.el
+++ org-8.2.7b/lisp/org-element.el
@@ -3026,7 +3026,12 @@ Assume point is at the beginning of the
(looking-at latex-regexp))))
(throw 'exit (nth 2 e)))))
;; None found: it's a macro.
- (looking-at "\\\\[a-zA-Z]+\\*?\\(\\(\\[[^][\n{}]*\\]\\)\\|\\({[^{}\n]*}\\)\\)*")
+ (looking-at (concat
+ "\\\\\\([a-zA-Z]+\\*?\\)"
+ "\\(?:<[^<>\n]*>\\)*"
+ "\\(?:\\[[^][\n]*?\\]\\)*"
+ "\\(?:<[^<>\n]*>\\)*"
+ "\\(" (org-create-multibrace-regexp "{" "}" 3) "\\)\\{1,3\\}"))
0))
(value (org-match-string-no-properties substring-match))
(post-blank (progn (goto-char (match-end substring-match))
Index: org-8.2.7b/doc/org.texi
===================================================================
--- org-8.2.7b.orig/doc/org.texi
+++ org-8.2.7b/doc/org.texi
@@ -10168,6 +10168,9 @@ any @LaTeX{} environment will be handled
@code{\begin} and @code{\end} statements appear on a new line, at the
beginning of the line or after whitespaces only.
@item
+Commands like \command[...]{...} or \command{...}; the curly brakets could be
+nested up to 3 levels.
+@item
Text within the usual @LaTeX{} math delimiters. To avoid conflicts with
currency specifications, single @samp{$} characters are only recognized as
math delimiters if the enclosed text contains at most two line breaks, is
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] curly nested latex fragments
2014-06-29 11:47 ` [PATCH] curly nested latex fragments (was: " heroxbd
@ 2014-06-29 13:53 ` Nicolas Goaziou
2014-06-30 0:38 ` heroxbd
0 siblings, 1 reply; 9+ messages in thread
From: Nicolas Goaziou @ 2014-06-29 13:53 UTC (permalink / raw)
To: heroxbd; +Cc: emacs-orgmode
Hello,
heroxbd@gentoo.org writes:
> Nesting braces is already implemented in the classic org-latex.el[1],
> and is forward ported into org-element.el.
Thanks for your patch.
I think you are misunderstanding something. I didn't port this
limitation in Org 8. AFAIK it has been there for a long time. See
`org-inside-latex-macro-p' for example.
The main problem with Org < 8 is that every exporter implemented its own
parser for the Org buffer. As you can see, "org-latex.el" was in
contradiction with "org.el".
> Would you like to take a look at the attached patch? Thanks.
I do not mind extending syntax for LaTeX macros a bit if it helps users,
but first, I would like a clear definition of what subset of macros
should be supported in Org.
See, for example,
http://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
Also, I do not want to add constructs like
"\\(?:<[^<>\n]*>\\)*"
in this definition, as this isn't supported even in
`TeX-find-macro-end-helper' (from auctex), which I consider as
a reference for macro syntax (i.e. we shouldn't support more than what
is supports).
Eventually, please note that this imply to change not only
"org-element.el", but also "org.el" and possibly other parts where the
limitation is encoded. But first, we need to agree on what exactly
a valid a LaTeX macro is in Org.
> If \ce{^2H} works as above, it is not a problem for me. Although make
> it configurable is more user-friendly; "^:{}" is already there afterall,
> adding another style feels natural.
It's not about adding another style. "^:{}" allows less (without
changing syntax, because the limitation is done at the export level),
you want to allow more, which implies to change syntax. I don't want the
latter to be configurable.
I explained in this thread why it wasn't possible, for the time being,
to allow a blank character before sub or superscript. This was discussed
on this ML, you may want to search archives.
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] curly nested latex fragments
2014-06-29 13:53 ` [PATCH] curly nested latex fragments Nicolas Goaziou
@ 2014-06-30 0:38 ` heroxbd
2014-06-30 12:31 ` Nicolas Goaziou
0 siblings, 1 reply; 9+ messages in thread
From: heroxbd @ 2014-06-30 0:38 UTC (permalink / raw)
To: emacs-orgmode
Hi Nicolas,
Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
> Hello,
>
> heroxbd@gentoo.org writes:
>
>> Nesting braces is already implemented in the classic org-latex.el[1],
>> and is forward ported into org-element.el.
>
> Thanks for your patch.
>
> I think you are misunderstanding something. I didn't port this
> limitation in Org 8. AFAIK it has been there for a long time. See
> `org-inside-latex-macro-p' for example.
> The main problem with Org < 8 is that every exporter implemented its own
> parser for the Org buffer. As you can see, "org-latex.el" was in
> contradiction with "org.el".
I see, the regex used for latex protection (in org-latex.el) and
footnote guarding (org-footnotes.el org.el) are different.
>> Would you like to take a look at the attached patch? Thanks.
>
> I do not mind extending syntax for LaTeX macros a bit if it helps users,
> but first, I would like a clear definition of what subset of macros
> should be supported in Org.
>
> See, for example,
>
> http://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
\ce{^{238}U} falls into \NAME POST, doesn't it?
> Also, I do not want to add constructs like
>
> "\\(?:<[^<>\n]*>\\)*"
>
> in this definition, as this isn't supported even in
> `TeX-find-macro-end-helper' (from auctex), which I consider as
> a reference for macro syntax (i.e. we shouldn't support more than what
> is supports).
Ha, I don't even aware of <...> syntex as a part of the LaTeX macro; I
just copied the regex from org-latex.el. So let's strip it out, and
advise the users to use explicit LaTeX block for <...> constructs.
+ (looking-at (concat
+ "\\\\\\([a-zA-Z]+\\*?\\)"
+ "\\(?:\\[[^][\n]*?\\]\\)*"
+ "\\(" (org-create-multibrace-regexp "{" "}" 3) "\\)\\{1,3\\}"))
> Eventually, please note that this imply to change not only
> "org-element.el", but also "org.el" and possibly other parts where the
> limitation is encoded. But first, we need to agree on what exactly
> a valid a LaTeX macro is in Org.
`org-inside-latex-macro-p' for example? Yeah, definitely.
>> If \ce{^2H} works as above, it is not a problem for me. Although make
>> it configurable is more user-friendly; "^:{}" is already there afterall,
>> adding another style feels natural.
>
> It's not about adding another style. "^:{}" allows less (without
> changing syntax, because the limitation is done at the export level),
> you want to allow more, which implies to change syntax. I don't want the
> latter to be configurable.
>
> I explained in this thread why it wasn't possible, for the time being,
> to allow a blank character before sub or superscript. This was discussed
> on this ML, you may want to search archives.
Do you mean this[2] and this[3] threads? I've read them through, and
remotely understood the difficulty coming from the ambiguity of the
syntax. And as discussed above, the difficulty manifests in the
definition of LaTeX fragments, too. It is frustrating to deal with
these corner cases, making a well-designed parser framework unnecessary
complex.
At the same time, these syntax sugar is great. And that's the reason
why we prefer org-mode in composing LaTeX to pristine LaTeX. There is a
sincere need to compromise the cleanness of the implementation for the
sake of an ambiguous-but-human-intuitive syntax.
To resolve this dilemma, we need a formal (mathematically rigorous) org
syntex specification, like the rules drafted in
http://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
together with a set of test suites to demonstrate the spec. There would
be a lot of work, but we could start from embedded LaTeX fragments and
super(sub)scripts/underline.
It might be mentally overwhelming for one single guy to do the spec and
the implementation at the same time, because they require different
mindsets. The spec is long term and should be stable while the
implementation is always being optimized. After all, it is considered
good practice to make the two processes independent to each other.
What do you think?
Yours,
Benda
1. http://orgmode.org/w/?p=org-mode.git;a=commit;h=88cf58802cc35dee2bc8ff8633b5c842fa7a23b3
2. http://thread.gmane.org/gmane.emacs.orgmode/79735
3. http://thread.gmane.org/gmane.emacs.orgmode/85902
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] curly nested latex fragments
2014-06-30 0:38 ` heroxbd
@ 2014-06-30 12:31 ` Nicolas Goaziou
2014-06-30 21:50 ` heroxbd
0 siblings, 1 reply; 9+ messages in thread
From: Nicolas Goaziou @ 2014-06-30 12:31 UTC (permalink / raw)
To: heroxbd; +Cc: emacs-orgmode
Hello,
heroxbd@gentoo.org writes:
> Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
>
>> I do not mind extending syntax for LaTeX macros a bit if it helps users,
>> but first, I would like a clear definition of what subset of macros
>> should be supported in Org.
>>
>> See, for example,
>>
>> http://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
>
> \ce{^{238}U} falls into \NAME POST, doesn't it?
Sorry I wasn't clear. I suggested to not use a regexp to describe the
syntax, as regular expressions may not be sufficient to describe the
object. Try to use something like the link above.
Also, bear in mind that a complicated regexp slows down parsing.
> Ha, I don't even aware of <...> syntex as a part of the LaTeX macro; I
> just copied the regex from org-latex.el. So let's strip it out, and
> advise the users to use explicit LaTeX block for <...> constructs.
>
> + (looking-at (concat
> + "\\\\\\([a-zA-Z]+\\*?\\)"
> + "\\(?:\\[[^][\n]*?\\]\\)*"
> + "\\(" (org-create-multibrace-regexp "{" "}" 3) "\\)\\{1,3\\}"))
Unfortunately, this is ambiguous with Org macro syntax. For example, it
would match:
\alpha{{{macro(arg)}}}
which is an entity followed by a macro.
> Do you mean this[2] and this[3] threads? I've read them through, and
> remotely understood the difficulty coming from the ambiguity of the
> syntax. And as discussed above, the difficulty manifests in the
> definition of LaTeX fragments, too.
There is no ambiguity in LaTeX fragments, as Org is not required to
support full raw LaTeX syntax (and never did anyway), as long as we
provide markup to insert LaTeX in the buffer anyway.
If we can support a bit more without introducing corner cases, that's
fine. But, as you say, that's just syntactic sugar, so pure Org syntax
goes first.
> At the same time, these syntax sugar is great. And that's the reason
> why we prefer org-mode in composing LaTeX to pristine LaTeX. There is a
> sincere need to compromise the cleanness of the implementation for the
> sake of an ambiguous-but-human-intuitive syntax.
@@l:\ce{^{238}U}@@ is not so bad, nor is {{{ce(^{238)U)}}} with
a properly defined macro template.
Anyway, let me stress it again: a change to macro syntax is fine if it
introduces no ambiguity. Obviously, the same holds for sub/superscript.
> To resolve this dilemma, we need a formal (mathematically rigorous) org
> syntex specification, like the rules drafted in
>
> http://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
>
> together with a set of test suites to demonstrate the spec. There would
> be a lot of work, but we could start from embedded LaTeX fragments and
> super(sub)scripts/underline.
>
> It might be mentally overwhelming for one single guy to do the spec and
> the implementation at the same time, because they require different
> mindsets. The spec is long term and should be stable while the
> implementation is always being optimized. After all, it is considered
> good practice to make the two processes independent to each other.
I'm not sure what do you mean. "org-syntax.html" describes, well, the
syntax (although it could be better, with, e.g., EBNF, help is welcome),
"org-element.el" implements it, with optimizations, and
"test-org-element.el" tests the implementation.
Anyway, let's concentrate on LaTeX macros.
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] curly nested latex fragments
2014-06-30 12:31 ` Nicolas Goaziou
@ 2014-06-30 21:50 ` heroxbd
2014-07-06 20:11 ` Nicolas Goaziou
0 siblings, 1 reply; 9+ messages in thread
From: heroxbd @ 2014-06-30 21:50 UTC (permalink / raw)
To: emacs-orgmode
Hi Nicolas,
Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
> heroxbd@gentoo.org writes:
>
>> Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
>>
>>> I do not mind extending syntax for LaTeX macros a bit if it helps users,
>>> but first, I would like a clear definition of what subset of macros
>>> should be supported in Org.
>>>
>>> See, for example,
>>>
>>> http://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
>>
>> \ce{^{238}U} falls into \NAME POST, doesn't it?
>
> Sorry I wasn't clear. I suggested to not use a regexp to describe the
> syntax, as regular expressions may not be sufficient to describe the
> object. Try to use something like the link above.
>
> Also, bear in mind that a complicated regexp slows down parsing.
Wow that's exactly what I was wondering when reading
org-element--parse-{elements,objects}. It is a tokenizer in lexical
analysis, for which great tools exist for decades.
>> Ha, I don't even aware of <...> syntex as a part of the LaTeX macro; I
>> just copied the regex from org-latex.el. So let's strip it out, and
>> advise the users to use explicit LaTeX block for <...> constructs.
>>
>> + (looking-at (concat
>> + "\\\\\\([a-zA-Z]+\\*?\\)"
>> + "\\(?:\\[[^][\n]*?\\]\\)*"
>> + "\\(" (org-create-multibrace-regexp "{" "}" 3) "\\)\\{1,3\\}"))
>
> Unfortunately, this is ambiguous with Org macro syntax. For example, it
> would match:
>
> \alpha{{{macro(arg)}}}
>
> which is an entity followed by a macro.
Err, insert a white space?
\alpha {{{macro(arg)}}}
Or expand the macro before latex-or-entity matching.
>> Do you mean this[2] and this[3] threads? I've read them through, and
>> remotely understood the difficulty coming from the ambiguity of the
>> syntax. And as discussed above, the difficulty manifests in the
>> definition of LaTeX fragments, too.
>
> There is no ambiguity in LaTeX fragments, as Org is not required to
> support full raw LaTeX syntax (and never did anyway), as long as we
> provide markup to insert LaTeX in the buffer anyway.
>
> If we can support a bit more without introducing corner cases, that's
> fine. But, as you say, that's just syntactic sugar, so pure Org syntax
> goes first.
I agree with you on this.
>> At the same time, these syntax sugar is great. And that's the reason
>> why we prefer org-mode in composing LaTeX to pristine LaTeX. There is a
>> sincere need to compromise the cleanness of the implementation for the
>> sake of an ambiguous-but-human-intuitive syntax.
>
> @@l:\ce{^{238}U}@@ is not so bad, nor is {{{ce(^{238)U)}}} with
> a properly defined macro template.
>
> Anyway, let me stress it again: a change to macro syntax is fine if it
> introduces no ambiguity. Obviously, the same holds for
> sub/superscript.
Hmmm, after reflection, my preference of \ce{^{238}U} comes from the
syntax of org-mode 7.9.
>> To resolve this dilemma, we need a formal (mathematically rigorous) org
>> syntex specification, like the rules drafted in
>>
>> http://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
>>
>> together with a set of test suites to demonstrate the spec. There would
>> be a lot of work, but we could start from embedded LaTeX fragments and
>> super(sub)scripts/underline.
>>
>> It might be mentally overwhelming for one single guy to do the spec and
>> the implementation at the same time, because they require different
>> mindsets. The spec is long term and should be stable while the
>> implementation is always being optimized. After all, it is considered
>> good practice to make the two processes independent to each other.
>
> I'm not sure what do you mean. "org-syntax.html" describes, well, the
> syntax (although it could be better, with, e.g., EBNF, help is welcome),
> "org-element.el" implements it, with optimizations, and
> "test-org-element.el" tests the implementation.
Sorry, it's my ignorance. I didn't notice the tests/ dir. So great
that the testing framework is already there.
> Anyway, let's concentrate on LaTeX macros.
Okay.
Cheers,
Benda
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] curly nested latex fragments
2014-06-30 21:50 ` heroxbd
@ 2014-07-06 20:11 ` Nicolas Goaziou
0 siblings, 0 replies; 9+ messages in thread
From: Nicolas Goaziou @ 2014-07-06 20:11 UTC (permalink / raw)
To: heroxbd; +Cc: emacs-orgmode
Hello,
heroxbd@gentoo.org writes:
> Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
>> Unfortunately, this is ambiguous with Org macro syntax. For example, it
>> would match:
>>
>> \alpha{{{macro(arg)}}}
>>
>> which is an entity followed by a macro.
>
> Err, insert a white space?
>
> \alpha {{{macro(arg)}}}
Well, it may not be equivalent, depending on the macro. Also, this is
not the point. \alpha{{{macro(arg)}}} is valid, so we have to parse it
as something. In this case, there are two possible interpretations.
I want to avoid it.
> Or expand the macro before latex-or-entity matching.
Macro expansion only happens at the beginning of the export process. The
problem you want to solve isn't necessarily tied to the export
mechanism.
Also, as you mention "latex-or-entity", which doesn't exist anymore, you
should look at the parsing code in master instead of maint, in
particular to `org-element-latex-fragment-parser'. Maybe the mechanism
used to find a macro can be improved to match more of them without
matching anything else.
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-07-06 20:11 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-27 10:42 Bug: [regression] superscript not available after non-alphanumeric [8.2.7b (8.2.7b-dist @ /home/benda/gnto/usr/share/emacs/site-lisp/org-mode/)] heroxbd
2014-06-27 11:55 ` Nicolas Goaziou
2014-06-28 1:39 ` syntax specification (was Re: Bug: [regression] superscript not available after non-alphanumeric) heroxbd
2014-06-29 11:47 ` [PATCH] curly nested latex fragments (was: " heroxbd
2014-06-29 13:53 ` [PATCH] curly nested latex fragments Nicolas Goaziou
2014-06-30 0:38 ` heroxbd
2014-06-30 12:31 ` Nicolas Goaziou
2014-06-30 21:50 ` heroxbd
2014-07-06 20:11 ` Nicolas Goaziou
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).