From mboxrd@z Thu Jan 1 00:00:00 1970 From: heroxbd@gentoo.org Subject: Re: [PATCH] curly nested latex fragments Date: Tue, 01 Jul 2014 06:50:19 +0900 Message-ID: <86r426gj8k.fsf@moguhome00.in.awa.tohoku.ac.jp> References: <86simqocpz.fsf@moguhome00.in.awa.tohoku.ac.jp> <878uoiy3bd.fsf@nicolasgoaziou.fr> <86pphshr82.fsf_-_@moguhome00.in.awa.tohoku.ac.jp> <87simng6tw.fsf@nicolasgoaziou.fr> <86k37zi63g.fsf@moguhome00.in.awa.tohoku.ac.jp> <87pphqmvdr.fsf@nicolasgoaziou.fr> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:41039) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1jT6-0002MR-2c for emacs-orgmode@gnu.org; Mon, 30 Jun 2014 17:50:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X1jT0-0007ps-KF for emacs-orgmode@gnu.org; Mon, 30 Jun 2014 17:50:32 -0400 Received: from smtp.gentoo.org ([140.211.166.183]:57637) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1jT0-0007oR-An for emacs-orgmode@gnu.org; Mon, 30 Jun 2014 17:50:26 -0400 Received: from moguhome00.in.awa.tohoku.ac.jp (ernie02-dmz.awa.tohoku.ac.jp [130.34.99.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: heroxbd) by smtp.gentoo.org (Postfix) with ESMTPSA id C6B7933FD61 for ; Mon, 30 Jun 2014 21:50:23 +0000 (UTC) In-Reply-To: <87pphqmvdr.fsf@nicolasgoaziou.fr> (Nicolas Goaziou's message of "Mon, 30 Jun 2014 14:31:28 +0200") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org Hi Nicolas, Nicolas Goaziou writes: > heroxbd@gentoo.org writes: > >> Nicolas Goaziou writes: >> >>> I do not mind extending syntax for LaTeX macros a bit if it helps users, >>> but first, I would like a clear definition of what subset of macros >>> should be supported in Org. >>> >>> See, for example, >>> >>> http://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments >> >> \ce{^{238}U} falls into \NAME POST, doesn't it? > > Sorry I wasn't clear. I suggested to not use a regexp to describe the > syntax, as regular expressions may not be sufficient to describe the > object. Try to use something like the link above. > > Also, bear in mind that a complicated regexp slows down parsing. Wow that's exactly what I was wondering when reading org-element--parse-{elements,objects}. It is a tokenizer in lexical analysis, for which great tools exist for decades. >> Ha, I don't even aware of <...> syntex as a part of the LaTeX macro; I >> just copied the regex from org-latex.el. So let's strip it out, and >> advise the users to use explicit LaTeX block for <...> constructs. >> >> + (looking-at (concat >> + "\\\\\\([a-zA-Z]+\\*?\\)" >> + "\\(?:\\[[^][\n]*?\\]\\)*" >> + "\\(" (org-create-multibrace-regexp "{" "}" 3) "\\)\\{1,3\\}")) > > Unfortunately, this is ambiguous with Org macro syntax. For example, it > would match: > > \alpha{{{macro(arg)}}} > > which is an entity followed by a macro. Err, insert a white space? \alpha {{{macro(arg)}}} Or expand the macro before latex-or-entity matching. >> Do you mean this[2] and this[3] threads? I've read them through, and >> remotely understood the difficulty coming from the ambiguity of the >> syntax. And as discussed above, the difficulty manifests in the >> definition of LaTeX fragments, too. > > There is no ambiguity in LaTeX fragments, as Org is not required to > support full raw LaTeX syntax (and never did anyway), as long as we > provide markup to insert LaTeX in the buffer anyway. > > If we can support a bit more without introducing corner cases, that's > fine. But, as you say, that's just syntactic sugar, so pure Org syntax > goes first. I agree with you on this. >> At the same time, these syntax sugar is great. And that's the reason >> why we prefer org-mode in composing LaTeX to pristine LaTeX. There is a >> sincere need to compromise the cleanness of the implementation for the >> sake of an ambiguous-but-human-intuitive syntax. > > @@l:\ce{^{238}U}@@ is not so bad, nor is {{{ce(^{238)U)}}} with > a properly defined macro template. > > Anyway, let me stress it again: a change to macro syntax is fine if it > introduces no ambiguity. Obviously, the same holds for > sub/superscript. Hmmm, after reflection, my preference of \ce{^{238}U} comes from the syntax of org-mode 7.9. >> To resolve this dilemma, we need a formal (mathematically rigorous) org >> syntex specification, like the rules drafted in >> >> http://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments >> >> together with a set of test suites to demonstrate the spec. There would >> be a lot of work, but we could start from embedded LaTeX fragments and >> super(sub)scripts/underline. >> >> It might be mentally overwhelming for one single guy to do the spec and >> the implementation at the same time, because they require different >> mindsets. The spec is long term and should be stable while the >> implementation is always being optimized. After all, it is considered >> good practice to make the two processes independent to each other. > > I'm not sure what do you mean. "org-syntax.html" describes, well, the > syntax (although it could be better, with, e.g., EBNF, help is welcome), > "org-element.el" implements it, with optimizations, and > "test-org-element.el" tests the implementation. Sorry, it's my ignorance. I didn't notice the tests/ dir. So great that the testing framework is already there. > Anyway, let's concentrate on LaTeX macros. Okay. Cheers, Benda