emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [bug?, org-element] latex-environment delimiters must be at BOL
@ 2014-07-10 11:54 Rasmus
  2014-07-16 13:04 ` Nicolas Goaziou
  0 siblings, 1 reply; 10+ messages in thread
From: Rasmus @ 2014-07-10 11:54 UTC (permalink / raw)
  To: emacs-orgmode

Hi,

I couldn't find previous discussions on this.

Looking at org-element-latex-environment-parser LaTeX environments are
recognized as "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}" (for start)
and "^[ \t]*\\\\end{%s}[ \t]*$" (for the end).

However, for e.g. small equations one might want to write

(*)         \begin{equation} PLACEHOLDER \end{equation},

i.e. in one line.  This fails to be recognized as the latter regexp is
not satisfied.  For code generated by humans this is not an issue.
However, e.g. SymPy generates code in the (*) format.  Prefixing (*)
with "#+LATEX:" will prevent ox-html from recognizing it.

Should org-element try to catch one-line environments as the one
above?  Or is it "a can of worms"?

Cheers,
Rasmus

-- 
Don't panic!!!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug?, org-element] latex-environment delimiters must be at BOL
  2014-07-10 11:54 [bug?, org-element] latex-environment delimiters must be at BOL Rasmus
@ 2014-07-16 13:04 ` Nicolas Goaziou
  2014-07-18 15:49   ` Rasmus
  0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Goaziou @ 2014-07-16 13:04 UTC (permalink / raw)
  To: Rasmus; +Cc: emacs-orgmode

Hello,

Rasmus <rasmus@gmx.us> writes:

> Looking at org-element-latex-environment-parser LaTeX environments are
> recognized as "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}" (for start)
> and "^[ \t]*\\\\end{%s}[ \t]*$" (for the end).
>
> However, for e.g. small equations one might want to write
>
> (*)         \begin{equation} PLACEHOLDER \end{equation},
>
> i.e. in one line.  This fails to be recognized as the latter regexp is
> not satisfied.  For code generated by humans this is not an issue.
> However, e.g. SymPy generates code in the (*) format.  Prefixing (*)
> with "#+LATEX:" will prevent ox-html from recognizing it.
>
> Should org-element try to catch one-line environments as the one
> above?  Or is it "a can of worms"?

I think allowing them is possible, as long as \end{equation} ends the
last line (with trailing whitespaces tolerated).

I assume that, for simplicity, PLACEHOLDER could be longer than one
line, too.

If Bastien agrees and if you want to provide a patch, please be sure to
include tests in test-org-element.el, and appropriate changes to "Org
Syntax" document. The manual may need to be updated, too.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug?, org-element] latex-environment delimiters must be at BOL
  2014-07-16 13:04 ` Nicolas Goaziou
@ 2014-07-18 15:49   ` Rasmus
  2014-07-20 17:49     ` Nicolas Goaziou
  0 siblings, 1 reply; 10+ messages in thread
From: Rasmus @ 2014-07-18 15:49 UTC (permalink / raw)
  To: emacs-orgmode

Hi Nicolas,

Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:

> Rasmus <rasmus@gmx.us> writes:
>
>> Looking at org-element-latex-environment-parser LaTeX environments are
>> recognized as "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}" (for start)
>> and "^[ \t]*\\\\end{%s}[ \t]*$" (for the end).
>>
>> However, for e.g. small equations one might want to write
>>
>> (*)         \begin{equation} PLACEHOLDER \end{equation},
>>
>> i.e. in one line.  This fails to be recognized as the latter regexp is
>> not satisfied.  For code generated by humans this is not an issue.
>> However, e.g. SymPy generates code in the (*) format.  Prefixing (*)
>> with "#+LATEX:" will prevent ox-html from recognizing it.
>>
>> Should org-element try to catch one-line environments as the one
>> above?  Or is it "a can of worms"?
>
> I think allowing them is possible, as long as \end{equation} ends the
> last line (with trailing whitespaces tolerated).
>
> I assume that, for simplicity, PLACEHOLDER could be longer than one
> line, too.
>
> If Bastien agrees and if you want to provide a patch, please be sure to
> include tests in test-org-element.el, and appropriate changes to "Org
> Syntax" document. The manual may need to be updated, too.

Thanks for your comments.

I changed a couple of things, and org-element now recognizes what I
wanted it to recognize.  Next step: don't break anything.

To test for immediate breakages I do:
   1. Install latest version of org (beta_8.3-15-g224105) on my /usr
      folder and start emacs -q.  Run tests as specified in
      testing/readme.  I get 12 failed (1 unexpected).
   2. Load my modified org-element.el file and rerun tests.
      The only new fail is this one, that is to be expected to fail.

          (should-not
           (eq 'latex-environment
               (org-test-with-temp-text "\\begin{env}{arg} something\nvalue\n\\end{env}"
                 (org-element-type (org-element-at-point)))))

Hence, my question is: Can I now assume that my modification are now
somewhat safe wrt. the current Org syntax?

I have a case that I worry a bit about, namely broken LaTeX code with
mismatching no. of begin/end tags:

Consider:

    #+begin_src org 
      \begin{equation} x^2_3
      \begin{equation} x^2_3 \end{equation}
    #+end_src

With my patch, this will produce (*):

    #+begin_src latex
    \begin{equation} x^2_3
    \begin{equation} x^2_3 \end{equation}
    #+end_src latex

Before it would produce

    #+begin_src latex
    \begin{equation} x\(^{\text{2}}_{\text{3}}\)
    \begin{equation} x\(^{\text{2}}_{\text{3}}\) \end{equation}
    #+end_src latex

Both will break when compiled.  With some more newlines in the
Org-code, I can get this without my patches

   #+begin_src latex
   \begin{equation} 
    x^2_3
    \begin{equation} 
    x^2_3
    \end{equation}
    #+end_src latex
 
which is identical to (*) to a LaTeX compiler.

Should I worry about such changes to behavior?


I will write some tests and review the two documents you mention and
submit a patch ASAP.

Cheers,
Rasmus

-- 
I hear there's rumors on the, uh, Internets. . .

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug?, org-element] latex-environment delimiters must be at BOL
  2014-07-18 15:49   ` Rasmus
@ 2014-07-20 17:49     ` Nicolas Goaziou
  2014-07-21  9:29       ` Rasmus
  0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Goaziou @ 2014-07-20 17:49 UTC (permalink / raw)
  To: Rasmus; +Cc: emacs-orgmode

Hello,

Rasmus <rasmus@gmx.us> writes:

> I changed a couple of things,

A couple of things? AFAIU, it only requires to change the regexp in
`org-element-latex-environment-parser', doesn't it?

[...]

>    2. Load my modified org-element.el file and rerun tests.
>       The only new fail is this one, that is to be expected to fail.
>
>           (should-not
>            (eq 'latex-environment
>                (org-test-with-temp-text "\\begin{env}{arg} something\nvalue\n\\end{env}"
>                  (org-element-type (org-element-at-point)))))

OK.

> Hence, my question is: Can I now assume that my modification are now
> somewhat safe wrt. the current Org syntax?

You should also add tests specifics to the new behaviour.

> I have a case that I worry a bit about, namely broken LaTeX code with
> mismatching no. of begin/end tags:
>
> Consider:
>
>     #+begin_src org 
>       \begin{equation} x^2_3
>       \begin{equation} x^2_3 \end{equation}
>     #+end_src
>
> With my patch, this will produce (*):
>
>     #+begin_src latex
>     \begin{equation} x^2_3
>     \begin{equation} x^2_3 \end{equation}
>     #+end_src latex

This looks "correct".  It is a valid latex-environment element even
though it is not valid LaTeX syntax.

> I will write some tests and review the two documents you mention and
> submit a patch ASAP.

OK. Thank you for that.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug?, org-element] latex-environment delimiters must be at BOL
  2014-07-20 17:49     ` Nicolas Goaziou
@ 2014-07-21  9:29       ` Rasmus
  2014-07-22  7:54         ` Nicolas Goaziou
  0 siblings, 1 reply; 10+ messages in thread
From: Rasmus @ 2014-07-21  9:29 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1215 bytes --]

Hi,

Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:

> Rasmus <rasmus@gmx.us> writes:
>
>> I changed a couple of things,
>
> A couple of things? AFAIU, it only requires to change the regexp in
> `org-element-latex-environment-parser', doesn't it?

Turns out that I also had to modify the regexp in
org-element--current-element as this kicks off
org-element-latex-environment-parser (correct me if I am wrong).

In the patch I define new regexps and use them everywhere.
Org-element.el seems to rather use spelled-out regexps.  If desirable,
I can use this practice rather than defining new regexps.  
[BTW: Is there a way to test performance systematically?]

The change in org-element-paragraph-parser, namely using the mentioned
regexp, may be redundant.

I have added tests on new newline behavior as well as a test on
different begin and end tags.  The latter could be moved to a separate
commit or entirely dropped (but it is in the org-syntax document).

This patch does not change the manual as I did not find any direct
discussion of newlines and LaTeX environments.  I do not change the
syntax document, as it seems to only live on Worg (different rope).

Cheers,
Rasmus

-- 
May contains speling mistake

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-element.el-allow-one-line-LaTeX-environments.patch --]
[-- Type: text/x-diff, Size: 4212 bytes --]

From f21ae57759ca5a1d4f19424c4eb0be8db4dcbfb5 Mon Sep 17 00:00:00 2001
From: Rasmus <w530@pank.eu>
Date: Fri, 18 Jul 2014 17:01:53 +0200
Subject: [PATCH] org-element.el: allow one-line LaTeX environments

* org-element.el (org-elment--latex-begin-environment,
org-element--latex-end-environment): New regexps identifying begining
and ending of LaTeX environment.
(org-element-latex-environment-parser, org-element-paragraph-parser,
org-element--current-element): use org-elment--latex-begin-environment
and org-element--latex-end-environment

* test-org-element.el (test-org-element/latex-environment-parser):
test for one-line LaTeX environments.
(test-org-element/latex-environment-parser): test different start tag
and end tag.
---
 lisp/org-element.el              | 19 ++++++++++++-------
 testing/lisp/test-org-element.el | 13 ++++++++++++-
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 91b4934..5311e60 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -2083,6 +2083,13 @@ CONTENTS is nil."
 
 
 ;;;; Latex Environment
+(defconst org-elment--latex-begin-environment
+  "[ \t]*\\\\begin{\\([A-Za-z0-9*]+\\)}\\(\\[.*?\\]\\|{.*?}\\)*[ \t]*?"
+  "Holds regexp matching beginning of LaTeX environments.")
+
+(defconst org-element--latex-end-environment
+  "[ \t]*\\\\end{%s}[ \t]*"
+  "Holds regexp matching ending of LaTeX environments.")
 
 (defun org-element-latex-environment-parser (limit affiliated)
   "Parse a LaTeX environment.
@@ -2100,8 +2107,8 @@ Assume point is at the beginning of the latex environment."
   (save-excursion
     (let ((case-fold-search t)
 	  (code-begin (point)))
-      (looking-at "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}")
-      (if (not (re-search-forward (format "^[ \t]*\\\\end{%s}[ \t]*$"
+      (looking-at org-elment--latex-begin-environment)
+      (if (not (re-search-forward (format org-element--latex-end-environment
 					  (regexp-quote (match-string 1)))
 				  limit t))
 	  ;; Incomplete latex environment: parse it as a paragraph.
@@ -2219,11 +2226,10 @@ Assume point is at the beginning of the paragraph."
 					  (org-match-string-no-properties 1)))
 				 limit t)))
 			 ;; Stop at valid latex environments.
-			 (and (looking-at
-			       "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}")
+			 (and (looking-at org-elment--latex-begin-environment)
 			      (save-excursion
 				(re-search-forward
-				 (format "^[ \t]*\\\\end{%s}[ \t]*$"
+				 (format org-element--latex-end-environment
 					 (regexp-quote
 					  (org-match-string-no-properties 1)))
 				 limit t)))
@@ -3707,8 +3713,7 @@ element it has to parse."
 	      (goto-char (car affiliated))
 	      (org-element-keyword-parser limit nil))
 	     ;; LaTeX Environment.
-	     ((looking-at
-	       "[ \t]*\\\\begin{[A-Za-z0-9*]+}\\(\\[.*?\\]\\|{.*?}\\)*[ \t]*$")
+	     ((looking-at org-elment--latex-begin-environment)
 	      (org-element-latex-environment-parser limit affiliated))
 	     ;; Drawer and Property Drawer.
 	     ((looking-at org-drawer-regexp)
diff --git a/testing/lisp/test-org-element.el b/testing/lisp/test-org-element.el
index 04ef1ce..0c042e1 100644
--- a/testing/lisp/test-org-element.el
+++ b/testing/lisp/test-org-element.el
@@ -1310,9 +1310,20 @@ e^{i\\pi}+1=0
    (eq 'latex-environment
        (org-test-with-temp-text "\\begin{env}{arg}\nvalue\n\\end{env}"
 	 (org-element-type (org-element-at-point)))))
+  ;; Allow environments without newline after \begin{.}.
+  (should
+   (eq 'latex-environment
+       (org-test-with-temp-text "\\begin{env}{arg}something\nvalue\n\\end{env}"
+	 (org-element-type (org-element-at-point)))))
+    ;; Allow one-line environments.
+  (should
+   (eq 'latex-environment
+       (org-test-with-temp-text "\\begin{env}{arg}something\\end{env}"
+	 (org-element-type (org-element-at-point)))))
+  ;; Should not allow different tags
   (should-not
    (eq 'latex-environment
-       (org-test-with-temp-text "\\begin{env}{arg} something\nvalue\n\\end{env}"
+       (org-test-with-temp-text "\\begin{env*}{arg}something\\end{env}"
 	 (org-element-type (org-element-at-point)))))
   ;; Handle non-empty blank line at the end of buffer.
   (should
-- 
2.0.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [bug?, org-element] latex-environment delimiters must be at BOL
  2014-07-21  9:29       ` Rasmus
@ 2014-07-22  7:54         ` Nicolas Goaziou
  2014-07-22  9:08           ` Rasmus
  0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Goaziou @ 2014-07-22  7:54 UTC (permalink / raw)
  To: Rasmus; +Cc: emacs-orgmode

Hello,

Rasmus <rasmus@gmx.us> writes:

Thanks for your patch.

> Turns out that I also had to modify the regexp in
> org-element--current-element as this kicks off
> org-element-latex-environment-parser (correct me if I am wrong).

Correct.

> In the patch I define new regexps and use them everywhere.
> Org-element.el seems to rather use spelled-out regexps.  If desirable,
> I can use this practice rather than defining new regexps.

IMO, the way to go would be to define regexps as constants everywhere in
"org-element.el", but that would introduce duplicates with some
constants in "org.el". IOW, there's no clear design about it. Situation
may change when the parser gets more integrated into core.

Meanwhile, it can't hurt to use constants in "org-element.el".

> [BTW: Is there a way to test performance systematically?]

I don't think so.

> The change in org-element-paragraph-parser, namely using the mentioned
> regexp, may be redundant.

That's fine by me.

> This patch does not change the manual as I did not find any direct
> discussion of newlines and LaTeX environments.

OK.

> From f21ae57759ca5a1d4f19424c4eb0be8db4dcbfb5 Mon Sep 17 00:00:00 2001
> From: Rasmus <w530@pank.eu>
> Date: Fri, 18 Jul 2014 17:01:53 +0200
> Subject: [PATCH] org-element.el: allow one-line LaTeX environments

"Allow"

> * org-element.el (org-elment--latex-begin-environment,
> org-element--latex-end-environment): New regexps identifying begining
> and ending of LaTeX environment.

"beginning"

> (org-element-latex-environment-parser, org-element-paragraph-parser,
> org-element--current-element): use org-elment--latex-begin-environment
> and org-element--latex-end-environment

"Use" and a full stop. Also, "elment" -> "element"

> * test-org-element.el (test-org-element/latex-environment-parser):
> test for one-line LaTeX environments.
> (test-org-element/latex-environment-parser): test different start tag
> and end tag.

"Add tests." is sufficient.

>  ;;;; Latex Environment
> +(defconst org-elment--latex-begin-environment
> +  "[ \t]*\\\\begin{\\([A-Za-z0-9*]+\\)}\\(\\[.*?\\]\\|{.*?}\\)*[ \t]*?"
> +  "Holds regexp matching beginning of LaTeX environments.")

There should be a blank line below "Latex Environment". Also, the
docstring should be "Regexp matching...".

The regexp is wrong. It must start at the beginning of line. Also, you
can skip the optional part since you don't intend to analyze it.

  "^[ \t]*\\\\begin{\\([A-Za-z0-9*]+\\)}"

Also, "elment" -> "element".

> +
> +(defconst org-element--latex-end-environment
> +  "[ \t]*\\\\end{%s}[ \t]*"
> +  "Holds regexp matching ending of LaTeX environments.")

Likewise, the end of line is mandatory here, and the optional
whitespaces at the beginning of line not necessary.

  "\\\\end{%s}[ \t]*$"

Note that it isn't a regexp per se, but a format string used to build
a regexp. Maybe the docstring should be explicit about this.

>      (let ((case-fold-search t)
>  	  (code-begin (point)))
> -      (looking-at "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}")
> -      (if (not (re-search-forward (format "^[ \t]*\\\\end{%s}[ \t]*$"
> +      (looking-at org-elment--latex-begin-environment)

"elment" -> "element"

>  			 ;; Stop at valid latex environments.
> -			 (and (looking-at
> -			       "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}")
> +			 (and (looking-at org-elment--latex-begin-environment)

"elment" -> "element"

> -	     ((looking-at
> -	       "[ \t]*\\\\begin{[A-Za-z0-9*]+}\\(\\[.*?\\]\\|{.*?}\\)*[ \t]*$")
> +	     ((looking-at org-elment--latex-begin-environment)

"elment" -> "element"


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug?, org-element] latex-environment delimiters must be at BOL
  2014-07-22  7:54         ` Nicolas Goaziou
@ 2014-07-22  9:08           ` Rasmus
  2014-07-22 10:03             ` Nicolas Goaziou
  0 siblings, 1 reply; 10+ messages in thread
From: Rasmus @ 2014-07-22  9:08 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 672 bytes --]

Hi,

Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:

> IMO, the way to go would be to define regexps as constants everywhere in
> "org-element.el", but that would introduce duplicates with some
> constants in "org.el". IOW, there's no clear design about it. Situation
> may change when the parser gets more integrated into core.

Sounds like an daunting, but worthwhile task.

> Meanwhile, it can't hurt to use constants in "org-element.el".

Good to know.

The attached patch hopefully addresses all of the issues you pointed
out.  Sorry about those typos before!

Thanks,
Rasmus

-- 
Sådan en god dansk lagereddike kan man slet ikke bruge mere

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-element.el-Allow-one-line-LaTeX-environments.patch --]
[-- Type: text/x-diff, Size: 4837 bytes --]

From b34f2199a648f6eafd6917662ec0d6c56e12bc8b Mon Sep 17 00:00:00 2001
From: Rasmus <rasmus@gmx.us>
Date: Fri, 18 Jul 2014 17:01:53 +0200
Subject: [PATCH] org-element.el: Allow one-line LaTeX environments

* org-element.el (org-element--latex-begin-environment,
org-element--latex-end-environment): New format strings to identify
beginning and ending of LaTeX environments.
(org-element-latex-environment-parser, org-element-paragraph-parser,
org-element--current-element): Use `org-element--latex-begin-environment'
and `org-element--latex-end-environment'.

* test-org-element.el (test-org-element/latex-environment-parser):
Add tests.
---
 lisp/org-element.el              | 29 ++++++++++++++++++++++-------
 testing/lisp/test-org-element.el | 22 +++++++++++++++++++++-
 2 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 850ed94..147432e 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -2084,6 +2084,23 @@ CONTENTS is nil."
 
 ;;;; Latex Environment
 
+(defconst org-element--latex-begin-environment
+  "^[ \t]*\\\\begin{\\([A-Za-z0-9*]+\\)}"
+  ;; The following format string also matches optional arguments:
+  ;; "^[ \t]*\\\\begin{\\([A-Za-z0-9*]+\\)}\\(\\[.*?\\]\\|{.*?}\\)*[ \t]*?"
+  "Format string matching the beginning of a LaTeX environment.
+
+Usage example:
+  (format org-element--latex-begin-environment ENV)
+where ENV is a LaTeX environment.
+
+See also `org-element--latex-end-environment'.")
+
+(defconst org-element--latex-end-environment
+  "\\\\end{%s}[ \t]*$"
+  "Format string matching the ending of a LaTeX environment.
+See also `org-element--latex-begin-environment'.")
+
 (defun org-element-latex-environment-parser (limit affiliated)
   "Parse a LaTeX environment.
 
@@ -2100,8 +2117,8 @@ Assume point is at the beginning of the latex environment."
   (save-excursion
     (let ((case-fold-search t)
 	  (code-begin (point)))
-      (looking-at "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}")
-      (if (not (re-search-forward (format "^[ \t]*\\\\end{%s}[ \t]*$"
+      (looking-at org-element--latex-begin-environment)
+      (if (not (re-search-forward (format org-element--latex-end-environment
 					  (regexp-quote (match-string 1)))
 				  limit t))
 	  ;; Incomplete latex environment: parse it as a paragraph.
@@ -2219,11 +2236,10 @@ Assume point is at the beginning of the paragraph."
 					  (org-match-string-no-properties 1)))
 				 limit t)))
 			 ;; Stop at valid latex environments.
-			 (and (looking-at
-			       "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}")
+			 (and (looking-at org-element--latex-begin-environment)
 			      (save-excursion
 				(re-search-forward
-				 (format "^[ \t]*\\\\end{%s}[ \t]*$"
+				 (format org-element--latex-end-environment
 					 (regexp-quote
 					  (org-match-string-no-properties 1)))
 				 limit t)))
@@ -3707,8 +3723,7 @@ element it has to parse."
 	      (goto-char (car affiliated))
 	      (org-element-keyword-parser limit nil))
 	     ;; LaTeX Environment.
-	     ((looking-at
-	       "[ \t]*\\\\begin{[A-Za-z0-9*]+}\\(\\[.*?\\]\\|{.*?}\\)*[ \t]*$")
+	     ((looking-at org-element--latex-begin-environment)
 	      (org-element-latex-environment-parser limit affiliated))
 	     ;; Drawer and Property Drawer.
 	     ((looking-at org-drawer-regexp)
diff --git a/testing/lisp/test-org-element.el b/testing/lisp/test-org-element.el
index 04ef1ce..0ca0f8a 100644
--- a/testing/lisp/test-org-element.el
+++ b/testing/lisp/test-org-element.el
@@ -1310,9 +1310,29 @@ e^{i\\pi}+1=0
    (eq 'latex-environment
        (org-test-with-temp-text "\\begin{env}{arg}\nvalue\n\\end{env}"
 	 (org-element-type (org-element-at-point)))))
+  ;; Allow environments without newline after \begin{.}.
+  (should
+   (eq 'latex-environment
+       (org-test-with-temp-text "\\begin{env}{arg}something\nvalue\n\\end{env}"
+	 (org-element-type (org-element-at-point)))))
+  ;; Allow one-line environments.
+  (should
+   (eq 'latex-environment
+       (org-test-with-temp-text "\\begin{env}{arg}something\\end{env}"
+	 (org-element-type (org-element-at-point)))))
+  ;; Should not allow different tags.
+  (should-not
+   (eq 'latex-environment
+       (org-test-with-temp-text "\\begin{env*}{arg}something\\end{env}"
+				(org-element-type (org-element-at-point)))))
+  ;; LaTeX environments must be on separate lines.
+  (should-not
+   (eq 'latex-environment
+       (org-test-with-temp-text "\\begin{env} x \\end{env} y"
+	 (org-element-type (org-element-at-point)))))
   (should-not
    (eq 'latex-environment
-       (org-test-with-temp-text "\\begin{env}{arg} something\nvalue\n\\end{env}"
+       (org-test-with-temp-text "y \\begin{env} x<point> \\end{env}"
 	 (org-element-type (org-element-at-point)))))
   ;; Handle non-empty blank line at the end of buffer.
   (should
-- 
2.0.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [bug?, org-element] latex-environment delimiters must be at BOL
  2014-07-22  9:08           ` Rasmus
@ 2014-07-22 10:03             ` Nicolas Goaziou
  2014-07-22 13:28               ` Rasmus
  0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Goaziou @ 2014-07-22 10:03 UTC (permalink / raw)
  To: Rasmus; +Cc: emacs-orgmode

Rasmus <rasmus@gmx.us> writes:

> The attached patch hopefully addresses all of the issues you pointed
> out.  Sorry about those typos before!

Thanks for the update. Some more comments below.

> +(defconst org-element--latex-begin-environment
> +  "^[ \t]*\\\\begin{\\([A-Za-z0-9*]+\\)}"

I overlooked this in the previous patch. This regexp is not correct, as
it matches, e.g.

  \begin{ab*cd}

A more accurate regexp is

  "^[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}"

I realize that `org-element--current-element' is wrong here. I fixed it
in maint, which probably means that you will need to deal with a merge
conflict.

> +  ;; The following format string also matches optional arguments:
> +  ;; "^[ \t]*\\\\begin{\\([A-Za-z0-9*]+\\)}\\(\\[.*?\\]\\|{.*?}\\)*[ \t]*?"

I think this comment is not necessary.

> +  "Format string matching the beginning of a LaTeX environment.
> +
> +Usage example:
> +  (format org-element--latex-begin-environment ENV)
> +where ENV is a LaTeX environment.

`org-element--latex-begin-environment' is not a format string.
`org-element--latex-end-environment' is.

You can remove this part. Add that the environment is put in group 1,
though.

Otherwise, the patch looks good. Feel free to apply it with suggested
changes.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug?, org-element] latex-environment delimiters must be at BOL
  2014-07-22 10:03             ` Nicolas Goaziou
@ 2014-07-22 13:28               ` Rasmus
  2014-07-23  8:29                 ` Nicolas Goaziou
  0 siblings, 1 reply; 10+ messages in thread
From: Rasmus @ 2014-07-22 13:28 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1980 bytes --]

Hi,

Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:

>> +(defconst org-element--latex-begin-environment
>> +  "^[ \t]*\\\\begin{\\([A-Za-z0-9*]+\\)}"
>
> I overlooked this in the previous patch. This regexp is not correct, as
> it matches, e.g.
>
>   \begin{ab*cd}
>
> A more accurate regexp is
>
>   "^[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}"

Note that the maint/master version of org-element--current-element
does not including a trailing "^".  I

> I realize that `org-element--current-element' is wrong here. I fixed it
> in maint, which probably means that you will need to deal with a merge
> conflict.

I'm not sure the location of stars matter much in a technical sense,
though I have never come across any environments using stars other
than for the last character.

This document compiles fine, although it is an exercise is obscurity:

     \documentclass{minimal}
     \newenvironment{ab*cd}[1]{\bfseries{#1}}{}
     \newenvironment{*abcd}[1]{\bfseries{#1}}{}
     \begin{document}
     \begin{ab*cd} x \end{ab*cd}
     \begin{*abcd} x \end{*abcd}
     \end{document}

I'm happy to accept only stars in the end, though.

>> +  ;; The following format string also matches optional arguments:
>> +  ;; "^[ \t]*\\\\begin{\\([A-Za-z0-9*]+\\)}\\(\\[.*?\\]\\|{.*?}\\)*[ \t]*?"
>
> I think this comment is not necessary.

It's a nice regexp. . .

>> +  "Format string matching the beginning of a LaTeX environment.
>> +
>> +Usage example:
>> +  (format org-element--latex-begin-environment ENV)
>> +where ENV is a LaTeX environment.
>
> `org-element--latex-begin-environment' is not a format string.
> `org-element--latex-end-environment' is.

True.  

> You can remove this part. Add that the environment is put in group 1,
> though.
>
> Otherwise, the patch looks good. Feel free to apply it with suggested
> changes.

Sorry, I can't push to Org-git.  But a patch is attached.

Thanks again,
Rasmus

-- 
This is the kind of tedious nonsense up with which I will not put

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-element.el-Allow-one-line-LaTeX-environments.patch --]
[-- Type: text/x-diff, Size: 4595 bytes --]

From fce1bff7789f90a9fa16cc318233166faf961a29 Mon Sep 17 00:00:00 2001
From: Rasmus <rasmus@gmx.us>
Date: Tue, 22 Jul 2014 15:09:03 +0200
Subject: [PATCH] org-element.el: Allow one-line LaTeX environments

* org-element.el (org-element--latex-begin-environment,
org-element--latex-end-environment): New format strings to identify
beginning and ending of LaTeX environments.
(org-element-latex-environment-parser, org-element-paragraph-parser,
org-element--current-element): Use `org-element--latex-begin-environment'
and `org-element--latex-end-environment'.

* test-org-element.el (test-org-element/latex-environment-parser):
Add tests.
---
 lisp/org-element.el              | 23 +++++++++++++++++------
 testing/lisp/test-org-element.el | 22 +++++++++++++++++++++-
 2 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index bfbba5e..75fccc5 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -2084,6 +2084,18 @@ CONTENTS is nil."
 
 ;;;; Latex Environment
 
+(defconst org-element--latex-begin-environment
+  "^[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}"
+  "Regexp matching the beginning of a LaTeX environment.
+The environment is captured by the first group.
+
+See also `org-element--latex-end-environment'.")
+
+(defconst org-element--latex-end-environment
+  "\\\\end{%s}[ \t]*$"
+  "Format string matching the ending of a LaTeX environment.
+See also `org-element--latex-begin-environment'.")
+
 (defun org-element-latex-environment-parser (limit affiliated)
   "Parse a LaTeX environment.
 
@@ -2100,8 +2112,8 @@ Assume point is at the beginning of the latex environment."
   (save-excursion
     (let ((case-fold-search t)
 	  (code-begin (point)))
-      (looking-at "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}")
-      (if (not (re-search-forward (format "^[ \t]*\\\\end{%s}[ \t]*$"
+      (looking-at org-element--latex-begin-environment)
+      (if (not (re-search-forward (format org-element--latex-end-environment
 					  (regexp-quote (match-string 1)))
 				  limit t))
 	  ;; Incomplete latex environment: parse it as a paragraph.
@@ -2219,11 +2231,10 @@ Assume point is at the beginning of the paragraph."
 					  (org-match-string-no-properties 1)))
 				 limit t)))
 			 ;; Stop at valid latex environments.
-			 (and (looking-at
-			       "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}")
+			 (and (looking-at org-element--latex-begin-environment)
 			      (save-excursion
 				(re-search-forward
-				 (format "^[ \t]*\\\\end{%s}[ \t]*$"
+				 (format org-element--latex-end-environment
 					 (regexp-quote
 					  (org-match-string-no-properties 1)))
 				 limit t)))
@@ -3707,7 +3718,7 @@ element it has to parse."
 	      (goto-char (car affiliated))
 	      (org-element-keyword-parser limit nil))
 	     ;; LaTeX Environment.
-	     ((looking-at "[ \t]*\\\\begin{\\([A-Za-z0-9]+\\*?\\)}")
+	     ((looking-at org-element--latex-begin-environment)
 	      (org-element-latex-environment-parser limit affiliated))
 	     ;; Drawer and Property Drawer.
 	     ((looking-at org-drawer-regexp)
diff --git a/testing/lisp/test-org-element.el b/testing/lisp/test-org-element.el
index 04ef1ce..0ca0f8a 100644
--- a/testing/lisp/test-org-element.el
+++ b/testing/lisp/test-org-element.el
@@ -1310,9 +1310,29 @@ e^{i\\pi}+1=0
    (eq 'latex-environment
        (org-test-with-temp-text "\\begin{env}{arg}\nvalue\n\\end{env}"
 	 (org-element-type (org-element-at-point)))))
+  ;; Allow environments without newline after \begin{.}.
+  (should
+   (eq 'latex-environment
+       (org-test-with-temp-text "\\begin{env}{arg}something\nvalue\n\\end{env}"
+	 (org-element-type (org-element-at-point)))))
+  ;; Allow one-line environments.
+  (should
+   (eq 'latex-environment
+       (org-test-with-temp-text "\\begin{env}{arg}something\\end{env}"
+	 (org-element-type (org-element-at-point)))))
+  ;; Should not allow different tags.
+  (should-not
+   (eq 'latex-environment
+       (org-test-with-temp-text "\\begin{env*}{arg}something\\end{env}"
+				(org-element-type (org-element-at-point)))))
+  ;; LaTeX environments must be on separate lines.
+  (should-not
+   (eq 'latex-environment
+       (org-test-with-temp-text "\\begin{env} x \\end{env} y"
+	 (org-element-type (org-element-at-point)))))
   (should-not
    (eq 'latex-environment
-       (org-test-with-temp-text "\\begin{env}{arg} something\nvalue\n\\end{env}"
+       (org-test-with-temp-text "y \\begin{env} x<point> \\end{env}"
 	 (org-element-type (org-element-at-point)))))
   ;; Handle non-empty blank line at the end of buffer.
   (should
-- 
2.0.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [bug?, org-element] latex-environment delimiters must be at BOL
  2014-07-22 13:28               ` Rasmus
@ 2014-07-23  8:29                 ` Nicolas Goaziou
  0 siblings, 0 replies; 10+ messages in thread
From: Nicolas Goaziou @ 2014-07-23  8:29 UTC (permalink / raw)
  To: Rasmus; +Cc: emacs-orgmode

Hello,

Rasmus <rasmus@gmx.us> writes:

> Note that the maint/master version of org-element--current-element
> does not including a trailing "^".

That's becasue `org-element--current-element' always does the check at
the beginning of line.

> I'm not sure the location of stars matter much in a technical sense,
> though I have never come across any environments using stars other
> than for the last character.
>
> This document compiles fine, although it is an exercise is obscurity:
>
>      \documentclass{minimal}
>      \newenvironment{ab*cd}[1]{\bfseries{#1}}{}
>      \newenvironment{*abcd}[1]{\bfseries{#1}}{}
>      \begin{document}
>      \begin{ab*cd} x \end{ab*cd}
>      \begin{*abcd} x \end{*abcd}
>      \end{document}
>
> I'm happy to accept only stars in the end, though.

I somehow was convinced that star characters were only allowed at the
end of the environment name. You're right, enforcing stars at the end is
unnecessary, and \\([A-Za-z0-9*]+\\) is better.

> Sorry, I can't push to Org-git.  But a patch is attached.

I applied your patch with aforementioned change. Thank you.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-07-23  8:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-10 11:54 [bug?, org-element] latex-environment delimiters must be at BOL Rasmus
2014-07-16 13:04 ` Nicolas Goaziou
2014-07-18 15:49   ` Rasmus
2014-07-20 17:49     ` Nicolas Goaziou
2014-07-21  9:29       ` Rasmus
2014-07-22  7:54         ` Nicolas Goaziou
2014-07-22  9:08           ` Rasmus
2014-07-22 10:03             ` Nicolas Goaziou
2014-07-22 13:28               ` Rasmus
2014-07-23  8:29                 ` Nicolas Goaziou

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).