emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Question about Org syntax
@ 2021-05-16  3:38 Ihor Radchenko
  2021-05-16  7:11 ` Nicolas Goaziou
  0 siblings, 1 reply; 5+ messages in thread
From: Ihor Radchenko @ 2021-05-16  3:38 UTC (permalink / raw)
  To: emacs-orgmode

Hi,

I am wondering about the element structure of the following Org buffer:

* test
:drawer:
Paragraph
* test
:end:

Should the ":end:" line belong to drawer or should it be a separate
paragraph? Running org-element-at-point at the beginning of ":end:" line
yields (drawer ...).

Best,
Ihor


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about Org syntax
  2021-05-16  3:38 Question about Org syntax Ihor Radchenko
@ 2021-05-16  7:11 ` Nicolas Goaziou
  2021-05-16  8:01   ` Ihor Radchenko
  0 siblings, 1 reply; 5+ messages in thread
From: Nicolas Goaziou @ 2021-05-16  7:11 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Hello,

Ihor Radchenko <yantar92@gmail.com> writes:

> I am wondering about the element structure of the following Org buffer:
>
> * test
> :drawer:
> Paragraph
> * test
> :end:
>
> Should the ":end:" line belong to drawer or should it be a separate
> paragraph? Running org-element-at-point at the beginning of ":end:" line
> yields (drawer ...).

It should be a paragraph. I'll fix it soon.

Note the problem can be reproduced with only

  * test
  :end:

Regards,
-- 
Nicolas Goaziou


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about Org syntax
  2021-05-16  7:11 ` Nicolas Goaziou
@ 2021-05-16  8:01   ` Ihor Radchenko
  2021-05-16  8:38     ` Nicolas Goaziou
  0 siblings, 1 reply; 5+ messages in thread
From: Ihor Radchenko @ 2021-05-16  8:01 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
> It should be a paragraph. I'll fix it soon.
>
> Note the problem can be reproduced with only
>
>   * test
>   :end:

Thanks!

Also, I have few more questions (or maybe bug reports) about
syntax/parsing:

1. Does org-element--current element suppose to return (paragraph ...)
   on empty buffer?

2. Some of the element parsers honour LIMIT argument partially. Part of
   the parsing is typically done using looking-at (ignoring the LIMIT)
   and part is honouring it. This can backfire when LIMIT is before
   first characteristic line of the element. For example take headline
   parser:

   <point>* Example<limit> headline

   :contents-begin of the parsed headline will be _after_ :end

   Or even
   <point><limit>* example headline

   :contents-begin is  equal to :begin, sometimes leading to infinite
   loops in org-element--parse-to called by org-element-cache (hence,
   known bug with Emacs hangs when org-element-use-cache is non-nil)

   Some of the parsers potentially causing similar issues are:

   In particular, org-element-footnote-definition-parser,
   org-element-headline-parser, org-element-inlinetask-parser,
   org-element-plain-list-parser, org-element-property-drawer-parser,
   org-element-babel-call-parser, org-element-clock-parser,
   org-element-comment-parser, org-element-diary-sexp-parser,
   org-element-fixed-width-parser, org-element-horizontal-rule-parser,
   org-element-keyword-parser, org-element-node-property-parser,
   org-element-paragraph-parser, ...


 3. Some of the element parsers ignore LIMIT altogether:
    org-element-item-parser, org-element-section-parser...

    Is there any reason behind this? I though that parsing narrowed
    buffer is supposed to honour narrowing. Also, ignoring LIMIT might
    cause issue when trying to parse only visible elements.

Best,
Ihor
    


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about Org syntax
  2021-05-16  8:01   ` Ihor Radchenko
@ 2021-05-16  8:38     ` Nicolas Goaziou
  2021-05-23  3:50       ` Ihor Radchenko
  0 siblings, 1 reply; 5+ messages in thread
From: Nicolas Goaziou @ 2021-05-16  8:38 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Ihor Radchenko <yantar92@gmail.com> writes:

> Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
>> It should be a paragraph. I'll fix it soon.
>>
>> Note the problem can be reproduced with only
>>
>>   * test
>>   :end:
>
> Thanks!

Fixed. Thank you.

> Also, I have few more questions (or maybe bug reports) about
> syntax/parsing:
>
> 1. Does org-element--current element suppose to return (paragraph ...)
>    on empty buffer?

It is undefined. `org-element-current-element' is an internal function
being called at the beginning of "something". 

However, `org-element-at-point' is expected to return nil in an empty
buffer.

> 2. Some of the element parsers honour LIMIT argument partially. Part of
>    the parsing is typically done using looking-at (ignoring the LIMIT)
>    and part is honouring it. This can backfire when LIMIT is before
>    first characteristic line of the element. For example take headline
>    parser:
>
>    <point>* Example<limit> headline
>
>    :contents-begin of the parsed headline will be _after_ :end
>
>    Or even
>    <point><limit>* example headline
>
>    :contents-begin is  equal to :begin, sometimes leading to infinite
>    loops in org-element--parse-to called by org-element-cache (hence,
>    known bug with Emacs hangs when org-element-use-cache is non-nil)
>
>    Some of the parsers potentially causing similar issues are:
>
>    In particular, org-element-footnote-definition-parser,
>    org-element-headline-parser, org-element-inlinetask-parser,
>    org-element-plain-list-parser, org-element-property-drawer-parser,
>    org-element-babel-call-parser, org-element-clock-parser,
>    org-element-comment-parser, org-element-diary-sexp-parser,
>    org-element-fixed-width-parser, org-element-horizontal-rule-parser,
>    org-element-keyword-parser, org-element-node-property-parser,
>    org-element-paragraph-parser, ...

LIMIT is not a random position in the buffer. It is supposed to be the
end of the parent element, or (point-max).

It is a bug (in the parser or in the cache) if it ends up being anywhere
else.

>  3. Some of the element parsers ignore LIMIT altogether:
>     org-element-item-parser, org-element-section-parser...

`org-element-section-parser' actually recomputes LIMIT since it calls
`outline-next-heading'. This is sub-optimal and could probably be
removed.

`org-element-item-parser' is called in `item' mode, i.e., right after
`org-element-plain-list-parser', which already takes care of LIMIT. No
need to handle it twice.

>     Is there any reason behind this? I though that parsing narrowed
>     buffer is supposed to honour narrowing. Also, ignoring LIMIT might
>     cause issue when trying to parse only visible elements.

No, parsing ignores any narrowing, hence the copious use of
`org-with-wide-buffer' or `org-with-point-at'.

Narrowing is here to help the user focus on a part of the document, not
to cheat on the surrounding syntax. As an example

  Here is an example: what do you think about it?

Narrowing the buffer to ": what do you think about it?" for reasons
should not trick the parser into thinking you're in a fixed width area.

Regards,
-- 
Nicolas Goaziou


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about Org syntax
  2021-05-16  8:38     ` Nicolas Goaziou
@ 2021-05-23  3:50       ` Ihor Radchenko
  0 siblings, 0 replies; 5+ messages in thread
From: Ihor Radchenko @ 2021-05-23  3:50 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 4998 bytes --]

Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:

Thanks for your detailed explanations!

>> 2. Some of the element parsers honour LIMIT argument partially. Part of
>>    the parsing is typically done using looking-at (ignoring the LIMIT)
>>    and part is honouring it. This can backfire when LIMIT is before
>>    first characteristic line of the element. For example take headline
>>    parser:
>> ...
>
> LIMIT is not a random position in the buffer. It is supposed to be the
> end of the parent element, or (point-max).
>
> It is a bug (in the parser or in the cache) if it ends up being anywhere
> else.

Makes sense, though it is not mentioned in the docstrings and even in
the docstring of the `org-element--current-element'. Moreover, section
comment about parsers states that most parsers accept no arguments:

> ;; A parser returns the element or object as the list described above.
> ;; Most of them accepts no argument. 

And the comment about adding things to `org-element--current-element'
does not help either.

> ;; Beside implementing a parser and an interpreter, adding a new
> ;; greater element requires tweaking `org-element--current-element'.
> ;; Moreover, the newly defined type must be added to both
> ;; `org-element-all-elements' and `org-element-greater-elements'.

Probably, the docstring of `org-element--current-element' could mention
about the expected values of LIMIT argument?

------

Also, I have some more questions as I am trying to understand the
org-element-cache code.

I tried to add some additional comments to the existing code to clarify
parts I had difficulties understanding. See the attached patch. Let me
know if any of the comments are incorrect so that I can update my
understanding.

For now I am stuck understanding some places in the code. They are
either bugs or I misunderstood some things:

1. In org-element--cache-process-request:

>        (while t
> ...
>	  (let ((beg (aref request 0))
>		(end (aref request 2))
>		(node (org-element--cache-root))
>		data data-key last-container)
> 
> ...
>			(and last-container
> ...
>		      (progn (when (and (not last-container)
>					(> (org-element-property :end data)
>					   end))
>			       (setq last-container data))

I do not understand the use of last-container here. The code is ran in a
while loop containing let-bound last-container variable (the let binding
sets it to nil).
(setq last-container data) will not affect further iterations of the
while loop as last-container will always be nil inside the let-binding.

2. In org-element--cache-submit-request

>      (let ((first (org-element--cache-for-removal beg end offset)))
> ...
>	    (push (let ((beg (org-element-property :begin first))
>			(key (org-element--cache-key first)))
> ...
>		     ((let ((first-end (org-element-property :end first)))
>			(and (>= first-end end)
>			     (vector key beg first-end offset first 0))))
> ...
>		  org-element--cache-sync-requests)

According to the docstring of org-element--cache-sync-requests,
(aref 4 request) during "0" phase should be "PARENT, when non-nil, is
the parent of the first element to be removed. Yet, KEY is the key of
the FIRST and FIRST itself is passed as PARENT.

>	    (push (let ((beg (org-element-property :begin first))
>			(key (org-element--cache-key first)))
> ...
>		     ;; Otherwise, we find the first non robust
>		     ;; element containing END.  All elements between
>		     ;; FIRST and this one are to be removed.
> ...
>		     (t
>		      (let* ((element (org-element--cache-find end))
>			     (end (org-element-property :end element))
>			     (up element))
>			(while (and (setq up (org-element-property :parent up))
>				    (>= (org-element-property :begin up) beg))
>			  (setq end (org-element-property :end up)
>				element up))
>			(vector key beg end offset element 0)))))

Despite what the comment states, the following code simply searches for
the parent of FIRST and extents the region of phase 0 request all the
way to the end of that parent. (note that BEG is re-bound to the
beginning of the FIRST).

Now, consider the following branch of org-element--cache-submit-request.
The (aref next 4) may contain the parent of FIRST from the above code.
That parent is robust (or it would be returned by
org-element--cache-for-removal) and its :end/:contents-end have already
been shifted in the earlier call to org-element--cache-for-removal
(that was where we have found the FIRST value):

>	  ;; If last change happened within area to be removed, extend
>	  ;; boundaries of robust parents, if any.  Otherwise, find
>	  ;; first element to remove and update request accordingly.
>	  (if (> beg delete-from)
> ...
>	      (let ((up (aref next 4)))
>		(while up
>		  (org-element--cache-shift-positions
>		   up offset '(:contents-end :end))
>		  (setq up (org-element-property :parent up))))

The first iteration of the while loop will potentially extend the
:end/:contents-end of the parent of FIRST second time in a row.

Best,
Ihor


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-comments-to-org-element-cache-code.patch --]
[-- Type: text/x-diff, Size: 5163 bytes --]

From 354275e12ca86b3f226a08dd57f0794e680afc49 Mon Sep 17 00:00:00 2001
Message-Id: <354275e12ca86b3f226a08dd57f0794e680afc49.1621740204.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Sun, 23 May 2021 11:03:40 +0800
Subject: [PATCH] Add comments to org-element-cache code

---
 lisp/org-element.el | 36 ++++++++++++++++++++++++++++++++++--
 1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index ba4f0ead6..b9bb87b66 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -5322,6 +5322,9 @@ (defun org-element--cache-process-request
 		  (if (if (or (not next)
 			      (org-element--cache-key-less-p data-key next))
 			  (<= pos end)
+                        ;; FIXME: Isn't LAST-CONTAINER always nil here?
+                        ;; We are inside let-binding defining
+                        ;; last-container to be nil.
 			(and last-container
 			     (let ((up data))
 			       (while (and up (not (eq up last-container)))
@@ -5675,7 +5678,10 @@ (defun org-element--cache-for-removal (beg end offset)
 any position between BEG and END.  As an exception, greater
 elements around the changes that are robust to contents
 modifications are preserved and updated according to the
-changes."
+changes.  In the latter case, the returned element is the outermost
+non-robust element affected by changed.  Note that the returned
+element may end before END position in which case some cached element
+starting after the returned may still be affected by the changes."
   (let* ((elements (org-element--cache-find (1- beg) 'both))
 	 (before (car elements))
 	 (after (cdr elements)))
@@ -5722,22 +5728,39 @@ (defun org-element--cache-submit-request (beg end offset)
   (let ((next (car org-element--cache-sync-requests))
 	delete-to delete-from)
     (if (and next
+             ;; First existing sync request is in phase 0.
 	     (zerop (aref next 5))
+             ;; Current changes intersect with the first sync request.
 	     (> (setq delete-to (+ (aref next 2) (aref next 3))) end)
 	     (<= (setq delete-from (aref next 1)) end))
 	;; Current changes can be merged with first sync request: we
 	;; can save a partial cache synchronization.
 	(progn
+          ;; Update OFFSET of the existing request.
 	  (cl-incf (aref next 3) offset)
 	  ;; If last change happened within area to be removed, extend
 	  ;; boundaries of robust parents, if any.  Otherwise, find
 	  ;; first element to remove and update request accordingly.
 	  (if (> beg delete-from)
+              ;; The current modification is completely inside NEXT.
+              ;; (aref next 4) is supposed to be the outermost parent
+              ;; to be removed.  Everything in cache inside that
+              ;; element will be removed later and we do not care
+              ;; about extending boundaries of robust elements in
+              ;; there.  Also, all the upper elements are guaranteed
+              ;; to be robust because otherwise `org-element--cache-for-removal'
+              ;; would return more outer element (to be stored in
+              ;; (aref next 4).
 	      (let ((up (aref next 4)))
 		(while up
 		  (org-element--cache-shift-positions
 		   up offset '(:contents-end :end))
 		  (setq up (org-element-property :parent up))))
+            ;; The current and NEXT modifications are intersecting
+            ;; with current modification starting before NEXT and NEXT
+            ;; ending after current.  We need to update the common
+            ;; non-robust parent for the new extended modification
+            ;; region.
 	    (let ((first (org-element--cache-for-removal beg delete-to offset)))
 	      (when first
 		(aset next 0 (org-element--cache-key first))
@@ -5762,8 +5785,15 @@ (defun org-element--cache-submit-request (beg end offset)
 		     ;; Otherwise, we find the first non robust
 		     ;; element containing END.  All elements between
 		     ;; FIRST and this one are to be removed.
+                     ;;
+                     ;; The current modification is completely inside
+                     ;; FIRST.  Clear and update cached elements in
+                     ;; region containing FIRST.
 		     ((let ((first-end (org-element-property :end first)))
-			(and (> first-end end)
+			(and (>= first-end end)
+                             ;; FIXME: According to `org-element--cache-sync-requests'
+                             ;; docstring, (aref 4 request) must be the
+                             ;; parent of the first element to be removed.
 			     (vector key beg first-end offset first 0))))
 		     (t
 		      (let* ((element (org-element--cache-find end))
@@ -5773,6 +5803,8 @@ (defun org-element--cache-submit-request (beg end offset)
 				    (>= (org-element-property :begin up) beg))
 			  (setq end (org-element-property :end up)
 				element up))
+                        ;; Extend region to remove elements to parent
+                        ;; of the FIRST.
 			(vector key beg end offset element 0)))))
 		  org-element--cache-sync-requests)
 	  ;; No element to remove.  No need to re-parent either.
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-05-23  3:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-05-16  3:38 Question about Org syntax Ihor Radchenko
2021-05-16  7:11 ` Nicolas Goaziou
2021-05-16  8:01   ` Ihor Radchenko
2021-05-16  8:38     ` Nicolas Goaziou
2021-05-23  3:50       ` Ihor Radchenko

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).