emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Nicolas Goaziou <n.goaziou@gmail.com>
To: Achim Gratz <Stromeko@nexgo.de>
Cc: emacs-orgmode@gnu.org
Subject: Re: Exporting large documents
Date: Mon, 06 May 2013 21:17:50 +0200	[thread overview]
Message-ID: <87haifex41.fsf@gmail.com> (raw)
In-Reply-To: <87vc6wuf0t.fsf@Rainer.invalid> (Achim Gratz's message of "Mon, 06 May 2013 20:41:54 +0200")

Hello,

Achim Gratz <Stromeko@nexgo.de> writes:

> Lawrence Mitchell writes:
>> org-element--current-element takes (on my machine) 0.0003 seconds per
>> call.  However, when exporting 128x the orgmanual introduction, it's
>> called around 250000 times giving ~ 80 seconds total time (out of ~200
>> total).
>
> I've traced this a bit and the question does warrant further
> investigation.  Exporting the introduction without any duplications
> already shows some interesting things: the property drawer for the
> introduction is scanned a whopping 137 times, followed by 134 times the
> cindex entry following it, followed by 125 times the "Summary" headline.
> The header options feature prominently with around 100 scans each as
> well.
>
> The rest of the calls have mostly just a single invocation, but there
> are some instances where parts of the tree are traversed multiple times
> in succession to apparently adjust the :end property to the leaf element
> in small increments or decrements.  If elements are mutable during
> parsing then caching is more difficult as well, obviously.
>
>> So it sort of feels like actually what is needed is microoptimisations
>> of the bits of the export engine that are called the most.
>
> Looking at the traces I'd think if we could eliminate the repeated
> backtracking to adjust the leafs or at least skip over those elements in
> a backtrack that are already fully parsed instead of parsing them again,
> that would be a good start.

Actually this is a bit different. Parsing doesn't backtrack. Look at
`org-element-parse-buffer' through elp to see that elements are parsed
only once.

The problem comes from `org-element-at-point'. To be effective, it needs
to move back to the current headline, and start parsing buffer again
from there. That means the first element after the headline (often
a property drawer) will be parsed each time we need information within
the section.

A very good improvement for the exporter and, more importantly, for the
parser, would be to cache results from `org-element--current-element'.
Though, this cache would also need to be refreshed after each buffer
modification. This is the tricky part.

One solution would be to use `after-change-functions' and
`before-change-functions' to store intervals of modified areas in the
buffer. Then, during idle time, a `maphash' could update boundaries of
cached values or remove them completely, according to the intervals.


Regards,

-- 
Nicolas Goaziou

  reply	other threads:[~2013-05-06 19:17 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-27 19:28 Exporting large documents Achim Gratz
2013-04-27 19:35 ` Carsten Dominik
2013-04-29 16:04   ` Lawrence Mitchell
2013-04-29 18:44     ` Achim Gratz
2013-05-01 12:18       ` [PATCH] ox: Cache locations of fuzzy links Lawrence Mitchell
2013-05-01 21:46         ` Nicolas Goaziou
2013-05-02  9:03           ` [PATCH v2] " Lawrence Mitchell
2013-05-02 12:35             ` Nicolas Goaziou
2013-05-02 12:53               ` Nicolas Goaziou
2013-05-03  8:43     ` Exporting large documents Carsten Dominik
2013-05-03 11:12       ` Lawrence Mitchell
     [not found]         ` <877gjfgnl9.fsf@gmail.com>
     [not found]           ` <0F877AB5-D488-4223-B0E7-F11B4B973614@gmail.com>
     [not found]             ` <87ip2xfd0x.fsf@gmail.com>
2013-05-06 11:07               ` Lawrence Mitchell
2013-05-06 16:15                 ` Lawrence Mitchell
2013-05-07 10:26                   ` Bastien
2013-05-06 18:41                 ` Achim Gratz
2013-05-06 19:17                   ` Nicolas Goaziou [this message]
2013-05-06 19:32                     ` Achim Gratz
2013-05-07 14:29                       ` Nicolas Goaziou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87haifex41.fsf@gmail.com \
    --to=n.goaziou@gmail.com \
    --cc=Stromeko@nexgo.de \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).