emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Please document the caching and its user options
@ 2024-06-12  9:38 Eli Zaretskii
  2024-06-14 13:12 ` Ihor Radchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Eli Zaretskii @ 2024-06-12  9:38 UTC (permalink / raw)
  To: emacs-orgmode

I needed to visit org.org, the Org manual, today, and to my surprise
saw Emacs writing some data files into the ~/.cache/org-persist/
directory.  What's more, Emacs popped a buffer out of the blue telling
me that it could not safely encode the data written to (I presume)
some of those files, and asked me to select a safe coding-system.

By randomly poking here and there, I've succeeded to figure out that
this is due to org-element's caching of data from parsing Org files.
It seems this caching is turned on by default, but is not documented
in the Org manual, and in particular there's nothing in the manual
about turning off the caching.

Please document the caching features of Org in the manual, including
how to turn that off.  (I also question the wisdom of turning this on
by default without as much as a single request for confirmation from
the user.)

Please also make sure that the code which actually writes the data to
the cache files makes a point of binding coding-system-for-write to a
proper value (probably utf-8-unix), or forces
buffer-file-coding-system of the buffer from which it writes to have
such a safe value, to avoid annoying and unexpected prompting of the
user to select a proper encoding.  Lisp programs that write files in
the background cannot fail to set a proper encoding, because the call
to select-safe-coding-system is not supposed to be triggered by Lisp
programs unless they run as a direct result of a user-invoked command.

I've seen those problems in Emacs 29.3.  If these issues are already
solved in what will become Emacs 30, then my apologies, and kudos to
whoever solved them.  (However, the latest Org manual still keeps
completely silent about these features and their control by users.)

Thanks.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-12  9:38 Please document the caching and its user options Eli Zaretskii
@ 2024-06-14 13:12 ` Ihor Radchenko
  2024-06-14 13:41   ` Eli Zaretskii
                     ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-14 13:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

> I needed to visit org.org, the Org manual, today, and to my surprise
> saw Emacs writing some data files into the ~/.cache/org-persist/
> directory.  What's more, Emacs popped a buffer out of the blue telling
> me that it could not safely encode the data written to (I presume)
> some of those files, and asked me to select a safe coding-system.
>
> By randomly poking here and there, I've succeeded to figure out that
> this is due to org-element's caching of data from parsing Org files.
> It seems this caching is turned on by default, but is not documented
> in the Org manual, and in particular there's nothing in the manual
> about turning off the caching.
>
> Please document the caching features of Org in the manual, including
> how to turn that off.  (I also question the wisdom of turning this on
> by default without as much as a single request for confirmation from
> the user.)

Hmm. What aspect of caching do you want us to document?
FYI, Org mode has been doing various forms of caching since
forever. Recently, we just employed a bit more regular API and
introduced one more kind of caching - parser cache. In addition to the
previously existing image cache, publishing cache, ID cache, clock
cache, etc.

> Please also make sure that the code which actually writes the data to
> the cache files makes a point of binding coding-system-for-write to a
> proper value (probably utf-8-unix), or forces
> buffer-file-coding-system of the buffer from which it writes to have
> such a safe value, to avoid annoying and unexpected prompting of the
> user to select a proper encoding.  Lisp programs that write files in
> the background cannot fail to set a proper encoding, because the call
> to select-safe-coding-system is not supposed to be triggered by Lisp
> programs unless they run as a direct result of a user-invoked command.

I believe that this particular problem has been solved in
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=c8f88589c
It is a part of Org 9.7.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 13:12 ` Ihor Radchenko
@ 2024-06-14 13:41   ` Eli Zaretskii
  2024-06-14 15:31     ` Ihor Radchenko
  2024-06-14 13:56   ` Jens Lechtenboerger
  2024-06-16  5:40   ` Please document the caching and its user options Daniel Clemente
  2 siblings, 1 reply; 35+ messages in thread
From: Eli Zaretskii @ 2024-06-14 13:41 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org
> Date: Fri, 14 Jun 2024 13:12:42 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I needed to visit org.org, the Org manual, today, and to my surprise
> > saw Emacs writing some data files into the ~/.cache/org-persist/
> > directory.  What's more, Emacs popped a buffer out of the blue telling
> > me that it could not safely encode the data written to (I presume)
> > some of those files, and asked me to select a safe coding-system.
> >
> > By randomly poking here and there, I've succeeded to figure out that
> > this is due to org-element's caching of data from parsing Org files.
> > It seems this caching is turned on by default, but is not documented
> > in the Org manual, and in particular there's nothing in the manual
> > about turning off the caching.
> >
> > Please document the caching features of Org in the manual, including
> > how to turn that off.  (I also question the wisdom of turning this on
> > by default without as much as a single request for confirmation from
> > the user.)
> 
> Hmm. What aspect of caching do you want us to document?

First and foremost, that it exists, and is turned on by default.  The
manual is currently completely silent about it.

Next, please document the user options that control this caching, and
especially those options which can be used to turn this caching off or
direct it to a different place.

> FYI, Org mode has been doing various forms of caching since
> forever. Recently, we just employed a bit more regular API and
> introduced one more kind of caching - parser cache. In addition to the
> previously existing image cache, publishing cache, ID cache, clock
> cache, etc.

I'm not a heavy user of Org, but I do have several Org files that I
visit from time to time.  This was the first time I got prompted about
anything related to this caching.  Moreover, I think this was the
first time the Org file I visited was parsed by Org and the results
cached: I have a feature on my system that prominently indicates when
the machine is heavily loaded, and I was surprised to see it in action
when I visited org.org.  I never had this activated before just by
visiting an Org file.  I presumed the high load was due to the
parsing.  So either this is very new, or maybe my Org files are much
simpler than doc/misc/org.org, and so the parsing I triggered before
was much less expensive.

I hope you now understand why I wrote this report now and not before,
and why I was surprised: this caching was never so explicitly and
prominently into my face, so I could have completely missed its
existence.

> > Please also make sure that the code which actually writes the data to
> > the cache files makes a point of binding coding-system-for-write to a
> > proper value (probably utf-8-unix), or forces
> > buffer-file-coding-system of the buffer from which it writes to have
> > such a safe value, to avoid annoying and unexpected prompting of the
> > user to select a proper encoding.  Lisp programs that write files in
> > the background cannot fail to set a proper encoding, because the call
> > to select-safe-coding-system is not supposed to be triggered by Lisp
> > programs unless they run as a direct result of a user-invoked command.
> 
> I believe that this particular problem has been solved in
> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=c8f88589c
> It is a part of Org 9.7.

Maybe.  Try visiting org.org on a system whose locale is set to, say,
Latin-1, and see if you get the warnings about a safe coding-system.

But why do you use utf-8 there and not utf-8-unix?  Come to think
about it, why not emacs-internal?  Those files are used internally by
Org, so they should be able to encode any characters supported by
Emacs, not just those which have UTF-8 encoding.  And using native EOL
convention is not needed, and will get in the way if the user shares
these files between systems.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 13:12 ` Ihor Radchenko
  2024-06-14 13:41   ` Eli Zaretskii
@ 2024-06-14 13:56   ` Jens Lechtenboerger
  2024-06-14 14:31     ` Publishing cache (was: Please document the caching and its user options) Ihor Radchenko
  2024-06-16  5:40   ` Please document the caching and its user options Daniel Clemente
  2 siblings, 1 reply; 35+ messages in thread
From: Jens Lechtenboerger @ 2024-06-14 13:56 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 861 bytes --]

On 2024-06-14, Ihor Radchenko wrote:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>> Please document the caching features of Org in the manual, including
>> how to turn that off.  (I also question the wisdom of turning this on
>> by default without as much as a single request for confirmation from
>> the user.)
>
> Hmm. What aspect of caching do you want us to document?
> FYI, Org mode has been doing various forms of caching since
> forever. Recently, we just employed a bit more regular API and
> introduced one more kind of caching - parser cache. In addition to the
> previously existing image cache, publishing cache, ID cache, clock
> cache, etc.

Jumping in here, I do not understand the publishing cache.  Some of
my org documents are re-published every time, while others are only
re-published after changes.  What is checked where?

Best wishes,
Jens

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 6187 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Publishing cache (was: Please document the caching and its user options)
  2024-06-14 13:56   ` Jens Lechtenboerger
@ 2024-06-14 14:31     ` Ihor Radchenko
  0 siblings, 0 replies; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-14 14:31 UTC (permalink / raw)
  To: Jens Lechtenboerger; +Cc: Eli Zaretskii, emacs-orgmode

Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:

> Jumping in here, I do not understand the publishing cache.  Some of
> my org documents are re-published every time, while others are only
> re-published after changes.  What is checked where?

See "14.4 Triggering Publication" section of Org mode manual:

       Org uses timestamps to track when a file has changed.  The above
    functions normally only publish changed files.  You can override this
    and force publishing of all files by giving a prefix argument to any of
    the commands above, or by customizing the variable
    ‘org-publish-use-timestamps-flag’.  This may be necessary in particular
    if files include other files via ‘SETUPFILE’ or ‘INCLUDE’ keywords.
    
Apart from caching "timestamps" (a combination of modification time and
file hash), ox-publish stores information about generated link anchors,
so that they remain stable upon repeated publications (by default Org
mode export generates random anchors, unless they are specified in Org
mode source).

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 13:41   ` Eli Zaretskii
@ 2024-06-14 15:31     ` Ihor Radchenko
  2024-06-14 15:56       ` Eli Zaretskii
  0 siblings, 1 reply; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-14 15:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

>> Hmm. What aspect of caching do you want us to document?
>
> First and foremost, that it exists, and is turned on by default.  The
> manual is currently completely silent about it.
>
> Next, please document the user options that control this caching, and
> especially those options which can be used to turn this caching off or
> direct it to a different place.

I am not convinced that we have to do it.

Firstly, it is not clear if you are asking to document caching parser
state specifically or all kinds of caching Org mode does.

Secondly, I am not sure if we have to document the details of caching at
all in the manual. We do not document all the custom options in the
manual; just the most important/useful.

Emacs user manual does not document `multisession-directory' - something
very close to how we implement Org caches.  So, apparently, customizing
`multisession-directory' and even the very multisession feature
existence is not deemed necessary inside Emacs manual. Why would it be
different for Org mode manual?

> I'm not a heavy user of Org, but I do have several Org files that I
> visit from time to time.  This was the first time I got prompted about
> anything related to this caching.

The prompt you saw is indeed a bug.

> ...  Moreover, I think this was the
> first time the Org file I visited was parsed by Org and the results
> cached: I have a feature on my system that prominently indicates when
> the machine is heavily loaded, and I was surprised to see it in action
> when I visited org.org.  I never had this activated before just by
> visiting an Org file.  I presumed the high load was due to the
> parsing.  So either this is very new, or maybe my Org files are much
> simpler than doc/misc/org.org, and so the parsing I triggered before
> was much less expensive.

Org mode uses parser since long time ago. Previously, the parser was
invoked without any caching, even in-memory. Since Org 9.6, we
implemented in-memory and on-disk caches for the parser. This allowed us
to utilize the parser more frequently, without relying upon
half-accurate regexp matches. Overall, it decreased CPU loads, but there
are different scenarios; sometimes CPU load is larger momentarily.

>> I believe that this particular problem has been solved in
>> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=c8f88589c
>> It is a part of Org 9.7.
>
> Maybe.  Try visiting org.org on a system whose locale is set to, say,
> Latin-1, and see if you get the warnings about a safe coding-system.
>
> But why do you use utf-8 there and not utf-8-unix?  Come to think
> about it, why not emacs-internal?  Those files are used internally by
> Org, so they should be able to encode any characters supported by
> Emacs, not just those which have UTF-8 encoding.  And using native EOL
> convention is not needed, and will get in the way if the user shares
> these files between systems.

Mostly because we chose whatever looked reasonable. I am not 100% sure
what is the practical difference between `utf-8' and `utf-8-unix' and
why the latter should be considered better.

As for `emacs-internal', we try to make files readable if at all
possible. In particular, index.eld file is even pretty-printed for user
convenience. The idea is to keep things in plain text and not in binary
formats, following the overall spirit how Emacs usually stores data. (I
think you may recall people raising their voice about plain text
vs. binary during the discussion of multisession feature and the use of
sqlite database).

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 15:31     ` Ihor Radchenko
@ 2024-06-14 15:56       ` Eli Zaretskii
  2024-06-15 12:47         ` Ihor Radchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Eli Zaretskii @ 2024-06-14 15:56 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org
> Date: Fri, 14 Jun 2024 15:31:28 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Hmm. What aspect of caching do you want us to document?
> >
> > First and foremost, that it exists, and is turned on by default.  The
> > manual is currently completely silent about it.
> >
> > Next, please document the user options that control this caching, and
> > especially those options which can be used to turn this caching off or
> > direct it to a different place.
> 
> I am not convinced that we have to do it.

That's too bad.  When a user finds out about this caching, how do you
propose that he/she looks for the information about it?  I wanted to
know what is being cached, why, and in what file/directory.  It took
me quite some time to find the answers, since Org is a very large
package, and there's no org-cache.el file or similar to serve as the
immediate suspect.  Surely, such a basic functionality should be at
least hinted in the documentation, so that users new which options to
look at and where?

> Firstly, it is not clear if you are asking to document caching parser
> state specifically or all kinds of caching Org mode does.

All of them.

> Secondly, I am not sure if we have to document the details of caching at
> all in the manual. We do not document all the custom options in the
> manual; just the most important/useful.

I submit that at least the options which control where the cache is
and how to disable it are important enough to be in the manual.  Given
their names, users can use apropos or customize-group to find other
relevant options.

> Emacs user manual does not document `multisession-directory' - something
> very close to how we implement Org caches.  So, apparently, customizing
> `multisession-directory' and even the very multisession feature
> existence is not deemed necessary inside Emacs manual. Why would it be
> different for Org mode manual?

multisession is an optional package, it is neither preloaded nor
turned on by default in Emacs.  And even if Emacs makes a mistake of
not documenting anything it is not a valid argument to make the same
mistake elsewhere.

> > But why do you use utf-8 there and not utf-8-unix?  Come to think
> > about it, why not emacs-internal?  Those files are used internally by
> > Org, so they should be able to encode any characters supported by
> > Emacs, not just those which have UTF-8 encoding.  And using native EOL
> > convention is not needed, and will get in the way if the user shares
> > these files between systems.
> 
> Mostly because we chose whatever looked reasonable. I am not 100% sure
> what is the practical difference between `utf-8' and `utf-8-unix' and
> why the latter should be considered better.
> 
> As for `emacs-internal', we try to make files readable if at all
> possible. In particular, index.eld file is even pretty-printed for user
> convenience. The idea is to keep things in plain text and not in binary
> formats, following the overall spirit how Emacs usually stores data. (I
> think you may recall people raising their voice about plain text
> vs. binary during the discussion of multisession feature and the use of
> sqlite database).

The emacs-internal encoding is not binary.  In almost all the cases it
is indistinguishable from utf-8-unix.  It differs where a buffer
includes characters outside of the Unicode codespace.  The usual
practice in Emacs is that files holding internal data use
emacs-internal to make sure all the characters are saved correctly and
can be later restored correctly.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 15:56       ` Eli Zaretskii
@ 2024-06-15 12:47         ` Ihor Radchenko
  2024-06-15 13:01           ` Eli Zaretskii
  2024-06-15 13:47           ` Ihor Radchenko
  0 siblings, 2 replies; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-15 12:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

>> I am not convinced that we have to do it.
>
> That's too bad.  When a user finds out about this caching, how do you
> propose that he/she looks for the information about it?  I wanted to
> know what is being cached, why, and in what file/directory.  It took
> me quite some time to find the answers, since Org is a very large
> package, and there's no org-cache.el file or similar to serve as the
> immediate suspect.  Surely, such a basic functionality should be at
> least hinted in the documentation, so that users new which options to
> look at and where?

Maybe. Although it is not clear where to document such things.
Ideally, it would be nice if caches were managed by Emacs itself, with
all the cache storage locations customizeable across various packages.
Then, documenting cache locations in the Emacs manual would suffice.

Would it be possible for Emacs to define a framework for cache/var/data
locations? Such framework would not only be useful in the context of
this discussion, but also to tackle the issue with packages sprinkling
things randomly into .emacs.d or ~/ (see
https://github.com/emacscollective/no-littering/)

>> Emacs user manual does not document `multisession-directory' - something
>> very close to how we implement Org caches.  So, apparently, customizing
>> `multisession-directory' and even the very multisession feature
>> existence is not deemed necessary inside Emacs manual. Why would it be
>> different for Org mode manual?
>
> multisession is an optional package, it is neither preloaded nor
> turned on by default in Emacs.

It is used by default in emoji.el (C-x 8 e r)

> ... And even if Emacs makes a mistake of
> not documenting anything it is not a valid argument to make the same
> mistake elsewhere.

I 100% agree. But my default assumption is that things added to Emacs
are usually documented in the manual, if necessary. I assumed that the
judgment was that documenting multisession was not necessary and worked
out of that assumption.

Of course, if you say that multisession and similar things should be
documented, I will follow. Let's discuss the details.

(Also, should we open some kind of bug report to track documenting
multisession in the manual?)

> The emacs-internal encoding is not binary.  In almost all the cases it
> is indistinguishable from utf-8-unix.  It differs where a buffer
> includes characters outside of the Unicode codespace.  The usual
> practice in Emacs is that files holding internal data use
> emacs-internal to make sure all the characters are saved correctly and
> can be later restored correctly.

Then, I agree that using emacs-internal for cached data makes sense.

Note, however, that I see no indication about such convention in the
manual. The only relevant bit is

       The coding system ‘utf-8-emacs’ specifies that the data is
    represented in the internal Emacs encoding (*note Text
    Representations::).  This is like ‘raw-text’ in that no code conversion
    happens, but different in that the result is multibyte data.  The name
    ‘emacs-internal’ is an alias for ‘utf-8-emacs-unix’ (so it forces no
    conversion of end-of-line, unlike ‘utf-8-emacs’, which can decode all 3
    kinds of end-of-line conventions).

However, I cannot come to the conclusion you pointed from reading that
paragraph.

Would it make sense to add the tip about storing Elisp data somewhere in
the Elisp manual?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-15 12:47         ` Ihor Radchenko
@ 2024-06-15 13:01           ` Eli Zaretskii
  2024-06-15 14:13             ` Ihor Radchenko
  2024-06-15 13:47           ` Ihor Radchenko
  1 sibling, 1 reply; 35+ messages in thread
From: Eli Zaretskii @ 2024-06-15 13:01 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org
> Date: Sat, 15 Jun 2024 12:47:29 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> I am not convinced that we have to do it.
> >
> > That's too bad.  When a user finds out about this caching, how do you
> > propose that he/she looks for the information about it?  I wanted to
> > know what is being cached, why, and in what file/directory.  It took
> > me quite some time to find the answers, since Org is a very large
> > package, and there's no org-cache.el file or similar to serve as the
> > immediate suspect.  Surely, such a basic functionality should be at
> > least hinted in the documentation, so that users new which options to
> > look at and where?
> 
> Maybe. Although it is not clear where to document such things.
> Ideally, it would be nice if caches were managed by Emacs itself, with
> all the cache storage locations customizeable across various packages.
> Then, documenting cache locations in the Emacs manual would suffice.
> 
> Would it be possible for Emacs to define a framework for cache/var/data
> locations? Such framework would not only be useful in the context of
> this discussion, but also to tackle the issue with packages sprinkling
> things randomly into .emacs.d or ~/ (see
> https://github.com/emacscollective/no-littering/)

I think Emacs already provides all the framework for caching that is
needed.  Caching simply means you write some data to file, and all the
building blocks of that already exist, for quite some time, actually.
The only thing that is application dependent is the data to be cached
and how to serialize that, but that cannot be usefully generalized.

> >> Emacs user manual does not document `multisession-directory' - something
> >> very close to how we implement Org caches.  So, apparently, customizing
> >> `multisession-directory' and even the very multisession feature
> >> existence is not deemed necessary inside Emacs manual. Why would it be
> >> different for Org mode manual?
> >
> > multisession is an optional package, it is neither preloaded nor
> > turned on by default in Emacs.
> 
> It is used by default in emoji.el (C-x 8 e r)

Which is also optional.  And a minor feature at that.

> Of course, if you say that multisession and similar things should be
> documented, I will follow. Let's discuss the details.

I think at least emoji.el should say somewhere in its doc strings that
it caches the previously used emoji sequences, yes.

> (Also, should we open some kind of bug report to track documenting
> multisession in the manual?)

I don't mind, but it sounds like an exaggeration to me.

> > The emacs-internal encoding is not binary.  In almost all the cases it
> > is indistinguishable from utf-8-unix.  It differs where a buffer
> > includes characters outside of the Unicode codespace.  The usual
> > practice in Emacs is that files holding internal data use
> > emacs-internal to make sure all the characters are saved correctly and
> > can be later restored correctly.
> 
> Then, I agree that using emacs-internal for cached data makes sense.
> 
> Note, however, that I see no indication about such convention in the
> manual.

The opportunities for using it are rare enough.  But I added that now,
in the hope that someone will actually read all those recommendations.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-15 12:47         ` Ihor Radchenko
  2024-06-15 13:01           ` Eli Zaretskii
@ 2024-06-15 13:47           ` Ihor Radchenko
  1 sibling, 0 replies; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-15 13:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode

Ihor Radchenko <yantar92@posteo.net> writes:

>> The emacs-internal encoding is not binary.  In almost all the cases it
>> is indistinguishable from utf-8-unix.  It differs where a buffer
>> includes characters outside of the Unicode codespace.  The usual
>> practice in Emacs is that files holding internal data use
>> emacs-internal to make sure all the characters are saved correctly and
>> can be later restored correctly.
>
> Then, I agree that using emacs-internal for cached data makes sense.

Done in
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?h=bugfix&id=be39e61c4efa5027536809c89b90bfe66b76b712 (bugfix)

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-15 13:01           ` Eli Zaretskii
@ 2024-06-15 14:13             ` Ihor Radchenko
  2024-06-15 14:37               ` Eli Zaretskii
  0 siblings, 1 reply; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-15 14:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode, emacs-devel, Michael Albinus

CCing emacs-devel as I'd like to upgrade this discussion to Emacs-wide
context.

Eli Zaretskii <eliz@gnu.org> writes:

>> ... I wanted to know what is being cached, why, and in what file/directory.
>> 
> >  ...
>> Would it be possible for Emacs to define a framework for cache/var/data
>> locations? Such framework would not only be useful in the context of
>> this discussion, but also to tackle the issue with packages sprinkling
>> things randomly into .emacs.d or ~/ (see
>> https://github.com/emacscollective/no-littering/)
>
> I think Emacs already provides all the framework for caching that is
> needed.  Caching simply means you write some data to file, and all the
> building blocks of that already exist, for quite some time, actually.
> The only thing that is application dependent is the data to be cached
> and how to serialize that, but that cannot be usefully generalized.

I was referring to some kind of global option that defines cache
directory, data directory, etc. Something akin XDG.

Then, Org can place cache inside that directory rather than trying to
cook up something independently.

Also, caching is not as simple, because caches may contain sensitive
data. (see
https://list.orgmode.org/orgmode/CAM9ALR8fuSu0YWS1SehRw7sYxprJFX-r2juXd_DgvCYVKQc95Q@mail.gmail.com/)
Some users may want to move caches to read-restricted location
or even to location dependent on where the cache is originating from
(separate caches depending on whether default-directory is from
encrypted volume, remote mount, etc)

Finally, we got several requests to have caches cleared up upon exiting
Emacs, which is also something that should be better managed centrally,
by Emacs, for all possible kinds of cache/history data.

>> > multisession is an optional package, it is neither preloaded nor
>> > turned on by default in Emacs.
>> 
>> It is used by default in emoji.el (C-x 8 e r)
>
> Which is also optional.  And a minor feature at that.

It is just for now.
TRAMP (by no means a minor feature), has the following TODO item in
tramp-cache.el:

;;; TODO:
;;
;; * Use multisession.el, starting with Emacs 29.1.

>> (Also, should we open some kind of bug report to track documenting
>> multisession in the manual?)
>
> I don't mind, but it sounds like an exaggeration to me.

I kind of agree, if we talk about the current state of affairs. But, I'd
like to discuss this in the context I elaborated on above - more
centralized cache management.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-15 14:13             ` Ihor Radchenko
@ 2024-06-15 14:37               ` Eli Zaretskii
  2024-06-16  9:05                 ` Ihor Radchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Eli Zaretskii @ 2024-06-15 14:37 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode, emacs-devel, michael.albinus

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org, emacs-devel@gnu.org, Michael Albinus
>  <michael.albinus@gmx.de>
> Date: Sat, 15 Jun 2024 14:13:03 +0000
> 
> CCing emacs-devel as I'd like to upgrade this discussion to Emacs-wide
> context.
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> ... I wanted to know what is being cached, why, and in what file/directory.
> >> 
> > >  ...
> >> Would it be possible for Emacs to define a framework for cache/var/data
> >> locations? Such framework would not only be useful in the context of
> >> this discussion, but also to tackle the issue with packages sprinkling
> >> things randomly into .emacs.d or ~/ (see
> >> https://github.com/emacscollective/no-littering/)
> >
> > I think Emacs already provides all the framework for caching that is
> > needed.  Caching simply means you write some data to file, and all the
> > building blocks of that already exist, for quite some time, actually.
> > The only thing that is application dependent is the data to be cached
> > and how to serialize that, but that cannot be usefully generalized.
> 
> I was referring to some kind of global option that defines cache
> directory, data directory, etc. Something akin XDG.

We already have xdg-cache-home (and a few others in xdg.el).  Is that
what you meant?

> Also, caching is not as simple, because caches may contain sensitive
> data. (see
> https://list.orgmode.org/orgmode/CAM9ALR8fuSu0YWS1SehRw7sYxprJFX-r2juXd_DgvCYVKQc95Q@mail.gmail.com/)
> Some users may want to move caches to read-restricted location
> or even to location dependent on where the cache is originating from
> (separate caches depending on whether default-directory is from
> encrypted volume, remote mount, etc)

AFAIK, Emacs has APIs for at least some of that, but whether to use
them is up to the application, I think.

> Finally, we got several requests to have caches cleared up upon exiting
> Emacs, which is also something that should be better managed centrally,
> by Emacs, for all possible kinds of cache/history data.

Deleting files in a directory, recursively if needed, is already
available.  is that what you meant?

> >> > multisession is an optional package, it is neither preloaded nor
> >> > turned on by default in Emacs.
> >> 
> >> It is used by default in emoji.el (C-x 8 e r)
> >
> > Which is also optional.  And a minor feature at that.
> 
> It is just for now.
> TRAMP (by no means a minor feature), has the following TODO item in
> tramp-cache.el:
> 
> ;;; TODO:
> ;;
> ;; * Use multisession.el, starting with Emacs 29.1.

How far are you prepared to go just to make a point?

> >> (Also, should we open some kind of bug report to track documenting
> >> multisession in the manual?)
> >
> > I don't mind, but it sounds like an exaggeration to me.
> 
> I kind of agree, if we talk about the current state of affairs. But, I'd
> like to discuss this in the context I elaborated on above - more
> centralized cache management.

Can we first fix the problems for which I started this thread?  The
more general issues should be subjects of separate discussions, IMO.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 13:12 ` Ihor Radchenko
  2024-06-14 13:41   ` Eli Zaretskii
  2024-06-14 13:56   ` Jens Lechtenboerger
@ 2024-06-16  5:40   ` Daniel Clemente
  2024-06-16 12:36     ` Ihor Radchenko
  2 siblings, 1 reply; 35+ messages in thread
From: Daniel Clemente @ 2024-06-16  5:40 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> > Please document the caching features of Org in the manual, including
> > how to turn that off.  (I also question the wisdom of turning this on
> > by default without as much as a single request for confirmation from
> > the user.)
> Hmm. What aspect of caching do you want us to document?
> FYI, Org mode has been doing various forms of caching since
> forever. Recently, we just employed a bit more regular API and
> introduced one more kind of caching - parser cache. In addition to the
> previously existing image cache, publishing cache, ID cache, clock
> cache, etc.

One of the discussion points is specifically org-persist, which is
what creates files on disk.
There have been reports, like
https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00203.html,
or Eli's message here, mentioning that ~/.cache/org-persist is created
when the user doesn't want it or expect it.

In particular, when setting (setq org-element-cache-persistent nil)
org-mode *should not* create an org-persist directory anywhere. And I
think it shouldn't activate org-persist timers (it does now) or hooks.
The user's preference should be respected.

That's a code change.
If you just want to update documentation, a starting point can be
org-element-cache-persistent's documentation, which is just "Non-nil
when cache should persist between Emacs sessions.", and doesn't
mention that some files will always be created even if it's nil. It
also doesn't explicitly mention that it will create files (better be
explicit about this), or where (or how to control where), or which
content (i.e. just statistics, or parts of possible private org
files).

I suggest making an explicit difference between "caching in memory"
and "caching by storing files on disk".
For instance:
(defvar org-element-use-cache t
  "Non-nil when Org parser should cache its results.")
From that description, it's not clear to a new user whether they're
creating files on disk (as caches often do) or not.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-15 14:37               ` Eli Zaretskii
@ 2024-06-16  9:05                 ` Ihor Radchenko
  2024-06-16 10:41                   ` Eli Zaretskii
  0 siblings, 1 reply; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-16  9:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode, emacs-devel, michael.albinus

Eli Zaretskii <eliz@gnu.org> writes:

>> I was referring to some kind of global option that defines cache
>> directory, data directory, etc. Something akin XDG.
>
> We already have xdg-cache-home (and a few others in xdg.el).  Is that
> what you meant?

Yes, except that `xdg-cache-home' is limited:

1. It cannot be customized by users
2. It may sometimes return nil
3. It is limited to XDG - not all the Emacs platforms

What I had in mind is a new custom option for cache dir (defaulting to
OS-specific cache like XDG on Linux or something equivalent on Windows)
+ a new API function like `system-cache-home' that will be guaranteed to
return some kind of meaningful dir.

>> Also, caching is not as simple, because caches may contain sensitive
>> data. (see
>> https://list.orgmode.org/orgmode/CAM9ALR8fuSu0YWS1SehRw7sYxprJFX-r2juXd_DgvCYVKQc95Q@mail.gmail.com/)
>> Some users may want to move caches to read-restricted location
>> or even to location dependent on where the cache is originating from
>> (separate caches depending on whether default-directory is from
>> encrypted volume, remote mount, etc)
>
> AFAIK, Emacs has APIs for at least some of that, but whether to use
> them is up to the application, I think.

What are those APIs?

>> Finally, we got several requests to have caches cleared up upon exiting
>> Emacs, which is also something that should be better managed centrally,
>> by Emacs, for all possible kinds of cache/history data.
>
> Deleting files in a directory, recursively if needed, is already
> available.  is that what you meant?

No. I mean a new user option like `clear-caches-on-exit' that will work
across all the packages. Then, concerned users may set it to non-nil to
delete *all* the caches upon exiting Emacs.

Having to set this for each specific package (with some packages not
documenting that they use cache, or users not expecting that cache may
be used and not reading _all_ the docs carefully enough) is not ideal,
IMHO.

> Can we first fix the problems for which I started this thread?  The
> more general issues should be subjects of separate discussions, IMO.

If there is a global Emacs-wide customization how to handle caches,
there will be no need to document it in Org mode manual. So, I would
like to see if introducing such global customization is feasible before
making non-trivial changes to Org manual. (I am not even sure where to
document these things in the manual yet; they seem way too generic wrt
Org mode's scope)

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-16  9:05                 ` Ihor Radchenko
@ 2024-06-16 10:41                   ` Eli Zaretskii
  2024-06-23  9:12                     ` Björn Bidar
  0 siblings, 1 reply; 35+ messages in thread
From: Eli Zaretskii @ 2024-06-16 10:41 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode, emacs-devel, michael.albinus

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org, emacs-devel@gnu.org, michael.albinus@gmx.de
> Date: Sun, 16 Jun 2024 09:05:02 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> I was referring to some kind of global option that defines cache
> >> directory, data directory, etc. Something akin XDG.
> >
> > We already have xdg-cache-home (and a few others in xdg.el).  Is that
> > what you meant?
> 
> Yes, except that `xdg-cache-home' is limited:
> 
> 1. It cannot be customized by users

Of course it can: just make the default value of a defcustom be
derived by xdg-cache-home, and users can then customize the option to
a different value if they want.

> 2. It may sometimes return nil

The fallback is well-known.

> 3. It is limited to XDG - not all the Emacs platforms

No, it's supported on all platforms, even if XDG isn't.

> What I had in mind is a new custom option for cache dir (defaulting to
> OS-specific cache like XDG on Linux or something equivalent on Windows)
> + a new API function like `system-cache-home' that will be guaranteed to
> return some kind of meaningful dir.

Using xdg-cache-home and its fallbacks is a de-facto standard of
solving this in Emacs, and it supports all the platforms.  Even
startup.el uses it (albeit by customized code, to avoid interfering
with user customizations) when looking for init files and suchlikes.

So I think you raise a problem that is already solved in Emacs.

> >> Also, caching is not as simple, because caches may contain sensitive
> >> data. (see
> >> https://list.orgmode.org/orgmode/CAM9ALR8fuSu0YWS1SehRw7sYxprJFX-r2juXd_DgvCYVKQc95Q@mail.gmail.com/)
> >> Some users may want to move caches to read-restricted location
> >> or even to location dependent on where the cache is originating from
> >> (separate caches depending on whether default-directory is from
> >> encrypted volume, remote mount, etc)
> >
> > AFAIK, Emacs has APIs for at least some of that, but whether to use
> > them is up to the application, I think.
> 
> What are those APIs?

Making files and directories readable only by the owner, for example:
set-file-modes and with-file-modes.  All the other Lisp programs in
Emacs use that, so why would Org need something special?

> >> Finally, we got several requests to have caches cleared up upon exiting
> >> Emacs, which is also something that should be better managed centrally,
> >> by Emacs, for all possible kinds of cache/history data.
> >
> > Deleting files in a directory, recursively if needed, is already
> > available.  is that what you meant?
> 
> No. I mean a new user option like `clear-caches-on-exit' that will work
> across all the packages.

Having a single option for all the caches makes little sense to me.
This must be a per-cache setting.

However, users on XDG platforms can have that via XDG system-wide
settings.

> Having to set this for each specific package (with some packages not
> documenting that they use cache, or users not expecting that cache may
> be used and not reading _all_ the docs carefully enough) is not ideal,
> IMHO.

I cannot disagree more.  Each cache has its own logic for when it is a
good time to empty the cache.

> > Can we first fix the problems for which I started this thread?  The
> > more general issues should be subjects of separate discussions, IMO.
> 
> If there is a global Emacs-wide customization how to handle caches,
> there will be no need to document it in Org mode manual.

I respectfully ask the Org developers to solve this particular issue
first, without waiting for some hypothetical general Emacs feature,
which may or may not materialize.

> like to see if introducing such global customization is feasible before
> making non-trivial changes to Org manual. (I am not even sure where to
> document these things in the manual yet; they seem way too generic wrt
> Org mode's scope)

A new chapter should be fine, if no existing chapter is relevant.

TIA


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-16  5:40   ` Please document the caching and its user options Daniel Clemente
@ 2024-06-16 12:36     ` Ihor Radchenko
  2024-06-17 12:41       ` Daniel Clemente
  0 siblings, 1 reply; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-16 12:36 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

> In particular, when setting (setq org-element-cache-persistent nil)
> org-mode *should not* create an org-persist directory anywhere. And I
> think it shouldn't activate org-persist timers (it does now) or hooks.
> The user's preference should be respected.

Nope. "org-persist" directory is not only used by org-element. If some
other parts of Org need to cache something, they can also store cache
there.

> That's a code change.
> If you just want to update documentation, a starting point can be
> org-element-cache-persistent's documentation, which is just "Non-nil
> when cache should persist between Emacs sessions.", and doesn't
> mention that some files will always be created even if it's nil. It
> also doesn't explicitly mention that it will create files (better be
> explicit about this), or where (or how to control where), or which
> content (i.e. just statistics, or parts of possible private org
> files).

May you suggest an alternative docstring?

> I suggest making an explicit difference between "caching in memory"
> and "caching by storing files on disk".
> For instance:
> (defvar org-element-use-cache t
>   "Non-nil when Org parser should cache its results.")
> From that description, it's not clear to a new user whether they're
> creating files on disk (as caches often do) or not.

Do you mean something like

"Non-nil when Org parser should cache its results.

The cache is stored in-memory and may also be stored on disk if
`org-element-cache-persistent' is non-nil (the default)."

?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-16 12:36     ` Ihor Radchenko
@ 2024-06-17 12:41       ` Daniel Clemente
  2024-06-18 15:53         ` Ihor Radchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Daniel Clemente @ 2024-06-17 12:41 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> > In particular, when setting (setq org-element-cache-persistent nil)
> > org-mode *should not* create an org-persist directory anywhere. And I
> > think it shouldn't activate org-persist timers (it does now) or hooks.
> > The user's preference should be respected.
>
> Nope. "org-persist" directory is not only used by org-element. If some
> other parts of Org need to cache something, they can also store cache
> there.
>

What's the setting then to disable org-persist? I.e. to disable
creating of files like ~/.cache/org-persist/gc-lock.eld
Many people seem to want to disable all creation of org-mode related files.


> > That's a code change.
> > If you just want to update documentation, a starting point can be
> > org-element-cache-persistent's documentation, which is just "Non-nil
> > when cache should persist between Emacs sessions.", and doesn't
> > mention that some files will always be created even if it's nil. It
> > also doesn't explicitly mention that it will create files (better be
> > explicit about this), or where (or how to control where), or which
> > content (i.e. just statistics, or parts of possible private org
> > files).
>
> May you suggest an alternative docstring?
>

I don't know org-persist or org-element-cache-persistent so this needs
your input. I can start with a template, and you can fine-tune it,
expand it or rewrite it:

(defvar org-element-cache-persistent t
  "Non-nil when Org element cache should persist between Emacs sessions.
Cache files are written to disk at `org-persist-directory'.
The cache will be updated regularly (as controlled by
`org-element-cache-sync-idle-time') and when Emacs is closed.

Persisting the cache to disk can speed up ................(startup?
file opening time?, agendas? ...)...... especially if you open
.......(large files? mostly unmodified files? multiple emacs
instances?).
It is not recommended if ........(you edit the same files from
different emacs instances? if the Org files include sensitive
data?).... If you use `org-crypt', note that the persisted cache may
temporarily store unencrypted data after decrypting a header.

Use `org-element-use-cache' instead to use a memory-only cache.")




I mentioned I don't know org-element-cache-persistent, I mean that as a user.
It's explained in developer terms („make the cache persistent“).
But as an user I don't know: is it good? will things be faster? are
there risks involved? can it corrupt my files? will it leave traces of
my files in other places? who should enable it? what's the downside?
etc.
My own experience, very subjective and it may be an edge case, is that
enabling org-element-cache-persistent didn't make loading my org files
faster; on the contrary, it made some things slower (including closing
Emacs).


> > I suggest making an explicit difference between "caching in memory"
> > and "caching by storing files on disk".
> > For instance:
> > (defvar org-element-use-cache t
> >   "Non-nil when Org parser should cache its results.")
> > From that description, it's not clear to a new user whether they're
> > creating files on disk (as caches often do) or not.
>
> Do you mean something like
>
> "Non-nil when Org parser should cache its results.
>
> The cache is stored in-memory and may also be stored on disk if
> `org-element-cache-persistent' is non-nil (the default)."
>
> ?

This seems better.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-17 12:41       ` Daniel Clemente
@ 2024-06-18 15:53         ` Ihor Radchenko
  2024-06-18 16:15           ` Eli Zaretskii
  2024-06-23 11:45           ` Daniel Clemente
  0 siblings, 2 replies; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-18 15:53 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 977 bytes --]

Daniel Clemente <n142857@gmail.com> writes:

>> Nope. "org-persist" directory is not only used by org-element. If some
>> other parts of Org need to cache something, they can also store cache
>> there.
>>
> What's the setting then to disable org-persist? I.e. to disable
> creating of files like ~/.cache/org-persist/gc-lock.eld
> Many people seem to want to disable all creation of org-mode related files.

It is impossible. We need to store files like latex previews
somewhere. This somewhere is org-persist-directory now.

That said, gc-lock.eld should not be created when nothing else is
actually stored in the cache. It will be fixed.

>> May you suggest an alternative docstring?
>>
>
> I don't know org-persist or org-element-cache-persistent so this needs
> your input. I can start with a template, and you can fine-tune it,
> expand it or rewrite it:...

Thanks!
I am attaching tentative patch that improve the documentation. I hope
that it clarifies things for you.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-element-cache-Improve-docstrings.patch --]
[-- Type: text/x-patch, Size: 1759 bytes --]

From 8a64e83303566bad608c386fbdafe34aa9065a2b Mon Sep 17 00:00:00 2001
Message-ID: <8a64e83303566bad608c386fbdafe34aa9065a2b.1718725818.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Tue, 18 Jun 2024 17:49:43 +0200
Subject: [PATCH] org-element-cache: Improve docstrings

* lisp/org-element.el (org-element-use-cache):
(org-element-cache-persistent): Add more details to the docstrings.

Reported-by: Daniel Clemente <n142857@gmail.com>
Link: https://orgmode.org/list/CAJKAhPBUAS2bDT5k+xB2E-vu+d==yoNAfKjdKu2HC4qmB_XUnw@mail.gmail.com
---
 lisp/org-element.el | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 191bb5698..631cdf20c 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -5744,10 +5744,19 @@ ;;; Cache
 
 ;;;###autoload
 (defvar org-element-use-cache t
-  "Non-nil when Org parser should cache its results.")
+  "Non-nil when Org parser should cache its results.
+The results are cached in memory and may also be cached between Emacs
+sessions if `org-element-cache-persistent' is set to non-nil.")
 
 (defvar org-element-cache-persistent t
-  "Non-nil when cache should persist between Emacs sessions.")
+  "Non-nil when Org element cache should persist between Emacs sessions.
+Cache files are written to disk at `org-persist-directory'.
+The cache will be updated when Emacs is closed or when an Org buffer
+is closed.
+
+Persisting the cache to disk can speed up opening Org files
+\\(especially large Org files).  It is not recommended if the Org files
+include sensitive data, unless the data is encrypted via `org-crypt'.")
 
 (defconst org-element-cache-version "2.3"
   "Version number for Org AST structure.
-- 
2.45.1


[-- Attachment #3: Type: text/plain, Size: 631 bytes --]


> My own experience, very subjective and it may be an edge case, is that
> enabling org-element-cache-persistent didn't make loading my org files
> faster; on the contrary, it made some things slower (including closing
> Emacs).

What happens if you set `org-persist--report-time' to t in your config
and examine *Messages* buffer after opening/closing some Org files?
Look for "org-persist:..." messages.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 15:53         ` Ihor Radchenko
@ 2024-06-18 16:15           ` Eli Zaretskii
  2024-06-18 16:25             ` Ihor Radchenko
  2024-06-23 11:45           ` Daniel Clemente
  1 sibling, 1 reply; 35+ messages in thread
From: Eli Zaretskii @ 2024-06-18 16:15 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: n142857, emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-orgmode@gnu.org
> Date: Tue, 18 Jun 2024 15:53:18 +0000
> 
> Daniel Clemente <n142857@gmail.com> writes:
> 
> > What's the setting then to disable org-persist? I.e. to disable
> > creating of files like ~/.cache/org-persist/gc-lock.eld
> > Many people seem to want to disable all creation of org-mode related files.
> 
> It is impossible. We need to store files like latex previews
> somewhere. This somewhere is org-persist-directory now.

Sorry, I don't understand: why do you need to store them as files?
Why not keep the previews in buffer(s)?


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 16:15           ` Eli Zaretskii
@ 2024-06-18 16:25             ` Ihor Radchenko
  2024-06-18 16:33               ` Eli Zaretskii
  2024-06-18 22:06               ` Rudolf Adamkovič
  0 siblings, 2 replies; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-18 16:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: n142857, emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

>> It is impossible. We need to store files like latex previews
>> somewhere. This somewhere is org-persist-directory now.
>
> Sorry, I don't understand: why do you need to store them as files?
> Why not keep the previews in buffer(s)?

In Org mode, in order to create latex previews, we
(1) run latex to generate the preview image
(2) that image is stored in some directory
(3) we display that image over the corresponding latex fragment in an
    overlay
(4) we retain the image on disk, so that we do not need to run latex
    many times if the users toggles displaying the previews (this is
    very important, because running latex is costly)

Can we instead store them in memory? Yes, but (1) it will make Emacs RAM
consumption grow constantly and more and more previews are generated;
(2) it will require significant changes in the Org mode codebase.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 16:25             ` Ihor Radchenko
@ 2024-06-18 16:33               ` Eli Zaretskii
  2024-06-18 16:55                 ` Ihor Radchenko
  2024-06-18 22:06               ` Rudolf Adamkovič
  1 sibling, 1 reply; 35+ messages in thread
From: Eli Zaretskii @ 2024-06-18 16:33 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: n142857, emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: n142857@gmail.com, emacs-orgmode@gnu.org
> Date: Tue, 18 Jun 2024 16:25:10 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> It is impossible. We need to store files like latex previews
> >> somewhere. This somewhere is org-persist-directory now.
> >
> > Sorry, I don't understand: why do you need to store them as files?
> > Why not keep the previews in buffer(s)?
> 
> In Org mode, in order to create latex previews, we
> (1) run latex to generate the preview image
> (2) that image is stored in some directory
> (3) we display that image over the corresponding latex fragment in an
>     overlay
> (4) we retain the image on disk, so that we do not need to run latex
>     many times if the users toggles displaying the previews (this is
>     very important, because running latex is costly)
> 
> Can we instead store them in memory? Yes, but (1) it will make Emacs RAM
> consumption grow constantly and more and more previews are generated;
> (2) it will require significant changes in the Org mode codebase.

I understand all that, but if the user wants it, and insist on not
caching any data, let them have what they want.  My surprise was
caused by your "it is impossible"; I now understand that you meant
"not reasonable" or perhaps "users will not like that" instead.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 16:33               ` Eli Zaretskii
@ 2024-06-18 16:55                 ` Ihor Radchenko
  2024-06-19  9:27                   ` Colin Baxter
  0 siblings, 1 reply; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-18 16:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: n142857, emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

>> Can we instead store them in memory? Yes, but (1) it will make Emacs RAM
>> consumption grow constantly and more and more previews are generated;
>> (2) it will require significant changes in the Org mode codebase.
>
> I understand all that, but if the user wants it, and insist on not
> caching any data, let them have what they want.

It is not about letting or not letting them. I would have to implement
it. (I am ok with it, but I am not going to prioritize my time for
nice-to-haves; though I would not mind patches submitted by interested
users).

> ... My surprise was
> caused by your "it is impossible"; I now understand that you meant
> "not reasonable" or perhaps "users will not like that" instead.

I meant:

1. not reasonable in a sense that it has downsides compared to what we
   do now - save latex previews on disk
2. impossible in a sense that we do not have an existing toggle to store
   cached previews in memory. Such functionality would have to be added;
   and it is not necessarily trivial to add it.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 16:25             ` Ihor Radchenko
  2024-06-18 16:33               ` Eli Zaretskii
@ 2024-06-18 22:06               ` Rudolf Adamkovič
  2024-06-19  4:29                 ` tomas
  1 sibling, 1 reply; 35+ messages in thread
From: Rudolf Adamkovič @ 2024-06-18 22:06 UTC (permalink / raw)
  To: Ihor Radchenko, Eli Zaretskii; +Cc: n142857, emacs-orgmode

Ihor Radchenko <yantar92@posteo.net> writes:

> Can we instead store them in memory? Yes, but (1) it will make Emacs RAM
> consumption grow constantly and more and more previews are generated;
> (2) it will require significant changes in the Org mode codebase.

And, (3) all previews would be lost every time one shuts down their
computer, say for the night, or even restarts Emacs, which would be
terrible experience.

Rudy
-- 
"It is no paradox to say that in our most theoretical moods we may be
nearest to our most practical applications."
--- Alfred North Whitehead, 1861-1947

Rudolf Adamkovič <rudolf@adamkovic.org> [he/him]
http://adamkovic.org


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 22:06               ` Rudolf Adamkovič
@ 2024-06-19  4:29                 ` tomas
  0 siblings, 0 replies; 35+ messages in thread
From: tomas @ 2024-06-19  4:29 UTC (permalink / raw)
  To: Rudolf Adamkovič
  Cc: Ihor Radchenko, Eli Zaretskii, n142857, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 840 bytes --]

On Wed, Jun 19, 2024 at 12:06:42AM +0200, Rudolf Adamkovič wrote:
> Ihor Radchenko <yantar92@posteo.net> writes:
> 
> > Can we instead store them in memory? Yes, but (1) it will make Emacs RAM
> > consumption grow constantly and more and more previews are generated;
> > (2) it will require significant changes in the Org mode codebase.
> 
> And, (3) all previews would be lost every time one shuts down their
> computer, say for the night, or even restarts Emacs, which would be
> terrible experience.

I was one of those clamouring for a "master switch". I'm aware of all
of that. I can live with that (not everyone will, that's why it should
be optional).

I arrived at the impression that the discussion was becoming unproductive,
that's why I gave up and went with the non-existing directory trick.

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 16:55                 ` Ihor Radchenko
@ 2024-06-19  9:27                   ` Colin Baxter
  2024-06-19 10:35                     ` Ihor Radchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Colin Baxter @ 2024-06-19  9:27 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, n142857, emacs-orgmode

>>>>> Ihor Radchenko <yantar92@posteo.net> writes:

    > Eli Zaretskii <eliz@gnu.org> writes:
    >>> Can we instead store them in memory? Yes, but (1) it will make
    >>> Emacs RAM consumption grow constantly and more and more previews
    >>> are generated; (2) it will require significant changes in the
    >>> Org mode codebase.
    >> 
    >> I understand all that, but if the user wants it, and insist on
    >> not caching any data, let them have what they want.

    > It is not about letting or not letting them. I would have to
    > implement it. (I am ok with it, but I am not going to prioritize
    > my time for nice-to-haves; though I would not mind patches
    > submitted by interested users).

    >> ... My surprise was caused by your "it is impossible"; I now
    >> understand that you meant "not reasonable" or perhaps "users will
    >> not like that" instead.

    > I meant:

    > 1. not reasonable in a sense that it has downsides compared to
    > what we do now - save latex previews on disk 2. impossible in a
    > sense that we do not have an existing toggle to store cached
    > previews in memory. Such functionality would have to be added; and
    > it is not necessarily trivial to add it.

I too was one of those complainers who wanted to be able to disable
org-persist completely. The argument about latex preview is really a
non-starter in my opinion. I never use latex-preview and I'm sure I'm
not alone in this. I also would not class the disabling of org-persist
to be a 'nice-to-have'.

Best wishes,

Colin Baxter.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-19  9:27                   ` Colin Baxter
@ 2024-06-19 10:35                     ` Ihor Radchenko
  2024-06-19 13:04                       ` Eli Zaretskii
  0 siblings, 1 reply; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-19 10:35 UTC (permalink / raw)
  To: m43cap; +Cc: Eli Zaretskii, n142857, emacs-orgmode

Colin Baxter <m43cap@yandex.com> writes:

>     > 1. not reasonable in a sense that it has downsides compared to
>     > what we do now - save latex previews on disk 2. impossible in a
>     > sense that we do not have an existing toggle to store cached
>     > previews in memory. Such functionality would have to be added; and
>     > it is not necessarily trivial to add it.
>
> I too was one of those complainers who wanted to be able to disable
> org-persist completely. The argument about latex preview is really a
> non-starter in my opinion. I never use latex-preview and I'm sure I'm
> not alone in this. I also would not class the disabling of org-persist
> to be a 'nice-to-have'.

Let me clarify.

If you do not use latex-preview or other features that cache their
results, org-persist should not create any files or directories.
(It currently does create gc-lock.eld, but I will fix this)

However, if you do use it, Org mode has no option to disable creating
cache.  In fact, Org mode never had such an option. For example,
`org-preview-latex-image-directory' is a part of Org mode since at least
Org 9.0, and it was never an option to disable it. org-persist did not
introduce anything drastically new in this regard.

So, this discussion and people insisting on completely disabling the
cache is a bit strange to me. I suspect that the problem may be not the
cache itself, but either (1) that it is created when cache features are
not really used; (2) that it is created in .emacs.d for some users.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-19 10:35                     ` Ihor Radchenko
@ 2024-06-19 13:04                       ` Eli Zaretskii
  2024-06-19 13:30                         ` Ihor Radchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Eli Zaretskii @ 2024-06-19 13:04 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: m43cap, n142857, emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, n142857@gmail.com, emacs-orgmode@gnu.org
> Date: Wed, 19 Jun 2024 10:35:39 +0000
> 
> If you do not use latex-preview or other features that cache their
> results, org-persist should not create any files or directories.
> (It currently does create gc-lock.eld, but I will fix this)
> 
> However, if you do use it, Org mode has no option to disable creating
> cache.  In fact, Org mode never had such an option. For example,
> `org-preview-latex-image-directory' is a part of Org mode since at least
> Org 9.0, and it was never an option to disable it. org-persist did not
> introduce anything drastically new in this regard.
> 
> So, this discussion and people insisting on completely disabling the
> cache is a bit strange to me. I suspect that the problem may be not the
> cache itself, but either (1) that it is created when cache features are
> not really used; (2) that it is created in .emacs.d for some users.

Let me clarify.  In the scenario in which I found out about Org
caching, I didn't use latex-preview, not at all.  All I did was visit
the org.org file that we have now on the master branch, and look
around for a while (specifically, I looked for the constructs that
produce the Texinfo @dircategory and @direntry directives).  Perhaps
the caching I saw was a different kind of caching, I don't know (hence
the request to document that, and if there's more than one kind of
caching, I hope they will all be documented), but evidently the
caching by Org happens (by default!) even if the user doesn't come
anywhere near latex-preview.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-19 13:04                       ` Eli Zaretskii
@ 2024-06-19 13:30                         ` Ihor Radchenko
  2024-06-19 16:07                           ` Colin Baxter
  0 siblings, 1 reply; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-19 13:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: m43cap, n142857, emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

> Let me clarify.  In the scenario in which I found out about Org
> caching, I didn't use latex-preview, not at all....

Sure. Org uses multiple caches.
You encountered the one created by parser. The parser cache in
particular can be disabled. But not the latex preview cache.

When replying to Colin, I was clarifying about why some parts of the
cache cannot be disabled. That's all.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-19 13:30                         ` Ihor Radchenko
@ 2024-06-19 16:07                           ` Colin Baxter
  2024-06-19 16:15                             ` Ihor Radchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Colin Baxter @ 2024-06-19 16:07 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, n142857, emacs-orgmode

>>>>> Ihor Radchenko <yantar92@posteo.net> writes:

    > Eli Zaretskii <eliz@gnu.org> writes:
    >> Let me clarify.  In the scenario in which I found out about Org
    >> caching, I didn't use latex-preview, not at all....

    > Sure. Org uses multiple caches.  You encountered the one created
    > by parser. The parser cache in particular can be disabled. But not
    > the latex preview cache.

This what I cannot understand. If the user never uses latex preview why
cannot the latex preview cache be disabled? I don't want to go on and on
and become a bore - I've said my piece and I will be silent from now
on.

Best wishes,

Colin Baxter.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-19 16:07                           ` Colin Baxter
@ 2024-06-19 16:15                             ` Ihor Radchenko
  0 siblings, 0 replies; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-19 16:15 UTC (permalink / raw)
  To: m43cap; +Cc: Eli Zaretskii, n142857, emacs-orgmode

Colin Baxter <m43cap@yandex.com> writes:

> This what I cannot understand. If the user never uses latex preview why
> cannot the latex preview cache be disabled? I don't want to go on and on
> and become a bore - I've said my piece and I will be silent from now
> on.

I believe that we have some kind of misunderstanding.
Disabling cache only makes sense when it is used.
When it is unused, no cache files will be created.

So, your ask to allow disabling preview cache means that you want latex
previews to work without creating cache files, which is currently not an
option.

If you do not use latex previews, no cache files will be created due to
latex previews. Other cache files may be created though. One of them is
parser cache, but you can disable this one.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-16 10:41                   ` Eli Zaretskii
@ 2024-06-23  9:12                     ` Björn Bidar
  0 siblings, 0 replies; 35+ messages in thread
From: Björn Bidar @ 2024-06-23  9:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ihor Radchenko, emacs-orgmode, emacs-devel, michael.albinus

Eli Zaretskii <eliz@gnu.org> writes:

>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> I was referring to some kind of global option that defines cache
>> >> directory, data directory, etc. Something akin XDG.
>> >
>> > We already have xdg-cache-home (and a few others in xdg.el).  Is that
>> > what you meant?
>> 
>> Yes, except that `xdg-cache-home' is limited:
>> 
>> 1. It cannot be customized by users
>
> Of course it can: just make the default value of a defcustom be
> derived by xdg-cache-home, and users can then customize the option to
> a different value if they want.

That and it can be overridden just like any XDG Directory variable using
environment variables.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 15:53         ` Ihor Radchenko
  2024-06-18 16:15           ` Eli Zaretskii
@ 2024-06-23 11:45           ` Daniel Clemente
  2024-06-24 10:36             ` Ihor Radchenko
  1 sibling, 1 reply; 35+ messages in thread
From: Daniel Clemente @ 2024-06-23 11:45 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

>
> Thanks!
> I am attaching tentative patch that improve the documentation. I hope
> that it clarifies things for you.
>
>


Thanks. I'm not sure about the "unless" part here:

> Persisting the cache to disk […]
> It is not recommended if the Org files
> include sensitive data, unless the data is encrypted via `org-crypt'.")

I first mentioned org-crypt because users of org-crypt may be
surprised if they see encrypted data stored unencrypted in disk, due
to this cache.
A user has somefile.org which contains some headers marked with the
"crypt" tag. Only those headers are encrypted. The org-element cache
may now cache the whole file, including the encrypted headers (this is
ok). Now the user temporarily decrypts the encrypted header, works on
it some time (including closing the file and opening it again) then
encrypts the section again. During the time that the header was
unencrypted, the org-element cache was storing information about
unencrypted data in ~/.cache/org-persist, which could even be a remote
server (NFS, SMB etc), not as private as the org file itself.

Apparently the data stored in the cache doesn't contain the actual
paragraphs of text but it still contains plain text (like: names of
tags, properties, files, macros, scheduling information), which I
would call private if I'm using org-crypt.


I saw some code related to the org-element cache to avoid putting
encrypted files in the cache, but if I remember correctly that would
be just for whole encrypted files.
The part about how org-crypt works with caching could also be
documented in org-crypt instead, or in the manual.


The rest of the documentation change seems good, it improves things.
I would just mention the shortcomings or disclaimers, if there are.
For instance I worry about what may happen when different Emacs
processes load the same Org files at the same time (e.g. I run several
automated batch export jobs). And I guess that having a disk cache
creates new problems, like when in a web browser a simple F5 won't
refresh and you need S-F5.
But if there are no shortcomings (i.e. all operations will always use
up to date information and everything will keep working as usual when
you enable on-disk cache), it's ok like it is. It's also good if it's
explicitly mentioned. It could also be mentioned somewhere else, like
in a cache section in the manual, if it gets one.


> > My own experience, very subjective and it may be an edge case, is that
> > enabling org-element-cache-persistent didn't make loading my org files
> > faster; on the contrary, it made some things slower (including closing
> > Emacs).
>
> What happens if you set `org-persist--report-time' to t in your config
> and examine *Messages* buffer after opening/closing some Org files?
> Look for "org-persist:..." messages.

Thanks, I'll try that for some time and learn about org-persist and if
there are problems I'll continue in another thread.
For now I can say I see that each operation takes 0.00 or 0.01
seconds, but I have ~150 files so it amounts to a short delay (shorter
than the last time time I tried it).


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-23 11:45           ` Daniel Clemente
@ 2024-06-24 10:36             ` Ihor Radchenko
  2024-06-26 12:59               ` Daniel Clemente
  0 siblings, 1 reply; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-24 10:36 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

> Thanks. I'm not sure about the "unless" part here:
>
>> Persisting the cache to disk […]
>> It is not recommended if the Org files
>> include sensitive data, unless the data is encrypted via `org-crypt'.")
>
> I first mentioned org-crypt because users of org-crypt may be
> surprised if they see encrypted data stored unencrypted in disk, due
> to this cache.

No unencrypted data should be stored in the cache _on fs_.
If it does get stored, it is a bug that should be reported.

> A user has somefile.org which contains some headers marked with the
> "crypt" tag. Only those headers are encrypted. The org-element cache
> may now cache the whole file, including the encrypted headers (this is
> ok). Now the user temporarily decrypts the encrypted header, works on
> it some time (including closing the file and opening it again) then
> encrypts the section again. During the time that the header was
> unencrypted, the org-element cache was storing information about
> unencrypted data in ~/.cache/org-persist, which could even be a remote
> server (NFS, SMB etc), not as private as the org file itself.

Nope. Storing to disk only happens when you kill the buffer and before
exiting Emacs. At that point, org-crypt must take care about
re-encrypting everything.

> The rest of the documentation change seems good, it improves things.
> I would just mention the shortcomings or disclaimers, if there are.
> For instance I worry about what may happen when different Emacs
> processes load the same Org files at the same time (e.g. I run several
> automated batch export jobs). And I guess that having a disk cache
> creates new problems, like when in a web browser a simple F5 won't
> refresh and you need S-F5.
> But if there are no shortcomings (i.e. all operations will always use
> up to date information and everything will keep working as usual when
> you enable on-disk cache), it's ok like it is. It's also good if it's
> explicitly mentioned. It could also be mentioned somewhere else, like
> in a cache section in the manual, if it gets one.

Multiple Emacs instances are handled correctly. I do not see much
point documenting that things are working as expected.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Please document the caching and its user options
  2024-06-24 10:36             ` Ihor Radchenko
@ 2024-06-26 12:59               ` Daniel Clemente
  2024-06-26 13:21                 ` org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options) Ihor Radchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Daniel Clemente @ 2024-06-26 12:59 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> > A user has somefile.org which contains some headers marked with the
> > "crypt" tag. Only those headers are encrypted. The org-element cache
> > may now cache the whole file, including the encrypted headers (this is
> > ok). Now the user temporarily decrypts the encrypted header, works on
> > it some time (including closing the file and opening it again) then
> > encrypts the section again. During the time that the header was
> > unencrypted, the org-element cache was storing information about
> > unencrypted data in ~/.cache/org-persist, which could even be a remote
> > server (NFS, SMB etc), not as private as the org file itself.
> Nope. Storing to disk only happens when you kill the buffer and before
> exiting Emacs. At that point, org-crypt must take care about
> re-encrypting everything.

Sometimes org-crypt fails to reencrypt the data. E.g. if Emacs
crashes, or if you fail to type the same password twice, or of course
if you don't use (org-crypt-use-before-save-magic), etc.
At the end of the day when I do "git diff" + "git commit" sometimes I
realize there's unencrypted data and then I have to reencrypt it. In
the meantime I might have killed and reopened the buffer, thus
updating the file cache.
That may be a problem by org-encrypt and something to document in
org-crypt itself. The point is that users of org-encrypt should take
extra precautions when enabling org-element-cache-persistent. Like:
not closing buffers while the sections are unencrypted.

> Multiple Emacs instances are handled correctly. I do not see much
> point documenting that things are working as expected.

Ok, thanks, it's good to read this guarantee here. I'm used to
org-element cache inconsistency errors, so I didn't know the state of
things.
I agree it doesn't need to be in the docstring.
If there's some chapter about caches in the manual (which is one of
the topics in the original post of this thread) it can describe these
minor things. But the major ones like what does it do and to turn it
on/off are more interesting.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options)
  2024-06-26 12:59               ` Daniel Clemente
@ 2024-06-26 13:21                 ` Ihor Radchenko
  0 siblings, 0 replies; 35+ messages in thread
From: Ihor Radchenko @ 2024-06-26 13:21 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

> Sometimes org-crypt fails to reencrypt the data. E.g. if Emacs
> crashes, or if you fail to type the same password twice, or of course
> if you don't use (org-crypt-use-before-save-magic), etc.

I do not think that there is anything left on disk if Emacs crashes.

As for not typing the same password twice and not using
org-crypt-use-before-save-magic, we should somehow fix this.
(I am starting a new thread branch.)

One simple idea is to disable backups if encryption fails.
Or use `write-contents-functions' instead of `before-save-hook' - that
way, Emacs will not ignore errors thrown by org-crypt and will not
actually save anything if encryption fails.

> At the end of the day when I do "git diff" + "git commit" sometimes I
> realize there's unencrypted data and then I have to reencrypt it. In
> the meantime I might have killed and reopened the buffer, thus
> updating the file cache.
> That may be a problem by org-encrypt and something to document in
> org-crypt itself. The point is that users of org-encrypt should take
> extra precautions when enabling org-element-cache-persistent. Like:
> not closing buffers while the sections are unencrypted.

These things should be considered bugs. And we should fix them. Cache and
other libraries should not be responsible for special treatment of
optional org-crypt library.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2024-06-26 13:20 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-12  9:38 Please document the caching and its user options Eli Zaretskii
2024-06-14 13:12 ` Ihor Radchenko
2024-06-14 13:41   ` Eli Zaretskii
2024-06-14 15:31     ` Ihor Radchenko
2024-06-14 15:56       ` Eli Zaretskii
2024-06-15 12:47         ` Ihor Radchenko
2024-06-15 13:01           ` Eli Zaretskii
2024-06-15 14:13             ` Ihor Radchenko
2024-06-15 14:37               ` Eli Zaretskii
2024-06-16  9:05                 ` Ihor Radchenko
2024-06-16 10:41                   ` Eli Zaretskii
2024-06-23  9:12                     ` Björn Bidar
2024-06-15 13:47           ` Ihor Radchenko
2024-06-14 13:56   ` Jens Lechtenboerger
2024-06-14 14:31     ` Publishing cache (was: Please document the caching and its user options) Ihor Radchenko
2024-06-16  5:40   ` Please document the caching and its user options Daniel Clemente
2024-06-16 12:36     ` Ihor Radchenko
2024-06-17 12:41       ` Daniel Clemente
2024-06-18 15:53         ` Ihor Radchenko
2024-06-18 16:15           ` Eli Zaretskii
2024-06-18 16:25             ` Ihor Radchenko
2024-06-18 16:33               ` Eli Zaretskii
2024-06-18 16:55                 ` Ihor Radchenko
2024-06-19  9:27                   ` Colin Baxter
2024-06-19 10:35                     ` Ihor Radchenko
2024-06-19 13:04                       ` Eli Zaretskii
2024-06-19 13:30                         ` Ihor Radchenko
2024-06-19 16:07                           ` Colin Baxter
2024-06-19 16:15                             ` Ihor Radchenko
2024-06-18 22:06               ` Rudolf Adamkovič
2024-06-19  4:29                 ` tomas
2024-06-23 11:45           ` Daniel Clemente
2024-06-24 10:36             ` Ihor Radchenko
2024-06-26 12:59               ` Daniel Clemente
2024-06-26 13:21                 ` org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options) Ihor Radchenko

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).