Matt Lundin writes: > I've been doing some testing of org-publish functions and have found a > few problems with org-publish-cache-file-needs-publishing. They arise > from the fact that it attempts to take included files into account. OK, I've worked up a patch that solves several of these issues. The basic idea is to check when publishing an org file whether it includes other org files and then to store that data in the cache. That way, org-publish-cache-file-needs-publishing does not need to open each buffer but rather can compare the stored timestamp data against the actual modified times of the included files. > Org-publish does not check the cache of included files at all. It > simply compares the last modified time of an included file with the > last modified time of the master/including file. The result is that a > master file will perpetually be republished if an included file > happened to be changed afterwards (even if both files were changed > years ago and the project has been published 100s of times since > then). This patch fixes this by caching timestamps for included files, thus allowing org-publish to track changes in included files. > 3. It is slow!!! The function visits every file in a project to check > for #+INCLUDE declarations, thus offsetting much of the benefit of > caching timestamps. To test this, I created a dummy project with over > 1000 pages (not typical usage, of course, but possible for someone > writing a blog over several years or creating a large interlinked > wiki). This patch should make things much faster, since we only need to scan for included files during publishing (when the buffer is already active). Org-publish no longer has to visit each file individually during publishing (which takes a lot of time); rather, it can just use the cache. Matt