From mboxrd@z Thu Jan 1 00:00:00 1970 From: Carsten Dominik Subject: Re: [PATCH] sha1 hash of latex fragments to avoid regeneration Date: Tue, 17 Nov 2009 14:14:15 +0100 Message-ID: <85C13EF4-5FE0-4E4F-94D1-6AB87565E2F4@gmail.com> References: Mime-Version: 1.0 (Apple Message framework v936) Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Return-path: Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NANt5-0007wF-BG for emacs-orgmode@gnu.org; Tue, 17 Nov 2009 08:14:27 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NANt0-0007up-6Y for emacs-orgmode@gnu.org; Tue, 17 Nov 2009 08:14:26 -0500 Received: from [199.232.76.173] (port=48890 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NANt0-0007uc-2d for emacs-orgmode@gnu.org; Tue, 17 Nov 2009 08:14:22 -0500 Received: from ey-out-2122.google.com ([74.125.78.24]:6941) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NANsz-0000qq-Hn for emacs-orgmode@gnu.org; Tue, 17 Nov 2009 08:14:21 -0500 Received: by ey-out-2122.google.com with SMTP id 4so2021231eyf.27 for ; Tue, 17 Nov 2009 05:14:20 -0800 (PST) In-Reply-To: List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Eric Schulte Cc: Org Mode Hi Eric, looks great now, I have made a few minor changes and applied it. - Carsten On Nov 17, 2009, at 1:11 AM, Eric Schulte wrote: > Delivered-To: carsten.dominik@gmail.com > Received: by 10.90.33.18 with SMTP id g18cs184746agg; > Mon, 16 Nov 2009 16:14:16 -0800 (PST) > Received: by 10.115.103.17 with SMTP id f17mr8915518wam. > 166.1258416855542; > Mon, 16 Nov 2009 16:14:15 -0800 (PST) > Return-Path: > Received: from mail-pz0-f194.google.com (mail-pz0-f194.google.com > [209.85.222.194]) > by mx.google.com with ESMTP id 32si16386502pzk. > 110.2009.11.16.16.14.14; > Mon, 16 Nov 2009 16:14:14 -0800 (PST) > Received-SPF: pass (google.com: domain of schulte.eric@gmail.com > designates 209.85.222.194 as permitted sender) client- > ip=209.85.222.194; > Authentication-Results: mx.google.com; spf=pass (google.com: domain > of schulte.eric@gmail.com designates 209.85.222.194 as permitted > sender) smtp.mail=schulte.eric@gmail.com; dkim=pass (test mode) header.i=@gmail.com > Received: by mail-pz0-f194.google.com with SMTP id 32so4183666pzk.21 > for ; Mon, 16 Nov 2009 16:14:14 > -0800 (PST) > DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; > d=gmail.com; s=gamma; > h=domainkey- > signature:received:received:from:to:cc:subject:date > :references:message-id:user-agent:mime-version:content-type; > bh=ljxQthw1QhSpvXDshKqdl0Hmi2nMWV522o9FyWQIilY=; > b=JWQ//xkTNTI4hck3U/DCNEnBYADht03DHIfHIpu/O3sUVCX7vECFDVV/ > YiboCVdziZ > R4Uy6vQO2/PIB > +m5VhNXtx9xQoVrZMZCkfsoNjXtg5iWUzvKPon0sP9Hu7x7iC48+bc3 > nHT82nwLQxD8AfjPRnrHWxVJE0V6PeFBl2zrk= > DomainKey-Signature: a=rsa-sha1; c=nofws; > d=gmail.com; s=gamma; > h=from:to:cc:subject:date:references:message-id:user-agent > :mime-version:content-type; > b=PsUXGek+vgAXULkt/6iP9BZQVaBqpCb8cB8bPp8suG4lT2ZAdTHti3K/ > QKt3ZKlUrp > uVYHXPt1lustTNapWXvGPCK269E9xLkzU0fiFtyE8InqF > +tOn86drUHSbDmSFC5hh3uJ > sXgMAXWAMMe7J1y89K1H/NdV61cXAm/AOclC4= > Received: by 10.115.101.18 with SMTP id d18mr8602604wam. > 191.1258416853669; > Mon, 16 Nov 2009 16:14:13 -0800 (PST) > Return-Path: > Received: from eschulte (adaptive.cs.unm.edu [64.106.21.179]) > by mx.google.com with ESMTPS id 23sm1871553pxi. > 1.2009.11.16.16.14.11 > (version=TLSv1/SSLv3 cipher=RC4-MD5); > Mon, 16 Nov 2009 16:14:12 -0800 (PST) > From: "Eric Schulte" > To: Carsten Dominik > Cc: Org Mode > Subject: Re: [Orgmode] [PATCH] sha1 hash of latex fragments to avoid > regeneration > Date: Mon, 16 Nov 2009 17:11:03 -0700 > References: > > Message-ID: > User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (darwin) > MIME-Version: 1.0 > Content-Type: multipart/mixed; boundary="=-=-=" > > --=-=-= > > Hi Carsten, > > Thanks for the feedback, I have comments inline below > > Carsten Dominik writes: > > > Hi Eric, > > > > this is fantastic, thank you for implementing it. I have wanted > some > > speedup > > for this for a long time. > > > > I think your implementation still suffers from one issue: > > > > The produced image also depends on the variables org-format-latex- > > options, > > org-format-latex-header, org-export-latex-package-alist, > > and on the `forbuffer' flag (because images made for display in > > the buffer and fo HTML export generaly need different resolution). > > > > One way to deal with this would be to make a list containing the > values > > of these four variables and using prin1-to-string to convert this > list > > into a string, and then to prepend this string to TXT when creating > > the hash. > > > > That sounds like the best solution. I have made this change in the > newly attached patch. > > > > > Now, I am sure that you are already planning to do the same > > for ditaa images etc? > > of course :) > > > That would be a treat, because ditaa can be terribly slow for > complex > > figures, and this would speed up the cycle when writing document by > > quite a bit. > > > > Dan and I have been working on general caching solution for org-babel. > Once we get that sorted out it should provide for the caching of all > org-babel results which would include ditaa, dot, gnuplot, etc... > > I am currently more interested in making these changes in org-babel > than > in org-exp-blocks, but in this case it may be worth implementing > caching > in both cases. > > > > > There is one further issue: Cleaning up images that are no > > longer used. > > > > With the LaTeX fragments it is not a big problem, because there > > live in a special directory. This would be a bigger concern for > > ditaa images etc which tend to live in the same directory as the > > source. Maybe that could be solved by > > > > 1. Making sure that each image still have a name like "blue", so > > that the name now would be "blus_loooooonghashvalue.png" or so. > > 2. Maybe creating a command that will look for orphaned images > > and remove them, by looking for the hash in the name and > > checking access times. I am not sure if this is needed, > > and not sure what would be the best way to implement it. > > > > Yes, this will not be an issue in the org-babel implementation as the > hash key is stored separate from the file name, but I can see how this > would need to be considered for any org-exp-blocks hash-based image > caching. The first option you propose above sounds very doable, as > long > as we are comfortable removing any files that match regular > expressions > like > > blue_[[:alnum:]]+\.png > > which seems safe enough... > > > > > After looking at these things, I would be *very* happy to accept > > this patch. > > > > I'll give this some more thought, but perhaps the latex image fragment > patch is now viable. > > Best -- Eric > > > --=-=-= > Content-Type: text/x-patch > Content-Disposition: inline; > filename=0001-latex-fragment-images-cached-using-sha1-hash-keys.patch > > From 0e9a359c1d5e8f67c20066533171fb1edc11ba61 Mon Sep 17 00:00:00 2001 > From: Eric Schulte > Date: Mon, 16 Nov 2009 16:53:34 -0700 > Subject: [PATCH] latex fragment images cached using sha1 hash keys > > Latex fragment images are now saved in files named by the sha1 hash > of the latex text used to create the image. By checking if files > exist before images generation the regeneration of identical latex > images is avoided. > --- > lisp/ChangeLog | 6 ++++++ > lisp/org.el | 22 +++++++++++----------- > 2 files changed, 17 insertions(+), 11 deletions(-) > > diff --git a/lisp/ChangeLog b/lisp/ChangeLog > index 5f83aaa..b581931 100755 > --- a/lisp/ChangeLog > +++ b/lisp/ChangeLog > @@ -1,3 +1,9 @@ > +2009-11-17 Eric Schulte > + > + * org.el (org-format-latex): Latex images are now saved to files > + named by the sha1 hash of the latex source text avoiding > + regeneration of identical images. > + > 2009-11-16 Carsten Dominik > > * org-html.el (org-export-html-home/up-format): Add an ID to the > diff --git a/lisp/org.el b/lisp/org.el > index bf6573b..15a8f9e 100644 > --- a/lisp/org.el > +++ b/lisp/org.el > @@ -14550,15 +14550,9 @@ Some of the options can be changed using > the variable > (opt org-format-latex-options) > (matchers (plist-get opt :matchers)) > (re-list org-latex-regexps) > - (cnt 0) txt link beg end re e checkdir > + (cnt 0) txt hash link beg end re e checkdir > executables-checked > m n block linkfile movefile ov) > - ;; Check if there are old images files with this prefix, and > remove them > - (when (file-directory-p todir) > - (mapc 'delete-file > - (directory-files > - todir 'full > - (concat (regexp-quote prefixnodir) "_[0-9]+\\.png$")))) > ;; Check the different regular expressions > (while (setq e (pop re-list)) > (setq m (car e) re (nth 1 e) n (nth 2 e) > @@ -14576,9 +14570,14 @@ Some of the options can be changed using > the variable > (setq txt (match-string n) > beg (match-beginning n) end (match-end n) > cnt (1+ cnt) > - linkfile (format "%s_%04d.png" prefix cnt) > - movefile (format "%s_%04d.png" absprefix cnt) > link (concat block "[[file:" linkfile "]]" block)) > + (setq hash (sha1 (prin1-to-string > + (list org-format-latex-header > + (if (boundp 'org-export-latex- > package-alist) > + org-export-latex-package- > alist) > + forbuffer txt))) > + linkfile (format "%s_%s.png" prefix hash) > + movefile (format "%s_%s.png" absprefix hash)) > (if msg (message msg cnt)) > (goto-char beg) > (unless checkdir ; make sure the directory exists > @@ -14592,8 +14591,9 @@ Some of the options can be changed using the > variable > "dvipng" "needed to convert LaTeX fragments to images") > (setq executables-checked t)) > > - (org-create-formula-image > - txt movefile opt forbuffer) > + (unless (file-exists-p movefile) > + (org-create-formula-image > + txt movefile opt forbuffer)) > (if overlays > (progn > (mapc (lambda (o) > -- > 1.6.4.73.gc144 > > > --=-=-= > > > > > > - Carsten > > > > On Nov 16, 2009, at 1:07 AM, Eric Schulte wrote: > > > >> Hi, > >> > >> The attached patch changes the latex fragment image generation so > that > >> it saves images into files named by the sha1 hash of the latex > source > >> code. By checking for the existence of image files before image > >> generation the regeneration of identical images is avoided. > >> > >> In practice I find that this greatly speeds up export to html and > the > >> `org-preview-latex-fragment' command. > >> > >> Cheers -- Eric > >> > >> From 13e1c48fa6cac43b0c87ca0fbc8e349f7a9fa864 Mon Sep 17 00:00:00 > 2001 > >> From: Eric Schulte > >> Date: Sun, 15 Nov 2009 17:00:09 -0700 > >> Subject: [PATCH] latex fragment images cached using sha1 hash keys > >> > >> Latex fragment images are now saved in files named by the sha1 > hash > >> of the latex text used to create the image. By checking if files > >> exist before images generation the regeneration of identical latex > >> images is avoided. > >> --- > >> lisp/ChangeLog | 6 ++++++ > >> lisp/org.el | 18 +++++++----------- > >> 2 files changed, 13 insertions(+), 11 deletions(-) > >> > >> diff --git a/lisp/ChangeLog b/lisp/ChangeLog > >> index 339f248..f18755c 100755 > >> --- a/lisp/ChangeLog > >> +++ b/lisp/ChangeLog > >> @@ -1,3 +1,9 @@ > >> +2009-11-16 Eric Schulte > >> + > >> + * org.el (org-format-latex): Latex images are now saved to files > >> + named by the sha1 hash of the latex source text avoiding > >> + regeneration of identical images. > >> + > >> 2009-11-15 Carsten Dominik > >> > >> * org-wl.el (org-wl-store-link): Handle the case that > >> diff --git a/lisp/org.el b/lisp/org.el > >> index bf6573b..46348fc 100644 > >> --- a/lisp/org.el > >> +++ b/lisp/org.el > >> @@ -14550,15 +14550,9 @@ Some of the options can be changed using > >> the variable > >> (opt org-format-latex-options) > >> (matchers (plist-get opt :matchers)) > >> (re-list org-latex-regexps) > >> - (cnt 0) txt link beg end re e checkdir > >> + (cnt 0) txt hash link beg end re e checkdir > >> executables-checked > >> m n block linkfile movefile ov) > >> - ;; Check if there are old images files with this prefix, and > >> remove them > >> - (when (file-directory-p todir) > >> - (mapc 'delete-file > >> - (directory-files > >> - todir 'full > >> - (concat (regexp-quote prefixnodir) "_[0-9]+\\.png$")))) > >> ;; Check the different regular expressions > >> (while (setq e (pop re-list)) > >> (setq m (car e) re (nth 1 e) n (nth 2 e) > >> @@ -14576,9 +14570,10 @@ Some of the options can be changed using > >> the variable > >> (setq txt (match-string n) > >> beg (match-beginning n) end (match-end n) > >> cnt (1+ cnt) > >> - linkfile (format "%s_%04d.png" prefix cnt) > >> - movefile (format "%s_%04d.png" absprefix cnt) > >> link (concat block "[[file:" linkfile "]]" block)) > >> + (setq hash (sha1 txt) > >> + linkfile (format "%s_%s.png" prefix hash) > >> + movefile (format "%s_%s.png" absprefix hash)) > >> (if msg (message msg cnt)) > >> (goto-char beg) > >> (unless checkdir ; make sure the directory exists > >> @@ -14592,8 +14587,9 @@ Some of the options can be changed using > the > >> variable > >> "dvipng" "needed to convert LaTeX fragments to images") > >> (setq executables-checked t)) > >> > >> - (org-create-formula-image > >> - txt movefile opt forbuffer) > >> + (unless (file-exists-p movefile) > >> + (org-create-formula-image > >> + txt movefile opt forbuffer)) > >> (if overlays > >> (progn > >> (mapc (lambda (o) > >> -- > >> 1.6.4.73.gc144 > >> > >> _______________________________________________ > >> Emacs-orgmode mailing list > >> Remember: use `Reply All' to send replies to the list. > >> Emacs-orgmode@gnu.org > >> http://lists.gnu.org/mailman/listinfo/emacs-orgmode > > > > - Carsten > > --=-=-=-- - Carsten