emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Carsten Dominik <carsten.dominik@gmail.com>
To: Eric Schulte <schulte.eric@gmail.com>
Cc: Org Mode <emacs-orgmode@gnu.org>
Subject: Re: [PATCH] sha1 hash of latex fragments to avoid regeneration
Date: Tue, 17 Nov 2009 14:14:15 +0100	[thread overview]
Message-ID: <85C13EF4-5FE0-4E4F-94D1-6AB87565E2F4@gmail.com> (raw)
In-Reply-To: <m21vjy80tq.fsf@gmail.com>

Hi Eric,

looks great now, I have made a few minor changes and applied it.

- Carsten

On Nov 17, 2009, at 1:11 AM, Eric Schulte wrote:

> Delivered-To: carsten.dominik@gmail.com
> Received: by 10.90.33.18 with SMTP id g18cs184746agg;
>         Mon, 16 Nov 2009 16:14:16 -0800 (PST)
> Received: by 10.115.103.17 with SMTP id f17mr8915518wam. 
> 166.1258416855542;
>         Mon, 16 Nov 2009 16:14:15 -0800 (PST)
> Return-Path: <schulte.eric@gmail.com>
> Received: from mail-pz0-f194.google.com (mail-pz0-f194.google.com  
> [209.85.222.194])
>         by mx.google.com with ESMTP id 32si16386502pzk. 
> 110.2009.11.16.16.14.14;
>         Mon, 16 Nov 2009 16:14:14 -0800 (PST)
> Received-SPF: pass (google.com: domain of schulte.eric@gmail.com  
> designates 209.85.222.194 as permitted sender) client- 
> ip=209.85.222.194;
> Authentication-Results: mx.google.com; spf=pass (google.com: domain  
> of schulte.eric@gmail.com designates 209.85.222.194 as permitted  
> sender) smtp.mail=schulte.eric@gmail.com; dkim=pass (test mode) header.i=@gmail.com
> Received: by mail-pz0-f194.google.com with SMTP id 32so4183666pzk.21
>         for <carsten.dominik@gmail.com>; Mon, 16 Nov 2009 16:14:14  
> -0800 (PST)
> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
>         d=gmail.com; s=gamma;
>         h=domainkey- 
> signature:received:received:from:to:cc:subject:date
>          :references:message-id:user-agent:mime-version:content-type;
>         bh=ljxQthw1QhSpvXDshKqdl0Hmi2nMWV522o9FyWQIilY=;
>         b=JWQ//xkTNTI4hck3U/DCNEnBYADht03DHIfHIpu/O3sUVCX7vECFDVV/ 
> YiboCVdziZ
>          R4Uy6vQO2/PIB 
> +m5VhNXtx9xQoVrZMZCkfsoNjXtg5iWUzvKPon0sP9Hu7x7iC48+bc3
>          nHT82nwLQxD8AfjPRnrHWxVJE0V6PeFBl2zrk=
> DomainKey-Signature: a=rsa-sha1; c=nofws;
>         d=gmail.com; s=gamma;
>         h=from:to:cc:subject:date:references:message-id:user-agent
>          :mime-version:content-type;
>         b=PsUXGek+vgAXULkt/6iP9BZQVaBqpCb8cB8bPp8suG4lT2ZAdTHti3K/ 
> QKt3ZKlUrp
>          uVYHXPt1lustTNapWXvGPCK269E9xLkzU0fiFtyE8InqF 
> +tOn86drUHSbDmSFC5hh3uJ
>          sXgMAXWAMMe7J1y89K1H/NdV61cXAm/AOclC4=
> Received: by 10.115.101.18 with SMTP id d18mr8602604wam. 
> 191.1258416853669;
>         Mon, 16 Nov 2009 16:14:13 -0800 (PST)
> Return-Path: <schulte.eric@gmail.com>
> Received: from eschulte (adaptive.cs.unm.edu [64.106.21.179])
>         by mx.google.com with ESMTPS id 23sm1871553pxi. 
> 1.2009.11.16.16.14.11
>         (version=TLSv1/SSLv3 cipher=RC4-MD5);
>         Mon, 16 Nov 2009 16:14:12 -0800 (PST)
> From: "Eric Schulte" <schulte.eric@gmail.com>
> To: Carsten Dominik <carsten.dominik@gmail.com>
> Cc: Org Mode <emacs-orgmode@gnu.org>
> Subject: Re: [Orgmode] [PATCH] sha1 hash of latex fragments to avoid  
> regeneration
> Date: Mon, 16 Nov 2009 17:11:03 -0700
> References: <m2my2n8h8i.fsf@gmail.com>
> 	<FD51078D-BA1E-40F5-ABC7-4D226A336F53@gmail.com>
> Message-ID: <m21vjy80tq.fsf@gmail.com>
> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (darwin)
> MIME-Version: 1.0
> Content-Type: multipart/mixed; boundary="=-=-="
>
> --=-=-=
>
> Hi Carsten,
>
> Thanks for the feedback, I have comments inline below
>
> Carsten Dominik <carsten.dominik@gmail.com> writes:
>
> > Hi Eric,
> >
> > this is fantastic, thank you for implementing it.  I have wanted  
> some
> > speedup
> > for this for a long time.
> >
> > I think your implementation still suffers from one issue:
> >
> > The produced image also depends on the variables org-format-latex-
> > options,
> > org-format-latex-header, org-export-latex-package-alist,
> > and on the `forbuffer' flag (because images made for display in
> > the buffer and fo HTML export generaly need different resolution).
> >
> > One way to deal with this would be to make a list containing the  
> values
> > of these four variables and using prin1-to-string to convert this  
> list
> > into a string, and then to prepend this string to TXT when creating
> > the hash.
> >
>
> That sounds like the best solution.  I have made this change in the
> newly attached patch.
>
> >
> > Now, I am sure that you are already planning to do the same
> > for ditaa images etc?
>
> of course :)
>
> > That would be a treat, because ditaa can be terribly slow for  
> complex
> > figures, and this would speed up the cycle when writing document by
> > quite a bit.
> >
>
> Dan and I have been working on general caching solution for org-babel.
> Once we get that sorted out it should provide for the caching of all
> org-babel results which would include ditaa, dot, gnuplot, etc...
>
> I am currently more interested in making these changes in org-babel  
> than
> in org-exp-blocks, but in this case it may be worth implementing  
> caching
> in both cases.
>
> >
> > There is one further issue:  Cleaning up images that are no
> > longer used.
> >
> > With the LaTeX fragments it is not a big problem, because there
> > live in a special directory.  This would be a bigger concern for
> > ditaa images etc which tend to live in the same directory as the
> > source.  Maybe that could be solved by
> >
> > 1. Making sure that each image still have a name like "blue", so
> >    that the name now would be "blus_loooooonghashvalue.png" or so.
> > 2. Maybe creating a command that will look for orphaned images
> >    and remove them, by looking for the hash in the name and
> >    checking access times.  I am not sure if this is needed,
> >    and not sure what would be the best way to implement it.
> >
>
> Yes, this will not be an issue in the org-babel implementation as the
> hash key is stored separate from the file name, but I can see how this
> would need to be considered for any org-exp-blocks hash-based image
> caching.  The first option you propose above sounds very doable, as  
> long
> as we are comfortable removing any files that match regular  
> expressions
> like
>
>   blue_[[:alnum:]]+\.png
>
> which seems safe enough...
>
> >
> > After looking at these things, I would be *very* happy to accept
> > this patch.
> >
>
> I'll give this some more thought, but perhaps the latex image fragment
> patch is now viable.
>
> Best -- Eric
>
>
> --=-=-=
> Content-Type: text/x-patch
> Content-Disposition: inline;
>  filename=0001-latex-fragment-images-cached-using-sha1-hash-keys.patch
>
> From 0e9a359c1d5e8f67c20066533171fb1edc11ba61 Mon Sep 17 00:00:00 2001
> From: Eric Schulte <schulte.eric@gmail.com>
> Date: Mon, 16 Nov 2009 16:53:34 -0700
> Subject: [PATCH] latex fragment images cached using sha1 hash keys
>
>   Latex fragment images are now saved in files named by the sha1 hash
>   of the latex text used to create the image.  By checking if files
>   exist before images generation the regeneration of identical latex
>   images is avoided.
> ---
>  lisp/ChangeLog |    6 ++++++
>  lisp/org.el    |   22 +++++++++++-----------
>  2 files changed, 17 insertions(+), 11 deletions(-)
>
> diff --git a/lisp/ChangeLog b/lisp/ChangeLog
> index 5f83aaa..b581931 100755
> --- a/lisp/ChangeLog
> +++ b/lisp/ChangeLog
> @@ -1,3 +1,9 @@
> +2009-11-17  Eric Schulte  <schulte.eric@gmail.com>
> +
> +	* org.el (org-format-latex): Latex images are now saved to files
> +	named by the sha1 hash of the latex source text avoiding
> +	regeneration of identical images.
> +
>  2009-11-16  Carsten Dominik  <carsten.dominik@gmail.com>
>
>  	* org-html.el (org-export-html-home/up-format): Add an ID to the
> diff --git a/lisp/org.el b/lisp/org.el
> index bf6573b..15a8f9e 100644
> --- a/lisp/org.el
> +++ b/lisp/org.el
> @@ -14550,15 +14550,9 @@ Some of the options can be changed using  
> the variable
>  	 (opt org-format-latex-options)
>  	 (matchers (plist-get opt :matchers))
>  	 (re-list org-latex-regexps)
> -	 (cnt 0) txt link beg end re e checkdir
> +	 (cnt 0) txt hash link beg end re e checkdir
>  	 executables-checked
>  	 m n block linkfile movefile ov)
> -    ;; Check if there are old images files with this prefix, and  
> remove them
> -    (when (file-directory-p todir)
> -      (mapc 'delete-file
> -	    (directory-files
> -	     todir 'full
> -	     (concat (regexp-quote prefixnodir) "_[0-9]+\\.png$"))))
>      ;; Check the different regular expressions
>      (while (setq e (pop re-list))
>        (setq m (car e) re (nth 1 e) n (nth 2 e)
> @@ -14576,9 +14570,14 @@ Some of the options can be changed using  
> the variable
>  	    (setq txt (match-string n)
>  		  beg (match-beginning n) end (match-end n)
>  		  cnt (1+ cnt)
> -		  linkfile (format "%s_%04d.png" prefix cnt)
> -		  movefile (format "%s_%04d.png" absprefix cnt)
>  		  link (concat block "[[file:" linkfile "]]" block))
> +            (setq hash (sha1 (prin1-to-string
> +                              (list org-format-latex-header
> +                                    (if (boundp 'org-export-latex- 
> package-alist)
> +                                        org-export-latex-package- 
> alist)
> +                                    forbuffer txt)))
> +		  linkfile (format "%s_%s.png" prefix hash)
> +		  movefile (format "%s_%s.png" absprefix hash))
>  	    (if msg (message msg cnt))
>  	    (goto-char beg)
>  	    (unless checkdir ; make sure the directory exists
> @@ -14592,8 +14591,9 @@ Some of the options can be changed using the  
> variable
>  	       "dvipng" "needed to convert LaTeX fragments to images")
>  	      (setq executables-checked t))
>
> -	    (org-create-formula-image
> -	     txt movefile opt forbuffer)
> +            (unless (file-exists-p movefile)
> +              (org-create-formula-image
> +               txt movefile opt forbuffer))
>  	    (if overlays
>  		(progn
>  		  (mapc (lambda (o)
> -- 
> 1.6.4.73.gc144
>
>
> --=-=-=
>
>
> >
> > - Carsten
> >
> > On Nov 16, 2009, at 1:07 AM, Eric Schulte wrote:
> >
> >> Hi,
> >>
> >> The attached patch changes the latex fragment image generation so  
> that
> >> it saves images into files named by the sha1 hash of the latex  
> source
> >> code.  By checking for the existence of image files before image
> >> generation the regeneration of identical images is avoided.
> >>
> >> In practice I find that this greatly speeds up export to html and  
> the
> >> `org-preview-latex-fragment' command.
> >>
> >> Cheers -- Eric
> >>
> >> From 13e1c48fa6cac43b0c87ca0fbc8e349f7a9fa864 Mon Sep 17 00:00:00  
> 2001
> >> From: Eric Schulte <schulte.eric@gmail.com>
> >> Date: Sun, 15 Nov 2009 17:00:09 -0700
> >> Subject: [PATCH] latex fragment images cached using sha1 hash keys
> >>
> >>  Latex fragment images are now saved in files named by the sha1  
> hash
> >>  of the latex text used to create the image.  By checking if files
> >>  exist before images generation the regeneration of identical latex
> >>  images is avoided.
> >> ---
> >> lisp/ChangeLog |    6 ++++++
> >> lisp/org.el    |   18 +++++++-----------
> >> 2 files changed, 13 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/lisp/ChangeLog b/lisp/ChangeLog
> >> index 339f248..f18755c 100755
> >> --- a/lisp/ChangeLog
> >> +++ b/lisp/ChangeLog
> >> @@ -1,3 +1,9 @@
> >> +2009-11-16  Eric Schulte  <schulte.eric@gmail.com>
> >> +
> >> +	* org.el (org-format-latex): Latex images are now saved to files
> >> +	named by the sha1 hash of the latex source text avoiding
> >> +	regeneration of identical images.
> >> +
> >> 2009-11-15  Carsten Dominik  <carsten.dominik@gmail.com>
> >>
> >> 	* org-wl.el (org-wl-store-link): Handle the case that
> >> diff --git a/lisp/org.el b/lisp/org.el
> >> index bf6573b..46348fc 100644
> >> --- a/lisp/org.el
> >> +++ b/lisp/org.el
> >> @@ -14550,15 +14550,9 @@ Some of the options can be changed using
> >> the variable
> >> 	 (opt org-format-latex-options)
> >> 	 (matchers (plist-get opt :matchers))
> >> 	 (re-list org-latex-regexps)
> >> -	 (cnt 0) txt link beg end re e checkdir
> >> +	 (cnt 0) txt hash link beg end re e checkdir
> >> 	 executables-checked
> >> 	 m n block linkfile movefile ov)
> >> -    ;; Check if there are old images files with this prefix, and
> >> remove them
> >> -    (when (file-directory-p todir)
> >> -      (mapc 'delete-file
> >> -	    (directory-files
> >> -	     todir 'full
> >> -	     (concat (regexp-quote prefixnodir) "_[0-9]+\\.png$"))))
> >>     ;; Check the different regular expressions
> >>     (while (setq e (pop re-list))
> >>       (setq m (car e) re (nth 1 e) n (nth 2 e)
> >> @@ -14576,9 +14570,10 @@ Some of the options can be changed using
> >> the variable
> >> 	    (setq txt (match-string n)
> >> 		  beg (match-beginning n) end (match-end n)
> >> 		  cnt (1+ cnt)
> >> -		  linkfile (format "%s_%04d.png" prefix cnt)
> >> -		  movefile (format "%s_%04d.png" absprefix cnt)
> >> 		  link (concat block "[[file:" linkfile "]]" block))
> >> +            (setq hash (sha1 txt)
> >> +		  linkfile (format "%s_%s.png" prefix hash)
> >> +		  movefile (format "%s_%s.png" absprefix hash))
> >> 	    (if msg (message msg cnt))
> >> 	    (goto-char beg)
> >> 	    (unless checkdir ; make sure the directory exists
> >> @@ -14592,8 +14587,9 @@ Some of the options can be changed using  
> the
> >> variable
> >> 	       "dvipng" "needed to convert LaTeX fragments to images")
> >> 	      (setq executables-checked t))
> >>
> >> -	    (org-create-formula-image
> >> -	     txt movefile opt forbuffer)
> >> +            (unless (file-exists-p movefile)
> >> +              (org-create-formula-image
> >> +               txt movefile opt forbuffer))
> >> 	    (if overlays
> >> 		(progn
> >> 		  (mapc (lambda (o)
> >> --
> >> 1.6.4.73.gc144
> >>
> >> _______________________________________________
> >> Emacs-orgmode mailing list
> >> Remember: use `Reply All' to send replies to the list.
> >> Emacs-orgmode@gnu.org
> >> http://lists.gnu.org/mailman/listinfo/emacs-orgmode
> >
> > - Carsten
>
> --=-=-=--

- Carsten

      parent reply	other threads:[~2009-11-17 13:14 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-16  0:07 [PATCH] sha1 hash of latex fragments to avoid regeneration Eric Schulte
2009-11-16  6:57 ` Carsten Dominik
2009-11-17  0:11   ` Eric Schulte
2009-11-17  2:42     ` Eric Schulte
2009-11-17 13:21       ` Carsten Dominik
2009-11-17 15:24         ` Eric Schulte
2009-11-17 15:36           ` Carsten Dominik
2009-11-17 17:02             ` Eric Schulte
     [not found]               ` <m23a4dozic.fsf-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-12-02 10:35                 ` Francesco Pizzolante
2009-12-05 16:35                   ` Eric Schulte
     [not found]                     ` <yn43a3pqsyw.fsf-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-12-07 12:48                       ` Francesco Pizzolante
2009-12-23 15:17                         ` Eric Schulte
2009-11-17 13:14     ` Carsten Dominik [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=85C13EF4-5FE0-4E4F-94D1-6AB87565E2F4@gmail.com \
    --to=carsten.dominik@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=schulte.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).