emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Four issues with org-babel-tangle
@ 2011-09-14 19:51 Christopher Genovese
  2011-09-15 15:49 ` Eric Schulte
  0 siblings, 1 reply; 6+ messages in thread
From: Christopher Genovese @ 2011-09-14 19:51 UTC (permalink / raw)
  To: emacs-orgmode


[-- Attachment #1.1: Type: text/plain, Size: 25518 bytes --]

/Semi-verbose Preamble/.
Having recently begun intensive use of org-mode for tangling source
files, I encountered four issues related to comment extraction (two
bugs, one undesirable behavior, and one ... unfulfilled need), which I
describe in detail below. I started by creating an org file that would
reproduce the problems, and soon started /describing/ the problems in
the org file as well as putting my fixes in the source blocks. At the
risk of it being too meta or annoying, I've included that org file at
the end of this message as the problem description. All the details are
there as well as two fixes. Tangling that file in various ways described
demonstrates the problems, and you can export to PDF for nicer reading.
(I've attached the PDF to this mail for convenience. It looks good;
kudos, org-mode!) I've also attached a tarball with files that make it
easy to try my changes and to restore the original behavior, as well as
tests and results from the org file for easy comparison. See the
included README.

I've been using the revised code now for a few days. It fixes the
problems I describe, and I think it provides a flexible framework for
comment extraction with minimal change to the base code. If the reaction
to this is positive, I will happily submit a patch, sign paperwork, or
whatever is needed, after fixing any problems that you all see. In any
case, I very much look forward to any feedback you can offer. Thanks.

  -- Christopher

P.S. In case the attachments get dropped, I've put the PDF and the
     tarball at

     http://www.stat.cmu.edu/~genovese/depot/tangle-error.pdf
     http://www.stat.cmu.edu/~genovese/depot/tangle-bundle.tgz

/Problem Description/
################ Cut Here ####################
# -*- org-return-follows-link: t; comment-empty-lines: t; -*-
#+TITLE: Tangle this file: four issues with org-babel-tangle
#+AUTHOR: Christopher Genovese
#+DATE: 14 Sep 2011\vspace*{-0.5cm}
#+EMAIL: genovese@cmu.edu
#+OPTIONS: toc:1 H:1
#+BABEL: :tangle yes :comments org :results silent :exports code
#+BIND: org-export-latex-hyperref-format "\\hyperlink{%s}{%s}"
#+STARTUP: showall
#+LATEX_HEADER: \usepackage[labelsep=period,labelfont=bf]{caption}
#+LaTeX: \vspace*{-1cm}\vspace*{-2pt}


* Four Related Issues with org-babel-tangle

  Running org-mode version 7.7 on both Gnu Emacs 23.2.1 and 24.0.50.1
  with Mac OS X 10.5.8 (with and without a -Q option), I encountered the
  following issues/problems/bugs when tangling files with code blocks
  for which the :comment header argument is org:

  1. The subtree associated with the very first code block has
     its headline tangled without leading stars, but all subsequent
     sub-trees associated with code blocks have the leading stars
     included in the comments.

  2. If the first code block comes before the first headline, the start
     of the comment text will be determined by pre-existing match data
     and thus will likely be incorrect.

  3. Org structural elements such as headline stars, =#+= control lines,
     and drawers are included in the comments.

  4. There is no way easy to delimit comment text or transform it that
     does not also change the structure of the org file (e.g., by adding
     headlines or source blocks).

  Issues 1 and 2 seem to be genuine bugs. Issues 3 and 4 are more
  subjective, I admit, but seem undesirable. Stars, drawers, and
  control lines are org structure rather than content, and they
  are often inappropriate in comments.

  To reproduce the behaviors for issues 1 and 3, look at the result of
  tangling this file. To reproduce issue 2 as well, remove the first two
  stars from this file and tangle again. Alternatively, within emacs,
  evaluate the =(buffer-substring ...)= sexp from the original-code code
  below at the beginning character of a source block. (You
  can also export this file to PDF for more pleasant reading.)

  Below, I give details on these issues and code for two fixes:
  a [[simple-fix][simple fix]] that handles the first two issues and the
stars in the third,
  and a [[preferred-fix][better fix]] that handles all four issues in a more
  modular, customizable framework. I'd be interested in hearing
  feedback on all of this. If the reaction is positive, I will gladly
  submit a patch. Thanks for your consideration.



* The Original Code and Details on the Problem

  The relevant section of the original code from org-mode version 7.7
  is shown below, comprising lines 344 through 357
  of ob-tangle.el in function =org-babel-tangle-collect-blocks=. With point
  is at a =#+begin_src=, it scans back either
  for a heading line or for the end of the previous code block,
  whichever comes later. The resulting region becomes the
  comment text.

  #+latex: \begin{figure}[h]
  #+latex: \hypertarget{original-code}{} % <<original-code>>
  #+source: original-code
  #+begin_src emacs-lisp
    (comment
     (when (or (string= "both" (cdr (assoc :comments params)))
               (string= "org" (cdr (assoc :comments params))))
       ;; from the previous heading or code-block end
       (buffer-substring
        (max (condition-case nil
                 (save-excursion
                   (org-back-to-heading t) (point))
               (error 0))
             (save-excursion
               (re-search-backward
                org-babel-src-block-regexp nil t)
               (match-end 0)))
        (point))))
  #+end_src
  #+latex: \caption{Original Code, lines 344--357 in {\tt ob-tangle.el}.}
  #+latex: \label{fig::original-code}
  #+latex: \end{figure}

  /Issue 1/. When in the first code block in the file, the
  second search fails (there is no previous code block),
  so the (match-end 0)  call uses the match data from the (implicit) match
during
  =org-back-to-heading=, which skips the stars. (Not a particularly
  transparent reference, incidentally.) For subsequent blocks, the
  =(match-end 0)= gives the end of the previous code block, which in these
  examples is earlier than the previous headline location.

  /Issue 2/. When the first code block lies before the first headline
  (say with some text before it), the searches fail in /both/ clauses of
  the max. So, the =match-end= will return an essentially arbitrary
  result, which is a bug.

  /Issue 3/. =org-back-to-heading= leaves point at the beginning of the
  stars, so a headline included in the text will have stars, except for
  the first one.

  /Issue 4/. Control lines at the end of the previous code block and
  before point are not filtered out and so are included in the comments.


* A Simple Fix for the First Three Issues

  A small change addresses issues 1, 2, and the stars for issue 3: in
  both cases, simply use the =match-end= and replace 0 values with
  =(point-min)=. The latter gives a sensible result even if both
  computed positions are trivial (as when the first code block comes
  before the first headline) and respects narrowing.

  #+latex: \begin{figure}[h]
  #+latex: \hypertarget{simple-fix}{} % <<simple-fix>>
  #+begin_src emacs-lisp
    (comment
     (when (or (string= "both" (cdr (assoc :comments params)))
               (string= "org" (cdr (assoc :comments params))))
       ;; from the previous heading or code-block end
       (buffer-substring
        (max (condition-case nil
                 (save-excursion
                   (org-back-to-heading t)  ; sets match data
                   (match-end 0))
               (error (point-min)))
             (save-excursion
               (if (re-search-backward
                    org-babel-src-block-regexp nil t)
                   (match-end 0)
                 (point-min))))
        (point))))
  #+end_src
  #+latex: \caption{Simple Fix, replacement for lines 344--357 in {\tt
ob-tangle.el}.}
  #+latex: \label{fig::simple-fix}
  #+latex: \end{figure}

* A Fix for All Four Issues

  A better fix that handles issues 1--4 starts with the region computed as
in the [[simple-fix][simple fix]]
  and then processes that text through a user-configurable sequence of
functions
  to derive the final form of the comment text.

  The following changes are required.

** Extract Initial Comment Text and State from Org Buffer

   The initial comment text ranges from either the most recent headline
   at the point after the stars, the beginning of the line after the
   =#+end_src= of the most recent code block, or the beginning of the
   buffer, whichever is later, through the line before the source
   block.[fn:1]

   The [[preferred-fix][code]] to extract this is given below.
   #+latex: (See Figure \ref{fig::preferred-fix}.)
   It replaces lines 344 through 357 of
   =ob-tangle.el= from org-mode version 7.7 in the function
   =org-babel-tangle-collect-blocks=.

   #+latex: \begin{figure}[h]
   #+latex: \hypertarget{preferred-fix}{} % <<preferred-fix>>
   #+begin_src emacs-lisp
     (comment
      (when (or (string= "both" (cdr (assoc :comments params)))
                (string= "org" (cdr (assoc :comments params))))
        (let* ((prev-heading
                (condition-case nil
                    (save-excursion
                      (org-back-to-heading t) ; sets match data
                      (match-end 0))
                  (error (point-min))))
               (end-of-prev-src-block
                (save-excursion
                  (if (null (re-search-backward
                             org-babel-src-block-regexp nil t))
                      (point-min)
                    (goto-char (match-end 0))
                    (forward-line 1)
                    (point))))
               (comment-start
                (max prev-heading end-of-prev-src-block))
               (comment-end
                (save-excursion
                  (forward-line 0)
                  (point)))
               (state
                (list (cons 'org-drawers
                            org-drawers)
                      (cons 'after-heading
                            (= comment-start prev-heading))
                      (cons 'first-line
                            (= comment-start (point-min))))))
          (org-babel-process-comment-text
           (buffer-substring comment-start comment-end) state))))
   #+end_src
   #+latex: \caption{Better Fix, replacement for lines 344--357 in {\tt
ob-tangle.el}.}
   #+latex: \label{fig::preferred-fix}
   #+latex: \end{figure}

** Adjust =org-babel-spec-to-string=

   The commment block collected by the [[original-code][original code]]
   #+latex: (Figure \ref{fig::original-code})
   in =org-babel-tangle-collect-blocks= is further processed in \newline
   =org-babel-spec-to-string= to trim leading and trailing whitespace
   from string. This was needed because spaces after a source block were
   included in the comment. In the revised code, however, this space
   trimming is handled during text transformation, except for removing
   trailing newlines. (Note: trailing /spaces/ are not removed to allow
   more flexibility in comment processing.) Hence,
   =org-babel-spec-to-string= needs to be slightly adjusted.
   #+latex: See Figure \ref{fig::spec-string-diff}.

   #+latex: \begin{figure}[h]
   #+begin_example
     --- ob-tangle.el           2011-09-14 11:48:26.000000000 -0400
     +++ new-ob-tangle.el       2011-09-14 11:55:56.000000000 -0400
     @@ -398,3 +398,3 @@
          (flet ((insert-comment (text)
     -            (let ((text (org-babel-trim text)))
     +            (let ((text (org-babel-chomp text "[\f\t\n\r\v]")))
              (when (and comments (not (string= comments "no"))
   #+end_example
   #+latex: \caption{Changes to {\tt org-spec-to-string} in {\tt
ob-tangle.el}, unified diff, one line of context}
   #+latex: \label{fig::spec-string-diff}
   #+latex: \end{figure}

** Process Comment Text Through Sequence of Transforms

   At the end of the revised [[preferred-fix][comment collection code]],
   the comment text is passed to
   =org-babel-process-comment-text= which
   applies a sequence of transformation functions.
   #+latex: (See Figure \ref{fig::comment-transformation}.)
   The list of transformation functions is stored in a customizable
   variable described [[Define Customization Variable for
Transforms][below]]. Several predefined transformations are
   given [[Define A Collection of Transform Functions][below]] as well.

   #+latex: \begin{figure}[h]
   #+begin_src emacs-lisp
     (defun org-babel-process-comment-text (text &optional state)
       "Apply list of transforms to comment TEXT assuming bindings in alist
STATE.
     Returns the modified text string, which may have text properties.
     See `org-babel-comment-processing-functions' for the transforms to be
     applied and details on the allowed keys in the STATE alist."
       (let ((funcs org-babel-comment-processing-functions))
         (with-temp-buffer
           (insert text)
           (let ((org-drawers
                  (or (cdr (assoc 'org-drawers state))
                      org-drawers))
                 (after-heading
                  (cdr (assoc 'after-heading state)))
                 (first-line
                  (cdr (assoc 'first-line state))))
             (while funcs
               (goto-char (point-min))
               (funcall (car funcs))
               (setq funcs (cdr funcs))))
           (buffer-substring (point-min) (point-max)))))
   #+end_src
   #+latex: \caption{Better Fix, comment transformation driver.}
   #+latex: \label{fig::comment-transformation}
   #+latex: \end{figure}

** Define Customization Variable for Transforms

   A list of nullary functions applied in order to
   the comment text. The text is inserted
   in a temporary buffer, so these functions can use
   the entire Emacs library for operating on buffer text.
   #+latex: See Figure \ref{fig::comment-transformation-function-list}.

   #+latex: \begin{figure}[h]
   #+begin_src emacs-lisp
     (defcustom org-babel-comment-processing-functions
       '(org-babel-comment-delete-file-variables-line
         org-babel-comment-delete-org-control-lines
         org-babel-comment-delete-drawers
         org-babel-comment-trim-blank-lines
         org-babel-comment-trim-indent-prefix)
       "List of functions to transform source-block comment text before
insertion.
     Each function will be called with no arguments with point at the
     beginning of a buffer containing only the comment text. Each
     function can modify the text at will and leave point anywhere,
     but it should *not* modify the narrowing state of the buffer.
     Several dynamic state variables are set prior to execution that
     each function can reference. These currently include:

        + org-drawers:   names of drawers in the original org buffer.
        + from-heading:  t if comment starts at an org heading line,
                         nil otherwise.
        + first-line:    t if initial comment starts on first line
                         of the original org buffer, nil otherwise.

     If a function changes the value of these state variables, the new
     value will be seen by all following functions in the list, but
     this is not generally recommended.

     The functions in this list are called *in order*, and this order
     can influence the form of the resulting comment text."
       :group 'org-babel
       :type 'list)
   #+end_src
   #+latex: \caption{Better Fix, customizable transform list.}
   #+latex: \label{fig::comment-transformation-function-list}
   #+latex: \end{figure}

** Define A Collection of Transform Functions

   An advantage of this design is that transformation of the
   comments is modular and customizable. We can include in
   =ob-tangle.el= a collection of pre-defined transforms.
   The default processing stream in =org-babel-comment-processing-functions=
   is as follows:

   1. Delete a file variables if on the first line of the buffer.
   2. Delete all drawers and their contents.
   3. Delete all org control lines from the comment text.
   4. Trim blank lines from the beginning and end.
   5. Reindent the text by removing the longest common leading
      string of spaces.

   #+ TANGLE: end-comment
   These and several other useful transforms are given below
   (e.g., deleting drawer delimiters but not contents)..
   #+latex: See Figures \ref{fig::transformA}--\ref{fig::transformZ}.
   It is easy to define new transforms; any function that
   operates on text in the current buffer beginning at point-min
   will work.

   #+latex: \begin{figure}[h]
   #+begin_src emacs-lisp
     (defun org-babel-comment-delete-file-variables-line ()
       "Delete file variables comment line if at beginning of buffer.
     This only checks the first line of the buffer, and so should be
     placed first (or at least early enough) in the list
     `org-babel-comment-processing-functions' to ensure that the no
     other text has been inserted earlier."
       (when (and first-line
                  (looking-at ; file-variables line
                   "^#[ \t]*-\\*-.*:.*;[ \t]**-\\*-[ \t]*$"))
         (let ((kill-whole-line t))
           (kill-line))))
   #+end_src
   #+latex: \caption{Comment Transform.}
   #+latex: \label{fig::transformA}
   #+latex: \end{figure}

   #+latex: \begin{figure}[h]
   #+begin_src emacs-lisp
     (defun org-babel-comment-delete-org-control-lines ()
       "Remove all org #+ control lines from comment."
       (let ((control-regexp "^[ \t]*#\\+.*$"))
         (delete-matching-lines control-regexp)))
   #+end_src
   #+latex: \caption{Comment Transform.}
   #+latex: \end{figure}

   #+latex: \begin{figure}[h]
   #+begin_src emacs-lisp
     (defun org-babel-comment-delete-org-in-buffer-settings ()
       "Remove all org #+ in-buffer setting lines, leaving other control
lines.
     In-buffer setting lines begin with #+ and have all caps keyword
     names."
       (let ((setting-regexp "^#\\+[ \t]*[A-Z_]+:.*$"))
         (delete-matching-lines setting-regexp)))
   #+end_src
   #+latex: \caption{Comment Transform.}
   #+latex: \end{figure}

   #+latex: \begin{figure}[h]
   #+begin_src emacs-lisp
     (defun org-babel-comment-delete-drawers ()
       "Delete drawer delimiters and contents from comment.
     Drawer names are restricted to those in the `org-drawers' state."
       (let ((drawer-start-regexp
              (format "^[ \t]*:\\(?:%s\\):[ \t]*$"
                      (mapconcat 'identity
                                 org-drawers
                                 "\\|")))
             (drawer-end-regexp "^[ \t]*:END:[ \t]*$"))
         (while (re-search-forward drawer-start-regexp nil t)
           (let ((beg (save-excursion
                        (forward-line 0)
                        (point)))
                 (end (save-excursion
                        (re-search-forward drawer-end-regexp nil t)
                        (forward-line 1)
                        (point))))
             (goto-char end)
             (delete-region beg end)))))
   #+end_src
   #+latex: \caption{Comment Transform.}
   #+latex: \end{figure}

   #+latex: \begin{figure}[h]
   #+begin_src emacs-lisp
     (defun org-babel-comment-delete-drawer-delimiters ()
       "Delete drawer delimiters from comment leaving content.
     Drawer names are restricted to those given by the `org-drawers'
     state."
       (let ((drawer-delim-regexp
              (format "^[ \t]*:\\(?:%s\\)"
                      (mapconcat 'identity
                                 (cons "END" org-drawers)
                                 "\\|"))))
         (delete-matching-lines drawer-delim-regexp)))
   #+end_src
   #+latex: \caption{Comment Transform.}
   #+latex: \end{figure}

   #+latex: \begin{figure}[h]
   #+begin_src emacs-lisp
     (defun org-babel-comment-trim-blank-lines ()
       "Trim whitespace-only lines from beginning and end of text."
       (while (and (looking-at "^[ \t\f]*$")
                   (< (point) (point-max)))
         (forward-line 1))
       (delete-region (point-min) (point))
       (when (< (point) (point-max))
         (goto-char (point-max))
         (let ((last-point (point)))
           (forward-line 0)
           (while (and (looking-at "^[ \t\f]*$")
                       (> (point) (point-min)))
             (setq last-point (point))
             (forward-line -1))
           (delete-region last-point (point-max)))))
   #+end_src
   #+latex: \caption{Comment Transform.}
   #+latex: \end{figure}

   #+latex: \begin{figure}[h]
   #+begin_src emacs-lisp
     (defun org-babel-comment-trim-indent-prefix ()
       "Remove longest common leading prefix of spaces from each line of
TEXT.
     Prefix is computed from the initial whitespace on each line with
     tabs converted to spaces, preserving indentation."
       (let* ((common-indent nil)
              (common-length (1+ (- (point-max) (point-min))))
              (current-indent "")                   ; enter first loop
              (current-length common-length))       ; skip first assignment
         (goto-char (point-min))
         (while current-indent
           (when (< current-length common-length)
             (setq common-indent current-indent
                   common-length current-length))
           (setq current-indent
                 (let* ((found (re-search-forward "^\\([ \t]*\\)\\S-" nil
t))
                        (bol (match-beginning 0))
                        (eos (match-end 1))
                        (space-str (match-string 1))
                        (indent-tabs-mode nil))
                   (cond
                    ((not found)
                     nil)
                    ((not (string-match "\t" space-str))
                     space-str)
                    (t                       ; detabify indent string
                     (goto-char eos)
                     (let ((col (current-column)))
                       (delete-region bol eos)
                       (indent-to col))
                     (buffer-substring-no-properties bol (point))))))
           (setq current-length (length current-indent)))
         (when (and common-indent (> common-length 0))
           (let ((indent-re (concat "^" common-indent)))
             (goto-char (point-min))
             (while (re-search-forward indent-re nil t)
               (replace-match "" nil nil))))))
   #+end_src
   #+latex: \caption{Comment Transform.}
   #+latex: \label{fig::transformZ}
   #+latex: \end{figure}
   #+latex: \end{itemize}

   #+latex: \noindent
   This kind of customization offers some nice possibilities,
   including controlling indentation, eliminating or
   transforming org markup, eliminating trailing whitespace, and
   automating specialized comment formatting (e.g., javadoc). As
   an additional illustration, consider the transform
   =org-babel-comment-restrict-comment-range=
   #+latex: in Figure \ref{fig::transform-illustration}
   below. The idea is that it is sometimes useful to select
   from the text under a headline a /part/ of the text for
   the comment. We want some org markup that will not affect
   either the export or the structure of the org file itself.
   To do this, we use the fact that =#+=\textvisiblespace
   lines are not exported.[fn:2] So, we can /de facto/ use the
   =#+ TANGLE:= construct to control various aspects of tangling.
   Here, we use the =#+ TANGLE: start-comment= and
   =#+ TANGLE: end-comment= to delimit the comment text.
   (This function needs to come earlier in the function list than the
   functions that eliminate org control lines. It is sufficient to
   prepend it to that list.) This is used in the current file,
   for example.

   #+latex: \begin{figure}[h]
   #+begin_src emacs-lisp
     (defun org-babel-comment-restrict-comment-range ()
       "Remove all comment text outside start-comment and end-comment
delimiters.
     Comment delimiters are #+TANGLE lines with respective keywords
     start-comment and end-comment. THE #+TANGLE lines are also
     deleted. To be effective, this function should be positioned in
     the list `org-babel-comment-processing-functions' before any
     functions that remove org control lines or process other
     co-occuring attributes of #+TANGLE lines."
       (when (re-search-forward "^[ \t]*#\\+[ \t]*TANGLE:.*start-comment.*$"
nil t)
         (forward-line 1)
         (delete-region (point-min) (point)))
       (when (re-search-forward "^[ \t]*#\\+[ \t]*TANGLE:.*end-comment.*$"
nil t)
         (forward-line 0)
         (delete-region (point) (point-max))))
   #+end_src
   #+latex: \caption{Transform to illustrate some customization
possibilities.}
   #+latex: \label{fig::transform-illustration}
   #+latex: \end{figure}
   #+latex: \begin{itemize}

[fn:1] In the original code and in the simple fix above, the comment
starts /immediately/ after the =#+end_src= rather than at the start of
the next line. Starting at the next line seems more natural to me
because the comment being constructed relates to the /following/ code
block. But the original behavior is easily restored if people disagree.

[fn:2] A feature request: I would propose that the =#+tangle:= construct
be recognized as non-exported even with spaces preceding the =#= and no
spaces after the =+=. This would enable a variety of interesting
customization for tangled comments. Alternatively, a generic construct
such as =#+noop:= or =#+generic:= could be a valuable for user-based
tags in an org file that serves a similar purpose -- allow customized
processing without directly being exported.

[-- Attachment #1.2: Type: text/html, Size: 28429 bytes --]

[-- Attachment #2: tangle-error.pdf --]
[-- Type: application/pdf, Size: 254903 bytes --]

[-- Attachment #3: tangle-bundle.tgz --]
[-- Type: application/x-gzip, Size: 278048 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Four issues with org-babel-tangle
  2011-09-14 19:51 Four issues with org-babel-tangle Christopher Genovese
@ 2011-09-15 15:49 ` Eric Schulte
  2011-09-15 21:36   ` Christopher Genovese
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Schulte @ 2011-09-15 15:49 UTC (permalink / raw)
  To: Christopher Genovese; +Cc: emacs-orgmode

Hi Christopher,

Thank you for the thorough examples and for suggesting fixes.  I would
like to apply your "simple fix" immediately, and the resulting patch
should be small enough (less than 10 lines of changes) that it can be
applied without FSF assignment -- although I would encourage you to
begin the FSF assignment process if you anticipate potentially
contributing more fixes in the future.  Could you please send a git
format-patch version of the simple fix to the list so that I might apply
it?

I like the idea of introducing a customizable function for comment text
transformation, however I'm not sure that the temporary-buffer mechanics
need to be included by default, rather perhaps we should just leave the
default value of this function as simple as possible and allow users to
customize it to be as simple or complex as they wish.  Perhaps a change
like the following, where the call to `org-babel-trim' in
`org-babel-spec-to-string' is removed.

#+begin_src emacs-lisp
  (buffer-substring
   (max (condition-case nil
            (save-excursion
              (org-back-to-heading t) (point))
          (error 0))
        (save-excursion
          (re-search-backward
           org-babel-src-block-regexp nil t)
          (match-end 0)))
   (point))
  
  ;; | becomes
  ;; v
  
  (org-babel-process-comment-text
   (buffer-substring
    (max (condition-case nil
             (save-excursion
               (org-back-to-heading t) (point))
           (error 0))
         (save-excursion
           (re-search-backward
            org-babel-src-block-regexp nil t)
           (match-end 0)))
    (point)))
  
  ;; where
  
  (defcustom org-babel-process-comment-text #'org-babel-trim
    "Customizable function for processing comment text."
    :group 'org-babel
    :type 'function)
#+end_src

This change may end up being more than 10 lines long, but a patch would
still be welcome, otherwise if the solution I sketched out above sounds
reasonable I could compose a patch and then share it with you for double
checking before it is applied.

Finally I'm not sure I fully understand what you mean by

>
> [fn:2] A feature request: I would propose that the =#+tangle:= construct
> be recognized as non-exported even with spaces preceding the =#= and no
> spaces after the =+=. This would enable a variety of interesting
> customization for tangled comments. Alternatively, a generic construct
> such as =#+noop:= or =#+generic:= could be a valuable for user-based
> tags in an org file that serves a similar purpose -- allow customized
> processing without directly being exported.
>

Please do let me know if I missed anything, it was a long email.

Thanks for contributing! -- Eric

Christopher Genovese <genovese.cr@gmail.com> writes:

> /Semi-verbose Preamble/.
> Having recently begun intensive use of org-mode for tangling source
> files, I encountered four issues related to comment extraction (two
> bugs, one undesirable behavior, and one ... unfulfilled need), which I
> describe in detail below. I started by creating an org file that would
> reproduce the problems, and soon started /describing/ the problems in
> the org file as well as putting my fixes in the source blocks. At the
> risk of it being too meta or annoying, I've included that org file at
> the end of this message as the problem description. All the details are
> there as well as two fixes. Tangling that file in various ways described
> demonstrates the problems, and you can export to PDF for nicer reading.
> (I've attached the PDF to this mail for convenience. It looks good;
> kudos, org-mode!) I've also attached a tarball with files that make it
> easy to try my changes and to restore the original behavior, as well as
> tests and results from the org file for easy comparison. See the
> included README.
>
> I've been using the revised code now for a few days. It fixes the
> problems I describe, and I think it provides a flexible framework for
> comment extraction with minimal change to the base code. If the reaction
> to this is positive, I will happily submit a patch, sign paperwork, or
> whatever is needed, after fixing any problems that you all see. In any
> case, I very much look forward to any feedback you can offer. Thanks.
>
>   -- Christopher
>
> P.S. In case the attachments get dropped, I've put the PDF and the
>      tarball at
>
>      http://www.stat.cmu.edu/~genovese/depot/tangle-error.pdf
>      http://www.stat.cmu.edu/~genovese/depot/tangle-bundle.tgz
>
> /Problem Description/
> ################ Cut Here ####################
> # -*- org-return-follows-link: t; comment-empty-lines: t; -*-
> #+TITLE: Tangle this file: four issues with org-babel-tangle
> #+AUTHOR: Christopher Genovese
> #+DATE: 14 Sep 2011\vspace*{-0.5cm}
> #+EMAIL: genovese@cmu.edu
> #+OPTIONS: toc:1 H:1
> #+BABEL: :tangle yes :comments org :results silent :exports code
> #+BIND: org-export-latex-hyperref-format "\\hyperlink{%s}{%s}"
> #+STARTUP: showall
> #+LATEX_HEADER: \usepackage[labelsep=period,labelfont=bf]{caption}
> #+LaTeX: \vspace*{-1cm}\vspace*{-2pt}
>
>
> * Four Related Issues with org-babel-tangle
>
>   Running org-mode version 7.7 on both Gnu Emacs 23.2.1 and 24.0.50.1
>   with Mac OS X 10.5.8 (with and without a -Q option), I encountered the
>   following issues/problems/bugs when tangling files with code blocks
>   for which the :comment header argument is org:
>
>   1. The subtree associated with the very first code block has
>      its headline tangled without leading stars, but all subsequent
>      sub-trees associated with code blocks have the leading stars
>      included in the comments.
>
>   2. If the first code block comes before the first headline, the start
>      of the comment text will be determined by pre-existing match data
>      and thus will likely be incorrect.
>
>   3. Org structural elements such as headline stars, =#+= control lines,
>      and drawers are included in the comments.
>
>   4. There is no way easy to delimit comment text or transform it that
>      does not also change the structure of the org file (e.g., by adding
>      headlines or source blocks).
>
>   Issues 1 and 2 seem to be genuine bugs. Issues 3 and 4 are more
>   subjective, I admit, but seem undesirable. Stars, drawers, and
>   control lines are org structure rather than content, and they
>   are often inappropriate in comments.
>
>   To reproduce the behaviors for issues 1 and 3, look at the result of
>   tangling this file. To reproduce issue 2 as well, remove the first two
>   stars from this file and tangle again. Alternatively, within emacs,
>   evaluate the =(buffer-substring ...)= sexp from the original-code code
>   below at the beginning character of a source block. (You
>   can also export this file to PDF for more pleasant reading.)
>
>   Below, I give details on these issues and code for two fixes:
>   a [[simple-fix][simple fix]] that handles the first two issues and the
> stars in the third,
>   and a [[preferred-fix][better fix]] that handles all four issues in a more
>   modular, customizable framework. I'd be interested in hearing
>   feedback on all of this. If the reaction is positive, I will gladly
>   submit a patch. Thanks for your consideration.
>
>
>
> * The Original Code and Details on the Problem
>
>   The relevant section of the original code from org-mode version 7.7
>   is shown below, comprising lines 344 through 357
>   of ob-tangle.el in function =org-babel-tangle-collect-blocks=. With point
>   is at a =#+begin_src=, it scans back either
>   for a heading line or for the end of the previous code block,
>   whichever comes later. The resulting region becomes the
>   comment text.
>
>   #+latex: \begin{figure}[h]
>   #+latex: \hypertarget{original-code}{} % <<original-code>>
>   #+source: original-code
>   #+begin_src emacs-lisp
>     (comment
>      (when (or (string= "both" (cdr (assoc :comments params)))
>                (string= "org" (cdr (assoc :comments params))))
>        ;; from the previous heading or code-block end
>        (buffer-substring
>         (max (condition-case nil
>                  (save-excursion
>                    (org-back-to-heading t) (point))
>                (error 0))
>              (save-excursion
>                (re-search-backward
>                 org-babel-src-block-regexp nil t)
>                (match-end 0)))
>         (point))))
>   #+end_src
>   #+latex: \caption{Original Code, lines 344--357 in {\tt ob-tangle.el}.}
>   #+latex: \label{fig::original-code}
>   #+latex: \end{figure}
>
>   /Issue 1/. When in the first code block in the file, the
>   second search fails (there is no previous code block),
>   so the (match-end 0)  call uses the match data from the (implicit) match
> during
>   =org-back-to-heading=, which skips the stars. (Not a particularly
>   transparent reference, incidentally.) For subsequent blocks, the
>   =(match-end 0)= gives the end of the previous code block, which in these
>   examples is earlier than the previous headline location.
>
>   /Issue 2/. When the first code block lies before the first headline
>   (say with some text before it), the searches fail in /both/ clauses of
>   the max. So, the =match-end= will return an essentially arbitrary
>   result, which is a bug.
>
>   /Issue 3/. =org-back-to-heading= leaves point at the beginning of the
>   stars, so a headline included in the text will have stars, except for
>   the first one.
>
>   /Issue 4/. Control lines at the end of the previous code block and
>   before point are not filtered out and so are included in the comments.
>
>
> * A Simple Fix for the First Three Issues
>
>   A small change addresses issues 1, 2, and the stars for issue 3: in
>   both cases, simply use the =match-end= and replace 0 values with
>   =(point-min)=. The latter gives a sensible result even if both
>   computed positions are trivial (as when the first code block comes
>   before the first headline) and respects narrowing.
>
>   #+latex: \begin{figure}[h]
>   #+latex: \hypertarget{simple-fix}{} % <<simple-fix>>
>   #+begin_src emacs-lisp
>     (comment
>      (when (or (string= "both" (cdr (assoc :comments params)))
>                (string= "org" (cdr (assoc :comments params))))
>        ;; from the previous heading or code-block end
>        (buffer-substring
>         (max (condition-case nil
>                  (save-excursion
>                    (org-back-to-heading t)  ; sets match data
>                    (match-end 0))
>                (error (point-min)))
>              (save-excursion
>                (if (re-search-backward
>                     org-babel-src-block-regexp nil t)
>                    (match-end 0)
>                  (point-min))))
>         (point))))
>   #+end_src
>   #+latex: \caption{Simple Fix, replacement for lines 344--357 in {\tt
> ob-tangle.el}.}
>   #+latex: \label{fig::simple-fix}
>   #+latex: \end{figure}
>
> * A Fix for All Four Issues
>
>   A better fix that handles issues 1--4 starts with the region computed as
> in the [[simple-fix][simple fix]]
>   and then processes that text through a user-configurable sequence of
> functions
>   to derive the final form of the comment text.
>
>   The following changes are required.
>
> ** Extract Initial Comment Text and State from Org Buffer
>
>    The initial comment text ranges from either the most recent headline
>    at the point after the stars, the beginning of the line after the
>    =#+end_src= of the most recent code block, or the beginning of the
>    buffer, whichever is later, through the line before the source
>    block.[fn:1]
>
>    The [[preferred-fix][code]] to extract this is given below.
>    #+latex: (See Figure \ref{fig::preferred-fix}.)
>    It replaces lines 344 through 357 of
>    =ob-tangle.el= from org-mode version 7.7 in the function
>    =org-babel-tangle-collect-blocks=.
>
>    #+latex: \begin{figure}[h]
>    #+latex: \hypertarget{preferred-fix}{} % <<preferred-fix>>
>    #+begin_src emacs-lisp
>      (comment
>       (when (or (string= "both" (cdr (assoc :comments params)))
>                 (string= "org" (cdr (assoc :comments params))))
>         (let* ((prev-heading
>                 (condition-case nil
>                     (save-excursion
>                       (org-back-to-heading t) ; sets match data
>                       (match-end 0))
>                   (error (point-min))))
>                (end-of-prev-src-block
>                 (save-excursion
>                   (if (null (re-search-backward
>                              org-babel-src-block-regexp nil t))
>                       (point-min)
>                     (goto-char (match-end 0))
>                     (forward-line 1)
>                     (point))))
>                (comment-start
>                 (max prev-heading end-of-prev-src-block))
>                (comment-end
>                 (save-excursion
>                   (forward-line 0)
>                   (point)))
>                (state
>                 (list (cons 'org-drawers
>                             org-drawers)
>                       (cons 'after-heading
>                             (= comment-start prev-heading))
>                       (cons 'first-line
>                             (= comment-start (point-min))))))
>           (org-babel-process-comment-text
>            (buffer-substring comment-start comment-end) state))))
>    #+end_src
>    #+latex: \caption{Better Fix, replacement for lines 344--357 in {\tt
> ob-tangle.el}.}
>    #+latex: \label{fig::preferred-fix}
>    #+latex: \end{figure}
>
> ** Adjust =org-babel-spec-to-string=
>
>    The commment block collected by the [[original-code][original code]]
>    #+latex: (Figure \ref{fig::original-code})
>    in =org-babel-tangle-collect-blocks= is further processed in \newline
>    =org-babel-spec-to-string= to trim leading and trailing whitespace
>    from string. This was needed because spaces after a source block were
>    included in the comment. In the revised code, however, this space
>    trimming is handled during text transformation, except for removing
>    trailing newlines. (Note: trailing /spaces/ are not removed to allow
>    more flexibility in comment processing.) Hence,
>    =org-babel-spec-to-string= needs to be slightly adjusted.
>    #+latex: See Figure \ref{fig::spec-string-diff}.
>
>    #+latex: \begin{figure}[h]
>    #+begin_example
>      --- ob-tangle.el           2011-09-14 11:48:26.000000000 -0400
>      +++ new-ob-tangle.el       2011-09-14 11:55:56.000000000 -0400
>      @@ -398,3 +398,3 @@
>           (flet ((insert-comment (text)
>      -            (let ((text (org-babel-trim text)))
>      +            (let ((text (org-babel-chomp text "[\f\t\n\r\v]")))
>               (when (and comments (not (string= comments "no"))
>    #+end_example
>    #+latex: \caption{Changes to {\tt org-spec-to-string} in {\tt
> ob-tangle.el}, unified diff, one line of context}
>    #+latex: \label{fig::spec-string-diff}
>    #+latex: \end{figure}
>
> ** Process Comment Text Through Sequence of Transforms
>
>    At the end of the revised [[preferred-fix][comment collection code]],
>    the comment text is passed to
>    =org-babel-process-comment-text= which
>    applies a sequence of transformation functions.
>    #+latex: (See Figure \ref{fig::comment-transformation}.)
>    The list of transformation functions is stored in a customizable
>    variable described [[Define Customization Variable for
> Transforms][below]]. Several predefined transformations are
>    given [[Define A Collection of Transform Functions][below]] as well.
>
>    #+latex: \begin{figure}[h]
>    #+begin_src emacs-lisp
>      (defun org-babel-process-comment-text (text &optional state)
>        "Apply list of transforms to comment TEXT assuming bindings in alist
> STATE.
>      Returns the modified text string, which may have text properties.
>      See `org-babel-comment-processing-functions' for the transforms to be
>      applied and details on the allowed keys in the STATE alist."
>        (let ((funcs org-babel-comment-processing-functions))
>          (with-temp-buffer
>            (insert text)
>            (let ((org-drawers
>                   (or (cdr (assoc 'org-drawers state))
>                       org-drawers))
>                  (after-heading
>                   (cdr (assoc 'after-heading state)))
>                  (first-line
>                   (cdr (assoc 'first-line state))))
>              (while funcs
>                (goto-char (point-min))
>                (funcall (car funcs))
>                (setq funcs (cdr funcs))))
>            (buffer-substring (point-min) (point-max)))))
>    #+end_src
>    #+latex: \caption{Better Fix, comment transformation driver.}
>    #+latex: \label{fig::comment-transformation}
>    #+latex: \end{figure}
>
> ** Define Customization Variable for Transforms
>
>    A list of nullary functions applied in order to
>    the comment text. The text is inserted
>    in a temporary buffer, so these functions can use
>    the entire Emacs library for operating on buffer text.
>    #+latex: See Figure \ref{fig::comment-transformation-function-list}.
>
>    #+latex: \begin{figure}[h]
>    #+begin_src emacs-lisp
>      (defcustom org-babel-comment-processing-functions
>        '(org-babel-comment-delete-file-variables-line
>          org-babel-comment-delete-org-control-lines
>          org-babel-comment-delete-drawers
>          org-babel-comment-trim-blank-lines
>          org-babel-comment-trim-indent-prefix)
>        "List of functions to transform source-block comment text before
> insertion.
>      Each function will be called with no arguments with point at the
>      beginning of a buffer containing only the comment text. Each
>      function can modify the text at will and leave point anywhere,
>      but it should *not* modify the narrowing state of the buffer.
>      Several dynamic state variables are set prior to execution that
>      each function can reference. These currently include:
>
>         + org-drawers:   names of drawers in the original org buffer.
>         + from-heading:  t if comment starts at an org heading line,
>                          nil otherwise.
>         + first-line:    t if initial comment starts on first line
>                          of the original org buffer, nil otherwise.
>
>      If a function changes the value of these state variables, the new
>      value will be seen by all following functions in the list, but
>      this is not generally recommended.
>
>      The functions in this list are called *in order*, and this order
>      can influence the form of the resulting comment text."
>        :group 'org-babel
>        :type 'list)
>    #+end_src
>    #+latex: \caption{Better Fix, customizable transform list.}
>    #+latex: \label{fig::comment-transformation-function-list}
>    #+latex: \end{figure}
>
> ** Define A Collection of Transform Functions
>
>    An advantage of this design is that transformation of the
>    comments is modular and customizable. We can include in
>    =ob-tangle.el= a collection of pre-defined transforms.
>    The default processing stream in =org-babel-comment-processing-functions=
>    is as follows:
>
>    1. Delete a file variables if on the first line of the buffer.
>    2. Delete all drawers and their contents.
>    3. Delete all org control lines from the comment text.
>    4. Trim blank lines from the beginning and end.
>    5. Reindent the text by removing the longest common leading
>       string of spaces.
>
>    #+ TANGLE: end-comment
>    These and several other useful transforms are given below
>    (e.g., deleting drawer delimiters but not contents)..
>    #+latex: See Figures \ref{fig::transformA}--\ref{fig::transformZ}.
>    It is easy to define new transforms; any function that
>    operates on text in the current buffer beginning at point-min
>    will work.
>
>    #+latex: \begin{figure}[h]
>    #+begin_src emacs-lisp
>      (defun org-babel-comment-delete-file-variables-line ()
>        "Delete file variables comment line if at beginning of buffer.
>      This only checks the first line of the buffer, and so should be
>      placed first (or at least early enough) in the list
>      `org-babel-comment-processing-functions' to ensure that the no
>      other text has been inserted earlier."
>        (when (and first-line
>                   (looking-at ; file-variables line
>                    "^#[ \t]*-\\*-.*:.*;[ \t]**-\\*-[ \t]*$"))
>          (let ((kill-whole-line t))
>            (kill-line))))
>    #+end_src
>    #+latex: \caption{Comment Transform.}
>    #+latex: \label{fig::transformA}
>    #+latex: \end{figure}
>
>    #+latex: \begin{figure}[h]
>    #+begin_src emacs-lisp
>      (defun org-babel-comment-delete-org-control-lines ()
>        "Remove all org #+ control lines from comment."
>        (let ((control-regexp "^[ \t]*#\\+.*$"))
>          (delete-matching-lines control-regexp)))
>    #+end_src
>    #+latex: \caption{Comment Transform.}
>    #+latex: \end{figure}
>    #+latex: \begin{figure}[h]
>    #+begin_src emacs-lisp
>      (defun org-babel-comment-delete-org-in-buffer-settings ()
>        "Remove all org #+ in-buffer setting lines, leaving other control
> lines.
>      In-buffer setting lines begin with #+ and have all caps keyword
>      names."
>        (let ((setting-regexp "^#\\+[ \t]*[A-Z_]+:.*$"))
>          (delete-matching-lines setting-regexp)))
>    #+end_src
>    #+latex: \caption{Comment Transform.}
>    #+latex: \end{figure}
>    #+latex: \begin{figure}[h]
>    #+begin_src emacs-lisp
>      (defun org-babel-comment-delete-drawers ()
>        "Delete drawer delimiters and contents from comment.
>      Drawer names are restricted to those in the `org-drawers' state."
>        (let ((drawer-start-regexp
>               (format "^[ \t]*:\\(?:%s\\):[ \t]*$"
>                       (mapconcat 'identity
>                                  org-drawers
>                                  "\\|")))
>              (drawer-end-regexp "^[ \t]*:END:[ \t]*$"))
>          (while (re-search-forward drawer-start-regexp nil t)
>            (let ((beg (save-excursion
>                         (forward-line 0)
>                         (point)))
>                  (end (save-excursion
>                         (re-search-forward drawer-end-regexp nil t)
>                         (forward-line 1)
>                         (point))))
>              (goto-char end)
>              (delete-region beg end)))))
>    #+end_src
>    #+latex: \caption{Comment Transform.}
>    #+latex: \end{figure}
>    #+latex: \begin{figure}[h]
>    #+begin_src emacs-lisp
>      (defun org-babel-comment-delete-drawer-delimiters ()
>        "Delete drawer delimiters from comment leaving content.
>      Drawer names are restricted to those given by the `org-drawers'
>      state."
>        (let ((drawer-delim-regexp
>               (format "^[ \t]*:\\(?:%s\\)"
>                       (mapconcat 'identity
>                                  (cons "END" org-drawers)
>                                  "\\|"))))
>          (delete-matching-lines drawer-delim-regexp)))
>    #+end_src
>    #+latex: \caption{Comment Transform.}
>    #+latex: \end{figure}
>    #+latex: \begin{figure}[h]
>    #+begin_src emacs-lisp
>      (defun org-babel-comment-trim-blank-lines ()
>        "Trim whitespace-only lines from beginning and end of text."
>        (while (and (looking-at "^[ \t\f]*$")
>                    (< (point) (point-max)))
>          (forward-line 1))
>        (delete-region (point-min) (point))
>        (when (< (point) (point-max))
>          (goto-char (point-max))
>          (let ((last-point (point)))
>            (forward-line 0)
>            (while (and (looking-at "^[ \t\f]*$")
>                        (> (point) (point-min)))
>              (setq last-point (point))
>              (forward-line -1))
>            (delete-region last-point (point-max)))))
>    #+end_src
>    #+latex: \caption{Comment Transform.}
>    #+latex: \end{figure}
>    #+latex: \begin{figure}[h]
>    #+begin_src emacs-lisp
>      (defun org-babel-comment-trim-indent-prefix ()
>        "Remove longest common leading prefix of spaces from each line of
> TEXT.
>      Prefix is computed from the initial whitespace on each line with
>      tabs converted to spaces, preserving indentation."
>        (let* ((common-indent nil)
>               (common-length (1+ (- (point-max) (point-min))))
>               (current-indent "")                   ; enter first loop
>               (current-length common-length))       ; skip first assignment
>          (goto-char (point-min))
>          (while current-indent
>            (when (< current-length common-length)
>              (setq common-indent current-indent
>                    common-length current-length))
>            (setq current-indent
>                  (let* ((found (re-search-forward "^\\([ \t]*\\)\\S-" nil
> t))
>                         (bol (match-beginning 0))
>                         (eos (match-end 1))
>                         (space-str (match-string 1))
>                         (indent-tabs-mode nil))
>                    (cond
>                     ((not found)
>                      nil)
>                     ((not (string-match "\t" space-str))
>                      space-str)
>                     (t                       ; detabify indent string
>                      (goto-char eos)
>                      (let ((col (current-column)))
>                        (delete-region bol eos)
>                        (indent-to col))
>                      (buffer-substring-no-properties bol (point))))))
>            (setq current-length (length current-indent)))
>          (when (and common-indent (> common-length 0))
>            (let ((indent-re (concat "^" common-indent)))
>              (goto-char (point-min))
>              (while (re-search-forward indent-re nil t)
>                (replace-match "" nil nil))))))
>    #+end_src
>    #+latex: \caption{Comment Transform.}
>    #+latex: \label{fig::transformZ}
>    #+latex: \end{figure}
>    #+latex: \end{itemize}
>    #+latex: \noindent
>    This kind of customization offers some nice possibilities,
>    including controlling indentation, eliminating or
>    transforming org markup, eliminating trailing whitespace, and
>    automating specialized comment formatting (e.g., javadoc). As
>    an additional illustration, consider the transform
>    =org-babel-comment-restrict-comment-range=
>    #+latex: in Figure \ref{fig::transform-illustration}
>    below. The idea is that it is sometimes useful to select
>    from the text under a headline a /part/ of the text for
>    the comment. We want some org markup that will not affect
>    either the export or the structure of the org file itself.
>    To do this, we use the fact that =#+=\textvisiblespace
>    lines are not exported.[fn:2] So, we can /de facto/ use the
>    =#+ TANGLE:= construct to control various aspects of tangling.
>    Here, we use the =#+ TANGLE: start-comment= and
>    =#+ TANGLE: end-comment= to delimit the comment text.
>    (This function needs to come earlier in the function list than the
>    functions that eliminate org control lines. It is sufficient to
>    prepend it to that list.) This is used in the current file,
>    for example.
>
>    #+latex: \begin{figure}[h]
>    #+begin_src emacs-lisp
>      (defun org-babel-comment-restrict-comment-range ()
>        "Remove all comment text outside start-comment and end-comment
> delimiters.
>      Comment delimiters are #+TANGLE lines with respective keywords
>      start-comment and end-comment. THE #+TANGLE lines are also
>      deleted. To be effective, this function should be positioned in
>      the list `org-babel-comment-processing-functions' before any
>      functions that remove org control lines or process other
>      co-occuring attributes of #+TANGLE lines."
>        (when (re-search-forward "^[ \t]*#\\+[ \t]*TANGLE:.*start-comment.*$"
> nil t)
>          (forward-line 1)
>          (delete-region (point-min) (point)))
>        (when (re-search-forward "^[ \t]*#\\+[ \t]*TANGLE:.*end-comment.*$"
> nil t)
>          (forward-line 0)
>          (delete-region (point) (point-max))))
>    #+end_src
>    #+latex: \caption{Transform to illustrate some customization
> possibilities.}
>    #+latex: \label{fig::transform-illustration}
>    #+latex: \end{figure}
>    #+latex: \begin{itemize}
>
> [fn:1] In the original code and in the simple fix above, the comment
> starts /immediately/ after the =#+end_src= rather than at the start of
> the next line. Starting at the next line seems more natural to me
> because the comment being constructed relates to the /following/ code
> block. But the original behavior is easily restored if people disagree.
>
> [fn:2] A feature request: I would propose that the =#+tangle:= construct
> be recognized as non-exported even with spaces preceding the =#= and no
> spaces after the =+=. This would enable a variety of interesting
> customization for tangled comments. Alternatively, a generic construct
> such as =#+noop:= or =#+generic:= could be a valuable for user-based
> tags in an org file that serves a similar purpose -- allow customized
> processing without directly being exported.
>
>

-- 
Eric Schulte
http://cs.unm.edu/~eschulte/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Four issues with org-babel-tangle
  2011-09-15 15:49 ` Eric Schulte
@ 2011-09-15 21:36   ` Christopher Genovese
  2011-09-15 22:02     ` Eric Schulte
  0 siblings, 1 reply; 6+ messages in thread
From: Christopher Genovese @ 2011-09-15 21:36 UTC (permalink / raw)
  To: Eric Schulte; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 32624 bytes --]

Hi Eric,

   Thanks for your note.

> I would encourage you to begin the FSF assignment process if
> you anticipate potentially contributing more fixes in the
> future. Could you please send a git format-patch version of
> the simple fix to the list so that I might apply it?

   I will begin the FSF assignment process, and I will send a git-format
patch based on the simple fix. (I'll send that tonight.)

> I like the idea of introducing a customizable function for
> comment text transformation, however ... rather perhaps we
> should just leave the default value of this function as
> simple as possible and allow users to customize it ....

   That makes sense, and I like the way you did it. In particular,
I absolutely agree that the org-babel-trim should be removed
from org-babel-spec-to-string (to allow flexibility in the customization).
Making it the default processor works well, I think.

   Would you like me to submit a separate patch based on this change
or should I include that as part of the patch with the simple fix?

> Finally I'm not sure I fully understand what you mean by ...

Sorry, I wasn't clear. It's a small thing. If you put
'#+tangle' in column 0, the line is not exported because it
begins with #; if you put #+ tangle on a line (spaces
after + and possibly before #), the line is not exported
because it begins with #+; but if you put #+tangle (no
spaces after the + but spaces before the #), the line is
exported. I think it would be useful if something like
 #+tangle's (with no spaces between the # and +) were
*not* exported because such lines can support
useful customizations. Having to put the spaces after the +
is a bit bothersome and looks uglier to me.

> ..., it was a long email.

Yeah, sorry. :) Thanks for slogging through.

  Best,

    Christopher


On Thu, Sep 15, 2011 at 11:49, Eric Schulte <schulte.eric@gmail.com> wrote:

> Hi Christopher,
>
> Thank you for the thorough examples and for suggesting fixes.  I would
> like to apply your "simple fix" immediately, and the resulting patch
> should be small enough (less than 10 lines of changes) that it can be
> applied without FSF assignment -- although I would encourage you to
> begin the FSF assignment process if you anticipate potentially
> contributing more fixes in the future.  Could you please send a git
> format-patch version of the simple fix to the list so that I might apply
> it?
>
> I like the idea of introducing a customizable function for comment text
> transformation, however I'm not sure that the temporary-buffer mechanics
> need to be included by default, rather perhaps we should just leave the
> default value of this function as simple as possible and allow users to
> customize it to be as simple or complex as they wish.  Perhaps a change
> like the following, where the call to `org-babel-trim' in
> `org-babel-spec-to-string' is removed.
>
> #+begin_src emacs-lisp
>   (buffer-substring
>   (max (condition-case nil
>            (save-excursion
>              (org-back-to-heading t) (point))
>          (error 0))
>        (save-excursion
>          (re-search-backward
>           org-babel-src-block-regexp nil t)
>          (match-end 0)))
>   (point))
>
>   ;; | becomes
>  ;; v
>
>  (org-babel-process-comment-text
>   (buffer-substring
>     (max (condition-case nil
>             (save-excursion
>               (org-back-to-heading t) (point))
>           (error 0))
>         (save-excursion
>           (re-search-backward
>            org-babel-src-block-regexp nil t)
>           (match-end 0)))
>    (point)))
>
>   ;; where
>
>  (defcustom org-babel-process-comment-text #'org-babel-trim
>    "Customizable function for processing comment text."
>    :group 'org-babel
>    :type 'function)
> #+end_src
>
> This change may end up being more than 10 lines long, but a patch would
> still be welcome, otherwise if the solution I sketched out above sounds
> reasonable I could compose a patch and then share it with you for double
> checking before it is applied.
>
> Finally I'm not sure I fully understand what you mean by
>
> >
> > [fn:2] A feature request: I would propose that the =#+tangle:= construct
> > be recognized as non-exported even with spaces preceding the =#= and no
> > spaces after the =+=. This would enable a variety of interesting
> > customization for tangled comments. Alternatively, a generic construct
> > such as =#+noop:= or =#+generic:= could be a valuable for user-based
> > tags in an org file that serves a similar purpose -- allow customized
> > processing without directly being exported.
> >
>
> Please do let me know if I missed anything, it was a long email.
>
> Thanks for contributing! -- Eric
>
> Christopher Genovese <genovese.cr@gmail.com> writes:
>
> > /Semi-verbose Preamble/.
> > Having recently begun intensive use of org-mode for tangling source
> > files, I encountered four issues related to comment extraction (two
> > bugs, one undesirable behavior, and one ... unfulfilled need), which I
> > describe in detail below. I started by creating an org file that would
> > reproduce the problems, and soon started /describing/ the problems in
> > the org file as well as putting my fixes in the source blocks. At the
> > risk of it being too meta or annoying, I've included that org file at
> > the end of this message as the problem description. All the details are
> > there as well as two fixes. Tangling that file in various ways described
> > demonstrates the problems, and you can export to PDF for nicer reading.
> > (I've attached the PDF to this mail for convenience. It looks good;
> > kudos, org-mode!) I've also attached a tarball with files that make it
> > easy to try my changes and to restore the original behavior, as well as
> > tests and results from the org file for easy comparison. See the
> > included README.
> >
> > I've been using the revised code now for a few days. It fixes the
> > problems I describe, and I think it provides a flexible framework for
> > comment extraction with minimal change to the base code. If the reaction
> > to this is positive, I will happily submit a patch, sign paperwork, or
> > whatever is needed, after fixing any problems that you all see. In any
> > case, I very much look forward to any feedback you can offer. Thanks.
> >
> >   -- Christopher
> >
> > P.S. In case the attachments get dropped, I've put the PDF and the
> >      tarball at
> >
> >      http://www.stat.cmu.edu/~genovese/depot/tangle-error.pdf
> >      http://www.stat.cmu.edu/~genovese/depot/tangle-bundle.tgz
> >
> > /Problem Description/
> > ################ Cut Here ####################
> > # -*- org-return-follows-link: t; comment-empty-lines: t; -*-
> > #+TITLE: Tangle this file: four issues with org-babel-tangle
> > #+AUTHOR: Christopher Genovese
> > #+DATE: 14 Sep 2011\vspace*{-0.5cm}
> > #+EMAIL: genovese@cmu.edu
> > #+OPTIONS: toc:1 H:1
> > #+BABEL: :tangle yes :comments org :results silent :exports code
> > #+BIND: org-export-latex-hyperref-format "\\hyperlink{%s}{%s}"
> > #+STARTUP: showall
> > #+LATEX_HEADER: \usepackage[labelsep=period,labelfont=bf]{caption}
> > #+LaTeX: \vspace*{-1cm}\vspace*{-2pt}
> >
> >
> > * Four Related Issues with org-babel-tangle
> >
> >   Running org-mode version 7.7 on both Gnu Emacs 23.2.1 and 24.0.50.1
> >   with Mac OS X 10.5.8 (with and without a -Q option), I encountered the
> >   following issues/problems/bugs when tangling files with code blocks
> >   for which the :comment header argument is org:
> >
> >   1. The subtree associated with the very first code block has
> >      its headline tangled without leading stars, but all subsequent
> >      sub-trees associated with code blocks have the leading stars
> >      included in the comments.
> >
> >   2. If the first code block comes before the first headline, the start
> >      of the comment text will be determined by pre-existing match data
> >      and thus will likely be incorrect.
> >
> >   3. Org structural elements such as headline stars, =#+= control lines,
> >      and drawers are included in the comments.
> >
> >   4. There is no way easy to delimit comment text or transform it that
> >      does not also change the structure of the org file (e.g., by adding
> >      headlines or source blocks).
> >
> >   Issues 1 and 2 seem to be genuine bugs. Issues 3 and 4 are more
> >   subjective, I admit, but seem undesirable. Stars, drawers, and
> >   control lines are org structure rather than content, and they
> >   are often inappropriate in comments.
> >
> >   To reproduce the behaviors for issues 1 and 3, look at the result of
> >   tangling this file. To reproduce issue 2 as well, remove the first two
> >   stars from this file and tangle again. Alternatively, within emacs,
> >   evaluate the =(buffer-substring ...)= sexp from the original-code code
> >   below at the beginning character of a source block. (You
> >   can also export this file to PDF for more pleasant reading.)
> >
> >   Below, I give details on these issues and code for two fixes:
> >   a [[simple-fix][simple fix]] that handles the first two issues and the
> > stars in the third,
> >   and a [[preferred-fix][better fix]] that handles all four issues in a
> more
> >   modular, customizable framework. I'd be interested in hearing
> >   feedback on all of this. If the reaction is positive, I will gladly
> >   submit a patch. Thanks for your consideration.
> >
> >
> >
> > * The Original Code and Details on the Problem
> >
> >   The relevant section of the original code from org-mode version 7.7
> >   is shown below, comprising lines 344 through 357
> >   of ob-tangle.el in function =org-babel-tangle-collect-blocks=. With
> point
> >   is at a =#+begin_src=, it scans back either
> >   for a heading line or for the end of the previous code block,
> >   whichever comes later. The resulting region becomes the
> >   comment text.
> >
> >   #+latex: \begin{figure}[h]
> >   #+latex: \hypertarget{original-code}{} % <<original-code>>
> >   #+source: original-code
> >   #+begin_src emacs-lisp
> >     (comment
> >      (when (or (string= "both" (cdr (assoc :comments params)))
> >                (string= "org" (cdr (assoc :comments params))))
> >        ;; from the previous heading or code-block end
> >        (buffer-substring
> >         (max (condition-case nil
> >                  (save-excursion
> >                    (org-back-to-heading t) (point))
> >                (error 0))
> >              (save-excursion
> >                (re-search-backward
> >                 org-babel-src-block-regexp nil t)
> >                (match-end 0)))
> >         (point))))
> >   #+end_src
> >   #+latex: \caption{Original Code, lines 344--357 in {\tt ob-tangle.el}.}
> >   #+latex: \label{fig::original-code}
> >   #+latex: \end{figure}
> >
> >   /Issue 1/. When in the first code block in the file, the
> >   second search fails (there is no previous code block),
> >   so the (match-end 0)  call uses the match data from the (implicit)
> match
> > during
> >   =org-back-to-heading=, which skips the stars. (Not a particularly
> >   transparent reference, incidentally.) For subsequent blocks, the
> >   =(match-end 0)= gives the end of the previous code block, which in
> these
> >   examples is earlier than the previous headline location.
> >
> >   /Issue 2/. When the first code block lies before the first headline
> >   (say with some text before it), the searches fail in /both/ clauses of
> >   the max. So, the =match-end= will return an essentially arbitrary
> >   result, which is a bug.
> >
> >   /Issue 3/. =org-back-to-heading= leaves point at the beginning of the
> >   stars, so a headline included in the text will have stars, except for
> >   the first one.
> >
> >   /Issue 4/. Control lines at the end of the previous code block and
> >   before point are not filtered out and so are included in the comments.
> >
> >
> > * A Simple Fix for the First Three Issues
> >
> >   A small change addresses issues 1, 2, and the stars for issue 3: in
> >   both cases, simply use the =match-end= and replace 0 values with
> >   =(point-min)=. The latter gives a sensible result even if both
> >   computed positions are trivial (as when the first code block comes
> >   before the first headline) and respects narrowing.
> >
> >   #+latex: \begin{figure}[h]
> >   #+latex: \hypertarget{simple-fix}{} % <<simple-fix>>
> >   #+begin_src emacs-lisp
> >     (comment
> >      (when (or (string= "both" (cdr (assoc :comments params)))
> >                (string= "org" (cdr (assoc :comments params))))
> >        ;; from the previous heading or code-block end
> >        (buffer-substring
> >         (max (condition-case nil
> >                  (save-excursion
> >                    (org-back-to-heading t)  ; sets match data
> >                    (match-end 0))
> >                (error (point-min)))
> >              (save-excursion
> >                (if (re-search-backward
> >                     org-babel-src-block-regexp nil t)
> >                    (match-end 0)
> >                  (point-min))))
> >         (point))))
> >   #+end_src
> >   #+latex: \caption{Simple Fix, replacement for lines 344--357 in {\tt
> > ob-tangle.el}.}
> >   #+latex: \label{fig::simple-fix}
> >   #+latex: \end{figure}
> >
> > * A Fix for All Four Issues
> >
> >   A better fix that handles issues 1--4 starts with the region computed
> as
> > in the [[simple-fix][simple fix]]
> >   and then processes that text through a user-configurable sequence of
> > functions
> >   to derive the final form of the comment text.
> >
> >   The following changes are required.
> >
> > ** Extract Initial Comment Text and State from Org Buffer
> >
> >    The initial comment text ranges from either the most recent headline
> >    at the point after the stars, the beginning of the line after the
> >    =#+end_src= of the most recent code block, or the beginning of the
> >    buffer, whichever is later, through the line before the source
> >    block.[fn:1]
> >
> >    The [[preferred-fix][code]] to extract this is given below.
> >    #+latex: (See Figure \ref{fig::preferred-fix}.)
> >    It replaces lines 344 through 357 of
> >    =ob-tangle.el= from org-mode version 7.7 in the function
> >    =org-babel-tangle-collect-blocks=.
> >
> >    #+latex: \begin{figure}[h]
> >    #+latex: \hypertarget{preferred-fix}{} % <<preferred-fix>>
> >    #+begin_src emacs-lisp
> >      (comment
> >       (when (or (string= "both" (cdr (assoc :comments params)))
> >                 (string= "org" (cdr (assoc :comments params))))
> >         (let* ((prev-heading
> >                 (condition-case nil
> >                     (save-excursion
> >                       (org-back-to-heading t) ; sets match data
> >                       (match-end 0))
> >                   (error (point-min))))
> >                (end-of-prev-src-block
> >                 (save-excursion
> >                   (if (null (re-search-backward
> >                              org-babel-src-block-regexp nil t))
> >                       (point-min)
> >                     (goto-char (match-end 0))
> >                     (forward-line 1)
> >                     (point))))
> >                (comment-start
> >                 (max prev-heading end-of-prev-src-block))
> >                (comment-end
> >                 (save-excursion
> >                   (forward-line 0)
> >                   (point)))
> >                (state
> >                 (list (cons 'org-drawers
> >                             org-drawers)
> >                       (cons 'after-heading
> >                             (= comment-start prev-heading))
> >                       (cons 'first-line
> >                             (= comment-start (point-min))))))
> >           (org-babel-process-comment-text
> >            (buffer-substring comment-start comment-end) state))))
> >    #+end_src
> >    #+latex: \caption{Better Fix, replacement for lines 344--357 in {\tt
> > ob-tangle.el}.}
> >    #+latex: \label{fig::preferred-fix}
> >    #+latex: \end{figure}
> >
> > ** Adjust =org-babel-spec-to-string=
> >
> >    The commment block collected by the [[original-code][original code]]
> >    #+latex: (Figure \ref{fig::original-code})
> >    in =org-babel-tangle-collect-blocks= is further processed in \newline
> >    =org-babel-spec-to-string= to trim leading and trailing whitespace
> >    from string. This was needed because spaces after a source block were
> >    included in the comment. In the revised code, however, this space
> >    trimming is handled during text transformation, except for removing
> >    trailing newlines. (Note: trailing /spaces/ are not removed to allow
> >    more flexibility in comment processing.) Hence,
> >    =org-babel-spec-to-string= needs to be slightly adjusted.
> >    #+latex: See Figure \ref{fig::spec-string-diff}.
> >
> >    #+latex: \begin{figure}[h]
> >    #+begin_example
> >      --- ob-tangle.el           2011-09-14 11:48:26.000000000 -0400
> >      +++ new-ob-tangle.el       2011-09-14 11:55:56.000000000 -0400
> >      @@ -398,3 +398,3 @@
> >           (flet ((insert-comment (text)
> >      -            (let ((text (org-babel-trim text)))
> >      +            (let ((text (org-babel-chomp text "[\f\t\n\r\v]")))
> >               (when (and comments (not (string= comments "no"))
> >    #+end_example
> >    #+latex: \caption{Changes to {\tt org-spec-to-string} in {\tt
> > ob-tangle.el}, unified diff, one line of context}
> >    #+latex: \label{fig::spec-string-diff}
> >    #+latex: \end{figure}
> >
> > ** Process Comment Text Through Sequence of Transforms
> >
> >    At the end of the revised [[preferred-fix][comment collection code]],
> >    the comment text is passed to
> >    =org-babel-process-comment-text= which
> >    applies a sequence of transformation functions.
> >    #+latex: (See Figure \ref{fig::comment-transformation}.)
> >    The list of transformation functions is stored in a customizable
> >    variable described [[Define Customization Variable for
> > Transforms][below]]. Several predefined transformations are
> >    given [[Define A Collection of Transform Functions][below]] as well.
> >
> >    #+latex: \begin{figure}[h]
> >    #+begin_src emacs-lisp
> >      (defun org-babel-process-comment-text (text &optional state)
> >        "Apply list of transforms to comment TEXT assuming bindings in
> alist
> > STATE.
> >      Returns the modified text string, which may have text properties.
> >      See `org-babel-comment-processing-functions' for the transforms to
> be
> >      applied and details on the allowed keys in the STATE alist."
> >        (let ((funcs org-babel-comment-processing-functions))
> >          (with-temp-buffer
> >            (insert text)
> >            (let ((org-drawers
> >                   (or (cdr (assoc 'org-drawers state))
> >                       org-drawers))
> >                  (after-heading
> >                   (cdr (assoc 'after-heading state)))
> >                  (first-line
> >                   (cdr (assoc 'first-line state))))
> >              (while funcs
> >                (goto-char (point-min))
> >                (funcall (car funcs))
> >                (setq funcs (cdr funcs))))
> >            (buffer-substring (point-min) (point-max)))))
> >    #+end_src
> >    #+latex: \caption{Better Fix, comment transformation driver.}
> >    #+latex: \label{fig::comment-transformation}
> >    #+latex: \end{figure}
> >
> > ** Define Customization Variable for Transforms
> >
> >    A list of nullary functions applied in order to
> >    the comment text. The text is inserted
> >    in a temporary buffer, so these functions can use
> >    the entire Emacs library for operating on buffer text.
> >    #+latex: See Figure \ref{fig::comment-transformation-function-list}.
> >
> >    #+latex: \begin{figure}[h]
> >    #+begin_src emacs-lisp
> >      (defcustom org-babel-comment-processing-functions
> >        '(org-babel-comment-delete-file-variables-line
> >          org-babel-comment-delete-org-control-lines
> >          org-babel-comment-delete-drawers
> >          org-babel-comment-trim-blank-lines
> >          org-babel-comment-trim-indent-prefix)
> >        "List of functions to transform source-block comment text before
> > insertion.
> >      Each function will be called with no arguments with point at the
> >      beginning of a buffer containing only the comment text. Each
> >      function can modify the text at will and leave point anywhere,
> >      but it should *not* modify the narrowing state of the buffer.
> >      Several dynamic state variables are set prior to execution that
> >      each function can reference. These currently include:
> >
> >         + org-drawers:   names of drawers in the original org buffer.
> >         + from-heading:  t if comment starts at an org heading line,
> >                          nil otherwise.
> >         + first-line:    t if initial comment starts on first line
> >                          of the original org buffer, nil otherwise.
> >
> >      If a function changes the value of these state variables, the new
> >      value will be seen by all following functions in the list, but
> >      this is not generally recommended.
> >
> >      The functions in this list are called *in order*, and this order
> >      can influence the form of the resulting comment text."
> >        :group 'org-babel
> >        :type 'list)
> >    #+end_src
> >    #+latex: \caption{Better Fix, customizable transform list.}
> >    #+latex: \label{fig::comment-transformation-function-list}
> >    #+latex: \end{figure}
> >
> > ** Define A Collection of Transform Functions
> >
> >    An advantage of this design is that transformation of the
> >    comments is modular and customizable. We can include in
> >    =ob-tangle.el= a collection of pre-defined transforms.
> >    The default processing stream in
> =org-babel-comment-processing-functions=
> >    is as follows:
> >
> >    1. Delete a file variables if on the first line of the buffer.
> >    2. Delete all drawers and their contents.
> >    3. Delete all org control lines from the comment text.
> >    4. Trim blank lines from the beginning and end.
> >    5. Reindent the text by removing the longest common leading
> >       string of spaces.
> >
> >    #+ TANGLE: end-comment
> >    These and several other useful transforms are given below
> >    (e.g., deleting drawer delimiters but not contents)..
> >    #+latex: See Figures \ref{fig::transformA}--\ref{fig::transformZ}.
> >    It is easy to define new transforms; any function that
> >    operates on text in the current buffer beginning at point-min
> >    will work.
> >
> >    #+latex: \begin{figure}[h]
> >    #+begin_src emacs-lisp
> >      (defun org-babel-comment-delete-file-variables-line ()
> >        "Delete file variables comment line if at beginning of buffer.
> >      This only checks the first line of the buffer, and so should be
> >      placed first (or at least early enough) in the list
> >      `org-babel-comment-processing-functions' to ensure that the no
> >      other text has been inserted earlier."
> >        (when (and first-line
> >                   (looking-at ; file-variables line
> >                    "^#[ \t]*-\\*-.*:.*;[ \t]**-\\*-[ \t]*$"))
> >          (let ((kill-whole-line t))
> >            (kill-line))))
> >    #+end_src
> >    #+latex: \caption{Comment Transform.}
> >    #+latex: \label{fig::transformA}
> >    #+latex: \end{figure}
> >
> >    #+latex: \begin{figure}[h]
> >    #+begin_src emacs-lisp
> >      (defun org-babel-comment-delete-org-control-lines ()
> >        "Remove all org #+ control lines from comment."
> >        (let ((control-regexp "^[ \t]*#\\+.*$"))
> >          (delete-matching-lines control-regexp)))
> >    #+end_src
> >    #+latex: \caption{Comment Transform.}
> >    #+latex: \end{figure}
> >    #+latex: \begin{figure}[h]
> >    #+begin_src emacs-lisp
> >      (defun org-babel-comment-delete-org-in-buffer-settings ()
> >        "Remove all org #+ in-buffer setting lines, leaving other control
> > lines.
> >      In-buffer setting lines begin with #+ and have all caps keyword
> >      names."
> >        (let ((setting-regexp "^#\\+[ \t]*[A-Z_]+:.*$"))
> >          (delete-matching-lines setting-regexp)))
> >    #+end_src
> >    #+latex: \caption{Comment Transform.}
> >    #+latex: \end{figure}
> >    #+latex: \begin{figure}[h]
> >    #+begin_src emacs-lisp
> >      (defun org-babel-comment-delete-drawers ()
> >        "Delete drawer delimiters and contents from comment.
> >      Drawer names are restricted to those in the `org-drawers' state."
> >        (let ((drawer-start-regexp
> >               (format "^[ \t]*:\\(?:%s\\):[ \t]*$"
> >                       (mapconcat 'identity
> >                                  org-drawers
> >                                  "\\|")))
> >              (drawer-end-regexp "^[ \t]*:END:[ \t]*$"))
> >          (while (re-search-forward drawer-start-regexp nil t)
> >            (let ((beg (save-excursion
> >                         (forward-line 0)
> >                         (point)))
> >                  (end (save-excursion
> >                         (re-search-forward drawer-end-regexp nil t)
> >                         (forward-line 1)
> >                         (point))))
> >              (goto-char end)
> >              (delete-region beg end)))))
> >    #+end_src
> >    #+latex: \caption{Comment Transform.}
> >    #+latex: \end{figure}
> >    #+latex: \begin{figure}[h]
> >    #+begin_src emacs-lisp
> >      (defun org-babel-comment-delete-drawer-delimiters ()
> >        "Delete drawer delimiters from comment leaving content.
> >      Drawer names are restricted to those given by the `org-drawers'
> >      state."
> >        (let ((drawer-delim-regexp
> >               (format "^[ \t]*:\\(?:%s\\)"
> >                       (mapconcat 'identity
> >                                  (cons "END" org-drawers)
> >                                  "\\|"))))
> >          (delete-matching-lines drawer-delim-regexp)))
> >    #+end_src
> >    #+latex: \caption{Comment Transform.}
> >    #+latex: \end{figure}
> >    #+latex: \begin{figure}[h]
> >    #+begin_src emacs-lisp
> >      (defun org-babel-comment-trim-blank-lines ()
> >        "Trim whitespace-only lines from beginning and end of text."
> >        (while (and (looking-at "^[ \t\f]*$")
> >                    (< (point) (point-max)))
> >          (forward-line 1))
> >        (delete-region (point-min) (point))
> >        (when (< (point) (point-max))
> >          (goto-char (point-max))
> >          (let ((last-point (point)))
> >            (forward-line 0)
> >            (while (and (looking-at "^[ \t\f]*$")
> >                        (> (point) (point-min)))
> >              (setq last-point (point))
> >              (forward-line -1))
> >            (delete-region last-point (point-max)))))
> >    #+end_src
> >    #+latex: \caption{Comment Transform.}
> >    #+latex: \end{figure}
> >    #+latex: \begin{figure}[h]
> >    #+begin_src emacs-lisp
> >      (defun org-babel-comment-trim-indent-prefix ()
> >        "Remove longest common leading prefix of spaces from each line of
> > TEXT.
> >      Prefix is computed from the initial whitespace on each line with
> >      tabs converted to spaces, preserving indentation."
> >        (let* ((common-indent nil)
> >               (common-length (1+ (- (point-max) (point-min))))
> >               (current-indent "")                   ; enter first loop
> >               (current-length common-length))       ; skip first
> assignment
> >          (goto-char (point-min))
> >          (while current-indent
> >            (when (< current-length common-length)
> >              (setq common-indent current-indent
> >                    common-length current-length))
> >            (setq current-indent
> >                  (let* ((found (re-search-forward "^\\([ \t]*\\)\\S-" nil
> > t))
> >                         (bol (match-beginning 0))
> >                         (eos (match-end 1))
> >                         (space-str (match-string 1))
> >                         (indent-tabs-mode nil))
> >                    (cond
> >                     ((not found)
> >                      nil)
> >                     ((not (string-match "\t" space-str))
> >                      space-str)
> >                     (t                       ; detabify indent string
> >                      (goto-char eos)
> >                      (let ((col (current-column)))
> >                        (delete-region bol eos)
> >                        (indent-to col))
> >                      (buffer-substring-no-properties bol (point))))))
> >            (setq current-length (length current-indent)))
> >          (when (and common-indent (> common-length 0))
> >            (let ((indent-re (concat "^" common-indent)))
> >              (goto-char (point-min))
> >              (while (re-search-forward indent-re nil t)
> >                (replace-match "" nil nil))))))
> >    #+end_src
> >    #+latex: \caption{Comment Transform.}
> >    #+latex: \label{fig::transformZ}
> >    #+latex: \end{figure}
> >    #+latex: \end{itemize}
> >    #+latex: \noindent
> >    This kind of customization offers some nice possibilities,
> >    including controlling indentation, eliminating or
> >    transforming org markup, eliminating trailing whitespace, and
> >    automating specialized comment formatting (e.g., javadoc). As
> >    an additional illustration, consider the transform
> >    =org-babel-comment-restrict-comment-range=
> >    #+latex: in Figure \ref{fig::transform-illustration}
> >    below. The idea is that it is sometimes useful to select
> >    from the text under a headline a /part/ of the text for
> >    the comment. We want some org markup that will not affect
> >    either the export or the structure of the org file itself.
> >    To do this, we use the fact that =#+=\textvisiblespace
> >    lines are not exported.[fn:2] So, we can /de facto/ use the
> >    =#+ TANGLE:= construct to control various aspects of tangling.
> >    Here, we use the =#+ TANGLE: start-comment= and
> >    =#+ TANGLE: end-comment= to delimit the comment text.
> >    (This function needs to come earlier in the function list than the
> >    functions that eliminate org control lines. It is sufficient to
> >    prepend it to that list.) This is used in the current file,
> >    for example.
> >
> >    #+latex: \begin{figure}[h]
> >    #+begin_src emacs-lisp
> >      (defun org-babel-comment-restrict-comment-range ()
> >        "Remove all comment text outside start-comment and end-comment
> > delimiters.
> >      Comment delimiters are #+TANGLE lines with respective keywords
> >      start-comment and end-comment. THE #+TANGLE lines are also
> >      deleted. To be effective, this function should be positioned in
> >      the list `org-babel-comment-processing-functions' before any
> >      functions that remove org control lines or process other
> >      co-occuring attributes of #+TANGLE lines."
> >        (when (re-search-forward "^[ \t]*#\\+[
> \t]*TANGLE:.*start-comment.*$"
> > nil t)
> >          (forward-line 1)
> >          (delete-region (point-min) (point)))
> >        (when (re-search-forward "^[ \t]*#\\+[
> \t]*TANGLE:.*end-comment.*$"
> > nil t)
> >          (forward-line 0)
> >          (delete-region (point) (point-max))))
> >    #+end_src
> >    #+latex: \caption{Transform to illustrate some customization
> > possibilities.}
> >    #+latex: \label{fig::transform-illustration}
> >    #+latex: \end{figure}
> >    #+latex: \begin{itemize}
> >
> > [fn:1] In the original code and in the simple fix above, the comment
> > starts /immediately/ after the =#+end_src= rather than at the start of
> > the next line. Starting at the next line seems more natural to me
> > because the comment being constructed relates to the /following/ code
> > block. But the original behavior is easily restored if people disagree.
> >
> > [fn:2] A feature request: I would propose that the =#+tangle:= construct
> > be recognized as non-exported even with spaces preceding the =#= and no
> > spaces after the =+=. This would enable a variety of interesting
> > customization for tangled comments. Alternatively, a generic construct
> > such as =#+noop:= or =#+generic:= could be a valuable for user-based
> > tags in an org file that serves a similar purpose -- allow customized
> > processing without directly being exported.
> >
> >
>
> --
> Eric Schulte
> http://cs.unm.edu/~eschulte/
>

[-- Attachment #2: Type: text/html, Size: 37974 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Four issues with org-babel-tangle
  2011-09-15 21:36   ` Christopher Genovese
@ 2011-09-15 22:02     ` Eric Schulte
  2011-09-16  4:31       ` Christopher Genovese
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Schulte @ 2011-09-15 22:02 UTC (permalink / raw)
  To: Christopher Genovese; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1512 bytes --]

Christopher Genovese <genovese.cr@gmail.com> writes:

> Hi Eric,
>
>    Thanks for your note.
>
>> I would encourage you to begin the FSF assignment process if
>> you anticipate potentially contributing more fixes in the
>> future. Could you please send a git format-patch version of
>> the simple fix to the list so that I might apply it?
>
>    I will begin the FSF assignment process, and I will send a git-format
> patch based on the simple fix. (I'll send that tonight.)
>

Fantastic.

>
>> I like the idea of introducing a customizable function for
>> comment text transformation, however ... rather perhaps we
>> should just leave the default value of this function as
>> simple as possible and allow users to customize it ....
>
>    That makes sense, and I like the way you did it. In particular,
> I absolutely agree that the org-babel-trim should be removed
> from org-babel-spec-to-string (to allow flexibility in the customization).
> Making it the default processor works well, I think.
>
>    Would you like me to submit a separate patch based on this change
> or should I include that as part of the patch with the simple fix?
>

I'll write up this change as it may end up being longer than 10 lines,
and if I write it we don't have to wait for your FSF assignment to clear
(which can sometimes take months) before applying the patch.

In fact... if this attached patch looks good to you (i.e., allows the
behavior you originally intended) then please let me know and I'll apply
it immediately.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-customizable-processing-of-Org-mode-text-used-as-com.patch --]
[-- Type: text/x-diff, Size: 3101 bytes --]

From cebe0bec72df8c07dab367e2df500d2fd1a8aae3 Mon Sep 17 00:00:00 2001
From: Eric Schulte <schulte.eric@gmail.com>
Date: Thu, 15 Sep 2011 16:00:10 -0600
Subject: [PATCH] customizable processing of Org-mode text used as comments in tangled source-code files

* lisp/ob-tangle.el (org-babel-process-comment-text): Customizable
  function to process comment text.
  (org-babel-tangle-collect-blocks): Make use of new customizable
  processing function.
  (org-babel-spec-to-string): Call customizable function rather than
  `org-babel-trim'.
---
 lisp/ob-tangle.el |   41 +++++++++++++++++++++++++----------------
 1 files changed, 25 insertions(+), 16 deletions(-)

diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el
index d1e26c0..10fc120 100644
--- a/lisp/ob-tangle.el
+++ b/lisp/ob-tangle.el
@@ -95,6 +95,14 @@ controlled by the :comments header argument."
   :group 'org-babel
   :type 'string)
 
+(defcustom org-babel-process-comment-text #'org-babel-trim
+  "Function called to process raw Org-mode text collected to be
+inserted as comments in tangled source-code files.  The function
+should take a single string argument and return a string
+result.  The default value is `org-babel-trim'."
+  :group 'org-babel
+  :type 'function)
+
 (defun org-babel-find-file-noselect-refresh (file)
   "Find file ensuring that the latest changes on disk are
 represented in the file."
@@ -345,16 +353,18 @@ code blocks by language."
 		    (when (or (string= "both" (cdr (assoc :comments params)))
 			      (string= "org" (cdr (assoc :comments params))))
 		      ;; from the previous heading or code-block end
-		      (buffer-substring
-		       (max (condition-case nil
-				(save-excursion
-				  (org-back-to-heading t) (point))
-			      (error 0))
-			    (save-excursion
-			      (re-search-backward
-			       org-babel-src-block-regexp nil t)
-			      (match-end 0)))
-		       (point))))
+		      (funcall
+		       org-babel-process-comment-text
+		       (buffer-substring
+			(max (condition-case nil
+				 (save-excursion
+				   (org-back-to-heading t) (point))
+			       (error 0))
+			     (save-excursion
+			       (re-search-backward
+				org-babel-src-block-regexp nil t)
+			       (match-end 0)))
+			(point)))))
 		   by-lang)
 	      ;; add the spec for this block to blocks under it's language
 	      (setq by-lang (cdr (assoc src-lang blocks)))
@@ -396,12 +406,11 @@ form
 				     (eval el))))
 			    '(start-line file link source-name))))
     (flet ((insert-comment (text)
-            (let ((text (org-babel-trim text)))
-	      (when (and comments (not (string= comments "no"))
-			 (> (length text) 0))
-		(when padline (insert "\n"))
-		(comment-region (point) (progn (insert text) (point)))
-		(end-of-line nil) (insert "\n")))))
+            (when (and comments (not (string= comments "no"))
+		       (> (length text) 0))
+	      (when padline (insert "\n"))
+	      (comment-region (point) (progn (insert text) (point)))
+	      (end-of-line nil) (insert "\n"))))
       (when comment (insert-comment comment))
       (when link-p
 	(insert-comment
-- 
1.7.4.1


[-- Attachment #3: Type: text/plain, Size: 32721 bytes --]


>
>> Finally I'm not sure I fully understand what you mean by ...
>
> Sorry, I wasn't clear. It's a small thing. If you put
> '#+tangle' in column 0, the line is not exported because it
> begins with #; if you put #+ tangle on a line (spaces
> after + and possibly before #), the line is not exported
> because it begins with #+; but if you put #+tangle (no
> spaces after the + but spaces before the #), the line is
> exported. I think it would be useful if something like
>  #+tangle's (with no spaces between the # and +) were
> *not* exported because such lines can support
> useful customizations. Having to put the spaces after the +
> is a bit bothersome and looks uglier to me.
>

Hmm, but #+tangle is not an official Org-mode directive in the same way
that #+source:, #+headers:, and #+call: are.  Unless I'm forgetting
something #+tange: lines would have no functional effect, in which case
why not just use a normal org-mode comment (e.g., a line starting with
"# ").

>
>> ..., it was a long email.
>
> Yeah, sorry. :) Thanks for slogging through.
>

no problem at all, didn't mean this as a complaint :)

Cheers -- Eric

>
>   Best,
>
>     Christopher
>
>
> On Thu, Sep 15, 2011 at 11:49, Eric Schulte <schulte.eric@gmail.com> wrote:
>
>> Hi Christopher,
>>
>> Thank you for the thorough examples and for suggesting fixes.  I would
>> like to apply your "simple fix" immediately, and the resulting patch
>> should be small enough (less than 10 lines of changes) that it can be
>> applied without FSF assignment -- although I would encourage you to
>> begin the FSF assignment process if you anticipate potentially
>> contributing more fixes in the future.  Could you please send a git
>> format-patch version of the simple fix to the list so that I might apply
>> it?
>>
>> I like the idea of introducing a customizable function for comment text
>> transformation, however I'm not sure that the temporary-buffer mechanics
>> need to be included by default, rather perhaps we should just leave the
>> default value of this function as simple as possible and allow users to
>> customize it to be as simple or complex as they wish.  Perhaps a change
>> like the following, where the call to `org-babel-trim' in
>> `org-babel-spec-to-string' is removed.
>>
>> #+begin_src emacs-lisp
>>   (buffer-substring
>>   (max (condition-case nil
>>            (save-excursion
>>              (org-back-to-heading t) (point))
>>          (error 0))
>>        (save-excursion
>>          (re-search-backward
>>           org-babel-src-block-regexp nil t)
>>          (match-end 0)))
>>   (point))
>>
>>   ;; | becomes
>>  ;; v
>>
>>  (org-babel-process-comment-text
>>   (buffer-substring
>>     (max (condition-case nil
>>             (save-excursion
>>               (org-back-to-heading t) (point))
>>           (error 0))
>>         (save-excursion
>>           (re-search-backward
>>            org-babel-src-block-regexp nil t)
>>           (match-end 0)))
>>    (point)))
>>
>>   ;; where
>>
>>  (defcustom org-babel-process-comment-text #'org-babel-trim
>>    "Customizable function for processing comment text."
>>    :group 'org-babel
>>    :type 'function)
>> #+end_src
>>
>> This change may end up being more than 10 lines long, but a patch would
>> still be welcome, otherwise if the solution I sketched out above sounds
>> reasonable I could compose a patch and then share it with you for double
>> checking before it is applied.
>>
>> Finally I'm not sure I fully understand what you mean by
>>
>> >
>> > [fn:2] A feature request: I would propose that the =#+tangle:= construct
>> > be recognized as non-exported even with spaces preceding the =#= and no
>> > spaces after the =+=. This would enable a variety of interesting
>> > customization for tangled comments. Alternatively, a generic construct
>> > such as =#+noop:= or =#+generic:= could be a valuable for user-based
>> > tags in an org file that serves a similar purpose -- allow customized
>> > processing without directly being exported.
>> >
>>
>> Please do let me know if I missed anything, it was a long email.
>>
>> Thanks for contributing! -- Eric
>>
>> Christopher Genovese <genovese.cr@gmail.com> writes:
>>
>> > /Semi-verbose Preamble/.
>> > Having recently begun intensive use of org-mode for tangling source
>> > files, I encountered four issues related to comment extraction (two
>> > bugs, one undesirable behavior, and one ... unfulfilled need), which I
>> > describe in detail below. I started by creating an org file that would
>> > reproduce the problems, and soon started /describing/ the problems in
>> > the org file as well as putting my fixes in the source blocks. At the
>> > risk of it being too meta or annoying, I've included that org file at
>> > the end of this message as the problem description. All the details are
>> > there as well as two fixes. Tangling that file in various ways described
>> > demonstrates the problems, and you can export to PDF for nicer reading.
>> > (I've attached the PDF to this mail for convenience. It looks good;
>> > kudos, org-mode!) I've also attached a tarball with files that make it
>> > easy to try my changes and to restore the original behavior, as well as
>> > tests and results from the org file for easy comparison. See the
>> > included README.
>> >
>> > I've been using the revised code now for a few days. It fixes the
>> > problems I describe, and I think it provides a flexible framework for
>> > comment extraction with minimal change to the base code. If the reaction
>> > to this is positive, I will happily submit a patch, sign paperwork, or
>> > whatever is needed, after fixing any problems that you all see. In any
>> > case, I very much look forward to any feedback you can offer. Thanks.
>> >
>> >   -- Christopher
>> >
>> > P.S. In case the attachments get dropped, I've put the PDF and the
>> >      tarball at
>> >
>> >      http://www.stat.cmu.edu/~genovese/depot/tangle-error.pdf
>> >      http://www.stat.cmu.edu/~genovese/depot/tangle-bundle.tgz
>> >
>> > /Problem Description/
>> > ################ Cut Here ####################
>> > # -*- org-return-follows-link: t; comment-empty-lines: t; -*-
>> > #+TITLE: Tangle this file: four issues with org-babel-tangle
>> > #+AUTHOR: Christopher Genovese
>> > #+DATE: 14 Sep 2011\vspace*{-0.5cm}
>> > #+EMAIL: genovese@cmu.edu
>> > #+OPTIONS: toc:1 H:1
>> > #+BABEL: :tangle yes :comments org :results silent :exports code
>> > #+BIND: org-export-latex-hyperref-format "\\hyperlink{%s}{%s}"
>> > #+STARTUP: showall
>> > #+LATEX_HEADER: \usepackage[labelsep=period,labelfont=bf]{caption}
>> > #+LaTeX: \vspace*{-1cm}\vspace*{-2pt}
>> >
>> >
>> > * Four Related Issues with org-babel-tangle
>> >
>> >   Running org-mode version 7.7 on both Gnu Emacs 23.2.1 and 24.0.50.1
>> >   with Mac OS X 10.5.8 (with and without a -Q option), I encountered the
>> >   following issues/problems/bugs when tangling files with code blocks
>> >   for which the :comment header argument is org:
>> >
>> >   1. The subtree associated with the very first code block has
>> >      its headline tangled without leading stars, but all subsequent
>> >      sub-trees associated with code blocks have the leading stars
>> >      included in the comments.
>> >
>> >   2. If the first code block comes before the first headline, the start
>> >      of the comment text will be determined by pre-existing match data
>> >      and thus will likely be incorrect.
>> >
>> >   3. Org structural elements such as headline stars, =#+= control lines,
>> >      and drawers are included in the comments.
>> >
>> >   4. There is no way easy to delimit comment text or transform it that
>> >      does not also change the structure of the org file (e.g., by adding
>> >      headlines or source blocks).
>> >
>> >   Issues 1 and 2 seem to be genuine bugs. Issues 3 and 4 are more
>> >   subjective, I admit, but seem undesirable. Stars, drawers, and
>> >   control lines are org structure rather than content, and they
>> >   are often inappropriate in comments.
>> >
>> >   To reproduce the behaviors for issues 1 and 3, look at the result of
>> >   tangling this file. To reproduce issue 2 as well, remove the first two
>> >   stars from this file and tangle again. Alternatively, within emacs,
>> >   evaluate the =(buffer-substring ...)= sexp from the original-code code
>> >   below at the beginning character of a source block. (You
>> >   can also export this file to PDF for more pleasant reading.)
>> >
>> >   Below, I give details on these issues and code for two fixes:
>> >   a [[simple-fix][simple fix]] that handles the first two issues and the
>> > stars in the third,
>> >   and a [[preferred-fix][better fix]] that handles all four issues in a
>> more
>> >   modular, customizable framework. I'd be interested in hearing
>> >   feedback on all of this. If the reaction is positive, I will gladly
>> >   submit a patch. Thanks for your consideration.
>> >
>> >
>> >
>> > * The Original Code and Details on the Problem
>> >
>> >   The relevant section of the original code from org-mode version 7.7
>> >   is shown below, comprising lines 344 through 357
>> >   of ob-tangle.el in function =org-babel-tangle-collect-blocks=. With
>> point
>> >   is at a =#+begin_src=, it scans back either
>> >   for a heading line or for the end of the previous code block,
>> >   whichever comes later. The resulting region becomes the
>> >   comment text.
>> >
>> >   #+latex: \begin{figure}[h]
>> >   #+latex: \hypertarget{original-code}{} % <<original-code>>
>> >   #+source: original-code
>> >   #+begin_src emacs-lisp
>> >     (comment
>> >      (when (or (string= "both" (cdr (assoc :comments params)))
>> >                (string= "org" (cdr (assoc :comments params))))
>> >        ;; from the previous heading or code-block end
>> >        (buffer-substring
>> >         (max (condition-case nil
>> >                  (save-excursion
>> >                    (org-back-to-heading t) (point))
>> >                (error 0))
>> >              (save-excursion
>> >                (re-search-backward
>> >                 org-babel-src-block-regexp nil t)
>> >                (match-end 0)))
>> >         (point))))
>> >   #+end_src
>> >   #+latex: \caption{Original Code, lines 344--357 in {\tt ob-tangle.el}.}
>> >   #+latex: \label{fig::original-code}
>> >   #+latex: \end{figure}
>> >
>> >   /Issue 1/. When in the first code block in the file, the
>> >   second search fails (there is no previous code block),
>> >   so the (match-end 0)  call uses the match data from the (implicit)
>> match
>> > during
>> >   =org-back-to-heading=, which skips the stars. (Not a particularly
>> >   transparent reference, incidentally.) For subsequent blocks, the
>> >   =(match-end 0)= gives the end of the previous code block, which in
>> these
>> >   examples is earlier than the previous headline location.
>> >
>> >   /Issue 2/. When the first code block lies before the first headline
>> >   (say with some text before it), the searches fail in /both/ clauses of
>> >   the max. So, the =match-end= will return an essentially arbitrary
>> >   result, which is a bug.
>> >
>> >   /Issue 3/. =org-back-to-heading= leaves point at the beginning of the
>> >   stars, so a headline included in the text will have stars, except for
>> >   the first one.
>> >
>> >   /Issue 4/. Control lines at the end of the previous code block and
>> >   before point are not filtered out and so are included in the comments.
>> >
>> >
>> > * A Simple Fix for the First Three Issues
>> >
>> >   A small change addresses issues 1, 2, and the stars for issue 3: in
>> >   both cases, simply use the =match-end= and replace 0 values with
>> >   =(point-min)=. The latter gives a sensible result even if both
>> >   computed positions are trivial (as when the first code block comes
>> >   before the first headline) and respects narrowing.
>> >
>> >   #+latex: \begin{figure}[h]
>> >   #+latex: \hypertarget{simple-fix}{} % <<simple-fix>>
>> >   #+begin_src emacs-lisp
>> >     (comment
>> >      (when (or (string= "both" (cdr (assoc :comments params)))
>> >                (string= "org" (cdr (assoc :comments params))))
>> >        ;; from the previous heading or code-block end
>> >        (buffer-substring
>> >         (max (condition-case nil
>> >                  (save-excursion
>> >                    (org-back-to-heading t)  ; sets match data
>> >                    (match-end 0))
>> >                (error (point-min)))
>> >              (save-excursion
>> >                (if (re-search-backward
>> >                     org-babel-src-block-regexp nil t)
>> >                    (match-end 0)
>> >                  (point-min))))
>> >         (point))))
>> >   #+end_src
>> >   #+latex: \caption{Simple Fix, replacement for lines 344--357 in {\tt
>> > ob-tangle.el}.}
>> >   #+latex: \label{fig::simple-fix}
>> >   #+latex: \end{figure}
>> >
>> > * A Fix for All Four Issues
>> >
>> >   A better fix that handles issues 1--4 starts with the region computed
>> as
>> > in the [[simple-fix][simple fix]]
>> >   and then processes that text through a user-configurable sequence of
>> > functions
>> >   to derive the final form of the comment text.
>> >
>> >   The following changes are required.
>> >
>> > ** Extract Initial Comment Text and State from Org Buffer
>> >
>> >    The initial comment text ranges from either the most recent headline
>> >    at the point after the stars, the beginning of the line after the
>> >    =#+end_src= of the most recent code block, or the beginning of the
>> >    buffer, whichever is later, through the line before the source
>> >    block.[fn:1]
>> >
>> >    The [[preferred-fix][code]] to extract this is given below.
>> >    #+latex: (See Figure \ref{fig::preferred-fix}.)
>> >    It replaces lines 344 through 357 of
>> >    =ob-tangle.el= from org-mode version 7.7 in the function
>> >    =org-babel-tangle-collect-blocks=.
>> >
>> >    #+latex: \begin{figure}[h]
>> >    #+latex: \hypertarget{preferred-fix}{} % <<preferred-fix>>
>> >    #+begin_src emacs-lisp
>> >      (comment
>> >       (when (or (string= "both" (cdr (assoc :comments params)))
>> >                 (string= "org" (cdr (assoc :comments params))))
>> >         (let* ((prev-heading
>> >                 (condition-case nil
>> >                     (save-excursion
>> >                       (org-back-to-heading t) ; sets match data
>> >                       (match-end 0))
>> >                   (error (point-min))))
>> >                (end-of-prev-src-block
>> >                 (save-excursion
>> >                   (if (null (re-search-backward
>> >                              org-babel-src-block-regexp nil t))
>> >                       (point-min)
>> >                     (goto-char (match-end 0))
>> >                     (forward-line 1)
>> >                     (point))))
>> >                (comment-start
>> >                 (max prev-heading end-of-prev-src-block))
>> >                (comment-end
>> >                 (save-excursion
>> >                   (forward-line 0)
>> >                   (point)))
>> >                (state
>> >                 (list (cons 'org-drawers
>> >                             org-drawers)
>> >                       (cons 'after-heading
>> >                             (= comment-start prev-heading))
>> >                       (cons 'first-line
>> >                             (= comment-start (point-min))))))
>> >           (org-babel-process-comment-text
>> >            (buffer-substring comment-start comment-end) state))))
>> >    #+end_src
>> >    #+latex: \caption{Better Fix, replacement for lines 344--357 in {\tt
>> > ob-tangle.el}.}
>> >    #+latex: \label{fig::preferred-fix}
>> >    #+latex: \end{figure}
>> >
>> > ** Adjust =org-babel-spec-to-string=
>> >
>> >    The commment block collected by the [[original-code][original code]]
>> >    #+latex: (Figure \ref{fig::original-code})
>> >    in =org-babel-tangle-collect-blocks= is further processed in \newline
>> >    =org-babel-spec-to-string= to trim leading and trailing whitespace
>> >    from string. This was needed because spaces after a source block were
>> >    included in the comment. In the revised code, however, this space
>> >    trimming is handled during text transformation, except for removing
>> >    trailing newlines. (Note: trailing /spaces/ are not removed to allow
>> >    more flexibility in comment processing.) Hence,
>> >    =org-babel-spec-to-string= needs to be slightly adjusted.
>> >    #+latex: See Figure \ref{fig::spec-string-diff}.
>> >
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_example
>> >      --- ob-tangle.el           2011-09-14 11:48:26.000000000 -0400
>> >      +++ new-ob-tangle.el       2011-09-14 11:55:56.000000000 -0400
>> >      @@ -398,3 +398,3 @@
>> >           (flet ((insert-comment (text)
>> >      -            (let ((text (org-babel-trim text)))
>> >      +            (let ((text (org-babel-chomp text "[\f\t\n\r\v]")))
>> >               (when (and comments (not (string= comments "no"))
>> >    #+end_example
>> >    #+latex: \caption{Changes to {\tt org-spec-to-string} in {\tt
>> > ob-tangle.el}, unified diff, one line of context}
>> >    #+latex: \label{fig::spec-string-diff}
>> >    #+latex: \end{figure}
>> >
>> > ** Process Comment Text Through Sequence of Transforms
>> >
>> >    At the end of the revised [[preferred-fix][comment collection code]],
>> >    the comment text is passed to
>> >    =org-babel-process-comment-text= which
>> >    applies a sequence of transformation functions.
>> >    #+latex: (See Figure \ref{fig::comment-transformation}.)
>> >    The list of transformation functions is stored in a customizable
>> >    variable described [[Define Customization Variable for
>> > Transforms][below]]. Several predefined transformations are
>> >    given [[Define A Collection of Transform Functions][below]] as well.
>> >
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_src emacs-lisp
>> >      (defun org-babel-process-comment-text (text &optional state)
>> >        "Apply list of transforms to comment TEXT assuming bindings in
>> alist
>> > STATE.
>> >      Returns the modified text string, which may have text properties.
>> >      See `org-babel-comment-processing-functions' for the transforms to
>> be
>> >      applied and details on the allowed keys in the STATE alist."
>> >        (let ((funcs org-babel-comment-processing-functions))
>> >          (with-temp-buffer
>> >            (insert text)
>> >            (let ((org-drawers
>> >                   (or (cdr (assoc 'org-drawers state))
>> >                       org-drawers))
>> >                  (after-heading
>> >                   (cdr (assoc 'after-heading state)))
>> >                  (first-line
>> >                   (cdr (assoc 'first-line state))))
>> >              (while funcs
>> >                (goto-char (point-min))
>> >                (funcall (car funcs))
>> >                (setq funcs (cdr funcs))))
>> >            (buffer-substring (point-min) (point-max)))))
>> >    #+end_src
>> >    #+latex: \caption{Better Fix, comment transformation driver.}
>> >    #+latex: \label{fig::comment-transformation}
>> >    #+latex: \end{figure}
>> >
>> > ** Define Customization Variable for Transforms
>> >
>> >    A list of nullary functions applied in order to
>> >    the comment text. The text is inserted
>> >    in a temporary buffer, so these functions can use
>> >    the entire Emacs library for operating on buffer text.
>> >    #+latex: See Figure \ref{fig::comment-transformation-function-list}.
>> >
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_src emacs-lisp
>> >      (defcustom org-babel-comment-processing-functions
>> >        '(org-babel-comment-delete-file-variables-line
>> >          org-babel-comment-delete-org-control-lines
>> >          org-babel-comment-delete-drawers
>> >          org-babel-comment-trim-blank-lines
>> >          org-babel-comment-trim-indent-prefix)
>> >        "List of functions to transform source-block comment text before
>> > insertion.
>> >      Each function will be called with no arguments with point at the
>> >      beginning of a buffer containing only the comment text. Each
>> >      function can modify the text at will and leave point anywhere,
>> >      but it should *not* modify the narrowing state of the buffer.
>> >      Several dynamic state variables are set prior to execution that
>> >      each function can reference. These currently include:
>> >
>> >         + org-drawers:   names of drawers in the original org buffer.
>> >         + from-heading:  t if comment starts at an org heading line,
>> >                          nil otherwise.
>> >         + first-line:    t if initial comment starts on first line
>> >                          of the original org buffer, nil otherwise.
>> >
>> >      If a function changes the value of these state variables, the new
>> >      value will be seen by all following functions in the list, but
>> >      this is not generally recommended.
>> >
>> >      The functions in this list are called *in order*, and this order
>> >      can influence the form of the resulting comment text."
>> >        :group 'org-babel
>> >        :type 'list)
>> >    #+end_src
>> >    #+latex: \caption{Better Fix, customizable transform list.}
>> >    #+latex: \label{fig::comment-transformation-function-list}
>> >    #+latex: \end{figure}
>> >
>> > ** Define A Collection of Transform Functions
>> >
>> >    An advantage of this design is that transformation of the
>> >    comments is modular and customizable. We can include in
>> >    =ob-tangle.el= a collection of pre-defined transforms.
>> >    The default processing stream in
>> =org-babel-comment-processing-functions=
>> >    is as follows:
>> >
>> >    1. Delete a file variables if on the first line of the buffer.
>> >    2. Delete all drawers and their contents.
>> >    3. Delete all org control lines from the comment text.
>> >    4. Trim blank lines from the beginning and end.
>> >    5. Reindent the text by removing the longest common leading
>> >       string of spaces.
>> >
>> >    #+ TANGLE: end-comment
>> >    These and several other useful transforms are given below
>> >    (e.g., deleting drawer delimiters but not contents)..
>> >    #+latex: See Figures \ref{fig::transformA}--\ref{fig::transformZ}.
>> >    It is easy to define new transforms; any function that
>> >    operates on text in the current buffer beginning at point-min
>> >    will work.
>> >
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_src emacs-lisp
>> >      (defun org-babel-comment-delete-file-variables-line ()
>> >        "Delete file variables comment line if at beginning of buffer.
>> >      This only checks the first line of the buffer, and so should be
>> >      placed first (or at least early enough) in the list
>> >      `org-babel-comment-processing-functions' to ensure that the no
>> >      other text has been inserted earlier."
>> >        (when (and first-line
>> >                   (looking-at ; file-variables line
>> >                    "^#[ \t]*-\\*-.*:.*;[ \t]**-\\*-[ \t]*$"))
>> >          (let ((kill-whole-line t))
>> >            (kill-line))))
>> >    #+end_src
>> >    #+latex: \caption{Comment Transform.}
>> >    #+latex: \label{fig::transformA}
>> >    #+latex: \end{figure}
>> >
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_src emacs-lisp
>> >      (defun org-babel-comment-delete-org-control-lines ()
>> >        "Remove all org #+ control lines from comment."
>> >        (let ((control-regexp "^[ \t]*#\\+.*$"))
>> >          (delete-matching-lines control-regexp)))
>> >    #+end_src
>> >    #+latex: \caption{Comment Transform.}
>> >    #+latex: \end{figure}
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_src emacs-lisp
>> >      (defun org-babel-comment-delete-org-in-buffer-settings ()
>> >        "Remove all org #+ in-buffer setting lines, leaving other control
>> > lines.
>> >      In-buffer setting lines begin with #+ and have all caps keyword
>> >      names."
>> >        (let ((setting-regexp "^#\\+[ \t]*[A-Z_]+:.*$"))
>> >          (delete-matching-lines setting-regexp)))
>> >    #+end_src
>> >    #+latex: \caption{Comment Transform.}
>> >    #+latex: \end{figure}
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_src emacs-lisp
>> >      (defun org-babel-comment-delete-drawers ()
>> >        "Delete drawer delimiters and contents from comment.
>> >      Drawer names are restricted to those in the `org-drawers' state."
>> >        (let ((drawer-start-regexp
>> >               (format "^[ \t]*:\\(?:%s\\):[ \t]*$"
>> >                       (mapconcat 'identity
>> >                                  org-drawers
>> >                                  "\\|")))
>> >              (drawer-end-regexp "^[ \t]*:END:[ \t]*$"))
>> >          (while (re-search-forward drawer-start-regexp nil t)
>> >            (let ((beg (save-excursion
>> >                         (forward-line 0)
>> >                         (point)))
>> >                  (end (save-excursion
>> >                         (re-search-forward drawer-end-regexp nil t)
>> >                         (forward-line 1)
>> >                         (point))))
>> >              (goto-char end)
>> >              (delete-region beg end)))))
>> >    #+end_src
>> >    #+latex: \caption{Comment Transform.}
>> >    #+latex: \end{figure}
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_src emacs-lisp
>> >      (defun org-babel-comment-delete-drawer-delimiters ()
>> >        "Delete drawer delimiters from comment leaving content.
>> >      Drawer names are restricted to those given by the `org-drawers'
>> >      state."
>> >        (let ((drawer-delim-regexp
>> >               (format "^[ \t]*:\\(?:%s\\)"
>> >                       (mapconcat 'identity
>> >                                  (cons "END" org-drawers)
>> >                                  "\\|"))))
>> >          (delete-matching-lines drawer-delim-regexp)))
>> >    #+end_src
>> >    #+latex: \caption{Comment Transform.}
>> >    #+latex: \end{figure}
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_src emacs-lisp
>> >      (defun org-babel-comment-trim-blank-lines ()
>> >        "Trim whitespace-only lines from beginning and end of text."
>> >        (while (and (looking-at "^[ \t\f]*$")
>> >                    (< (point) (point-max)))
>> >          (forward-line 1))
>> >        (delete-region (point-min) (point))
>> >        (when (< (point) (point-max))
>> >          (goto-char (point-max))
>> >          (let ((last-point (point)))
>> >            (forward-line 0)
>> >            (while (and (looking-at "^[ \t\f]*$")
>> >                        (> (point) (point-min)))
>> >              (setq last-point (point))
>> >              (forward-line -1))
>> >            (delete-region last-point (point-max)))))
>> >    #+end_src
>> >    #+latex: \caption{Comment Transform.}
>> >    #+latex: \end{figure}
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_src emacs-lisp
>> >      (defun org-babel-comment-trim-indent-prefix ()
>> >        "Remove longest common leading prefix of spaces from each line of
>> > TEXT.
>> >      Prefix is computed from the initial whitespace on each line with
>> >      tabs converted to spaces, preserving indentation."
>> >        (let* ((common-indent nil)
>> >               (common-length (1+ (- (point-max) (point-min))))
>> >               (current-indent "")                   ; enter first loop
>> >               (current-length common-length))       ; skip first
>> assignment
>> >          (goto-char (point-min))
>> >          (while current-indent
>> >            (when (< current-length common-length)
>> >              (setq common-indent current-indent
>> >                    common-length current-length))
>> >            (setq current-indent
>> >                  (let* ((found (re-search-forward "^\\([ \t]*\\)\\S-" nil
>> > t))
>> >                         (bol (match-beginning 0))
>> >                         (eos (match-end 1))
>> >                         (space-str (match-string 1))
>> >                         (indent-tabs-mode nil))
>> >                    (cond
>> >                     ((not found)
>> >                      nil)
>> >                     ((not (string-match "\t" space-str))
>> >                      space-str)
>> >                     (t                       ; detabify indent string
>> >                      (goto-char eos)
>> >                      (let ((col (current-column)))
>> >                        (delete-region bol eos)
>> >                        (indent-to col))
>> >                      (buffer-substring-no-properties bol (point))))))
>> >            (setq current-length (length current-indent)))
>> >          (when (and common-indent (> common-length 0))
>> >            (let ((indent-re (concat "^" common-indent)))
>> >              (goto-char (point-min))
>> >              (while (re-search-forward indent-re nil t)
>> >                (replace-match "" nil nil))))))
>> >    #+end_src
>> >    #+latex: \caption{Comment Transform.}
>> >    #+latex: \label{fig::transformZ}
>> >    #+latex: \end{figure}
>> >    #+latex: \end{itemize}
>> >    #+latex: \noindent
>> >    This kind of customization offers some nice possibilities,
>> >    including controlling indentation, eliminating or
>> >    transforming org markup, eliminating trailing whitespace, and
>> >    automating specialized comment formatting (e.g., javadoc). As
>> >    an additional illustration, consider the transform
>> >    =org-babel-comment-restrict-comment-range=
>> >    #+latex: in Figure \ref{fig::transform-illustration}
>> >    below. The idea is that it is sometimes useful to select
>> >    from the text under a headline a /part/ of the text for
>> >    the comment. We want some org markup that will not affect
>> >    either the export or the structure of the org file itself.
>> >    To do this, we use the fact that =#+=\textvisiblespace
>> >    lines are not exported.[fn:2] So, we can /de facto/ use the
>> >    =#+ TANGLE:= construct to control various aspects of tangling.
>> >    Here, we use the =#+ TANGLE: start-comment= and
>> >    =#+ TANGLE: end-comment= to delimit the comment text.
>> >    (This function needs to come earlier in the function list than the
>> >    functions that eliminate org control lines. It is sufficient to
>> >    prepend it to that list.) This is used in the current file,
>> >    for example.
>> >
>> >    #+latex: \begin{figure}[h]
>> >    #+begin_src emacs-lisp
>> >      (defun org-babel-comment-restrict-comment-range ()
>> >        "Remove all comment text outside start-comment and end-comment
>> > delimiters.
>> >      Comment delimiters are #+TANGLE lines with respective keywords
>> >      start-comment and end-comment. THE #+TANGLE lines are also
>> >      deleted. To be effective, this function should be positioned in
>> >      the list `org-babel-comment-processing-functions' before any
>> >      functions that remove org control lines or process other
>> >      co-occuring attributes of #+TANGLE lines."
>> >        (when (re-search-forward "^[ \t]*#\\+[
>> \t]*TANGLE:.*start-comment.*$"
>> > nil t)
>> >          (forward-line 1)
>> >          (delete-region (point-min) (point)))
>> >        (when (re-search-forward "^[ \t]*#\\+[
>> \t]*TANGLE:.*end-comment.*$"
>> > nil t)
>> >          (forward-line 0)
>> >          (delete-region (point) (point-max))))
>> >    #+end_src
>> >    #+latex: \caption{Transform to illustrate some customization
>> > possibilities.}
>> >    #+latex: \label{fig::transform-illustration}
>> >    #+latex: \end{figure}
>> >    #+latex: \begin{itemize}
>> >
>> > [fn:1] In the original code and in the simple fix above, the comment
>> > starts /immediately/ after the =#+end_src= rather than at the start of
>> > the next line. Starting at the next line seems more natural to me
>> > because the comment being constructed relates to the /following/ code
>> > block. But the original behavior is easily restored if people disagree.
>> >
>> > [fn:2] A feature request: I would propose that the =#+tangle:= construct
>> > be recognized as non-exported even with spaces preceding the =#= and no
>> > spaces after the =+=. This would enable a variety of interesting
>> > customization for tangled comments. Alternatively, a generic construct
>> > such as =#+noop:= or =#+generic:= could be a valuable for user-based
>> > tags in an org file that serves a similar purpose -- allow customized
>> > processing without directly being exported.
>> >
>> >
>>
>> --
>> Eric Schulte
>> http://cs.unm.edu/~eschulte/
>>

-- 
Eric Schulte
http://cs.unm.edu/~eschulte/

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Four issues with org-babel-tangle
  2011-09-15 22:02     ` Eric Schulte
@ 2011-09-16  4:31       ` Christopher Genovese
  2011-09-16 15:51         ` Eric Schulte
  0 siblings, 1 reply; 6+ messages in thread
From: Christopher Genovese @ 2011-09-16  4:31 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 5109 bytes --]

> I'll write up this change as it may end up being longer than 10 lines,
> and if I write it we don't have to wait for your FSF assignment to clear
> (which can sometimes take months) before applying the patch.

That sounds good, thanks.

> In fact... if this attached patch looks good to you (i.e., allows the
> behavior you originally intended) then please let me know and I'll apply
> it immediately.

Ideally, I'd like to combine the customizable processing with the
simple fix code (which eliminates the two related bugs and the
extra *s).  Something like the following in place of the corresponding
section in the patch you sent.  The extra (match-end 0) and (point-min)'s
prevent those problems. Otherwise, it all looks great.

+       (funcall
+        org-babel-process-comment-text
+        (buffer-substring
+         (max (condition-case nil
+                  (save-excursion
 +                    (org-back-to-heading t)    ; sets match data
+                    (match-end 0))
+                (error (point-min)))
+              (save-excursion
+                (if (re-search-backward
+                     org-babel-src-block-regexp nil t)
+                    (match-end 0)
+                  (point-min))))
+         (point)))))

I'm happy to take a look at the patch again anytime.

> Hmm, but #+tangle is not an official Org-mode directive in the same way
> that #+source:, #+headers:, and #+call: are.  Unless I'm forgetting
> something #+tangle: lines would have no functional effect, in which case
> why not just use a normal org-mode comment (e.g., a line starting with
> "# ").

You're right, I agree. I'm just being particular about indentation.
I don't like to have a line starting with # when everything else is
indented.
And I don't like having to put a space after the #+ to prevent export, so I
just wanted #+tangle (or #+noop or #+comment or whatever) to count
as a non-exported comment too, just like #+ tangle would.  But I can see
that it's not worth the effort or the confusion with a functional directive
that
it would cause. I'll just suck it up and use the extra space.

Thanks again, Eric.

   Best,

     Christopher


On Thu, Sep 15, 2011 at 18:02, Eric Schulte <schulte.eric@gmail.com> wrote:

> - Show quoted text -
>  Christopher Genovese <genovese.cr@gmail.com> writes:
>
> > Hi Eric,
> >
> >    Thanks for your note.
> >
> >> I would encourage you to begin the FSF assignment process if
> >> you anticipate potentially contributing more fixes in the
> >> future. Could you please send a git format-patch version of
> >> the simple fix to the list so that I might apply it?
> >
> >    I will begin the FSF assignment process, and I will send a git-format
> > patch based on the simple fix. (I'll send that tonight.)
> >
>
> Fantastic.
>
> >
> >> I like the idea of introducing a customizable function for
> >> comment text transformation, however ... rather perhaps we
> >> should just leave the default value of this function as
> >> simple as possible and allow users to customize it ....
> >
> >    That makes sense, and I like the way you did it. In particular,
> > I absolutely agree that the org-babel-trim should be removed
> > from org-babel-spec-to-string (to allow flexibility in the
> customization).
> > Making it the default processor works well, I think.
> >
> >    Would you like me to submit a separate patch based on this change
> > or should I include that as part of the patch with the simple fix?
> >
>
> I'll write up this change as it may end up being longer than 10 lines,
> and if I write it we don't have to wait for your FSF assignment to clear
> (which can sometimes take months) before applying the patch.
>
> In fact... if this attached patch looks good to you (i.e., allows the
> behavior you originally intended) then please let me know and I'll apply
> it immediately.
>
>
>
> >
> >> Finally I'm not sure I fully understand what you mean by ...
> >
> > Sorry, I wasn't clear. It's a small thing. If you put
> > '#+tangle' in column 0, the line is not exported because it
> > begins with #; if you put #+ tangle on a line (spaces
> > after + and possibly before #), the line is not exported
> > because it begins with #+; but if you put #+tangle (no
> > spaces after the + but spaces before the #), the line is
> > exported. I think it would be useful if something like
> >  #+tangle's (with no spaces between the # and +) were
> > *not* exported because such lines can support
> > useful customizations. Having to put the spaces after the +
> > is a bit bothersome and looks uglier to me.
> >
>
> Hmm, but #+tangle is not an official Org-mode directive in the same way
> that #+source:, #+headers:, and #+call: are.  Unless I'm forgetting
> something #+tange: lines would have no functional effect, in which case
> why not just use a normal org-mode comment (e.g., a line starting with
> "# ").
>
>
> >
> >> ..., it was a long email.
> >
> > Yeah, sorry. :) Thanks for slogging through.
> >
>
> no problem at all, didn't mean this as a complaint :)
>
> Cheers -- Eric
>
>
>
> --
> Eric Schulte
> http://cs.unm.edu/~eschulte/
>
>

[-- Attachment #2: Type: text/html, Size: 6611 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Four issues with org-babel-tangle
  2011-09-16  4:31       ` Christopher Genovese
@ 2011-09-16 15:51         ` Eric Schulte
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Schulte @ 2011-09-16 15:51 UTC (permalink / raw)
  To: Christopher Genovese; +Cc: emacs-orgmode

Christopher Genovese <genovese.cr@gmail.com> writes:

>> I'll write up this change as it may end up being longer than 10 lines,
>> and if I write it we don't have to wait for your FSF assignment to clear
>> (which can sometimes take months) before applying the patch.
>
> That sounds good, thanks.
>
>> In fact... if this attached patch looks good to you (i.e., allows the
>> behavior you originally intended) then please let me know and I'll apply
>> it immediately.
>
> Ideally, I'd like to combine the customizable processing with the
> simple fix code (which eliminates the two related bugs and the
> extra *s).  Something like the following in place of the corresponding
> section in the patch you sent.  The extra (match-end 0) and (point-min)'s
> prevent those problems. Otherwise, it all looks great.
>
> +       (funcall
> +        org-babel-process-comment-text
> +        (buffer-substring
> +         (max (condition-case nil
> +                  (save-excursion
>  +                    (org-back-to-heading t)    ; sets match data
> +                    (match-end 0))
> +                (error (point-min)))
> +              (save-excursion
> +                (if (re-search-backward
> +                     org-babel-src-block-regexp nil t)
> +                    (match-end 0)
> +                  (point-min))))
> +         (point)))))
>
> I'm happy to take a look at the patch again anytime.
>

OK, I've just applied this simple change on top of the previously
discussed change directly to the Org-mode git repository.  Please let me
know if anything looks amiss.

>
>> Hmm, but #+tangle is not an official Org-mode directive in the same way
>> that #+source:, #+headers:, and #+call: are.  Unless I'm forgetting
>> something #+tangle: lines would have no functional effect, in which case
>> why not just use a normal org-mode comment (e.g., a line starting with
>> "# ").
>
> You're right, I agree. I'm just being particular about indentation.
> I don't like to have a line starting with # when everything else is
> indented.
> And I don't like having to put a space after the #+ to prevent export, so I
> just wanted #+tangle (or #+noop or #+comment or whatever) to count
> as a non-exported comment too, just like #+ tangle would.  But I can see
> that it's not worth the effort or the confusion with a functional directive
> that
> it would cause. I'll just suck it up and use the extra space.
>

Probably the best way forward.  Sorry I can't be more help here.

Cheers -- Eric

>
> Thanks again, Eric.
>
>    Best,
>
>      Christopher
>
>
> On Thu, Sep 15, 2011 at 18:02, Eric Schulte <schulte.eric@gmail.com> wrote:
>
>> - Show quoted text -
>>  Christopher Genovese <genovese.cr@gmail.com> writes:
>>
>> > Hi Eric,
>> >
>> >    Thanks for your note.
>> >
>> >> I would encourage you to begin the FSF assignment process if
>> >> you anticipate potentially contributing more fixes in the
>> >> future. Could you please send a git format-patch version of
>> >> the simple fix to the list so that I might apply it?
>> >
>> >    I will begin the FSF assignment process, and I will send a git-format
>> > patch based on the simple fix. (I'll send that tonight.)
>> >
>>
>> Fantastic.
>>
>> >
>> >> I like the idea of introducing a customizable function for
>> >> comment text transformation, however ... rather perhaps we
>> >> should just leave the default value of this function as
>> >> simple as possible and allow users to customize it ....
>> >
>> >    That makes sense, and I like the way you did it. In particular,
>> > I absolutely agree that the org-babel-trim should be removed
>> > from org-babel-spec-to-string (to allow flexibility in the
>> customization).
>> > Making it the default processor works well, I think.
>> >
>> >    Would you like me to submit a separate patch based on this change
>> > or should I include that as part of the patch with the simple fix?
>> >
>>
>> I'll write up this change as it may end up being longer than 10 lines,
>> and if I write it we don't have to wait for your FSF assignment to clear
>> (which can sometimes take months) before applying the patch.
>>
>> In fact... if this attached patch looks good to you (i.e., allows the
>> behavior you originally intended) then please let me know and I'll apply
>> it immediately.
>>
>>
>>
>> >
>> >> Finally I'm not sure I fully understand what you mean by ...
>> >
>> > Sorry, I wasn't clear. It's a small thing. If you put
>> > '#+tangle' in column 0, the line is not exported because it
>> > begins with #; if you put #+ tangle on a line (spaces
>> > after + and possibly before #), the line is not exported
>> > because it begins with #+; but if you put #+tangle (no
>> > spaces after the + but spaces before the #), the line is
>> > exported. I think it would be useful if something like
>> >  #+tangle's (with no spaces between the # and +) were
>> > *not* exported because such lines can support
>> > useful customizations. Having to put the spaces after the +
>> > is a bit bothersome and looks uglier to me.
>> >
>>
>> Hmm, but #+tangle is not an official Org-mode directive in the same way
>> that #+source:, #+headers:, and #+call: are.  Unless I'm forgetting
>> something #+tange: lines would have no functional effect, in which case
>> why not just use a normal org-mode comment (e.g., a line starting with
>> "# ").
>>
>>
>> >
>> >> ..., it was a long email.
>> >
>> > Yeah, sorry. :) Thanks for slogging through.
>> >
>>
>> no problem at all, didn't mean this as a complaint :)
>>
>> Cheers -- Eric
>>
>>
>>
>> --
>> Eric Schulte
>> http://cs.unm.edu/~eschulte/
>>
>>

-- 
Eric Schulte
http://cs.unm.edu/~eschulte/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-09-16 15:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-14 19:51 Four issues with org-babel-tangle Christopher Genovese
2011-09-15 15:49 ` Eric Schulte
2011-09-15 21:36   ` Christopher Genovese
2011-09-15 22:02     ` Eric Schulte
2011-09-16  4:31       ` Christopher Genovese
2011-09-16 15:51         ` Eric Schulte

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).