emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] Markup on same line as text
@ 2011-01-06 22:12 Roland Kaufmann
  2011-02-16 20:04 ` Hrvoje Niksic
  2011-02-17 21:58 ` Hrvoje Niksic
  0 siblings, 2 replies; 8+ messages in thread
From: Roland Kaufmann @ 2011-01-06 22:12 UTC (permalink / raw)
  To: hniksic; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1991 bytes --]

I just discovered a problem with colorization and references in code 
snippets due to the way org-mode and htmlize interact. Consider the 
org-mode fragment:

#+BEGIN_SRC emacs-lisp
(let ((x 42)) ; meaning of l.u.e.
   (print x))  ; (ref:2)
#+END_SRC

Without the reference on line 2, doing an org-export-as-html would 
generate markup like this:

(let ((x 42)) <span style="comment">; meaning of l.u.e.
</span>  (print x))

Note that htmlize put the newline character on the end of the first line 
together with the text of the comment, both which is put inside the 
span. The closing tag of the span to colorize the comment thus ends up 
on the next line.

When a reference is put on the next line, org-mode will subsequently add 
markup to highlight each line, so the markup ends up like this:

(let ((x 42)) <span style="comment">; meaning of l.u.e.
<span id="ref-2"></span>  (print x))</span>
                  ^^^^^^^
The first closing tag is really the end of the comment which is spilled 
to the next line, but it erraneously closes the id span. The color of 
the comment then proceeds to the end of the second line, where the id 
span was to close.

To remedy this, I wrote a patch which postpone writing the newline to 
the html buffer until after the closing tag has been emitted. The patch 
is attached and should be applicable to the current Git repository.

It should be applicable to version 1.37 of htmlize.el as well, with the 
command `patch -p3 < 0001-Markup-on-same-line-as-text.patch`.

I refactored the insert-text functions so that they return the markup 
that should be applied instead of doing the insertion itself, and then 
let this go through a new function add-markup which puts the tags around 
the text, putting any trailing newline in the text at the very end, 
before the main htmlize-buffer-1 does the actual insertion in the buffer.

I have tested this with all three kinds of htmlize-output-type, and it 
seems to give the expected result.

-- 
  Roland.

[-- Attachment #2: 0001-Markup-on-same-line-as-text.patch --]
[-- Type: text/plain, Size: 7232 bytes --]

From 86f1508f58dd304471d768481944d34e220e24f1 Mon Sep 17 00:00:00 2001
From: Roland Kaufmann <rlndkfmn+orgmode@gmail.com>
Date: Thu, 6 Jan 2011 11:22:49 +0100
Subject: [PATCH] Markup on same line as text
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------1.7.3.1.msysgit.0"

This is a multi-part message in MIME format.
--------------1.7.3.1.msysgit.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


* contrib/lisp/htmlize.el: Markup on same line as text

Newline was considered to be a part of the text to be marked up; thus
the closing tag was put on the next line. Since org-mode does line
processing on the result, the line number span was closed prematurely
by this tag and the formatting erraneously extended through the next
line.

This fix replaces the insert-text functions with new get-markup variants
which return a pair (start-tag end-tag) for the appropriate formatting.
The newline is then removed from the text with `split-trailing-newline`
and appended again after the close tag in `add-markup`.

The text is sent to the buffer after this processing instead of in each
behavioral insert-text function. The names of the functions are changed
so reflect that the signatures are different.
---
 contrib/lisp/htmlize.el |   67 ++++++++++++++++++++++++++---------------------
 1 files changed, 37 insertions(+), 30 deletions(-)


--------------1.7.3.1.msysgit.0
Content-Type: text/x-patch; name="0001-Markup-on-same-line-as-text.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="0001-Markup-on-same-line-as-text.patch"

diff --git a/contrib/lisp/htmlize.el b/contrib/lisp/htmlize.el
index 5f4cb5b..f952b80 100644
--- a/contrib/lisp/htmlize.el
+++ b/contrib/lisp/htmlize.el
@@ -1209,7 +1209,7 @@ property and by buffer overlays that specify `face'."
 ;; `htmlize-buffer-1' calls a number of "methods", which indirect to
 ;; the functions that depend on `htmlize-output-type'.  The currently
 ;; used methods are `doctype', `insert-head', `body-tag', and
-;; `insert-text'.  Not all output types define all methods.
+;; `get-markup'.  Not all output types define all methods.
 ;;
 ;; Methods are called either with (htmlize-method METHOD ARGS...) 
 ;; special form, or by accessing the function with
@@ -1347,18 +1347,18 @@ it's called with the same value of KEY.  All other times, the cached
   (insert htmlize-hyperlink-style
 	  "    -->\n    </style>\n"))
 
-(defun htmlize-css-insert-text (text fstruct-list buffer)
-  ;; Insert TEXT colored with FACES into BUFFER.  In CSS mode, this is
-  ;; easy: just nest the text in one <span class=...> tag for each
-  ;; face in FSTRUCT-LIST.
-  (dolist (fstruct fstruct-list)
-    (princ "<span class=\"" buffer)
-    (princ (htmlize-fstruct-css-name fstruct) buffer)
-    (princ "\">" buffer))
-  (princ text buffer)
-  (dolist (fstruct fstruct-list)
-    (ignore fstruct)			; shut up the byte-compiler
-    (princ "</span>" buffer)))
+(defun htmlize-css-get-markup (fstruct-list)
+  ;; Get markup for FACES. In CSS mode, this is easy; just create one
+  ;; <span class=...> tag for each face in FSTRUCT-LIST.
+  (cons
+   (mapconcat
+	(lambda (fs) (concat "<span class=\""
+						 (htmlize-fstruct-css-name fs)
+						 "\">"))
+	fstruct-list "")
+   (mapconcat
+	(lambda (fs) (ignore fs) "</span>")
+	fstruct-list "")))
 \f
 ;; `inline-css' output support.
 
@@ -1367,20 +1367,16 @@ it's called with the same value of KEY.  All other times, the cached
 	  (mapconcat #'identity (htmlize-css-specs (gethash 'default face-map))
 		     " ")))
 
-(defun htmlize-inline-css-insert-text (text fstruct-list buffer)
+(defun htmlize-inline-css-get-markup (fstruct-list)
   (let* ((merged (htmlize-merge-faces fstruct-list))
 	 (style (htmlize-memoize
 		 merged
 		 (let ((specs (htmlize-css-specs merged)))
 		   (and specs
 			(mapconcat #'identity (htmlize-css-specs merged) " "))))))
-    (when style
-      (princ "<span style=\"" buffer)
-      (princ style buffer)
-      (princ "\">" buffer))
-    (princ text buffer)
-    (when style
-      (princ "</span>" buffer))))
+	(if style
+		(cons (concat "<span style=\"" style "\">") "</span>")
+	  (cons "" ""))))
 \f
 ;;; `font' tag based output support.
 
@@ -1390,12 +1386,12 @@ it's called with the same value of KEY.  All other times, the cached
 	    (htmlize-fstruct-foreground fstruct)
 	    (htmlize-fstruct-background fstruct))))
        
-(defun htmlize-font-insert-text (text fstruct-list buffer)
+(defun htmlize-font-get-markup (fstruct-list)
   ;; In `font' mode, we use the traditional HTML means of altering
   ;; presentation: <font> tag for colors, <b> for bold, <u> for
   ;; underline, and <strike> for strike-through.
-  (let* ((merged (htmlize-merge-faces fstruct-list))
-	 (markup (htmlize-memoize
+  (let ((merged (htmlize-merge-faces fstruct-list)))
+	 (htmlize-memoize
 		  merged
 		  (cons (concat
 			 (and (htmlize-fstruct-foreground merged)
@@ -1410,10 +1406,19 @@ it's called with the same value of KEY.  All other times, the cached
 			 (and (htmlize-fstruct-italicp merged)    "</i>")
 			 (and (htmlize-fstruct-boldp merged)      "</b>")
 			 (and (htmlize-fstruct-foreground merged) "</font>"))))))
-    (princ (car markup) buffer)
-    (princ text buffer)
-    (princ (cdr markup) buffer)))
 \f
+(defun split-trailing-newline (text)
+  "Split a trailing newline from the text"
+  (let ((length-minus-one (- (length text) 1)))
+	(if (and (>= length-minus-one 0) (char-equal (aref text length-minus-one) ?\n))
+		(cons (substring text 0 length-minus-one) "\n")
+	  (cons text ""))))
+
+(defun add-markup (markup text)
+  "Interpose head and tail of markup on each side of text"
+  (let ((text-and-newline (split-trailing-newline text)))
+	(concat (car markup) (car text-and-newline) (cdr markup) (cdr text-and-newline))))
+
 (defun htmlize-buffer-1 ()
   ;; Internal function; don't call it from outside this file.  Htmlize
   ;; current buffer, writing the resulting HTML to a new buffer, and
@@ -1468,11 +1473,11 @@ it's called with the same value of KEY.  All other times, the cached
 		"\n    ")
 	(plist-put places 'content-start (point-marker))
 	(insert "<pre>\n"))
-      (let ((insert-text-method
+      (let ((get-markup-method
 	     ;; Get the inserter method, so we can funcall it inside
 	     ;; the loop.  Not calling `htmlize-method' in the loop
 	     ;; body yields a measurable speed increase.
-	     (htmlize-method-function 'insert-text))
+	     (htmlize-method-function 'get-markup))
 	    ;; Declare variables used in loop body outside the loop
 	    ;; because it's faster to establish `let' bindings only
 	    ;; once.
@@ -1510,7 +1515,9 @@ it's called with the same value of KEY.  All other times, the cached
 	  (when (> (length text) 0)
 	    ;; Insert the text, along with the necessary markup to
 	    ;; represent faces in FSTRUCT-LIST.
-	    (funcall insert-text-method text fstruct-list htmlbuf))
+		(let* ((markup (funcall get-markup-method fstruct-list))
+			   (markedup-text (add-markup markup text)))
+		  (princ markedup-text htmlbuf)))
 	  (goto-char next-change)))
 
       ;; Insert the epilog and post-process the buffer.

--------------1.7.3.1.msysgit.0--



[-- Attachment #3: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] Markup on same line as text
  2011-01-06 22:12 [PATCH] Markup on same line as text Roland Kaufmann
@ 2011-02-16 20:04 ` Hrvoje Niksic
  2011-02-16 21:53   ` Roland Kaufmann
  2011-02-17 21:58 ` Hrvoje Niksic
  1 sibling, 1 reply; 8+ messages in thread
From: Hrvoje Niksic @ 2011-02-16 20:04 UTC (permalink / raw)
  To: Roland Kaufmann; +Cc: emacs-orgmode

Sorry for taking a very long time to respond.

Roland Kaufmann <roland.kaufmann@gmail.com> writes:
> (let ((x 42)) <span style="comment">; meaning of l.u.e.
> <span id="ref-2"></span>  (print x))</span>
>                  ^^^^^^^
> The first closing tag is really the end of the comment which is
> spilled to the next line, but it erraneously closes the id span.

If so, that would be a bug in htmlize.  While Emacs supports arbitrarily
overlapping properties/overlays/extents, HTML doesn't, and htmlize is
normally careful to describe each unchanged run of text on its own.

I am not familiar with org-mode, so I will need a description of exactly
how to reproduce this bug.  Specifically I don't know how to put a
reference on the next line.

Your patch may work in this particular case, but the idea behind htmlize
is to describe the state of the buffer.  If a property ends after the
newline, it is intended that the generated HTML reflect this.

Hrvoje

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Markup on same line as text
  2011-02-16 20:04 ` Hrvoje Niksic
@ 2011-02-16 21:53   ` Roland Kaufmann
  0 siblings, 0 replies; 8+ messages in thread
From: Roland Kaufmann @ 2011-02-16 21:53 UTC (permalink / raw)
  To: Hrvoje Niksic; +Cc: emacs-orgmode

> Your patch may work in this particular case, but the idea behind
> htmlize is to describe the state of the buffer.  If a property ends
> after the newline, it is intended that the generated HTML reflect

The philosophical question is then: Is the newline character part of the 
syntax construct that is being fontified, or rather a "formatting code" 
that should be kept separate?

Being whitespace it (mostly) doesn't matter visually, which makes it an 
easy choice to include in tokens to preserve formatting continuity 
between lines.

However, any further line processing by other modules is complicated 
significantly if the terminator is put inside the markup.

> I am not familiar with org-mode, so I will need a description of
> exactly how to reproduce this bug.  Specifically I don't know how to
> put a reference on the next line.

This Elisp will create/overwrite a buffer called foo.org in the /tmp 
directory containing problematic code and export it to foo.html:

(let ((filename (expand-file-name "foo.org" temporary-file-directory)))
   (switch-to-buffer (find-file-noselect filename))
   (erase-buffer)
   (insert "*
#+BEGIN_SRC emacs-lisp
(let ((x 42)) ; meaning of l.u.e.
   (print x))  ; (ref:2)
#+END_SRC")
   (save-buffer)
   (org-mode)
   (org-export-as-html nil))

-- 
    Roland.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Markup on same line as text
  2011-01-06 22:12 [PATCH] Markup on same line as text Roland Kaufmann
  2011-02-16 20:04 ` Hrvoje Niksic
@ 2011-02-17 21:58 ` Hrvoje Niksic
  2011-02-19 14:26   ` Roland Kaufmann
  1 sibling, 1 reply; 8+ messages in thread
From: Hrvoje Niksic @ 2011-02-17 21:58 UTC (permalink / raw)
  To: Roland Kaufmann; +Cc: hniksic, emacs-orgmode

[ Please include me in the replies, as I'm not subscribed to emacs-orgmode. ]

> > Your patch may work in this particular case, but the idea behind
> > htmlize is to describe the state of the buffer.  If a property ends
> > after the newline, it is intended that the generated HTML reflect
> 
> The philosophical question is then: Is the newline character part of the 
> syntax construct that is being fontified, or rather a "formatting code" 
> that should be kept separate?

htmlize doesn't operate on the level of syntax-based fontification, it
examines the display-related properties attached to buffer text (not
necessarily by font-lock) and renders them into the corresponding HTML.

If a display property includes the newline character, that will be
reflected in the HTML.  This works fine for displaying in a browser, but
confuses org-mode's post-processing of HTML, which (if my understanding
is correct) assumes that spans will be closed before the newline.  This
assumption is wrong in the case you present.

However, using htmlize-before-hook, it is trivial to make the assumption
correct by resetting the property.  Here is your example, modified to do
so:

(let ((filename (expand-file-name "foo.org" temporary-file-directory)))
   (switch-to-buffer (find-file-noselect filename))
   (erase-buffer)
   (insert "*
#+BEGIN_SRC emacs-lisp
(let ((x 42)) ; meaning of l.u.e.
   (print x))  ; (ref:2)
#+END_SRC")
   (save-buffer)
   (org-mode)
   (let ((htmlize-before-hook htmlize-before-hook))
     (add-hook 'htmlize-before-hook
               (lambda ()
                 (goto-char (point-min))
                 (while (progn (end-of-line) (not (eobp)))
                   (put-text-property (point) (1+ (point)) 'face nil)
                   (forward-char 1))))
     (org-export-as-html nil)))

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Markup on same line as text
  2011-02-17 21:58 ` Hrvoje Niksic
@ 2011-02-19 14:26   ` Roland Kaufmann
  2011-02-19 23:49     ` Hrvoje Niksic
  2011-02-26  9:39     ` Bastien
  0 siblings, 2 replies; 8+ messages in thread
From: Roland Kaufmann @ 2011-02-19 14:26 UTC (permalink / raw)
  To: Hrvoje Niksic; +Cc: emacs-orgmode

> htmlize doesn't operate on the level of syntax-based fontification, it
> examines the display-related properties attached to buffer text (not
> necessarily by font-lock) and renders them into the corresponding HTML.

Good point.

And, as you point out, it is probably better to deal with the problem by 
removing the formatting on the newlines (probably right after 
font-lock-fontify-buffer in org-export-format-source-code-or-example) in 
the temporary buffer that is htmlize'd, reducing the chance of any 
unintended consequences.

I'll try this approach and see how it turns out (i.e. proposal for 
change in htmlize dropped). Thank you for the feedback!

-- 
    Roland.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Markup on same line as text
  2011-02-19 14:26   ` Roland Kaufmann
@ 2011-02-19 23:49     ` Hrvoje Niksic
  2011-02-26  9:39     ` Bastien
  1 sibling, 0 replies; 8+ messages in thread
From: Hrvoje Niksic @ 2011-02-19 23:49 UTC (permalink / raw)
  To: Roland Kaufmann; +Cc: emacs-orgmode

Roland Kaufmann <roland.kaufmann@gmail.com> writes:

>> htmlize doesn't operate on the level of syntax-based fontification, it
>> examines the display-related properties attached to buffer text (not
>> necessarily by font-lock) and renders them into the corresponding HTML.
>
> Good point.
>
> And, as you point out, it is probably better to deal with the problem
> by removing the formatting on the newlines (probably right after
> font-lock-fontify-buffer in org-export-format-source-code-or-example)
> in the temporary buffer that is htmlize'd, reducing the chance of any
> unintended consequences.

Yes, tweaking the properties seems preferable to tweaking htmlize itself
for this particular case.

In the future I'd like htmlize to be more extensible about converting
buffer contents to HTML.  Currently it only examines the `face' and
`invisible' properties that Emacs itself uses for display.  But
additional properties could be defined, which htmlize would use to
discern hyperlinks, or to add line highlight markup, to override choice
of style sheet, etc.

> I'll try this approach and see how it turns out (i.e. proposal for
> change in htmlize dropped). Thank you for the feedback!

Thanks for sticking with this.  Please let me know how it works for you.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: [PATCH] Markup on same line as text
  2011-02-19 14:26   ` Roland Kaufmann
  2011-02-19 23:49     ` Hrvoje Niksic
@ 2011-02-26  9:39     ` Bastien
  2011-02-27 21:09       ` Roland Kaufmann
  1 sibling, 1 reply; 8+ messages in thread
From: Bastien @ 2011-02-26  9:39 UTC (permalink / raw)
  To: Roland Kaufmann; +Cc: emacs-orgmode, Hrvoje Niksic

Hi Roland,

Roland Kaufmann <roland.kaufmann@gmail.com> writes:

> And, as you point out, it is probably better to deal with the problem by
> removing the formatting on the newlines (probably right after
> font-lock-fontify-buffer in org-export-format-source-code-or-example) in
> the temporary buffer that is htmlize'd, reducing the chance of any
> unintended consequences.
>
> I'll try this approach and see how it turns out (i.e. proposal for change
> in htmlize dropped). Thank you for the feedback!

Please let us know if there is a useful way to generalize the workaround
presented earlier in the thread.

Thanks,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Markup on same line as text
  2011-02-26  9:39     ` Bastien
@ 2011-02-27 21:09       ` Roland Kaufmann
  0 siblings, 0 replies; 8+ messages in thread
From: Roland Kaufmann @ 2011-02-27 21:09 UTC (permalink / raw)
  To: emacs-orgmode; +Cc: Hrvoje Niksic

[-- Attachment #1: Type: text/plain, Size: 470 bytes --]

Sorry for taking so long to come back to this; I had some unrelated
problems with my system.

> Please let us know if there is a useful way to generalize the workaround
> presented earlier in the thread.

Attached is against the latest git tree for org-mode a patch which I
believe does the trick. (This supersedes the previous patch against
htmlize.el).

Thanks to Hrvoje for suggesting this approach and providing the code
to reset the face attribute.

-- 
   Roland.

[-- Attachment #2: 0001-Markup-on-same-line-as-text.patch --]
[-- Type: application/octet-stream, Size: 1681 bytes --]

From 98e2a586eb0e911ec6b5bedeec4af5f00ee2bf6c Mon Sep 17 00:00:00 2001
From: Roland Kaufmann <rlndkfmn+orgmode@gmail.com>
Date: Sun, 27 Feb 2011 20:52:31 +0100
Subject: [PATCH] Markup on same line as text

* org-exp.el (org-export-format-source-code-or-example): fontify one
  line at the time to avoid partial overlap between fontification and
  reference markup.
---
 lisp/org-exp.el |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/lisp/org-exp.el b/lisp/org-exp.el
index 9a35b00..dbcf105 100644
--- a/lisp/org-exp.el
+++ b/lisp/org-exp.el
@@ -2375,6 +2375,15 @@ in the list) and remove property and value from the list in LISTVAR."
 (defvar org-export-latex-listings-options) ;; defined in org-latex.el
 (defvar org-export-latex-minted-options) ;; defined in org-latex.el
 
+(defun org-remove-formatting-on-newlines-in-region (beg end)
+  "Remove formatting on newline characters"
+  (interactive "r")
+  (save-excursion
+    (goto-char beg)
+    (while (progn (end-of-line) (< (point) end))
+      (put-text-property (point) (1+ (point)) 'face nil)
+      (forward-char 1))))
+
 (defun org-export-format-source-code-or-example
   (backend lang code &optional opts indent caption)
   "Format CODE from language LANG and return it formatted for export.
@@ -2461,6 +2470,8 @@ INDENT was the original indentation of the block."
 				(funcall mode)
 			      (fundamental-mode))
 			    (font-lock-fontify-buffer)
+			    ;; markup each line separately
+			    (org-remove-formatting-on-newlines-in-region (point-min) (point-max))
 			    (org-src-mode)
 			    (set-buffer-modified-p nil)
 			    (org-export-htmlize-region-for-paste
-- 
1.7.1


[-- Attachment #3: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-02-27 21:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-06 22:12 [PATCH] Markup on same line as text Roland Kaufmann
2011-02-16 20:04 ` Hrvoje Niksic
2011-02-16 21:53   ` Roland Kaufmann
2011-02-17 21:58 ` Hrvoje Niksic
2011-02-19 14:26   ` Roland Kaufmann
2011-02-19 23:49     ` Hrvoje Niksic
2011-02-26  9:39     ` Bastien
2011-02-27 21:09       ` Roland Kaufmann

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).