emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)]
@ 2013-12-16 20:48 Jess Balint
  2013-12-20 21:47 ` Nicolas Goaziou
  0 siblings, 1 reply; 9+ messages in thread
From: Jess Balint @ 2013-12-16 20:48 UTC (permalink / raw)
  To: emacs-orgmode


Remember to cover the basics, that is, what you expected to happen and
what in fact did happen.  You don't know how to make a good report?  See

     http://orgmode.org/manual/Feedback.html#Feedback

Your bug report will be posted to the Org-mode mailing list.
------------------------------------------------------------------------

I've generated a link to a headline with `org-store-link' and
`org-insert-link'. It's rendered into the Org file like so:

  [[*Headline%20with%Spaces][Headline with Spaces]]

The HTML export renders this as an italic string (the last cond in the org-html-link function).

The HTML exporter gets a "fuzzy" link which is handled in this block:

     ;; Links pointing to a headline: Find destination and build
     ;; appropriate referencing command.
     ((member type '("custom-id" "fuzzy" "id"))
      (let ((destination (if (string= type "fuzzy")
			     (org-export-resolve-fuzzy-link link info)
			   (org-export-resolve-id-link link info))))

The problem is in `org-export-resolve-fuzzy-link' which get's the path
directly from the link:

  (let* ((raw-path (org-element-property :path link))

But at this point it has "%20" in it which causes a problem when
splitting it:

     ;; Split PATH at white spaces so matches are space
     ;; insensitive.
     (path (org-split-string
        (if match-title-p (substring raw-path 1) raw-path)))

This does nothing because there are no spaces in the string (they are
%20 here). The search for a headline matching this fails. I can solve
this here by doing:

	 ;; Split PATH at white spaces so matches are space
	 ;; insensitive.
	 (path (org-split-string
			(replace-regexp-in-string "%20" " "
		(if match-title-p (substring raw-path 1) raw-path))))

I'm not it's the proper solution though. Thanks.
Jess

Emacs  : GNU Emacs 24.3.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.8.2)
 of 2013-08-06 on -mnt-storage-buildroots-staging-x86_64-eric
Package: Org-mode version 8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)

current state:
==============
(setq
 org-tab-first-hook '(org-hide-block-toggle-maybe
					  org-src-native-tab-command-maybe
					  org-babel-hide-result-toggle-maybe
					  org-babel-header-arg-expand)
 org-speed-command-hook '(org-speed-command-default-hook
						  org-babel-speed-command-hook)
 org-occur-hook '(org-first-headline-recenter)
 org-metaup-hook '(org-babel-load-in-session-maybe)
 org-log-done 'time
 org-confirm-shell-link-function 'yes-or-no-p
 org-latex-format-headline-function 'org-latex-format-headline-default-function
 org-default-notes-file "~/Dropbox/important/org/notes.org"
 org-after-todo-state-change-hook '(org-clock-out-if-current)
 org-src-mode-hook '(org-src-babel-configure-edit-buffer
					 org-src-mode-configure-edit-buffer)
 org-agenda-before-write-hook '(org-agenda-add-entry-text)
 org-babel-pre-tangle-hook '(save-buffer)
 org-mode-hook '(#[nil "\300\301\302\303\304$\207"
				   [org-add-hook change-major-mode-hook
					org-show-block-all append local]
				   5]
				 #[nil "\300\301\302\303\304$\207"
				   [org-add-hook change-major-mode-hook
					org-babel-show-result-all append local]
				   5]
				 org-babel-result-hide-spec org-babel-hide-all-hashes
				 (lambda nil (local-set-key "\x03a" (quote org-agenda))
				  (local-unset-key "\x03\x0f")
				  (local-set-key "\x03\x0f"
				   (function
					(lambda nil (interactive)
					 (if (oracle-elem-open) nil (org-open-at-point)))
					)
				   )
				  (org-babel-do-load-languages
				   (quote org-babel-load-languages) (quote ((ditaa . t))))
				  )
				 )
 org-ctrl-c-ctrl-c-hook '(org-babel-hash-at-point
						  org-babel-execute-safely-maybe)
 org-directory "~/Dropbox/important/org"
 org-cycle-hook '(org-cycle-hide-archived-subtrees org-cycle-hide-drawers
				  org-cycle-hide-inline-tasks org-cycle-show-empty-lines
				  org-optimize-window-after-visibility-change)
 org-todo-keywords '((sequence "TODO" "|" "DONE")
					 (sequence "|" "WAITING") (sequence "|" "CANCELED"))
 org-confirm-elisp-link-function 'yes-or-no-p
 org-metadown-hook '(org-babel-pop-to-session-maybe)
 org-babel-load-languages '((ditaa . t))
 org-agenda-files '("~/Dropbox/important/org"
					"~/Dropbox/important/org/oracle_work_log"
					"~/Dropbox/important/org/notes")
 org-clock-out-hook '(org-clock-remove-empty-clock-drawer)
 org-confirm-babel-evaluate 'my-org-confirm-babel-evaluate
 org-src-fontify-natively t
 )

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)]
  2013-12-16 20:48 Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)] Jess Balint
@ 2013-12-20 21:47 ` Nicolas Goaziou
  2014-01-03 14:48   ` Bastien
  0 siblings, 1 reply; 9+ messages in thread
From: Nicolas Goaziou @ 2013-12-20 21:47 UTC (permalink / raw)
  To: Jess Balint; +Cc: emacs-orgmode

Hello,

Jess Balint <jbalint@gmail.com> writes:

> I've generated a link to a headline with `org-store-link' and
> `org-insert-link'. It's rendered into the Org file like so:
>
>   [[*Headline%20with%Spaces][Headline with Spaces]]
>

[...]

> The problem is in `org-export-resolve-fuzzy-link' which get's the path
> directly from the link:
>
>   (let* ((raw-path (org-element-property :path link))
>
> But at this point it has "%20" in it which causes a problem when
> splitting it:
>
>      ;; Split PATH at white spaces so matches are space
>      ;; insensitive.
>      (path (org-split-string
>         (if match-title-p (substring raw-path 1) raw-path)))
>

Thank you for the report.

This bug exists because `org-insert-link' blindly url-hexifies links,
but nothing will unhexify it before it reaches an export back-end.

This is difficult to solve, because if you unhexify it, the very same
bug will occur on the other side (i.e. links you paste without using
`org-insert-link', which you don't want to unhexify).

IMO, `org-insert-link' shouldn't hexify links in all situations (if at
all).

Anyway, I can't think of any satisfactory solution at the moment.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)]
  2013-12-20 21:47 ` Nicolas Goaziou
@ 2014-01-03 14:48   ` Bastien
  2014-01-03 15:05     ` Nicolas Goaziou
  0 siblings, 1 reply; 9+ messages in thread
From: Bastien @ 2014-01-03 14:48 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode, Jess Balint

[-- Attachment #1: Type: text/plain, Size: 485 bytes --]

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> IMO, `org-insert-link' shouldn't hexify links in all situations (if at
> all).

Agreed.

We can first narrow the set of url-hexified links to those matching
`org-link-types-re' (http://, ftp://.)

> Anyway, I can't think of any satisfactory solution at the moment.

See patch.

Of course, there will still be some false positives, because
`org-link-types-re' comprises custom link types, but this is
a step in the right direction IMO.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: org-don't-url-hexify-all-links.patch --]
[-- Type: text/x-diff, Size: 720 bytes --]

diff --git a/lisp/org.el b/lisp/org.el
index f7a038e..20e6e33 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -9793,11 +9793,10 @@ according to FMT (default from `org-email-link-description-format')."
 	     (not (equal link (org-link-escape link))))
     (setq description (org-extract-attributes link)))
   (setq link
-	(cond ((string-match (org-image-file-name-regexp) link) link)
-	      ((string-match org-link-types-re link)
+	(cond ((string-match org-link-types-re link)
 	       (concat (match-string 1 link)
 		       (org-link-escape (substring link (match-end 1)))))
-	      (t (org-link-escape link))))
+	      (t link)))
   (concat "[[" link "]"
 	  (if description (concat "[" description "]") "")
 	  "]"))

[-- Attachment #3: Type: text/plain, Size: 14 bytes --]


-- 
 Bastien

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)]
  2014-01-03 14:48   ` Bastien
@ 2014-01-03 15:05     ` Nicolas Goaziou
  2014-01-04 14:35       ` Bastien
  0 siblings, 1 reply; 9+ messages in thread
From: Nicolas Goaziou @ 2014-01-03 15:05 UTC (permalink / raw)
  To: Bastien; +Cc: Jess Balint, emacs-orgmode

Hello,

Bastien <bzg@gnu.org> writes:

> See patch.
>
> Of course, there will still be some false positives, because
> `org-link-types-re' comprises custom link types, but this is
> a step in the right direction IMO.

Also, I think there will be a problem if an internal link contains
brackets. E.g, how to create an internal link to the following headline?

 * Some [headline]

Anyway, I agree this is a step in the right direction.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)]
  2014-01-03 15:05     ` Nicolas Goaziou
@ 2014-01-04 14:35       ` Bastien
  2014-01-05  0:03         ` Nicolas Goaziou
  0 siblings, 1 reply; 9+ messages in thread
From: Bastien @ 2014-01-04 14:35 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode, Jess Balint

Hi Nicolas,

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> Also, I think there will be a problem if an internal link contains
> brackets. E.g, how to create an internal link to the following headline?
>
>  * Some [headline]

Maybe we can just escape square brackets for such internal links.
At least this would solve the OP problem.

Let me know if you can think of other side-effect this change would
have.

-- 
 Bastien

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)]
  2014-01-04 14:35       ` Bastien
@ 2014-01-05  0:03         ` Nicolas Goaziou
  2014-01-05  6:45           ` Bastien
  0 siblings, 1 reply; 9+ messages in thread
From: Nicolas Goaziou @ 2014-01-05  0:03 UTC (permalink / raw)
  To: Bastien; +Cc: Jess Balint, emacs-orgmode

Hello,

Bastien <bzg@gnu.org> writes:

> Nicolas Goaziou <n.goaziou@gmail.com> writes:
>
>> Also, I think there will be a problem if an internal link contains
>> brackets. E.g, how to create an internal link to the following headline?
>>
>>  * Some [headline]
>
> Maybe we can just escape square brackets for such internal links.
> At least this would solve the OP problem.

I'm not sure to understand. IIUC, one reason for url-hexification is to
avoid forbidden characters in Org links, i.e., Org hexifies links to
escape characters like brackets.

Do you mean using another escape mechanism? If so, I think the problem
would be the same as before (impossible to know if link at point was
escaped or not in the first place).

Another idea would be to get rid of automatic hexification in all cases,
and have more tolerant regexps for links.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)]
  2014-01-05  0:03         ` Nicolas Goaziou
@ 2014-01-05  6:45           ` Bastien
  2014-01-06 11:32             ` Nicolas Goaziou
  0 siblings, 1 reply; 9+ messages in thread
From: Bastien @ 2014-01-05  6:45 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode, Jess Balint

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> I'm not sure to understand. IIUC, one reason for url-hexification is to
> avoid forbidden characters in Org links, i.e., Org hexifies links to
> escape characters like brackets.
>
> Do you mean using another escape mechanism?

No, I mean to url-hexify URLs that correspond to a link type (through
org-link-types) and for internal links, to escape only [ and ], which
are the only two problematic characters (aren't they?)

> If so, I think the problem
> would be the same as before (impossible to know if link at point was
> escaped or not in the first place).

Yes...

> Another idea would be to get rid of automatic hexification in all cases,
> and have more tolerant regexps for links.

This seems incertain.

Or maybe we can just live with the current problem.

-- 
 Bastien

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)]
  2014-01-05  6:45           ` Bastien
@ 2014-01-06 11:32             ` Nicolas Goaziou
  2014-01-06 16:36               ` Bastien
  0 siblings, 1 reply; 9+ messages in thread
From: Nicolas Goaziou @ 2014-01-06 11:32 UTC (permalink / raw)
  To: Bastien; +Cc: emacs-orgmode, Jess Balint

Bastien <bzg@gnu.org> writes:

>> Another idea would be to get rid of automatic hexification in all cases,
>> and have more tolerant regexps for links.
>
> This seems incertain.

How so? Accepting balanced pairs of square brackets is better than no
closing square bracket at all.

> Or maybe we can just live with the current problem.

Probably, yes. But I have the feeling we encounter the problem again.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)]
  2014-01-06 11:32             ` Nicolas Goaziou
@ 2014-01-06 16:36               ` Bastien
  0 siblings, 0 replies; 9+ messages in thread
From: Bastien @ 2014-01-06 16:36 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Jess Balint, emacs-orgmode

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> Bastien <bzg@gnu.org> writes:
>
>>> Another idea would be to get rid of automatic hexification in all cases,
>>> and have more tolerant regexps for links.
>>
>> This seems incertain.
>
> How so? Accepting balanced pairs of square brackets is better than no
> closing square bracket at all.

Well, I let you try this, but my guess it that the cost in terms of
regexp complexity (especially for the "balanced" aspects) will be a
bit too high.

-- 
 Bastien

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-01-06 16:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-16 20:48 Bug: HTML Export doesn't handle internal link with spaces [8.0.7 (8.0.7-6-g13cb28-elpa @ /home/jbalint/.emacs.d/elpa/org-20130812/)] Jess Balint
2013-12-20 21:47 ` Nicolas Goaziou
2014-01-03 14:48   ` Bastien
2014-01-03 15:05     ` Nicolas Goaziou
2014-01-04 14:35       ` Bastien
2014-01-05  0:03         ` Nicolas Goaziou
2014-01-05  6:45           ` Bastien
2014-01-06 11:32             ` Nicolas Goaziou
2014-01-06 16:36               ` Bastien

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).