* org-protocol and encoding
@ 2009-04-06 10:37 Ulf Stegemann
2009-04-06 11:53 ` Sebastian Rose
0 siblings, 1 reply; 8+ messages in thread
From: Ulf Stegemann @ 2009-04-06 10:37 UTC (permalink / raw)
To: emacs-orgmode
org-protocol is really a great extension for org-mode! However, I
experience an annoyance regarding non-ascii character encoding.
When using org-protocol with remember and firefox, all non-ascii
characters get b0rked in the remember buffer (on linux, with emacs 23
and org-mode as of today and latest ff 3.0). It doesn't matter if the
source page uses html entities or literal non-ascii-characters. Does
anyone share this experience and has a suggestion on where to look for
the cause?
Ulf
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: org-protocol and encoding
2009-04-06 10:37 org-protocol and encoding Ulf Stegemann
@ 2009-04-06 11:53 ` Sebastian Rose
2009-04-07 11:20 ` Sebastian Rose
2009-04-09 13:29 ` Carsten Dominik
0 siblings, 2 replies; 8+ messages in thread
From: Sebastian Rose @ 2009-04-06 11:53 UTC (permalink / raw)
To: Ulf Stegemann; +Cc: emacs-orgmode
Ulf Stegemann <ulf-news@zeitform.de> writes:
> org-protocol is really a great extension for org-mode! However, I
> experience an annoyance regarding non-ascii character encoding.
>
> When using org-protocol with remember and firefox, all non-ascii
> characters get b0rked in the remember buffer (on linux, with emacs 23
> and org-mode as of today and latest ff 3.0). It doesn't matter if the
> source page uses html entities or literal non-ascii-characters. Does
> anyone share this experience and has a suggestion on where to look for
> the cause?
Yes. Same here.
This seems to be a emacs/remember problem though.
If I open a file `xy.txt' and select this text:
lkäüüäüpüpjüpjsf
and then 'C-x r' to remember it, I get this in my remember buffer:
[[file:~/xy.txt::lk%20p%20pj%20pjsf][file:~/xy.txt::lk p pj pjsf]]
Not sure how to work around this yet. Seems to be encoding-related...
Maybe I find some time to into this later today.
Sebastian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: org-protocol and encoding
2009-04-06 11:53 ` Sebastian Rose
@ 2009-04-07 11:20 ` Sebastian Rose
2009-04-15 22:22 ` Sebastian Rose
2009-04-09 13:29 ` Carsten Dominik
1 sibling, 1 reply; 8+ messages in thread
From: Sebastian Rose @ 2009-04-07 11:20 UTC (permalink / raw)
To: Ulf Stegemann; +Cc: emacs-orgmode
Sebastian Rose <sebastian_rose@gmx.de> writes:
> This seems to be a emacs/remember problem though.
>
>
> If I open a file `xy.txt' and select this text:
>
> lkäüüäüpüpjüpjsf
>
> and then 'C-x r' to remember it, I get this in my remember buffer:
>
> [[file:~/xy.txt::lk%20p%20pj%20pjsf][file:~/xy.txt::lk p pj pjsf]]
>
While this is true, incomming text looks corrupted in org-protocol.el too.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: org-protocol and encoding
2009-04-07 11:20 ` Sebastian Rose
@ 2009-04-15 22:22 ` Sebastian Rose
2009-04-15 22:33 ` Sebastian Rose
2009-04-16 6:39 ` Carsten Dominik
0 siblings, 2 replies; 8+ messages in thread
From: Sebastian Rose @ 2009-04-15 22:22 UTC (permalink / raw)
To: Ulf Stegemann; +Cc: emacs-orgmode
[-- Attachment #1: Type: text/plain, Size: 1522 bytes --]
Hi Ulf and Carsten,
here appended is a patch, that does two things.
1. Decode hex-encoded unicode
=============================
The new function `org-protocol-unhex-string' correctly decodes unicode
hex-enoded, just like the JavaScript function `encodeURIComponent' does.
I tested with several unicode and german websites.
This is text fetched per org-protocol.el after patching:
From the mew homepage (http://www.mew.org/index.html.ja):
=> --->8----------------------------->8----------------------------->8---
Quelle: [2009-04-16 Do], [[http://www.mew.org/index.html.ja][Mew のオフィシャルページ]]
Mewに関する質問はMew-distメーリングリストへ送ってください。
作者個人宛に送っても,返事は戻ってこないかもしれません。
このページへのリンク、書籍・雑誌等での紹介は、
公序良俗に反しない範囲で自由にどうぞ。
<= ---8<-----------------------------8<-----------------------------8<---
2. Allow a function as second argument to org-protocol-split-data
=================================================================
The default decoding function is now `org-protocol-unhex-string', if the
second parameter to `org-protocol-split-data' is non-nil. If that
parameter is a function, that function is used to decode the split
parts.
The patch still containes some lines with debugging code, that
may be uncommented to see what's going on.
[-- Attachment #2: patch-org-protocol.el --]
[-- Type: application/emacs-lisp, Size: 3467 bytes --]
[-- Attachment #3: Type: text/plain, Size: 26 bytes --]
Best
Sebastian
[-- Attachment #4: Type: text/plain, Size: 204 bytes --]
_______________________________________________
Emacs-orgmode mailing list
Remember: use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: org-protocol and encoding
2009-04-15 22:22 ` Sebastian Rose
@ 2009-04-15 22:33 ` Sebastian Rose
2009-04-16 6:39 ` Carsten Dominik
1 sibling, 0 replies; 8+ messages in thread
From: Sebastian Rose @ 2009-04-15 22:33 UTC (permalink / raw)
To: Ulf Stegemann; +Cc: emacs-orgmode
Sorry for replying to my own mail.
Reading `man utf-8' might help to understand what the patch does.
And, utf-8 is what is decoded here.
Sebastian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: org-protocol and encoding
2009-04-15 22:22 ` Sebastian Rose
2009-04-15 22:33 ` Sebastian Rose
@ 2009-04-16 6:39 ` Carsten Dominik
2009-04-16 8:48 ` Ulf Stegemann
1 sibling, 1 reply; 8+ messages in thread
From: Carsten Dominik @ 2009-04-16 6:39 UTC (permalink / raw)
To: Sebastian Rose; +Cc: emacs-orgmode, Ulf Stegemann
Hi Sebastian,
this looks like a good solution!
Send me a final patch when you are convinced yourself.
Ulf, can you do some testing, please?
Thanks.
- Carsten
On Apr 16, 2009, at 12:22 AM, Sebastian Rose wrote:
> Hi Ulf and Carsten,
>
>
> here appended is a patch, that does two things.
>
>
> 1. Decode hex-encoded unicode
> =============================
>
> The new function `org-protocol-unhex-string' correctly decodes unicode
> hex-enoded, just like the JavaScript function `encodeURIComponent'
> does.
>
> I tested with several unicode and german websites.
>
> This is text fetched per org-protocol.el after patching:
>
> From the mew homepage (http://www.mew.org/index.html.ja):
>
> => --->8----------------------------->8-----------------------------
> >8---
> Quelle: [2009-04-16 Do], [[http://www.mew.org/index.html.ja][Mew の
> オフィシャルページ]]
>
> Mewに関する質問はMew-distメーリングリストへ送って
> ください。
> 作者個人宛に送っても,返事は戻ってこないかもし
> れません。
> このページへのリンク、書籍・雑誌等での紹介は、
> 公序良俗に反しない範囲で自由にどうぞ。
>
> <=
> ---8<-----------------------------8<-----------------------------8<---
>
>
>
>
> 2. Allow a function as second argument to org-protocol-split-data
> =================================================================
>
> The default decoding function is now `org-protocol-unhex-string', if
> the
> second parameter to `org-protocol-split-data' is non-nil. If that
> parameter is a function, that function is used to decode the split
> parts.
>
>
>
>
>
> The patch still containes some lines with debugging code, that
> may be uncommented to see what's going on.
>
>
> <patch-org-protocol.el>
>
>
> Best
>
> Sebastian
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: org-protocol and encoding
2009-04-16 6:39 ` Carsten Dominik
@ 2009-04-16 8:48 ` Ulf Stegemann
0 siblings, 0 replies; 8+ messages in thread
From: Ulf Stegemann @ 2009-04-16 8:48 UTC (permalink / raw)
To: emacs-orgmode
Sebastian, Carsten,
Carsten Dominik <carsten.dominik@gmail.com> wrote:
> Ulf, can you do some testing, please?
done that using Emacs 23 (cvs today), Org (git today + patch), Firefox
3.0.8 on linux. Everything seems to work fine, I haven't found a single
page/text that hasn't been encoded correctly. So I assume that the patch
is working ... at least in the environment mentioned. Thanks for your
effort and the good work :)
Ulf
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: org-protocol and encoding
2009-04-06 11:53 ` Sebastian Rose
2009-04-07 11:20 ` Sebastian Rose
@ 2009-04-09 13:29 ` Carsten Dominik
1 sibling, 0 replies; 8+ messages in thread
From: Carsten Dominik @ 2009-04-09 13:29 UTC (permalink / raw)
To: Sebastian Rose; +Cc: emacs-orgmode, Ulf Stegemann
On Apr 6, 2009, at 1:53 PM, Sebastian Rose wrote:
> Ulf Stegemann <ulf-news@zeitform.de> writes:
>> org-protocol is really a great extension for org-mode! However, I
>> experience an annoyance regarding non-ascii character encoding.
>>
>> When using org-protocol with remember and firefox, all non-ascii
>> characters get b0rked in the remember buffer (on linux, with emacs 23
>> and org-mode as of today and latest ff 3.0). It doesn't matter if the
>> source page uses html entities or literal non-ascii-characters. Does
>> anyone share this experience and has a suggestion on where to look
>> for
>> the cause?
>
>
> Yes. Same here.
>
> This seems to be a emacs/remember problem though.
>
>
> If I open a file `xy.txt' and select this text:
>
> lkäüüäüpüpjüpjsf
>
> and then 'C-x r' to remember it, I get this in my remember buffer:
>
> [[file:~/xy.txt::lk%20p%20pj%20pjsf][file:~/xy.txt::lk p pj pjsf]]
>
>
This problem might be partially resolved by pulling from git and then
seting
(setq org-url-encoding-use-url-hexify t)
This is for testing only right now.
- Carsten
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-04-16 8:48 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-06 10:37 org-protocol and encoding Ulf Stegemann
2009-04-06 11:53 ` Sebastian Rose
2009-04-07 11:20 ` Sebastian Rose
2009-04-15 22:22 ` Sebastian Rose
2009-04-15 22:33 ` Sebastian Rose
2009-04-16 6:39 ` Carsten Dominik
2009-04-16 8:48 ` Ulf Stegemann
2009-04-09 13:29 ` Carsten Dominik
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).