emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* org-protocol and encoding
@ 2009-04-06 10:37 Ulf Stegemann
  2009-04-06 11:53 ` Sebastian Rose
  0 siblings, 1 reply; 8+ messages in thread
From: Ulf Stegemann @ 2009-04-06 10:37 UTC (permalink / raw)
  To: emacs-orgmode

org-protocol is really a great extension for org-mode! However, I
experience an annoyance regarding non-ascii character encoding.

When using org-protocol with remember and firefox, all non-ascii
characters get b0rked in the remember buffer (on linux, with emacs 23
and org-mode as of today and latest ff 3.0). It doesn't matter if the
source page uses html entities or literal non-ascii-characters. Does
anyone share this experience and has a suggestion on where to look for
the cause?

Ulf

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: org-protocol and encoding
  2009-04-06 10:37 org-protocol and encoding Ulf Stegemann
@ 2009-04-06 11:53 ` Sebastian Rose
  2009-04-07 11:20   ` Sebastian Rose
  2009-04-09 13:29   ` Carsten Dominik
  0 siblings, 2 replies; 8+ messages in thread
From: Sebastian Rose @ 2009-04-06 11:53 UTC (permalink / raw)
  To: Ulf Stegemann; +Cc: emacs-orgmode

Ulf Stegemann <ulf-news@zeitform.de> writes:
> org-protocol is really a great extension for org-mode! However, I
> experience an annoyance regarding non-ascii character encoding.
>
> When using org-protocol with remember and firefox, all non-ascii
> characters get b0rked in the remember buffer (on linux, with emacs 23
> and org-mode as of today and latest ff 3.0). It doesn't matter if the
> source page uses html entities or literal non-ascii-characters. Does
> anyone share this experience and has a suggestion on where to look for
> the cause?


Yes. Same here.

This seems to be a emacs/remember problem though.


If I open a file `xy.txt' and select this text:

  lkäüüäüpüpjüpjsf

and then 'C-x r' to remember it, I get this in my remember buffer:

  [[file:~/xy.txt::lk%20p%20pj%20pjsf][file:~/xy.txt::lk p pj pjsf]]



Not sure how to work around this yet. Seems to be encoding-related...

Maybe I find some time to into this later today.



   Sebastian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: org-protocol and encoding
  2009-04-06 11:53 ` Sebastian Rose
@ 2009-04-07 11:20   ` Sebastian Rose
  2009-04-15 22:22     ` Sebastian Rose
  2009-04-09 13:29   ` Carsten Dominik
  1 sibling, 1 reply; 8+ messages in thread
From: Sebastian Rose @ 2009-04-07 11:20 UTC (permalink / raw)
  To: Ulf Stegemann; +Cc: emacs-orgmode

Sebastian Rose <sebastian_rose@gmx.de> writes:
> This seems to be a emacs/remember problem though.
>
>
> If I open a file `xy.txt' and select this text:
>
>   lkäüüäüpüpjüpjsf
>
> and then 'C-x r' to remember it, I get this in my remember buffer:
>
>   [[file:~/xy.txt::lk%20p%20pj%20pjsf][file:~/xy.txt::lk p pj pjsf]]
>

While this is true, incomming text looks corrupted in org-protocol.el too.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: org-protocol and encoding
  2009-04-06 11:53 ` Sebastian Rose
  2009-04-07 11:20   ` Sebastian Rose
@ 2009-04-09 13:29   ` Carsten Dominik
  1 sibling, 0 replies; 8+ messages in thread
From: Carsten Dominik @ 2009-04-09 13:29 UTC (permalink / raw)
  To: Sebastian Rose; +Cc: emacs-orgmode, Ulf Stegemann


On Apr 6, 2009, at 1:53 PM, Sebastian Rose wrote:

> Ulf Stegemann <ulf-news@zeitform.de> writes:
>> org-protocol is really a great extension for org-mode! However, I
>> experience an annoyance regarding non-ascii character encoding.
>>
>> When using org-protocol with remember and firefox, all non-ascii
>> characters get b0rked in the remember buffer (on linux, with emacs 23
>> and org-mode as of today and latest ff 3.0). It doesn't matter if the
>> source page uses html entities or literal non-ascii-characters. Does
>> anyone share this experience and has a suggestion on where to look  
>> for
>> the cause?
>
>
> Yes. Same here.
>
> This seems to be a emacs/remember problem though.
>
>
> If I open a file `xy.txt' and select this text:
>
>  lkäüüäüpüpjüpjsf
>
> and then 'C-x r' to remember it, I get this in my remember buffer:
>
>  [[file:~/xy.txt::lk%20p%20pj%20pjsf][file:~/xy.txt::lk p pj pjsf]]
>
>

This problem might be partially resolved by pulling from git and then
seting

(setq org-url-encoding-use-url-hexify t)

This is for testing only right now.

- Carsten

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: org-protocol and encoding
  2009-04-07 11:20   ` Sebastian Rose
@ 2009-04-15 22:22     ` Sebastian Rose
  2009-04-15 22:33       ` Sebastian Rose
  2009-04-16  6:39       ` Carsten Dominik
  0 siblings, 2 replies; 8+ messages in thread
From: Sebastian Rose @ 2009-04-15 22:22 UTC (permalink / raw)
  To: Ulf Stegemann; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1522 bytes --]

Hi Ulf and Carsten,


here appended is a patch, that does two things.


1. Decode hex-encoded unicode
=============================
  
The new function `org-protocol-unhex-string' correctly decodes unicode
hex-enoded, just like the JavaScript function `encodeURIComponent' does.

I tested with several unicode and german websites.

This is text fetched per org-protocol.el after patching:
 
 From the mew homepage (http://www.mew.org/index.html.ja):

 => --->8----------------------------->8----------------------------->8---
 Quelle: [2009-04-16 Do], [[http://www.mew.org/index.html.ja][Mew のオフィシャルページ]]

 Mewに関する質問はMew-distメーリングリストへ送ってください。
 作者個人宛に送っても,返事は戻ってこないかもしれません。
 このページへのリンク、書籍・雑誌等での紹介は、
 公序良俗に反しない範囲で自由にどうぞ。 

 <= ---8<-----------------------------8<-----------------------------8<---
  


  
2. Allow a function as second argument to org-protocol-split-data
=================================================================

The default decoding function is now `org-protocol-unhex-string', if the
second parameter to `org-protocol-split-data' is non-nil. If that
parameter is a function, that function is used to decode the split
parts. 
  




The patch still containes some lines with debugging code, that
may be uncommented to see what's going on. 



[-- Attachment #2: patch-org-protocol.el --]
[-- Type: application/emacs-lisp, Size: 3467 bytes --]

[-- Attachment #3: Type: text/plain, Size: 26 bytes --]




Best

    Sebastian




[-- Attachment #4: Type: text/plain, Size: 204 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Remember: use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: org-protocol and encoding
  2009-04-15 22:22     ` Sebastian Rose
@ 2009-04-15 22:33       ` Sebastian Rose
  2009-04-16  6:39       ` Carsten Dominik
  1 sibling, 0 replies; 8+ messages in thread
From: Sebastian Rose @ 2009-04-15 22:33 UTC (permalink / raw)
  To: Ulf Stegemann; +Cc: emacs-orgmode

Sorry for replying to my own mail.

Reading `man utf-8' might help to understand what the patch does.

And, utf-8 is what is decoded here.



  Sebastian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: org-protocol and encoding
  2009-04-15 22:22     ` Sebastian Rose
  2009-04-15 22:33       ` Sebastian Rose
@ 2009-04-16  6:39       ` Carsten Dominik
  2009-04-16  8:48         ` Ulf Stegemann
  1 sibling, 1 reply; 8+ messages in thread
From: Carsten Dominik @ 2009-04-16  6:39 UTC (permalink / raw)
  To: Sebastian Rose; +Cc: emacs-orgmode, Ulf Stegemann

Hi Sebastian,

this looks like a good solution!

Send me a final patch when you are convinced yourself.

Ulf, can you do some testing, please?

Thanks.

- Carsten

On Apr 16, 2009, at 12:22 AM, Sebastian Rose wrote:

> Hi Ulf and Carsten,
>
>
> here appended is a patch, that does two things.
>
>
> 1. Decode hex-encoded unicode
> =============================
>
> The new function `org-protocol-unhex-string' correctly decodes unicode
> hex-enoded, just like the JavaScript function `encodeURIComponent'  
> does.
>
> I tested with several unicode and german websites.
>
> This is text fetched per org-protocol.el after patching:
>
> From the mew homepage (http://www.mew.org/index.html.ja):
>
> => --->8----------------------------->8----------------------------- 
> >8---
> Quelle: [2009-04-16 Do], [[http://www.mew.org/index.html.ja][Mew の 
> オフィシャルページ]]
>
> Mewに関する質問はMew-distメーリングリストへ送って 
> ください。
> 作者個人宛に送っても,返事は戻ってこないかもし 
> れません。
> このページへのリンク、書籍・雑誌等での紹介は、
> 公序良俗に反しない範囲で自由にどうぞ。
>
> <=  
> ---8<-----------------------------8<-----------------------------8<---
>
>
>
>
> 2. Allow a function as second argument to org-protocol-split-data
> =================================================================
>
> The default decoding function is now `org-protocol-unhex-string', if  
> the
> second parameter to `org-protocol-split-data' is non-nil. If that
> parameter is a function, that function is used to decode the split
> parts.
>
>
>
>
>
> The patch still containes some lines with debugging code, that
> may be uncommented to see what's going on.
>
>
> <patch-org-protocol.el>
>
>
> Best
>
>    Sebastian
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: org-protocol and encoding
  2009-04-16  6:39       ` Carsten Dominik
@ 2009-04-16  8:48         ` Ulf Stegemann
  0 siblings, 0 replies; 8+ messages in thread
From: Ulf Stegemann @ 2009-04-16  8:48 UTC (permalink / raw)
  To: emacs-orgmode

Sebastian, Carsten,

Carsten Dominik <carsten.dominik@gmail.com> wrote:

> Ulf, can you do some testing, please?

done that using Emacs 23 (cvs today), Org (git today + patch), Firefox
3.0.8 on linux. Everything seems to work fine, I haven't found a single
page/text that hasn't been encoded correctly. So I assume that the patch
is working ... at least in the environment mentioned. Thanks for your
effort and the good work :)


Ulf

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-04-16  8:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-06 10:37 org-protocol and encoding Ulf Stegemann
2009-04-06 11:53 ` Sebastian Rose
2009-04-07 11:20   ` Sebastian Rose
2009-04-15 22:22     ` Sebastian Rose
2009-04-15 22:33       ` Sebastian Rose
2009-04-16  6:39       ` Carsten Dominik
2009-04-16  8:48         ` Ulf Stegemann
2009-04-09 13:29   ` Carsten Dominik

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).