From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?utf-8?Q?=C5=81ukasz?= Stelmach Subject: Re: [PATCH] unicode nbsp in org-emphasis-regexp-components Date: Sun, 24 Oct 2010 23:05:07 +0200 Message-ID: <87wrp7714s.fsf@kotik.lan> References: <87lj5p7sy4.fsf@kotik.lan> <2FD034DB-D8AF-4EA3-8510-F279DA93739E@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: Received: from [140.186.70.92] (port=35774 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PA7zK-0001t7-FL for emacs-orgmode@gnu.org; Sun, 24 Oct 2010 17:20:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PA7kl-0006Ku-As for emacs-orgmode@gnu.org; Sun, 24 Oct 2010 17:05:20 -0400 Received: from lo.gmane.org ([80.91.229.12]:56831) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PA7kl-0006Kq-2e for emacs-orgmode@gnu.org; Sun, 24 Oct 2010 17:05:19 -0400 Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PA7kj-0003LM-Br for emacs-orgmode@gnu.org; Sun, 24 Oct 2010 23:05:17 +0200 Received: from 87-205-172-252.adsl.inetia.pl ([87.205.172.252]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 24 Oct 2010 23:05:17 +0200 Received: from lukasz.stelmach by 87-205-172-252.adsl.inetia.pl with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 24 Oct 2010 23:05:17 +0200 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org Carsten Dominik writes: > On Oct 23, 2010, at 12:39 AM, Łukasz Stelmach wrote: > >> The Unicode contains a NON-BREAK SPACE character at position 0xA0. >> IMHO org-mode's emphasis code should by default treat this (any >> other?) character the same as normal space. When i write: >> >> It was a /big bang/. >> >> I'd like the "big bang" to be put in italic especially when exported >> to HTML. (I don't know if it goes properly through all the mailing >> systems but I put the "\u00A0" between "a" and "/" above.) >> [...] >> > > I am aftraid that this will break flavors of Emacs which do not > support unicode characters, like Emacs 22. Org-mode still supports > Emacs 22. And I do not know how to write this in a way that it > will remaind compatible. Do you? How about simply checking the Emacs version? (defcustom org-emphasis-regexp-components (if (<= 23 (string-to-number (car (split-string emacs-version "\\.")))) '(" \t('\"{\u00A0" "- \t.,:!?;'\")}\\" " \t\r\n,\"'" "." 1) '(" \t('\"{" "- \t.,:!?;'\")}\\" " \t\r\n,\"'" "." 1)) [...] The problem with earlier version is that although most, if not all, ISO Latin pages put `NO-BREAK SPACE' at 0xA0 some may use different codepages. But they can do this also in newer Emacsen if they haven't converted their files yet, can't they? If you think putting `A0' in that regexp may break things, then I'd suggest putting a note about it somewhere for people who'd like to customise it for themselves. -- Miłego dnia, Łukasz Stelmach