From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tobias Getzner Subject: Re: [BUG] Mark-up handling chokes on unicode whitespace Date: Tue, 23 Sep 2014 17:44:19 +0000 (UTC) Message-ID: References: <87a95qp8vp.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:39657) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XWU8t-0002iC-Mf for emacs-orgmode@gnu.org; Tue, 23 Sep 2014 13:44:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XWU8m-0002VY-7s for emacs-orgmode@gnu.org; Tue, 23 Sep 2014 13:44:47 -0400 Received: from plane.gmane.org ([80.91.229.3]:34832) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XWU8m-0002SE-29 for emacs-orgmode@gnu.org; Tue, 23 Sep 2014 13:44:40 -0400 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1XWU8e-0004Ma-Qb for emacs-orgmode@gnu.org; Tue, 23 Sep 2014 19:44:32 +0200 Received: from f048108255.adsl.alicedsl.de ([78.48.108.255]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 23 Sep 2014 19:44:32 +0200 Received: from tobias.getzner by f048108255.adsl.alicedsl.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 23 Sep 2014 19:44:32 +0200 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org Hello Aaron! On Tue, 23 Sep 2014 13:03:06 -0400, Aaron Ecay wrote: > 2014ko irailak 23an, Tobias Getzner-ek idatzi zuen: >> >> When mark-up such as =monospace=, /italic/, etc. is preceded by a >> non-8bit whitespace, e. g., «narrow no-break space» (U+202F) or >> «no-break space» (U+00A0), org-mode will not recognize the mark-up >> content correctly > > You will need to change the variable org-emphasis-regexp-components; see > the documentation thereof. Thank you very much! This seems to do it. Might I suggest amending unicode whitespace to the default? That variable seems a bit opaque and I might probably never have discovered it on my own; it also appears as if one has to ensure that this is set before org- mode is «required», and one cannot easily just extend the default without also setting the rest. For type-setting purposes, at least the class of non-breaking whitespace is very useful. At first I thought it might be easy to cleanly solve such problems by using the whitespace character class throughout, but to my chagrin it seems that at least «search-forward-regexp» will only match 8-bit whitespace this way, so I suppose Emacs regex isn’t aware of non-ASCII whitespace? :'| Best, Tobias