From mboxrd@z Thu Jan 1 00:00:00 1970 From: Aaron Ecay Subject: Re: [BUG] Mark-up handling chokes on unicode whitespace Date: Tue, 23 Sep 2014 14:15:53 -0400 Message-ID: <87ppemnqxy.fsf@gmail.com> References: <87a95qp8vp.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:57006) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XWUdD-00045e-75 for emacs-orgmode@gnu.org; Tue, 23 Sep 2014 14:16:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XWUd6-0007Uw-ML for emacs-orgmode@gnu.org; Tue, 23 Sep 2014 14:16:07 -0400 Received: from mail-qg0-x22d.google.com ([2607:f8b0:400d:c04::22d]:49272) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XWUd6-0007Tf-I5 for emacs-orgmode@gnu.org; Tue, 23 Sep 2014 14:16:00 -0400 Received: by mail-qg0-f45.google.com with SMTP id q108so4801249qgd.32 for ; Tue, 23 Sep 2014 11:15:55 -0700 (PDT) In-Reply-To: List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Tobias Getzner , emacs-orgmode@gnu.org Hi Tobias, 2014ko irailak 23an, Tobias Getzner-ek idatzi zuen: >=20 > Hello Aaron! >=20 > On Tue, 23 Sep 2014 13:03:06 -0400, Aaron Ecay wrote: >=20 >> You will need to change the variable org-emphasis-regexp-components; see >> the documentation thereof. >=20 > Thank you very much! This seems to do it. >=20 > Might I suggest amending unicode whitespace to the default? That variable= =20 > seems a bit opaque and I might probably never have discovered it on my=20 > own; it also appears as if one has to ensure that this is set before org- > mode is =C2=ABrequired=C2=BB, and one cannot easily just extend the defau= lt without=20 > also setting the rest. For type-setting purposes, at least the class of=20 > non-breaking whitespace is very useful. org-emphasis-regexp-components is known to be a wart. You can search for posts on the mailing list. Some people are trying to figure out how to get rid of it. (You can search in particular for Nicolas Goaziou=E2=80= =99s posts...) Here=E2=80=99s one thread where you can see the lay of the land: . All that to say, the longer-term solution is to figure out some radically different approach. In the meantime though, if you can provide a list of characters (by unicode name and/or code point) that you think should be added to that variable, someone might be able to add them. (I probably would not make such a change on my own, but would wait for feedback from Nicolas, Bastien, or one of the other maintainer-esque figures on the list). On the other hand, they might say =E2=80=9Cmaking such a change in = org=E2=80=99s core is just restacking the deck chairs on the Titanic,=E2=80=9D which would also be a reasonable position for them to take IMO. >=20 > At first I thought it might be easy to cleanly solve such problems by=20 > using the whitespace character class throughout, but to my chagrin it=20 > seems that at least =C2=ABsearch-forward-regexp=C2=BB will only match 8-b= it=20 > whitespace this way, so I suppose Emacs regex isn=E2=80=99t aware of non-= ASCII=20 > whitespace? :'| I don=E2=80=99t really know anything about this...it=E2=80=99s unfortunate = if true though. --=20 Aaron Ecay