From mboxrd@z Thu Jan 1 00:00:00 1970 From: Rasmus Subject: =?UTF-8?B?UmU6IE9yZ21vZGUg4oaSIE9EVDogQ2VydGFpbiBjaGFycyBicmVh?= =?UTF-8?B?ayBleHBvcnQ=?= Date: Fri, 13 Feb 2015 17:07:14 +0100 Message-ID: <87mw4hn6bx.fsf@gmx.us> References: <87r3tu5bus.fsf@gmail.com> <87386avzrl.fsf@gmx.us> <87egpt6drz.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:40256) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMIm8-0006eV-FS for emacs-orgmode@gnu.org; Fri, 13 Feb 2015 11:07:29 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YMIm4-00029t-7x for emacs-orgmode@gnu.org; Fri, 13 Feb 2015 11:07:28 -0500 Received: from plane.gmane.org ([80.91.229.3]:44256) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMIm4-00029Y-1P for emacs-orgmode@gnu.org; Fri, 13 Feb 2015 11:07:24 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1YMIm1-00067w-Ul for emacs-orgmode@gnu.org; Fri, 13 Feb 2015 17:07:21 +0100 Received: from 46.166.186.233 ([46.166.186.233]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 13 Feb 2015 17:07:21 +0100 Received: from rasmus by 46.166.186.233 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 13 Feb 2015 17:07:21 +0100 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org torys.anderson@gmail.com (Tory S. Anderson) writes: > From a user perspective just stripping the characters seems best to > me, but finding out what the characters seems obnoxious. But maybe there is a valid way to represent such characters in XML? At the very least entities must be replaced before stripping these... > Neither a quick search nor skimming the ODT doc specification[1][2] seem > to give any insight into a set of illegal characters. Does elisp have > anything similar to Java's "isWhitespace"[3] that could be used to check > character features? It's an XML thing. When I tried to open the contents.xml with Firefox it also says broken XML. But I also don't know which are the characters that are not supported by XML. —Rasmus -- This space is left intentionally blank