From mboxrd@z Thu Jan 1 00:00:00 1970 From: Rick Frankel Subject: Re: [patch][ox-html] Stylistic changes Date: Wed, 19 Mar 2014 10:00:49 -0400 Message-ID: <488c8d8754d4539f920ce81d9698a7fb@mail.rickster.com> References: <874n2z3ruf.fsf@gmx.us> <87d2hmsbuc.fsf@gmail.com> <87eh21k1qx.fsf@bzg.ath.cx> <20140317170102.GA75979@eyeBook> <87k3bs31u8.fsf@gmx.us> <20140318003542.GB92601@eyeBook> <874n2w2n62.fsf@gmx.us> <0d5c03b1f1e36a4250cbd11d467d3efe@mail.rickster.com> <8761nbs31n.fsf@gmx.us> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:53189) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WQH38-0005AD-Qt for emacs-orgmode@gnu.org; Wed, 19 Mar 2014 10:00:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WQH33-0004x9-QK for emacs-orgmode@gnu.org; Wed, 19 Mar 2014 10:00:54 -0400 Received: from mail.rickster.com ([204.62.15.78]:38425) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WQH33-0004we-Mk for emacs-orgmode@gnu.org; Wed, 19 Mar 2014 10:00:49 -0400 In-Reply-To: <8761nbs31n.fsf@gmx.us> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Rasmus Cc: emacs-orgmode@gnu.org On 2014-03-18 15:46, Rasmus wrote: > Rick Frankel writes: > > On 2014-03-17 23:36, Rasmus wrote: > When you refer above to "utf-8 entities", do you mean the named html > entities (e.g., <) or the actual utf-8 encoded characters? > > The latter. Do M-x describe-char on such an character. Emacs will > tell you the code points. My conjecture is therefore that one could > write a script that would translate html values to these weird hex > string or codepoints. It would create more ugly source output, but > perhaps better for XHTML. Personally, I don't care about XHTML as I > have little intuition as to when to use. . . Do you close the empty tags in your html (e.g.,
,
)? Then you're using xhtml. > I believe the named entities are encoding independent, while including > encoded characters in html output is fine -- although making sure the > page is served with the correct character encoding is another issue > entirely. > > Not what I meant. I'm only addressing your concern about > &HUMAN-READABLE-NAME; vs %HEX-VALUE;. > > As to using a more extensive set of named entities, as i said above, > the problem is that the xhtml flavors don't support them, and I don't > see any advantage in making the exporter handle character encoding > differently based on ouput doctype. > > Definitely not. Why I ask if there's a point in changing nice > entities to ugly entities for the sake of not getting them in > XHTML-encoded documents. Yes we should. You can't properly post-process the html if it's invalid xml. And the definition of "pretty" and "ugly" are subjective. The question is, do we want to generate valid (x)html or not? My vote is yes. In our case, html is an output format and not a source format. In fact, we should probably compress out unnecessary whitespace, etc. the way other web generators do to make the smallest/most efficent output for webserving. > As Nicolas would point out, you can always use a filter to map all the > entities in the output. > > With ox-latex.el we for instance don't include entities that are not > supported by the default package alist. A similar concern could be at > play here. Agreed.