emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* HTML export: how to delimit escaped HTML entity?
@ 2008-08-27 22:37 mtheo
  2008-08-28  0:00 ` Sebastian Rose
  2008-08-28 14:46 ` William Henney
  0 siblings, 2 replies; 3+ messages in thread
From: mtheo @ 2008-08-27 22:37 UTC (permalink / raw)
  To: org-mode list

This is probably bonehead simple, but so far that bone seems to be broken here.

The entities in org-html-entities work fine for me as long as followed by a 
space (or another \-escaped entity), but I can't seem to discover how 
they're delimited within a word.

E.g.
	"Ren\eacute " produces  "René "

which is just fine; but I've failed to find a way to get the correct

	"Ástor"

out of any combination of "\Aacute" and "stor".

	"\Aacutestor" -> "\Aacutestor"
	"\Aacute stor" (naturally) -> "Á stor"

and nothing I've tried by way of analogy with shells (e.g. "\{Aacute}stor") 
or other syntax ("\Aacute\stor"? "\Aacute$stor"? etc.) has produced anything 
but the same string in the generated code.

I'm sure my ignorance of {insert bonehead lacuna here} is showing, but does 
anyone know how to enter poor old Sr Piazzolla's given name into an org file 
so that it will export?

Thanks -- and repeated & continual thanks to Carsten and all other generous 
contributors.

Mark


GNU Emacs 23.0.60.1 (i386-mingw-nt5.1.2600)
org-version 6.06b


-- 
m. theo
producer / classics without walls
the anti-warhorse zone / www.amural.com
kusf 90.3fm / san francisco

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: HTML export: how to delimit escaped HTML entity?
  2008-08-27 22:37 HTML export: how to delimit escaped HTML entity? mtheo
@ 2008-08-28  0:00 ` Sebastian Rose
  2008-08-28 14:46 ` William Henney
  1 sibling, 0 replies; 3+ messages in thread
From: Sebastian Rose @ 2008-08-28  0:00 UTC (permalink / raw)
  To: org-mode list


What's in your XHTML-Head section?

How does your Content-Type line look? That is,

<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>

Change the 'utf-8' to what ever encoding your exported HTML files have.
You can detect the real encoding of the files, by visiting souch an
exported HTML file with emacs.

In the modeline on the left, you can see some indicators, the second
of which shows if the encoding is multibyte ('U') or singlebyte ('1').
Click it to see a full description of the files encoding.



|
|
|
|
|_________________________________________
|-U--
   ^
   |
Click








Some more on this:

The world is big. Consider to change your System to use UTF-8
everywhere.

This is what I do here. Every file (and every filename) is UTF-8
on my system and thus things like

ö ß á Á € ø ·

folder
   │
   ├> file
   └> another_file


are always displayed correct. So is the XHTML export of my Org-files.
The characters are not changed to their html entities at all.

Switch to UTF-8 and get rid of all your local problems :-)

The easiest way to do this is just to install a modern GNU/Linux
distribution (use Debian). Just 1 hour and your done, once for all.


Regards,

  - Sebastian


mtheo wrote:
> This is probably bonehead simple, but so far that bone seems to be 
> broken here.
> 
> The entities in org-html-entities work fine for me as long as followed 
> by a space (or another \-escaped entity), but I can't seem to discover 
> how they're delimited within a word.
> 
> E.g.
>     "Ren\eacute " produces  "Ren&eacute; "
> 
> which is just fine; but I've failed to find a way to get the correct
> 
>     "&Aacute;stor"
> 
> out of any combination of "\Aacute" and "stor".
> 
>     "\Aacutestor" -> "\Aacutestor"
>     "\Aacute stor" (naturally) -> "&Aacute; stor"
> 
> and nothing I've tried by way of analogy with shells (e.g. 
> "\{Aacute}stor") or other syntax ("\Aacute\stor"? "\Aacute$stor"? etc.) 
> has produced anything but the same string in the generated code.
> 
> I'm sure my ignorance of {insert bonehead lacuna here} is showing, but 
> does anyone know how to enter poor old Sr Piazzolla's given name into an 
> org file so that it will export?
> 
> Thanks -- and repeated & continual thanks to Carsten and all other 
> generous contributors.
> 
> Mark
> 
> 
> GNU Emacs 23.0.60.1 (i386-mingw-nt5.1.2600)
> org-version 6.06b
> 
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: HTML export: how to delimit escaped HTML entity?
  2008-08-27 22:37 HTML export: how to delimit escaped HTML entity? mtheo
  2008-08-28  0:00 ` Sebastian Rose
@ 2008-08-28 14:46 ` William Henney
  1 sibling, 0 replies; 3+ messages in thread
From: William Henney @ 2008-08-28 14:46 UTC (permalink / raw)
  To: mtheo; +Cc: org-mode list

Hi Mark

On Wed, Aug 27, 2008 at 5:37 PM, mtheo <mtheo@mtheo.net> wrote:
> The entities in org-html-entities work fine for me as long as followed by a
> space (or another \-escaped entity), but I can't seem to discover how
> they're delimited within a word.

Maybe someone will correct me, but it looks to me like there is no
provision in the code for delimiting these entities.

In the function org-html-do-expand in org-exp.el, if you change the
line 3788 (in org 6.06b):

(while (setq start (string-match "\\\\\\([a-zA-Z]+\\)" s start))

to

(while (setq start (string-match "\\\\\\([a-zA-Z]+\\)\\(\\{\\}\\)?" s start))

then you can use an empty pair of braces "{}" to end the entities (as
in LaTeX macros). With that change, then "\Aacute{}stor" will produce
"&Aacute;stor". I haven't tested this much, so I don't know if it has
unwanted side effects elsewhere.

I would also echo Sebastian's suggestion that you may be better off
just directly entering the non-ascii character in the .org file. This
usually works fine for export to HTML and means taht your org files
are much more readable. You can either use your operating system's way
of entering special characters (which in most cases is pretty clunky)
or emacs' own input methods (which are very nice). You can turn one of
these on with "C-u C-\" - hit TAB to get a list to choose from. There
are lots of language-specific ones, and also general ones like tex and
sgml. For instance, with the sgml input method, you type "&Aacute;" to
get "Á", or with the spanish-prefix input method you would type "'A"

Cheers

Will




-- 

 Dr William Henney, Centro de Radioastronomía y Astrofísica,
 Universidad Nacional Autónoma de México, Campus Morelia

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-08-28 14:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-08-27 22:37 HTML export: how to delimit escaped HTML entity? mtheo
2008-08-28  0:00 ` Sebastian Rose
2008-08-28 14:46 ` William Henney

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).