emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: "Juan Manuel Macías" <maciaschain@posteo.net>
To: Ihor Radchenko <yantar92@posteo.net>
Cc: orgmode <emacs-orgmode@gnu.org>,  Timothy <orgmode@tec.tecosaur.net>
Subject: Re: Fallback fonts in LaTeX export for non latin scripts
Date: Sun, 03 Sep 2023 11:05:27 +0000	[thread overview]
Message-ID: <87il8ra554.fsf@posteo.net> (raw)
In-Reply-To: <87bkejoh4l.fsf@localhost> (Ihor Radchenko's message of "Sun, 03 Sep 2023 07:22:50 +0000")

Thanks for your comments!

Ihor Radchenko writes:

> Juan Manuel Macías <maciaschain@posteo.net> writes:
>
>> Finally I can upload some usable code here, in this case to be able to
>> load and manage fonts for languages with non-Latin scripts, through
>> babel and fontspec (in LuaLaTeX). It is an attempt to simplify from Org
>> the multiform syntax of babel + fontspec. Of course, it is more limited,
>> but for regular use I think it may be enough.
>
> I can see that you did not add defaults for Chinese, which is one of the
> problematic scripts for LaTeX. Can you add it?

In that first proof of concept I only put a few scripts, less
problematic, simply to show the functionality. In CJK languages things
are a little more complicated, but it can be done too. The idea is to
cover all scripts. In the next code I submit, when I redo the current
one, I will try to introduce the case of CJK scripts.

>> ;; #+LaTeX_Header: % !enable-fonts-for ancientgreek:Linux Libertine O(Scale=MatchLowercase)
>> ;; #+LaTeX_Header: % !enable-fonts-for russian:FreeSerif(Numbers=Lowercase,Color=blue) :: arabic
>
> I do not like this approach.

I'm not a big fan of doing it like that either. I chose this option
because I didn't have to define a new keyword and to be less "intrusive"
with the actual code. But on the other hand it adds a new syntax. Well,
I discard it, to the detriment of an idea that you mention below.

> Would be more consistent to allow multiple languages in #+language +
> #+LATEX_FONT keyword to optionally specify per-language font:

> #+language: ancientgreek russian arabic

Of course, this syntax would be the most appropriate and consistent
within Org. The problem is LaTeX, specifically babel, and that certain
inconsistencies would be created with the rest of the backends. At first
some pitfalls come to mind:

- The keyword #+language accepts for now only language codes (es, en,
  el, ar, ru, etc.). Consistency with other backends should
  be maintained in this regard: ancientgreek is not a valid language
  code, but a name that only babel understands. If we put something
  like (a valid language code):

  #+language: el-polyton

  this could be translated in babel as polutonikogreek (in the classic
  syntax, that is, the languages that are loaded in the options of
  \usepackage[options]{babel}), or, in the new syntax, ancientgreek and
  polytonicgreek, which are actually two different languages: the first
  is ancient polytonic Greek and the second modern polytonic Greek. To
  add more confusion to the matter, in classical babel syntax
  greek.ancient and greek.polytonic are also supported. But neither of
  these things can be deduced by simply putting el-polyton, unless
  breaking the consistency with the other backends.

- Added to this is that Babel has two ways to load languages: the
  classic syntax and the \babelprovide command, which is the one we are
  interested in here for languages with non-Latin scripts, because the
  onchar=ids fonts property must be added here. And what happens if the
  user has already defined several languages with babel, using the
  current procedure: \usepackage[french, english, AUTO]{babel}?

Therefore, the least complicated thing, in my opinion, is to leave the
syntax of the keyword #+language as it is. It is not necessary for the
user to explicitly define secondary non-latin languages. The idea is
that Org is responsible for generating the necessary babel code by
simply giving a command like enable font for X language. What we are
talking about here is ensuring readability using a series of fonts that
LaTeX does not load by default, not even LuaLaTeX. And, after all, Org
is monolingual: it does not have multilingual support at the moment;
that is, there is nothing in Org to switch languages in the middle of
the document. What happens is that here we take advantage of the
functionality that Babel has to automatically apply a font for a
non-Latin language/script, also loading its properties (hyphen rules,
captions, etc.).

A new keyword #+latex_language could be created, which would understand
the babel names, but I think it is unnecessary and would add more
complexity. As I said before, defining the necessary fonts would be
enough, since my idea in this is a basic practicality to ensure the
readability of the documents. And anyone looking for more advanced
functions would have to enter LaTeX code explicitly.

> #+latex_font[ancientgreek]: "Linux Libertine O" Scale=MatchLowercase
>
> #+latex_font[russian]: "FreeSerif" Numbers=Lowercase,Color=blue

I like this idea, but with the exception that in the two examples you
give the user is declaring two fonts for both languages. In my example
there was also Arabic, where the default font for the Arabic script is
used. Note that each script would have default fonts, which the user can
change or not change in their document. A user could simply put
something like "enable the default fonts for ancientgreek, russian,
malayalam, georgian, chinese". And nothing more. Or choose some other
font with or without options for a specific lang.

Could be:

#+latex_font: ancientgreek, russian, malayalam, sanskrit-devanagari

beside:

#+latex_font[arabic]: "FreeSerif" Numbers=Lowercase,Color=blue

This last syntax would also be valid to modify the main default fonts:

#+latex_font[main]: "FreeSerif" Numbers=Lowercase
#+latex_font[sans]: "some font"
#+latex_font[mono]: "some font"
#+latex_font[math]: "some font"

A practical use case. Suppose a user has a document in Spanish, which
includes passages in Greek and Russian. It would be enough to use the
Old Standard font (included in TeX live) for the entire document,
ensuring consistency:

#+latex_header: \usepackage[AUTO]{babel}
#+language:es
#+latex_font[main,greek,russian]: Old Standard

> Also, I think that it may still make sense to have some kind of fallback
> font if the specified fonts are not sufficient. For example, when using
> emoji symbols, which do not correspond to any language.

Yes I agree. That could also be included in the generated preamble.

--
Juan Manuel Macías

https://juanmanuelmacias.com

https://lunotipia.juanmanuelmacias.com

https://gnutas.juanmanuelmacias.com


  reply	other threads:[~2023-09-03 11:06 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-30  8:25 Fallback fonts in LaTeX export for non latin scripts Juan Manuel Macías
2023-08-31  8:17 ` Ihor Radchenko
2023-08-31 11:42   ` Juan Manuel Macías
2023-09-01  9:18     ` Ihor Radchenko
2023-09-02 21:39       ` Juan Manuel Macías
2023-09-03  7:22         ` Ihor Radchenko
2023-09-03 11:05           ` Juan Manuel Macías [this message]
2023-09-04  8:09             ` Ihor Radchenko
2023-09-04 22:22               ` Juan Manuel Macías
2023-09-05 10:44                 ` Ihor Radchenko
2023-09-20 14:03                   ` Juan Manuel Macías
2023-09-21  9:00                     ` Ihor Radchenko
2023-09-24 18:24                       ` Juan Manuel Macías
2023-09-26 10:37                         ` Ihor Radchenko
2023-09-05 16:42                 ` Max Nikulin
2023-09-05 18:33                   ` Juan Manuel Macías
2023-09-06  9:29                     ` Ihor Radchenko
2023-09-06 14:58                       ` Juan Manuel Macías
2023-09-07 10:22                         ` Ihor Radchenko
2023-09-07 12:04                           ` Juan Manuel Macías
2023-09-08  7:42                             ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87il8ra554.fsf@posteo.net \
    --to=maciaschain@posteo.net \
    --cc=emacs-orgmode@gnu.org \
    --cc=orgmode@tec.tecosaur.net \
    --cc=yantar92@posteo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).