emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@posteo.net>
To: "Juan Manuel Macías" <maciaschain@posteo.net>
Cc: orgmode <emacs-orgmode@gnu.org>, Timothy <orgmode@tec.tecosaur.net>
Subject: Re: Fallback fonts in LaTeX export for non latin scripts
Date: Mon, 04 Sep 2023 08:09:13 +0000	[thread overview]
Message-ID: <87jzt6weae.fsf@localhost> (raw)
In-Reply-To: <87il8ra554.fsf@posteo.net>

Juan Manuel Macías <maciaschain@posteo.net> writes:

>> #+language: ancientgreek russian arabic
>
> Of course, this syntax would be the most appropriate and consistent
> within Org. The problem is LaTeX, specifically babel, and that certain
> inconsistencies would be created with the rest of the backends. At first
> some pitfalls come to mind:
>
> - The keyword #+language accepts for now only language codes (es, en,
>   el, ar, ru, etc.). Consistency with other backends should
>   be maintained in this regard: ancientgreek is not a valid language
>   code, but a name that only babel understands. If we put something
>   like (a valid language code):
>
>   #+language: el-polyton
>
>   this could be translated in babel as polutonikogreek (in the classic
>   syntax, that is, the languages that are loaded in the options of
>   \usepackage[options]{babel}), or, in the new syntax, ancientgreek and
>   polytonicgreek, which are actually two different languages: the first
>   is ancient polytonic Greek and the second modern polytonic Greek. To
>   add more confusion to the matter, in classical babel syntax
>   greek.ancient and greek.polytonic are also supported. But neither of
>   these things can be deduced by simply putting el-polyton, unless
>   breaking the consistency with the other backends.

I am now working on unifying Org translation system as discussed in
https://orgmode.org/list/87o7iw8yem.fsf@bzg.fr
As a part of the effort, I plan to introduce a new constant that will
unify language abbreviations across Org and also associate them with
more human-readable names.

(defconst org-language-abbrevs
  '(("am".  "Amharic")
    ("ar" . "Arabic")
    ("ast" . "Asturian")
    ("bg" . "Bulgarian")
    ("bn" . "Bengali")
    ...))

The idea is to allow
#+language: Austrian German, Greek
as a valid specifier, in addition to
#+language: de-at, el

Then, across Org, we will make use of the standardized language
abbreviations.

> - Added to this is that Babel has two ways to load languages: the
>   classic syntax and the \babelprovide command, which is the one we are
>   interested in here for languages with non-Latin scripts, because the
>   onchar=ids fonts property must be added here. And what happens if the
>   user has already defined several languages with babel, using the
>   current procedure: \usepackage[french, english, AUTO]{babel}?

For LaTeX specifically, `org-latex-language-alist', will be re-used to
map whatever is allowed in #+language keyword to its name in
babel/polyglossia.

Does it make sense?

> Therefore, the least complicated thing, in my opinion, is to leave the
> syntax of the keyword #+language as it is. It is not necessary for the
> user to explicitly define secondary non-latin languages. The idea is
> that Org is responsible for generating the necessary babel code by
> simply giving a command like enable font for X language. What we are
> talking about here is ensuring readability using a series of fonts that
> LaTeX does not load by default, not even LuaLaTeX. And, after all, Org
> is monolingual: it does not have multilingual support at the moment;
> that is, there is nothing in Org to switch languages in the middle of
> the document. What happens is that here we take advantage of the
> functionality that Babel has to automatically apply a font for a
> non-Latin language/script, also loading its properties (hyphen rules,
> captions, etc.).
>
> A new keyword #+latex_language could be created, which would understand
> the babel names, but I think it is unnecessary and would add more
> complexity. As I said before, defining the necessary fonts would be
> enough, since my idea in this is a basic practicality to ensure the
> readability of the documents. And anyone looking for more advanced
> functions would have to enter LaTeX code explicitly.

I think that we should move towards multi-language support.
Such support would practically simplify WORG and orgmode.org translation
process, and may also be used as a basis to allow translating the
Org manual.

My rough idea is to allow specifying language as affiliated
keyword and, in future, allow selective export to certain target
language.

Multi-language documents are another potential target to support.

>> #+latex_font[ancientgreek]: "Linux Libertine O" Scale=MatchLowercase
>>
>> #+latex_font[russian]: "FreeSerif" Numbers=Lowercase,Color=blue
>
> I like this idea, but with the exception that in the two examples you
> give the user is declaring two fonts for both languages. In my example
> there was also Arabic, where the default font for the Arabic script is
> used.

My idea was that

#+language: ancientgreek russian arabic

implies "use default font for arabic", unless #+latex_font is specified.

> #+latex_font[arabic]: "FreeSerif" Numbers=Lowercase,Color=blue
>
> This last syntax would also be valid to modify the main default fonts:
>
> #+latex_font[main]: "FreeSerif" Numbers=Lowercase
> #+latex_font[sans]: "some font"
> #+latex_font[mono]: "some font"
> #+latex_font[math]: "some font"
>
> A practical use case. Suppose a user has a document in Spanish, which
> includes passages in Greek and Russian. It would be enough to use the
> Old Standard font (included in TeX live) for the entire document,
> ensuring consistency:
>
> #+latex_header: \usepackage[AUTO]{babel}
> #+language:es
> #+latex_font[main,greek,russian]: Old Standard

Looks reasonable.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


  reply	other threads:[~2023-09-04  8:10 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-30  8:25 Fallback fonts in LaTeX export for non latin scripts Juan Manuel Macías
2023-08-31  8:17 ` Ihor Radchenko
2023-08-31 11:42   ` Juan Manuel Macías
2023-09-01  9:18     ` Ihor Radchenko
2023-09-02 21:39       ` Juan Manuel Macías
2023-09-03  7:22         ` Ihor Radchenko
2023-09-03 11:05           ` Juan Manuel Macías
2023-09-04  8:09             ` Ihor Radchenko [this message]
2023-09-04 22:22               ` Juan Manuel Macías
2023-09-05 10:44                 ` Ihor Radchenko
2023-09-20 14:03                   ` Juan Manuel Macías
2023-09-21  9:00                     ` Ihor Radchenko
2023-09-24 18:24                       ` Juan Manuel Macías
2023-09-26 10:37                         ` Ihor Radchenko
2023-09-05 16:42                 ` Max Nikulin
2023-09-05 18:33                   ` Juan Manuel Macías
2023-09-06  9:29                     ` Ihor Radchenko
2023-09-06 14:58                       ` Juan Manuel Macías
2023-09-07 10:22                         ` Ihor Radchenko
2023-09-07 12:04                           ` Juan Manuel Macías
2023-09-08  7:42                             ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzt6weae.fsf@localhost \
    --to=yantar92@posteo.net \
    --cc=emacs-orgmode@gnu.org \
    --cc=maciaschain@posteo.net \
    --cc=orgmode@tec.tecosaur.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).