From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:403:478a::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id SI3jCudY9mSfSQAAauVa8A:P1 (envelope-from ) for ; Tue, 05 Sep 2023 00:23:35 +0200 Received: from aspmx1.migadu.com ([2001:41d0:403:478a::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id SI3jCudY9mSfSQAAauVa8A (envelope-from ) for ; Tue, 05 Sep 2023 00:23:35 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 88B9662593 for ; Tue, 5 Sep 2023 00:23:34 +0200 (CEST) Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=emWH7+8w; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=posteo.net ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1693866215; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=3zAS7Z6wtNCvuOsz02+4IaYKG29pMhswj0e+gy//OFE=; b=uvBfPjVHzSTQEnGUtbuoCxoPYUZs1AVG+GZDSBrLYzhZh6nSUd0Q3B3igo5CN+pKNV6Cwq aJvGUIZtFPzPbE+aQ+OqACc/lD5iYmFV0gmuu/FtS/YNC4I0U2k7COp125dFcQexA2GVKo wU8rXt73nemu1tabQfEwmHz45eyjBo6JO/fqCi0kdQPWUCDHxYEl+06s8zd4rlcBNspngE zYefFVFjhP5r3ST7YAPXFggeYX3+InRrmaNmwxHN8hxUrUd+sk4862Np8edtuQ2XBwF9pi BgvnpiloBRX+X0a4/LKb2ePgGkNIwzjO0NEBX+ahsbzdP7ke1STJTI5v/oBLzQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1693866215; a=rsa-sha256; cv=none; b=TRhu5qR46VZjSjgfMBZN13udwa3LfELw5yymk48Fq/cSYI4fpQ7qTHxD8BWyAiYoK6CHHl LwG2Gsld8f49+L9s8dCYcsRtJ+v/33HlTO+sfMnlaBw1OqqhcW0Vsz5WJbvOU/mGUPxjx+ sOSZyAyJn7ZHTqO9/MsRiGDASfPt8wXx7HS0BddQ64sQjXh165ZSofgVUFKqRiIMhxQUKL ANryU4eSfnrmGhG7iiEK2Fyxc0wP41ZMVoMf9impr+uqtHEhUHuWlZfG/sDKMsdIV4ZNAh YcvPNYBgzg/W9C1CLWLWIPJMgFhZYZE+i0R/sLjhDYKeHvGZQ9ldZTaexpyE3w== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=emWH7+8w; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=posteo.net Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qdHxu-00054A-Iq; Mon, 04 Sep 2023 18:22:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qdHxr-000540-Fh for emacs-orgmode@gnu.org; Mon, 04 Sep 2023 18:22:36 -0400 Received: from mout02.posteo.de ([185.67.36.66]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qdHxo-0005HI-3s for emacs-orgmode@gnu.org; Mon, 04 Sep 2023 18:22:35 -0400 Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id DCB6B240101 for ; Tue, 5 Sep 2023 00:22:27 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1693866147; bh=syd00TN1g7DnEzdnc0FuCkf5fRhmRMS9LdZAcvc8fOw=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version: Content-Transfer-Encoding:From; b=emWH7+8wldL3+r8TbwQVgdzCXT0enljfJM2bKA+fEqFD1fxRyNhiCIeagkU0neUp3 Kn/VknqgIKp7maJVpkpcYDkoXzBLGBOTQEAGYIKIvZKk1m8ntw/5jUkEqYUGEvfKqQ hMqMJMDCiW478Noba243sBBiBEd8rfSCGsyOLwMu7dxlTkJ4OD553We+nHnGohefDh eEWF1deXfx2fUF4fTc2kTdbSi9CyyU0RmH5iwiMICaEVSwT6QipO3Ti+muQr7f48c7 E40UF93CLpK12QuRGaJnwm/MX9h7g9Igp0MaNInO8FCSvkJC3yf9P+SjZJ2Q5AEXjq BDGfXqkhXaIaA== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4Rfjmz1G42z9rxH; Tue, 5 Sep 2023 00:22:27 +0200 (CEST) From: =?utf-8?Q?Juan_Manuel_Mac=C3=ADas?= To: Ihor Radchenko Cc: orgmode , Timothy Subject: Re: Fallback fonts in LaTeX export for non latin scripts In-Reply-To: <87jzt6weae.fsf@localhost> (Ihor Radchenko's message of "Mon, 04 Sep 2023 08:09:13 +0000") References: <878r9t7x7y.fsf@posteo.net> <87wmxbvd60.fsf@localhost> <877cpb8mkd.fsf@posteo.net> <877cpatfol.fsf@localhost> <878r9ocl17.fsf@posteo.net> <87bkejoh4l.fsf@localhost> <87il8ra554.fsf@posteo.net> <87jzt6weae.fsf@localhost> Date: Mon, 04 Sep 2023 22:22:25 +0000 Message-ID: <87bkehshni.fsf@posteo.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=185.67.36.66; envelope-from=maciaschain@posteo.net; helo=mout02.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-Migadu-Scanner: mx0.migadu.com X-Migadu-Spam-Score: -6.94 X-Spam-Score: -6.94 X-Migadu-Queue-Id: 88B9662593 X-TUID: odUK0NgSQGmo Ihor Radchenko writes: > Juan Manuel Mac=C3=ADas writes: > >>> #+language: ancientgreek russian arabic >> >> Of course, this syntax would be the most appropriate and consistent >> within Org. The problem is LaTeX, specifically babel, and that certain >> inconsistencies would be created with the rest of the backends. At first >> some pitfalls come to mind: >> >> - The keyword #+language accepts for now only language codes (es, en, >> el, ar, ru, etc.). Consistency with other backends should >> be maintained in this regard: ancientgreek is not a valid language >> code, but a name that only babel understands. If we put something >> like (a valid language code): >> >> #+language: el-polyton >> >> this could be translated in babel as polutonikogreek (in the classic >> syntax, that is, the languages that are loaded in the options of >> \usepackage[options]{babel}), or, in the new syntax, ancientgreek and >> polytonicgreek, which are actually two different languages: the first >> is ancient polytonic Greek and the second modern polytonic Greek. To >> add more confusion to the matter, in classical babel syntax >> greek.ancient and greek.polytonic are also supported. But neither of >> these things can be deduced by simply putting el-polyton, unless >> breaking the consistency with the other backends. > > I am now working on unifying Org translation system as discussed in > https://orgmode.org/list/87o7iw8yem.fsf@bzg.fr > As a part of the effort, I plan to introduce a new constant that will > unify language abbreviations across Org and also associate them with > more human-readable names. > > (defconst org-language-abbrevs > '(("am". "Amharic") > ("ar" . "Arabic") > ("ast" . "Asturian") > ("bg" . "Bulgarian") > ("bn" . "Bengali") > ...)) > > The idea is to allow > > #+language: Austrian German, Greek > as a valid specifier, in addition to > > #+language: de-at, el > > Then, across Org, we will make use of the standardized language > abbreviations. Great! I think it's great news. Yes, I agree with what you say below. I think Org should move towards a multilingual support that is 100% native to Org. That is, Org had its own "selectlanguage" mechanism, to be able to delimit text segments in other languages and have control over them, both within Org and when exporting to the different backends. That scenario seems very desirable to me, and I would like to contribute my help to the best of my ability (and time). In LaTeX, as I mentioned, things are complicated. There is Babel and Polyglossia, and there is LuaTeX and XeTeX. In addition, there is also pdfTeX, which is still the default engine and (to be honest) is the engine used by a high percentage of LaTeX users. Although perhaps things will change soon to the detriment of LuaTeX. Both babel and polyglossia could be supported, but that means more work, more code, and more complications. And we are not sure that polyglossia is no longer maintained. After all, babel is the official LaTeX package for language support, and polyglossia appeared at a time when babel had no support for the new unicode engines. Now Babel supports all of that and is much more powerful, but its interface has also grown in complexity. There is the problem of the double syntax for loading languages: the old one, which loads traditional ldf files, and the modern one (\babelprovide), which loads languages using ini files. It is more powerful, with more options, but has added more verbosity to babel. I have taken advantage of \babelprovide, specifically its onchar=3Did fonts property, to automatically apply fonts to non-Latin scripts. >> I like this idea, but with the exception that in the two examples you >> give the user is declaring two fonts for both languages. In my example >> there was also Arabic, where the default font for the Arabic script is >> used. > > My idea was that > > #+language: ancientgreek russian arabic > > implies "use default font for arabic", unless #+latex_font is specified. This seems the most consistent to me for Org, but, as I mentioned in the other email, I have some concerns. Currently, what we are talking about is simply font support for non-Latin languages. If it is allowed, in the current state of things, that #+language can accept a list of language names, we can give the user a wrong perception of reality. That is: multilingual support that does not exist as such. It is more like font support for non-Latin languages. And only in LaTeX, and specifically in LuaLaTeX. Furthermore, the user could mix languages that in Babel are loaded through ldf and others through ini files. For example, something like this: #+language: spanish, english, french, russian in Babel it would be: \usepackage[english,french,spanish]{babel} and here we need babelprovide for the font (and load Russian via ini file): \babelprovide[onchar=3Did fonts, import]{russian} \babelfont[russian]{rm}[options]{somefont} Org would have to discern which name refers to a non-Latin language (which wouldn't be complicated with the functionality you're working on) and then apply the default font by adding a line with \babelprovide. Of course, English, French and Spanish can also be loaded via ini files: \babelprovide[main,import]{spanish} \babelprovide[import]{french} \babelprovide[import]{english} Even babel also supports: \usepackage[english,french,spanish,provide*=3D*]{babel} but in that line we cannot put Russian with onchar, etc. And then there is pdfTeX, where only the classic babel syntax is allowed, without any "*provide". In short, I find everything very confusing. I am not opposed to doing it as you propose (in fact, it is the option I like the most, especially when org is polyglot in the future), but I also want to warn of possible complications. Therefore, since we are, for now, with fonts for non-Latin languages, I think it should be made clear that the keyword is about fonts (and about LuaLaTeX). Maybe through two keywords: #+lualatex_fonts_for: language(s) #+lualatex_fonts[language(s)]: "font" options ? I think it's ugly, but I can't think of anything else. By the way, and as a side note, is it currently possible in Org to define a keyword within :options-alist of the style #+foo[anything] or would something like org-collect-keywords have to be modified? --=20 Juan Manuel Mac=C3=ADas https://juanmanuelmacias.com https://lunotipia.juanmanuelmacias.com https://gnutas.juanmanuelmacias.com