From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:403:478a::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id sP4NG9+Q9WQ2jwAAauVa8A:P1 (envelope-from ) for ; Mon, 04 Sep 2023 10:10:07 +0200 Received: from aspmx1.migadu.com ([2001:41d0:403:478a::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id sP4NG9+Q9WQ2jwAAauVa8A (envelope-from ) for ; Mon, 04 Sep 2023 10:10:07 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 1658260FEF for ; Mon, 4 Sep 2023 10:10:07 +0200 (CEST) Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=WMG0E9dd; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=posteo.net ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1693815007; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=fpCbzxgEM4V/8mTy55orWaGjthsY/5Novvi7iSyzWwk=; b=JQQ+SIoiBcsr2a0TCAgxEVUqvTG3wvuW0twr8jrw81GLCNzs759LdRPTBjAPHh9bejb26X D9YGdYOJj5wXu1HoddgTjbwXOssOBk04haXxKdcBbCmzS+suFPVGic/NqiZ9FD84rbM9Bw tyVYUqGHq9fasEA+qYZbSThMVGSg/h2SNj6DRKvQVExr4yHxkJ4BfXLK/EMljeG1Ecfw0H 1SQDy9iWQ3nn6NW9l+srWdT2gogQ2k0x7LSunWaTX9t64iQdMZ8z4RQPln5CkNhjQRepCE qNDap7HvFtmNE4xW3kgKqIlfspEFHqATWIAXEonuRpAn0DfqyA1ANEWD9KBn4g== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1693815007; a=rsa-sha256; cv=none; b=M3BLc+bWit2mtwoVO2LuHGE7d+OPIWddqYWiZKROnEI5RviL4PHrokoDu/CAfb4cPBqj6B 5moBzjhLUz53UbV9UdnozwCu12AJMgQQtZdmHeOjoHjT3c1IaHTO9Y/zlW5JG/hRwyUl1l OIMbOGQX3PbAJ/9rJCY32rZM2VnEoZH/DI5sT/cPr+l+C6RVnEGVgGwLPZgnSfWfOHubHR Jpc0Gwq/cP4Rjm3ghvh6N/eCOs8+xRMJ59IV8ogZljdkl4FNRH9Kz4DDso7q8P3P8ewogj RhuX29ZSStG5F0MwFxE3kE3Vtsvt3dPCgJyuyN24PB/UiAfCGYjuN7Rjy/LoKg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=WMG0E9dd; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=posteo.net Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qd4df-0006DR-7m; Mon, 04 Sep 2023 04:08:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qd4dM-0005pl-Au for emacs-orgmode@gnu.org; Mon, 04 Sep 2023 04:08:34 -0400 Received: from mout01.posteo.de ([185.67.36.65]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qd4dI-0006vq-Uk for emacs-orgmode@gnu.org; Mon, 04 Sep 2023 04:08:32 -0400 Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id F264E240028 for ; Mon, 4 Sep 2023 10:08:26 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1693814907; bh=KKPGBws7pjuEYWAFO6DKYYjDTmRcK73leA0JRyOv41E=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version: Content-Transfer-Encoding:From; b=WMG0E9ddA6JD2GGNulHoMD2yXDJmU1HZ3QNz8jRN+jMhha5WfEr7+41+A/dEBeITx hJhZDuccZC3G64EqnMNmyeH6FzBQyTG4/8eV9hGNm5l2PM0ci8/0fBg6qzNNcxXb0f YZxe6it0kuqnVnNe1Jp2RDzaWvWtrds1eXZJIdDF6elx+zAPoGbrzZssfyx6p37aDB KkqrEPbzC5mbzvsOlIgYgOPwoXifPFMkeelVQIs88Wy36ORXtqBywQA3zl/6Za7pSS hzvXR3lHz64H4pBiyZTW416fresj3Oa1sELYK1yX9xwHNOw6plh2GWiy+yTufvLOU1 PH3gvs/Yf1WDQ== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4RfLqZ3ZgQz9rxL; Mon, 4 Sep 2023 10:08:26 +0200 (CEST) From: Ihor Radchenko To: Juan Manuel =?utf-8?Q?Mac=C3=ADas?= Cc: orgmode , Timothy Subject: Re: Fallback fonts in LaTeX export for non latin scripts In-Reply-To: <87il8ra554.fsf@posteo.net> References: <878r9t7x7y.fsf@posteo.net> <87wmxbvd60.fsf@localhost> <877cpb8mkd.fsf@posteo.net> <877cpatfol.fsf@localhost> <878r9ocl17.fsf@posteo.net> <87bkejoh4l.fsf@localhost> <87il8ra554.fsf@posteo.net> Date: Mon, 04 Sep 2023 08:09:13 +0000 Message-ID: <87jzt6weae.fsf@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=185.67.36.65; envelope-from=yantar92@posteo.net; helo=mout01.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-Migadu-Scanner: mx0.migadu.com X-Migadu-Spam-Score: -9.44 X-Spam-Score: -9.44 X-Migadu-Queue-Id: 1658260FEF X-TUID: mhtcBa8m/x7b Juan Manuel Mac=C3=ADas writes: >> #+language: ancientgreek russian arabic > > Of course, this syntax would be the most appropriate and consistent > within Org. The problem is LaTeX, specifically babel, and that certain > inconsistencies would be created with the rest of the backends. At first > some pitfalls come to mind: > > - The keyword #+language accepts for now only language codes (es, en, > el, ar, ru, etc.). Consistency with other backends should > be maintained in this regard: ancientgreek is not a valid language > code, but a name that only babel understands. If we put something > like (a valid language code): > > #+language: el-polyton > > this could be translated in babel as polutonikogreek (in the classic > syntax, that is, the languages that are loaded in the options of > \usepackage[options]{babel}), or, in the new syntax, ancientgreek and > polytonicgreek, which are actually two different languages: the first > is ancient polytonic Greek and the second modern polytonic Greek. To > add more confusion to the matter, in classical babel syntax > greek.ancient and greek.polytonic are also supported. But neither of > these things can be deduced by simply putting el-polyton, unless > breaking the consistency with the other backends. I am now working on unifying Org translation system as discussed in https://orgmode.org/list/87o7iw8yem.fsf@bzg.fr As a part of the effort, I plan to introduce a new constant that will unify language abbreviations across Org and also associate them with more human-readable names. (defconst org-language-abbrevs '(("am". "Amharic") ("ar" . "Arabic") ("ast" . "Asturian") ("bg" . "Bulgarian") ("bn" . "Bengali") ...)) The idea is to allow #+language: Austrian German, Greek as a valid specifier, in addition to #+language: de-at, el Then, across Org, we will make use of the standardized language abbreviations. > - Added to this is that Babel has two ways to load languages: the > classic syntax and the \babelprovide command, which is the one we are > interested in here for languages with non-Latin scripts, because the > onchar=3Dids fonts property must be added here. And what happens if the > user has already defined several languages with babel, using the > current procedure: \usepackage[french, english, AUTO]{babel}? For LaTeX specifically, `org-latex-language-alist', will be re-used to map whatever is allowed in #+language keyword to its name in babel/polyglossia. Does it make sense? > Therefore, the least complicated thing, in my opinion, is to leave the > syntax of the keyword #+language as it is. It is not necessary for the > user to explicitly define secondary non-latin languages. The idea is > that Org is responsible for generating the necessary babel code by > simply giving a command like enable font for X language. What we are > talking about here is ensuring readability using a series of fonts that > LaTeX does not load by default, not even LuaLaTeX. And, after all, Org > is monolingual: it does not have multilingual support at the moment; > that is, there is nothing in Org to switch languages in the middle of > the document. What happens is that here we take advantage of the > functionality that Babel has to automatically apply a font for a > non-Latin language/script, also loading its properties (hyphen rules, > captions, etc.). > > A new keyword #+latex_language could be created, which would understand > the babel names, but I think it is unnecessary and would add more > complexity. As I said before, defining the necessary fonts would be > enough, since my idea in this is a basic practicality to ensure the > readability of the documents. And anyone looking for more advanced > functions would have to enter LaTeX code explicitly. I think that we should move towards multi-language support. Such support would practically simplify WORG and orgmode.org translation process, and may also be used as a basis to allow translating the Org manual. My rough idea is to allow specifying language as affiliated keyword and, in future, allow selective export to certain target language. Multi-language documents are another potential target to support. >> #+latex_font[ancientgreek]: "Linux Libertine O" Scale=3DMatchLowercase >> >> #+latex_font[russian]: "FreeSerif" Numbers=3DLowercase,Color=3Dblue > > I like this idea, but with the exception that in the two examples you > give the user is declaring two fonts for both languages. In my example > there was also Arabic, where the default font for the Arabic script is > used. My idea was that #+language: ancientgreek russian arabic implies "use default font for arabic", unless #+latex_font is specified. > #+latex_font[arabic]: "FreeSerif" Numbers=3DLowercase,Color=3Dblue > > This last syntax would also be valid to modify the main default fonts: > > #+latex_font[main]: "FreeSerif" Numbers=3DLowercase > #+latex_font[sans]: "some font" > #+latex_font[mono]: "some font" > #+latex_font[math]: "some font" > > A practical use case. Suppose a user has a document in Spanish, which > includes passages in Greek and Russian. It would be enough to use the > Old Standard font (included in TeX live) for the entire document, > ensuring consistency: > > #+latex_header: \usepackage[AUTO]{babel} > #+language:es > #+latex_font[main,greek,russian]: Old Standard Looks reasonable. --=20 Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at