emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: "Juan Manuel Macías" <maciaschain@posteo.net>
To: orgmode <emacs-orgmode@gnu.org>
Cc: Ihor Radchenko <yantar92@posteo.net>, Timothy <orgmode@tec.tecosaur.net>
Subject: Re: Fallback fonts in LaTeX export for non latin scripts
Date: Sat, 02 Sep 2023 21:39:16 +0000	[thread overview]
Message-ID: <878r9ocl17.fsf@posteo.net> (raw)
In-Reply-To: <877cpatfol.fsf@localhost> (Ihor Radchenko's message of "Fri, 01 Sep 2023 09:18:18 +0000")

[-- Attachment #1: Type: text/plain, Size: 1551 bytes --]

Finally I can upload some usable code here, in this case to be able to
load and manage fonts for languages with non-Latin scripts, through
babel and fontspec (in LuaLaTeX). It is an attempt to simplify from Org
the multiform syntax of babel + fontspec. Of course, it is more limited,
but for regular use I think it may be enough.

Since this code is mostly a proof of concept and the names of many
things (and the things themselves) are still tentative, I thought it
would be more useful to attach it in an *.el file, rather than a regular
patch. Loading that file everything should work fine. I also attach an
org document with some examples of use. In any case, there are more
explanations inside the .el file.

One of the big problems I have encountered when trying to create a
"(LaTeX) Babel interface in Org" is the *horrible* multiplicity that
Babel has for language names. That is the reason for the :babel-alt
property in 'org-latex-language-alist', which collects the names that
babel supports for \babelprovide, which are not always the same as the
'classic' babel syntax.

Finally, I find this way more useful (that is, loading fonts with
language support), instead of a fallback font system based only on the
Unicode scripts. It is less 'automatic', but more precise, and it also
does not require much 'specialized' intervention on the part of the
user.

Best regards,

-- 
Juan Manuel Macías

https://juanmanuelmacias.com

https://lunotipia.juanmanuelmacias.com

https://gnutas.juanmanuelmacias.com


[-- Attachment #2: test-lang.org --]
[-- Type: application/vnd.lotus-organizer, Size: 2866 bytes --]

[-- Attachment #3: unicode-font-support.el --]
[-- Type: text/plain, Size: 8721 bytes --]

;; -*- lexical-binding: t; -*-

;; A proof of concept for Unicode font support in LaTeX export, using
;; babel and fontspec, with luatex as the default compiler.

;; Use example:

;; It is not necessary to load languages with non-Latin alphabet in babel options:
;; #+LaTeX_Header: \usepackage[AUTO]{babel}

;; Languages and fonts (there may be multiple lines):

;; #+LaTeX_Header: % !enable-fonts-for ancientgreek:Linux Libertine O(Scale=MatchLowercase)
;; #+LaTeX_Header: % !enable-fonts-for russian:FreeSerif(Numbers=Lowercase,Color=blue) :: arabic

;; Explanation:

;; - lang = enable default font for lang
;; - lang:font = enable font for lang in current document
;; - lanf:font(options) = enable font for lang in this document with options
;; - :: = separator


;; code

;;  This is supposed to be a defcustom.

(setq org-latex-uc-fonts-support t)

;; A mini version of `org-latex-language-alist', for this proof of
;; concept. Babel uses various names for languages. The ones that
;; interest us here are those collected in `:babel-alt', which is
;; always a list. The names sometimes match the `classic' babel name
;; and other times they don't. And in the case of "el-polyton" there
;; are two possible names. For a list of these names see:
;; [[https://CTAN/macros/latex/required/babel/base/babel.pdf]],
;; p. 22.

(defconst org-latex-language-alist
  '(("en"  :babel "american" :babel-alt ("english-unitedstates") :polyglossia "english" :polyglossia-variant "usmax" :lang-name "English" :script "latin" :code "latn")
    ("ar" :babel "arabic" :babel-alt ("arabic") :polyglossia "arabic" :lang-name "Arabic" :script "arabic" :code "arab")
    ("el"  :babel "greek" :babel-alt ("greek") :polyglossia "greek" :lang-name "Greek" :script "greek" :code "grk")
    ("el-polyton" :babel "polutonikogreek" :babel-alt ("ancientgreek" "polytonicgreek") :polyglossia "greek" :polyglossia-variant "polytonic" :lang-name "Polytonic Greek" :script "greek" :code "grk")
    ("ru"  :babel "russian" :babel-alt ("russian") :polyglossia "russian" :lang-name "Russian" :script "cyrillic" :code "cyrl"))
  "TODO")

;; This is supposed to be a defcustom for the main fonts. `'default'
;; means 'use the main default fonts'. Otherwise, the value must be
;; a plist. Valid props. are:

;; - :main = roman font
;; - :sans = sans font
;; - :mono = mono font
;; - :math = math font
;; - :...-options = font options

;; For the font options and the fontspec package syntax, see
;; [[https://CTAN/macros/unicodetex/latex/fontspec/fontspec.pdf]]

(setq org-latex-uc-fonts-support-default-main-fonts
      '(:main "FreeSerif" :mono "inconsolatan" :mono-options "Scale=0.95"))

;; This is supposed to be a defcustom. Each element has the structure:
;; script - font - (optional) font options

(setq org-latex-uc-fonts-support-default-scripts-fonts
      '(("greek" "Linux Libertine")
	("cyrillic" "Old Standard")
	("arabic" "FreeSerif")))

;; Get main fonts (declared in
;; `org-latex-uc-fonts-support-default-main-fonts')

(defun org-latex-uc-fonts-support-get-main-fonts (plist prop)
  (let ((format))
    (if (not
	 (plist-member plist prop))
	(ignore)
      (let* ((value (plist-get plist prop))
	     (prop-name
	      (replace-regexp-in-string ":" "" (symbol-name prop)))
	     (options (plist-get
		       plist
		       (intern
			(format
			 ":%s-options"
			 prop-name)))))
	(setq format
	      (format
	       "\\\\set%sfont{%s}[%s]"
	       prop-name value
	       (if options options "")
	       ))))
    format))

;; get non latin fonts explicitly added

(defun org-latex-uc-fonts-support-get-fonts-other-languages (header)
  (interactive)
  (let ((format-str)
	(lines))
    (with-temp-buffer
      (insert header)
      (save-excursion
	(goto-char (point-min))
	(while (re-search-forward "%\s+!enable-fonts-for\s+\\(.+\\)" nil t)
	  (add-to-list 'lines (match-string 1)))))
    (let* ((lines-list
	    (mapcar
	     (lambda (x)
	       (split-string x "::"))
	     lines))
	   (flat (flatten-list lines-list))
	   (format-list (mapcar
			 (lambda (x)
			   (org-latex-uc-fonts-support-format-font-for-language (string-trim x)))
			 flat)))
      (setq format-str (mapconcat #'identity format-list "\n\n")))
    format-str))

;; format each lang/font

(defun org-latex-uc-fonts-support-format-font-for-language (lang)
  (let* ((regexp "\\([^:]+\\):*\\([^()]*\\)(*\\([^()]*\\))*")
	 (lang-name (when (string-match regexp lang)
		      (match-string 1 lang)))
	 (lang-explicit-font (when (string-match regexp lang)
			       (match-string 2 lang)))
	 (lang-explicit-font-opts (when (string-match regexp lang)
				    (match-string 3 lang)))
	 (lang-alias (let ((candidato))
		       (mapc (lambda (x)
			       (when (member :babel-alt x)
				 (let* ((plist (cdr x))
					(babel-alt (plist-get plist :babel-alt)))
				   (when (member lang-name babel-alt)
				     (setq candidato (car x))))))
			     org-latex-language-alist)
		       candidato))
	 (plist (cdr (assoc lang-alias org-latex-language-alist)))
	 (script (plist-get plist :script))
	 (default-script-font (assoc script org-latex-uc-fonts-support-default-scripts-fonts))
	 (default-font (nth 1 default-script-font))
	 (default-font-options (nth 2 default-script-font))
	 (default-font-options? (if default-font-options
				    default-font-options
				  "")))
    (format
     "\\\\babelprovide[onchar=ids fonts]{%s}\n
    \\\\babelfont[%s]{rm}[%s]{%s}\n"
     lang-name
     lang-name
     (if (not (equal lang-explicit-font-opts "")) lang-explicit-font-opts default-font-options?)
     (if (not (equal lang-explicit-font "")) lang-explicit-font default-font))))

;; make preamble definitions. This is supposed to be part of
;; `org-latex-guess-babel-language', as in the modified version below

(defun org-latex-uc-fonts-support-make-preamble (header)
  (let* ((main-fonts (unless (eq 'org-latex-uc-fonts-support-default-main-fonts 'default)
		       (mapconcat #'identity
				  (cl-remove-if-not #'identity
						    (mapcar
						     (lambda (elt)
						       (let ((str (org-latex-uc-fonts-support-get-main-fonts
								   org-latex-uc-fonts-support-default-main-fonts
								   elt)))
							 (when str str)))
						     (list :main :sans :mono :math)))
				  "\n")))
	 (other-fonts-per-language
	  (org-latex-uc-fonts-support-get-fonts-other-languages header))
	 (preamble (with-temp-buffer
		     (insert "\n\n")
		     (when main-fonts
		       (insert main-fonts))
		     (insert "\n\n")
		     (when other-fonts-per-language
		       (insert other-fonts-per-language))
		     (buffer-string))))
    preamble))

(defun org-latex-guess-babel-language (header info)
  "Modified version for this proof of concept"
  (let* ((language-code (plist-get info :language))
	 (plist (cdr
		 (assoc language-code org-latex-language-alist)))
	 (language (plist-get plist :babel))
	 (language-ini-only (plist-get plist :babel-ini-only))
	 ;; If no language is set, or Babel package is not loaded, or
	 ;; LANGUAGE keyword value is a language served by Babel
	 ;; exclusively through ini files, return HEADER as-is.
	 (header (if (or language-ini-only
			 (not (stringp language-code))
			 (not (string-match "\\\\usepackage\\[\\(.*\\)\\]{babel}" header)))
		     header
		   (let ((options (save-match-data
				    (org-split-string (match-string 1 header) ",[ \t]*"))))
		     ;; If LANGUAGE is already loaded, return header
		     ;; without AUTO.  Otherwise, replace AUTO with language or
		     ;; append language if AUTO is not present.  Languages that are
		     ;; served in Babel exclusively through ini files are not added
		     ;; to the babel argument, and must be loaded using
		     ;; `\babelprovide'.
		     (replace-match
		      (mapconcat (lambda (option) (if (equal "AUTO" option) language option))
				 (cond ((member language options) (delete "AUTO" options))
				       ((member "AUTO" options) options)
				       (t (append options (list language))))
				 ", ")
		      t nil header 1)))))
    ;;; adition:
    (when org-latex-uc-fonts-support
      (setq header (let ((form (org-latex-uc-fonts-support-make-preamble header)))
		     (replace-regexp-in-string
		      "\\(\\\\usepackage\\[?.*\\]?{babel}\\)"
		      (format "\n\\\\usepackage{fontspec}\n\n\\1\n%s" form)
		      header))))
    ;;;
    ;; If `\babelprovide[args]{AUTO}' is present, AUTO is
    ;; replaced by LANGUAGE.
    (if (not (string-match "\\\\babelprovide\\[.*\\]{\\(.+\\)}" header))
	header
      (let ((prov (match-string 1 header)))
	(if (equal "AUTO" prov)
	    (replace-regexp-in-string (format
				       "\\(\\\\babelprovide\\[.*\\]\\)\\({\\)%s}" prov)
				      (format "\\1\\2%s}"
					      (or language language-ini-only))
				      header t)
	  header)))))

  reply	other threads:[~2023-09-02 21:40 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-30  8:25 Fallback fonts in LaTeX export for non latin scripts Juan Manuel Macías
2023-08-31  8:17 ` Ihor Radchenko
2023-08-31 11:42   ` Juan Manuel Macías
2023-09-01  9:18     ` Ihor Radchenko
2023-09-02 21:39       ` Juan Manuel Macías [this message]
2023-09-03  7:22         ` Ihor Radchenko
2023-09-03 11:05           ` Juan Manuel Macías
2023-09-04  8:09             ` Ihor Radchenko
2023-09-04 22:22               ` Juan Manuel Macías
2023-09-05 10:44                 ` Ihor Radchenko
2023-09-20 14:03                   ` Juan Manuel Macías
2023-09-21  9:00                     ` Ihor Radchenko
2023-09-24 18:24                       ` Juan Manuel Macías
2023-09-26 10:37                         ` Ihor Radchenko
2023-09-05 16:42                 ` Max Nikulin
2023-09-05 18:33                   ` Juan Manuel Macías
2023-09-06  9:29                     ` Ihor Radchenko
2023-09-06 14:58                       ` Juan Manuel Macías
2023-09-07 10:22                         ` Ihor Radchenko
2023-09-07 12:04                           ` Juan Manuel Macías
2023-09-08  7:42                             ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878r9ocl17.fsf@posteo.net \
    --to=maciaschain@posteo.net \
    --cc=emacs-orgmode@gnu.org \
    --cc=orgmode@tec.tecosaur.net \
    --cc=yantar92@posteo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).