From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id gJxUIAiwYmO3EgEAbAwnHQ (envelope-from ) for ; Wed, 02 Nov 2022 18:59:36 +0100 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id qFpWHwiwYmPm2AAAG6o9tA (envelope-from ) for ; Wed, 02 Nov 2022 18:59:36 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 34DC9C94D for ; Wed, 2 Nov 2022 18:59:36 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oqI0X-0000cb-1I; Wed, 02 Nov 2022 13:58:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oqI0V-0000cU-Vh for emacs-orgmode@gnu.org; Wed, 02 Nov 2022 13:58:31 -0400 Received: from mail-ej1-x634.google.com ([2a00:1450:4864:20::634]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oqI0U-0005Wa-6X for emacs-orgmode@gnu.org; Wed, 02 Nov 2022 13:58:31 -0400 Received: by mail-ej1-x634.google.com with SMTP id kt23so47338522ejc.7 for ; Wed, 02 Nov 2022 10:58:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=y/fSrU0pNnvVJjkgCi3m4Ja9UCl5+6PuMeIFjcf0TbE=; b=IBMACL1cRaSvxKwIGQ1GsItphSgmj2a7goEfZXsDXxxCAfu5cRadX2DBCO+WrIafEp xV+7yj1U03eLP+mW9EeJ72m1L3xRAoHjvMteVhaDEMffVSLTmCbIpvTeij2S9beUYew7 +tquSgCvz5HOXGlgCPlOO79fsqqzzFEWszcgRJPEosuUP5efzro4QZ5hS/2xg5s2JsF7 29le1kHPdk2YpwL4zxqqBc5ocUNRmnqwPsm3ZuqY/sWPeEAys+/GooH0OA5v4Hx9hQ4y 1PK+85QNKroEczWUq/VHpxfc5AuoZzY9uKJm/ngzACVy0tKQe+FTClwBEcXJyapKxyNw p/eQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y/fSrU0pNnvVJjkgCi3m4Ja9UCl5+6PuMeIFjcf0TbE=; b=mGGz+h5ceuE5q0Xvm9xhAXJGtfrWexcgMq8M/xUm6JMmVnU4E9vRbn+PBSONXCuL2G q+b/0De9ua0zoAtoS3A+uQNWKgH7Pb44PbloeOcvOVkVBuir9I5aIt76eanwlLB+2DfI 8d5m8rAQFvWjtbm9+rPjV1NYP5+Rt0QLhbVPmJynhutABm8bm3BvOhaH5gK6xNT4CBJr HFHVFQMMZalD7kLiSjqMQvyiuTPsXlGWkC27yDJW3cn+eknLYgvJh7HscQAMzErm0TjR Wr2vnkh5Kkf+vG3cb6FYTOGl66DsI1BH5yxR2NuFjrw2ip/5eZy+EkslMYZ1sjHQXvus N+xw== X-Gm-Message-State: ACrzQf0pN4/BvQiYNvPl6C4Jc0s15Pmz8aJYkHelL97/KjIdIqOgPJd1 UwWb+SjmERPwSaynwjbwqwXSOsGkJMYITw2FXYEy+mq+HPo= X-Google-Smtp-Source: AMsMyM79mbRvKKWqVRpmuwOW0oRxtZbGoZSBZx9ubEb4H/h2mz8MbJRX5hfVAb/wjGYU1YP4oG0JN1Psr2I5TZC2O08= X-Received: by 2002:a17:906:79d8:b0:7ad:b675:f34d with SMTP id m24-20020a17090679d800b007adb675f34dmr22262170ejo.194.1667411907923; Wed, 02 Nov 2022 10:58:27 -0700 (PDT) MIME-Version: 1.0 References: <87r0ytoqi6.fsf@localhost> <87k04dlvie.fsf@localhost> In-Reply-To: <87k04dlvie.fsf@localhost> From: =?UTF-8?Q?Andr=C3=A1s_Simonyi?= Date: Wed, 2 Nov 2022 18:58:16 +0100 Message-ID: Subject: Re: [PATCH][oc-csl] Improve reference parsing To: Ihor Radchenko Cc: emacs-orgmode list Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::634; envelope-from=andras.simonyi@gmail.com; helo=mail-ej1-x634.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Emacs-orgmode" Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1667411976; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=y/fSrU0pNnvVJjkgCi3m4Ja9UCl5+6PuMeIFjcf0TbE=; b=QASyq4S4/n6rq4TFdbQLZ5NajC0+fE+fzcAFtB5yYuFNtFzxfcrSqiVP5CpFZjRXygOgFj v5i8efPOwcyv/nYRIIhXtW2VANbicOHe7R5YB9DAHHAHLPNm7zVKKJSNDq8umUE4PMjkSs BfBNb8rU11eEyBIGWLTSJIhWxTnb4Dsk7b/lb+4h1GNggvup2xbDaVDsdt2fCVRONqFHRB L2/GulRR7nEE0X8o5PsvO2f5BWawYPLD7AVGTo6+tQeJTlUveTW1005InsfB9PzURg4K+w 67yEQF3iU9GrQ490uvBRVzsyPbAINhvK24441xbM6Ksmzq7JGOYceYHd/eLVZA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1667411976; a=rsa-sha256; cv=none; b=onAc978NL4aSwhWMX5HAil7Lw1rZ3x8vx6XFS+ImQBt8oK5izWTjYMp8drpoXXCKPyqXCZ Kx1jTuvM3QH8RQS2tpUu7ws6eujmOyVsf3S/0FhbtWfjQaNLBfwhAcG4HMLVgCfay2SqPX DfcxTQUgTrPv9tQ2K5E4/o/151IVMmN1f2yrkiIxdhJNzYbE9c//W+rk5VrhGcU1ryZpRp cfV/HOhw9e3jQgXGMon9sTmW+yk3mqx+EBrNfmQKVfLRmmJpdxNGbGOPSXnvn9yR7bPXvI 1wTYJoAFUpszZ980gGC6TUxeUVVzcnsmQi2C0KyFqBYZOnvgKf1q/Ek8mKI24g== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=IBMACL1c; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -9.29 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=IBMACL1c; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 34DC9C94D X-Spam-Score: -9.29 X-Migadu-Scanner: scn0.migadu.com X-TUID: HYu0i6NTN6sx Dear All, On Wed, 2 Nov 2022 at 07:28, Ihor Radchenko wrote: > I do not think that CSL limitations are really limiting us. > > - Allowing macros will be handled by ox.el itself automatically > - Export snippets can also be processed without much issue (consider > direct LaTeX code) > - inline-babel-call and inline src blocks may be useful with :exports > results when some auto-generation of text is needed. They will also be > handled automatically by ob-exp. > - latex-fragments are either equivalent to direct LaTeX or to inserting > an image > - timestamps could be exported as text, although I do not see any > obvious utility of timestamps inside references. I'm not really familiar with the internals of the Org exporter but, looking at the ox.el code, macros and babel calls are processed and resolved before processing citations, so they seemingly have no bearing on the org-cite-csl--parse-reference function my patch is concerned with. > However, oc-csl should not ignore the export processor to support all > the above. I am not sure why you need a dedicated export processor > instead of passing the string to current processor (or derivative) > instead. > If you really need to mark certain constructs specially for CSL, you can > create a derived export backend for the current backend and replace the > transcoders for the object types that must be treated specially. Other than macros and babel calls, e.g., timestamps, LaTeX fragments etc. the problem is that citeproc-el expects and needs the affixes and locator to be passed in the very limited html-like markup supported by CSL (see https://www.zotero.org/support/kb/rich_text_bibliography for a rudimentary description), and, crucially, the assumption is that everything else is plain text, which, if necessary, will be escaped according to the target format, i.e., '$' signs are escaped by citeproc-el's own LaTeX formatter. The reason for this limitation is that the affixes and especially the locator have to be parsed into citeproc-el's internal rich-text representation for further processing according to the used CSL style. (Affixes are only concatenated to other elements but locators can be the subject of any type of formatting.) As a consequence, I think the only real alternatives are using a custom backend as I do in the current patch or a backend derived from the plain text Org exporter -- I don't have a strong preference as to which solution we choose, just went with the seemingly more minimalist option. (The proper way of dealing with LaTeX fragments in this context, in particular with LaTeX math fragments, would be to support those in citeproc-el's internal representation and markup, which is planned but not implemented yet.) > > +(defconst org-cite-csl--export-backend > > + (org-export-create-backend > > + :transcoders > > + '((bold . (lambda (_bold contents _info) (format "%s" conten= ts))) > > + (code . org-cite-csl--element-value) > > + (entity . (lambda (entity _contents _info) > > + (format "\\%s" (org-element-property :name entity)))) > > Why :name, but not :html? Good point, thinking about it a bit more, :utf-8 would probably be a slightly better solution (in keeping with citeproc-el's 'plain text' requirement), I'will change this when we will have sorted out the other details. best wishes, Andr=C3=A1s