emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: "András Simonyi" <andras.simonyi@gmail.com>
To: Ihor Radchenko <yantar92@posteo.net>
Cc: emacs-orgmode list <emacs-orgmode@gnu.org>
Subject: Re: [PATCH][oc-csl] Improve reference parsing
Date: Tue, 1 Nov 2022 16:02:57 +0100	[thread overview]
Message-ID: <CAOWRwxCWPeNPjMEyH0f5SwR4M5psoj0NMZF8KHH_bHpVatBt5w@mail.gmail.com> (raw)
In-Reply-To: <87r0ytoqi6.fsf@localhost>

[-- Attachment #1: Type: text/plain, Size: 1604 bytes --]

Dear All,

On Thu, 27 Oct 2022 at 06:10, Ihor Radchenko <yantar92@posteo.net> wrote:
> This will render e.g. strike-through empty.
> Note that citation references may contain the following Org markup objects:
> '(bold code entity italic
>   latex-fragment strike-through subscript
>   superscript underline verbatim)

thanks for pointing out the problem!! I've attached a new version of
the patch, in which the custom exporter backend has an (in many cases
trivial) translator for all currently allowed objects.

> And we may add more, as discussed in
> https://orgmode.org/list/87k04xhhw3.fsf@localhost

I don't think that it would make much sense to add a lot more, with
the possible exception of links, since citations are at most
sentence-sized textual units, not to mention the possible
complications arising for the existing export processors. (What type
of objects could the various LaTeX-based exporters support without
complex changes?)  Since CSL has only a few types of formatting
attributes (font-style, font-variant, font-weight, text-decoration and
vertical-align), if the set of allowed object is radically expanded
then it will probably be more reasonable to define a derived backed,
maybe based on the ascii exporter, but I feel that the current set
doesn't require this solution.

thanks & best wishes,

> --
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>

[-- Attachment #2: 0001-oc-csl.el-Improve-reference-parsing.patch --]
[-- Type: text/x-patch, Size: 6019 bytes --]

From 5dfbb8ef9291f906014800013cdb9a9d5569b728 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Andr=C3=A1s=20Simonyi?= <andras.simonyi@gmail.com>
Date: Wed, 26 Oct 2022 12:15:42 +0200
Subject: [PATCH] oc-csl.el: Improve reference parsing

* lisp/oc-csl.el (org-cite-csl--export-backend): New constant to
provide a trivial export back-end for exporting reference affixes and
locators with the simple html-based markup expected by citeproc.
(org-cite-csl--parse-reference): Do not construct the reference
locator and include it in the result, since citeproc does not make use
of it.  Start the suffix immediately after the locator's ending,
skipping the ending comma if necessary.  Use
`org-cite-csl--export-backend' to export reference affixes and
 lisp/oc-csl.el | 57 +++++++++++++++++++++++++++++++++++---------------
 1 file changed, 40 insertions(+), 17 deletions(-)

diff --git a/lisp/oc-csl.el b/lisp/oc-csl.el
index 1ccb74e92..1f40a9e8a 100644
--- a/lisp/oc-csl.el
+++ b/lisp/oc-csl.el
@@ -140,9 +140,10 @@
 (declare-function org-element-property "org-element" (property element))
 (declare-function org-element-put-property "org-element" (element property value))
-(declare-function org-export-data "org-export" (data info))
+(declare-function org-export-data-with-backend "org-export" (data backend info))
 (declare-function org-export-derived-backend-p "org-export" (backend &rest backends))
 (declare-function org-export-get-footnote-number "org-export" (footnote info &optional data body-first))
+(declare-function org-export-create-backend "org-export" (&key transcoders))
 ;;; Customization
@@ -310,8 +311,30 @@ If nil then the Chicago author-date style is used as a fallback.")
   "Regexp matching a label in a citation reference suffix.
 Label is in match group 1.")
+(defconst org-cite-csl--export-backend
+  (org-export-create-backend 
+   :transcoders
+   '((bold . (lambda (_bold contents _info) (format "<b>%s</b>" contents)))
+     (code . org-cite-csl--element-value)
+     (entity . (lambda (entity _contents _info)
+                 (format "\\%s" (org-element-property :name entity))))
+     (italic . (lambda (_italic contents _info) (format "<i>%s</i>" contents)))
+     (latex-fragment . org-cite-csl--element-value)
+     (plaintext . (lambda (contents _info) contents))
+     (strike-through . (lambda (_strike-through contents _info) contents))
+     (subscript . (lambda (_subscript contents _info) (format "<sub>%s</sub>" contents)))
+     (superscript . (lambda (_superscript contents _info) (format "<sup>%s</sup>" contents)))
+     (underline . (lambda (_underline contents _info)
+                    (format "<span class=\"underline\">%s</span>" contents)))
+     (verbatim . org-cite-csl--element-value)))
+  "Custom backend for exporting citation affixes and locators.")
 ;;; Internal functions
+(defun org-cite-csl--element-value (element _contents _info)
+  "Return the`:value' property of ELEMENT."
+  (org-element-property :value element))
 (defun org-cite-csl--barf-without-citeproc ()
   "Raise an error if Citeproc library is not loaded."
   (unless (featurep 'citeproc)
@@ -476,11 +499,10 @@ property in INFO."
 INFO is the export state, as a property list.
 The result is a association list.  Keys are: `id', `prefix',`suffix',
-`location', `locator' and `label'."
-  (let (label location-start locator-start location locator prefix suffix)
+`locator' and `label'."
+  (let (label location-start locator-start locator prefix suffix)
     ;; Parse suffix.  Insert it in a temporary buffer to find
-    ;; different parts: pre-label, label, locator, location (label +
-    ;; locator), and suffix.
+    ;; different parts: pre-label, label, locator, and suffix.
         (insert (org-element-interpret-data
@@ -506,12 +528,15 @@ The result is a association list.  Keys are: `id', `prefix',`suffix',
         (let ((re (rx (or "," (group digit)))))
           (when (re-search-backward re location-start t)
             (goto-char (or (match-end 1) (match-beginning 0)))
-            (setq location (buffer-substring location-start (point)))
-            (setq locator (org-trim (buffer-substring locator-start (point))))
+            (setq locator
+                  (org-cite-parse-objects
+                   (buffer-substring locator-start (point))
+                   t))
             ;; Skip comma in suffix.
+            (when (= (following-char) ?,) (forward-char))
             (setq suffix
-                   (buffer-substring (match-end 0) (point-max))
+                   (buffer-substring (point) (point-max))
       (setq prefix
@@ -525,18 +550,16 @@ The result is a association list.  Keys are: `id', `prefix',`suffix',
            (lambda (data)
-               ;; When Citeproc exports to Org syntax, avoid mix and
-               ;; matching output formats by also generating Org
-               ;; syntax for prefix and suffix.
-               (if (eq 'org (org-cite-csl--output-format info))
-                   (org-element-interpret-data data)
-                 (org-export-data data info)))))))
+               ;; Export the parsed prefix, suffix, and locator  
+               ;; with a custom backend that produces the simple
+               ;; html markup expected by citeproc.
+               (org-export-data-with-backend
+                data org-cite-csl--export-backend info))))))
       `((id . ,(org-element-property :key reference))
         (prefix . ,(funcall export prefix))
         (suffix . ,(funcall export suffix))
-        (locator . ,locator)
-        (label . ,label)
-        (location . ,location)))))
+        (locator . ,(funcall export locator))
+        (label . ,label)))))
 (defun org-cite-csl--create-structure (citation info)
   "Create Citeproc structure for CITATION object.

  reply	other threads:[~2022-11-01 17:24 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-26 15:40 [PATCH][oc-csl] Improve reference parsing András Simonyi
2022-10-27  4:10 ` Ihor Radchenko
2022-11-01 15:02   ` András Simonyi [this message]
2022-11-02  6:29     ` Ihor Radchenko
2022-11-02 17:58       ` András Simonyi
2022-11-03  6:34         ` Ihor Radchenko
2023-01-07 12:50           ` M. ‘quintus’ Gülker
2023-01-15  8:56             ` Ihor Radchenko
2023-01-18 23:08               ` András Simonyi
2023-01-19  8:21                 ` M. ‘quintus’ Gülker
2023-01-19  9:35                   ` András Simonyi
2023-01-19  9:59                     ` Ihor Radchenko
2023-01-19 10:11                     ` M. ‘quintus’ Gülker
2023-01-25 22:44                       ` András Simonyi
2023-01-19  9:56                 ` Ihor Radchenko
     [not found]                   ` <CAOWRwxD3pSfao7+G145naE=jaAz6=m2BjvUX0rj_c4r8qeu7rQ@mail.gmail.com>
2023-01-26  9:43                     ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOWRwxCWPeNPjMEyH0f5SwR4M5psoj0NMZF8KHH_bHpVatBt5w@mail.gmail.com \
    --to=andras.simonyi@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=yantar92@posteo.net \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).