From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id iLQcJMZBx2BFeAEAgWs5BA (envelope-from ) for ; Mon, 14 Jun 2021 13:47:18 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id UM3lH8ZBx2D9CgAA1q6Kng (envelope-from ) for ; Mon, 14 Jun 2021 11:47:18 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 77E9A2D968 for ; Mon, 14 Jun 2021 13:47:17 +0200 (CEST) Received: from localhost ([::1]:51384 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lsl3i-0005zI-PK for larch@yhetil.org; Mon, 14 Jun 2021 07:47:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60236) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lsl2A-0005tK-Qj for emacs-orgmode@gnu.org; Mon, 14 Jun 2021 07:45:38 -0400 Received: from mout-p-101.mailbox.org ([80.241.56.151]:51760) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_CHACHA20_POLY1305:256) (Exim 4.90_1) (envelope-from ) id 1lsl26-0007PZ-Im for emacs-orgmode@gnu.org; Mon, 14 Jun 2021 07:45:38 -0400 Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:105:465:1:2:0]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4G3V4m09CjzQk2y; Mon, 14 Jun 2021 13:45:28 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mailbox.org; h= content-type:content-type:in-reply-to:mime-version:date:date :message-id:from:from:references:subject:subject:received; s= mail20150812; t=1623671124; bh=OJ3FsvyoHv1O/7tKKjb2nwuBEh/O4Y7aJ i64RphUgTE=; b=HxJt5igp7wd0oTjRpfOSWDiQhoZuwZuJ5aevDMuyq99CqCJHC /Ngzh35CzfYQF1Zecw4n/ml8hICypmoN/wBlhAzEzepMBBOzV7utj/OT+b/22xpw HcS1KtF5fpz8i517/gapOPsQ72uxakKuolKd8me8PbpsA6XbwYM3GtQDOkoWjWuX YfrUGkseDEiG6c1xGATy5YJig9t01ZuHiKqMzAnOST7lHPdb0NldV4ZjOgYkBCsJ Q8QKN3MHHUj1LIAVqvLtxm6W22DU31MzHbGM0RA4z21nTuaq9u3iq/oJRaLdnHYS kdNtP34UPp3XeB9KrVW0b1ZSdgm2p+MIodHAQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mailbox.org; s=mail20150812; t=1623671126; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=HUZRER8NBKkt2ru85onUUMA7paEUtfAtXeotQuNdsVw=; b=WHGfkhbaQqRi1zekcBrRN2VcudjyD3Eo4LTX65Zk3BeCWvaQ6sm9gbUpm0mAEV4v0PcWxF TFbp4SUqzhxe9QaZZcaEbacdeFg8npUeCbw6heIgNEKhuaUf3rW1Cyr++L9nG0vwVhDuBx 2IfVkMSUbRMokRJd//Me9xQVM25gtTTZ3yvm1OF83jkkPmrY9uhA3U1jxtbhm0WRc7s11+ yvx9GiUQOKhwB5He3tzjUDKJt8BnPqNBSDgHZIh/29Emg8VO8Q9rk1s36lC/VpMVxFKKeO K/3/Ho01OoK0XR2h5lp7ilycM0ZZf1yzVN4fWf0LS/Ep/NBTIUj91+e3EENPCQ== X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp2.mailbox.org ([80.241.60.241]) by spamfilter04.heinlein-hosting.de (spamfilter04.heinlein-hosting.de [80.241.56.122]) (amavisd-new, port 10030) with ESMTP id O35dSOGPzkei; Mon, 14 Jun 2021 13:45:24 +0200 (CEST) Subject: Re: [wip-cite-new] Adjust punctuation around citations To: Bruce D'Arcus Cc: Org Mode List , Nicolas Goaziou References: <871raawc7j.fsf@nicolasgoaziou.fr> <4dd47d8d-5dd8-4769-7e2f-eb3438ba0b4a@mailbox.org> <87sg2orz0z.fsf@nicolasgoaziou.fr> <81051f87-a90e-56ed-7867-d6179ec1e9ad@mailbox.org> <139ff81d-4af6-1e75-f4c9-416032fc514f@mailbox.org> <87h7icatav.fsf@nicolasgoaziou.fr> <952cbae3-496c-acea-4ff1-beb9c3306979@mailbox.org> <87zgvvtoay.fsf@nicolasgoaziou.fr> <535c4059-e019-0970-afea-efed82b003ac@mailbox.org> <2009852882.89167.1623622988878@office.mailbox.org> From: Denis Maier Message-ID: <48377c85-5891-3658-d35b-cba1358878a8@mailbox.org> Date: Mon, 14 Jun 2021 13:45:23 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/alternative; boundary="------------A6F680266CC8FAC52FD6E6E5" X-MBO-SPAM-Probability: X-Rspamd-Score: -3.60 / 15.00 / 15.00 X-Rspamd-Queue-Id: E0064181E X-Rspamd-UID: d1a1a4 Received-SPF: pass client-ip=80.241.56.151; envelope-from=denismaier@mailbox.org; helo=mout-p-101.mailbox.org X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1623671237; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=HUZRER8NBKkt2ru85onUUMA7paEUtfAtXeotQuNdsVw=; b=obJMrH6x4P+THDlZRaxQ3Dg+P+P7v5w3vK+6vZXQ16+Z9VzIHOGSA2R/PV3WmmmwoZD89T AUVOEu1BEjwKASlr1e3L/8GuSjgWNwC54PIem1GZ8GIBBoXGXJiW9M86bc8MLzXAFkYayQ Ki2vqZ3xa+PTQkw9A8/e77F2o2CiaKMOxi9fsKyeVZVyz2s8MBeeSwuInhtdi2cimfJdSE ga9KDDo5uBvrdwb18Zh657eEKL/JFd3abIHrXpBoK+HoVzxVf8o7k1od2oyCYRroSGHAJx iJpoVax9cWudbSBZQOwTWoaQW3suW+o+2woxV9AHyGt05qxI0rHXC4MZligqiQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1623671237; a=rsa-sha256; cv=none; b=nvc9T2qn4x9in0hWNdYg8Lb3t107ZBjw0qdKHRJDbt7ALZJanIh2cF9vytlhQljHUjqZop BDnGxeMHZMy1PJDHeNCBGBjyVr6Z7uNKuL+dv8zR5F2WCUwKYJIpwVYyX1cU0wSnwjqLFm khgm82+xPVXP2x6r1ddSFCJu5qgmqrwKCAFz97VZ5bxAWDPfjk+V80w1g0YZGmaemqknYn R+PRqIGqcSJiBYRi6UtmHk8cFLMm0NlFSVuChSEai0C4R66qR4Z1fmMEgGGfyniqU3MZh9 gpGF2sNaBZbMCuQZfH1mu7STi8Drb97/vAHtdYr9KKpqRjzaUL1qDHZGnEJAsg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=mailbox.org header.s=mail20150812 header.b=HxJt5igp; dkim=pass header.d=mailbox.org header.s=mail20150812 header.b=WHGfkhba; dmarc=pass (policy=reject) header.from=mailbox.org; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Spam-Score: -3.12 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=mailbox.org header.s=mail20150812 header.b=HxJt5igp; dkim=pass header.d=mailbox.org header.s=mail20150812 header.b=WHGfkhba; dmarc=pass (policy=reject) header.from=mailbox.org; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: 77E9A2D968 X-Spam-Score: -3.12 X-Migadu-Scanner: scn0.migadu.com X-TUID: VIgH3erOiiZS This is a multi-part message in MIME format. --------------A6F680266CC8FAC52FD6E6E5 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Below a few examples of what I mean. WDYT? Am I missing something? Denis =========================================================== #+cite_export: csl #+cite_export: csl "C:/Users/denis/Zotero/styles/chicago-note-bibliography.csl" #+bibliography: test.bib * Original source "A quotation ending with a period." "A quotation ending without punctuation" * Author-date style input (= semantically non-strict input) "A quotation ending with a period" [cite: @hoel-71-whole]. "A quotation ending without punctuation" [cite: @hoel-71-whole]. ** author-date output with language: en-us Expected: "A quotation ending with a period" (Hoel 1971). Actual:   "A quotation ending with a period" (Hoel 1971). Expected: "A quotation ending without punctuation" (Hoel 1971). Actual:   "A quotation ending without punctuation" (Hoel 1971). => ok ** author-date output with language: de Expected: "A quotation ending with a period" (Hoel 1971). Actual:   "A quotation ending with a period" (Hoel 1971). Expected: "A quotation ending without punctuation" (Hoel 1971). Actual:   "A quotation ending without punctuation" (Hoel 1971). => ok ** note style output with language: en-us Expected: "A quotation ending with a period."[1] Actual:   "A quotation ending with a period."[1] Expected: "A quotation ending without punctuation."[1] Actual:   "A quotation ending without punctuation."[1] => ok ** note style output with language: en-gb or de Expected: "A quotation ending with a period."[1] Actual:   "A quotation ending with a period".[1] Expected: "A quotation ending without punctuation".[1] Actual:   "A quotation ending without punctuation".[1] => Here, we cannot distinguish between the two cases as we don't know whether punctuation appears in the original source. * Note style input (=semantically strict input) "A quotation ending with a period." [cite: @hoel-71-whole] "A quotation ending without punctuation". [cite: @hoel-71-whole] As the input preserves the location of punctuation in the original material, I'd say it should be much easier to deal with this. We don't have to add information which isn't in the input, but rather we'll just have to move any punctuation to after the citation object. Maybe I'm missing something, but to me this looks like a much simpler operation than going in the opposite direction. Maybe we should stop talking about author date vs note style input, but rather about strict vs. non-strict input. And, I think that's the whole issue: going from strict to non-strict is easy while the other way is more complicated; at least, it would require some more efforts to support the last case (going from non-strict input to note style output with a language that requires strict output. ========================================================================= Am 14.06.2021 um 00:47 schrieb Bruce D'Arcus: > I'll let you two sort it out; I don't have a position. > > On Sun, Jun 13, 2021, 3:23 PM Denis Maier > wrote: > > >> Bruce D'Arcus > hat >> am 14.06.2021 00:04 geschrieben: >> >> >> Nicolas explained the reverse is out of scope, > IIRC, it was out of scope ATM. >> and gave a reasonable explanation why (because much harder to >> reconstruct missing information IIRC). > That's where I disagree. I think the opposite is true. > >> On Sun, Jun 13, 2021, 2:54 PM Denis Maier > > wrote: >> >> Am 12.06.2021 um 11:39 schrieb Nicolas Goaziou: >> > Hello, >> > >> > Denis Maier > > writes: >> > >> >> Yes, good this is coming. >> > >> > As a step forward, I rebased wip-cite-new branch with more >> support for >> > note numbers handling. >> > >> > I added three customizable variables: >> > >> > - org-cite-adjust-note-numbers, which simply allows the >> user to toggle >> >    punctuation and note number moving (on by default). >> > >> > - org-cite-note-rules, which defines what rules to apply >> according to >> >    locale, expressed as a language tag, as in RFC 4646. >> > >> > - org-cite-punctuation-marks, which lists strings >> recognized as >> >    punctuation in the process. >> > >> > `csl' and `basic' processors now both make use of this. >> > >> > I'd appreciate some feedback, in particular about the >> docstrings of the >> > variables above. I focused on the "note numbers" topic >> instead of >> > "punctuation" since I found the latter too generic. >> > >> > Also, there are some points that may need to be discussed: >> > >> > - I'm not sure about the `org-cite-punctuation-marks' >> variable being >> >    global, i.e., not locale-specific. >> > >> > - There is no support for this in LaTeX-derived back-ends, >> because >> >    I don't know when a citation is going to become a >> footnote. As >> >    a reminder, there is no "\footcite" command in >> `biblatex' processor. >> >    OTOH, users might prefer using a more advanced >> mechanism, e.g., >> >    csquotes. >> > >> > - It doesn't do anything special in quote blocks, because >> I'm still not >> >    sure there is something to do. AFAIU, special casing >> there only >> >    applies to author-date location, which out of the scope >> of this code. >> > >> > WDYT? >> >> Ok, I've managed to test this a bit, and I think this looks >> pretty good >> so far. >> >> The only question I'd still have is if this could somehow >> also cover the >> reverse situation (going from a note style to author-date). >> I've noticed >> that simply adding a new language rule doesn't work >> anymore---as opposed >> to my initial tests with earlier iterations of that >> mechanism. Seems >> like this mechanism is now only triggered when using a note >> based style. >> >> Best, >> Denis >> >> > >> > Regards, >> > >> --------------A6F680266CC8FAC52FD6E6E5 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit Below a few examples of what I mean.

WDYT? Am I missing something?

Denis
===========================================================
#+cite_export: csl
#+cite_export: csl "C:/Users/denis/Zotero/styles/chicago-note-bibliography.csl"
#+bibliography: test.bib

* Original source

"A quotation ending with a period."

"A quotation ending without punctuation"

* Author-date style input (= semantically non-strict input)

"A quotation ending with a period" [cite: @hoel-71-whole].

"A quotation ending without punctuation" [cite: @hoel-71-whole].

** author-date output with language: en-us
Expected: "A quotation ending with a period" (Hoel 1971).
Actual:   "A quotation ending with a period" (Hoel 1971).

Expected: "A quotation ending without punctuation" (Hoel 1971).
Actual:   "A quotation ending without punctuation" (Hoel 1971).

=> ok

** author-date output with language: de
Expected: "A quotation ending with a period" (Hoel 1971).
Actual:   "A quotation ending with a period" (Hoel 1971).

Expected: "A quotation ending without punctuation" (Hoel 1971).
Actual:   "A quotation ending without punctuation" (Hoel 1971).

=> ok

** note style output with language: en-us
Expected: "A quotation ending with a period."[1]
Actual:   "A quotation ending with a period."[1]

Expected: "A quotation ending without punctuation."[1]
Actual:   "A quotation ending without punctuation."[1]

=> ok

** note style output with language: en-gb or de
Expected: "A quotation ending with a period."[1]
Actual:   "A quotation ending with a period".[1]

Expected: "A quotation ending without punctuation".[1]
Actual:   "A quotation ending without punctuation".[1]

=> Here, we cannot distinguish between the two cases as we don't know whether punctuation appears in the original source.

* Note style input (=semantically strict input)

"A quotation ending with a period." [cite: @hoel-71-whole]

"A quotation ending without punctuation". [cite: @hoel-71-whole]

As the input preserves the location of punctuation in the original material, I'd say it should be much easier to deal with this. We don't have to add information which isn't in the input, but rather we'll just have to move any punctuation to after the citation object. Maybe I'm missing something, but to me this looks like a much simpler operation than going in the opposite direction.

Maybe we should stop talking about author date vs note style input, but rather about strict vs. non-strict input. And, I think that's the whole issue: going from strict to non-strict is easy while the other way is more complicated; at least, it would require some more efforts to support the last case (going from non-strict input to note style output with a language that requires strict output.
=========================================================================

Am 14.06.2021 um 00:47 schrieb Bruce D'Arcus:
I'll let you two sort it out; I don't have a position.

On Sun, Jun 13, 2021, 3:23 PM Denis Maier <denismaier@mailbox.org> wrote:

Bruce D'Arcus <bdarcus@gmail.com> hat am 14.06.2021 00:04 geschrieben:


Nicolas explained the reverse is out of scope,
IIRC, it was out of scope ATM.
and gave a reasonable explanation why (because much harder to reconstruct missing information IIRC).
That's where I disagree. I think the opposite is true.

On Sun, Jun 13, 2021, 2:54 PM Denis Maier <denismaier@mailbox.org> wrote:
Am 12.06.2021 um 11:39 schrieb Nicolas Goaziou:
> Hello,
>
> Denis Maier <denismaier@mailbox.org> writes:
>
>> Yes, good this is coming.
>
> As a step forward, I rebased wip-cite-new branch with more support for
> note numbers handling.
>
> I added three customizable variables:
>
> - org-cite-adjust-note-numbers, which simply allows the user to toggle
>    punctuation and note number moving (on by default).
>
> - org-cite-note-rules, which defines what rules to apply according to
>    locale, expressed as a language tag, as in RFC 4646.
>
> - org-cite-punctuation-marks, which lists strings recognized as
>    punctuation in the process.
>
> `csl' and `basic' processors now both make use of this.
>
> I'd appreciate some feedback, in particular about the docstrings of the
> variables above. I focused on the "note numbers" topic instead of
> "punctuation" since I found the latter too generic.
>
> Also, there are some points that may need to be discussed:
>
> - I'm not sure about the `org-cite-punctuation-marks' variable being
>    global, i.e., not locale-specific.
>
> - There is no support for this in LaTeX-derived back-ends, because
>    I don't know when a citation is going to become a footnote. As
>    a reminder, there is no "\footcite" command in `biblatex' processor.
>    OTOH, users might prefer using a more advanced mechanism, e.g.,
>    csquotes.
>
> - It doesn't do anything special in quote blocks, because I'm still not
>    sure there is something to do. AFAIU, special casing there only
>    applies to author-date location, which out of the scope of this code.
>
> WDYT?

Ok, I've managed to test this a bit, and I think this looks pretty good
so far.

The only question I'd still have is if this could somehow also cover the
reverse situation (going from a note style to author-date). I've noticed
that simply adding a new language rule doesn't work anymore---as opposed
to my initial tests with earlier iterations of that mechanism. Seems
like this mechanism is now only triggered when using a note based style.

Best,
Denis

>
> Regards,
>


--------------A6F680266CC8FAC52FD6E6E5--