emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] oc-csl: New custom option `org-cite-csl-sentence-case-bibtex-titles'
@ 2024-05-11 15:33 András Simonyi
  2024-05-11 17:26 ` Ihor Radchenko
  2024-05-15 11:48 ` Max Nikulin
  0 siblings, 2 replies; 6+ messages in thread
From: András Simonyi @ 2024-05-11 15:33 UTC (permalink / raw)
  To: emacs-orgmode list

[-- Attachment #1: Type: text/plain, Size: 880 bytes --]

Dear All,

since bibtex and biblatex requires title fields to be in title case
but CSL assumes that they are in sentence-case, citeproc-el converts
title fields in bib(la)tex bibliography databases into sentence-case
before processing them except for entries with an explicit non-English
langid value. Although this seems to be reasonable behaviour, there
were several requests in the last couple of years to make it possible
to turn this conversion off (see, e.g., citeproc issues  #119 and
#142). The attached patch introduces a new custom option to configure
when to perform the conversion.

I'm a bit unsure about naming the option:
Perhaps `org-cite-csl-sentence-case-bibtex-titles-without-langid'
would be more precise but I found it absurdly long and technical, as
most users are probably unaware of the existence of langid fields.

best wishes,
András

[-- Attachment #2: 0001-oc-csl-New-custom-option-org-cite-csl-sentence-case-.patch --]
[-- Type: text/x-patch, Size: 2771 bytes --]

From 031428611c18bb4d97bbdbd11a7549ca2b96ccec Mon Sep 17 00:00:00 2001
From: Andras Simonyi <andras.simonyi@gmail.com>
Date: Sat, 11 May 2024 11:20:41 +0200
Subject: [PATCH] oc-csl: New custom option
 `org-cite-csl-sentence-case-bibtex-titles'

* lisp/oc-csl.el (org-cite-csl-sentence-case-bibtex-titles): New
variable.
(org-cite-csl--processor): Create the itemgetter using the new option.
* etc/ORG-NEWS (New option
~org-cite-csl-sentence-case-bibtex-titles~): Announce the change.
---
 etc/ORG-NEWS   |  9 +++++++++
 lisp/oc-csl.el | 18 +++++++++++++++++-
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/etc/ORG-NEWS b/etc/ORG-NEWS
index 36eeddda1..9ca71d0a2 100644
--- a/etc/ORG-NEWS
+++ b/etc/ORG-NEWS
@@ -1128,6 +1128,15 @@ blocks that do not specify any ~:formatter~ parameter. Its default
 value (the new function ~org-columns-dblock-write-default~) yields the
 previous (fixed) formatting behaviour.
 
+*** New option ~org-cite-csl-sentence-case-bibtex-titles~
+
+When this option is non-nil then title fields in bibtex bibliography
+entries are converted to sentence-case before being formatted
+according to a CSL style, except for entries with a =langid= field
+specifying a non-English language.  When nil, this conversion is
+limited to entries having a =langid= field specifying a variant of
+English.  The defult value is ~t~.
+
 ** New features
 *** =ob-lua=: Support all types and multiple values in results
 
diff --git a/lisp/oc-csl.el b/lisp/oc-csl.el
index 9bbe5e29d..7234174d0 100644
--- a/lisp/oc-csl.el
+++ b/lisp/oc-csl.el
@@ -321,6 +321,21 @@ in the bibliography measured in characters."
   :type 'string
   :package-version '(Org . "9.7"))
 
+(defcustom org-cite-csl-sentence-case-bibtex-titles t
+  "Convert bibtex title fields to sentence-case by default.
+
+When non-nil, title fields in bibtex bibliography entries are
+converted to sentence-case before being formatted according to a
+CSL style, except for entries with a `langid' field specifying a
+non-English language.
+
+When nil, this conversion is limited to entries having a `langid'
+field specifying a variant of English."
+  :group 'org-cite
+  :package-version '(Org . "9.7")
+  :type 'boolean
+  :safe #'booleanp)
+
 \f
 ;;; Internal variables
 (defconst org-cite-csl--etc-dir
@@ -584,7 +599,8 @@ property in INFO."
              (processor
               (citeproc-create
                (org-cite-csl--style-file info)
-               (citeproc-hash-itemgetter-from-any bibliography)
+               (citeproc-hash-itemgetter-from-any
+                bibliography (not org-cite-csl-sentence-case-bibtex-titles))
                (org-cite-csl--locale-getter)
                locale)))
         (plist-put info :cite-citeproc-processor processor)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] oc-csl: New custom option `org-cite-csl-sentence-case-bibtex-titles'
  2024-05-11 15:33 [PATCH] oc-csl: New custom option `org-cite-csl-sentence-case-bibtex-titles' András Simonyi
@ 2024-05-11 17:26 ` Ihor Radchenko
  2024-05-14  9:58   ` András Simonyi
  2024-05-15 11:48 ` Max Nikulin
  1 sibling, 1 reply; 6+ messages in thread
From: Ihor Radchenko @ 2024-05-11 17:26 UTC (permalink / raw)
  To: András Simonyi; +Cc: emacs-orgmode list

András Simonyi <andras.simonyi@gmail.com> writes:

> since bibtex and biblatex requires title fields to be in title case
> ...

Are you sure? AFAIK, bibtex and biblatex (depending on the bibstyle)
does not care about capitalization and instead applies its own, unless
the title explicitly protects the capitalization/case with {Curly
BracketS}.

See https://texfaq.org/FAQ-capbibtex and
https://tex.stackexchange.com/questions/20335/proper-casing-in-citation-bibliography-titles-using-biblatex-biber

> ... but CSL assumes that they are in sentence-case, citeproc-el converts
> title fields in bib(la)tex bibliography databases into sentence-case
> before processing them except for entries with an explicit non-English
> langid value...

AFAIU, the general recommendation is to use sentence case in the bib
files. Both for Bib(La)Tex (because it converts into title case if
necessary by itself) and for CSL, according to
https://citationstyles.org/authors/

Of course, not every real-life bibliography follows such suggestion,
but, as stated in https://citationstyles.org/authors/, converting from
title case to sentence case is error-prone:

>> For this reason, we recommend that you store all titles in your
>> reference database in sentence case. Our repository CSL styles that
>> need sentence case will generally just print titles as is, whereas
>> styles that need title case will use an automatic title-case
>> conversion.

Looking at https://github.com/andras-simonyi/citeproc-el/issues/119 and
https://github.com/andras-simonyi/citeproc-el/issues/142, it appears to
me that citeproc.el does not stick to the above guideline from CSL
website - unless title case is requested by the bibliography converting
to sentence case should not be done by default. Also, such conversion
should only happen in titles, AFAIU; not in other fields.

> Subject: [PATCH] oc-csl: New custom option
> `org-cite-csl-sentence-case-bibtex-titles'

I see no problem with the new feature, but I'd consider flipping the
default to nil.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] oc-csl: New custom option `org-cite-csl-sentence-case-bibtex-titles'
  2024-05-11 17:26 ` Ihor Radchenko
@ 2024-05-14  9:58   ` András Simonyi
  2024-05-17 13:34     ` Ihor Radchenko
  0 siblings, 1 reply; 6+ messages in thread
From: András Simonyi @ 2024-05-14  9:58 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode list

Dear All,

On Sat, 11 May 2024 at 19:24, Ihor Radchenko <yantar92@posteo.net> wrote:

>> > since bibtex and biblatex requires title fields to be in title case
> Are you sure? AFAIK, bibtex and biblatex (depending on the bibstyle)
> does not care about capitalization and instead applies its own, unless
> the title explicitly protects the capitalization/case with {Curly
> BracketS}.

Yes, I'm pretty sure that the expected casing in .bib bibliography
databases for the title fields is by default title case, plus
protective braces around texts whose case shouldn't be touched during
formatting (of course, formatting itself can produce both sentence and
title case from this input depending on the used style). This
requirement was already clearly stated in Lamport's original LaTeX
book. To quote the relevant part of "The Bibliography Database"
chapter  (2nd edition, p. 158):

> The bibliography style determines whether or not a title is capitalized; the
> titles of books usually are, the titles of articles usually are not. You type
> a title the way it should appear if it is capitalized. You should capitalize
> the first word of the title, the first word after a colon, and all other words
> except articles and unstressed conjunctions and prepositions. BiBTeX will
> change uppercase letters to lowercase if appropriate. Uppercase letters that
> should not be changed are enclosed in braces.

These requirements haven't changed since then and also hold for
biblatex, see, e.g.,
https://tex.stackexchange.com/questions/439440/what-is-the-proper-casing-to-use-when-storing-titles-in-the-bibliography-database.
If you are interested, you can also look at the discussion concerning
the citeproc-el conversion implementation at
https://github.com/andras-simonyi/citeproc-el/issues/71, see also the
"Capitalization in titles" section in
Pandoc's User’s Guide.

The proposed default makes it possible to use a .bib bibliography
database that conforms to the standard title field format requirements
(title case etc.) and get the intended output corresponding to the
used citation style both with bib(la)tex and CSL styles.  Moreover, it
matches Pandoc's behaviour, see
https://github.com/jgm/pandoc-citeproc/issues/269.

best wishes,
András


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] oc-csl: New custom option `org-cite-csl-sentence-case-bibtex-titles'
  2024-05-11 15:33 [PATCH] oc-csl: New custom option `org-cite-csl-sentence-case-bibtex-titles' András Simonyi
  2024-05-11 17:26 ` Ihor Radchenko
@ 2024-05-15 11:48 ` Max Nikulin
  1 sibling, 0 replies; 6+ messages in thread
From: Max Nikulin @ 2024-05-15 11:48 UTC (permalink / raw)
  To: emacs-orgmode; +Cc: András Simonyi

On 11/05/2024 22:33, András Simonyi wrote:
> since bibtex and biblatex requires title fields to be in title case
> but CSL assumes that they are in sentence-case, citeproc-el converts
> title fields in bib(la)tex bibliography databases into sentence-case
> before processing them except for entries with an explicit non-English
> langid value.

I am not a user of citeproc-el, so feel free to disregard my comments.

In the past I had to adjust BibTeX styles, but yesterday I was surprised 
that there are options for upper case, lower case, and sentence-style 
capitalization, but not for title-style capitalization. It seems that 
both approaches with title case and with sentence case have some 
shortcomings. Likely title case like in BibTeX requires more explicit 
hints and perhaps there are cases when available hints are not enough to 
get specific formatting. I still expect that CSL needs hints as well to 
avoid improper formatting.

Is it possible to keep title formatting from .bib files till it becomes 
known that specific style requires sentence case for particular entry 
type? I had a hope that it might alleviate the issue and to make things 
working out of the box for more users.

> I'm a bit unsure about naming the option:
> Perhaps `org-cite-csl-sentence-case-bibtex-titles-without-langid'

A variant: org-cite-csl-bibtex-title-to-sentence-case

> @@ -584,7 +599,8 @@ property in INFO."
>               (processor
>                (citeproc-create
>                 (org-cite-csl--style-file info)
> -               (citeproc-hash-itemgetter-from-any bibliography)
> +               (citeproc-hash-itemgetter-from-any
> +                bibliography (not org-cite-csl-sentence-case-bibtex-titles))
>                 (org-cite-csl--locale-getter)
>                 locale)))
>          (plist-put info :cite-citeproc-processor processor)

I am not in the context, so I may be completely wrong.

Does it means that you added one more argument to `citeproc-create' and 
that consistent Org and citeproc-el versions must be used? If so, 
wouldn't it better to pass a property list to allow newer Org to work 
with older citeproc-el or vice versa? It may be tricky to preserve 
backward-forward compatibility on this step, but it should make further 
changes easier. It may be reasonable to explicitly add version of 
"protocol" to the property list, so that citeproc-el may decide if error 
should be signaled in the case of serious version difference.

It is not clear for me why `org-cite-csl-sentence-case-bibtex-titles' is 
a part of Org, not of citeproc-el. The only thing that Org can do is to 
pass it to citeproc-el. It is not configurable per .org file and likely 
it should not be. From my point of view it might be more suitable per 
.bib file. Anyway it is almost unrelated to Org.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] oc-csl: New custom option `org-cite-csl-sentence-case-bibtex-titles'
  2024-05-14  9:58   ` András Simonyi
@ 2024-05-17 13:34     ` Ihor Radchenko
  2024-06-17 11:39       ` Ihor Radchenko
  0 siblings, 1 reply; 6+ messages in thread
From: Ihor Radchenko @ 2024-05-17 13:34 UTC (permalink / raw)
  To: András Simonyi; +Cc: emacs-orgmode list

András Simonyi <andras.simonyi@gmail.com> writes:

> On Sat, 11 May 2024 at 19:24, Ihor Radchenko <yantar92@posteo.net> wrote:
>
>>> > since bibtex and biblatex requires title fields to be in title case
>> Are you sure? AFAIK, bibtex and biblatex (depending on the bibstyle)
>> does not care about capitalization and instead applies its own, unless
>> the title explicitly protects the capitalization/case with {Curly
>> BracketS}.
>
> Yes, I'm pretty sure that the expected casing in .bib bibliography
> databases for the title fields is by default title case, plus
> protective braces around texts whose case shouldn't be touched during
> formatting (of course, formatting itself can produce both sentence and
> title case from this input depending on the used style). This
> requirement was already clearly stated in Lamport's original LaTeX
> book. To quote the relevant part of "The Bibliography Database"
> chapter  (2nd edition, p. 158):
> ...
> These requirements haven't changed since then and also hold for
> biblatex, see, e.g.,
> ...

Thanks for the explanation.
I also cross-checked with the examples given in
https://ctan.org/pkg/bibtex - all consistent.

> I'm a bit unsure about naming the option:
> Perhaps `org-cite-csl-sentence-case-bibtex-titles-without-langid'
> would be more precise but I found it absurdly long and technical, as
> most users are probably unaware of the existence of langid fields.

Maybe org-cite-csl-sentence-case-english-titles?

Also, it would be nice to point out the CSL and Bibtex have different
conventions for the title field.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] oc-csl: New custom option `org-cite-csl-sentence-case-bibtex-titles'
  2024-05-17 13:34     ` Ihor Radchenko
@ 2024-06-17 11:39       ` Ihor Radchenko
  0 siblings, 0 replies; 6+ messages in thread
From: Ihor Radchenko @ 2024-06-17 11:39 UTC (permalink / raw)
  To: András Simonyi; +Cc: emacs-orgmode list

Ihor Radchenko <yantar92@posteo.net> writes:

>> I'm a bit unsure about naming the option:
>> Perhaps `org-cite-csl-sentence-case-bibtex-titles-without-langid'
>> would be more precise but I found it absurdly long and technical, as
>> most users are probably unaware of the existence of langid fields.
>
> Maybe org-cite-csl-sentence-case-english-titles?
>
> Also, it would be nice to point out the CSL and Bibtex have different
> conventions for the title field.

Andras, it has been a month since the last message in this thread.
May I know about the status of the patch?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-06-17 11:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-11 15:33 [PATCH] oc-csl: New custom option `org-cite-csl-sentence-case-bibtex-titles' András Simonyi
2024-05-11 17:26 ` Ihor Radchenko
2024-05-14  9:58   ` András Simonyi
2024-05-17 13:34     ` Ihor Radchenko
2024-06-17 11:39       ` Ihor Radchenko
2024-05-15 11:48 ` Max Nikulin

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).