emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Org-cite (oc-csl) tip: Filtering bibliography for language
@ 2022-12-19 14:48 Christian Moe
  2022-12-19 22:05 ` András Simonyi
  0 siblings, 1 reply; 8+ messages in thread
From: Christian Moe @ 2022-12-19 14:48 UTC (permalink / raw)
  To: emacs-orgmode@gnu.org

Hi,

A tip some of you might find useful:

I wanted to separate sub-bibliographies by language, which is not one of
the out-of-the-box available properties of the PRINT_BIBLIOGRAPHY
keyword.[fn:1] Specifically, I wanted to filter out Norwegian items into
one subbibliography and non-Norwegian ones into another in the same
document.[fn:2]

I found out how to do it with the `PRINT_BIBLIOGRAPHY: :filter
<predicate>' property, which turns out to be used in the function
citeproc-sb--match-p, where it is applied to a var-value list.

I defined a predicate for Norwegian, bibitem-norwegian-p, that matches a
regexp for various labels for Norwegian[fn:3] against the language
value.

#+begin_src elisp
  (defun bibitem-norwegian-p (vv)
    "Returns non-nil (0) if a bibliography item is in
  Norwegian. For use in an org-cite PRINT_BIBLIOGRAPHY filter."
    (let ((itemlang (alist-get 'language vv)))
      (and itemlang (string-match "n[obn][r-bo]?" itemlang))))
#+end_src

Then I could successfully use it as follows:

#+PRINT_BIBLIOGRAPHY: :filter bibitem-norwegian-p

For the list of non-Norwegian items I just needed to define a complementary
function:

#+begin_src elisp
  (defun bibitem-not-norwegian-p (vv)
    (not (bibitem-norwegian-p vv)))
#+begin_src elisp

#+PRINT_BIBLIOGRAPHY: :filter bibitem-not-norwegian-p

Adapt as needed for other languages and use cases.

Refinements welcome. I'm especially wondering what would be an elegant
way to generalize this for more languages without defining a predicate
for each language (given that we cannot pass the language as an
additional argument in the print_bibliography line).

* Footnotes

[fn:1] [[info:org#Bibliography options in the ``biblatex'' and ``csl''
export processors]]

[fn:2] For this to work at all, of course, your CSL JSON or BibTeX has
to contain language information on the items; it should, or English
formatting such as title-casing might be applied inappropriately to
non-English items by some styles.

[fn:3] The regexp is complicated because Norwegian is complicated, my
labeling is inconsistent and I want to match at least no, nb, nn, no-NO,
nb-NO, nn-NO, nor, nob, nno, norsk and Norwegian ...



Yours,
Christian


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org-cite (oc-csl) tip: Filtering bibliography for language
  2022-12-19 14:48 Org-cite (oc-csl) tip: Filtering bibliography for language Christian Moe
@ 2022-12-19 22:05 ` András Simonyi
  2022-12-19 22:20   ` András Simonyi
  0 siblings, 1 reply; 8+ messages in thread
From: András Simonyi @ 2022-12-19 22:05 UTC (permalink / raw)
  To: Christian Moe; +Cc: emacs-orgmode@gnu.org

Dear All,

On Mon, 19 Dec 2022 at 15:49, Christian Moe <mail@christianmoe.com> wrote:

> Refinements welcome. I'm especially wondering what would be an elegant
> way to generalize this for more languages without defining a predicate
> for each language (given that we cannot pass the language as an
> additional argument in the print_bibliography line).

Thanks for describing this usage! As for the problem of generalizing
to more languages, one relatively simple solution would be to allow
arbitrary sexps as filters. Then one could write something like

#+print_bibliography: :filter (lambda (item) (bibitem-has-language item "en")))

Would this type of extension be helpful? One (not necessarily
important)  consequence would be that filters of this type would be
obviously unusable with the biblatex exporter.

best wishes,
András


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org-cite (oc-csl) tip: Filtering bibliography for language
  2022-12-19 22:05 ` András Simonyi
@ 2022-12-19 22:20   ` András Simonyi
  2022-12-20  8:23     ` Denis Maier
  0 siblings, 1 reply; 8+ messages in thread
From: András Simonyi @ 2022-12-19 22:20 UTC (permalink / raw)
  To: Christian Moe; +Cc: emacs-orgmode@gnu.org

... I've forgotten to add that another (probably more user friendly)
option would be to design and implement some kind of  filtering DSL.

András

On Mon, 19 Dec 2022 at 23:05, András Simonyi <andras.simonyi@gmail.com> wrote:
>
> Dear All,
>
> On Mon, 19 Dec 2022 at 15:49, Christian Moe <mail@christianmoe.com> wrote:
>
> > Refinements welcome. I'm especially wondering what would be an elegant
> > way to generalize this for more languages without defining a predicate
> > for each language (given that we cannot pass the language as an
> > additional argument in the print_bibliography line).
>
> Thanks for describing this usage! As for the problem of generalizing
> to more languages, one relatively simple solution would be to allow
> arbitrary sexps as filters. Then one could write something like
>
> #+print_bibliography: :filter (lambda (item) (bibitem-has-language item "en")))
>
> Would this type of extension be helpful? One (not necessarily
> important)  consequence would be that filters of this type would be
> obviously unusable with the biblatex exporter.
>
> best wishes,
> András


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org-cite (oc-csl) tip: Filtering bibliography for language
  2022-12-19 22:20   ` András Simonyi
@ 2022-12-20  8:23     ` Denis Maier
  2022-12-20  9:47       ` András Simonyi
  2022-12-20 10:46       ` Christian Moe
  0 siblings, 2 replies; 8+ messages in thread
From: Denis Maier @ 2022-12-20  8:23 UTC (permalink / raw)
  To: András Simonyi, Christian Moe; +Cc: emacs-orgmode@gnu.org

Am 19.12.2022 um 23:20 schrieb András Simonyi:
> ... I've forgotten to add that another (probably more user friendly)
> option would be to design and implement some kind of  filtering DSL.
> 
> András
> 
> On Mon, 19 Dec 2022 at 23:05, András Simonyi <andras.simonyi@gmail.com> wrote:
>>
>> Dear All,
>>
>> On Mon, 19 Dec 2022 at 15:49, Christian Moe <mail@christianmoe.com> wrote:
>>
>>> Refinements welcome. I'm especially wondering what would be an elegant
>>> way to generalize this for more languages without defining a predicate
>>> for each language (given that we cannot pass the language as an
>>> additional argument in the print_bibliography line).
>>
>> Thanks for describing this usage! As for the problem of generalizing
>> to more languages, one relatively simple solution would be to allow
>> arbitrary sexps as filters. Then one could write something like
>>
>> #+print_bibliography: :filter (lambda (item) (bibitem-has-language item "en")))
>>
>> Would this type of extension be helpful? One (not necessarily
>> important)  consequence would be that filters of this type would be
>> obviously unusable with the biblatex exporter.
>>
>> best wishes,
>> András

I'd say both options are certainly useful. A filtering DSL is surely the 
more user friendly option, but allowing lambda expressions would 
probably be quicker to implement, and it would also allow for predicates 
not anticipated by DSL designers.

Best,
Denis



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org-cite (oc-csl) tip: Filtering bibliography for language
  2022-12-20  8:23     ` Denis Maier
@ 2022-12-20  9:47       ` András Simonyi
  2022-12-20 10:22         ` Timothy
  2022-12-20 10:46       ` Christian Moe
  1 sibling, 1 reply; 8+ messages in thread
From: András Simonyi @ 2022-12-20  9:47 UTC (permalink / raw)
  To: Denis Maier; +Cc: Christian Moe, emacs-orgmode@gnu.org

On Tue, 20 Dec 2022 at 09:22, Denis Maier <maier.de@gmail.com> wrote:

> allowing lambda expressions would
> probably be quicker to implement, and it would also allow for predicates
> not anticipated by DSL designers.

Yes, on the other hand, we will have to be very careful with regard to
security if we choose this route, treating filters basically as elisp
source code blocks.

András


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org-cite (oc-csl) tip: Filtering bibliography for language
  2022-12-20  9:47       ` András Simonyi
@ 2022-12-20 10:22         ` Timothy
  0 siblings, 0 replies; 8+ messages in thread
From: Timothy @ 2022-12-20 10:22 UTC (permalink / raw)
  To: András Simonyi; +Cc: Denis Maier, Christian Moe, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 641 bytes --]

Hi András,

> Yes, on the other hand, we will have to be very careful with regard to
> security if we choose this route, treating filters basically as elisp
> source code blocks.

Mmm, next year I’d like to try to track down and manage all of the “surprise
elisp execution” that happens with Org files, so having something basic for
simple use cases could be nice.

All the best,
Timothy

-- 
Timothy (‘tecosaur’/‘TEC’), Org mode contributor.
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/tec>.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org-cite (oc-csl) tip: Filtering bibliography for language
  2022-12-20  8:23     ` Denis Maier
  2022-12-20  9:47       ` András Simonyi
@ 2022-12-20 10:46       ` Christian Moe
  2023-01-18 19:39         ` András Simonyi
  1 sibling, 1 reply; 8+ messages in thread
From: Christian Moe @ 2022-12-20 10:46 UTC (permalink / raw)
  To: Denis Maier; +Cc: András Simonyi, emacs-orgmode@gnu.org


Denis Maier writes:

> Am 19.12.2022 um 23:20 schrieb András Simonyi:
>> ... I've forgotten to add that another (probably more user friendly)
>> option would be to design and implement some kind of  filtering DSL.
>> András
>> On Mon, 19 Dec 2022 at 23:05, András Simonyi
>> <andras.simonyi@gmail.com> wrote:
>>>
>>> Dear All,
>>>
>>> On Mon, 19 Dec 2022 at 15:49, Christian Moe <mail@christianmoe.com> wrote:
>>>
>>>> Refinements welcome. I'm especially wondering what would be an elegant
>>>> way to generalize this for more languages without defining a predicate
>>>> for each language (given that we cannot pass the language as an
>>>> additional argument in the print_bibliography line).
>>>
>>> Thanks for describing this usage! As for the problem of generalizing
>>> to more languages, one relatively simple solution would be to allow
>>> arbitrary sexps as filters. Then one could write something like
>>>
>>> #+print_bibliography: :filter (lambda (item) (bibitem-has-language item "en")))
>>>
>>> Would this type of extension be helpful? One (not necessarily
>>> important)  consequence would be that filters of this type would be
>>> obviously unusable with the biblatex exporter.
>>>
>>> best wishes,
>>> András
>
> I'd say both options are certainly useful. A filtering DSL is surely
> the more user friendly option, but allowing lambda expressions would
> probably be quicker to implement, and it would also allow for
> predicates not anticipated by DSL designers.
>
> Best,
> Denis

Arbitrary sexps would give us more flexibility. Alternately, one could
achieve more or less the same by letting :filter collect any additional
arguments and pass them as &rest to the user's predicate function,
something like:

  #+PRINT_BIBLIOGRAPHY: :filter bibitem-lang-p nb nn no :type article

(This perhaps makes for cleaner solutions. And it is perhaps slightly
better from a security viewpoint: I hope for a bright future of
collaborative authoring in Org, so I'm wary of proliferating ways to
execute arbitrary elisp that a user might not notice. But we do have
such ways already, and it's possible to abuse the above solution as
well, so I don't know.)

Alternatively, I think there is a case for adding a user-friendly
:language property to the print_bibliography keyword. On my bookshelf it
vies with primary/secondary sources as the most common criterion for
separate bibliographies.

I was going to say that this is the only extension I can think of that
is needed beside :(not)(csl)type and :(not)keyword, but of course people
are sooner or later going to want easy-to-use properties to filter by
author, publication date ranges, and probably other criteria I cannot
think of right now, so it's a strategic decision for the maintainer(s)
if you want to go that way. :-)

Yours,
Christian


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org-cite (oc-csl) tip: Filtering bibliography for language
  2022-12-20 10:46       ` Christian Moe
@ 2023-01-18 19:39         ` András Simonyi
  0 siblings, 0 replies; 8+ messages in thread
From: András Simonyi @ 2023-01-18 19:39 UTC (permalink / raw)
  To: Christian Moe; +Cc: Denis Maier, emacs-orgmode@gnu.org

Dear All,

first of all, sorry for replying that late.

On Tue, 20 Dec 2022 at 11:46, Christian Moe <mail@christianmoe.com> wrote:
> Arbitrary sexps would give us more flexibility. Alternately, one could
> achieve more or less the same by letting :filter collect any additional
> arguments and pass them as &rest to the user's predicate function,
> something like:
>
>   #+PRINT_BIBLIOGRAPHY: :filter bibitem-lang-p nb nn no :type article

I like this proposal a lot -- it seems to strike a good balance with
regard to safety and flexibility.
I'll try to make the required changes on the citeproc-el side and then
propose a patch here.

> Alternatively, I think there is a case for adding a user-friendly
> :language property to the print_bibliography keyword. On my bookshelf it
> vies with primary/secondary sources as the most common criterion for
> separate bibliographies.

> I was going to say that this is the only extension I can think of that
> is needed beside :(not)(csl)type and :(not)keyword, but of course people
> are sooner or later going to want easy-to-use properties to filter by
> author, publication date ranges, and probably other criteria I cannot
> think of right now, so it's a strategic decision for the maintainer(s)
> if you want to go that way. :-)

this is also a useful suggestion, although with the added difficulty
of having to support both
bib(la)tex and csl-json, which use, I think, different sets of language codes.

best wishes,
András

> Yours,
> Christian


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-01-18 19:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-19 14:48 Org-cite (oc-csl) tip: Filtering bibliography for language Christian Moe
2022-12-19 22:05 ` András Simonyi
2022-12-19 22:20   ` András Simonyi
2022-12-20  8:23     ` Denis Maier
2022-12-20  9:47       ` András Simonyi
2022-12-20 10:22         ` Timothy
2022-12-20 10:46       ` Christian Moe
2023-01-18 19:39         ` András Simonyi

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).