From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matt Price Subject: Re: Citation processing via Zotero + zotxt Date: Tue, 1 Dec 2015 14:37:29 -0500 Message-ID: References: <87wpt1yj5k.fsf@berkeley.edu> <87egf62q31.fsf@gmx.us> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=94eb2c07f940c6aada0525db492b Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:56291) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3qk0-0000n6-Az for emacs-orgmode@gnu.org; Tue, 01 Dec 2015 14:37:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a3qjy-0000GB-CI for emacs-orgmode@gnu.org; Tue, 01 Dec 2015 14:37:32 -0500 Received: from mail-io0-x229.google.com ([2607:f8b0:4001:c06::229]:35043) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3qjy-0000G0-6G for emacs-orgmode@gnu.org; Tue, 01 Dec 2015 14:37:30 -0500 Received: by ioc74 with SMTP id 74so21075099ioc.2 for ; Tue, 01 Dec 2015 11:37:29 -0800 (PST) In-Reply-To: <87egf62q31.fsf@gmx.us> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Rasmus Cc: Org Mode --94eb2c07f940c6aada0525db492b Content-Type: text/plain; charset=UTF-8 On Tue, Dec 1, 2015 at 9:36 AM, Rasmus wrote: > Hi Richard, > > First, thank you for looking into this. I learned something new from this > > > Pretty much all the other options we have talked about seem like they > > will require multi-step, non-trivial installation procedures ("First > > install {Node.js/Haskell/JVM ...}, then install > > {citeproc-node/pandoc-citeproc/citeproc-java...}, then install our > > wrapper script..."). Updating could require other manual operations of > > similar complexity. Avoiding that kind of procedure will make citations > > a lot more usable from Org for everyone. > > I think this is *very* important. > I totally agree. > > 2) It is quite complete. > > > > Previously, I thought that it would be a similar amount of work to > > communicate with Zotero from Emacs as any of the other CSL > > implementations out there. However, after looking at zotxt a bit more > > closely, I discovered that it has an (undocumented) API endpoint [3] > > This sounds amazing, but also dangerous. Do you know whether stabilizing > the API has been discussed upstream? > I think the API Richard is referring to is *zotxt's* API, not Zotero's. So "upstream" is a very short distance to an underground spring under our house (Erik Hetzner). The somewhat more widely-used Better Bibtex plugin also has provides an API to the Zotero database (https://zotplus.github.io/better-bibtex/cayw.html). In either case, it probably would be relatively easy to provide patches to the maintainer if we run into trouble. > > > that pretty much does exactly what we need: it accepts a list of > > citation objects, and returns a list of formatted citations and a > > formatted bibliography, which can be inserted into the exported > > document. > > Could you give an example of the sort of input you give? I.e. is it based > on keys? From my bibtex-centric world view I imagine something like: > > I send key or pointer @K to a DB entry as well as a CSL-file pointer C, > and maybe a desired output format F. I get a string back that is the > formatting of the data behind @K formatted according to the rules in C, > adapted to F. > > Is that correct? If so, does it support html, text and odt? > > Right now, IIUC, zotxt accepts only a *style name*, not a CSL file -- it will locate the CSL file in the Zotero style list. It supports html and text output formats, as well as the QuickKey syntax used by the ODF-scan zotero plugin (https://github.com/Zotero-ODF-Scan/zotero-odf-scan). My understanding is that providing fully-formed odt syntax is difficult, because of the structure of the odt file, which I guess wants a bunch of metadata that isn't trivial to provide. The recommended path right now is to run ODF-scan on the odt from libreoffice -- it's an annoying extra step that I was hoping to be able to avoid, but am not competent to program: https://forums.zotero.org/discussion/29308/7/rtfodf-scan-for-zotero/#Comment_226799 > This endpoint still needs a little bit of work, to generalize it and > > make it easier to get the data in the format we need. (That is probably > > why it is undocumented in the README.) But it requires much less work > > than I thought it would, and much less work than it would be to get a > > full-featured setup with something like citeproc-node. > > This is a very strong argument. > > At some point Matt talked about adding support for org citation syntax in > citeproc-js. Would this be advantageous if going this route? I guess not. > Depends on whether you want to be able to request org-mode syntax from zotero. Zotero uses citation-js internally; changes we make to citation-js will eventually percolate up to zotero, and it's not impossible to replace zotero's citeproc with one's own copy (even I can do it). > > > > > IMO we can leverage zotero as a tool, but we cannot enforce it as a > bibliography manager. > yes > > > I still think Zotero + zotxt is the best option for non-LaTeX > > citation processing, even for these folks. The ease of installation > > (and removal) of the required programs alone makes it worth it, even if > > you never actually populate a Zotero database. So given what I know at > > the moment, I think our efforts would best be directed at making the > > in-progress org-cite library communicate with Zotero via zotxt. What do > > you think? > > +1, though re zotxt we should make sure Erik would want to move it to > GELPA. > Basically I'm enthusiastic and glad you are taking up the challenge, since matt's programming:snail's pace :: snail's pace:leopard run --94eb2c07f940c6aada0525db492b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Tue, Dec 1, 2015 at 9:36 AM, Rasmus <rasmus@gmx.us> wrote= :
Hi Richard,

First, thank you for looking into this.=C2=A0 I learned something new from = this

> Pretty much all the other options we have talked about seem like they<= br> > will require multi-step, non-trivial installation procedures ("Fi= rst
> install {Node.js/Haskell/JVM ...}, then install
> {citeproc-node/pandoc-citeproc/citeproc-java...}, then install our
> wrapper script...").=C2=A0 Updating could require other manual op= erations of
> similar complexity.=C2=A0 Avoiding that kind of procedure will make ci= tations
> a lot more usable from Org for everyone.

I think this is *very* important.
I totally agr= ee.

=C2=A0
> 2) It is quite complete.
>
> Previously, I thought that it would be a similar amount of work to
> communicate with Zotero from Emacs as any of the other CSL
> implementations out there.=C2=A0 However, after looking at zotxt a bit= more
> closely, I discovered that it has an (undocumented) API endpoint [3]
This sounds amazing, but also dangerous.=C2=A0 Do you know whether s= tabilizing
the API has been discussed upstream?

I = think the API Richard is referring to is *zotxt's* API, not Zotero'= s. So "upstream" is a very short distance to an underground sprin= g under our house (Erik Hetzner).
The somewhat more widely-us= ed Better Bibtex plugin also has provides an API to the Zotero database (https://zotplus= .github.io/better-bibtex/cayw.html). In either case, it probably would = be relatively easy to provide patches to the maintainer if we run into trou= ble.=C2=A0
=C2=A0

> that pretty much does exactly what we need: it accepts a list of
> citation objects, and returns a list of formatted citations and a
> formatted bibliography, which can be inserted into the exported
> document.

Could you give an example of the sort of input you give?=C2=A0 I.e. = is it based
on keys?=C2=A0 From my bibtex-centric world view I imagine something like:<= br>
=C2=A0 =C2=A0I send key or pointer @K to a DB entry as well as a CSL-file p= ointer C,
=C2=A0 =C2=A0and maybe a desired output format F.=C2=A0 I get a string back= that is the
=C2=A0 =C2=A0formatting of the data behind @K formatted according to the ru= les in C,
=C2=A0 =C2=A0adapted to F.

Is that correct?=C2=A0 If so, does it support html, text and odt?

Right now, IIUC, zotxt accept= s only a *style name*, not a CSL file -- it will locate the CSL file in the= Zotero style list.=C2=A0 It supports html and text output formats, as well= as the QuickKey syntax used by the ODF-scan zotero plugin (https://github.com/Zotero-O= DF-Scan/zotero-odf-scan). My understanding is that providing fully-form= ed odt syntax is difficult, because=C2=A0 of the structure of the odt file,= which I guess wants a bunch of metadata that isn't trivial to provide.= =C2=A0 The recommended path right now is to run ODF-scan on the odt from li= breoffice -- it's an annoying extra step that I was hoping to be able t= o avoid, but am not competent to program:

https= ://forums.zotero.org/discussion/29308/7/rtfodf-scan-for-zotero/#Comment_226= 799

> This endpoint still needs a little bit of work, to generalize it and > make it easier to get the data in the format we need.=C2=A0 (That is p= robably
> why it is undocumented in the README.) But it requires much less work<= br> > than I thought it would, and much less work than it would be to get a<= br> > full-featured setup with something like citeproc-node.

This is a very strong argument.

At some point Matt talked about adding support for org citation syntax in citeproc-js.=C2=A0 Would this be advantageous if going this route?=C2=A0 I = guess not.

Depends on whether you want = to be able to request org-mode syntax from zotero. Zotero uses citation-js = internally; changes we make to citation-js will eventually percolate up to = zotero, and it's not impossible to replace zotero's citeproc with o= ne's own copy (even I can do it). =C2=A0

>

=C2=A0
IMO we can leverage zotero as a tool, but we cannot enforce it as a
bibliography manager.

yes

> I still think Zotero + zotxt is the best option for non-LaTeX
> citation processing, even for these folks.=C2=A0 The ease of installat= ion
> (and removal) of the required programs alone makes it worth it, even i= f
> you never actually populate a Zotero database.=C2=A0 So given what I k= now at
> the moment, I think our efforts would best be directed at making the > in-progress org-cite library communicate with Zotero via zotxt.=C2=A0 = What do
> you think?

+1, though re zotxt we should make sure Erik would want to move it t= o
GELPA.

Basically I'm enthusiastic a= nd glad you are taking up the challenge, since matt's programming:snail= 's pace :: snail's pace:leopard run

--94eb2c07f940c6aada0525db492b--