From mboxrd@z Thu Jan  1 00:00:00 1970
From: Matt Lundin <mdl@imapmail.org>
Subject: Re: Citation processing via Zotero + zotxt
Date: Thu, 03 Dec 2015 14:45:39 -0600
Message-ID: <87k2ovw9ak.fsf@fastmail.fm>
References: <87wpt1yj5k.fsf@berkeley.edu> <m2poyrzz08.fsf@gmail.com>
	<m2k2ozqa92.fsf@andrew.cmu.edu> <87d1uqyiva.fsf@berkeley.edu>
	<8737vkidgl.fsf@fastmail.fm> <87mvtsw3sp.fsf@berkeley.edu>
	<87lh9bh7se.fsf@fastmail.fm> <87k2ovwh4w.fsf@berkeley.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Return-path: <emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([2001:4830:134:3::10]:60593)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mdl@imapmail.org>) id 1a4al6-0004Fk-GE
	for emacs-orgmode@gnu.org; Thu, 03 Dec 2015 15:45:45 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mdl@imapmail.org>) id 1a4al3-000718-6v
	for emacs-orgmode@gnu.org; Thu, 03 Dec 2015 15:45:44 -0500
Received: from out4-smtp.messagingengine.com ([66.111.4.28]:37561)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mdl@imapmail.org>) id 1a4al2-000712-Tc
	for emacs-orgmode@gnu.org; Thu, 03 Dec 2015 15:45:41 -0500
In-Reply-To: <87k2ovwh4w.fsf@berkeley.edu> (Richard Lawrence's message of
	"Thu, 03 Dec 2015 09:56:15 -0800")
List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-orgmode>
List-Post: <mailto:emacs-orgmode@gnu.org>
List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=subscribe>
Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
To: Richard Lawrence <richard.lawrence@berkeley.edu>
Cc: emacs-orgmode@gnu.org

Hi Richard,=20

Thanks so much for this very helpful explanation!

Richard Lawrence <richard.lawrence@berkeley.edu> writes:

> Hi Matt and all,
>
> Matt Lundin <mdl@imapmail.org> writes:
>
>> But for bibtex users, wouldn't there presumably have to be another
>> zotero plugin that would allow for live, automated importing of bibtex
>> into zotero? (If anyone knows whether such a plugin exists, please do
>> let me know.)
>
> Well, my hope is that this could be added to zotxt without much effort,
> so we could still just depend on Zotero and zotxt.  The translation
> capability already exists in Zotero; it's just a matter of exposing it
> as an API, and I imagine that Erik would happily accept a patch to zotxt
> that does so.

That sounds like a great plan.=20

> Yes, you're basically describing the approach that I eventually realized
> org-citeproc should take: use the full capabilities of Pandoc to render
> citations and bibliography in Org format, then re-parse these on the Org
> side.  I did start to work on this, though I didn't finish and I'm not
> sure if I pushed it to the public repo.
>
> If we want to use pandoc-citeproc directly, instead of wrapping it
> with something like org-citeproc, what we'd need to do is be able to
> translate an Org document (or at least the citations within it) both
> to and from pandoc-compatible JSON, since pandoc-citeproc reads and
> writes in that format.

I'm probably missing something, but would we necessarily need to convert
to pandoc's JSON format? A quick and dirty approach might be to use an
org export filter function to grab citations and insert some temporary
unique ids in the export string as placeholders for each citation. Then
we could create a temporary buffer that looks like this:

--8<---------------cut here---------------start------------->8---
unique_id1 [@some_citation, pp. 1-10]

unique_id2 [@another_citation, p. 23]
--8<---------------cut here---------------end--------------->8---

We could then run a shell command on the buffer (i.e., "pandoc
--filter=3Dpandoc-citeproc --csl=3D/path/to/csl
--bibligoraphy=3D/path/to/bibdata -t org"), resulting in formatted
citations for each id. With some simple mapping, we could use a filter
function to insert the citations in the export string/buffer.

Obviously, JSON would be way more elegant. But would still need to run
the results through pandoc to get strings of formatted org output.

> I am not opposed to this idea -- indeed, I kind of like it, which is why
> I started work on org-citeproc in the first place.  Still, it would be a
> non-trivial amount of work to develop this solution even to the point
> that it can do what Zotero and zotxt can do right now.

Thanks for this explanation. I vote for you going full speed ahead with
the zotero/zotxt plans. I'd be happy to build on the work you've already
done to try to make pandoc-citeproc work.

>> Javascript interpreters/engines are widely available for all platforms
>> if we create a wrapper script around citeproc-js. Node itself is also
>> easily available for most platforms. But we wouldn't need to set it up
>> as a node server =C3=A0 la citeproc-node.
>
> My concern here is with the wrapper script.  Yes, it's pretty easy to
> install a javascript interpreter; but getting from there to the point
> where you have a fully-working toolchain for processing citations from
> Org mode is the problem.  What I think we should avoid is a process that
> looks like:
>
> 1) Install node (or whatever interpreter)
> 2) Install citeproc-js and the wrapper script
> 3) Make sure the wrapper script is available as an executable that can
> be called from Emacs
> 4) ...

If we chose node, we could try to package the wrapper script so it can
be installed via npm. Then the installation process would be:

a) install node
b) npm install citeproc-js-wrapper [or whatever]

> It's a question of where to focus the limited resources we've got. My
> impression is that going with the combination of Zotero and zotxt will
> represent the least amount of effort to get citations working on
> non-LaTeX backends, for both Org developers and users.... I fully
> support that. But until more people have time to work on this, it
> seems to me that Zotero and zotxt represent the most practical path
> forward.

That makes a lot of sense. Thanks for all the work you've already put
into this. I'm happy to help out wherever I can.

Matt