From mboxrd@z Thu Jan  1 00:00:00 1970
From: Richard Lawrence <richard.lawrence@berkeley.edu>
Subject: Re: Citation syntax: a revised proposal
Date: Wed, 11 Mar 2015 10:33:52 -0700
Message-ID: <87sidb8mm7.fsf@berkeley.edu>
References: <87k2zjnc0e.fsf@berkeley.edu> <87bnkvm8la.fsf@berkeley.edu>
	<m21tlrot9o.fsf@tsdye.com> <87zj8co3se.fsf@berkeley.edu>
	<m2iof0gdnc.fsf@tsdye.com> <87ioezooi2.fsf@berkeley.edu>
	<87mw4bpaiu.fsf@nicolasgoaziou.fr> <8761aznpiq.fsf@berkeley.edu>
	<87twyjnh0r.fsf@nicolasgoaziou.fr> <87oaopx24e.fsf@berkeley.edu>
	<87k2zd4f3w.fsf@nicolasgoaziou.fr>
	<87egpkv8g9.fsf@berkeley.edu> <877fv6xfaq.fsf@gmail.com>
	<87twya2ak0.fsf@berkeley.edu> <87zj81aa97.fsf@nicolasgoaziou.fr>
	<87ioep2r6p.fsf@berkeley.edu>
	<87y4ngbgm7.fsf@nicolasgoaziou.fr> <87bnkbi61v.fsf@gmail.com>
	<87a8zlmujp.fsf@gmail.com> <87ioe8yftp.fsf@berkeley.edu>
	<871tkwmh87.fsf@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Return-path: <emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([2001:4830:134:3::10]:35376)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <richard.lawrence@berkeley.edu>) id 1YVkWl-0003nD-FS
	for emacs-orgmode@gnu.org; Wed, 11 Mar 2015 13:34:41 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <richard.lawrence@berkeley.edu>) id 1YVkWc-0001Uc-VF
	for emacs-orgmode@gnu.org; Wed, 11 Mar 2015 13:34:39 -0400
Received: from mail-pd0-f172.google.com ([209.85.192.172]:33246)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <richard.lawrence@berkeley.edu>) id 1YVkWc-0001Tm-NQ
	for emacs-orgmode@gnu.org; Wed, 11 Mar 2015 13:34:30 -0400
Received: by pdev10 with SMTP id v10so12884317pde.0
	for <emacs-orgmode@gnu.org>; Wed, 11 Mar 2015 10:34:28 -0700 (PDT)
In-Reply-To: <871tkwmh87.fsf@gmail.com>
List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-orgmode>
List-Post: <mailto:emacs-orgmode@gnu.org>
List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=subscribe>
Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
To: Aaron Ecay <aaronecay@gmail.com>
Cc: emacs-orgmode@gnu.org

Hi Aaron and all,

I cleaned up my efforts a bit and posted them here:

https://github.com/wyleyr/org-citeproc

(This program is just a modified version of John MacFarlane's citeproc
program:

https://github.com/jgm/citeproc/

which reads JSON in a slightly different format, and produces JSON
instead of fully-rendered output.)

I wrote a few functions to produce JSON from (the current implementation
of) citations in Org; these live in json.el.  Note that I haven't been
able to test the new parser because I am on Emacs 23, so these have only
been tested with my hand-constructed Lisp objects, not with parser
output...they might need a little adjusting.  I also haven't done
anything to hook them up to the export process.

Aaron Ecay <aaronecay@gmail.com> writes:

> ... hmm.  Do you mean you=E2=80=99ve written Haskell code?

Yes, but not very much.  As long as I only have to write pure functions,
it seems to go alright. :)=20

I went with Haskell purely for practical reasons:
  1) I at least know a little Haskell, and I've been wanting to learn more;
     I don't know Java, Ruby, or JS
  2) I already have the Haskell platform installed, but I don't have
     Java, Ruby, or a JS engine (other than a browser) installed

> 2015ko martxoak 10an, Richard Lawrence-ek idatzi zuen:
>> I have actually been working on the same problem, using citeproc-hs as
>> the CSL processor instead of citeproc-java.
>
> This is an interesting approach.  What version of citeproc-hs are you
> using?  The version under that name is no longer maintained, and I had
> some trouble getting it to build.=20=20

I am in fact using the version under that name (I have not had trouble
installing/building it via cabal).  I went this way because it seemed to
have everything we'd need, and I wanted to avoid a dependency on all of
pandoc.  Maybe this was shortsighted of me, though...=20=20

> The pandoc fork (under the name pandoc-citeproc) seemed to me to lack
> command-line functionality.  (But I only looked briefly and could have
> missed something.)

Well, pandoc-citeproc does read and produce JSON at the command line,
but it takes (and returns) a complete pandoc document.

I think what I've got would be pretty easy to port to pandoc-citeproc,
if maintenance of citeproc-hs is an issue.  If a full dependency on
pandoc is OK, we could also just use Pandoc's rendering functions for
the various backends we wish to support.

>> My approach to the problems you mention has been the following:
>>   1) Generate JSON from citation objects on the Org side.
>>   2) Pass that JSON to the processor via stdin.
>>   3) Pass the output format, the CSL file, and the bibliography database
>>      to the processor via command line arguments.
>>   4) Return, to stdout, the formatted citations and the bibliography.
>>      These are formatted such that there is one citation or entry per li=
ne,
>>      and a recognizable separator separates the citations from the
>>      bibliography.
>>
>> This allows passing formatting options for individual citations via the
>> JSON object for that citation, so it allows citeproc-hs to do more of
>> the work of formatting citations.
>>
>> (See http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#citation-data-ob=
ject
>> for documentation of the citation data JSON format.)
>
> Very interesting.  But it looks like the CSL standard does not
> differentiate parenthesized/not and capitalized/not citations, whereas
> biblatex (taken as the best representative of the latex family of
> citation processors) does.=20=20

Yikes, really?  Surely the CSL standard has something to say about this.
It seems like it would be pretty useless otherwise, and there are all
those options for representing names...

Anyway, whether it's standard or not, citeproc-hs has an `authorInText'
option for citations, which I am using to capture the
parenthetical-vs-in-text distinction.  I haven't thought about
capitalization, though.

>> I don't know whether this will ultimately be a good design, but the way
>> I am picturing it right now is that exporting citations will work sort
>> of like footnotes: the exporter will gather them all together as they
>> are encountered, then generate the JSON and run a single call to the CSL
>> processor at the end of the export process.  It can then replace the
>> citations in the document with the result from the CSL processor, and
>> insert the bibliography at the end of the document.
>
> My code does something similar.  It processes all citations at the
> beginning of export and stashes the data in the info plist, so that
> it=E2=80=99s available to transcoders during the =E2=80=9Cmain=E2=80=9D e=
xport process.

Ah, that sounds like a much better idea than what I was thinking of.

Best,
Richard