From mboxrd@z Thu Jan 1 00:00:00 1970 From: Aaron Ecay Subject: Re: Citation syntax: a revised proposal Date: Tue, 10 Mar 2015 23:21:33 -0300 Message-ID: <871tkwmh87.fsf@gmail.com> References: <87k2zjnc0e.fsf@berkeley.edu> <87bnkvm8la.fsf@berkeley.edu> <87zj8co3se.fsf@berkeley.edu> <87ioezooi2.fsf@berkeley.edu> <87mw4bpaiu.fsf@nicolasgoaziou.fr> <8761aznpiq.fsf@berkeley.edu> <87twyjnh0r.fsf@nicolasgoaziou.fr> <87oaopx24e.fsf@berkeley.edu> <87k2zd4f3w.fsf@nicolasgoaziou.fr> <87egpkv8g9.fsf@berkeley.edu> <877fv6xfaq.fsf@gmail.com> <87twya2ak0.fsf@berkeley.edu> <87zj81aa97.fsf@nicolasgoaziou.fr> <87ioep2r6p.fsf@berkeley.edu> <87y4ngbgm7.fsf@nicolasgoaziou.fr> <87bnkbi61v.fsf@gmail.com> <87a8zlmujp.fsf@gmail.com> <87ioe8yftp.fsf@berkeley.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:49340) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YVWHW-0005fT-1F for emacs-orgmode@gnu.org; Tue, 10 Mar 2015 22:21:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YVWHR-000890-NA for emacs-orgmode@gnu.org; Tue, 10 Mar 2015 22:21:57 -0400 Received: from mail-qg0-x22c.google.com ([2607:f8b0:400d:c04::22c]:36216) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YVWHR-00088L-Gy for emacs-orgmode@gnu.org; Tue, 10 Mar 2015 22:21:53 -0400 Received: by qgdz107 with SMTP id z107so6919244qgd.3 for ; Tue, 10 Mar 2015 19:21:52 -0700 (PDT) In-Reply-To: <87ioe8yftp.fsf@berkeley.edu> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Richard Lawrence Cc: emacs-orgmode@gnu.org Hi Richard, Thanks for your comments, and for your work on an implementation. 2015ko martxoak 10an, Richard Lawrence-ek idatzi zuen: > I have actually been working on the same problem, using citeproc-hs as > the CSL processor instead of citeproc-java. This is an interesting approach. What version of citeproc-hs are you using? The version under that name is no longer maintained, and I had some trouble getting it to build. The pandoc fork (under the name pandoc-citeproc) seemed to me to lack command-line functionality. (But I only looked briefly and could have missed something.) > I took a (very) brief look at your code; it seems like you are only > communicating with citeproc-java via command line arguments and stdout. > Is that right? Yes. > > My approach to the problems you mention has been the following: > 1) Generate JSON from citation objects on the Org side. > 2) Pass that JSON to the processor via stdin. > 3) Pass the output format, the CSL file, and the bibliography database > to the processor via command line arguments. > 4) Return, to stdout, the formatted citations and the bibliography. > These are formatted such that there is one citation or entry per lin= e, > and a recognizable separator separates the citations from the > bibliography. > > This allows passing formatting options for individual citations via the > JSON object for that citation, so it allows citeproc-hs to do more of > the work of formatting citations. > > (See http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#citation-data-obj= ect > for documentation of the citation data JSON format.) Very interesting. But it looks like the CSL standard does not differentiate parenthesized/not and capitalized/not citations, whereas biblatex (taken as the best representative of the latex family of citation processors) does. I think we have decided we need to support these. So we will always need to do some post-processing of the CSL output. Then the question arises (for example) whether it=E2=80=99s better= to let CSL/citeproc handle the prefix and suffix, or to do it ourselves. I don=E2=80=99t think we can decide this without looking at a working =E2= =80=9Csketch=E2=80=9D of both implementations. It would be very good to see your draft code. > > I don't know whether this will ultimately be a good design, but the way > I am picturing it right now is that exporting citations will work sort > of like footnotes: the exporter will gather them all together as they > are encountered, then generate the JSON and run a single call to the CSL > processor at the end of the export process. It can then replace the > citations in the document with the result from the CSL processor, and > insert the bibliography at the end of the document. My code does something similar. It processes all citations at the beginning of export and stashes the data in the info plist, so that it=E2=80=99s available to transcoders during the =E2=80=9Cmain=E2=80=9D exp= ort process. IDK if footnotes are handled in the same way, or rather processed in a late step after the transcoders. But it=E2=80=99s six of one, half a dozen of t= he other I think. > > (The code is not very pretty yet, but it does generate citations and > bibliographies in both plain text and HTML, and it would be > straightforward to extend it to other output formats. I can post it > somewhere if anyone is interested in taking a look.) Please do! > >> Some people have talked about supporting other CSL processors. I don=E2= =80=99t >> see much utility in that, since CSL is a standard that all processors >> should implement faithfully. > > Indeed! Though as you have observed, `should' and `already does' come > apart. I doubt there are any implementations that are perfectly > complete. So it may be worth thinking about how Org can talk to CSL > processors in a processor-independent way. That way, different users > can use different CSL processors if one works better for their > particular document or environment. I take your point, but any differences in the implementation just make it potentially harder to be processor-independent. I think we should tightly integrate with one processor, working around whatever warts it may have. > > I think the generate-and-pass-JSON approach is promising for that > reason. That is what citeproc-js accepts as input (so maybe that is > what citeproc-java is doing internally?), and my code aims to allow > citeproc-hs to interpret the same JSON format as citeproc-js. ... hmm. Do you mean you=E2=80=99ve written Haskell code? > I don't know Ruby, but I think it would be easy to make citeproc-ruby > accept the same JSON format. Do you have a sense of how easy it would > be to coax citeproc-java into accepting JSON on stdin? My understanding is that citeproc-java in its current form can read JSON database files (in addition to bibtex). However, it does not accept JSON to control output of citations =E2=80=93 it merely allows passing a key, for which a =E2=80=9Cdefault=E2=80=9D citation will be generated (no prefix/suf= fix/author suppression/...). It would not be easy for me to extend it, because I=E2=80=99m not fluent in Java. Thanks, -- Aaron Ecay