From mboxrd@z Thu Jan 1 00:00:00 1970 From: Richard Lawrence Subject: Re: Citation syntax: a revised proposal Date: Wed, 25 Feb 2015 08:57:51 -0800 Message-ID: <87twya2ak0.fsf@berkeley.edu> References: <87k2zjnc0e.fsf@berkeley.edu> <87bnkvm8la.fsf@berkeley.edu> <87zj8co3se.fsf@berkeley.edu> <87ioezooi2.fsf@berkeley.edu> <87mw4bpaiu.fsf@nicolasgoaziou.fr> <8761aznpiq.fsf@berkeley.edu> <87twyjnh0r.fsf@nicolasgoaziou.fr> <87oaopx24e.fsf@berkeley.edu> <87k2zd4f3w.fsf@nicolasgoaziou.fr> <87egpkv8g9.fsf@berkeley.edu> <877fv6xfaq.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:36307) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQfIL-0005zR-3c for emacs-orgmode@gnu.org; Wed, 25 Feb 2015 11:58:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YQfIG-00039k-R5 for emacs-orgmode@gnu.org; Wed, 25 Feb 2015 11:58:44 -0500 Received: from plane.gmane.org ([80.91.229.3]:51174) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQfIG-00038R-HK for emacs-orgmode@gnu.org; Wed, 25 Feb 2015 11:58:40 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1YQfIE-0003Bm-OJ for emacs-orgmode@gnu.org; Wed, 25 Feb 2015 17:58:38 +0100 Received: from c-67-169-117-151.hsd1.ca.comcast.net ([67.169.117.151]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 25 Feb 2015 17:58:38 +0100 Received: from richard.lawrence by c-67-169-117-151.hsd1.ca.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 25 Feb 2015 17:58:38 +0100 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org Hi Aaron, Aaron Ecay writes: > Speaking for myself, I think the discussion so far has revealed a > number of “advanced” uses of citations, such as possessive citations, > citations as footnotes, the insertion of only author/year/etc., ... > At least for academic publishing, citations are pretty demanding and > there isn’t much room for “close enough;” a paper’s citations either > conform to a particular style guide or they don’t. > > I think these various applications of citations, and others not yet > mentioned or thought of, are best represented as binary switches. Many > of these distinctions will factor well into independent implementations. > For example, a citation that is :footnote t can (probably) be generated > by taking the citation, whatever it is, and wrapping it in > \footnote{...}. (For the latex case; other backends will have different > specifics but the idea is the same.) If this is implemented in terms of > subtypes, it will lead to an explosion of 2^n subtypes being necessary. Yes, I share this worry. > Of course, not all 2^n combinations will be realized (I don’t think it > makes sense for a citation to be both possessive and a footnote, for > example). Ultimately, it’s an empirical question how well different > types of citation factor, and how many of the combinations make sense or > are ever realized. Indeed. It is hard to tell in advance how much of a practical concern this is, but I can see subtypes becoming unpleasant to use unless one thinks ahead carefully about all the kinds of subtypes one needs, and how to relate them. It would be good to hear from people who know already that they will need something like subtypes or arbitrary key-value properties. How would you folks prefer to trade off between these options? 1) subtypes which are specified by a label and easy to write handlers for, but potentially introduce a lot of redundancy in handling different formatting properties 2) arbitrary key-value pairs (one of which could be `:type t'), which are harder to write handlers for, but introduce less redundancy and make it easier to handle formatting properties individually > Nicolas has given reasons why the inline attr syntax is needed > independently. I think no-subtype citations + inline attr is a superset > of with-subtype citations. I’d rather see the superset be implemented. > Subtypes would constrain the expressivity of citations and lead to > more fragile implementations. Since we’re designing the syntax from > scratch, I would like to avoid that. > However, the most important thing is to implement something. The > semipermanent beta status of master allows a period of experimentation > with a citation syntax before something is made official in a release. Agreed. I'd like to see an implementation of a parser for the [cite:...] part of the syntax as a first step. If we can get that far, I'd guess that extending the parser to include either a subtype label or {:key val ...} syntax will not be too difficult to do. I am OK with Elisp, but I should probably not lead the charge here, since I am not very familiar with how org-element works internally. If I hack this up myself, it will probably take longer and result in a less-acceptable patch than if someone more experienced does it. But if no one volunteers, I will start to work on it, with what skills I've got. Erik Hetzner started work on an Elisp parser for Pandoc syntax, which is here: https://gitlab.com/egh/org-pdcite/blob/master/org-pdcite.el Erik seems to have taken a break from the discussion since we moved away from the Pandoc syntax, but this might be a good starting point, either for him or for someone else. > PS A note on implementation: I envision a sort of pattern matching on > key-value combinations. Something like: > > (((:possessive t :footnote t) (error "wtf")) > ;; the generated citation command will be inserted at the %s > ((:footnote t) (wrap "\footnote{%s}")) > ;; slightly artificial example to illustrate pattern matching with binding > ((:color _c) (wrap (format "\color{%s}{%%s}" _c))) > ((:possessive t) (cite "\citeposs{%s}" ...)) > ;; cite provides a list of four format strings for the > ;; (non-)capitalized (non-)parenthesized > ;; variants encoded in the citation type > (default (cite "\cite{%s}" "\parencite{%s}" "\Cite{%s}" "\Parencite{%s}"))) I like this idea. Especially if there is fall-through in the `wrap' clauses, it would make it pretty easy to write your own handlers for arbitrary key-value pairs atop the default handlers, though one would still have to be careful because the order of the clauses would be significant. > Where the list of attributes is pattern-matched, and the first matching > cite command is composed with all matching wrap commands. I’ve just > shown one-place format strings for the cite key, but a full > implementation would have to handle pre- and post-note. It would > probably also need to handle multicites as a fifth type (or set of 4 > types). Though it’s worth considering whether the latex \multicite > family of commands provides anything above and beyond a series of > sequential \cite’s. It might be possible to handle multicites by just > using elisp to concatenate individual citation commands, and not letting > them vary by backend. > > The specifics of whether cite and wrap are sufficient primitives needs > to be decided on. Probably we need to allow functions not just format > strings, for the benefit of non-latex backends where the citation needs > to be formatted by emacs. Then cite and wrap would just be predefined > shortcuts, with the ability to drop into full elisp for more complicated > cases. > > A small version of the 2^n problem is already visible: the 4 types of > citation necessitate providing 4 strings/functions for the default case, > and also for the possessive case (though I think this is unavoidable > under any implementation). > > This is a very rough sketch, but I hope it helps stimulate thinking. > There’s already a pattern matching library in emacs (pcase.el), though > it would need to be extended for plist pattern matching. Yes, that is all food for further thought. Thanks for illustrating the idea! Best, Richard