From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Goaziou Subject: Re: Smart Quotes Exporting Date: Fri, 01 Jun 2012 19:11:29 +0200 Message-ID: <878vg72bzy.fsf@gmail.com> References: <4FBB08CA.5060705@kli.org> <87d35u8rvk.fsf@gmail.com> <4FBDA56E.5030901@kli.org> <87zk8w6v4q.fsf@gmail.com> <4FC00CE0.6060308@kli.org> <87r4u75tg9.fsf@gmail.com> <4FC426AC.2030109@kli.org> <87ehq227ky.fsf@gmail.com> <4FC56F1B.5040201@kli.org> <87r4u031ye.fsf@gmail.com> <4FC7FE2C.6040702@kli.org> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([208.118.235.92]:50073) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SaVRN-00070L-AM for emacs-orgmode@gnu.org; Fri, 01 Jun 2012 13:15:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SaVRK-0008He-Ns for emacs-orgmode@gnu.org; Fri, 01 Jun 2012 13:15:08 -0400 Received: from mail-we0-f169.google.com ([74.125.82.169]:63514) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SaVRK-0008Di-C5 for emacs-orgmode@gnu.org; Fri, 01 Jun 2012 13:15:06 -0400 Received: by wefh52 with SMTP id h52so1890394wef.0 for ; Fri, 01 Jun 2012 10:15:04 -0700 (PDT) In-Reply-To: <4FC7FE2C.6040702@kli.org> (Mark E. Shoulson's message of "Thu, 31 May 2012 19:26:36 -0400") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: "Mark E. Shoulson" Cc: emacs-orgmode@gnu.org Hello, "Mark E. Shoulson" writes: > Oh, certainly; they're all a disaster. I think I said that in the > writeup at the top. This is just proof of concept, nothing is in the > right place, nothing is properly documented. They have to be > defcustoms, there needs to be a good :type in the defcustom as well as > a proper docstring. You'll get no argument from me about the lack (or > inaccuracy) of docstrings and such. I hadn't gotten that far yet. > I said the patch was only if you wanted to tinker with the development > as this progresses. No worries, I was just making some comments before forgetting about them. >> +(defun org-e-latex--quotation-marks (text info) >> + (org-export-quotation-marks text info org-e-latex-quote-replacements)) >> + ;; (mapc (lambda(l) >> + ;; (let ((start 0)) >> + ;; (while (setq start (string-match (car l) text start)) >> + ;; (let ((new-quote (concat (match-string 1 text) (cdr l)))) >> + ;; (setq text (replace-match new-quote t t text)))))) >> + ;; (cdr (or (assoc (plist-get info :language) org-e-latex-quotes) >> + ;; ;; Falls back on English. >> + ;; (assoc "en" org-e-latex-quotes)))) >> + ;; text) >> Use directly `org-e-latex-quote-replacements' in code then. > > Not sure I understand this comment. Since `org-e-latex--quotation-marks' just calls `org-export-quotation-marks', you can remove completely the former from "org-export.el" and use the latter instead. > So... there's the filter-parse-tree-functions hook gets applied within > the parse tree... so a back-end can add a function to that list which > looks over the parse-tree and watches for these border cases (and also > the ones within ordinary strings). Looks like it's going to be tough > to work in any flexibility to define further per-language or > per-backend cleverness to handle anything beyond the "canonical set" > of open-double, close-double, open-single, close-single, and mid-word. > > To be sure, anything we do will most assuredly fail even on some > fairly reasonable input, in which case the users are pretty much on > their own and will have to do things the hard way. And I could use > that as the answer here, that, "well, it'll work only within > plain-text strings" (and I might possibly still have to use that > answer), but I would rather include the situations you bring up in the > supported set and not throw up my hands at it. So, yes, will look at > that. Actually it isn't very hard to handle this problem. But it will be different than the fontification used in an Org buffer. You might want to look at `org-element-normalize-contents', which solves a similar problem: removing maximum common indentation at the parsed paragraph level. As a first approximation, I can imagine a function accepting an element, an object or a secondary string and returning an equivalent element, object or secondary string, with its quotes "smartified". The algorithm could go like this: Walk element/object/secondary-string's contents . 1. When a string is encountered: 1. If it has a quote as its first or last position, check for objects before or after the string to guess its status. An object never starts with a white space, but you may have to check :post-blank property in order to know if previous object had white spaces at its end. 2. For each quote everywhere else in the string, your regexp can handle it fine. 2. When an object belonging to `org-element-recursive-objects' is encountered, apply the function to this object. 3. Accumulate returned strings or objects. Use accumulated data as the contents of the new object to return (i.e. just add the type and the same properties at the beginning of this list if it was an object or an element, return it as-is if that was a secondary string). On the elements side, only paragraphs, verse-blocks and table-rows can directly contain quotes. Also, headline, inlinetask item and footnote-reference have secondary strings containing quotes. I'm not sure yet where and how to install such a function, but I will think about it when it is implemented. Regards, -- Nicolas Goaziou