From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Goaziou Subject: Re: Smart Quotes Exporting Date: Tue, 19 Jun 2012 11:26:34 +0200 Message-ID: <87ipenps8l.fsf@gmail.com> References: <4FBB08CA.5060705@kli.org> <87d35u8rvk.fsf@gmail.com> <4FBDA56E.5030901@kli.org> <87zk8w6v4q.fsf@gmail.com> <4FC00CE0.6060308@kli.org> <87r4u75tg9.fsf@gmail.com> <4FC426AC.2030109@kli.org> <87ehq227ky.fsf@gmail.com> <4FC56F1B.5040201@kli.org> <87r4u031ye.fsf@gmail.com> <4FC7FE2C.6040702@kli.org> <878vg72bzy.fsf@gmail.com> <4FCEBCF5.1070209@kli.org> <87haunexn8.fsf@gmail.com> <874nqgeke6.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([208.118.235.92]:35147) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sgul0-0001qW-9Y for emacs-orgmode@gnu.org; Tue, 19 Jun 2012 05:30:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Sgukx-0002wm-Ub for emacs-orgmode@gnu.org; Tue, 19 Jun 2012 05:29:53 -0400 Received: from mail-wg0-f49.google.com ([74.125.82.49]:42401) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sgukx-0002wD-Ex for emacs-orgmode@gnu.org; Tue, 19 Jun 2012 05:29:51 -0400 Received: by wgbds1 with SMTP id ds1so4388070wgb.30 for ; Tue, 19 Jun 2012 02:29:49 -0700 (PDT) In-Reply-To: (Mark Shoulson's message of "Fri, 15 Jun 2012 16:20:43 +0000 (UTC)") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Mark Shoulson Cc: emacs-orgmode@gnu.org Hello, Mark Shoulson writes: > Well, wait; regexps can make some pretty darn good guesses at the beginnings > or ends of strings. I know that. They make a good job, I just want a better one. > This isn't quite it; beginning-of-string followed by quote, then punctuation > and then spaces is also a close-quote, etc... There is a lot of fine-tuning. > But even what I currently have was able to handle your > > Caesar said, "/Alea Jacta est./" > > example. No, it doesn't handle that, actually, it's just sheer luck. Indeed, the quoting function is applied to "\"". There's absolutely no space, punctuation, etc. to save the day. So it makes a wild guess with a probability of 0.5 of success. Since the guess is always the same, "/a/" will always fail. > The case where a quote both sits at the edge of a string (i.e. at the border > of some element, formatting, etc) *and* does not have whitespace next to it, > with possible punctuation, does not seem to be a normal occurrence to me. If > I'm wrong, how common *is* it? Even if it rarely happens, it can be _very_ annoying to have to cope with bad guesses. If it can be avoided, I see no reason not to do so. Now, here the infrastructure I propose. Internally, the two following functions are required. #+begin_src emacs-lisp (defun org-export--smart-quotes-in-element (element backend) "Replace plain quotes with smart quotes in ELEMENT. ELEMENT is an Org element or a secondary string. BACKEND is the back-end to check for rules, as a symbol. This is a destructive operation. Return new element." (let* ((type (org-element-type element)) (properties (and type (nth 1 element)))) ;; Destructively apply changes to secondary string, if any. (let ((secondary (and type (assq type org-element-secondary-value-alist)))) (when secondary (let* ((sec-symbol (cdr secondary)) (sec-value (plist-get properties sec-symbol))) (when sec-value (setq properties (plist-put properties sec-symbol (org-export--smart-quotes-in-element sec-value backend))))))) ;; Destructively change `:caption' if present. Since it's a dual ;; keyword, apply smart quotes to both CAR and CDR, if required. (let ((caption (plist-get :caption properties))) (when caption (setq properties (plist-put properties :caption (cons (org-export--smart-quotes-in-element (car caption) backend) (and (cdr caption) (org-export--smart-quotes-in-element (cdr caption) backend))))))) ;; Recursively apply changes to contents. Rebuild ELEMENT along ;; the way, with updated strings. (let ((contents (if type (org-element-contents element) element)) previous current next acc) (while contents (setq current (pop contents) next (car contents) previous current) (push (cond ((stringp current) ;; CURRENT is a string: Call ;; `org-export-quotation-marks' with appropriate ;; information. (org-export-quotation-marks current (and previous (if (stringp previous) (length (and (string-match " +\\'" previous) (match-string 0 previous))) (org-element-property :post-blank previous))) (and next (if (not (stringp next)) 0 (length (and (string-match "\\` +" next) (match-string 0 next))))) backend)) ;; CURRENT is recursive: Move into it. ((plist-get properties :contents-begin) (org-export--smart-quotes-in-element current backend)) ;; Otherwise, just accumulate CURRENT. (t current)) acc)) ;; Re-build transformed element. (if (or (not type) (eq type 'plain-text)) (nreverse acc) (nconc (list type properties) (nreverse acc)))))) (defun org-export-set-smart-quotes (tree backend info) "Replace plain quotes with smart quotes in TREE. BACKEND is the back-end, as a symbol, used for transcoding. INFO is a plist used as a communication channel. This is a destructive operation. This function is meant to be used as a parse tree filter for back-ends activating smart quotes." ;; Destructively apply smart quotes to parsed keywords in info. (let ((value (plist-get info :title))) (when value (setq info (plist-put info :title (org-export--smart-quotes-in-element value backend))))) ;; Replace smart quotes in elements containing plain text or ;; secondary strings across the parse tree. (org-element-map tree '(paragraph verse-block table-cell headline inlinetask item) (lambda (el) (org-export-set-element el (org-export--smart-quotes-in-element el backend)))) ;; Return parse tree. tree) #+end_src Then, all is left to do is write the function replacing quotes in a string, with additional information: #+begin_src emacs-lisp (defun org-export-quotation-marks (s &optional prev next backend) "Replace plain quotes with smart quotes in string S. Optional argument PREV (resp. NEXT) is the number of white space characters before (resp. after) the string, or nil if S starts (resp. ends) a paragraph. Optional argument BACKEND is a symbol representing the back-end to use for substitutions. The function returns the new string." ...) #+end_src Once this function is written, add `org-export-set-smart-quotes' as a parse tree filter in `org-BACKEND-filters-alist'. For example, one can add the following in org-e-latex.el to activate smart quotes in latex export: #+begin_src emacs-lisp (defconst org-e-latex-filters-alist '((:filter-parse-tree . org-export-set-smart-quotes)) "Alist between filters keywords and back-end specific filters. See `org-export-filters-alist' for more information.") #+end_src Could you please try to modify your original `org-export-quotation-marks' accordingly and test it? >> Yes. You may want to use `org-element-at-point' and `org-element-type' >> to tell if you're somewhere smart quotes are allowed (in table, >> table-row, paragraph, verse-block elements). > > Probably. I think I saw some other package make these decisions by peeking at > the formatting and seeing if it is set in comment-face or something, but > checking the element at point is presumably more sensible. Thinking about it, looking at face used will definitely be faster, though. That's your call. Regards, -- Nicolas Goaziou