From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Mark E. Shoulson" Subject: Re: Smart Quotes Exporting Date: Sat, 02 Jun 2012 23:16:00 -0400 Message-ID: <4FCAD6F0.5020209@kli.org> References: <4FBB08CA.5060705@kli.org> <87d35u8rvk.fsf@gmail.com> <4FBDA56E.5030901@kli.org> <87zk8w6v4q.fsf@gmail.com> <4FC00CE0.6060308@kli.org> <87r4u75tg9.fsf@gmail.com> <4FC426AC.2030109@kli.org> <87ehq227ky.fsf@gmail.com> <4FC56F1B.5040201@kli.org> <87r4u031ye.fsf@gmail.com> <4FC7FE2C.6040702@kli.org> <878vg72bzy.fsf@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------000108080805060407050200" Return-path: Received: from eggs.gnu.org ([208.118.235.92]:34466) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sb1P2-0005Qd-9o for emacs-orgmode@gnu.org; Sat, 02 Jun 2012 23:22:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Sb1Oz-0001A8-Dq for emacs-orgmode@gnu.org; Sat, 02 Jun 2012 23:22:51 -0400 Received: from pi.meson.org ([96.56.207.26]:56268) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1Sb1Oz-00019x-85 for emacs-orgmode@gnu.org; Sat, 02 Jun 2012 23:22:49 -0400 In-Reply-To: <878vg72bzy.fsf@gmail.com> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Nicolas Goaziou Cc: emacs-orgmode@gnu.org This is a multi-part message in MIME format. --------------000108080805060407050200 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit All right, preliminary patch is attached, *maybe* good enough for more serious consideration now, but might need some fixes. Still only uses ordinary regexps and plain-text strings, but can now handle the example with formatting-breaks next to quotes. Things have been moved into more appropriate locations, made customs, docstrings and types fixed, etc, etc. It supports onscreen display of "smart" quotes (when enabled); I have the quotes displayed in org-document-info face so they are slightly distinct, to make it clearer that they are "altered" from what they are in the plain text. This may or may not be a popular (or good) idea. I have also built it into the new export engine in org-e-latex and org-e-html as proofs of concept. I'm not positive the latex one will work properly for German, though; there might need to be something enabled in LaTeX for it to format ,, into . It should probably be set not to smartify quotes onscreen in comments; I haven't done that yet. Comments welcome; I hope I didn't complicate matters in the export engines too much. ~mark --------------000108080805060407050200 Content-Type: text/x-patch; name="0001-Add-smart-quotes-for-onscreen-display-and-for-latex-.patch" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename*0="0001-Add-smart-quotes-for-onscreen-display-and-for-latex-.pa"; filename*1="tch" >From 1bc507cf69c94d5645436abc6e28e7d96999083e Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Tue, 29 May 2012 23:01:12 -0400 Subject: [PATCH] Add `smart' quotes for onscreen display and for latex and html export * lisp/org.el: Add `smart' quotes: custom variables to define regexps to recognize quotes, to define how and whether to display them, and org-fontify-quotes to display `smart-quote' characters when activated. * contrib/lisp/org-export.el: Add function org-export-quotation-marks as a utility function usable by individual exporters to apply `smart' quotes. * contrib/lisp/org-e-latex.el: Replace org-e-latex-quotes custom with org-e-latex-quotes-replacements and make org-e-latex--quotation-marks use the org-export-quotation-marks function in org-export.el. * contrib/lisp/org-e-html.el: Replace org-e-html-quotes custom with org-e-html-quotes-replacements and enable org-e-html--quotation-marks, using org-export-quotation-marks function in org-export.el. --- contrib/lisp/org-e-html.el | 57 ++++++++---------------- contrib/lisp/org-e-latex.el | 67 ++++++++++------------------- contrib/lisp/org-export.el | 26 +++++++++++ lisp/org.el | 101 +++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 168 insertions(+), 83 deletions(-) diff --git a/contrib/lisp/org-e-html.el b/contrib/lisp/org-e-html.el index 53547a0..d4a505e 100644 --- a/contrib/lisp/org-e-html.el +++ b/contrib/lisp/org-e-html.el @@ -1077,37 +1077,24 @@ in order to mimic default behaviour: ;;;; Plain text -(defcustom org-e-html-quotes - '(("fr" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~") - ("\\(\\S-\\)\"" . "~»") - ("\\(\\s-\\|(\\|^\\)'" . "'")) - ("en" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "``") - ("\\(\\S-\\)\"" . "''") - ("\\(\\s-\\|(\\|^\\)'" . "`"))) - "Alist for quotes to use when converting english double-quotes. - -The CAR of each item in this alist is the language code. -The CDR of each item in this alist is a list of three CONS: -- the first CONS defines the opening quote; -- the second CONS defines the closing quote; -- the last CONS defines single quotes. - -For each item in a CONS, the first string is a regexp -for allowed characters before/after the quote, the second -string defines the replacement string for this quote." +(defcustom org-e-html-smart-quote-replacements + '(("fr" "« " " »" "‘" "’" "’") + ("en" "“" "”" "‘" "’" "’") + ("de" "„" "“" "‚" "‘" "’")) + "What to export for `smart-quotes'. +A list of five strings: + 1. Open double-quotes + 2. Close double-quotes + 3. Open single-quote + 4. Close single-quote + 5. Mid-word apostrophe" :group 'org-export-e-html :type '(list - (cons :tag "Opening quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")) - (cons :tag "Closing quote" - (string :tag "Regexp for char after ") - (string :tag "Replacement quote ")) - (cons :tag "Single quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")))) + (string :tag "Open double-quotes") ; "“" + (string :tag "Close double-quotes") ; "”" + (string :tag "Open single-quote") ; "‘" + (string :tag "Close single-quote") ; "’" + (string :tag "Mid-word apostrophe"))) ; "’" ;;;; Compilation @@ -1497,15 +1484,7 @@ This is used to choose a separator for constructs like \\verb." "Export quotation marks depending on language conventions. TEXT is a string containing quotation marks to be replaced. INFO is a plist used as a communication channel." - (mapc (lambda(l) - (let ((start 0)) - (while (setq start (string-match (car l) text start)) - (let ((new-quote (concat (match-string 1 text) (cdr l)))) - (setq text (replace-match new-quote t t text)))))) - (cdr (or (assoc (plist-get info :language) org-e-html-quotes) - ;; Falls back on English. - (assoc "en" org-e-html-quotes)))) - text) + (org-export-quotation-marks text info org-e-html-smart-quote-replacements)) (defun org-e-html--wrap-label (element output) "Wrap label associated to ELEMENT around OUTPUT, if appropriate. @@ -2729,7 +2708,7 @@ contextual information." ;; (format "\\%s{}" (match-string 1 text)) nil t text) ;; start (match-end 0)))) ;; Handle quotation marks - ;; (setq text (org-e-html--quotation-marks text info)) + (setq text (org-e-html--quotation-marks text info)) ;; Convert special strings. ;; (when (plist-get info :with-special-strings) ;; (while (string-match (regexp-quote "...") text) diff --git a/contrib/lisp/org-e-latex.el b/contrib/lisp/org-e-latex.el index 67e9197..2543c29 100644 --- a/contrib/lisp/org-e-latex.el +++ b/contrib/lisp/org-e-latex.el @@ -687,38 +687,28 @@ during latex export it will output ;;;; Plain text -(defcustom org-e-latex-quotes - '(("fr" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~") - ("\\(\\S-\\)\"" . "~»") - ("\\(\\s-\\|(\\|^\\)'" . "'")) - ("en" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "``") - ("\\(\\S-\\)\"" . "''") - ("\\(\\s-\\|(\\|^\\)'" . "`"))) - "Alist for quotes to use when converting english double-quotes. - -The CAR of each item in this alist is the language code. -The CDR of each item in this alist is a list of three CONS: -- the first CONS defines the opening quote; -- the second CONS defines the closing quote; -- the last CONS defines single quotes. - -For each item in a CONS, the first string is a regexp -for allowed characters before/after the quote, the second -string defines the replacement string for this quote." +(defcustom org-e-latex-quote-replacements + '(("en" "``" "''" "`" "'" "'") + ("fr" "«~" "~»" "‹~" "~›" "'") + ("de" ",," "``" "," "`" "'")) + "What to output for quotes. Each element is a list of six strings. +The first string specifies the language these quotes apply to (\"en\", +\"fr\", \"de\", etc.; see the LANGUAGE keyword), and the other five +define the strings to use for, in order: + 1. Open double-quotes + 2. Close double-quotes + 3. Open single-quote + 4. Close single-quote + 5. Mid-word apostrophe" :group 'org-export-e-latex - :type '(list - (cons :tag "Opening quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")) - (cons :tag "Closing quote" - (string :tag "Regexp for char after ") - (string :tag "Replacement quote ")) - (cons :tag "Single quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")))) - + :type '(repeat + (list + (string :tag "Language code") + (string :tag "Open double-quotes") + (string :tag "Close double-quotes") + (string :tag "Open single-quote") + (string :tag "Close single-quote") + (string :tag "Mid-word apostrophe")))) ;;;; Compilation @@ -852,19 +842,8 @@ nil." options ",")) -(defun org-e-latex--quotation-marks (text info) - "Export quotation marks depending on language conventions. -TEXT is a string containing quotation marks to be replaced. INFO -is a plist used as a communication channel." - (mapc (lambda(l) - (let ((start 0)) - (while (setq start (string-match (car l) text start)) - (let ((new-quote (concat (match-string 1 text) (cdr l)))) - (setq text (replace-match new-quote t t text)))))) - (cdr (or (assoc (plist-get info :language) org-e-latex-quotes) - ;; Falls back on English. - (assoc "en" org-e-latex-quotes)))) - text) +(defun org-e-latex--quotation-marks (text info) + (org-export-quotation-marks text info org-e-latex-quote-replacements)) (defun org-e-latex--wrap-label (element output) "Wrap label associated to ELEMENT around OUTPUT, if appropriate. diff --git a/contrib/lisp/org-export.el b/contrib/lisp/org-export.el index b9294e5..87f5c84 100644 --- a/contrib/lisp/org-export.el +++ b/contrib/lisp/org-export.el @@ -284,6 +284,32 @@ rules.") :tag "Org Export General" :group 'org-export) +;; Generic function, usable by exporters, but they can define their own +;; instead. +(defun org-export-quotation-marks (text info replacements) + "Export quotation marks depending on language conventions. +TEXT is a string containing quotation marks to be replaced. INFO +is a plist used as a communication channel." + ;; (message text) + (let* ((regexps + (cdr + (or + (assoc (plist-get info :language) + org-smart-quotes-regexps) + (assq 'DEFAULT org-smart-quotes-regexps)))) + (subs (cdr (or (assoc (plist-get info :language) + replacements) + (assoc "en" replacements)))) + (quotes (pairlis regexps subs))) + (mapc (lambda (p) + (let ((re (car p)) + (su (cdr p))) + (setq text (replace-regexp-in-string re su text t t 9)))) + quotes)) + text) + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + (defcustom org-export-with-archived-trees 'headline "Whether sub-trees with the ARCHIVE tag should be exported. diff --git a/lisp/org.el b/lisp/org.el index 0157e36..70d7266 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -3625,6 +3625,69 @@ When nil, the \\name form remains in the buffer." :version "24.1" :type 'boolean) +(defcustom org-smart-quotes nil + "Non-nil means display `smart' quotes on-screen in place +of \" and ' characters." + :group 'org-appearance + :type 'boolean) + +(defcustom org-smart-quotes-replacements + '("“" "”" "‘" "’" "’") + "What to display on-screen when `org-smart-quotes' is non-nil. +A list of five strings: + 1. Open double-quotes + 2. Close double-quotes + 3. Open single-quote + 4. Close single-quote + 5. Mid-word apostrophe" + :group 'org-appearance + :type '(list + (string :tag "Open double-quotes" "«") ; "“" + (string :tag "Close double-quotes" "»") ; "”" + (string :tag "Open single-quote" "‹") ; "‘" + (string :tag "Close single-quote" "›") ; "’" + (string :tag "Mid-word apostrophe" "’"))) ; "’" + +(defcustom org-smart-quotes-regexps + '((DEFAULT + "\\(?:\\s-\\|\\s(\\|^\\)\\(?9:\"\\)\\(?:\\w\\|\\s.\\|\\s_\\)\\|\\s-\\(?9:\"\\)$" + "\\(?:\\S-\\)\\(?9:\"\\)\\(?:\\s-\\|$\\|\\s)\\|\\s.\\)\\|^\\(?9:\"\\)\\s-" + "\\(?:\\s-\\|(\\|^\\)\\(?9:'\\)\\w\\|\\s-\\(?9:'\\)$" + "\\w\\(?9:'\\)\\(?:\\s-\\|\\s.\\|$\\)\\|^\\(?9:'\\)\\s-" + "\\w\\(?9:'\\)\\w")) + "Regexps for quotes to be made `smart' quotes upon export or onscreen. +Each element is a list of six strings. The car is the a string +representing the language to which this definition applies (e.g. \"en\", +\"fr\", \"de\", etc.); the cdr (the other five elements) are five REs +matching, in order: + 1. Opening double-quotes + 2. Closing double-quotes + 3. Opening single-quotes + 4. Closing single-quotes + 5. Mid-word apostrophes + +Each regexp should surround the actual quote in a capturing group, which +must be specified as number 9 (so as not to conflict with other processing.) + +One element should have as its car the atom DEFAULT, to be used when no +other element fits. It is also the one used for on-screen display of +`smart' quotes (see the variable `org-smart-quotes'). + +As what makes an opening or closing quote is somewhat consistent across +languages (as opposed to how they are represented in typography), the +DEFAULT element is likely sufficient for most purposes." + :group 'org-export-general + :group 'org-appearance + :type '(repeat + (list + (choice (const DEFAULT) + (string :tag "Language")) + (regexp :tag "Open double-quotes") + (regexp :tag "Close double-quotes") + (regexp :tag "Open single-quote") + (regexp :tag "Close double-quote") + (regexp :tag "Mid-word apostrophe")))) + (defvar org-emph-re nil "Regular expression for matching emphasis. After a match, the match groups contain these elements: @@ -5927,6 +5990,7 @@ needs to be inserted at a specific position in the font-lock sequence.") ;; Specials '(org-do-latex-and-special-faces) '(org-fontify-entities) + '(org-fontify-quotes) '(org-raise-scripts) ;; Code '(org-activate-code (1 'org-code t)) @@ -5948,6 +6012,43 @@ needs to be inserted at a specific position in the font-lock sequence.") '(org-font-lock-keywords t nil nil backward-paragraph)) (kill-local-variable 'font-lock-keywords) nil)) +(defun org-fontify-quotes (limit) + (require 'org-export) + (when org-smart-quotes + (let* ((start (point)) + k su + (splice-string (lambda (lst join) + (if (null (cdr lst)) (car lst) + (concat (car lst) join + (splice-string (cdr lst) join))))) + (regexps + (cdr + (assq 'DEFAULT org-smart-quotes-regexps))) + (i 1) + (allreg + (mapconcat (lambda (n) (prog1 (format "\\(?%d:%s\\)" i n) + (setq i (1+ i)))) + regexps "\\|")) + (quotes (pairlis regexps org-smart-quotes-replacements))) + (catch 'match + (while (re-search-forward allreg limit t) + (cond ((match-string 1) + (setq su (nth 0 org-smart-quotes-replacements))) + ((match-string 2) + (setq su (nth 1 org-smart-quotes-replacements))) + ((match-string 3) + (setq su (nth 2 org-smart-quotes-replacements))) + ((match-string 4) + (setq su (nth 3 org-smart-quotes-replacements))) + ((match-string 5) + (setq su (nth 4 org-smart-quotes-replacements)))) + (add-text-properties (match-beginning 9) (match-end 9) + (list 'font-lock-fontified t + 'face 'org-document-info)) + (compose-region (match-beginning 9) (match-end 9) su nil) + (backward-char 1) + (throw 'match t)))))) + (defun org-toggle-pretty-entities () "Toggle the composition display of entities as UTF8 characters." (interactive) -- 1.7.7.6 --------------000108080805060407050200--