From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Mark E. Shoulson" Subject: Re: Smart Quotes Exporting Date: Tue, 05 Jun 2012 22:14:13 -0400 Message-ID: <4FCEBCF5.1070209@kli.org> References: <4FBB08CA.5060705@kli.org> <87d35u8rvk.fsf@gmail.com> <4FBDA56E.5030901@kli.org> <87zk8w6v4q.fsf@gmail.com> <4FC00CE0.6060308@kli.org> <87r4u75tg9.fsf@gmail.com> <4FC426AC.2030109@kli.org> <87ehq227ky.fsf@gmail.com> <4FC56F1B.5040201@kli.org> <87r4u031ye.fsf@gmail.com> <4FC7FE2C.6040702@kli.org> <878vg72bzy.fsf@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------020806030700050008060006" Return-path: Received: from eggs.gnu.org ([208.118.235.92]:36648) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sc5ow-0005LB-LC for emacs-orgmode@gnu.org; Tue, 05 Jun 2012 22:19:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Sc5lQ-0002nf-Cm for emacs-orgmode@gnu.org; Tue, 05 Jun 2012 22:18:02 -0400 Received: from pi.meson.org ([96.56.207.26]:59945) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1Sc5lQ-0002lT-6L for emacs-orgmode@gnu.org; Tue, 05 Jun 2012 22:14:24 -0400 In-Reply-To: <878vg72bzy.fsf@gmail.com> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Nicolas Goaziou Cc: emacs-orgmode@gnu.org This is a multi-part message in MIME format. --------------020806030700050008060006 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Update on the smart-quotes patch. Supports the odt exporter now too, which I think covers all the current major "new" exporters for which it is relevant (adding smart quotes to ASCII export is a contradiction in terms; should it be in the "publish" exporter? It didn't look like it to me). Added an options keyword, '"' (that is, the double-quote mark) to select smart quotes on/off, and a defcustom for customizing your default. Set the default default [sic] to nil, though actually it might be reasonable to set it to t. Slight touch-up to the regexps since last time, but they will definitely be subject to a lot of fine-tuning as more special cases are found that break them and ways to fix it are found (the close-quote still breaks on one of "/a/." or "/a./") It's pretty good on the whole, though, usually guesses right. I know there's some work being done on the odt exporter; hope this fits in well with it. How does it look to you? ~mark --------------020806030700050008060006 Content-Type: text/x-patch; name="0001-Add-smart-quotes-for-onscreen-display-and-for-latex-.patch" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename*0="0001-Add-smart-quotes-for-onscreen-display-and-for-latex-.pa"; filename*1="tch" >From e6df2efd1a9ce36964a20fc06aa2a688acd87efb Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Tue, 29 May 2012 23:01:12 -0400 Subject: [PATCH] Add `smart' quotes for onscreen display and for latex and html export * lisp/org.el: Add `smart' quotes: custom variables to define regexps to recognize quotes, to define how and whether to display them, and org-fontify-quotes to display `smart-quote' characters when activated. * contrib/lisp/org-export.el: Add function org-export-quotation-marks as a utility function usable by individual exporters to apply `smart' quotes. Also add keyword '"' for customizing smart quotes, and custom default for it. * contrib/lisp/org-e-latex.el: Replace org-e-latex-quotes custom with org-e-latex-quotes-replacements and make org-e-latex--quotation-marks use the org-export-quotation-marks function in org-export.el. * contrib/lisp/org-e-html.el: Replace org-e-html-quotes custom with org-e-html-quotes-replacements and enable org-e-html--quotation-marks, using org-export-quotation-marks function in org-export.el. * contrib/lisp/org-e-odt.el: Replace org-e-odt-quotes custom with org-e-odt-quotes-replacements and make org-e-odt--quotation-marks use org-export-quotations-marks function in org-export.el. --- contrib/lisp/org-e-html.el | 57 ++++++++---------------- contrib/lisp/org-e-latex.el | 67 ++++++++++------------------- contrib/lisp/org-e-odt.el | 68 ++++++++++------------------- contrib/lisp/org-export.el | 38 ++++++++++++++++ lisp/org.el | 101 +++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 203 insertions(+), 128 deletions(-) diff --git a/contrib/lisp/org-e-html.el b/contrib/lisp/org-e-html.el index 4287a59..c49608d 100644 --- a/contrib/lisp/org-e-html.el +++ b/contrib/lisp/org-e-html.el @@ -1043,37 +1043,24 @@ in order to mimic default behaviour: ;;;; Plain text -(defcustom org-e-html-quotes - '(("fr" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~") - ("\\(\\S-\\)\"" . "~»") - ("\\(\\s-\\|(\\|^\\)'" . "'")) - ("en" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "``") - ("\\(\\S-\\)\"" . "''") - ("\\(\\s-\\|(\\|^\\)'" . "`"))) - "Alist for quotes to use when converting english double-quotes. - -The CAR of each item in this alist is the language code. -The CDR of each item in this alist is a list of three CONS: -- the first CONS defines the opening quote; -- the second CONS defines the closing quote; -- the last CONS defines single quotes. - -For each item in a CONS, the first string is a regexp -for allowed characters before/after the quote, the second -string defines the replacement string for this quote." +(defcustom org-e-html-smart-quote-replacements + '(("fr" "« " " »" "‘" "’" "’") + ("en" "“" "”" "‘" "’" "’") + ("de" "„" "“" "‚" "‘" "’")) + "What to export for `smart-quotes'. +A list of five strings: + 1. Open double-quotes + 2. Close double-quotes + 3. Open single-quote + 4. Close single-quote + 5. Mid-word apostrophe" :group 'org-export-e-html :type '(list - (cons :tag "Opening quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")) - (cons :tag "Closing quote" - (string :tag "Regexp for char after ") - (string :tag "Replacement quote ")) - (cons :tag "Single quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")))) + (string :tag "Open double-quotes") ; "“" + (string :tag "Close double-quotes") ; "”" + (string :tag "Open single-quote") ; "‘" + (string :tag "Close single-quote") ; "’" + (string :tag "Mid-word apostrophe"))) ; "’" ;;;; Compilation @@ -1459,15 +1446,7 @@ This is used to choose a separator for constructs like \\verb." "Export quotation marks depending on language conventions. TEXT is a string containing quotation marks to be replaced. INFO is a plist used as a communication channel." - (mapc (lambda(l) - (let ((start 0)) - (while (setq start (string-match (car l) text start)) - (let ((new-quote (concat (match-string 1 text) (cdr l)))) - (setq text (replace-match new-quote t t text)))))) - (cdr (or (assoc (plist-get info :language) org-e-html-quotes) - ;; Falls back on English. - (assoc "en" org-e-html-quotes)))) - text) + (org-export-quotation-marks text info org-e-html-smart-quote-replacements)) (defun org-e-html--wrap-label (element output) "Wrap label associated to ELEMENT around OUTPUT, if appropriate. @@ -2691,7 +2670,7 @@ contextual information." ;; (format "\\%s{}" (match-string 1 text)) nil t text) ;; start (match-end 0)))) ;; Handle quotation marks - ;; (setq text (org-e-html--quotation-marks text info)) + (setq text (org-e-html--quotation-marks text info)) ;; Convert special strings. ;; (when (plist-get info :with-special-strings) ;; (while (string-match (regexp-quote "...") text) diff --git a/contrib/lisp/org-e-latex.el b/contrib/lisp/org-e-latex.el index 67e9197..2543c29 100644 --- a/contrib/lisp/org-e-latex.el +++ b/contrib/lisp/org-e-latex.el @@ -687,38 +687,28 @@ during latex export it will output ;;;; Plain text -(defcustom org-e-latex-quotes - '(("fr" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~") - ("\\(\\S-\\)\"" . "~»") - ("\\(\\s-\\|(\\|^\\)'" . "'")) - ("en" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "``") - ("\\(\\S-\\)\"" . "''") - ("\\(\\s-\\|(\\|^\\)'" . "`"))) - "Alist for quotes to use when converting english double-quotes. - -The CAR of each item in this alist is the language code. -The CDR of each item in this alist is a list of three CONS: -- the first CONS defines the opening quote; -- the second CONS defines the closing quote; -- the last CONS defines single quotes. - -For each item in a CONS, the first string is a regexp -for allowed characters before/after the quote, the second -string defines the replacement string for this quote." +(defcustom org-e-latex-quote-replacements + '(("en" "``" "''" "`" "'" "'") + ("fr" "«~" "~»" "‹~" "~›" "'") + ("de" ",," "``" "," "`" "'")) + "What to output for quotes. Each element is a list of six strings. +The first string specifies the language these quotes apply to (\"en\", +\"fr\", \"de\", etc.; see the LANGUAGE keyword), and the other five +define the strings to use for, in order: + 1. Open double-quotes + 2. Close double-quotes + 3. Open single-quote + 4. Close single-quote + 5. Mid-word apostrophe" :group 'org-export-e-latex - :type '(list - (cons :tag "Opening quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")) - (cons :tag "Closing quote" - (string :tag "Regexp for char after ") - (string :tag "Replacement quote ")) - (cons :tag "Single quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")))) - + :type '(repeat + (list + (string :tag "Language code") + (string :tag "Open double-quotes") + (string :tag "Close double-quotes") + (string :tag "Open single-quote") + (string :tag "Close single-quote") + (string :tag "Mid-word apostrophe")))) ;;;; Compilation @@ -852,19 +842,8 @@ nil." options ",")) -(defun org-e-latex--quotation-marks (text info) - "Export quotation marks depending on language conventions. -TEXT is a string containing quotation marks to be replaced. INFO -is a plist used as a communication channel." - (mapc (lambda(l) - (let ((start 0)) - (while (setq start (string-match (car l) text start)) - (let ((new-quote (concat (match-string 1 text) (cdr l)))) - (setq text (replace-match new-quote t t text)))))) - (cdr (or (assoc (plist-get info :language) org-e-latex-quotes) - ;; Falls back on English. - (assoc "en" org-e-latex-quotes)))) - text) +(defun org-e-latex--quotation-marks (text info) + (org-export-quotation-marks text info org-e-latex-quote-replacements)) (defun org-e-latex--wrap-label (element output) "Wrap label associated to ELEMENT around OUTPUT, if appropriate. diff --git a/contrib/lisp/org-e-odt.el b/contrib/lisp/org-e-odt.el index cab4c66..7eb92b6 100644 --- a/contrib/lisp/org-e-odt.el +++ b/contrib/lisp/org-e-odt.el @@ -2318,39 +2318,28 @@ in order to mimic default behaviour: ;;;; Plain text -(defcustom org-e-odt-quotes - '(("fr" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "« ") - ("\\(\\S-\\)\"" . "» ") - ("\\(\\s-\\|(\\|^\\)'" . "'")) - ("en" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "“") - ("\\(\\S-\\)\"" . "”") - ("\\(\\s-\\|(\\|^\\)'" . "‘") - ("\\(\\S-\\)'" . "’"))) - "Alist for quotes to use when converting english double-quotes. - -The CAR of each item in this alist is the language code. -The CDR of each item in this alist is a list of three CONS: -- the first CONS defines the opening quote; -- the second CONS defines the closing quote; -- the last CONS defines single quotes. - -For each item in a CONS, the first string is a regexp -for allowed characters before/after the quote, the second -string defines the replacement string for this quote." +(defcustom org-e-odt-quote-replacements + '(("en" "“" "”" "‘" "’" "’") + ("fr" "« " " »" "‹ " " ›" "’") + ("de" "„" "“" "‚" "‘" "’")) + "What to output for quotes. Each element is a list of six strings. +The first string specifies the language these quotes apply to (\"en\", +\"fr\", \"de\", etc.; see the LANGUAGE keyword), and the other five +define the strings to use for, in order: + 1. Open double-quotes + 2. Close double-quotes + 3. Open single-quote + 4. Close single-quote + 5. Mid-word apostrophe" :group 'org-export-e-odt - :type '(list - (cons :tag "Opening quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")) - (cons :tag "Closing quote" - (string :tag "Regexp for char after ") - (string :tag "Replacement quote ")) - (cons :tag "Single quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")))) - + :type '(repeat + (list + (string :tag "Language code") + (string :tag "Open double-quotes") + (string :tag "Close double-quotes") + (string :tag "Open single-quote") + (string :tag "Close single-quote") + (string :tag "Mid-word apostrophe")))) ;;;; Compilation @@ -2485,19 +2474,8 @@ This is used to choose a separator for constructs like \\verb." when (not (string-match (regexp-quote (char-to-string c)) s)) return (char-to-string c)))) -(defun org-e-odt--quotation-marks (text info) - "Export quotation marks depending on language conventions. -TEXT is a string containing quotation marks to be replaced. INFO -is a plist used as a communication channel." - (mapc (lambda(l) - (let ((start 0)) - (while (setq start (string-match (car l) text start)) - (let ((new-quote (concat (match-string 1 text) (cdr l)))) - (setq text (replace-match new-quote t t text)))))) - (cdr (or (assoc (plist-get info :language) org-e-odt-quotes) - ;; Falls back on English. - (assoc "en" org-e-odt-quotes)))) - text) +(defun org-e-odt--quotation-marks (text info) + (org-export-quotation-marks text info org-e-odt-quote-replacements)) (defun org-e-odt--wrap-label (element output) "Wrap label associated to ELEMENT around OUTPUT, if appropriate. diff --git a/contrib/lisp/org-export.el b/contrib/lisp/org-export.el index b9294e5..4e5f738 100644 --- a/contrib/lisp/org-export.el +++ b/contrib/lisp/org-export.el @@ -143,6 +143,7 @@ (:with-priority nil "pri" org-export-with-priority) (:with-special-strings nil "-" org-export-with-special-strings) (:with-sub-superscript nil "^" org-export-with-sub-superscripts) + (:with-smart-quotes nil "\"" org-export-with-smart-quotes) (:with-toc nil "toc" org-export-with-toc) (:with-tables nil "|" org-export-with-tables) (:with-tags nil "tags" org-export-with-tags) @@ -284,6 +285,33 @@ rules.") :tag "Org Export General" :group 'org-export) +;; Generic function, usable by exporters, but they can define their own +;; instead. +(defun org-export-quotation-marks (text info replacements) + "Export quotation marks depending on language conventions. +TEXT is a string containing quotation marks to be replaced. INFO +is a plist used as a communication channel." + ;; (message text) + (when (plist-get info :with-smart-quotes) + (let* ((regexps + (cdr + (or + (assoc (plist-get info :language) + org-smart-quotes-regexps) + (assq 'DEFAULT org-smart-quotes-regexps)))) + (subs (cdr (or (assoc (plist-get info :language) + replacements) + (assoc "en" replacements)))) + (quotes (pairlis regexps subs))) + (mapc (lambda (p) + (let ((re (car p)) + (su (cdr p))) + (setq text (replace-regexp-in-string re su text t t 9)))) + quotes))) + text) + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + (defcustom org-export-with-archived-trees 'headline "Whether sub-trees with the ARCHIVE tag should be exported. @@ -445,6 +473,16 @@ e.g. \"e:nil\"." :group 'org-export-general :type 'boolean) +(defcustom org-export-with-smart-quotes t + "Non-nil means try to make quotes \"smart\" when exporting. + +For example, HTML export would convert \"Hello\" to “Hello”. + +The exact style of quotes depends on the language; see the LANGUAGE +keyword and also the smart-quote custom settings for each exporter." + :group 'org-export-general + :type 'boolean) + (defcustom org-export-with-planning nil "Non-nil means include planning info in export. This option can also be set with the #+OPTIONS: line, diff --git a/lisp/org.el b/lisp/org.el index b89889d..8a446ec 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -3629,6 +3629,69 @@ When nil, the \\name form remains in the buffer." :version "24.1" :type 'boolean) +(defcustom org-smart-quotes nil + "Non-nil means display `smart' quotes on-screen in place +of \" and ' characters." + :group 'org-appearance + :type 'boolean) + +(defcustom org-smart-quotes-replacements + '("“" "”" "‘" "’" "’") + "What to display on-screen when `org-smart-quotes' is non-nil. +A list of five strings: + 1. Open double-quotes + 2. Close double-quotes + 3. Open single-quote + 4. Close single-quote + 5. Mid-word apostrophe" + :group 'org-appearance + :type '(list + (string :tag "Open double-quotes" "“") + (string :tag "Close double-quotes" "”") + (string :tag "Open single-quote" "‘") + (string :tag "Close single-quote" "’") + (string :tag "Mid-word apostrophe" "’"))) + +(defcustom org-smart-quotes-regexps + '((DEFAULT + "\\(?:\\s-\\|\\s(\\|^\\)\\(?9:\"\\)\\(?:\\w\\|\\s.\\|\\s_\\)\\|\\s-\\(?9:\"\\)$" + "\\(?:\\S-\\)\\(?9:\"\\)\\(?:\\s-\\|$\\|\\s)\\|\\s.\\)\\|^\\(?9:\"\\)\\s-" + "\\(?:\\s-\\|(\\|^\\)\\(?9:'\\)\\w\\|\\s-\\(?9:'\\)$" + "\\w\\s.*\\(?9:'\\)\\(?:\\s-\\|\\s.\\|$\\)\\|^\\(?9:'\\)\\s-" + "\\w\\(?9:'\\)\\w")) + "Regexps for quotes to be made `smart' quotes upon export or onscreen. +Each element is a list of six strings. The car is the a string +representing the language to which this definition applies (e.g. \"en\", +\"fr\", \"de\", etc.); the cdr (the other five elements) are five REs +matching, in order: + 1. Opening double-quotes + 2. Closing double-quotes + 3. Opening single-quotes + 4. Closing single-quotes + 5. Mid-word apostrophes + +Each regexp should surround the actual quote in a capturing group, which +must be specified as number 9 (so as not to conflict with other processing.) + +One element should have as its car the atom DEFAULT, to be used when no +other element fits. It is also the one used for on-screen display of +`smart' quotes (see the variable `org-smart-quotes'). + +As what makes an opening or closing quote is somewhat consistent across +languages (as opposed to how they are represented in typography), the +DEFAULT element is likely sufficient for most purposes." + :group 'org-export-general + :group 'org-appearance + :type '(repeat + (list + (choice (const DEFAULT) + (string :tag "Language")) + (regexp :tag "Open double-quotes") + (regexp :tag "Close double-quotes") + (regexp :tag "Open single-quote") + (regexp :tag "Close double-quote") + (regexp :tag "Mid-word apostrophe")))) + (defvar org-emph-re nil "Regular expression for matching emphasis. After a match, the match groups contain these elements: @@ -5931,6 +5994,7 @@ needs to be inserted at a specific position in the font-lock sequence.") ;; Specials '(org-do-latex-and-special-faces) '(org-fontify-entities) + '(org-fontify-quotes) '(org-raise-scripts) ;; Code '(org-activate-code (1 'org-code t)) @@ -5952,6 +6016,43 @@ needs to be inserted at a specific position in the font-lock sequence.") '(org-font-lock-keywords t nil nil backward-paragraph)) (kill-local-variable 'font-lock-keywords) nil)) +(defun org-fontify-quotes (limit) + (require 'org-export) + (when org-smart-quotes + (let* ((start (point)) + k su + (splice-string (lambda (lst join) + (if (null (cdr lst)) (car lst) + (concat (car lst) join + (splice-string (cdr lst) join))))) + (regexps + (cdr + (assq 'DEFAULT org-smart-quotes-regexps))) + (i 1) + (allreg + (mapconcat (lambda (n) (prog1 (format "\\(?%d:%s\\)" i n) + (setq i (1+ i)))) + regexps "\\|")) + (quotes (pairlis regexps org-smart-quotes-replacements))) + (catch 'match + (while (re-search-forward allreg limit t) + (cond ((match-string 1) + (setq su (nth 0 org-smart-quotes-replacements))) + ((match-string 2) + (setq su (nth 1 org-smart-quotes-replacements))) + ((match-string 3) + (setq su (nth 2 org-smart-quotes-replacements))) + ((match-string 4) + (setq su (nth 3 org-smart-quotes-replacements))) + ((match-string 5) + (setq su (nth 4 org-smart-quotes-replacements)))) + (add-text-properties (match-beginning 9) (match-end 9) + (list 'font-lock-fontified t + 'face 'org-document-info)) + (compose-region (match-beginning 9) (match-end 9) su nil) + (backward-char 1) + (throw 'match t)))))) + (defun org-toggle-pretty-entities () "Toggle the composition display of entities as UTF8 characters." (interactive) -- 1.7.7.6 --------------020806030700050008060006--