From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark Shoulson Subject: Re: (no subject) Date: Thu, 31 May 2012 01:50:39 +0000 (UTC) Message-ID: References: <4FBB08CA.5060705@kli.org> <87d35u8rvk.fsf@gmail.com> <4FBDA56E.5030901@kli.org> <87zk8w6v4q.fsf@gmail.com> <4FC00CE0.6060308@kli.org> <87r4u75tg9.fsf@gmail.com> <4FC426AC.2030109@kli.org> <87ehq227ky.fsf@gmail.com> <4FC56F1B.5040201@kli.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: Received: from eggs.gnu.org ([208.118.235.92]:36152) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SZuXR-00036R-Fj for emacs-orgmode@gnu.org; Wed, 30 May 2012 21:50:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SZuXO-0001jK-W9 for emacs-orgmode@gnu.org; Wed, 30 May 2012 21:50:57 -0400 Received: from plane.gmane.org ([80.91.229.3]:56919) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SZuXO-0001jB-Hd for emacs-orgmode@gnu.org; Wed, 30 May 2012 21:50:54 -0400 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1SZuXM-0000RX-5A for emacs-orgmode@gnu.org; Thu, 31 May 2012 03:50:52 +0200 Received: from pi.meson.org ([96.56.207.26]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 31 May 2012 03:50:52 +0200 Received: from mark by pi.meson.org with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 31 May 2012 03:50:52 +0200 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org Mark E. Shoulson kli.org> writes: > > All right, bottom line, this is sort of what I'm seeing. I'm not 100% > sure which files should house these things, but something like this: > > 1) a variable containing for each language regexp for each of: open > double-quote, close double-quote, open single-quote, close single-quote, > and maybe mid-word apostrophe. Odds are these regexps are going to be > the same for just about all languages (the regexps detecting them, mind > you), so probably should have some sort of default that the alist can > just reference. A language should also be allowed to define other quote > regexps in its list too. We need these to be ordered, with a standard > set, so that we can have... > > 2) for each *exporter* (including on-screen display), a variable that > defines, for each language, what the *substitution* will be for > open-double-quote, close-double-quote, etc. Other extras can be defined > too. That way we can have an exporter-independent way to detect quotes > to be smartified, but each exporter has its own way to smartify them. > > 3) Since most exporters are probably going to be handling doing the > process approximately the same (match the regexp, stick in the > associated substitution), org-export.el should have a generic function > that does this which each exporter *may* call in (or as) its > quote-smartifier in its text translator, unless it needs something more > specific which it can provide itself. > > In terms of what is handled, the idea in my head is that we would expect > the writer to be using " or ' to surround their quotes, regardless of > what their native custom is (if they're doing it using their > language-specific quote-marks, we don't need to bother with all this > anyway). Goal is to handle either "quotes" or 'quotes' in either > nesting (or no nesting, if someone does "quote' for some reason), and > with any luck not get too confused with other uses of apostrophe. > > It makes sense to me, but I bet I explained it badly and people are > going to have all kinds of issues with it. :) > > No telling when (if?) I'll be able to produce something along these > lines, but it's something to start thinking about anyway. > > ~mark > > Regarding the "this is what I'm seeing", I paste at the bottom a preliminary patch. It is totally *not* worth actually applying it unless you want to develop this; it's a snapshot mid-development. But it does seem to actually work. The same set of regexps is used, and the same function, though that is defeasible and an exporter can define its own. The hardest part was getting the onscreen versions showing right (also the most recently and probably best tested, so the actual exporters might be more bumpy). Actual substitutions being used are not necessarily typologically sensible; chosen more so it's easier to see the action of the process. Nothing is in the right place, things that should be customizables aren't... it's proof-of-concept. Am I going in the right direction, as far as export-engine is concerned? ========== >From 420048063e3fd2af1b019c48864d58d82cef62ef Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Tue, 29 May 2012 23:01:12 -0400 Subject: [PATCH] Just barely works, nothing in the right places. For entertainment purposes only. --- contrib/lisp/org-e-html.el | 5 ++++ contrib/lisp/org-e-latex.el | 53 +++++++++++++++++++++---------------------- contrib/lisp/org-export.el | 52 ++++++++++++++++++++++++++++++++++++++++++ lisp/org.el | 50 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 133 insertions(+), 27 deletions(-) diff --git a/contrib/lisp/org-e-html.el b/contrib/lisp/org-e-html.el index de98493..b851713 100644 --- a/contrib/lisp/org-e-html.el +++ b/contrib/lisp/org-e-html.el @@ -1077,6 +1077,11 @@ in order to mimic default behaviour: ;;;; Plain text +(defvar org-e-html-quote-replacements + '(("fr" "« " " »" "‘" "’" "’") + ("en" "“" "”" "‘" "’" "’") + ("de" "„" "“" "‚" "‘" "’")) + (defcustom org-e-html-quotes '(("fr" ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~") diff --git a/contrib/lisp/org-e-latex.el b/contrib/lisp/org-e-latex.el index 67e9197..540ebe1 100644 --- a/contrib/lisp/org-e-latex.el +++ b/contrib/lisp/org-e-latex.el @@ -687,6 +687,10 @@ during latex export it will output ;;;; Plain text +(defvar org-e-latex-quote-replacements + '(("fr" "«~" "~»" "‹~" "~›" "/!") + ("en" "((" "))" ".(" ")." "/"))) + (defcustom org-e-latex-quotes '(("fr" ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~") @@ -699,25 +703,22 @@ during latex export it will output "Alist for quotes to use when converting english double-quotes. The CAR of each item in this alist is the language code. -The CDR of each item in this alist is a list of three CONS: -- the first CONS defines the opening quote; -- the second CONS defines the closing quote; -- the last CONS defines single quotes. +The CDR of each item in this alist is a list of CONS: +- the first CONS should define the opening quote; +- the second CONS should define the closing quote; +- subsequent CONS should define any other quotes, e.g. single, etc. For each item in a CONS, the first string is a regexp for allowed characters before/after the quote, the second string defines the replacement string for this quote." :group 'org-export-e-latex - :type '(list - (cons :tag "Opening quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")) - (cons :tag "Closing quote" - (string :tag "Regexp for char after ") - (string :tag "Replacement quote ")) - (cons :tag "Single quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")))) + :type '(repeat + (cons + (string :tag "language code") + (repeat + (cons :tag "Quote" + (string :tag "Regexp ") + (string :tag "Replacement quote ")))))) ;;;; Compilation @@ -852,19 +853,17 @@ nil." options ",")) -(defun org-e-latex--quotation-marks (text info) - "Export quotation marks depending on language conventions. -TEXT is a string containing quotation marks to be replaced. INFO -is a plist used as a communication channel." - (mapc (lambda(l) - (let ((start 0)) - (while (setq start (string-match (car l) text start)) - (let ((new-quote (concat (match-string 1 text) (cdr l)))) - (setq text (replace-match new-quote t t text)))))) - (cdr (or (assoc (plist-get info :language) org-e-latex-quotes) - ;; Falls back on English. - (assoc "en" org-e-latex-quotes)))) - text) +(defun org-e-latex--quotation-marks (text info) + (org-export-quotation-marks text info org-e-latex-quote-replacements)) + ;; (mapc (lambda(l) + ;; (let ((start 0)) + ;; (while (setq start (string-match (car l) text start)) + ;; (let ((new-quote (concat (match-string 1 text) (cdr l)))) + ;; (setq text (replace-match new-quote t t text)))))) + ;; (cdr (or (assoc (plist-get info :language) org-e-latex-quotes) + ;; ;; Falls back on English. + ;; (assoc "en" org-e-latex-quotes)))) + ;; text) (defun org-e-latex--wrap-label (element output) "Wrap label associated to ELEMENT around OUTPUT, if appropriate. diff --git a/contrib/lisp/org-export.el b/contrib/lisp/org-export.el index b9294e5..aacb448 100644 --- a/contrib/lisp/org-export.el +++ b/contrib/lisp/org-export.el @@ -284,6 +284,58 @@ rules.") :tag "Org Export General" :group 'org-export) +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Probably a defcustom eventually. + +;; Each element of this consists of: car=language code, cdr=list of +;; double-quote-open-regexp, double-quote-close-regexp, +;; single-quote-open-regexp, single-quote-close-regexp, &optional +;; single-apostrophe regexp? +;; Just about all will be the same anyway, so mostly language DEFAULT. + +;; For testing purposes, poorly-designed at first. +(defvar org-export-quotes-regexps + '((DEFAULT + "\\(?:\\s-\\|[[(]\\|^\\)\\(\"\\)\\w" + "\\(?:\\S-\\)\\(\"\\)\\s-" + "\\(?:\\s-\\|(\\|^\\)\\('\\)\\w" + "\\w\\('\\)\\(?:\\s-\\|\\s.\\|$\\)" + "\\w\\('\\)\\w"))) + +;; Generic function, usable by exporters, but they can define their own +;; instead. +(defun org-export-quotation-marks (text info replacements) + "Export quotation marks depending on language conventions. +TEXT is a string containing quotation marks to be replaced. INFO +is a plist used as a communication channel." + (let* ((start 0) + (regexps + (cdr + (or + (assoc (plist-get info :language) + org-export-quotes-regexps) + (assoc 'DEFAULT org-export-quotes-regexps)))) + (subs (cdr (or (assoc (plist-get info :language) + replacements) + (assoc "en" replacements)))) + (quotes (pairlis regexps subs))) + (mapc (lambda (p) + (let ((re (car p)) + (su (cdr p))) + (while (setq start (string-match re text start)) + (setq text (replace-match su t t text 1))))) + quotes)) + text) + +(defvar org-screen-smart-quotes + '(("en" "“" "”" "‘" "’" "’") + ("fr" "«" "»" "‹" "›" "’") + ("de" "„" "“" "‚" "’" "’"))) + + + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + (defcustom org-export-with-archived-trees 'headline "Whether sub-trees with the ARCHIVE tag should be exported. diff --git a/lisp/org.el b/lisp/org.el index 8a141cf..72bf4b0 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -5927,6 +5927,7 @@ needs to be inserted at a specific position in the font-lock sequence.") ;; Specials '(org-do-latex-and-special-faces) '(org-fontify-entities) + '(org-fontify-quotes) '(org-raise-scripts) ;; Code '(org-activate-code (1 'org-code t)) @@ -5948,6 +5949,55 @@ needs to be inserted at a specific position in the font-lock sequence.") '(org-font-lock-keywords t nil nil backward-paragraph)) (kill-local-variable 'font-lock-keywords) nil)) +(defvar org-smart-quotes nil) +(defvar org-smart-quotes-replacements + '("«" "»" "‹" "›" "’")) +;; '("“" "”" "‘" "’" "’")) + +;; Nother idea, try this: like in original smart-quotes attempt. +;; String all the regexps into one big regexp with \\| between them. +;; Possibly have to parenthesize them but that's okay, since if +;; each elt is in its own group, then those will be the odd-numbered groups +;; and the inner group (of the actual quote) will be groups 2,4,6, etc. + +(defun splice-string (lst join) + (if (null (cdr lst)) (car lst) + (concat (car lst) join (splice-string (cdr lst) join)))) + +(defun org-fontify-quotes (limit) + (require 'org-export) + (when org-smart-quotes + (let* ((start (point)) + k su + (regexps + (cdr + (assoc 'DEFAULT org-export-quotes-regexps))) + (allreg (splice-string regexps "\\|")) + (quotes (pairlis regexps org-smart-quotes-replacements))) + ;; (message "%s" allreg) + (catch 'match + (while (re-search-forward allreg limit t) + (cond ((match-string 1) + (setq k 1 su (nth 0 org-smart-quotes-replacements))) + ((match-string 2) + (setq k 2 su (nth 1 org-smart-quotes-replacements))) + ((match-string 3) + (setq k 3 su (nth 2 org-smart-quotes-replacements))) + ((match-string 4) + (setq k 4 su (nth 3 org-smart-quotes-replacements))) + ((match-string 5) + (setq k 5 su (nth 4 org-smart-quotes-replacements))) + ;;(t + ;; (message "????"))) + ) + ;; (message "%s %s" k (match-data)) + (add-text-properties (match-beginning k) (match-end k) + (list 'font-lock-fontified t + 'face 'org-warning)) + (compose-region (match-beginning k) (match-end k) su nil) + (backward-char 1) + (throw 'match t)))))) + (defun org-toggle-pretty-entities () "Toggle the composition display of entities as UTF8 characters." (interactive) -- 1.7.7.6