From: "Mark E. Shoulson" <mark@kli.org>
To: Nicolas Goaziou <n.goaziou@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Smart Quotes Exporting (Was: Re: (no subject))
Date: Thu, 31 May 2012 19:26:36 -0400 [thread overview]
Message-ID: <4FC7FE2C.6040702@kli.org> (raw)
In-Reply-To: <87r4u031ye.fsf@gmail.com>
Sorry for messing up the thread subject header; I think I misused
gmane's posting.
On 05/31/2012 09:38 AM, Nicolas Goaziou wrote:
> Hello,
>
> Mark Shoulson<mark@kli.org> writes:
>
>> +(defvar org-e-html-quote-replacements
>> + '(("fr" "« " " »" "‘" "’" "’")
>> + ("en" "“" "”" "‘" "’" "’")
>> + ("de" "„" "“" "‚" "‘" "’"))
> A docstring will be required for this variable. It should be
> a defcustom.
Oh, certainly; they're all a disaster. I think I said that in the
writeup at the top. This is just proof of concept, nothing is in the
right place, nothing is properly documented. They have to be
defcustoms, there needs to be a good :type in the defcustom as well as a
proper docstring. You'll get no argument from me about the lack (or
inaccuracy) of docstrings and such. I hadn't gotten that far yet. I
said the patch was only if you wanted to tinker with the development as
this progresses.
> +(defun org-e-latex--quotation-marks (text info)
> + (org-export-quotation-marks text info org-e-latex-quote-replacements))
> + ;; (mapc (lambda(l)
> + ;; (let ((start 0))
> + ;; (while (setq start (string-match (car l) text start))
> + ;; (let ((new-quote (concat (match-string 1 text) (cdr l))))
> + ;; (setq text (replace-match new-quote t t text))))))
> + ;; (cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
> + ;; ;; Falls back on English.
> + ;; (assoc "en" org-e-latex-quotes))))
> + ;; text)
> Use directly `org-e-latex-quote-replacements' in code then.
Not sure I understand this comment.
>> +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>> +;; Probably a defcustom eventually.
>> +
>> +;; Each element of this consists of: car=language code, cdr=list of
>> +;; double-quote-open-regexp, double-quote-close-regexp,
>> +;; single-quote-open-regexp, single-quote-close-regexp,&optional
>> +;; single-apostrophe regexp?
>> +;; Just about all will be the same anyway, so mostly language DEFAULT.
>> +
>> +;; For testing purposes, poorly-designed at first.
>> +(defvar org-export-quotes-regexps
>> + '((DEFAULT
>> + "\\(?:\\s-\\|[[(]\\|^\\)\\(\"\\)\\w"
>> + "\\(?:\\S-\\)\\(\"\\)\\s-"
>> + "\\(?:\\s-\\|(\\|^\\)\\('\\)\\w"
>> + "\\w\\('\\)\\(?:\\s-\\|\\s.\\|$\\)"
>> + "\\w\\('\\)\\w")))
> I'm not sure this variable can be used for both the buffer and the
> export engine. Export back-ends will only see chunks of the paragraph.
>
> For example, in the following text,
>
> He crossed the Rubicon and said: "/Alea jacta est./"
>
> Plain text translators will see three strings:
>
> 1. "He crossed the Rubicon and said: \""
> 2. "Alea jacta est."
> 3. "\""
>
> In case 1, you have an opening quote with nothing after it. In case 3,
> you have a closing quote with nothing before or after it. Plain regexps
> can't help here.
>
> The only solution in can think of is to do quote substitutions in
> paragraphs within the parse tree before they reach the translators (i.e.
> with `org-export-filter-parse-tree-functions').
>
> That's the only way to know if "\"" is an opening or a closing quote,
> for example. The current approach won't work.
Hm. OK, this may indeed be (a) a problem and (b) an indication that I
really don't understand the process as I thought I did... ... ... Ah.
So when the "plain" text is being exported, the exporter passes along
the text in chunks as divided up by the formatting. So string #2 is
broken out from the others due to its being in italics. That is indeed
an issue. Moreover, I never even properly considered the effects of
formatting characters (as opposed to punctuation) right next to the
quote-marks, even if this weren't a problem.
So... there's the filter-parse-tree-functions hook gets applied within
the parse tree... so a back-end can add a function to that list which
looks over the parse-tree and watches for these border cases (and also
the ones within ordinary strings). Looks like it's going to be tough to
work in any flexibility to define further per-language or per-backend
cleverness to handle anything beyond the "canonical set" of open-double,
close-double, open-single, close-single, and mid-word.
To be sure, anything we do will most assuredly fail even on some fairly
reasonable input, in which case the users are pretty much on their own
and will have to do things the hard way. And I could use that as the
answer here, that, "well, it'll work only within plain-text strings"
(and I might possibly still have to use that answer), but I would rather
include the situations you bring up in the supported set and not throw
up my hands at it. So, yes, will look at that.
>> + (let* ((start 0)
>> + (regexps
>> + (cdr
>> + (or
>> + (assoc (plist-get info :language)
>> + org-export-quotes-regexps)
>> + (assoc 'DEFAULT org-export-quotes-regexps))))
> Use `assq' instead of `assoc' in the second case.
Good call.
>> + (subs (cdr (or (assoc (plist-get info :language)
>> + replacements)
>> + (assoc "en" replacements))))
>> + (quotes (pairlis regexps subs)))
>> + (mapc (lambda (p)
>> + (let ((re (car p))
>> + (su (cdr p)))
>> + (while (setq start (string-match re text start))
>> + (setq text (replace-match su t t text 1)))))
> Use `replace-regexp-in-string' instead.
>
> (replace-regexp-in-string (car p) (cdr p) text t t 1)
I'd been looking at other functions that didn't have that available;
thanks for pointing me at it.
~mark
next prev parent reply other threads:[~2012-05-31 23:26 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-22 3:32 "Smart" quotes Mark E. Shoulson
2012-05-23 22:17 ` Nicolas Goaziou
2012-05-24 3:05 ` Mark E. Shoulson
2012-05-25 17:14 ` Nicolas Goaziou
2012-05-25 17:51 ` Jambunathan K
2012-05-25 22:51 ` Mark E. Shoulson
2012-05-26 6:48 ` Nicolas Goaziou
2012-05-29 1:30 ` Mark E. Shoulson
2012-05-29 17:57 ` Nicolas Goaziou
2012-05-30 0:51 ` Mark E. Shoulson
2012-05-31 1:50 ` (no subject) Mark Shoulson
2012-05-31 13:38 ` Nicolas Goaziou
2012-05-31 23:26 ` Mark E. Shoulson [this message]
2012-06-01 17:11 ` Smart Quotes Exporting Nicolas Goaziou
2012-06-01 22:41 ` Mark E. Shoulson
2012-06-03 3:16 ` Mark E. Shoulson
2012-06-06 2:14 ` Mark E. Shoulson
2012-06-07 19:21 ` Nicolas Goaziou
2012-06-11 1:28 ` Mark Shoulson
2012-06-12 13:21 ` Nicolas Goaziou
2012-06-15 16:20 ` Mark Shoulson
2012-06-19 9:26 ` Nicolas Goaziou
2012-08-07 23:18 ` Bastien
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FC7FE2C.6040702@kli.org \
--to=mark@kli.org \
--cc=emacs-orgmode@gnu.org \
--cc=n.goaziou@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).