From: "Mark E. Shoulson" <mark@kli.org>
To: Nicolas Goaziou <n.goaziou@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Re: "Smart" quotes
Date: Tue, 29 May 2012 20:51:39 -0400 [thread overview]
Message-ID: <4FC56F1B.5040201@kli.org> (raw)
In-Reply-To: <87ehq227ky.fsf@gmail.com>
On 05/29/2012 01:57 PM, Nicolas Goaziou wrote:
> Hello,
>
> "Mark E. Shoulson"<mark@kli.org> writes:
>
>
>> I guess it doesn't actually matter, but it starts to get weird if you
>> find yourself looking arbitrarily far back, and then you start
>> building in exceptions for crossing paragraph boundaries...
> True. I had the exporter in mind, where you always start at the
> beginning of the paragraph. It would be more difficult with search
> starting in the middle of the paragraph.
Maybe the on-screen stuff is no harder; will just have to see.
>> And then there's the fact that multi-paragraph quotes usually have an
>> open-quote for each paragraph but only one close-quote at the end...
> Some french typographers suggest to use a close-quote at the beginning
> of the paragraph to avoid that confusion, or to simply drop them (since
> they are a pain to maintain anyway). I don't know about other languages
> but, if that's the same, is it a good idea to bother implementing it?
I've never heard of it. But I think we may be overthinking this; we can
drive ourselves crazy trying to compress a dozen different typographical
traditions (and informal customs) into a few Elisp rules. On the other
hand, I don't think we need to throw up our hands and give up either! :)
>> Actually keeping count of what level you're at, accurately, is
>> a classic example of a non-regular language; you need a push-down
>> automaton to keep count, and regular expressions don't cut it.
> This is limited to 2 levels.
True.
>> I'm rambling. In sum, I'm going to start off /not/ trying to solve
>> that problem, and assume the writer is going to use alternating " and
>> as typography requires and not try to second-guess what level we're
>> at.
> You are right, the problem will be easier to solve with both " and '.
>
> Though, "as typography requires" is not true. In France, the /Imprimerie
> Nationale/ suggests to use guillemots at both levels. Remember that
> typography is localized, which is the main difficulty of the
> implementation.
Also a good point.
All right, bottom line, this is sort of what I'm seeing. I'm not 100%
sure which files should house these things, but something like this:
1) a variable containing for each language regexp for each of: open
double-quote, close double-quote, open single-quote, close single-quote,
and maybe mid-word apostrophe. Odds are these regexps are going to be
the same for just about all languages (the regexps detecting them, mind
you), so probably should have some sort of default that the alist can
just reference. A language should also be allowed to define other quote
regexps in its list too. We need these to be ordered, with a standard
set, so that we can have...
2) for each *exporter* (including on-screen display), a variable that
defines, for each language, what the *substitution* will be for
open-double-quote, close-double-quote, etc. Other extras can be defined
too. That way we can have an exporter-independent way to detect quotes
to be smartified, but each exporter has its own way to smartify them.
3) Since most exporters are probably going to be handling doing the
process approximately the same (match the regexp, stick in the
associated substitution), org-export.el should have a generic function
that does this which each exporter *may* call in (or as) its
quote-smartifier in its text translator, unless it needs something more
specific which it can provide itself.
In terms of what is handled, the idea in my head is that we would expect
the writer to be using " or ' to surround their quotes, regardless of
what their native custom is (if they're doing it using their
language-specific quote-marks, we don't need to bother with all this
anyway). Goal is to handle either "quotes" or 'quotes' in either
nesting (or no nesting, if someone does "quote' for some reason), and
with any luck not get too confused with other uses of apostrophe.
It makes sense to me, but I bet I explained it badly and people are
going to have all kinds of issues with it. :)
No telling when (if?) I'll be able to produce something along these
lines, but it's something to start thinking about anyway.
~mark
next prev parent reply other threads:[~2012-05-30 0:51 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-22 3:32 "Smart" quotes Mark E. Shoulson
2012-05-23 22:17 ` Nicolas Goaziou
2012-05-24 3:05 ` Mark E. Shoulson
2012-05-25 17:14 ` Nicolas Goaziou
2012-05-25 17:51 ` Jambunathan K
2012-05-25 22:51 ` Mark E. Shoulson
2012-05-26 6:48 ` Nicolas Goaziou
2012-05-29 1:30 ` Mark E. Shoulson
2012-05-29 17:57 ` Nicolas Goaziou
2012-05-30 0:51 ` Mark E. Shoulson [this message]
2012-05-31 1:50 ` (no subject) Mark Shoulson
2012-05-31 13:38 ` Nicolas Goaziou
2012-05-31 23:26 ` Smart Quotes Exporting (Was: Re: (no subject)) Mark E. Shoulson
2012-06-01 17:11 ` Smart Quotes Exporting Nicolas Goaziou
2012-06-01 22:41 ` Mark E. Shoulson
2012-06-03 3:16 ` Mark E. Shoulson
2012-06-06 2:14 ` Mark E. Shoulson
2012-06-07 19:21 ` Nicolas Goaziou
2012-06-11 1:28 ` Mark Shoulson
2012-06-12 13:21 ` Nicolas Goaziou
2012-06-15 16:20 ` Mark Shoulson
2012-06-19 9:26 ` Nicolas Goaziou
2012-08-07 23:18 ` Bastien
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FC56F1B.5040201@kli.org \
--to=mark@kli.org \
--cc=emacs-orgmode@gnu.org \
--cc=n.goaziou@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).