emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: "Mark E. Shoulson" <mark@kli.org>
To: Nicolas Goaziou <n.goaziou@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Re: "Smart" quotes
Date: Tue, 29 May 2012 20:51:39 -0400	[thread overview]
Message-ID: <4FC56F1B.5040201@kli.org> (raw)
In-Reply-To: <87ehq227ky.fsf@gmail.com>

On 05/29/2012 01:57 PM, Nicolas Goaziou wrote:
> Hello,
>
> "Mark E. Shoulson"<mark@kli.org>  writes:
>
>
>> I guess it doesn't actually matter, but it starts to get weird if you
>> find yourself looking arbitrarily far back, and then you start
>> building in exceptions for crossing paragraph boundaries...
> True. I had the exporter in mind, where you always start at the
> beginning of the paragraph. It would be more difficult with search
> starting in the middle of the paragraph.

Maybe the on-screen stuff is no harder; will just have to see.

>> And then there's the fact that multi-paragraph quotes usually have an
>> open-quote for each paragraph but only one close-quote at the end...
> Some french typographers suggest to use a close-quote at the beginning
> of the paragraph to avoid that confusion, or to simply drop them (since
> they are a pain to maintain anyway). I don't know about other languages
> but, if that's the same, is it a good idea to bother implementing it?

I've never heard of it.  But I think we may be overthinking this; we can 
drive ourselves crazy trying to compress a dozen different typographical 
traditions (and informal customs) into a few Elisp rules.  On the other 
hand, I don't think we need to throw up our hands and give up either! :)

>> Actually keeping count of what level you're at, accurately, is
>> a classic example of a non-regular language; you need a push-down
>> automaton to keep count, and regular expressions don't cut it.
> This is limited to 2 levels.
True.
>> I'm rambling.  In sum, I'm going to start off /not/ trying to solve
>> that problem, and assume the writer is going to use alternating " and
>> as typography requires and not try to second-guess what level we're
>> at.
> You are right, the problem will be easier to solve with both " and '.
>
> Though, "as typography requires" is not true. In France, the /Imprimerie
> Nationale/ suggests to use guillemots at both levels. Remember that
> typography is localized, which is the main difficulty of the
> implementation.

Also a good point.

All right, bottom line, this is sort of what I'm seeing.  I'm not 100% 
sure which files should house these things, but something like this:

1) a variable containing for each language regexp for each of: open 
double-quote, close double-quote, open single-quote, close single-quote, 
and maybe mid-word apostrophe.  Odds are these regexps are going to be 
the same for just about all languages (the regexps detecting them, mind 
you), so probably should have some sort of default that the alist can 
just reference.  A language should also be allowed to define other quote 
regexps in its list too.  We need these to be ordered, with a standard 
set, so that we can have...

2) for each *exporter* (including on-screen display), a variable that 
defines, for each language, what the *substitution* will be for 
open-double-quote, close-double-quote, etc.  Other extras can be defined 
too.  That way we can have an exporter-independent way to detect quotes 
to be smartified, but each exporter has its own way to smartify them.

3) Since most exporters are probably going to be handling doing the 
process approximately the same (match the regexp, stick in the 
associated substitution), org-export.el should have a generic function 
that does this which each exporter *may* call in (or as) its 
quote-smartifier in its text translator, unless it needs something more 
specific which it can provide itself.

In terms of what is handled, the idea in my head is that we would expect 
the writer to be using " or ' to surround their quotes, regardless of 
what their native custom is (if they're doing it using their 
language-specific quote-marks, we don't need to bother with all this 
anyway).  Goal is to handle either "quotes" or 'quotes' in either 
nesting (or no nesting, if someone does "quote' for some reason), and 
with any luck not get too confused with other uses of apostrophe.

It makes sense to me, but I bet I explained it badly and people are 
going to have all kinds of issues with it. :)

No telling when (if?) I'll be able to produce something along these 
lines, but it's something to start thinking about anyway.

~mark

  reply	other threads:[~2012-05-30  0:51 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-22  3:32 "Smart" quotes Mark E. Shoulson
2012-05-23 22:17 ` Nicolas Goaziou
2012-05-24  3:05   ` Mark E. Shoulson
2012-05-25 17:14     ` Nicolas Goaziou
2012-05-25 17:51       ` Jambunathan K
2012-05-25 22:51       ` Mark E. Shoulson
2012-05-26  6:48         ` Nicolas Goaziou
2012-05-29  1:30           ` Mark E. Shoulson
2012-05-29 17:57             ` Nicolas Goaziou
2012-05-30  0:51               ` Mark E. Shoulson [this message]
2012-05-31  1:50                 ` (no subject) Mark Shoulson
2012-05-31 13:38                   ` Nicolas Goaziou
2012-05-31 23:26                     ` Smart Quotes Exporting (Was: Re: (no subject)) Mark E. Shoulson
2012-06-01 17:11                       ` Smart Quotes Exporting Nicolas Goaziou
2012-06-01 22:41                         ` Mark E. Shoulson
2012-06-03  3:16                         ` Mark E. Shoulson
2012-06-06  2:14                         ` Mark E. Shoulson
2012-06-07 19:21                           ` Nicolas Goaziou
2012-06-11  1:28                             ` Mark Shoulson
2012-06-12 13:21                               ` Nicolas Goaziou
2012-06-15 16:20                                 ` Mark Shoulson
2012-06-19  9:26                                   ` Nicolas Goaziou
2012-08-07 23:18                                     ` Bastien

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FC56F1B.5040201@kli.org \
    --to=mark@kli.org \
    --cc=emacs-orgmode@gnu.org \
    --cc=n.goaziou@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).