emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* "Smart" quotes
@ 2012-05-22  3:32 Mark E. Shoulson
  2012-05-23 22:17 ` Nicolas Goaziou
  0 siblings, 1 reply; 23+ messages in thread
From: Mark E. Shoulson @ 2012-05-22  3:32 UTC (permalink / raw)
  To: org-mode mailing list

[-- Attachment #1: Type: text/html, Size: 1675 bytes --]

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: smartquotes2.patch --]
[-- Type: text/x-patch; name="smartquotes2.patch", Size: 2277 bytes --]

diff --git a/lisp/org-entities.el b/lisp/org-entities.el
index 8b5b3f3..ee54abc 100644
--- a/lisp/org-entities.el
+++ b/lisp/org-entities.el
@@ -47,6 +47,14 @@ in backends where the corresponding character is not available."
   :version "24.1"
   :type 'boolean)
 
+(defcustom org-smart-quotes nil
+  "Non-nil means display ' and \" characters as Unicode \"smart\" quotes.
+Org-mode will try to figure out if a quote character is opening or closing.
+
+Note: this does not affect export, only on-screen appearance."
+  :group 'org-entities
+  :type 'boolean)
+
 (defcustom org-entities-user nil
   "User-defined entities used in Org-mode to produce special characters.
 Each entry in this list is a list of strings.  It associates the name
diff --git a/lisp/org.el b/lisp/org.el
index 05f5375..213490e 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -5926,6 +5926,7 @@ needs to be inserted at a specific position in the font-lock sequence.")
 		 '(1 'org-archived prepend))
 	   ;; Specials
 	   '(org-do-latex-and-special-faces)
+	   '(org-smartify-quotes)
 	   '(org-fontify-entities)
 	   '(org-raise-scripts)
 	   ;; Code
@@ -5948,6 +5949,33 @@ needs to be inserted at a specific position in the font-lock sequence.")
 		   '(org-font-lock-keywords t nil nil backward-paragraph))
     (kill-local-variable 'font-lock-keywords) nil))
 
+(defconst org-smart-quotes-regex
+  ;; ' is a word character, " is punctuation.
+  "\\(\"\\<\\)\\|\\>\\s.*\\(\"\\)\\|\\(?:\\W\\|^\\)\\('\\)\\|\\w\\s.*\\('\\)")
+
+
+(defun org-smartify-quotes (limit)
+  "Make 'smart quotes' out of straight quotes."
+  (let* (start end subst k)
+    (when org-smart-quotes
+      (catch 'match
+	(while (re-search-forward org-smart-quotes-regex
+		limit t)
+	  (cond ((match-string 1)
+		 (setq k 1 subst "“"))
+		((match-string 2)
+		 (setq k 2 subst "”"))
+		((match-string 3)
+		 (setq k 3 subst "‘"))
+		((match-string 4)
+		 (setq k 4 subst "’")))
+	  (add-text-properties (match-beginning k) (match-end k)
+			       (list 'font-lock-fontified t))
+	  (compose-region (match-beginning k) (match-end k) subst nil)
+	  (backward-char 1)
+	  (throw 'match t))
+	nil))))
+
 (defun org-toggle-pretty-entities ()
   "Toggle the composition display of entities as UTF8 characters."
   (interactive)

[-- Attachment #3: ChangeLog --]
[-- Type: text/plain, Size: 159 bytes --]

2012-05-21  Mark Shoulson  <mark@kli.org>

	* lisp/org.el, lisp/org-entities.el: added org-smart-quotes
	for displaying ' and " characters as "smart quotes."


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: "Smart" quotes
  2012-05-22  3:32 "Smart" quotes Mark E. Shoulson
@ 2012-05-23 22:17 ` Nicolas Goaziou
  2012-05-24  3:05   ` Mark E. Shoulson
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Goaziou @ 2012-05-23 22:17 UTC (permalink / raw)
  To: Mark E. Shoulson; +Cc: org-mode mailing list

Hello,


"Mark E. Shoulson" <mark@kli.org> writes:

> "Smart" quotes can be annoying when they aren't smart enough. But when
> they work you can miss them. I'm attaching a patch that defines a
> custom variable org-smart-quotes (nil by default), which when non-nil
> causes the " and ' characters to display as “smart” quotes, hopefully
> the right ones. They're still ' and " in the underlying text, just
> overlaid with “”.

This is not related to entities, so code shouldn't be in org-entities.el.

Also, quotes are dependent on locale[fn:1]. English/US only quotes look
like a niche to me. Would it be possible to modify the patch and have
this feature handle LANGUAGE keyword, or at least have a support for it?


Regards,

[fn:1] https://en.wikipedia.org/wiki/Non-English_usage_of_quotation_marks

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: "Smart" quotes
  2012-05-23 22:17 ` Nicolas Goaziou
@ 2012-05-24  3:05   ` Mark E. Shoulson
  2012-05-25 17:14     ` Nicolas Goaziou
  0 siblings, 1 reply; 23+ messages in thread
From: Mark E. Shoulson @ 2012-05-24  3:05 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: org-mode mailing list

On 05/23/2012 06:17 PM, Nicolas Goaziou wrote:
> Hello,
>
>
> "Mark E. Shoulson"<mark@kli.org>  writes:
>
>> "Smart" quotes can be annoying when they aren't smart enough. But when
>> they work you can miss them. I'm attaching a patch that defines a
>> custom variable org-smart-quotes (nil by default), which when non-nil
>> causes the " and ' characters to display as “smart” quotes, hopefully
>> the right ones. They're still ' and " in the underlying text, just
>> overlaid with “”.
> This is not related to entities, so code shouldn't be in org-entities.el.
Agreed.

>
> Also, quotes are dependent on locale[fn:1]. English/US only quotes look
> like a niche to me. Would it be possible to modify the patch and have
> this feature handle LANGUAGE keyword, or at least have a support for it?
Hm.  I like the idea, but it raises some questions for me.  It would be 
particularly good if this could share code/custom variables with the 
pieces of the (new) exporter that make smart quotes on export.  That way 
we could be sure that what it looks like onscreen would also be what it 
looked like when exported.  Looking at contrib/lisp/org-e-latex.el at an 
upcoming exporter for such things, I see a variable org-e-latex-quotes, 
which has nice language-aware parts... but misses an important point.  
Each language gets to define one regexp for opening quotes, one for 
closing quotes, and one for single quotes.  But don't we want to talk 
about (at least) two levels of quotes, see your own reference[fn:1]?  
Single quotes would be for inner, second-level quotes (if we're using 
double straight quotes according to (American) English usage, I would 
guess we'd be using single straight quotes the same way).  That works 
okay for English, where a single apostrophe not part of a grouping 
construct is going to be interpreted as a "close" single quote and look 
right for an apostrophe.  It might not work so good in French where 
apostrophes are also used, but also single guillemets for inner-level 
quotes.  Does the setup there need to be smarter, or at least more 
extensible, to allow for more than exactly three entries?  Clever enough 
regexps could distinguish inner quotes from apostrophes, etc.  
Should/can we consider extending this for the new exporters?

(I'm looking forward to HTML and ODT exporters that can do smart quotes; 
the straight quotes are really the main jarring things about using Org 
as a lightweight markup and exporting into something fancier)

~mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: "Smart" quotes
  2012-05-24  3:05   ` Mark E. Shoulson
@ 2012-05-25 17:14     ` Nicolas Goaziou
  2012-05-25 17:51       ` Jambunathan K
  2012-05-25 22:51       ` Mark E. Shoulson
  0 siblings, 2 replies; 23+ messages in thread
From: Nicolas Goaziou @ 2012-05-25 17:14 UTC (permalink / raw)
  To: Mark E. Shoulson; +Cc: org-mode mailing list

Hello,

"Mark E. Shoulson" <mark@kli.org> writes:

> Hm.  I like the idea, but it raises some questions for me.  It would
> be particularly good if this could share code/custom variables with
> the pieces of the (new) exporter that make smart quotes on export.
> That way we could be sure that what it looks like onscreen would also
> be what it looked like when exported.

I could be interesting, but keep in mind that no matter how "smart" your
quotes are, they will fail in some situations. So, it will have to be
optional for export, independently on their in-buffer status.

The OPTIONS keyword may be used, with q:t and q:nil items.

> Looking at contrib/lisp/org-e-latex.el at an upcoming exporter for
> such things, I see a variable org-e-latex-quotes, which has nice
> language-aware parts... but misses an important point.  Each language
> gets to define one regexp for opening quotes, one for closing quotes,
> and one for single quotes.  But don't we want to talk about (at least)
> two levels of quotes, see your own reference[fn:1]?

Probably. But that's going to be somewhat harder.

> Single quotes would be for inner, second-level quotes (if we're using
> double straight quotes according to (American) English usage, I would
> guess we'd be using single straight quotes the same way).  That works
> okay for English, where a single apostrophe not part of a grouping
> construct is going to be interpreted as a "close" single quote and
> look right for an apostrophe.

The regexp may be able to tell level 1 from level 2 quotes.

> It might not work so good in French where apostrophes are also used,

There are no spaces around apostrophes, so they shouldn't be caught by
the regexp.

> but also single guillemets for inner-level quotes.

What are single guillemets? I don't think there is such thing in French.

> Should/can we consider extending this for the new exporters?

I think it would be a good addition to the export mechanism, if you want
to give it a try.

> (I'm looking forward to HTML and ODT exporters that can do smart
> quotes; the straight quotes are really the main jarring things about
> using Org as a lightweight markup and exporting into something
> fancier)

A function, provided in org-export, could help changing dumb quotes into
smart quotes in plain text. Then, it would be easier for back-ends to
provide the feature, if they wanted to.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: "Smart" quotes
  2012-05-25 17:14     ` Nicolas Goaziou
@ 2012-05-25 17:51       ` Jambunathan K
  2012-05-25 22:51       ` Mark E. Shoulson
  1 sibling, 0 replies; 23+ messages in thread
From: Jambunathan K @ 2012-05-25 17:51 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Mark E. Shoulson, org-mode mailing list


> I could be interesting, but keep in mind that no matter how "smart" your
> quotes are, they will fail in some situations. So, it will have to be
> optional for export, independently on their in-buffer status.
>
> The OPTIONS keyword may be used, with q:t and q:nil items.

I don't see an entry for this in `org-export-options-alist'.  So I
believe you are soliciting opinion on a fresh addition.

>> (I'm looking forward to HTML and ODT exporters that can do smart
>> quotes; the straight quotes are really the main jarring things about
>> using Org as a lightweight markup and exporting into something
>> fancier)
>
> A function, provided in org-export, could help changing dumb quotes into
> smart quotes in plain text. Then, it would be easier for back-ends to
> provide the feature, if they wanted to.

I can use it, if made available.  I think, It will be help if we force
all exporters to produce utf-8 files.
-- 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: "Smart" quotes
  2012-05-25 17:14     ` Nicolas Goaziou
  2012-05-25 17:51       ` Jambunathan K
@ 2012-05-25 22:51       ` Mark E. Shoulson
  2012-05-26  6:48         ` Nicolas Goaziou
  1 sibling, 1 reply; 23+ messages in thread
From: Mark E. Shoulson @ 2012-05-25 22:51 UTC (permalink / raw)
  To: emacs-orgmode

On 05/25/2012 01:14 PM, Nicolas Goaziou wrote:
> Hello,
>
> "Mark E. Shoulson"<mark@kli.org>  writes:
>
>> Hm.  I like the idea, but it raises some questions for me.  It would
>> be particularly good if this could share code/custom variables with
>> the pieces of the (new) exporter that make smart quotes on export.
>> That way we could be sure that what it looks like onscreen would also
>> be what it looked like when exported.
> I could be interesting, but keep in mind that no matter how "smart" your
> quotes are, they will fail in some situations. So, it will have to be
> optional for export, independently on their in-buffer status.
>
> The OPTIONS keyword may be used, with q:t and q:nil items.

"Smart" quotes absolutely have to be optional, and probably disabled by 
default.  They're going to fail sometimes, so they should only be there 
when you ask for them.  Smart-quotes-for-export and 
smart-quotes-onscreen need to be settable independently, yes.  
Smart-quotes-for-export needs to be settable per-file/per-buffer, with 
OPTIONS or something.  Smart-quotes-onscreen doesn't have to be 
buffer-local, though it might be a good idea.  Using q:t or maybe ":t in 
options seems perfectly good for setting exporting smart quotes.  It 
still would be good if onscreen and export could share code.

>> Looking at contrib/lisp/org-e-latex.el at an upcoming exporter for
>> such things, I see a variable org-e-latex-quotes, which has nice
>> language-aware parts... but misses an important point.  Each language
>> gets to define one regexp for opening quotes, one for closing quotes,
>> and one for single quotes.  But don't we want to talk about (at least)
>> two levels of quotes, see your own reference[fn:1]?
> Probably. But that's going to be somewhat harder.
>
>> Single quotes would be for inner, second-level quotes (if we're using
>> double straight quotes according to (American) English usage, I would
>> guess we'd be using single straight quotes the same way).  That works
>> okay for English, where a single apostrophe not part of a grouping
>> construct is going to be interpreted as a "close" single quote and
>> look right for an apostrophe.
> The regexp may be able to tell level 1 from level 2 quotes.

Do you mean that the author would use the same characters for both first 
and second level quotes, and the regexp would be smart enough to 
distinguish which level each was at?  I don't think that's possible, and 
you probably don't either.  What I meant, and you probably did as well, 
was that if we use apostrophes for second-level quotes, a regexp can be 
smart enough to tell the difference between a second-level quote and a 
non-quote apostrophe....

>> It might not work so good in French where apostrophes are also used,
> There are no spaces around apostrophes, so they shouldn't be caught by
> the regexp.

which is what you say here.  They *should* be caught by a regexp, but 
not the same one; they need to be smartified also, just not necessarily 
treated the same as second-level quotes.

>> but also single guillemets for inner-level quotes.
> What are single guillemets? I don't think there is such thing in French.

You're right; the Wikipedia page says that French uses quote-marks or 
the same double-chevrons for inner quotes.  I thought it used \lsaquo 
and \rsaquo, « like ‹ this › ».  Looks like it does in Swiss typography 
for various languages, according to the page.  Danish also uses the 
single-chevrons (pointing the other direction), and Azerbaijani and 
Basque, etc... Whatever.  What I meant was, if people are going to be 
writing using straight ascii quotes and expect them to be changed into 
language-appropriate quotes, they're going to want something like

"this is a 'quote', and that's all you need to know."

becoming, for instance

«this is a ‹quote›, and that’s all you need to know.»

that is, it should be possible to use the single quotes for inner 
quotes, which would mean more than just opening/closing/single in the 
org-e-latex-quotes (and analogous variables in other exporters).  Being 
able to determine when you need ‹› and when ’ might be a little 
uncertain, but it isn't hard to make a regexp that can make a decent 
guess at it.

>> Should/can we consider extending this for the new exporters?
> I think it would be a good addition to the export mechanism, if you want
> to give it a try.

I'd love to get org more export-friendly.  I'll see what I can 
understand of the (new) export code.

>> (I'm looking forward to HTML and ODT exporters that can do smart
>> quotes; the straight quotes are really the main jarring things about
>> using Org as a lightweight markup and exporting into something
>> fancier)
> A function, provided in org-export, could help changing dumb quotes into
> smart quotes in plain text. Then, it would be easier for back-ends to
> provide the feature, if they wanted to.
That sounds like a possibility, might make for good generic handling, 
only one bit of code to treat everything consistently... yeah, I didn't 
like the idea at first, I'm starting to like it more.  I'll think on it too.

~mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: "Smart" quotes
  2012-05-25 22:51       ` Mark E. Shoulson
@ 2012-05-26  6:48         ` Nicolas Goaziou
  2012-05-29  1:30           ` Mark E. Shoulson
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Goaziou @ 2012-05-26  6:48 UTC (permalink / raw)
  To: Mark E. Shoulson; +Cc: emacs-orgmode

Hello,

"Mark E. Shoulson" <mark@kli.org> writes:

>> The regexp may be able to tell level 1 from level 2 quotes.
>
> Do you mean that the author would use the same characters for both
> first and second level quotes, and the regexp would be smart enough to
> distinguish which level each was at?  I don't think that's possible,
> and you probably don't either.

Actually, I do. Since you can tell an opening quote from a closing one
by the position of the white space (or parenthesis, beginning/end of
line) near it, I think you can deduce the quote level. I may be wrong,
though.

> "this is a 'quote', and that's all you need to know."
>
> becoming, for instance
>
> «this is a ‹quote›, and that’s all you need to know.»

"this is a "quote", and that's all you need to know" is as parsable to
me.

As a side note, at least in French, many typographers would recommend
"this is a /quote/, and that's all you need to know" here. Oh, and
I know that was just an example.

> I'd love to get org more export-friendly.  I'll see what I can
> understand of the (new) export code.

Do not hesitate to ask questions about it.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: "Smart" quotes
  2012-05-26  6:48         ` Nicolas Goaziou
@ 2012-05-29  1:30           ` Mark E. Shoulson
  2012-05-29 17:57             ` Nicolas Goaziou
  0 siblings, 1 reply; 23+ messages in thread
From: Mark E. Shoulson @ 2012-05-29  1:30 UTC (permalink / raw)
  To: emacs-orgmode

On 05/26/2012 02:48 AM, Nicolas Goaziou wrote:
> Hello,
>
> "Mark E. Shoulson"<mark@kli.org>  writes:
>
>>> The regexp may be able to tell level 1 from level 2 quotes.
>> Do you mean that the author would use the same characters for both
>> first and second level quotes, and the regexp would be smart enough to
>> distinguish which level each was at?  I don't think that's possible,
>> and you probably don't either.
> Actually, I do. Since you can tell an opening quote from a closing one
> by the position of the white space (or parenthesis, beginning/end of
> line) near it, I think you can deduce the quote level. I may be wrong,
> though.

Maybe, if it's all on one line.  But if the quote is several lines long, 
can you sensibly count the levels?  I guess it doesn't actually matter, 
but it starts to get weird if you find yourself looking arbitrarily far 
back, and then you start building in exceptions for crossing paragraph 
boundaries... And then there's the fact that multi-paragraph quotes 
usually have an open-quote for each paragraph but only one close-quote 
at the end... Actually keeping count of what level you're at, 
accurately, is a classic example of a non-regular language; you need a 
push-down automaton to keep count, and regular expressions don't cut 
it.  Then again, Emacs regexps are more powerful than simple regular 
expressions, and we only would want to keep track of even vs odd level 
anyway.

I'm rambling.  In sum, I'm going to start off /not/ trying to solve that 
problem, and assume the writer is going to use alternating " and ' as 
typography requires and not try to second-guess what level we're at.  As 
that progresses, maybe I'll come to understand better what can and can't 
(and should and shouldn't) be deduced by the regexps.

>> "this is a 'quote', and that's all you need to know."
>>
>> becoming, for instance
>>
>> «this is a ‹quote›, and that’s all you need to know.»
> "this is a "quote", and that's all you need to know" is as parsable to
> me.
>
> As a side note, at least in French, many typographers would recommend
> "this is a /quote/, and that's all you need to know" here. Oh, and
> I know that was just an example.

I see; because I can tell that the second " must be an open-quote and 
not closing the first, due to its position relative to the spaces.  It 
does seem possible, but I think I'm going to try not solving that 
problem first.

(And French typography raises other problems, since French puts lots of 
space around the quote-marks, to the extent that French typists typing 
plain-text will often put a space on both sides of a quote-mark, making 
it hard to see whether it opens or closes... another issue, not 
necessarily solvable, to watch for.)

~mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: "Smart" quotes
  2012-05-29  1:30           ` Mark E. Shoulson
@ 2012-05-29 17:57             ` Nicolas Goaziou
  2012-05-30  0:51               ` Mark E. Shoulson
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Goaziou @ 2012-05-29 17:57 UTC (permalink / raw)
  To: Mark E. Shoulson; +Cc: emacs-orgmode

Hello,

"Mark E. Shoulson" <mark@kli.org> writes:

> Maybe, if it's all on one line.  But if the quote is several lines
> long, can you sensibly count the levels?

Well, yes.

> I guess it doesn't actually matter, but it starts to get weird if you
> find yourself looking arbitrarily far back, and then you start
> building in exceptions for crossing paragraph boundaries...

True. I had the exporter in mind, where you always start at the
beginning of the paragraph. It would be more difficult with search
starting in the middle of the paragraph.

> And then there's the fact that multi-paragraph quotes usually have an
> open-quote for each paragraph but only one close-quote at the end...

Some french typographers suggest to use a close-quote at the beginning
of the paragraph to avoid that confusion, or to simply drop them (since
they are a pain to maintain anyway). I don't know about other languages
but, if that's the same, is it a good idea to bother implementing it?

> Actually keeping count of what level you're at, accurately, is
> a classic example of a non-regular language; you need a push-down
> automaton to keep count, and regular expressions don't cut it.

This is limited to 2 levels.

> I'm rambling.  In sum, I'm going to start off /not/ trying to solve
> that problem, and assume the writer is going to use alternating " and
> as typography requires and not try to second-guess what level we're
> at.

You are right, the problem will be easier to solve with both " and '.

Though, "as typography requires" is not true. In France, the /Imprimerie
Nationale/ suggests to use guillemots at both levels. Remember that
typography is localized, which is the main difficulty of the
implementation.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: "Smart" quotes
  2012-05-29 17:57             ` Nicolas Goaziou
@ 2012-05-30  0:51               ` Mark E. Shoulson
  2012-05-31  1:50                 ` (no subject) Mark Shoulson
  0 siblings, 1 reply; 23+ messages in thread
From: Mark E. Shoulson @ 2012-05-30  0:51 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

On 05/29/2012 01:57 PM, Nicolas Goaziou wrote:
> Hello,
>
> "Mark E. Shoulson"<mark@kli.org>  writes:
>
>
>> I guess it doesn't actually matter, but it starts to get weird if you
>> find yourself looking arbitrarily far back, and then you start
>> building in exceptions for crossing paragraph boundaries...
> True. I had the exporter in mind, where you always start at the
> beginning of the paragraph. It would be more difficult with search
> starting in the middle of the paragraph.

Maybe the on-screen stuff is no harder; will just have to see.

>> And then there's the fact that multi-paragraph quotes usually have an
>> open-quote for each paragraph but only one close-quote at the end...
> Some french typographers suggest to use a close-quote at the beginning
> of the paragraph to avoid that confusion, or to simply drop them (since
> they are a pain to maintain anyway). I don't know about other languages
> but, if that's the same, is it a good idea to bother implementing it?

I've never heard of it.  But I think we may be overthinking this; we can 
drive ourselves crazy trying to compress a dozen different typographical 
traditions (and informal customs) into a few Elisp rules.  On the other 
hand, I don't think we need to throw up our hands and give up either! :)

>> Actually keeping count of what level you're at, accurately, is
>> a classic example of a non-regular language; you need a push-down
>> automaton to keep count, and regular expressions don't cut it.
> This is limited to 2 levels.
True.
>> I'm rambling.  In sum, I'm going to start off /not/ trying to solve
>> that problem, and assume the writer is going to use alternating " and
>> as typography requires and not try to second-guess what level we're
>> at.
> You are right, the problem will be easier to solve with both " and '.
>
> Though, "as typography requires" is not true. In France, the /Imprimerie
> Nationale/ suggests to use guillemots at both levels. Remember that
> typography is localized, which is the main difficulty of the
> implementation.

Also a good point.

All right, bottom line, this is sort of what I'm seeing.  I'm not 100% 
sure which files should house these things, but something like this:

1) a variable containing for each language regexp for each of: open 
double-quote, close double-quote, open single-quote, close single-quote, 
and maybe mid-word apostrophe.  Odds are these regexps are going to be 
the same for just about all languages (the regexps detecting them, mind 
you), so probably should have some sort of default that the alist can 
just reference.  A language should also be allowed to define other quote 
regexps in its list too.  We need these to be ordered, with a standard 
set, so that we can have...

2) for each *exporter* (including on-screen display), a variable that 
defines, for each language, what the *substitution* will be for 
open-double-quote, close-double-quote, etc.  Other extras can be defined 
too.  That way we can have an exporter-independent way to detect quotes 
to be smartified, but each exporter has its own way to smartify them.

3) Since most exporters are probably going to be handling doing the 
process approximately the same (match the regexp, stick in the 
associated substitution), org-export.el should have a generic function 
that does this which each exporter *may* call in (or as) its 
quote-smartifier in its text translator, unless it needs something more 
specific which it can provide itself.

In terms of what is handled, the idea in my head is that we would expect 
the writer to be using " or ' to surround their quotes, regardless of 
what their native custom is (if they're doing it using their 
language-specific quote-marks, we don't need to bother with all this 
anyway).  Goal is to handle either "quotes" or 'quotes' in either 
nesting (or no nesting, if someone does "quote' for some reason), and 
with any luck not get too confused with other uses of apostrophe.

It makes sense to me, but I bet I explained it badly and people are 
going to have all kinds of issues with it. :)

No telling when (if?) I'll be able to produce something along these 
lines, but it's something to start thinking about anyway.

~mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: (no subject)
  2012-05-30  0:51               ` Mark E. Shoulson
@ 2012-05-31  1:50                 ` Mark Shoulson
  2012-05-31 13:38                   ` Nicolas Goaziou
  0 siblings, 1 reply; 23+ messages in thread
From: Mark Shoulson @ 2012-05-31  1:50 UTC (permalink / raw)
  To: emacs-orgmode

Mark E. Shoulson <mark <at> kli.org> writes:


> 
> All right, bottom line, this is sort of what I'm seeing.  I'm not 100% 
> sure which files should house these things, but something like this:
> 
> 1) a variable containing for each language regexp for each of: open 
> double-quote, close double-quote, open single-quote, close single-quote, 
> and maybe mid-word apostrophe.  Odds are these regexps are going to be 
> the same for just about all languages (the regexps detecting them, mind 
> you), so probably should have some sort of default that the alist can 
> just reference.  A language should also be allowed to define other quote 
> regexps in its list too.  We need these to be ordered, with a standard 
> set, so that we can have...
> 
> 2) for each *exporter* (including on-screen display), a variable that 
> defines, for each language, what the *substitution* will be for 
> open-double-quote, close-double-quote, etc.  Other extras can be defined 
> too.  That way we can have an exporter-independent way to detect quotes 
> to be smartified, but each exporter has its own way to smartify them.
> 
> 3) Since most exporters are probably going to be handling doing the 
> process approximately the same (match the regexp, stick in the 
> associated substitution), org-export.el should have a generic function 
> that does this which each exporter *may* call in (or as) its 
> quote-smartifier in its text translator, unless it needs something more 
> specific which it can provide itself.
> 
> In terms of what is handled, the idea in my head is that we would expect 
> the writer to be using " or ' to surround their quotes, regardless of 
> what their native custom is (if they're doing it using their 
> language-specific quote-marks, we don't need to bother with all this 
> anyway).  Goal is to handle either "quotes" or 'quotes' in either 
> nesting (or no nesting, if someone does "quote' for some reason), and 
> with any luck not get too confused with other uses of apostrophe.
> 
> It makes sense to me, but I bet I explained it badly and people are 
> going to have all kinds of issues with it. :)
> 
> No telling when (if?) I'll be able to produce something along these 
> lines, but it's something to start thinking about anyway.
> 
> ~mark
> 
> 


Regarding the "this is what I'm seeing", I paste at the bottom a
preliminary patch.  It is totally *not* worth actually applying it unless you
want to develop this; it's a snapshot mid-development.  But it does seem to
actually work.  The same set of regexps is used, and the same function, though
that is defeasible and an exporter can define its own.  The hardest part was
getting the onscreen versions showing right (also the most recently and probably
best tested, so the actual exporters might be more bumpy).  Actual substitutions
being used are not necessarily typologically sensible; chosen more so it's
easier to see the action of the process.  Nothing is in the right place, things
that should be customizables aren't... it's proof-of-concept. Am I going in the
right direction, as far as export-engine is concerned?

==========

From 420048063e3fd2af1b019c48864d58d82cef62ef Mon Sep 17 00:00:00 2001
From: Mark Shoulson <mark@kli.org>
Date: Tue, 29 May 2012 23:01:12 -0400
Subject: [PATCH] Just barely works, nothing in the right places.  For
 entertainment purposes only.

---
 contrib/lisp/org-e-html.el  |    5 ++++
 contrib/lisp/org-e-latex.el |   53 +++++++++++++++++++++----------------------
 contrib/lisp/org-export.el  |   52 ++++++++++++++++++++++++++++++++++++++++++
 lisp/org.el                 |   50 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 133 insertions(+), 27 deletions(-)

diff --git a/contrib/lisp/org-e-html.el b/contrib/lisp/org-e-html.el
index de98493..b851713 100644
--- a/contrib/lisp/org-e-html.el
+++ b/contrib/lisp/org-e-html.el
@@ -1077,6 +1077,11 @@ in order to mimic default behaviour:
 
 ;;;; Plain text
 
+(defvar org-e-html-quote-replacements
+  '(("fr" "« " " »" "‘" "’" "’")
+    ("en" "“" "”" "‘" "’" "’")
+    ("de" "„" "“" "‚" "‘" "’"))
+
 (defcustom org-e-html-quotes
   '(("fr"
      ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~")
diff --git a/contrib/lisp/org-e-latex.el b/contrib/lisp/org-e-latex.el
index 67e9197..540ebe1 100644
--- a/contrib/lisp/org-e-latex.el
+++ b/contrib/lisp/org-e-latex.el
@@ -687,6 +687,10 @@ during latex export it will output
 
 ;;;; Plain text
 
+(defvar org-e-latex-quote-replacements
+  '(("fr" "«~" "~»" "‹~" "~›" "/!")
+    ("en" "((" "))" ".(" ")." "/")))
+
 (defcustom org-e-latex-quotes
   '(("fr"
      ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~")
@@ -699,25 +703,22 @@ during latex export it will output
   "Alist for quotes to use when converting english double-quotes.
 
 The CAR of each item in this alist is the language code.
-The CDR of each item in this alist is a list of three CONS:
-- the first CONS defines the opening quote;
-- the second CONS defines the closing quote;
-- the last CONS defines single quotes.
+The CDR of each item in this alist is a list of CONS:
+- the first CONS should define the opening quote;
+- the second CONS should define the closing quote;
+- subsequent CONS should define any other quotes, e.g. single, etc.
 
 For each item in a CONS, the first string is a regexp
 for allowed characters before/after the quote, the second
 string defines the replacement string for this quote."
   :group 'org-export-e-latex
-  :type '(list
-	  (cons :tag "Opening quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Closing quote"
-		(string :tag "Regexp for char after ")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Single quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))))
+  :type '(repeat
+	  (cons
+	   (string :tag "language code")
+	   (repeat
+	    (cons :tag "Quote"
+		  (string :tag "Regexp ")
+		  (string :tag "Replacement quote     "))))))
 
 
 ;;;; Compilation
@@ -852,19 +853,17 @@ nil."
 	     options
 	     ","))
 
-(defun org-e-latex--quotation-marks (text info)
-  "Export quotation marks depending on language conventions.
-TEXT is a string containing quotation marks to be replaced.  INFO
-is a plist used as a communication channel."
-  (mapc (lambda(l)
-	  (let ((start 0))
-	    (while (setq start (string-match (car l) text start))
-	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
-		(setq text (replace-match new-quote  t t text))))))
-	(cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
-		 ;; Falls back on English.
-		 (assoc "en" org-e-latex-quotes))))
-  text)
+(defun org-e-latex--quotation-marks (text info) 
+  (org-export-quotation-marks text info org-e-latex-quote-replacements))
+  ;; (mapc (lambda(l)
+  ;; 	  (let ((start 0))
+  ;; 	    (while (setq start (string-match (car l) text start))
+  ;; 	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
+  ;; 		(setq text (replace-match new-quote  t t text))))))
+  ;; 	(cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
+  ;; 		 ;; Falls back on English.
+  ;; 		 (assoc "en" org-e-latex-quotes))))
+  ;; text)
 
 (defun org-e-latex--wrap-label (element output)
   "Wrap label associated to ELEMENT around OUTPUT, if appropriate.
diff --git a/contrib/lisp/org-export.el b/contrib/lisp/org-export.el
index b9294e5..aacb448 100644
--- a/contrib/lisp/org-export.el
+++ b/contrib/lisp/org-export.el
@@ -284,6 +284,58 @@ rules.")
   :tag "Org Export General"
   :group 'org-export)
 
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Probably a defcustom eventually.
+
+;; Each element of this consists of: car=language code, cdr=list of
+;; double-quote-open-regexp, double-quote-close-regexp,
+;; single-quote-open-regexp, single-quote-close-regexp, &optional
+;; single-apostrophe regexp?
+;; Just about all will be the same anyway, so mostly language DEFAULT.
+
+;; For testing purposes, poorly-designed at first.
+(defvar org-export-quotes-regexps
+  '((DEFAULT 
+      "\\(?:\\s-\\|[[(]\\|^\\)\\(\"\\)\\w" 
+      "\\(?:\\S-\\)\\(\"\\)\\s-" 
+      "\\(?:\\s-\\|(\\|^\\)\\('\\)\\w"
+      "\\w\\('\\)\\(?:\\s-\\|\\s.\\|$\\)"
+      "\\w\\('\\)\\w")))
+
+;; Generic function, usable by exporters, but they can define their own
+;; instead.
+(defun org-export-quotation-marks (text info replacements)
+  "Export quotation marks depending on language conventions.
+TEXT is a string containing quotation marks to be replaced.  INFO
+is a plist used as a communication channel."
+  (let* ((start 0)
+	 (regexps 
+	  (cdr 
+	   (or 
+	    (assoc (plist-get info :language)
+		   org-export-quotes-regexps)
+	    (assoc 'DEFAULT org-export-quotes-regexps))))
+	 (subs (cdr (or (assoc (plist-get info :language)
+			       replacements)
+			(assoc "en" replacements))))
+	 (quotes (pairlis regexps subs)))
+    (mapc (lambda (p)
+	    (let ((re (car p))
+		  (su (cdr p)))
+	      (while (setq start (string-match re text start))
+		(setq text (replace-match su t t text 1)))))
+	  quotes))
+  text)
+
+(defvar org-screen-smart-quotes
+  '(("en" "“" "”" "‘" "’" "’")
+    ("fr" "«" "»" "‹" "›" "’")
+    ("de" "„" "“" "‚" "’" "’")))
+
+
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
 (defcustom org-export-with-archived-trees 'headline
   "Whether sub-trees with the ARCHIVE tag should be exported.
 
diff --git a/lisp/org.el b/lisp/org.el
index 8a141cf..72bf4b0 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -5927,6 +5927,7 @@ needs to be inserted at a specific position in the
font-lock sequence.")
 	   ;; Specials
 	   '(org-do-latex-and-special-faces)
 	   '(org-fontify-entities)
+	   '(org-fontify-quotes)
 	   '(org-raise-scripts)
 	   ;; Code
 	   '(org-activate-code (1 'org-code t))
@@ -5948,6 +5949,55 @@ needs to be inserted at a specific position in the
font-lock sequence.")
 		   '(org-font-lock-keywords t nil nil backward-paragraph))
     (kill-local-variable 'font-lock-keywords) nil))
 
+(defvar org-smart-quotes nil)
+(defvar org-smart-quotes-replacements
+  '("«" "»" "‹" "›" "’"))
+;;  '("“" "”" "‘" "’" "’"))
+
+;; Nother idea, try this: like in original smart-quotes attempt.
+;; String all the regexps into one big regexp with \\| between them.
+;; Possibly have to parenthesize them but that's okay, since if
+;; each elt is in its own group, then those will be the odd-numbered groups
+;; and the inner group (of the actual quote) will be groups 2,4,6, etc.
+
+(defun splice-string (lst join)
+  (if (null (cdr lst)) (car lst)
+    (concat (car lst) join (splice-string (cdr lst) join))))
+
+(defun org-fontify-quotes (limit)
+  (require 'org-export)
+  (when org-smart-quotes
+    (let* ((start (point))
+	   k su
+	   (regexps
+	    (cdr 
+	     (assoc 'DEFAULT org-export-quotes-regexps)))
+	   (allreg (splice-string regexps "\\|"))
+	   (quotes (pairlis regexps org-smart-quotes-replacements)))
+      ;; (message "%s" allreg)
+      (catch 'match
+	(while (re-search-forward allreg limit t)
+	  (cond ((match-string 1)
+		 (setq k 1 su (nth 0 org-smart-quotes-replacements)))
+		((match-string 2)
+		 (setq k 2 su (nth 1 org-smart-quotes-replacements)))
+		((match-string 3)
+		 (setq k 3 su (nth 2 org-smart-quotes-replacements)))
+		((match-string 4)
+		 (setq k 4 su (nth 3 org-smart-quotes-replacements)))
+		((match-string 5)
+		 (setq k 5 su (nth 4 org-smart-quotes-replacements)))
+		;;(t
+		;; (message "????")))
+		)
+	  ;; (message "%s %s" k (match-data))
+	  (add-text-properties (match-beginning k) (match-end k)
+			       (list 'font-lock-fontified t
+				     'face 'org-warning))
+	  (compose-region (match-beginning k) (match-end k) su nil)
+	  (backward-char 1)
+	  (throw 'match t))))))
+
 (defun org-toggle-pretty-entities ()
   "Toggle the composition display of entities as UTF8 characters."
   (interactive)
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: (no subject)
  2012-05-31  1:50                 ` (no subject) Mark Shoulson
@ 2012-05-31 13:38                   ` Nicolas Goaziou
  2012-05-31 23:26                     ` Smart Quotes Exporting (Was: Re: (no subject)) Mark E. Shoulson
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Goaziou @ 2012-05-31 13:38 UTC (permalink / raw)
  To: Mark Shoulson; +Cc: emacs-orgmode

Hello,

Mark Shoulson <mark@kli.org> writes:

> +(defvar org-e-html-quote-replacements
> +  '(("fr" "« " " »" "‘" "’" "’")
> +    ("en" "“" "”" "‘" "’" "’")
> +    ("de" "„" "“" "‚" "‘" "’"))

A docstring will be required for this variable. It should be
a defcustom.

> +(defvar org-e-latex-quote-replacements
> +  '(("fr" "«~" "~»" "‹~" "~›" "/!")
> +    ("en" "((" "))" ".(" ")." "/")))

Ditto.

>  (defcustom org-e-latex-quotes
>    '(("fr"
>       ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~")
> @@ -699,25 +703,22 @@ during latex export it will output
>    "Alist for quotes to use when converting english double-quotes.
>  
>  The CAR of each item in this alist is the language code.
> -The CDR of each item in this alist is a list of three CONS:
> -- the first CONS defines the opening quote;
> -- the second CONS defines the closing quote;
> -- the last CONS defines single quotes.
> +The CDR of each item in this alist is a list of CONS:
> +- the first CONS should define the opening quote;
> +- the second CONS should define the closing quote;
> +- subsequent CONS should define any other quotes, e.g. single, etc.
>  
>  For each item in a CONS, the first string is a regexp
>  for allowed characters before/after the quote, the second
>  string defines the replacement string for this quote."
>    :group 'org-export-e-latex
> -  :type '(list
> -	  (cons :tag "Opening quote"
> -		(string :tag "Regexp for char before")
> -		(string :tag "Replacement quote     "))
> -	  (cons :tag "Closing quote"
> -		(string :tag "Regexp for char after ")
> -		(string :tag "Replacement quote     "))
> -	  (cons :tag "Single quote"
> -		(string :tag "Regexp for char before")
> -		(string :tag "Replacement quote     "))))
> +  :type '(repeat
> +	  (cons
> +	   (string :tag "language code")
> +	   (repeat
> +	    (cons :tag "Quote"
> +		  (string :tag "Regexp ")
> +		  (string :tag "Replacement quote     "))))))

The docstring is not valid. It's now an an alist whose key is the
language code and the value is a list of strings, not cons cells.

> +(defun org-e-latex--quotation-marks (text info) 
> +  (org-export-quotation-marks text info org-e-latex-quote-replacements))
> +  ;; (mapc (lambda(l)
> +  ;; 	  (let ((start 0))
> +  ;; 	    (while (setq start (string-match (car l) text start))
> +  ;; 	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
> +  ;; 		(setq text (replace-match new-quote  t t text))))))
> +  ;; 	(cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
> +  ;; 		 ;; Falls back on English.
> +  ;; 		 (assoc "en" org-e-latex-quotes))))
> +  ;; text)

Use directly `org-e-latex-quote-replacements' in code then.

> +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> +;; Probably a defcustom eventually.
> +
> +;; Each element of this consists of: car=language code, cdr=list of
> +;; double-quote-open-regexp, double-quote-close-regexp,
> +;; single-quote-open-regexp, single-quote-close-regexp, &optional
> +;; single-apostrophe regexp?
> +;; Just about all will be the same anyway, so mostly language DEFAULT.
> +
> +;; For testing purposes, poorly-designed at first.
> +(defvar org-export-quotes-regexps
> +  '((DEFAULT 
> +      "\\(?:\\s-\\|[[(]\\|^\\)\\(\"\\)\\w" 
> +      "\\(?:\\S-\\)\\(\"\\)\\s-" 
> +      "\\(?:\\s-\\|(\\|^\\)\\('\\)\\w"
> +      "\\w\\('\\)\\(?:\\s-\\|\\s.\\|$\\)"
> +      "\\w\\('\\)\\w")))

I'm not sure this variable can be used for both the buffer and the
export engine. Export back-ends will only see chunks of the paragraph.

For example, in the following text,

  He crossed the Rubicon and said: "/Alea jacta est./"

Plain text translators will see three strings:

  1. "He crossed the Rubicon and said: \""
  2. "Alea jacta est."
  3. "\""

In case 1, you have an opening quote with nothing after it. In case 3,
you have a closing quote with nothing before or after it. Plain regexps
can't help here.

The only solution in can think of is to do quote substitutions in
paragraphs within the parse tree before they reach the translators (i.e.
with `org-export-filter-parse-tree-functions').

That's the only way to know if "\"" is an opening or a closing quote,
for example. The current approach won't work.

> +;; Generic function, usable by exporters, but they can define their own
> +;; instead.
> +(defun org-export-quotation-marks (text info replacements)
> +  "Export quotation marks depending on language conventions.
> +TEXT is a string containing quotation marks to be replaced.  INFO
> +is a plist used as a communication channel."

Please document each argument in the docstring.

> +  (let* ((start 0)
> +	 (regexps 
> +	  (cdr 
> +	   (or 
> +	    (assoc (plist-get info :language)
> +		   org-export-quotes-regexps)
> +	    (assoc 'DEFAULT org-export-quotes-regexps))))

Use `assq' instead of `assoc' in the second case.

> +	 (subs (cdr (or (assoc (plist-get info :language)
> +			       replacements)
> +			(assoc "en" replacements))))
> +	 (quotes (pairlis regexps subs)))
> +    (mapc (lambda (p)
> +	    (let ((re (car p))
> +		  (su (cdr p)))
> +	      (while (setq start (string-match re text start))
> +		(setq text (replace-match su t t text 1)))))

Use `replace-regexp-in-string' instead.

  (replace-regexp-in-string (car p) (cdr p) text t t 1)

> +	  quotes))
> +  text)


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Smart Quotes Exporting (Was: Re: (no subject))
  2012-05-31 13:38                   ` Nicolas Goaziou
@ 2012-05-31 23:26                     ` Mark E. Shoulson
  2012-06-01 17:11                       ` Smart Quotes Exporting Nicolas Goaziou
  0 siblings, 1 reply; 23+ messages in thread
From: Mark E. Shoulson @ 2012-05-31 23:26 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

Sorry for messing up the thread subject header; I think I misused 
gmane's posting.

On 05/31/2012 09:38 AM, Nicolas Goaziou wrote:
> Hello,
>
> Mark Shoulson<mark@kli.org>  writes:
>
>> +(defvar org-e-html-quote-replacements
>> +  '(("fr" "« " " »" "‘" "’" "’")
>> +    ("en" "“" "”" "‘" "’" "’")
>> +    ("de" "„" "“" "‚" "‘" "’"))
> A docstring will be required for this variable. It should be
> a defcustom.

Oh, certainly; they're all a disaster.  I think I said that in the 
writeup at the top.  This is just proof of concept, nothing is in the 
right place, nothing is properly documented.  They have to be 
defcustoms, there needs to be a good :type in the defcustom as well as a 
proper docstring.  You'll get no argument from me about the lack (or 
inaccuracy) of docstrings and such.  I hadn't gotten that far yet.  I 
said the patch was only if you wanted to tinker with the development as 
this progresses.

> +(defun org-e-latex--quotation-marks (text info)
> +  (org-export-quotation-marks text info org-e-latex-quote-replacements))
> +  ;; (mapc (lambda(l)
> +  ;; 	  (let ((start 0))
> +  ;; 	    (while (setq start (string-match (car l) text start))
> +  ;; 	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
> +  ;; 		(setq text (replace-match new-quote  t t text))))))
> +  ;; 	(cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
> +  ;; 		 ;; Falls back on English.
> +  ;; 		 (assoc "en" org-e-latex-quotes))))
> +  ;; text)
> Use directly `org-e-latex-quote-replacements' in code then.

Not sure I understand this comment.

>> +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>> +;; Probably a defcustom eventually.
>> +
>> +;; Each element of this consists of: car=language code, cdr=list of
>> +;; double-quote-open-regexp, double-quote-close-regexp,
>> +;; single-quote-open-regexp, single-quote-close-regexp,&optional
>> +;; single-apostrophe regexp?
>> +;; Just about all will be the same anyway, so mostly language DEFAULT.
>> +
>> +;; For testing purposes, poorly-designed at first.
>> +(defvar org-export-quotes-regexps
>> +  '((DEFAULT
>> +      "\\(?:\\s-\\|[[(]\\|^\\)\\(\"\\)\\w"
>> +      "\\(?:\\S-\\)\\(\"\\)\\s-"
>> +      "\\(?:\\s-\\|(\\|^\\)\\('\\)\\w"
>> +      "\\w\\('\\)\\(?:\\s-\\|\\s.\\|$\\)"
>> +      "\\w\\('\\)\\w")))
> I'm not sure this variable can be used for both the buffer and the
> export engine. Export back-ends will only see chunks of the paragraph.
>
> For example, in the following text,
>
>    He crossed the Rubicon and said: "/Alea jacta est./"
>
> Plain text translators will see three strings:
>
>    1. "He crossed the Rubicon and said: \""
>    2. "Alea jacta est."
>    3. "\""
>
> In case 1, you have an opening quote with nothing after it. In case 3,
> you have a closing quote with nothing before or after it. Plain regexps
> can't help here.
>
> The only solution in can think of is to do quote substitutions in
> paragraphs within the parse tree before they reach the translators (i.e.
> with `org-export-filter-parse-tree-functions').
>
> That's the only way to know if "\"" is an opening or a closing quote,
> for example. The current approach won't work.

Hm.  OK, this may indeed be (a) a problem and (b) an indication that I 
really don't understand the process as I thought I did... ... ...  Ah.  
So when the "plain" text is being exported, the exporter passes along 
the text in chunks as divided up by the formatting.  So string #2 is 
broken out from the others due to its being in italics.  That is indeed 
an issue.  Moreover, I never even properly considered the effects of 
formatting characters (as opposed to punctuation) right next to the 
quote-marks, even if this weren't a problem.

So... there's the filter-parse-tree-functions hook gets applied within 
the parse tree... so a back-end can add a function to that list which 
looks over the parse-tree and watches for these border cases (and also 
the ones within ordinary strings).  Looks like it's going to be tough to 
work in any flexibility to define further per-language or per-backend 
cleverness to handle anything beyond the "canonical set" of open-double, 
close-double, open-single, close-single, and mid-word.

To be sure, anything we do will most assuredly fail even on some fairly 
reasonable input, in which case the users are pretty much on their own 
and will have to do things the hard way.  And I could use that as the 
answer here, that, "well, it'll work only within plain-text strings" 
(and I might possibly still have to use that answer), but I would rather 
include the situations you bring up in the supported set and not throw 
up my hands at it.  So, yes, will look at that.
>> +  (let* ((start 0)
>> +	 (regexps
>> +	  (cdr
>> +	   (or
>> +	    (assoc (plist-get info :language)
>> +		   org-export-quotes-regexps)
>> +	    (assoc 'DEFAULT org-export-quotes-regexps))))
> Use `assq' instead of `assoc' in the second case.

Good call.

>> +	 (subs (cdr (or (assoc (plist-get info :language)
>> +			       replacements)
>> +			(assoc "en" replacements))))
>> +	 (quotes (pairlis regexps subs)))
>> +    (mapc (lambda (p)
>> +	    (let ((re (car p))
>> +		  (su (cdr p)))
>> +	      (while (setq start (string-match re text start))
>> +		(setq text (replace-match su t t text 1)))))
> Use `replace-regexp-in-string' instead.
>
>    (replace-regexp-in-string (car p) (cdr p) text t t 1)

I'd been looking at other functions that didn't have that available; 
thanks for pointing me at it.

~mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Smart Quotes Exporting
  2012-05-31 23:26                     ` Smart Quotes Exporting (Was: Re: (no subject)) Mark E. Shoulson
@ 2012-06-01 17:11                       ` Nicolas Goaziou
  2012-06-01 22:41                         ` Mark E. Shoulson
                                           ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Nicolas Goaziou @ 2012-06-01 17:11 UTC (permalink / raw)
  To: Mark E. Shoulson; +Cc: emacs-orgmode

Hello,

"Mark E. Shoulson" <mark@kli.org> writes:

> Oh, certainly; they're all a disaster.  I think I said that in the
> writeup at the top.  This is just proof of concept, nothing is in the
> right place, nothing is properly documented.  They have to be
> defcustoms, there needs to be a good :type in the defcustom as well as
> a proper docstring.  You'll get no argument from me about the lack (or
> inaccuracy) of docstrings and such.  I hadn't gotten that far yet.
> I said the patch was only if you wanted to tinker with the development
> as this progresses.

No worries, I was just making some comments before forgetting about
them.

>> +(defun org-e-latex--quotation-marks (text info)
>> +  (org-export-quotation-marks text info org-e-latex-quote-replacements))
>> +  ;; (mapc (lambda(l)
>> +  ;; 	  (let ((start 0))
>> +  ;; 	    (while (setq start (string-match (car l) text start))
>> +  ;; 	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
>> +  ;; 		(setq text (replace-match new-quote  t t text))))))
>> +  ;; 	(cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
>> +  ;; 		 ;; Falls back on English.
>> +  ;; 		 (assoc "en" org-e-latex-quotes))))
>> +  ;; text)
>> Use directly `org-e-latex-quote-replacements' in code then.
>
> Not sure I understand this comment.

Since `org-e-latex--quotation-marks' just calls
`org-export-quotation-marks', you can remove completely the former from
"org-export.el" and use the latter instead.

> So... there's the filter-parse-tree-functions hook gets applied within
> the parse tree... so a back-end can add a function to that list which
> looks over the parse-tree and watches for these border cases (and also
> the ones within ordinary strings).  Looks like it's going to be tough
> to work in any flexibility to define further per-language or
> per-backend cleverness to handle anything beyond the "canonical set"
> of open-double, close-double, open-single, close-single, and mid-word.
>
> To be sure, anything we do will most assuredly fail even on some
> fairly reasonable input, in which case the users are pretty much on
> their own and will have to do things the hard way.  And I could use
> that as the answer here, that, "well, it'll work only within
> plain-text strings" (and I might possibly still have to use that
> answer), but I would rather include the situations you bring up in the
> supported set and not throw up my hands at it.  So, yes, will look at
> that.

Actually it isn't very hard to handle this problem. But it will be
different than the fontification used in an Org buffer.

You might want to look at `org-element-normalize-contents', which solves
a similar problem: removing maximum common indentation at the parsed
paragraph level.

As a first approximation, I can imagine a function accepting an element,
an object or a secondary string and returning an equivalent element,
object or secondary string, with its quotes "smartified". The algorithm
could go like this:

Walk element/object/secondary-string's contents .

  1. When a string is encountered:

     1. If it has a quote as its first or last position, check for
        objects before or after the string to guess its status. An
        object never starts with a white space, but you may have to
        check :post-blank property in order to know if previous object
        had white spaces at its end.

     2. For each quote everywhere else in the string, your regexp can
        handle it fine.

  2. When an object belonging to `org-element-recursive-objects' is
     encountered, apply the function to this object.

  3. Accumulate returned strings or objects.

Use accumulated data as the contents of the new object to return (i.e.
just add the type and the same properties at the beginning of this list
if it was an object or an element, return it as-is if that was
a secondary string).

On the elements side, only paragraphs, verse-blocks and table-rows can
directly contain quotes. Also, headline, inlinetask item and
footnote-reference have secondary strings containing quotes.

I'm not sure yet where and how to install such a function, but I will
think about it when it is implemented.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Smart Quotes Exporting
  2012-06-01 17:11                       ` Smart Quotes Exporting Nicolas Goaziou
@ 2012-06-01 22:41                         ` Mark E. Shoulson
  2012-06-03  3:16                         ` Mark E. Shoulson
  2012-06-06  2:14                         ` Mark E. Shoulson
  2 siblings, 0 replies; 23+ messages in thread
From: Mark E. Shoulson @ 2012-06-01 22:41 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

On 06/01/2012 01:11 PM, Nicolas Goaziou wrote:
> Hello,
>
> "Mark E. Shoulson"<mark@kli.org>  writes:
>
>> Oh, certainly; they're all a disaster.  I think I said that in the
>> writeup at the top.  This is just proof of concept, nothing is in the
>> right place, nothing is properly documented.  They have to be
>> defcustoms, there needs to be a good :type in the defcustom as well as
>> a proper docstring.  You'll get no argument from me about the lack (or
>> inaccuracy) of docstrings and such.  I hadn't gotten that far yet.
>> I said the patch was only if you wanted to tinker with the development
>> as this progresses.
> No worries, I was just making some comments before forgetting about
> them.

Ah, ok.  Good!  Thanks.

>>> +(defun org-e-latex--quotation-marks (text info)
>>> +  (org-export-quotation-marks text info org-e-latex-quote-replacements))
>>> +  ;; (mapc (lambda(l)
>>> +  ;; 	  (let ((start 0))
>>> +  ;; 	    (while (setq start (string-match (car l) text start))
>>> +  ;; 	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
>>> +  ;; 		(setq text (replace-match new-quote  t t text))))))
>>> +  ;; 	(cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
>>> +  ;; 		 ;; Falls back on English.
>>> +  ;; 		 (assoc "en" org-e-latex-quotes))))
>>> +  ;; text)
>>> Use directly `org-e-latex-quote-replacements' in code then.
>> Not sure I understand this comment.
> Since `org-e-latex--quotation-marks' just calls
> `org-export-quotation-marks', you can remove completely the former from
> "org-export.el" and use the latter instead.

Well, that was done on purpose, and maybe the reason will make sense.  
As I see it, each exporter should be able to have its own smartifier 
function, and the export engine should make no assumptions about that: 
just call the individual exporter's function.  On the other hand, many 
(but perhaps not all!) of the exporters may find themselves using 
essentially the same code just with different replacement strings.  So I 
thought that "general-purpose" should be in org-export.el, just for the 
convenience of exporters should they choose to make use of it.  So, many 
of the exporters' smartifier functions will really just be calls to the 
more general-purpose function.

Does that make sense?

>> So... there's the filter-parse-tree-functions hook gets applied within
>> the parse tree... so a back-end can add a function to that list which
>> looks over the parse-tree and watches for these border cases (and also
>> the ones within ordinary strings).  Looks like it's going to be tough
>> to work in any flexibility to define further per-language or
>> per-backend cleverness to handle anything beyond the "canonical set"
>> of open-double, close-double, open-single, close-single, and mid-word.
>>
>> To be sure, anything we do will most assuredly fail even on some
>> fairly reasonable input, in which case the users are pretty much on
>> their own and will have to do things the hard way.  And I could use
>> that as the answer here, that, "well, it'll work only within
>> plain-text strings" (and I might possibly still have to use that
>> answer), but I would rather include the situations you bring up in the
>> supported set and not throw up my hands at it.  So, yes, will look at
>> that.
> Actually it isn't very hard to handle this problem. But it will be
> different than the fontification used in an Org buffer.
Yes, the fontification on-screen is different, and uses a rather 
different function--but if I can help it, the same regexps!  So things 
work the same everywhere.

I also started thinking a little about what you write below, how we can 
inspect the characters just after or before quotes at the very beginning 
or end of each chunk.  It would be nice if it could all be encapsulated 
neatly in the regexp(s).
> As a first approximation, I can imagine a function accepting an element,
> an object or a secondary string and returning an equivalent element,
> object or secondary string, with its quotes "smartified". The algorithm
> could go like this:
>
> Walk element/object/secondary-string's contents .

Need it be element/object/secondary-string?  At the bottom level it's 
always about strings; the higher levels don't affect the processing of 
each string in isolation.  Do we need to intercept it at the element 
level or just wait to grab things in the plain-text filter, since we 
have access at that point too?

(Might also be that my understanding of the process and the nature of 
elements is faulty or limited.  Will have to see what works.)

>
>    1. When a string is encountered:
>
>       1. If it has a quote as its first or last position, check for
>          objects before or after the string to guess its status. An
>          object never starts with a white space, but you may have to
>          check :post-blank property in order to know if previous object
>          had white spaces at its end.

Hmm, this may in fact answer my question above: you need to be able to 
get at the object level to test the post-blank.  I'll experiment.

>       2. For each quote everywhere else in the string, your regexp can
>          handle it fine.
>
>    2. When an object belonging to `org-element-recursive-objects' is
>       encountered, apply the function to this object.
>
>    3. Accumulate returned strings or objects.
>
> Use accumulated data as the contents of the new object to return (i.e.
> just add the type and the same properties at the beginning of this list
> if it was an object or an element, return it as-is if that was
> a secondary string).
>
> On the elements side, only paragraphs, verse-blocks and table-rows can
> directly contain quotes. Also, headline, inlinetask item and
> footnote-reference have secondary strings containing quotes.

I also haven't yet worked in smarts (especially in the on-screen 
fontifier) for things like not fontifying inside comments or verbatim 
strings, etc.  That'll come in time.

> I'm not sure yet where and how to install such a function, but I will
> think about it when it is implemented.

Uuum... Maybe org-export-filter-parse-tree-functions?

~mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Smart Quotes Exporting
  2012-06-01 17:11                       ` Smart Quotes Exporting Nicolas Goaziou
  2012-06-01 22:41                         ` Mark E. Shoulson
@ 2012-06-03  3:16                         ` Mark E. Shoulson
  2012-06-06  2:14                         ` Mark E. Shoulson
  2 siblings, 0 replies; 23+ messages in thread
From: Mark E. Shoulson @ 2012-06-03  3:16 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1090 bytes --]

All right, preliminary patch is attached, *maybe* good enough for more 
serious consideration now, but might need some fixes. Still only uses 
ordinary regexps and plain-text strings, but can now handle the example 
with formatting-breaks next to quotes. Things have been moved into more 
appropriate locations, made customs, docstrings and types fixed, etc, etc.

It supports onscreen display of "smart" quotes (when enabled); I have 
the quotes displayed in org-document-info face so they are slightly 
distinct, to make it clearer that they are "altered" from what they are 
in the plain text. This may or may not be a popular (or good) idea. I 
have also built it into the new export engine in org-e-latex and 
org-e-html as proofs of concept. I'm not positive the latex one will 
work properly for German, though; there might need to be something 
enabled in LaTeX for it to format ,, into „.

It should probably be set not to smartify quotes onscreen in comments; I 
haven't done that yet.

Comments welcome; I hope I didn't complicate matters in the export 
engines too much.

~mark

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-smart-quotes-for-onscreen-display-and-for-latex-.patch --]
[-- Type: text/x-patch; name="0001-Add-smart-quotes-for-onscreen-display-and-for-latex-.patch", Size: 13760 bytes --]

From 1bc507cf69c94d5645436abc6e28e7d96999083e Mon Sep 17 00:00:00 2001
From: Mark Shoulson <mark@kli.org>
Date: Tue, 29 May 2012 23:01:12 -0400
Subject: [PATCH] Add `smart' quotes for onscreen display and for latex and
 html export

* lisp/org.el: Add `smart' quotes: custom variables to define
  regexps to recognize quotes, to define how and whether to
  display them, and org-fontify-quotes to display `smart-quote'
  characters when activated.

* contrib/lisp/org-export.el: Add function org-export-quotation-marks
  as a utility function usable by individual exporters to apply
  `smart' quotes.

* contrib/lisp/org-e-latex.el: Replace org-e-latex-quotes custom with
  org-e-latex-quotes-replacements and make org-e-latex--quotation-marks
  use the org-export-quotation-marks function in org-export.el.

* contrib/lisp/org-e-html.el: Replace org-e-html-quotes custom with
  org-e-html-quotes-replacements and enable org-e-html--quotation-marks,
  using org-export-quotation-marks function in org-export.el.
---
 contrib/lisp/org-e-html.el  |   57 ++++++++----------------
 contrib/lisp/org-e-latex.el |   67 ++++++++++-------------------
 contrib/lisp/org-export.el  |   26 +++++++++++
 lisp/org.el                 |  101 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 168 insertions(+), 83 deletions(-)

diff --git a/contrib/lisp/org-e-html.el b/contrib/lisp/org-e-html.el
index 53547a0..d4a505e 100644
--- a/contrib/lisp/org-e-html.el
+++ b/contrib/lisp/org-e-html.el
@@ -1077,37 +1077,24 @@ in order to mimic default behaviour:
 
 ;;;; Plain text
 
-(defcustom org-e-html-quotes
-  '(("fr"
-     ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~")
-     ("\\(\\S-\\)\"" . "~»")
-     ("\\(\\s-\\|(\\|^\\)'" . "'"))
-    ("en"
-     ("\\(\\s-\\|[[(]\\|^\\)\"" . "``")
-     ("\\(\\S-\\)\"" . "''")
-     ("\\(\\s-\\|(\\|^\\)'" . "`")))
-  "Alist for quotes to use when converting english double-quotes.
-
-The CAR of each item in this alist is the language code.
-The CDR of each item in this alist is a list of three CONS:
-- the first CONS defines the opening quote;
-- the second CONS defines the closing quote;
-- the last CONS defines single quotes.
-
-For each item in a CONS, the first string is a regexp
-for allowed characters before/after the quote, the second
-string defines the replacement string for this quote."
+(defcustom org-e-html-smart-quote-replacements
+  '(("fr" "&laquo;&nbsp;" "&nbsp;&raquo;" "&lsquo;" "&rsquo;" "&rsquo;")
+    ("en" "&ldquo;" "&rdquo;" "&lsquo;" "&rsquo;" "&rsquo;")
+    ("de" "&bdquo;" "&ldquo;" "&sbquo;" "&lsquo;" "&rsquo;"))
+  "What to export for `smart-quotes'.
+A list of five strings:
+ 1. Open double-quotes
+ 2. Close double-quotes
+ 3. Open single-quote
+ 4. Close single-quote
+ 5. Mid-word apostrophe"
   :group 'org-export-e-html
   :type '(list
-	  (cons :tag "Opening quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Closing quote"
-		(string :tag "Regexp for char after ")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Single quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))))
+	  (string :tag "Open double-quotes")    ; "“"
+	  (string :tag "Close double-quotes")   ; "”"
+	  (string :tag "Open single-quote")     ; "‘"
+	  (string :tag "Close single-quote")    ; "’"
+	  (string :tag "Mid-word apostrophe"))) ; "’"
 
 ;;;; Compilation
 
@@ -1497,15 +1484,7 @@ This is used to choose a separator for constructs like \\verb."
   "Export quotation marks depending on language conventions.
 TEXT is a string containing quotation marks to be replaced.  INFO
 is a plist used as a communication channel."
-  (mapc (lambda(l)
-	  (let ((start 0))
-	    (while (setq start (string-match (car l) text start))
-	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
-		(setq text (replace-match new-quote  t t text))))))
-	(cdr (or (assoc (plist-get info :language) org-e-html-quotes)
-		 ;; Falls back on English.
-		 (assoc "en" org-e-html-quotes))))
-  text)
+  (org-export-quotation-marks text info org-e-html-smart-quote-replacements))
 
 (defun org-e-html--wrap-label (element output)
   "Wrap label associated to ELEMENT around OUTPUT, if appropriate.
@@ -2729,7 +2708,7 @@ contextual information."
   ;; 		  (format "\\%s{}" (match-string 1 text)) nil t text)
   ;; 	    start (match-end 0))))
   ;; Handle quotation marks
-  ;; (setq text (org-e-html--quotation-marks text info))
+  (setq text (org-e-html--quotation-marks text info))
   ;; Convert special strings.
   ;; (when (plist-get info :with-special-strings)
   ;;   (while (string-match (regexp-quote "...") text)
diff --git a/contrib/lisp/org-e-latex.el b/contrib/lisp/org-e-latex.el
index 67e9197..2543c29 100644
--- a/contrib/lisp/org-e-latex.el
+++ b/contrib/lisp/org-e-latex.el
@@ -687,38 +687,28 @@ during latex export it will output
 
 ;;;; Plain text
 
-(defcustom org-e-latex-quotes
-  '(("fr"
-     ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~")
-     ("\\(\\S-\\)\"" . "~»")
-     ("\\(\\s-\\|(\\|^\\)'" . "'"))
-    ("en"
-     ("\\(\\s-\\|[[(]\\|^\\)\"" . "``")
-     ("\\(\\S-\\)\"" . "''")
-     ("\\(\\s-\\|(\\|^\\)'" . "`")))
-  "Alist for quotes to use when converting english double-quotes.
-
-The CAR of each item in this alist is the language code.
-The CDR of each item in this alist is a list of three CONS:
-- the first CONS defines the opening quote;
-- the second CONS defines the closing quote;
-- the last CONS defines single quotes.
-
-For each item in a CONS, the first string is a regexp
-for allowed characters before/after the quote, the second
-string defines the replacement string for this quote."
+(defcustom org-e-latex-quote-replacements
+  '(("en" "``" "''" "`" "'" "'")
+    ("fr" "«~" "~»" "‹~" "~›" "'")
+    ("de" ",," "``" "," "`" "'"))
+  "What to output for quotes.  Each element is a list of six strings.
+The first string specifies the language these quotes apply to (\"en\",
+\"fr\", \"de\", etc.; see the LANGUAGE keyword), and the other five
+define the strings to use for, in order:
+ 1. Open double-quotes
+ 2. Close double-quotes
+ 3. Open single-quote
+ 4. Close single-quote
+ 5. Mid-word apostrophe"
   :group 'org-export-e-latex
-  :type '(list
-	  (cons :tag "Opening quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Closing quote"
-		(string :tag "Regexp for char after ")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Single quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))))
-
+  :type '(repeat
+	  (list
+	   (string :tag "Language code")
+	   (string :tag "Open double-quotes")
+	   (string :tag "Close double-quotes")
+	   (string :tag "Open single-quote")
+	   (string :tag "Close single-quote")
+	   (string :tag "Mid-word apostrophe"))))
 
 ;;;; Compilation
 
@@ -852,19 +842,8 @@ nil."
 	     options
 	     ","))
 
-(defun org-e-latex--quotation-marks (text info)
-  "Export quotation marks depending on language conventions.
-TEXT is a string containing quotation marks to be replaced.  INFO
-is a plist used as a communication channel."
-  (mapc (lambda(l)
-	  (let ((start 0))
-	    (while (setq start (string-match (car l) text start))
-	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
-		(setq text (replace-match new-quote  t t text))))))
-	(cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
-		 ;; Falls back on English.
-		 (assoc "en" org-e-latex-quotes))))
-  text)
+(defun org-e-latex--quotation-marks (text info) 
+  (org-export-quotation-marks text info org-e-latex-quote-replacements))
 
 (defun org-e-latex--wrap-label (element output)
   "Wrap label associated to ELEMENT around OUTPUT, if appropriate.
diff --git a/contrib/lisp/org-export.el b/contrib/lisp/org-export.el
index b9294e5..87f5c84 100644
--- a/contrib/lisp/org-export.el
+++ b/contrib/lisp/org-export.el
@@ -284,6 +284,32 @@ rules.")
   :tag "Org Export General"
   :group 'org-export)
 
+;; Generic function, usable by exporters, but they can define their own
+;; instead.
+(defun org-export-quotation-marks (text info replacements)
+  "Export quotation marks depending on language conventions.
+TEXT is a string containing quotation marks to be replaced.  INFO
+is a plist used as a communication channel."
+  ;; (message text)
+  (let* ((regexps 
+	  (cdr 
+	   (or 
+	    (assoc (plist-get info :language)
+		   org-smart-quotes-regexps)
+	    (assq 'DEFAULT org-smart-quotes-regexps))))
+	 (subs (cdr (or (assoc (plist-get info :language)
+			       replacements)
+			(assoc "en" replacements))))
+	 (quotes (pairlis regexps subs)))
+    (mapc (lambda (p)
+	    (let ((re (car p))
+		  (su (cdr p)))
+	      (setq text (replace-regexp-in-string re su text t t 9))))
+	  quotes))
+  text)
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
 (defcustom org-export-with-archived-trees 'headline
   "Whether sub-trees with the ARCHIVE tag should be exported.
 
diff --git a/lisp/org.el b/lisp/org.el
index 0157e36..70d7266 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -3625,6 +3625,69 @@ When nil, the \\name form remains in the buffer."
   :version "24.1"
   :type 'boolean)
 
+(defcustom org-smart-quotes nil
+  "Non-nil means display `smart' quotes on-screen in place
+of \" and ' characters."
+  :group 'org-appearance
+  :type 'boolean)
+
+(defcustom org-smart-quotes-replacements
+  '("“" "”" "‘" "’" "’")
+  "What to display on-screen when `org-smart-quotes' is non-nil.
+A list of five strings:
+ 1. Open double-quotes
+ 2. Close double-quotes
+ 3. Open single-quote
+ 4. Close single-quote
+ 5. Mid-word apostrophe"
+  :group 'org-appearance
+  :type '(list
+	  (string :tag "Open double-quotes" "«")    ; "“"
+	  (string :tag "Close double-quotes" "»")   ; "”"
+	  (string :tag "Open single-quote" "‹")     ; "‘"
+	  (string :tag "Close single-quote" "›")    ; "’"
+	  (string :tag "Mid-word apostrophe" "’"))) ; "’"
+
+(defcustom org-smart-quotes-regexps
+  '((DEFAULT
+      "\\(?:\\s-\\|\\s(\\|^\\)\\(?9:\"\\)\\(?:\\w\\|\\s.\\|\\s_\\)\\|\\s-\\(?9:\"\\)$" 
+      "\\(?:\\S-\\)\\(?9:\"\\)\\(?:\\s-\\|$\\|\\s)\\|\\s.\\)\\|^\\(?9:\"\\)\\s-" 
+      "\\(?:\\s-\\|(\\|^\\)\\(?9:'\\)\\w\\|\\s-\\(?9:'\\)$"
+      "\\w\\(?9:'\\)\\(?:\\s-\\|\\s.\\|$\\)\\|^\\(?9:'\\)\\s-"
+      "\\w\\(?9:'\\)\\w"))
+  "Regexps for quotes to be made `smart' quotes upon export or onscreen.
+Each element is a list of six strings.  The car is the a string
+representing the language to which this definition applies (e.g. \"en\",
+\"fr\", \"de\", etc.); the cdr (the other five elements) are five REs 
+matching, in order:
+ 1. Opening double-quotes
+ 2. Closing double-quotes
+ 3. Opening single-quotes
+ 4. Closing single-quotes
+ 5. Mid-word apostrophes
+
+Each regexp should surround the actual quote in a capturing group, which
+must be specified as number 9 (so as not to conflict with other processing.)
+
+One element should have as its car the atom DEFAULT, to be used when no
+other element fits.  It is also the one used for on-screen display of
+`smart' quotes (see the variable `org-smart-quotes').
+
+As what makes an opening or closing quote is somewhat consistent across
+languages (as opposed to how they are represented in typography), the
+DEFAULT element is likely sufficient for most purposes."
+  :group 'org-export-general
+  :group 'org-appearance
+  :type '(repeat
+	  (list
+	   (choice (const DEFAULT)
+		   (string :tag "Language"))
+	   (regexp :tag "Open double-quotes")
+	   (regexp :tag "Close double-quotes")
+	   (regexp :tag "Open single-quote")
+	   (regexp :tag "Close double-quote")
+	   (regexp :tag "Mid-word apostrophe"))))
+
 (defvar org-emph-re nil
   "Regular expression for matching emphasis.
 After a match, the match groups contain these elements:
@@ -5927,6 +5990,7 @@ needs to be inserted at a specific position in the font-lock sequence.")
 	   ;; Specials
 	   '(org-do-latex-and-special-faces)
 	   '(org-fontify-entities)
+	   '(org-fontify-quotes)
 	   '(org-raise-scripts)
 	   ;; Code
 	   '(org-activate-code (1 'org-code t))
@@ -5948,6 +6012,43 @@ needs to be inserted at a specific position in the font-lock sequence.")
 		   '(org-font-lock-keywords t nil nil backward-paragraph))
     (kill-local-variable 'font-lock-keywords) nil))
 
+(defun org-fontify-quotes (limit)
+  (require 'org-export)
+  (when org-smart-quotes
+    (let* ((start (point))
+	   k su
+	   (splice-string (lambda (lst join)
+			    (if (null (cdr lst)) (car lst)
+			      (concat (car lst) join
+				      (splice-string (cdr lst) join)))))
+	   (regexps
+	    (cdr
+	     (assq 'DEFAULT org-smart-quotes-regexps)))
+	   (i 1)
+	   (allreg
+	    (mapconcat (lambda (n) (prog1 (format "\\(?%d:%s\\)" i n)
+				     (setq i (1+ i))))
+		       regexps "\\|"))
+	   (quotes (pairlis regexps org-smart-quotes-replacements)))
+      (catch 'match
+	(while (re-search-forward allreg limit t)
+	  (cond ((match-string 1)
+		 (setq su (nth 0 org-smart-quotes-replacements)))
+		((match-string 2)
+		 (setq su (nth 1 org-smart-quotes-replacements)))
+		((match-string 3)
+		 (setq su (nth 2 org-smart-quotes-replacements)))
+		((match-string 4)
+		 (setq su (nth 3 org-smart-quotes-replacements)))
+		((match-string 5)
+		 (setq su (nth 4 org-smart-quotes-replacements))))
+	  (add-text-properties (match-beginning 9) (match-end 9)
+			       (list 'font-lock-fontified t
+				     'face 'org-document-info))
+	  (compose-region (match-beginning 9) (match-end 9) su nil)
+	  (backward-char 1)
+	  (throw 'match t))))))
+
 (defun org-toggle-pretty-entities ()
   "Toggle the composition display of entities as UTF8 characters."
   (interactive)
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: Smart Quotes Exporting
  2012-06-01 17:11                       ` Smart Quotes Exporting Nicolas Goaziou
  2012-06-01 22:41                         ` Mark E. Shoulson
  2012-06-03  3:16                         ` Mark E. Shoulson
@ 2012-06-06  2:14                         ` Mark E. Shoulson
  2012-06-07 19:21                           ` Nicolas Goaziou
  2 siblings, 1 reply; 23+ messages in thread
From: Mark E. Shoulson @ 2012-06-06  2:14 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 970 bytes --]

Update on the smart-quotes patch.  Supports the odt exporter now too, 
which I think covers all the current major "new" exporters for which it 
is relevant (adding smart quotes to ASCII export is a contradiction in 
terms; should it be in the "publish" exporter?  It didn't look like it 
to me).

Added an options keyword, '"' (that is, the double-quote mark) to select 
smart quotes on/off, and a defcustom for customizing your default.  Set 
the default default [sic] to nil, though actually it might be reasonable 
to set it to t.  Slight touch-up to the regexps since last time, but 
they will definitely be subject to a lot of fine-tuning as more special 
cases are found that break them and ways to fix it are found (the 
close-quote still breaks on one of "/a/." or "/a./")

It's pretty good on the whole, though, usually guesses right.  I know 
there's some work being done on the odt exporter; hope this fits in well 
with it.

How does it look to you?

~mark


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-smart-quotes-for-onscreen-display-and-for-latex-.patch --]
[-- Type: text/x-patch; name="0001-Add-smart-quotes-for-onscreen-display-and-for-latex-.patch", Size: 18446 bytes --]

From e6df2efd1a9ce36964a20fc06aa2a688acd87efb Mon Sep 17 00:00:00 2001
From: Mark Shoulson <mark@kli.org>
Date: Tue, 29 May 2012 23:01:12 -0400
Subject: [PATCH] Add `smart' quotes for onscreen display and for latex and
 html export

* lisp/org.el: Add `smart' quotes: custom variables to define
  regexps to recognize quotes, to define how and whether to
  display them, and org-fontify-quotes to display `smart-quote'
  characters when activated.

* contrib/lisp/org-export.el: Add function org-export-quotation-marks
  as a utility function usable by individual exporters to apply
  `smart' quotes.  Also add keyword '"' for customizing smart quotes,
  and custom default for it.

* contrib/lisp/org-e-latex.el: Replace org-e-latex-quotes custom with
  org-e-latex-quotes-replacements and make org-e-latex--quotation-marks
  use the org-export-quotation-marks function in org-export.el.

* contrib/lisp/org-e-html.el: Replace org-e-html-quotes custom with
  org-e-html-quotes-replacements and enable org-e-html--quotation-marks,
  using org-export-quotation-marks function in org-export.el.

* contrib/lisp/org-e-odt.el: Replace org-e-odt-quotes custom with
  org-e-odt-quotes-replacements and make org-e-odt--quotation-marks
  use org-export-quotations-marks function in org-export.el.
---
 contrib/lisp/org-e-html.el  |   57 ++++++++----------------
 contrib/lisp/org-e-latex.el |   67 ++++++++++-------------------
 contrib/lisp/org-e-odt.el   |   68 ++++++++++-------------------
 contrib/lisp/org-export.el  |   38 ++++++++++++++++
 lisp/org.el                 |  101 +++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 203 insertions(+), 128 deletions(-)

diff --git a/contrib/lisp/org-e-html.el b/contrib/lisp/org-e-html.el
index 4287a59..c49608d 100644
--- a/contrib/lisp/org-e-html.el
+++ b/contrib/lisp/org-e-html.el
@@ -1043,37 +1043,24 @@ in order to mimic default behaviour:
 
 ;;;; Plain text
 
-(defcustom org-e-html-quotes
-  '(("fr"
-     ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~")
-     ("\\(\\S-\\)\"" . "~»")
-     ("\\(\\s-\\|(\\|^\\)'" . "'"))
-    ("en"
-     ("\\(\\s-\\|[[(]\\|^\\)\"" . "``")
-     ("\\(\\S-\\)\"" . "''")
-     ("\\(\\s-\\|(\\|^\\)'" . "`")))
-  "Alist for quotes to use when converting english double-quotes.
-
-The CAR of each item in this alist is the language code.
-The CDR of each item in this alist is a list of three CONS:
-- the first CONS defines the opening quote;
-- the second CONS defines the closing quote;
-- the last CONS defines single quotes.
-
-For each item in a CONS, the first string is a regexp
-for allowed characters before/after the quote, the second
-string defines the replacement string for this quote."
+(defcustom org-e-html-smart-quote-replacements
+  '(("fr" "&laquo;&nbsp;" "&nbsp;&raquo;" "&lsquo;" "&rsquo;" "&rsquo;")
+    ("en" "&ldquo;" "&rdquo;" "&lsquo;" "&rsquo;" "&rsquo;")
+    ("de" "&bdquo;" "&ldquo;" "&sbquo;" "&lsquo;" "&rsquo;"))
+  "What to export for `smart-quotes'.
+A list of five strings:
+ 1. Open double-quotes
+ 2. Close double-quotes
+ 3. Open single-quote
+ 4. Close single-quote
+ 5. Mid-word apostrophe"
   :group 'org-export-e-html
   :type '(list
-	  (cons :tag "Opening quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Closing quote"
-		(string :tag "Regexp for char after ")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Single quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))))
+	  (string :tag "Open double-quotes")    ; "“"
+	  (string :tag "Close double-quotes")   ; "”"
+	  (string :tag "Open single-quote")     ; "‘"
+	  (string :tag "Close single-quote")    ; "’"
+	  (string :tag "Mid-word apostrophe"))) ; "’"
 
 ;;;; Compilation
 
@@ -1459,15 +1446,7 @@ This is used to choose a separator for constructs like \\verb."
   "Export quotation marks depending on language conventions.
 TEXT is a string containing quotation marks to be replaced.  INFO
 is a plist used as a communication channel."
-  (mapc (lambda(l)
-	  (let ((start 0))
-	    (while (setq start (string-match (car l) text start))
-	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
-		(setq text (replace-match new-quote  t t text))))))
-	(cdr (or (assoc (plist-get info :language) org-e-html-quotes)
-		 ;; Falls back on English.
-		 (assoc "en" org-e-html-quotes))))
-  text)
+  (org-export-quotation-marks text info org-e-html-smart-quote-replacements))
 
 (defun org-e-html--wrap-label (element output)
   "Wrap label associated to ELEMENT around OUTPUT, if appropriate.
@@ -2691,7 +2670,7 @@ contextual information."
   ;; 		  (format "\\%s{}" (match-string 1 text)) nil t text)
   ;; 	    start (match-end 0))))
   ;; Handle quotation marks
-  ;; (setq text (org-e-html--quotation-marks text info))
+  (setq text (org-e-html--quotation-marks text info))
   ;; Convert special strings.
   ;; (when (plist-get info :with-special-strings)
   ;;   (while (string-match (regexp-quote "...") text)
diff --git a/contrib/lisp/org-e-latex.el b/contrib/lisp/org-e-latex.el
index 67e9197..2543c29 100644
--- a/contrib/lisp/org-e-latex.el
+++ b/contrib/lisp/org-e-latex.el
@@ -687,38 +687,28 @@ during latex export it will output
 
 ;;;; Plain text
 
-(defcustom org-e-latex-quotes
-  '(("fr"
-     ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~")
-     ("\\(\\S-\\)\"" . "~»")
-     ("\\(\\s-\\|(\\|^\\)'" . "'"))
-    ("en"
-     ("\\(\\s-\\|[[(]\\|^\\)\"" . "``")
-     ("\\(\\S-\\)\"" . "''")
-     ("\\(\\s-\\|(\\|^\\)'" . "`")))
-  "Alist for quotes to use when converting english double-quotes.
-
-The CAR of each item in this alist is the language code.
-The CDR of each item in this alist is a list of three CONS:
-- the first CONS defines the opening quote;
-- the second CONS defines the closing quote;
-- the last CONS defines single quotes.
-
-For each item in a CONS, the first string is a regexp
-for allowed characters before/after the quote, the second
-string defines the replacement string for this quote."
+(defcustom org-e-latex-quote-replacements
+  '(("en" "``" "''" "`" "'" "'")
+    ("fr" "«~" "~»" "‹~" "~›" "'")
+    ("de" ",," "``" "," "`" "'"))
+  "What to output for quotes.  Each element is a list of six strings.
+The first string specifies the language these quotes apply to (\"en\",
+\"fr\", \"de\", etc.; see the LANGUAGE keyword), and the other five
+define the strings to use for, in order:
+ 1. Open double-quotes
+ 2. Close double-quotes
+ 3. Open single-quote
+ 4. Close single-quote
+ 5. Mid-word apostrophe"
   :group 'org-export-e-latex
-  :type '(list
-	  (cons :tag "Opening quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Closing quote"
-		(string :tag "Regexp for char after ")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Single quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))))
-
+  :type '(repeat
+	  (list
+	   (string :tag "Language code")
+	   (string :tag "Open double-quotes")
+	   (string :tag "Close double-quotes")
+	   (string :tag "Open single-quote")
+	   (string :tag "Close single-quote")
+	   (string :tag "Mid-word apostrophe"))))
 
 ;;;; Compilation
 
@@ -852,19 +842,8 @@ nil."
 	     options
 	     ","))
 
-(defun org-e-latex--quotation-marks (text info)
-  "Export quotation marks depending on language conventions.
-TEXT is a string containing quotation marks to be replaced.  INFO
-is a plist used as a communication channel."
-  (mapc (lambda(l)
-	  (let ((start 0))
-	    (while (setq start (string-match (car l) text start))
-	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
-		(setq text (replace-match new-quote  t t text))))))
-	(cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
-		 ;; Falls back on English.
-		 (assoc "en" org-e-latex-quotes))))
-  text)
+(defun org-e-latex--quotation-marks (text info) 
+  (org-export-quotation-marks text info org-e-latex-quote-replacements))
 
 (defun org-e-latex--wrap-label (element output)
   "Wrap label associated to ELEMENT around OUTPUT, if appropriate.
diff --git a/contrib/lisp/org-e-odt.el b/contrib/lisp/org-e-odt.el
index cab4c66..7eb92b6 100644
--- a/contrib/lisp/org-e-odt.el
+++ b/contrib/lisp/org-e-odt.el
@@ -2318,39 +2318,28 @@ in order to mimic default behaviour:
 
 ;;;; Plain text
 
-(defcustom org-e-odt-quotes
-  '(("fr"
-     ("\\(\\s-\\|[[(]\\|^\\)\"" . "« ")
-     ("\\(\\S-\\)\"" . "» ")
-     ("\\(\\s-\\|(\\|^\\)'" . "'"))
-    ("en"
-     ("\\(\\s-\\|[[(]\\|^\\)\"" . "“")
-     ("\\(\\S-\\)\"" . "”")
-     ("\\(\\s-\\|(\\|^\\)'" . "‘")
-     ("\\(\\S-\\)'" . "’")))
-  "Alist for quotes to use when converting english double-quotes.
-
-The CAR of each item in this alist is the language code.
-The CDR of each item in this alist is a list of three CONS:
-- the first CONS defines the opening quote;
-- the second CONS defines the closing quote;
-- the last CONS defines single quotes.
-
-For each item in a CONS, the first string is a regexp
-for allowed characters before/after the quote, the second
-string defines the replacement string for this quote."
+(defcustom org-e-odt-quote-replacements
+  '(("en" "“" "”" "‘" "’" "’")
+    ("fr" "« " " »" "‹ " " ›" "’")
+    ("de" "„" "“" "‚" "‘" "’"))
+  "What to output for quotes.  Each element is a list of six strings.
+The first string specifies the language these quotes apply to (\"en\",
+\"fr\", \"de\", etc.; see the LANGUAGE keyword), and the other five
+define the strings to use for, in order:
+ 1. Open double-quotes
+ 2. Close double-quotes
+ 3. Open single-quote
+ 4. Close single-quote
+ 5. Mid-word apostrophe"
   :group 'org-export-e-odt
-  :type '(list
-	  (cons :tag "Opening quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Closing quote"
-		(string :tag "Regexp for char after ")
-		(string :tag "Replacement quote     "))
-	  (cons :tag "Single quote"
-		(string :tag "Regexp for char before")
-		(string :tag "Replacement quote     "))))
-
+  :type '(repeat
+	  (list
+	   (string :tag "Language code")
+	   (string :tag "Open double-quotes")
+	   (string :tag "Close double-quotes")
+	   (string :tag "Open single-quote")
+	   (string :tag "Close single-quote")
+	   (string :tag "Mid-word apostrophe"))))
 
 ;;;; Compilation
 
@@ -2485,19 +2474,8 @@ This is used to choose a separator for constructs like \\verb."
 	  when (not (string-match (regexp-quote (char-to-string c)) s))
 	  return (char-to-string c))))
 
-(defun org-e-odt--quotation-marks (text info)
-  "Export quotation marks depending on language conventions.
-TEXT is a string containing quotation marks to be replaced.  INFO
-is a plist used as a communication channel."
-  (mapc (lambda(l)
-	  (let ((start 0))
-	    (while (setq start (string-match (car l) text start))
-	      (let ((new-quote (concat (match-string 1 text) (cdr l))))
-		(setq text (replace-match new-quote  t t text))))))
-	(cdr (or (assoc (plist-get info :language) org-e-odt-quotes)
-		 ;; Falls back on English.
-		 (assoc "en" org-e-odt-quotes))))
-  text)
+(defun org-e-odt--quotation-marks (text info) 
+  (org-export-quotation-marks text info org-e-odt-quote-replacements))
 
 (defun org-e-odt--wrap-label (element output)
   "Wrap label associated to ELEMENT around OUTPUT, if appropriate.
diff --git a/contrib/lisp/org-export.el b/contrib/lisp/org-export.el
index b9294e5..4e5f738 100644
--- a/contrib/lisp/org-export.el
+++ b/contrib/lisp/org-export.el
@@ -143,6 +143,7 @@
     (:with-priority nil "pri" org-export-with-priority)
     (:with-special-strings nil "-" org-export-with-special-strings)
     (:with-sub-superscript nil "^" org-export-with-sub-superscripts)
+    (:with-smart-quotes nil "\"" org-export-with-smart-quotes)
     (:with-toc nil "toc" org-export-with-toc)
     (:with-tables nil "|" org-export-with-tables)
     (:with-tags nil "tags" org-export-with-tags)
@@ -284,6 +285,33 @@ rules.")
   :tag "Org Export General"
   :group 'org-export)
 
+;; Generic function, usable by exporters, but they can define their own
+;; instead.
+(defun org-export-quotation-marks (text info replacements)
+  "Export quotation marks depending on language conventions.
+TEXT is a string containing quotation marks to be replaced.  INFO
+is a plist used as a communication channel."
+  ;; (message text)
+  (when (plist-get info :with-smart-quotes)
+    (let* ((regexps 
+	    (cdr 
+	     (or 
+	      (assoc (plist-get info :language)
+		     org-smart-quotes-regexps)
+	      (assq 'DEFAULT org-smart-quotes-regexps))))
+	   (subs (cdr (or (assoc (plist-get info :language)
+				 replacements)
+			  (assoc "en" replacements))))
+	   (quotes (pairlis regexps subs)))
+      (mapc (lambda (p)
+	      (let ((re (car p))
+		    (su (cdr p)))
+		(setq text (replace-regexp-in-string re su text t t 9))))
+	  quotes)))
+  text)
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
 (defcustom org-export-with-archived-trees 'headline
   "Whether sub-trees with the ARCHIVE tag should be exported.
 
@@ -445,6 +473,16 @@ e.g. \"e:nil\"."
   :group 'org-export-general
   :type 'boolean)
 
+(defcustom org-export-with-smart-quotes t
+  "Non-nil means try to make quotes \"smart\" when exporting.
+
+For example, HTML export would convert \"Hello\" to &ldquo;Hello&rdquo;.
+
+The exact style of quotes depends on the language; see the LANGUAGE
+keyword and also the smart-quote custom settings for each exporter."
+  :group 'org-export-general
+  :type 'boolean)
+
 (defcustom org-export-with-planning nil
   "Non-nil means include planning info in export.
 This option can also be set with the #+OPTIONS: line,
diff --git a/lisp/org.el b/lisp/org.el
index b89889d..8a446ec 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -3629,6 +3629,69 @@ When nil, the \\name form remains in the buffer."
   :version "24.1"
   :type 'boolean)
 
+(defcustom org-smart-quotes nil
+  "Non-nil means display `smart' quotes on-screen in place
+of \" and ' characters."
+  :group 'org-appearance
+  :type 'boolean)
+
+(defcustom org-smart-quotes-replacements
+  '("“" "”" "‘" "’" "’")
+  "What to display on-screen when `org-smart-quotes' is non-nil.
+A list of five strings:
+ 1. Open double-quotes
+ 2. Close double-quotes
+ 3. Open single-quote
+ 4. Close single-quote
+ 5. Mid-word apostrophe"
+  :group 'org-appearance
+  :type '(list
+	  (string :tag "Open double-quotes" "“")
+	  (string :tag "Close double-quotes" "”")
+	  (string :tag "Open single-quote" "‘")
+	  (string :tag "Close single-quote" "’")
+	  (string :tag "Mid-word apostrophe" "’")))
+
+(defcustom org-smart-quotes-regexps
+  '((DEFAULT
+      "\\(?:\\s-\\|\\s(\\|^\\)\\(?9:\"\\)\\(?:\\w\\|\\s.\\|\\s_\\)\\|\\s-\\(?9:\"\\)$" 
+      "\\(?:\\S-\\)\\(?9:\"\\)\\(?:\\s-\\|$\\|\\s)\\|\\s.\\)\\|^\\(?9:\"\\)\\s-" 
+      "\\(?:\\s-\\|(\\|^\\)\\(?9:'\\)\\w\\|\\s-\\(?9:'\\)$"
+      "\\w\\s.*\\(?9:'\\)\\(?:\\s-\\|\\s.\\|$\\)\\|^\\(?9:'\\)\\s-"
+      "\\w\\(?9:'\\)\\w"))
+  "Regexps for quotes to be made `smart' quotes upon export or onscreen.
+Each element is a list of six strings.  The car is the a string
+representing the language to which this definition applies (e.g. \"en\",
+\"fr\", \"de\", etc.); the cdr (the other five elements) are five REs 
+matching, in order:
+ 1. Opening double-quotes
+ 2. Closing double-quotes
+ 3. Opening single-quotes
+ 4. Closing single-quotes
+ 5. Mid-word apostrophes
+
+Each regexp should surround the actual quote in a capturing group, which
+must be specified as number 9 (so as not to conflict with other processing.)
+
+One element should have as its car the atom DEFAULT, to be used when no
+other element fits.  It is also the one used for on-screen display of
+`smart' quotes (see the variable `org-smart-quotes').
+
+As what makes an opening or closing quote is somewhat consistent across
+languages (as opposed to how they are represented in typography), the
+DEFAULT element is likely sufficient for most purposes."
+  :group 'org-export-general
+  :group 'org-appearance
+  :type '(repeat
+	  (list
+	   (choice (const DEFAULT)
+		   (string :tag "Language"))
+	   (regexp :tag "Open double-quotes")
+	   (regexp :tag "Close double-quotes")
+	   (regexp :tag "Open single-quote")
+	   (regexp :tag "Close double-quote")
+	   (regexp :tag "Mid-word apostrophe"))))
+
 (defvar org-emph-re nil
   "Regular expression for matching emphasis.
 After a match, the match groups contain these elements:
@@ -5931,6 +5994,7 @@ needs to be inserted at a specific position in the font-lock sequence.")
 	   ;; Specials
 	   '(org-do-latex-and-special-faces)
 	   '(org-fontify-entities)
+	   '(org-fontify-quotes)
 	   '(org-raise-scripts)
 	   ;; Code
 	   '(org-activate-code (1 'org-code t))
@@ -5952,6 +6016,43 @@ needs to be inserted at a specific position in the font-lock sequence.")
 		   '(org-font-lock-keywords t nil nil backward-paragraph))
     (kill-local-variable 'font-lock-keywords) nil))
 
+(defun org-fontify-quotes (limit)
+  (require 'org-export)
+  (when org-smart-quotes
+    (let* ((start (point))
+	   k su
+	   (splice-string (lambda (lst join)
+			    (if (null (cdr lst)) (car lst)
+			      (concat (car lst) join
+				      (splice-string (cdr lst) join)))))
+	   (regexps
+	    (cdr
+	     (assq 'DEFAULT org-smart-quotes-regexps)))
+	   (i 1)
+	   (allreg
+	    (mapconcat (lambda (n) (prog1 (format "\\(?%d:%s\\)" i n)
+				     (setq i (1+ i))))
+		       regexps "\\|"))
+	   (quotes (pairlis regexps org-smart-quotes-replacements)))
+      (catch 'match
+	(while (re-search-forward allreg limit t)
+	  (cond ((match-string 1)
+		 (setq su (nth 0 org-smart-quotes-replacements)))
+		((match-string 2)
+		 (setq su (nth 1 org-smart-quotes-replacements)))
+		((match-string 3)
+		 (setq su (nth 2 org-smart-quotes-replacements)))
+		((match-string 4)
+		 (setq su (nth 3 org-smart-quotes-replacements)))
+		((match-string 5)
+		 (setq su (nth 4 org-smart-quotes-replacements))))
+	  (add-text-properties (match-beginning 9) (match-end 9)
+			       (list 'font-lock-fontified t
+				     'face 'org-document-info))
+	  (compose-region (match-beginning 9) (match-end 9) su nil)
+	  (backward-char 1)
+	  (throw 'match t))))))
+
 (defun org-toggle-pretty-entities ()
   "Toggle the composition display of entities as UTF8 characters."
   (interactive)
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: Smart Quotes Exporting
  2012-06-06  2:14                         ` Mark E. Shoulson
@ 2012-06-07 19:21                           ` Nicolas Goaziou
  2012-06-11  1:28                             ` Mark Shoulson
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Goaziou @ 2012-06-07 19:21 UTC (permalink / raw)
  To: Mark E. Shoulson; +Cc: emacs-orgmode

Hello,

"Mark E. Shoulson" <mark@kli.org> writes:

> Update on the smart-quotes patch.  Supports the odt exporter now too,
> which I think covers all the current major "new" exporters for which
> it is relevant (adding smart quotes to ASCII export is a contradiction
> in terms;

ASCII exporter also handle UTF-8. So it's good to have there too.

> should it be in the "publish" exporter?  It didn't look like it to
> me).

No.

> Added an options keyword, '"' (that is, the double-quote mark) to
> select smart quotes on/off, and a defcustom for customizing your
> default.  Set the default default [sic] to nil, though actually it
> might be reasonable to set it to t.  Slight touch-up to the regexps
> since last time, but they will definitely be subject to a lot of
> fine-tuning as more special cases are found that break them and ways
> to fix it are found (the close-quote still breaks on one of "/a/." or
> "/a./")

Again, using regexps on plain text objects is a wrong approach, as you
need a better understanding of the whole paragraph structure to
properly. I already suggested a possible solution, is there anything
wrong with it?


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Smart Quotes Exporting
  2012-06-07 19:21                           ` Nicolas Goaziou
@ 2012-06-11  1:28                             ` Mark Shoulson
  2012-06-12 13:21                               ` Nicolas Goaziou
  0 siblings, 1 reply; 23+ messages in thread
From: Mark Shoulson @ 2012-06-11  1:28 UTC (permalink / raw)
  To: emacs-orgmode

Nicolas Goaziou <n.goaziou <at> gmail.com> writes:

> 
> Hello,
> 
> "Mark E. Shoulson" <mark <at> kli.org> writes:
> 
> > Update on the smart-quotes patch.  Supports the odt exporter now too,
> > which I think covers all the current major "new" exporters for which
> > it is relevant (adding smart quotes to ASCII export is a contradiction
> > in terms;
> 
> ASCII exporter also handle UTF-8. So it's good to have there too.

Really?  I would have thought ASCII meant ASCII, as in 7-bit clean text.  More
of a "plain text" exporter then.  Fair enough.  I'll work it in.

> > should it be in the "publish" exporter?  It didn't look like it to
> > me).
> 
> No.

OK, good.

> 
> > Added an options keyword, '"' (that is, the double-quote mark) to
> > select smart quotes on/off, and a defcustom for customizing your
> > default.  Set the default default [sic] to nil, though actually it
> > might be reasonable to set it to t.  Slight touch-up to the regexps
> > since last time, but they will definitely be subject to a lot of
> > fine-tuning as more special cases are found that break them and ways
> > to fix it are found (the close-quote still breaks on one of "/a/." or
> > "/a./")
> 
> Again, using regexps on plain text objects is a wrong approach, as you
> need a better understanding of the whole paragraph structure to
> properly. I already suggested a possible solution, is there anything
> wrong with it?

It looked to me like your solution would essentially boil down to "do string
handling when there's a string, otherwise recur down and find the strings,"
which essentially means apply it to all the strings... and there were already
functions out there applying things to strings, so this can just ride along with
them.  Here, let's look at your suggestion and see if we can find what I missed:

] Walk element/object/secondary-string's contents .
] 
]   1. When a string is encountered:
]
]      1. If it has a quote as its first or last position, check for
]         objects before or after the string to guess its status. An
]         object never starts with a white space, but you may have to
]         check :post-blank property in order to know if previous object
]         had white spaces at its end.
]
]      2. For each quote everywhere else in the string, your regexp can
]         handle it fine.
]
]   2. When an object belonging to `org-element-recursive-objects' is
]      encountered, apply the function to this object.
]
]   3. Accumulate returned strings or objects.

So, if it's a string, use the regexps (if they can be smart enough to look at
beginning and end of the string, which they can--though I haven't been using the
:post-blank property so presumably something is amiss), and if it isn't a
string, recur down until you get to a string... Ah, but only if it's in
org-element-recursive-objects.  So the issue with the current state is that it
would wind up applying to too much? (it would hit code and verbatim elements,
for example, and that would be wrong.)  And detecting such things at the string
level would be the wrong place... So it remains to find the right place in the
processing to put a function like the one you describe.  I'm trying to get a
proper understanding of the code structure to see what you mean.  Looks like it
should be something like a transcoder, only called on everything... wait, called
on the top-level parsed tree object, recursively doing its thing before(?) the
transcoders of the individual objects get to it.  So almost something replacing
the (lambda (blob contents info) contents) stub in org-export-transcoder; does
that make sense to you? Otherwise, called somehow in org-export-data.  In either
case made a hook of some kind so that it is backend-specific.

Does it sound like I am understanding this right, to you?

The on-screen one would still use the plain-string computation, as you said,
since the full parse isn't available.  And that seems to work okay (the export
works okay too, for simple cases.)  It would also need to be tweaked not to act
on verbatim/comment text, etc.

Thanks,

~mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Smart Quotes Exporting
  2012-06-11  1:28                             ` Mark Shoulson
@ 2012-06-12 13:21                               ` Nicolas Goaziou
  2012-06-15 16:20                                 ` Mark Shoulson
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Goaziou @ 2012-06-12 13:21 UTC (permalink / raw)
  To: Mark Shoulson; +Cc: emacs-orgmode

Hello,

Mark Shoulson <mark@kli.org> writes:

>> ASCII exporter also handle UTF-8. So it's good to have there too.
>
> Really?  I would have thought ASCII meant ASCII, as in 7-bit clean
> text.

org-e-ascii.el (as old org-ascii.el) handles ASCII, Latin1 and UTF-8
encodings.

> It looked to me like your solution would essentially boil down to "do
> string handling when there's a string, otherwise recur down and find
> the strings," which essentially means apply it to all the
> strings... and there were already functions out there applying things
> to strings, so this can just ride along with them.  Here, let's look
> at your suggestion and see if we can find what I missed:
>
> ] Walk element/object/secondary-string's contents .
> ] 
> ]   1. When a string is encountered:
> ]
> ]      1. If it has a quote as its first or last position, check for
> ]         objects before or after the string to guess its status. An
> ]         object never starts with a white space, but you may have to
> ]         check :post-blank property in order to know if previous object
> ]         had white spaces at its end.
> ]
> ]      2. For each quote everywhere else in the string, your regexp can
> ]         handle it fine.
> ]
> ]   2. When an object belonging to `org-element-recursive-objects' is
> ]      encountered, apply the function to this object.
> ]
> ]   3. Accumulate returned strings or objects.
>
> So, if it's a string, use the regexps (if they can be smart enough to look at
> beginning and end of the string, which they can--though I haven't been using the
> :post-blank property so presumably something is amiss), and if it isn't a
> string, recur down until you get to a string... Ah, but only if it's in
> org-element-recursive-objects.

You're missing an important part: the regexps cannot be smart enough for
quotes at the beginning or the end of the string. There, you must look
outside the string. Hence:

> ]      1. If it has a quote as its first or last position, check for
> ]         objects before or after the string to guess its status. An
> ]         object never starts with a white space, but you may have to
> ]         check :post-blank property in order to know if previous object
> ]         had white spaces at its end.

But you can only do that from the element containing the string, not
from the string itself.

> So the issue with the current state is that it
> would wind up applying to too much? (it would hit code and verbatim elements,
> for example, and that would be wrong.)

No, you are not applying it too much (verbatim elements don't contain
plain-text objects) but your function hasn't got access to enough
information to be useful.

> So it remains to find the right place in the processing to put
> a function like the one you describe.  I'm trying to get a proper
> understanding of the code structure to see what you mean.  Looks like
> it should be something like a transcoder, only called on
> everything... 

Transcoders are type specific, so that's not an option.

> wait, called on the top-level parsed tree object, recursively doing
> its thing before(?) the transcoders of the individual objects get to
> it.

That's called a parse tree filter. That should be a possibility
indeed. The function would be applied on the parse tree and would
replace strings within elements containing plain text (that is
paragraph, verse-block and table-row types). parse tree filters are
applied very early in the export process.

Another option would be to integrate it into
`org-element-normalize-contents', but I think the previous way is
better.

> The on-screen one would still use the plain-string computation, as you said,
> since the full parse isn't available.

Yes.

> It would also need to be tweaked not to act on verbatim/comment text,
> etc.

Yes. You may want to use `org-element-at-point' and `org-element-type'
to tell if you're somewhere smart quotes are allowed (in table,
table-row, paragraph, verse-block elements).


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Smart Quotes Exporting
  2012-06-12 13:21                               ` Nicolas Goaziou
@ 2012-06-15 16:20                                 ` Mark Shoulson
  2012-06-19  9:26                                   ` Nicolas Goaziou
  0 siblings, 1 reply; 23+ messages in thread
From: Mark Shoulson @ 2012-06-15 16:20 UTC (permalink / raw)
  To: emacs-orgmode

Nicolas Goaziou <n.goaziou <at> gmail.com> writes:

> 
> Hello,
> 
> Mark Shoulson <mark <at> kli.org> writes:
> 
> >> ASCII exporter also handle UTF-8. So it's good to have there too.
> >
> > Really?  I would have thought ASCII meant ASCII, as in 7-bit clean
> > text.
> 
> org-e-ascii.el (as old org-ascii.el) handles ASCII, Latin1 and UTF-8
> encodings.

I noticed that after writing my response.  The name just threw me a little.  
Yes, that exporter needs to handle it too.

> > It looked to me like your solution would essentially boil down to "do
> > string handling when there's a string, otherwise recur down and find
> > the strings," which essentially means apply it to all the
> > strings... and there were already functions out there applying things
> > to strings, so this can just ride along with them.  Here, let's look
> > at your suggestion and see if we can find what I missed:
> >
....
> > So, if it's a string, use the regexps (if they can be smart enough to look 
at
> > beginning and end of the string, which they can--though I haven't been 
using the
> > :post-blank property so presumably something is amiss), and if it isn't a
> > string, recur down until you get to a string... Ah, but only if it's in
> > org-element-recursive-objects.
> 
> You're missing an important part: the regexps cannot be smart enough for
> quotes at the beginning or the end of the string. There, you must look
> outside the string. Hence:

Well, wait; regexps can make some pretty darn good guesses at the beginnings 
or ends of strings.  Quotations don't normally end in spaces (in the 
conventions used with ""; French typography is different, but if you're using 
spaces around your quotes you have worse problems (line-breaks) to worry 
about).  So if a string ends in space(s) followed by a quote, it's very likely 
that quote is an open-quote for some stuff that comes after.  Conversely, if a 
string starts with a quote followed by some spaces, it's very likely a close-
quote to what went on before.

This isn't quite it; beginning-of-string followed by quote, then punctuation 
and then spaces is also a close-quote, etc... There is a lot of fine-tuning.  
But even what I currently have was able to handle your 

Caesar said, "/Alea Jacta est./"

example.  Yes, there are edge-cases which this won't catch, and it remains to 
be seen how pervasive and annoying those are.  It may be that repeated 
tweaking of regexps will handle enough of the ordinary cases.  It may be that 
after a few rounds of regexp-hacking someone will finally decide that regexp-
hacking just won't handle enough of the important cases.  But I think even as 
it stands now we'd probably handle 80-90% of the normal situations, which 
really is as much as we reasonably can hope for.

Could I trouble someone to try applying my patch and trying it out for 
yourself and seeing just how bad/good the performance is?  It seems to work 
okay for the cases I've been trying, but maybe my dataset isn't robust 
enough.  Let's give it a test and seen how many actual cases in common usage 
it gets wrong.  Maybe see how much can be fixed by tuning regexps.

> 
> > ]      1. If it has a quote as its first or last position, check for
> > ]         objects before or after the string to guess its status. An
> > ]         object never starts with a white space, but you may have to
> > ]         check :post-blank property in order to know if previous object
> > ]         had white spaces at its end.
> 
> But you can only do that from the element containing the string, not
> from the string itself.

The case where a quote both sits at the edge of a string (i.e. at the border 
of some element, formatting, etc) *and* does not have whitespace next to it, 
with possible punctuation, does not seem to be a normal occurrence to me.  If 
I'm wrong, how common *is* it?

> 
> > So the issue with the current state is that it
> > would wind up applying to too much? (it would hit code and verbatim 
elements,
> > for example, and that would be wrong.)
> 
> No, you are not applying it too much (verbatim elements don't contain
> plain-text objects) but your function hasn't got access to enough
> information to be useful.

The on-screen version, of course, will have to be smarter and check for 
the "face" formatting to make sure it doesn't happen in comments or verbatims; 
I am pretty sure it does not do that yet.
 
> > wait, called on the top-level parsed tree object, recursively doing
> > its thing before(?) the transcoders of the individual objects get to
> > it.
> 
> That's called a parse tree filter. That should be a possibility
> indeed. The function would be applied on the parse tree and would
> replace strings within elements containing plain text (that is
> paragraph, verse-block and table-row types). parse tree filters are
> applied very early in the export process.
> 
> Another option would be to integrate it into
> `org-element-normalize-contents', but I think the previous way is
> better.

Maybe.  I know it sounds like I'm fixated on the plain-text solution, but I'm 
not convinced the envisioned problems are more than theoretical, or that they 
will cause an unacceptable amount of error (keeping in mind that some error 
*is* acceptable and unavoidable).

> > The on-screen one would still use the plain-string computation, as you 
said,
> > since the full parse isn't available.
> 
> Yes.
> 
> > It would also need to be tweaked not to act on verbatim/comment text,
> > etc.
> 
> Yes. You may want to use `org-element-at-point' and `org-element-type'
> to tell if you're somewhere smart quotes are allowed (in table,
> table-row, paragraph, verse-block elements).

Probably.  I think I saw some other package make these decisions by peeking at 
the formatting and seeing if it is set in comment-face or something, but 
checking the element at point is presumably more sensible.

~mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Smart Quotes Exporting
  2012-06-15 16:20                                 ` Mark Shoulson
@ 2012-06-19  9:26                                   ` Nicolas Goaziou
  2012-08-07 23:18                                     ` Bastien
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Goaziou @ 2012-06-19  9:26 UTC (permalink / raw)
  To: Mark Shoulson; +Cc: emacs-orgmode

Hello,

Mark Shoulson <mark@kli.org> writes:

> Well, wait; regexps can make some pretty darn good guesses at the beginnings 
> or ends of strings.

I know that. They make a good job, I just want a better one.

> This isn't quite it; beginning-of-string followed by quote, then punctuation 
> and then spaces is also a close-quote, etc... There is a lot of fine-tuning.  
> But even what I currently have was able to handle your 
>
> Caesar said, "/Alea Jacta est./"
>
> example.

No, it doesn't handle that, actually, it's just sheer luck.  Indeed, the
quoting function is applied to "\"".  There's absolutely no space,
punctuation, etc. to save the day.  So it makes a wild guess with
a probability of 0.5 of success.  Since the guess is always the same,
"/a/" will always fail.

> The case where a quote both sits at the edge of a string (i.e. at the border 
> of some element, formatting, etc) *and* does not have whitespace next to it, 
> with possible punctuation, does not seem to be a normal occurrence to me.  If 
> I'm wrong, how common *is* it?

Even if it rarely happens, it can be _very_ annoying to have to cope
with bad guesses. If it can be avoided, I see no reason not to do so.

Now, here the infrastructure I propose.

Internally, the two following functions are required.

#+begin_src emacs-lisp
(defun org-export--smart-quotes-in-element (element backend)
  "Replace plain quotes with smart quotes in ELEMENT.

ELEMENT is an Org element or a secondary string.  BACKEND is the
back-end to check for rules, as a symbol.

This is a destructive operation.  Return new element."
  (let* ((type (org-element-type element))
         (properties (and type (nth 1 element))))
    ;; Destructively apply changes to secondary string, if any.
    (let ((secondary (and type (assq type org-element-secondary-value-alist))))
      (when secondary
        (let* ((sec-symbol (cdr secondary))
               (sec-value (plist-get properties sec-symbol)))
          (when sec-value
            (setq properties
                  (plist-put properties
                             sec-symbol
                             (org-export--smart-quotes-in-element
                              sec-value backend)))))))
    ;; Destructively change `:caption' if present.  Since it's a dual
    ;; keyword, apply smart quotes to both CAR and CDR, if required.
    (let ((caption (plist-get :caption properties)))
      (when caption
        (setq properties
              (plist-put properties
                         :caption
                         (cons
                          (org-export--smart-quotes-in-element
                           (car caption) backend)
                          (and (cdr caption)
                               (org-export--smart-quotes-in-element
                                (cdr caption) backend)))))))
    ;; Recursively apply changes to contents.  Rebuild ELEMENT along
    ;; the way, with updated strings.
    (let ((contents (if type (org-element-contents element) element))
          previous current next acc)
      (while contents
        (setq current (pop contents)
              next (car contents)
              previous current)
        (push
         (cond ((stringp current)
                ;; CURRENT is a string: Call
                ;; `org-export-quotation-marks' with appropriate
                ;; information.
                (org-export-quotation-marks
                 current
                 (and previous
                      (if (stringp previous)
                          (length (and (string-match " +\\'" previous)
                                       (match-string 0 previous)))
                        (org-element-property :post-blank previous)))
                 (and next
                      (if (not (stringp next)) 0
                        (length (and (string-match "\\` +" next)
                                     (match-string 0 next)))))
                 backend))
               ;; CURRENT is recursive: Move into it.
               ((plist-get properties :contents-begin)
                (org-export--smart-quotes-in-element current backend))
               ;; Otherwise, just accumulate CURRENT.
               (t current))
         acc))
      ;; Re-build transformed element.
      (if (or (not type) (eq type 'plain-text)) (nreverse acc)
        (nconc (list type properties) (nreverse acc))))))

(defun org-export-set-smart-quotes (tree backend info)
  "Replace plain quotes with smart quotes in TREE.

BACKEND is the back-end, as a symbol, used for transcoding.  INFO
is a plist used as a communication channel.

This is a destructive operation.  This function is meant to be
used as a parse tree filter for back-ends activating smart
quotes."
  ;; Destructively apply smart quotes to parsed keywords in info.
  (let ((value (plist-get info :title)))
    (when value
      (setq info
            (plist-put info
                       :title
                       (org-export--smart-quotes-in-element value backend)))))
  ;; Replace smart quotes in elements containing plain text or
  ;; secondary strings across the parse tree.
  (org-element-map
   tree '(paragraph verse-block table-cell headline inlinetask item)
   (lambda (el)
     (org-export-set-element el
                             (org-export--smart-quotes-in-element el backend))))
  ;; Return parse tree.
  tree)
#+end_src

Then, all is left to do is write the function replacing quotes in
a string, with additional information:

#+begin_src emacs-lisp
(defun org-export-quotation-marks (s &optional prev next backend)
  "Replace plain quotes with smart quotes in string S.

Optional argument PREV (resp. NEXT) is the number of white space
characters before (resp. after) the string, or nil if
S starts (resp. ends) a paragraph.

Optional argument BACKEND is a symbol representing the back-end
to use for substitutions.

The function returns the new string."
  ...)
#+end_src

Once this function is written, add `org-export-set-smart-quotes' as
a parse tree filter in `org-BACKEND-filters-alist'.

For example, one can add the following in org-e-latex.el to activate
smart quotes in latex export:

#+begin_src emacs-lisp
(defconst org-e-latex-filters-alist
  '((:filter-parse-tree . org-export-set-smart-quotes))
  "Alist between filters keywords and back-end specific filters.
See `org-export-filters-alist' for more information.")
#+end_src

Could you please try to modify your original
`org-export-quotation-marks' accordingly and test it?

>> Yes. You may want to use `org-element-at-point' and `org-element-type'
>> to tell if you're somewhere smart quotes are allowed (in table,
>> table-row, paragraph, verse-block elements).
>
> Probably.  I think I saw some other package make these decisions by peeking at 
> the formatting and seeing if it is set in comment-face or something, but 
> checking the element at point is presumably more sensible.

Thinking about it, looking at face used will definitely be faster,
though. That's your call.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Smart Quotes Exporting
  2012-06-19  9:26                                   ` Nicolas Goaziou
@ 2012-08-07 23:18                                     ` Bastien
  0 siblings, 0 replies; 23+ messages in thread
From: Bastien @ 2012-08-07 23:18 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Mark Shoulson, emacs-orgmode

Hi Mark and Nicolas,

in the patchwork¹, I've marked patches² related to this discussion as
"Not Applicable".

If there are progress made on this front, please send updated patches.
If there is a patch below that I should apply, please let me know.

Thanks!

¹ http://patchwork.newartisans.com/project/org-mode/list/
² Here are the patches:

http://patchwork.newartisans.com/patch/1330/
http://patchwork.newartisans.com/patch/1344/
http://patchwork.newartisans.com/patch/1346/
http://patchwork.newartisans.com/patch/1348/

-- 
 Bastien

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-08-07 23:18 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-22  3:32 "Smart" quotes Mark E. Shoulson
2012-05-23 22:17 ` Nicolas Goaziou
2012-05-24  3:05   ` Mark E. Shoulson
2012-05-25 17:14     ` Nicolas Goaziou
2012-05-25 17:51       ` Jambunathan K
2012-05-25 22:51       ` Mark E. Shoulson
2012-05-26  6:48         ` Nicolas Goaziou
2012-05-29  1:30           ` Mark E. Shoulson
2012-05-29 17:57             ` Nicolas Goaziou
2012-05-30  0:51               ` Mark E. Shoulson
2012-05-31  1:50                 ` (no subject) Mark Shoulson
2012-05-31 13:38                   ` Nicolas Goaziou
2012-05-31 23:26                     ` Smart Quotes Exporting (Was: Re: (no subject)) Mark E. Shoulson
2012-06-01 17:11                       ` Smart Quotes Exporting Nicolas Goaziou
2012-06-01 22:41                         ` Mark E. Shoulson
2012-06-03  3:16                         ` Mark E. Shoulson
2012-06-06  2:14                         ` Mark E. Shoulson
2012-06-07 19:21                           ` Nicolas Goaziou
2012-06-11  1:28                             ` Mark Shoulson
2012-06-12 13:21                               ` Nicolas Goaziou
2012-06-15 16:20                                 ` Mark Shoulson
2012-06-19  9:26                                   ` Nicolas Goaziou
2012-08-07 23:18                                     ` Bastien

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).