From: Nicolas Goaziou <n.goaziou@gmail.com>
To: Mark Shoulson <mark@kli.org>
Cc: emacs-orgmode@gnu.org
Subject: Re: Smart Quotes Exporting
Date: Tue, 12 Jun 2012 15:21:05 +0200 [thread overview]
Message-ID: <874nqgeke6.fsf@gmail.com> (raw)
In-Reply-To: <loom.20120611T024716-455@post.gmane.org> (Mark Shoulson's message of "Mon, 11 Jun 2012 01:28:12 +0000 (UTC)")
Hello,
Mark Shoulson <mark@kli.org> writes:
>> ASCII exporter also handle UTF-8. So it's good to have there too.
>
> Really? I would have thought ASCII meant ASCII, as in 7-bit clean
> text.
org-e-ascii.el (as old org-ascii.el) handles ASCII, Latin1 and UTF-8
encodings.
> It looked to me like your solution would essentially boil down to "do
> string handling when there's a string, otherwise recur down and find
> the strings," which essentially means apply it to all the
> strings... and there were already functions out there applying things
> to strings, so this can just ride along with them. Here, let's look
> at your suggestion and see if we can find what I missed:
>
> ] Walk element/object/secondary-string's contents .
> ]
> ] 1. When a string is encountered:
> ]
> ] 1. If it has a quote as its first or last position, check for
> ] objects before or after the string to guess its status. An
> ] object never starts with a white space, but you may have to
> ] check :post-blank property in order to know if previous object
> ] had white spaces at its end.
> ]
> ] 2. For each quote everywhere else in the string, your regexp can
> ] handle it fine.
> ]
> ] 2. When an object belonging to `org-element-recursive-objects' is
> ] encountered, apply the function to this object.
> ]
> ] 3. Accumulate returned strings or objects.
>
> So, if it's a string, use the regexps (if they can be smart enough to look at
> beginning and end of the string, which they can--though I haven't been using the
> :post-blank property so presumably something is amiss), and if it isn't a
> string, recur down until you get to a string... Ah, but only if it's in
> org-element-recursive-objects.
You're missing an important part: the regexps cannot be smart enough for
quotes at the beginning or the end of the string. There, you must look
outside the string. Hence:
> ] 1. If it has a quote as its first or last position, check for
> ] objects before or after the string to guess its status. An
> ] object never starts with a white space, but you may have to
> ] check :post-blank property in order to know if previous object
> ] had white spaces at its end.
But you can only do that from the element containing the string, not
from the string itself.
> So the issue with the current state is that it
> would wind up applying to too much? (it would hit code and verbatim elements,
> for example, and that would be wrong.)
No, you are not applying it too much (verbatim elements don't contain
plain-text objects) but your function hasn't got access to enough
information to be useful.
> So it remains to find the right place in the processing to put
> a function like the one you describe. I'm trying to get a proper
> understanding of the code structure to see what you mean. Looks like
> it should be something like a transcoder, only called on
> everything...
Transcoders are type specific, so that's not an option.
> wait, called on the top-level parsed tree object, recursively doing
> its thing before(?) the transcoders of the individual objects get to
> it.
That's called a parse tree filter. That should be a possibility
indeed. The function would be applied on the parse tree and would
replace strings within elements containing plain text (that is
paragraph, verse-block and table-row types). parse tree filters are
applied very early in the export process.
Another option would be to integrate it into
`org-element-normalize-contents', but I think the previous way is
better.
> The on-screen one would still use the plain-string computation, as you said,
> since the full parse isn't available.
Yes.
> It would also need to be tweaked not to act on verbatim/comment text,
> etc.
Yes. You may want to use `org-element-at-point' and `org-element-type'
to tell if you're somewhere smart quotes are allowed (in table,
table-row, paragraph, verse-block elements).
Regards,
--
Nicolas Goaziou
next prev parent reply other threads:[~2012-06-12 13:23 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-22 3:32 "Smart" quotes Mark E. Shoulson
2012-05-23 22:17 ` Nicolas Goaziou
2012-05-24 3:05 ` Mark E. Shoulson
2012-05-25 17:14 ` Nicolas Goaziou
2012-05-25 17:51 ` Jambunathan K
2012-05-25 22:51 ` Mark E. Shoulson
2012-05-26 6:48 ` Nicolas Goaziou
2012-05-29 1:30 ` Mark E. Shoulson
2012-05-29 17:57 ` Nicolas Goaziou
2012-05-30 0:51 ` Mark E. Shoulson
2012-05-31 1:50 ` (no subject) Mark Shoulson
2012-05-31 13:38 ` Nicolas Goaziou
2012-05-31 23:26 ` Smart Quotes Exporting (Was: Re: (no subject)) Mark E. Shoulson
2012-06-01 17:11 ` Smart Quotes Exporting Nicolas Goaziou
2012-06-01 22:41 ` Mark E. Shoulson
2012-06-03 3:16 ` Mark E. Shoulson
2012-06-06 2:14 ` Mark E. Shoulson
2012-06-07 19:21 ` Nicolas Goaziou
2012-06-11 1:28 ` Mark Shoulson
2012-06-12 13:21 ` Nicolas Goaziou [this message]
2012-06-15 16:20 ` Mark Shoulson
2012-06-19 9:26 ` Nicolas Goaziou
2012-08-07 23:18 ` Bastien
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874nqgeke6.fsf@gmail.com \
--to=n.goaziou@gmail.com \
--cc=emacs-orgmode@gnu.org \
--cc=mark@kli.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).