emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Kaushal Modi <kaushal.modi@gmail.com>
To: emacs-org list <emacs-orgmode@gnu.org>
Subject: Canonical way to strip off all markup from an element in Org exporter backend?
Date: Wed, 20 Dec 2017 18:30:20 +0000	[thread overview]
Message-ID: <CAFyQvY2Aqz79+hcn+cMZn1GCN5BTEhEVmKOqYb7W3EWG_945hQ@mail.gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1449 bytes --]

Hello,

What's the canonical way to strip off all markup from an element in an Org
exporter backend.

I do it in this round-about way in ox-hugo..it works but feels convoluted.
The trick is to remove all markup chars from an element while retaining the
*, /, `, etc chars *not* used for any markup.

I export Org subtrees to individual posts, where the subtree headline will
become the post title. So I need to sanitize that headline of any markup.

Step1: I get the HTMLized version of the title

(org-export-data-with-backend (plist-get info :title) 'html info)

But getting the HTMLized version of the title, it would be easy to strip
off the HTML tags which would be inserted basically for formatting (bold,
italics, etc.).

Step 2: Strip off the HTML tags.

(while (string-match "<\\(?1:[a-z]+\\)[^>]*>\\(?2:[^<]+\\)</\\1>" title)
  (setq title (replace-match "\\2" nil nil title)))

If I do any other exporter like md, I will lose the ability to distinguish
a literal * in the title from a * meant for bold/italics markup in
Markdown. Even ascii is not good because then I'd need to do some intensive
parsing to figure out if ` is meant to be a literal ` or part of `code'.

So the question: Is this the best way.. or is there a canonical way to
export an element without any markup char?

Full actual code[1].

[1]:
https://github.com/kaushalmodi/ox-hugo/blob/dffb7e970f33959a0b97fb8df267a54d01a98a2a/ox-hugo.el#L1769-L1802
-- 

Kaushal Modi

[-- Attachment #2: Type: text/html, Size: 2032 bytes --]

             reply	other threads:[~2017-12-20 18:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-20 18:30 Kaushal Modi [this message]
2017-12-20 22:04 ` Canonical way to strip off all markup from an element in Org exporter backend? Nicolas Goaziou
2017-12-20 22:11   ` Kaushal Modi
2017-12-20 22:27     ` Nicolas Goaziou
2017-12-20 22:41       ` Kaushal Modi
2017-12-21 14:22         ` Nicolas Goaziou
2017-12-22 20:31           ` Kaushal Modi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFyQvY2Aqz79+hcn+cMZn1GCN5BTEhEVmKOqYb7W3EWG_945hQ@mail.gmail.com \
    --to=kaushal.modi@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).