From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Goaziou Subject: Re: Bug: unconverted dashes in HTML export Date: Sat, 08 Feb 2014 10:29:54 +0100 Message-ID: <87ha8aymst.fsf@gmail.com> References: <87d2iykpwo.fsf@azha.ziiuu.com> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:45206) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WC4EE-0004LJ-I3 for emacs-orgmode@gnu.org; Sat, 08 Feb 2014 04:29:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WC4E8-0003mF-Va for emacs-orgmode@gnu.org; Sat, 08 Feb 2014 04:29:38 -0500 Received: from mail-we0-x231.google.com ([2a00:1450:400c:c03::231]:60492) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WC4E8-0003m7-Pm for emacs-orgmode@gnu.org; Sat, 08 Feb 2014 04:29:32 -0500 Received: by mail-we0-f177.google.com with SMTP id t61so2865679wes.8 for ; Sat, 08 Feb 2014 01:29:32 -0800 (PST) In-Reply-To: <87d2iykpwo.fsf@azha.ziiuu.com> (Thomas Morgan's message of "Sat, 01 Feb 2014 17:29:17 -0500") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Thomas Morgan Cc: emacs-orgmode@gnu.org Hello, Thomas Morgan writes: > I started Emacs with `emacs -Q -l setup.el test-case.org', then typed > `C-c C-e h o' to export to HTML and open the result. The setup file > (`setup.el'), test case (`test-case.org'), HTML output (`lose.html'), > and a PDF printed by the web browser (`lose.pdf'), are attached. > > The test case contains a one-cell table with three hyphens (`---'). > I expected this to be converted to an em-dash in the HTML output, > but it remained three hyphens. Indeed. > A patch fixing the problem is attached, along with the HTML and PDF > produced after the patch was applied (`win.html', `win.pdf'). > > I started preparing this report last May (sorry for the delay) > but just confirmed the bug again with Org-mode version 8.2.5g > (`release_8.2.5g-663-g24a213' @ `/src/org-mode/lisp/') and GNU Emacs > 24.3.1 (`x86_64-unknown-linux-gnu', X toolkit, Xaw3d scroll bars) > of 2013-09-24. Thank you for the patch. A few remarks below. >>From bd14cdce80a610a5eadbf563ac12472fbed542a5 Mon Sep 17 00:00:00 2001 > From: Thomas Morgan > Date: Mon, 13 May 2013 11:06:52 +0200 > Subject: [PATCH] Convert dashes in HTML export even when at end of string. > > * lisp/ox-html.el (org-html-special-string-regexps): Convert dashes > even when at end of string. You need to add TINYCHANGE at the end of the commit message. > - ("---\\([^-]\\)" . "—\\1") ; mdash > - ("--\\([^-]\\)" . "–\\1") ; ndash > + ("---\\([^-]?\\)" . "—\\1") ; mdash > + ("--\\([^-]?\\)" . "–\\1") ; ndash The new regexps still don't look right, as they can match an additional dash: (string-match "---\\([^-]?\\)" "----") => 0 I'm not sure about the intent of this regexp, that is whether consecutive mdashes or ndashes are allowed or not. A correct version could be either: ("---" . "—") or ("\\([^-]\\|^\\)---\\([^-]\\|$\\)" . "\\1—\\2") I think the former is on par with LaTeX behaviour. What do you think? Regards, -- Nicolas Goaziou