From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yasushi SHOJI Subject: [RFC] ox-ascii.el: fixing variable width character handling Date: Sun, 10 Nov 2013 19:40:21 +0900 Message-ID: <87zjpcsfoq.wl@dns1.atmark-techno.com> Mime-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:58767) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VfSiZ-0001AN-AZ for emacs-orgmode@gnu.org; Sun, 10 Nov 2013 05:58:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VfSiR-0003e4-3C for emacs-orgmode@gnu.org; Sun, 10 Nov 2013 05:58:11 -0500 Received: from plane.gmane.org ([80.91.229.3]:36886) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VfSiQ-0003du-SJ for emacs-orgmode@gnu.org; Sun, 10 Nov 2013 05:58:03 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1VfSiO-00032D-IY for emacs-orgmode@gnu.org; Sun, 10 Nov 2013 11:58:00 +0100 Received: from p654782.hkidff01.ap.so-net.ne.jp ([121.101.71.130]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 10 Nov 2013 11:58:00 +0100 Received: from yashi by p654782.hkidff01.ap.so-net.ne.jp with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 10 Nov 2013 11:58:00 +0100 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org Hi, I've been trying to fix ASCII export back-end for variable width chars. It is basically replacing `length' with `string-width', but the behavior of those two functions differ when you give nil as an argument; `length' returns 0, `string-width' yields a type error: "eval: Wrong type argument: stringp, nil" While I came up with the following experimental patch, I have a few questions: - What is the lisp idiom to handle type error? In the following patch, I've created a new wrapper function `org-ascii--string-width', it's a thin wrapper with if-then-else. I thought placing if-then-else clause in the original place clutter the code. - If wrapped string-width is good idea, would it be also a good idea to merge it with `org-string-width' in org.el? Because it is not ox-ascii specific. The current `org-string-width' does not handle nil, neither. - The width of the underline for headlines should be checked with string-width. As defined in Unicode Standard Annex #41[1], character width is environment dependent. So, the calculation of character width at export time might not be sufficient enough. We can check to see the exported document contains cjk chars more than some thresholds or not, I haven't gone that far. - Does anyone using ox-ascii.el depends on the clipped marker `=>' for fixed width columns? I thought `...' would be much readable. - BTW, while looking at table handling, I noticed fixed column width doesn't work with the code at the current git HEAD. That's because width calculation is mixed with `length' and `string-width', and pass out-of-range arguments to `add-text-properties'. [1]: http://www.unicode.org/reports/tr41/ diff --git a/lisp/ox-ascii.el b/lisp/ox-ascii.el index 8e75007..35d58fc 100644 --- a/lisp/ox-ascii.el +++ b/lisp/ox-ascii.el @@ -630,7 +630,8 @@ possible. It doesn't apply to `inlinetask' elements." org-ascii-underline))))) (and under-char (concat "\n" - (make-string (length first-part) under-char)))))))) + (make-string (/ (string-width first-part) (char-width under-char)) + under-char)))))))) (defun org-ascii--has-caption-p (element info) "Non-nil when ELEMENT has a caption affiliated keyword. @@ -1704,7 +1705,7 @@ are ignored." (org-element-map table 'table-row (lambda (row) (setq max-width - (max (length + (max (string-width (org-export-data (org-element-contents (elt (org-element-contents row) col)) @@ -1714,6 +1715,11 @@ are ignored." max-width)) cache)))) +(defun org-ascii--string-width (str) + (if str + (string-width str) + 0)) + (defun org-ascii-table-cell (table-cell contents info) "Transcode a TABLE-CELL object from Org to ASCII. CONTENTS is the cell contents. INFO is a plist used as @@ -1724,16 +1730,18 @@ a communication channel." ;; each cell in the column. (let ((width (org-ascii--table-cell-width table-cell info))) ;; When contents are too large, truncate them. - (unless (or org-ascii-table-widen-columns (<= (length contents) width)) - (setq contents (concat (substring contents 0 (- width 2)) "=>"))) + (unless (or org-ascii-table-widen-columns + (<= (org-ascii--string-width contents) width)) + (setq contents (truncate-string-to-width contents width nil ?. t))) ;; Align contents correctly within the cell. (let* ((indent-tabs-mode nil) (data (when contents (org-ascii--justify-string contents width - (org-export-table-cell-alignment table-cell info))))) - (setq contents (concat data (make-string (- width (length data)) ? )))) + (org-export-table-cell-alignment table-cell info)))) + (trailing-space (make-string (- width (org-ascii--string-width data)) ? ))) + (setq contents (concat data trailing-space))) ;; Return cell. (concat (format " %s " contents) (when (memq 'right (org-export-table-cell-borders table-cell info))