emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: "François Pinard" <pinard@iro.umontreal.ca>
To: emacs-orgmode@gnu.org
Subject: Python code for producing Org tables
Date: Tue, 18 Dec 2012 22:43:08 -0500	[thread overview]
Message-ID: <8638z2yaer.fsf@mercure.progiciels-bpi.ca> (raw)

Hi, Org people.

I recently needed to produce Org tables from within Python, splitting
them as needed to fit within a preset width.  I append the code after my
signature, in case it would be useful to others (or even, if you have
ideas to improve it).

One thing I only realized after writing this, however, is that I wrongly
thought Org mode was aligning floating numbers on the decimal period,
while it merely right align floating numbers regardless of the position
of the period.  So, I turn my mistake into a suggestion, as I think it
would be more convenient if Org mode was aligning floating numbers more
appropriately.

I even thought trying to contribute some Emacs Lisp code to do so, but
seeing that I'm short on free time in these days (like too often), I now
find more fruitful to merely share the idea (and the Python code) now.

François



def to_org(titles, rows, write, hide_empty=False, margin=0, easy=4, span=1,
           fillto=None, limit=None):
    """\
Given a list of column TITLES, and a list of ROWS, each containing a list
of columns, use WRITE to produce a Org formatted table with the text of
columns. If HIDE_EMPTY is not False, then omit columns containing nothing but
empty strings.  The formatted table is shifted right by MARGIN columns.

To accomodate for titles, the width of a column will easily extend to EASY,
or to whatever is needed so the title is not split on more than SPAN lines.
FILLTO may be used to force last column to extend until that position.  LIMIT
may be used to impose a limit to the number of characters in produced lines.

If TITLES is None, the titles are not produced.

Columns containing only numbers (integer or floating) align them properly.
"""

    # Exit if nothing to display.
    if not rows:
        return

    # Compute widths from data.
    rows = [[safe_unicode(column, 30).replace('\\', '\\\\')
             .replace('|', '\\vert{}') for column in row]
             for row in rows]
    # Each WIDTH is the column width as strings.  Each LEFT is either
    # False when there is a no-number in the column or the maximum width
    # of the integral part of all numbers.  When LEFT is not None, each
    # RIGHT is either the maximum width of the fraction part including
    # the decimal point of all numbers, or 0 if all pure integers.
    widths = [0] * len(rows[0])
    lefts = [0] * len(rows[0])
    rights = [0] * len(rows[0])
    for row in rows:
        for counter, cell in enumerate(row):
            widths[counter] = max(widths[counter], len(cell))
            if lefts[counter] is not False:
                match = re.match('([0-9]*)(\\.[0-9]*)$', cell)
                if match is None:
                    lefts[counter] = False
                else:
                    lefts[counter] = max(lefts[counter],
                                         len(match.group(1)))
                    if match.group(2):
                        rights[counter] = max(rights[counter],
                                              len(match.group(2)))
    for counter, (left, right) in enumerate(zip(lefts, rights)):
        if left == 0 and right == 0:
            lefts[counter] = False
        elif left is not False:
            widths[counter] = left + right

    # Extend widths as needed to make room for titles.
    if titles is not None:
        for counter, (width, title) in enumerate(zip(widths, titles)):
            if (not hide_empty or width) and len(title) > width:
                if len(title) <= easy:
                    widths[counter] = len(title)
                else:
                    for nlines in range(2, span):
                        if len(title) <= easy * nlines:
                            widths[counter] = max(
                                width, (len(title) + nlines - 1) // nlines)
                            break
                    else:
                        widths[counter] = max(
                            width, (len(title) + span - 1) // span)
    if fillto:
        extend = fillto - margin - sum(widths) - 3 * len(widths) - 1
        if extend > 0:
            widths[-1] += extend

    # Horizontally split the display so each part fits within LIMIT columns.
    end = 0
    while end < len(widths):
        start = end
        if limit is None:
            end = len(widths)
        else:
            remaining = limit - margin - widths[start] - 4
            end = start + 1
            while end < len(widths) and remaining >= widths[end] + 3:
                remaining -= widths[end] + 3
                end += 1
        # Now ready to output columns from START to END (excluded).

        # Skip this part if nothing to display.
        if hide_empty:
            for width in widths[start:end]:
                if width:
                    break
            else:
                continue
        if start > 0:
            write('\n')

        if titles is not None:
            # Write title lines, splitting titles as needed.
            pairs = zip(widths[start:end], titles[start:end])
            for counter in range(span):
                fragments = []
                inked = False
                for width, title in pairs:
                    if not hide_empty or width:
                        fragment = title[counter * width:(counter + 1) * width]
                        if fragment:
                            inked = True
                        fragments.append(
                            fragment.replace('|', ' ').lstrip().ljust(width))
                if not inked:
                    break
                write('%s| %s |\n' % (' ' * margin,
                                      ' | '.join(fragments)))

            # Write separator line.
            fragments = []
            for width in widths[start:end]:
                if not hide_empty or width:
                    fragments.append('-' * width)
            write('%s|-%s-|\n' % (' ' * margin, '-+-'.join(fragments)))

        # Write body lines.
        for row in rows:
            fragments = []
            for width, left, cell in zip(
                    widths[start:end], lefts[start:end], row[start:end]):
                if not hide_empty or width:
                    if left is False:
                        text = cell.replace('|', ' ').lstrip()
                    else:
                        position = cell.find('.')
                        if position < 0:
                            position = len(cell)
                        text = ' ' * (left - position) + cell
                    fragments.append(text.ljust(width))
            write('%s| %s |\n' % (' ' * margin,
                                  ' | '.join(fragments)))


unprintable_regexp = re.compile(
    '[%s]' % re.escape(''.join(map(unichr, range(0, 32) + range(127, 160)))))


def safe_unicode(value, limit=None):
    if value is None:
        return ''
    if isinstance(value, str):
        try:
            value = unicode(value, encoding)
        except UnicodeDecodeError:
            # FIXME: Too fishy!
            value = unicode(value, 'iso-8859-1')
    elif not isinstance(value, unicode):
        # FIXME: Il semble que la sortie de Rpy2 ne souffre pas ", encoding"?
        value = unicode(value)
    if re.search(unprintable_regexp, value):
        value = repr(value)
        if value.startswith('u'):
            value = value[1:]
    if limit is not None and len(value) > limit:
        left_cut = limit * 2 // 3
        right_cut = limit - left_cut
        return value[:left_cut - 1] + u'…' + value[-right_cut:]
    return value

             reply	other threads:[~2012-12-19  3:43 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-19  3:43 François Pinard [this message]
2012-12-24  0:35 ` Python code for producing Org tables Bastien
2012-12-24 12:39   ` François Pinard
2012-12-24 13:12     ` Bastien
2012-12-24 13:26       ` Dov Grobgeld
2012-12-25 13:10         ` François Pinard
2012-12-25 13:04       ` François Pinard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8638z2yaer.fsf@mercure.progiciels-bpi.ca \
    --to=pinard@iro.umontreal.ca \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).