From mboxrd@z Thu Jan 1 00:00:00 1970 From: Carsten Dominik Subject: Re: [PATCH] quote the real csv separator Date: Sun, 24 Oct 2010 21:37:56 +0200 Message-ID: References: <87mxq4wman.fsf@kotik.lan> <57E157F5-BDE7-4431-B15F-8EE52E571677@gmail.com> <12945.1287939175@gamaville.dokosmarshall.org> Mime-Version: 1.0 (Apple Message framework v936) Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Return-path: Received: from [140.186.70.92] (port=40365 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PA6Oj-0003Ag-PU for emacs-orgmode@gnu.org; Sun, 24 Oct 2010 15:38:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PA6OH-0007H2-EF for emacs-orgmode@gnu.org; Sun, 24 Oct 2010 15:38:28 -0400 Received: from mail-ew0-f41.google.com ([209.85.215.41]:52028) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PA6OG-0007Fv-Vq for emacs-orgmode@gnu.org; Sun, 24 Oct 2010 15:38:01 -0400 Received: by ewy25 with SMTP id 25so1323966ewy.0 for ; Sun, 24 Oct 2010 12:38:00 -0700 (PDT) In-Reply-To: <12945.1287939175@gamaville.dokosmarshall.org> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: nicholas.dokos@hp.com Cc: =?UTF-8?Q?=C5=81ukasz_Stelmach?= , emacs-orgmode@gnu.org On Oct 24, 2010, at 6:52 PM, Nick Dokos wrote: > Carsten Dominik wrote: > >> Hi Lukasz, >> >> thanks for the patch, but I do not understand it. >> >> The separator for csv is always the comma, or am I wrong here? >> So this function should use comma, hard-coded. The only place >> where it is used is when orgtbl-to-csv calls the generic >> exporter. It does so with comma as separator and with >> org-quote-csv-field as formatting function. >> >> What use case do you have in mind? >> >> - Carsten >> > > [This is *not* a comment on the patch itself, which I have not > looked at > carefully.] > > CSV started out simple and grew to be a monster (but it is still > useful > despite all that). It's not formally defined, so there are several > variations, dialects and subdialects. Here e.g. is the description of > the python module that handles CSV: it defines an "excel" dialect > and an > "excel_tab" subdialect, the latter using a TAB as a delimiter. If you > want more details and have python installed, start it up, import csv > and > then say "help(csv)". Thanks for this, useful info. I remember when we implemented csv that it seemed trivial and was not. Org-mode does have a separate orgtbl-to-tsv which handles tab-separated data. > > HTH, > Nick > > ,---- > | NAME > | csv - CSV parsing and writing. > | > | FILE > | /usr/lib/python2.5/csv.py > | > | MODULE DOCS > | http://www.python.org/doc/current/lib/module-csv.html > | > | DESCRIPTION > | This module provides classes that assist in the reading and > writing > | of Comma Separated Value (CSV) files, and implements the > interface > | described by PEP 305. Although many CSV files are simple to > parse, > | the format is not formally defined by a stable specification and > | is subtle enough that parsing lines of a CSV file with something > | like line.split(",") is bound to fail. The module supports > three > | basic APIs: reading, writing, and registration of dialects. > | > | > | DIALECT REGISTRATION: > | > | Readers and writers support a dialect argument, which is a > convenient > | handle on a group of settings. When the dialect argument is a > string, > | it identifies one of the dialects previously registered with > the module. > | If it is a class or instance, the attributes of the argument > are used as > | the settings for the reader or writer: > | > | class excel: > | delimiter = ',' > | quotechar = '"' > | escapechar = None > | doublequote = True > | skipinitialspace = False > | lineterminator = '\r\n' > | quoting = QUOTE_MINIMAL > | > | SETTINGS: > | > | * quotechar - specifies a one-character string to use as the > | quoting character. It defaults to '"'. > | * delimiter - specifies a one-character string to use as the > | field separator. It defaults to ','. > | * skipinitialspace - specifies how to interpret whitespace > which > | immediately follows a delimiter. It defaults to > False, which > | means that whitespace immediately following a > delimiter is part > | of the following field. > | * lineterminator - specifies the character sequence which > should > | terminate rows. > | * quoting - controls when quotes should be generated by > the writer. > | It can take on any of the following module constants: > | > | csv.QUOTE_MINIMAL means only when required, for > example, when a > | field contains either the quotechar or the delimiter > | csv.QUOTE_ALL means that quotes are always placed > around fields. > | csv.QUOTE_NONNUMERIC means that quotes are always > placed around > | fields which do not parse as integers or floating > point > | numbers. > | csv.QUOTE_NONE means that quotes are never placed > around fields. > | * escapechar - specifies a one-character string used to > escape > | the delimiter when quoting is set to QUOTE_NONE. > | * doublequote - controls the handling of quotes inside > fields. When > | True, two consecutive quotes are interpreted as one > during read, > | and when writing, each quote character embedded in the > data is > | written as two quotes > `----