From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thierry Banel Subject: Re: table formula help... Date: Wed, 10 Dec 2014 23:55:13 +0100 Message-ID: <5488CF51.8000006@free.fr> References: <87ppbxfi5f.fsf@ericabrahamsen.net> <54838805.8090705@free.fr> <8761do6t01.fsf@ericabrahamsen.net> <5484CD5C.9070604@free.fr> <5487471D.5070807@free.fr> <5487793C.1080800@free.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:49094) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XyqA9-0002wN-RR for emacs-orgmode@gnu.org; Wed, 10 Dec 2014 17:55:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XyqA8-0007SR-Jb for emacs-orgmode@gnu.org; Wed, 10 Dec 2014 17:55:17 -0500 Received: from smtp5-g21.free.fr ([2a01:e0c:1:1599::14]:16994) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XyqA8-0007S2-DJ for emacs-orgmode@gnu.org; Wed, 10 Dec 2014 17:55:16 -0500 Received: from [IPv6:2a01:e35:2e21:def0:c90f:b89b:d367:7611] (unknown [IPv6:2a01:e35:2e21:def0:c90f:b89b:d367:7611]) by smtp5-g21.free.fr (Postfix) with ESMTP id 06F7DD4805A for ; Wed, 10 Dec 2014 23:53:50 +0100 (CET) In-Reply-To: List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Org Mode Le 10/12/2014 22:06, Michael Brand a =C3=A9crit : > Hi Thierry > > On Tue, Dec 9, 2014 at 11:35 PM, Thierry Banel w= rote: > >> On output, empty cells are generated when the aggregation function doe= s >> not have enough input. For instance, =3Dmean=3D needs at least one val= ue, >> otherwise a division by zero happens. > The above "not enough input" contradicts with "no input" from the > docstring of orgtbl-aggregate-apply-calc-1arg-function: > > Empty value is returned when all input values are empty. My mistake.Fixed to: "Empty value is returned when not enough non-empty input is available" Thanks > If this function would follow its docstring by having "(if (cdr vec)" > also for sum, min, max and prod then the user could benefit from > adding "E" and/or "N" or not in the mode string of the TBLFM: To further process the aggregations? > #+TBLNAME: test > | Item | Value_1 | Value_2 | > |------+---------+---------| > | a | 2 | | > | a | 2 | 2 | > | b | 2 | 2 | > | b | | 2 | > | c | | 2 | > | c | | 2 | > | d | | 1 | > | d | | -1 | > > #+BEGIN: aggregate :table test :cols ("Item" "sum(Value_1)" "sum(Value_= 2)") > | Item | sum(Value_1) | sum(Value_2) | | | | > |------+--------------+--------------+---+---+----| > | a | 4 | 2 | 3 | 3 | > | > | b | 2 | 4 | 3 | 3 | < | > | c | | 4 | 4 | 2 | NA | > | d | | 0 | 0 | 0 | NA | > #+TBLFM: $4 =3D vmean($2..$3) :: $5 =3D vmean($2..$3); EN :: $6 =3D if(= "$2" > =3D=3D "nan" || "$3" =3D=3D "nan", string("NA"), if($2 > $3, string(">"= ), > if($2 < $3, string("<"), string("eq")))); E > #+END > > The current orgaggregate for comparison: > > | Item | sum(Value_1) | sum(Value_2) | | | | > |------+--------------+--------------+---+---+----| > | a | 4 | 2 | 3 | 3 | > | > | b | 2 | 4 | 3 | 3 | < | > | c | 0 | 4 | 2 | 2 | < | > | d | 0 | 0 | 0 | 0 | eq | > > One could still get the current behavior by adding the column formula > ~$2 =3D if("$2" =3D=3D "nan", 0, $0); E~ etc. for the aggregated column= s. > > What do you think? > > Michael > Well... Many different topics here.First of all, empty input & empty output are unrelated. So let us take them one at a time. * Input In the spreadsheet formulas, we have those modifiers: - E =3D keep empty fields when counting input vector size. - N =3D replace non-numbers (including empties) by zero. The first version of orgaggregate behaved as thought it had the EN modifiers.The latest version behaves as thought it had no modifiers at al= l. For the shake of consistency, it would be nice to have specifiers in orgaggregate. E & N, of course, but also p7 (precision 7 digits), %.3 (three decimal places after dot), F (fraction), and so on. * Output If a function is able to compute a result, then it should give it. If for any reason it is not able to compute a result, it should tell it in some way. Right now, orgaggregate signals problems by leaving a blank output. But it could be anything else: nan, NA, #ERROR, vmean([]), 1/0, whatever. The =3Dsum=3D aggregation is always able to return a value, even for a ze= ro length input vector, in which case the sum is zero. On zero length vectors, =3Dprod=3D gives one. The =3Dmean=3D aggregation has a hard time telling what is the mean of a zero length input vector. Here is how the spreadsheet handles those cases (without modifiers): | | | | sum | mean | prod | |---+---+---+-----+-----------+------| | 1 | 2 | 3 | 6 | 2 | 6 | | | 2 | 3 | 5 | 1.6666667 | 6 | | | | 3 | 3 | 1 | 3 | | | | | 0 | 0 | 1 | <--- see #+TBLFM: $4=3Dvsum($1..$3)::$5=3Dvmean($1..$3)::$6=3Dvprod($1..$3) This is correct. Orgaggregate should behave in a similar way. Fortunately in its latest version it does. * Summary Modifiers are lacking in orgaggregate for it to be fully consistent with the spreadsheet. If someone knowns how to add them easily... Regards Thierry