From mboxrd@z Thu Jan 1 00:00:00 1970 From: Carsten Dominik Subject: Re: [bug] org-agenda-write does not handle date stamps without day of week Date: Mon, 19 Mar 2012 10:12:01 +0100 Message-ID: <30E4D375-690C-4798-B341-51268082B826@gmail.com> References: <2012-03-05T15-36-35@devnull.Karl-Voit.at> <2012-03-16T17-12-15@devnull.Karl-Voit.at> <5033.1331920331@alphaville> <2012-03-16T19-40-36@devnull.Karl-Voit.at> <4859.1331966740@alphaville> Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/mixed; boundary=Apple-Mail-23--493680368 Return-path: Received: from eggs.gnu.org ([208.118.235.92]:35359) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9YdP-0000md-7i for emacs-orgmode@gnu.org; Mon, 19 Mar 2012 05:12:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S9YdM-0007I2-4T for emacs-orgmode@gnu.org; Mon, 19 Mar 2012 05:12:10 -0400 Received: from mail-ee0-f41.google.com ([74.125.83.41]:35925) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9YdL-0007Fn-OF for emacs-orgmode@gnu.org; Mon, 19 Mar 2012 05:12:08 -0400 Received: by eeke53 with SMTP id e53so2929070eek.0 for ; Mon, 19 Mar 2012 02:12:04 -0700 (PDT) In-Reply-To: <4859.1331966740@alphaville> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: nicholas.dokos@hp.com Cc: news1142@Karl-Voit.at, emacs-orgmode@gnu.org --Apple-Mail-23--493680368 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On 17.3.2012, at 07:45, Nick Dokos wrote: > Karl Voit wrote: >=20 >> * Nick Dokos wrote: >>> Karl Voit wrote: >>>=20 >>> For me, it was a "no time to work on org - stash it"... >>=20 >> OK. I just wanted to make sure that it *is* on someone's todo list >> :-) >>=20 >>>> * Karl Voit wrote: >>>>>=20 >>>>> * <2012-03-05 08:00-09:00> Wrong: ends up as full day event >>>=20 >>> org-agenda-write calls org-export-icalendar which calls = org-print-icalendar-entries >>> which loops over all the entries and parses them, decomposing them = into timestamps. >>> Each timestamp is then passed to org-parse-time-string. It's this = one that cannot >>> handle non-standard formats: it uses a regexp and assumes that all = the matched parts >>> are going to be in fixed places: >>>=20 >>> As to how to fix it, there are several possibilities: >>>=20 >>> 1. fix your scripts that produce time stamps to include day-of-week. >>=20 >> Sorry, deriving DOW from an arbitrary timestamps from arbitrary data >> sources is either pretty time consuming (calendar calculations) or >> simply hard to calculate. >>=20 >> Outside Org-mode, DOW is seldom part of time-stamp data :-( >>=20 >>> 2. change the callers of org-parse-time-string to make sure that DOW = is included. >>> There are roughly three dozen callers, so 2. is possible but a pain. >>=20 >> Ack. >>=20 >>> 3. change just one caller: org-print-icalendar-entries to make sure = that DOW is included. >>> 3. is simple but ugly as sin,=20 >>=20 >> Ouch, ack :-) >>=20 >>> 4. change org-parse-time-string to handle a missing DOW. >>> 4. is the best way to handle it within org. >>=20 >> Full ack. >>=20 >>> I vote for 1. where *you* have to do all the work ;-) >>=20 >> YMMD :-) >>=20 >> If my brain would be compatible to ELISP, I'd send a patch. >> Promised. >>=20 >> But I'll take my chance and wait for someone else (you?) >> implementing 4. to resolve this issue for everybody. I really >> appreciate every second you guys invest in maintaining Org-mode! >>=20 >=20 > I don't know about you, but whenever I engage in hand-to-hand combat > with a complicated regexp, I come out bruised, muddied and a lot worse > for wear. In any case, I'm attaching an org file with my = investigations. > It contains a description and a code block for testing. >=20 > I hope that the attachment will come through unscathed: it contains > regexps, and munging a regexp that looks like hen scratchings in the > first place through uncooperative mailers is not something to be > relished. >=20 > BTW, I'm not advocating a change: I'll leave it to Karl to do that if = he > really wants to and to the maintainers to decide whether it's worth > doing. But it can be done (more or less). And maybe somebody will come > up with a better way than the proof-of-concept that I'm attaching = here. Hi Nick and Karl, since we did make a change to Org a while ago to allow date stamps without the name of the day, I think it is only consequent to also do it for this case. Must have slipped our attention back then. The only thing we must ensure is that this regexp matches fast as it is used a lot. Nick's proposal works, except for the fact that is also matches when the time is directly attached to day name. Maybe it is cleaner to not match in this case. If we are going to make the day name optional, then it is better to include matching of the whitespace after the date into the day-name part of the regexp. I am attaching Nick's file again, with a third proposal for an updated regexp. As far as speed is concerned, this regexp will, if there is name and time, match directly and straight. If the date name is missing, it will notice on the first digit belonging to the time and switch without backtracking (well, minimal backtracking when there are multiple spaces) to the regexp section for the time of day. Rematching the spaces after the date will be the only overhead. Cheers - Carsten --Apple-Mail-23--493680368 Content-Disposition: attachment; filename=reproducer.org Content-Type: application/octet-stream; name="reproducer.org" Content-Transfer-Encoding: 7bit * Modify org-ts-regexpr0 to satisfy Karl Voit :-) This is an attempt to allow timestamps without a day-of-week. Modifying complicated regexps is no fun: I had to resort to re-builder to figure out what was going on. The problem was that the first " *" which is supposed to eat all the spaces between the "2012-03-05" part and the "Mon" part, would, in the absence of the "Mon" part, eat the space before the time spec "08:00". But the regexp for the time part explicitly requires that space (see the second sp in the explanatory line below). So I tried making the DOW part optional by appending a ?, but that didn't do the trick. re-builder showed the space problem: there was no space left to match if DOW is absent, so I changed the literal space " " to 0 or more spaces " *" and that did the trick (with some caveats noted below). In the code block below, tslist is a list of test cases. The code block sets org-ts-regexp0 first to its original value, then to the changed value (the first one is there just for comparison between the two). I've marked the changes (of which, the second is the significant one). We map the function org-parse-time-string over the test cases and get a list of results. The results look reasonable, but the new regexp allows funny-looking input like the second and fourth test case. One additional caveat is that org-odt uses org-ts-regexp0, so changing it without checking org-odt's use is fraught with peril, but that's what I've done: I have no idea whether it causes problems in org-odt. #+BEGIN_SRC elisp ; year_______ - month______ - day________ sp DOW____________ sp hour_________ : minutes____ (setq org-ts-regexp0 "\\(\\([0-9]\\{4\\}\\)-\\([0-9]\\{2\\}\\)-\\([0-9]\\{2\\}\\) *\\([^]+0-9>\r\n -]*\\)\\( \\([0-9]\\{1,2\\}\\):\\([0-9]\\{2\\}\\)\\)?\\)") ; changed v v (setq org-ts-regexp0 "\\(\\([0-9]\\{4\\}\\)-\\([0-9]\\{2\\}\\)-\\([0-9]\\{2\\}\\) *\\([^]+0-9>\r\n -]*\\)?\\( *\\([0-9]\\{1,2\\}\\):\\([0-9]\\{2\\}\\)\\)?\\)") ; New version by Carsten ^^vv v (setq org-ts-regexp0 "\\(\\([0-9]\\{4\\}\\)-\\([0-9]\\{2\\}\\)-\\([0-9]\\{2\\}\\)\\( +[^]+0-9>\r\n -]+\\)?\\( +\\([0-9]\\{1,2\\}\\):\\([0-9]\\{2\\}\\)\\)?\\)" (setq tslist '( "<2012-03-05 Mon 08:00>" "<2012-03-05 Mon08:00>" "<2012-03-05 Mon 8:00>" "<2012-03-05 Mon8:00>" "<2012-03-05 Mon 8:00>" "<2012-03-05 Mon 8:00>" "<2012-03-05 Mon >" "<2012-03-05 Mon>" "<2012-03-05 08:00>" "<2012-03-05 8:00>" "<2012-03-05 8:00>" "<2012-03-05>")) (pp (mapcar 'org-parse-time-string tslist)) #+END_SRC #+RESULTS: #+begin_example ((0 0 8 5 3 2012 nil nil nil) (0 0 8 5 3 2012 nil nil nil) (0 0 8 5 3 2012 nil nil nil) (0 0 8 5 3 2012 nil nil nil) (0 0 8 5 3 2012 nil nil nil) (0 0 8 5 3 2012 nil nil nil) (0 0 0 5 3 2012 nil nil nil) (0 0 0 5 3 2012 nil nil nil) (0 0 8 5 3 2012 nil nil nil) (0 0 8 5 3 2012 nil nil nil) (0 0 8 5 3 2012 nil nil nil) (0 0 0 5 3 2012 nil nil nil)) #+end_example --Apple-Mail-23--493680368 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii > > Nick > > PS. BTW, if you look at the attachment, it helps if you have a wide window, > something like 165 characters wide. > > --Apple-Mail-23--493680368--