From mboxrd@z Thu Jan  1 00:00:00 1970
From: Markus Heller <hellerm2@gmail.com>
Subject: Re: [OT]: Search for missing :END:
Date: Mon, 21 Nov 2011 15:27:03 -0800
Message-ID: <0vr511xdiw.fsf@gmail.com>
References: <0vvcqdxqf0.fsf@gmail.com>
	<6557.1321911502@alphaville.americas.hpqcorp.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Return-path: <emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([140.186.70.92]:56746)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <geo-emacs-orgmode@m.gmane.org>) id 1RSdGj-0007co-BH
	for emacs-orgmode@gnu.org; Mon, 21 Nov 2011 18:27:25 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <geo-emacs-orgmode@m.gmane.org>) id 1RSdGf-00021M-Cy
	for emacs-orgmode@gnu.org; Mon, 21 Nov 2011 18:27:21 -0500
Received: from lo.gmane.org ([80.91.229.12]:46318)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <geo-emacs-orgmode@m.gmane.org>) id 1RSdGf-000214-2o
	for emacs-orgmode@gnu.org; Mon, 21 Nov 2011 18:27:17 -0500
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <geo-emacs-orgmode@m.gmane.org>) id 1RSdGc-00018g-4x
	for emacs-orgmode@gnu.org; Tue, 22 Nov 2011 00:27:14 +0100
Received: from mail.cdrd.ca ([142.103.191.98])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <emacs-orgmode@gnu.org>; Tue, 22 Nov 2011 00:27:14 +0100
Received: from hellerm2 by mail.cdrd.ca with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <emacs-orgmode@gnu.org>; Tue, 22 Nov 2011 00:27:14 +0100
List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-orgmode>
List-Post: <mailto:emacs-orgmode@gnu.org>
List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=subscribe>
Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
To: emacs-orgmode@gnu.org

Nick Dokos <nicholas.dokos@hp.com> writes:

> Markus Heller <hellerm2@gmail.com> wrote:
>
>> Hello all,
>> 
>> I have an OT request that can hopefully be answered by emacs gurus in
>> less than a minute:
>> 
>> I'm looking for an emacs search expression that finds :PROPERTIES:
>> *without* a matching :END: ...
>> 
>
> If you mean a regexp, you are wasting your time[fn:1]. Regexps are
> powerful, but their range of applicability is limited to regular
> languages and even then, you have to worry about their efficiency. The
> above *is* a regular language: if P stands for :PROPERTIES: and E stands
> for :END:, then the regexp is
>
>     ([^EP]*P[^EP]*E)*
>
> In words, the stuff inside the parens says: 0 or more "other" things
> (non-P and non-E), followed by a P, followed by 0 or more "other"
> things, followed by an E. You can then have 0 or more of the
> parenthesized things. This will succeed on well formed "sentences" and
> fail on others.  But it might have to backtrack over the inner [^EP]*
> matches and then the outer matches, and rescan arbitrarily long
> stretches, which in the worst case, can turn your search into an
> exponentially slow descent into the abyss. You might be able to write
> non-greedy regexps that might behave better in this case. In most cases,
> you'd end up with a horrendous-looking regexp: good luck trying to
> understand it next week. That's my biggest problem with complicated regexps.
>
> However, a change of tool will simplify the problem enormously. E.g. here's
> a simple algorithm that can be used for this kind of problem:  start a
> nesting depth at 0 - when you see a P, increment the nesting depth by 1;
> when you see an E, decrement it by 1. If the nesting depth ever becomes
> something other than 0 or 1, you got a problem - also, if at EOF, the
> nesting depth is not 0, you got a problem. Easy variations of this will
> check well-formedness even when nesting *is* allowed.
>
> You can easily write such a program in any language you are familiar
> with (it does not have to be elisp, although you *can* write it in
> elisp - personally, I'd use awk).
>
> But assuming that you are getting some error from org, you don't know
> where the problem is and you are trying to find it, it will be simpler
> to just use egrep:
>
>     grep -E -n ':PROPERTIES:|:END:' foo.org
>
> will filter out the relevant lines, so all you have to do is scan the
> output by eye and spot any irregularity (consecutive :PROPERTIES: or
> consecutive :END: lines). Even if you have hundreds of them, that's
> *easy* for humans to do.[fn:2]
>
> Or, if you prefer, you can write trivial validation programs to operate
> on the output, e.g.:
>
>         grep -E -n ':PROPERTIES:|:END:' foo.org | tee foo.out | grep PROP | wc -l
> 	grep END foo.out | wc -l
>
> (the counts 'd better be the same).
>
> or
>
> 	grep -E -n ':PROPERTIES:|:END:' foo.org | foo.awk
>
> where foo.awk implements the nesting depth algorithm above - something
> like this:
>
> #! /bin/bash
>
> awk '
> BEGIN          { d = 0;}
> /:PROPERTIES:/ { d++; if (d > 1) { print $1, $d; exit; }}
> /:END:/        { d--; if (d < 0) { print $1, $d; exit; }}
> END            { if (d != 0) { print $1, $d; }}'
>
>
> Even on Windoze, you can probably do all this stuff with cygwin.

Hi Nick, 

thanks for this informative reply.

Unfortunately, I cannot install cygwin on my work computer.  I'll have
to figure something else out ...

AS for an example, I'm in one of my org files and I do C-TAB and get the
following error:

OVERVIEW
CONTENTS...done
SHOW ALL
if: :END: line missing at position 18720
Quit
Mark set

Where is position 18720?  I apologize if this is a stupid question, but
I can't seem to figure this out ...

Thanks again
Markus