From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Steven E. Harris" <seh@panix.com>
Subject: Re: Release 6.17
Date: Sun, 04 Jan 2009 15:24:20 -0500
Message-ID: <uhc4elszv.fsf@torus.sehlabs.com>
References: <EEA5854C-A2D7-48A9-98A5-3941EA2FF36A@gmail.com>
	<uljtrkunt.fsf@torus.sehlabs.com>
	<1AD01E3D-3A98-4811-A7A1-0491189CE5C0@uva.nl>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Return-path: <emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org>
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1LJZWV-00064i-IH
	for emacs-orgmode@gnu.org; Sun, 04 Jan 2009 15:24:35 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1LJZWT-00064W-45
	for emacs-orgmode@gnu.org; Sun, 04 Jan 2009 15:24:34 -0500
Received: from [199.232.76.173] (port=53620 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1LJZWS-00064T-VZ
	for emacs-orgmode@gnu.org; Sun, 04 Jan 2009 15:24:33 -0500
Received: from main.gmane.org ([80.91.229.2]:40712 helo=ciao.gmane.org)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32)
	(Exim 4.60) (envelope-from <geo-emacs-orgmode@m.gmane.org>)
	id 1LJZWS-00067g-F2
	for emacs-orgmode@gnu.org; Sun, 04 Jan 2009 15:24:32 -0500
Received: from list by ciao.gmane.org with local (Exim 4.43)
	id 1LJZWO-000463-G2
	for emacs-orgmode@gnu.org; Sun, 04 Jan 2009 20:24:28 +0000
Received: from host-69-95-83-137.pit.choiceone.net ([69.95.83.137])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <emacs-orgmode@gnu.org>; Sun, 04 Jan 2009 20:24:28 +0000
Received: from seh by host-69-95-83-137.pit.choiceone.net with local (Gmexim
	0.1 (Debian)) id 1AlnuQ-0007hv-00
	for <emacs-orgmode@gnu.org>; Sun, 04 Jan 2009 20:24:28 +0000
List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-orgmode>
List-Post: <mailto:emacs-orgmode@gnu.org>
List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=subscribe>
Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
To: emacs-orgmode@gnu.org

Carsten Dominik <dominik@science.uva.nl> writes:

> This idea is to make this work in a heuristic way, by using something
> that is unlikely enough to occur in real code.

And that is a tough problem, as code is usually defined as stuff that
contains all kinds of weird (and often paired) delimiters.

[...]

> What would be safer?
>
>  <<name>>    like the other Org-mode targets?  That would make sense.
>              Does anyone know a language where this would be used
>              in real life?  It would make it harder to write about
>              Org-mode, though.
>
> Or do we need another option, so that, if needed, we could switch do a
> different syntax?

This reminds me of the "leaning toothpick" problem with regular
expression syntax; Perl and some other languages adopted the flexibility
to accept any "matching" delimiters (either the same character used
twice or a balancing pair) in lieu of the default '/' delimiter
character. There was the need to have the delimiters be able to "get out
of the way" of the dominant syntax within that particular regular
expression. Here, too, I expect that we'd either need to define
language-specific escape hatches, or stop guessing and force the user to
define the active delimiters.

What if the user could specify before each code block some "dispatch
character" that then had to be followed by a more telling string, such
as "#line:def". In that example, the octothorpe is the dispatch
character, the "line:" is the belt-and-suspenders clarifying tag, and
the "def" is the named label for that line. Force it to be at the end of
the line (perhaps modulo trailing space), as there should only be one
definition per line.

A regular expression match would look for

  #line:([^)]+)\s*$
  ^
  |
  + (not fixed)

except that the dispatch character would need to be composed in and
regex-quoted appropriately. Also, that one would tolerate anything but a
closing parenthesis in a label; it could be more restrictive to tolerate
something more commonly expected of an identifier such as alphanumerics,
dashes, and underscores.

You could punt even further and just demand that the user provide a
suitable regex for finding the line labels unambiguously. I'm just leery
of trying to pick a default that's expected to work not just within
natural language, but within program source code.

-- 
Steven E. Harris