From mboxrd@z Thu Jan 1 00:00:00 1970 From: Achim Gratz Subject: Re: GSoC 2012 -- Elisp backend for Ragel Date: Wed, 28 Mar 2012 08:34:43 +0200 Message-ID: <87bonhkzcs.fsf@Rainer.invalid> References: <87vclsykl6.fsf@gmail.com> <87sjgtoi1o.fsf@Rainer.invalid> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: Received: from eggs.gnu.org ([208.118.235.92]:42042) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SCmTE-0005V0-KS for emacs-orgmode@gnu.org; Wed, 28 Mar 2012 02:35:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SCmTC-00068y-8y for emacs-orgmode@gnu.org; Wed, 28 Mar 2012 02:35:00 -0400 Received: from plane.gmane.org ([80.91.229.3]:39118) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SCmTB-00068f-Uz for emacs-orgmode@gnu.org; Wed, 28 Mar 2012 02:34:58 -0400 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1SCmTA-0001tp-3O for emacs-orgmode@gnu.org; Wed, 28 Mar 2012 08:34:56 +0200 Received: from pd9eb39de.dip.t-dialin.net ([217.235.57.222]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 28 Mar 2012 08:34:55 +0200 Received: from Stromeko by pd9eb39de.dip.t-dialin.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 28 Mar 2012 08:34:55 +0200 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org Aurélien Aptel writes: > Ragel is written in C++ and has no dependency. It depends on having a working C++ compiler (presumably with some list of features / standard conformance). > * every major platform has a C++ compiler Yes, but it may not be installed. Or has the wrong version. Or whatever. > * ragel input along with generated code can be tracked in the repo It is a bad idea(TM) to track both the sources and the result of a generation from that source in the same repo. That other projects are doing that doesn't mean we should follow their example. > * the generated code is portable since it's elisp (doesn't need to be > regenerated on different platforms) > * the parser is a confined part of org-mode > > I don't think this is a problem. It may not be a problem for you. It probably isn't for me. I'm still not saying it won't be a problem for every org-mode user. You need to think about possible problems from the user perspective. >> Which is just as easily done by specifying the syntax incorrectly. > > I think the fix will be shorter and simpler in the syntax because it's > easier to reason on an abstract definition when it comes to language. > When you're neck-deep in your handwritten implementation trying to > figure what you did wrong, it can take a long time. Please have a look at Nicolas' code first before making such statements. I haven't seen ragel output, especially not in ELisp and I don't know how easy it will be to debug parse errors. The other thing to keep in mind is that org-mode doesn't have a formal syntax description, much less one that follows one of the standard grammars. This will be a much bigger fish to fry then >> No, you can (for a suitably restricted set of languages) formally proof >> that the implementation and the specification is identical for any >> input. > > How would you do that programmatically? Fundamentally? By induction. >> The assumption that an FSM running in ELisp is faster than a bunch of >> regexp has not been actually tested or has it? > > I haven't tested anything yet. > If I remember correctly, the emacs regex API doesn't provide a way to > compile patterns and thus have to be compiled at each call. I haven't checked. But all this happens in machine code, not ELisp, so it is not clear on whether a re-implementation of the regex engine, even if it is vastly superior to the one Emacs uses now would be a net win. > Also the underlying FSM implementation uses NFA which can lead to a > exponential complexity in time [1] for certain patterns. That trait is shared by all regex engines that allow backreferences. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for KORG EX-800 and Poly-800MkII V0.9: http://Synth.Stromeko.net/Downloads.html#KorgSDada