* GSoC 2012 -- Elisp backend for Ragel
@ 2012-03-21 18:51 Aurélien Aptel
2012-03-21 19:32 ` Aurélien Aptel
` (4 more replies)
0 siblings, 5 replies; 26+ messages in thread
From: Aurélien Aptel @ 2012-03-21 18:51 UTC (permalink / raw)
To: emacs-orgmode
Hi!
I'm currently studying Computer Science at Lyon 1, France and I was
interested in writing an elisp backend for ragel to use in org-mode
for the Google Summer of Code 2012.
I hope GNU will have enough slot for this project because it's the one
I would really like doing :)
Here's a draft of my application. I was hoping someone (mentors?)
could help me improve it (suggestion, typo, correction). Keep in mind
English is not my native language.
---
Name: Aurélien Aptel
E-mail: aurelien.aptel@gmail.com
Project name: org-mode -- Implement an Elisp backend for Ragel
Summary:
The objective of the project is to implement an Elisp backend for
Ragel (a "parser generator") in order to replace the slow, complex and
error-prone parsing code in org-mode with fast code generated by Ragel
from a clean and readable grammar.
Benefits:
* Clean, readable and reusable grammar for org-mode files
* A new (fast) alternative for parsing in Elisp ; can be relevant for
a lot of Elisp projects
* New language backend for Ragel
Deliverables:
* New language backend for Ragel
* New improved parsing code for org-mode
* A grammar for org-mode files
Plan:
Communication:
I can be reached via email or irc. I plan on using a DVCS like
Mercurial and publish
my commits on a public hosting service such as bitbucket.org so
everyone can follow my progress. I will also post to the org-mode ML
to present my progress after each meaningful steps.
Qualification:
* I've already contributed to emacs.
I've added cross-platform "underwave" support, hopefully included in
future releases.
More info and patch at:
http://lists.gnu.org/archive/html/bug-gnu-emacs/2012-02/msg00238.html
http://lists.gnu.org/archive/html/emacs-devel/2012-01/msg00844.html
* I'm familiar with FSM concepts
I've had classes on languages and automata which involved the
implementation of several algorithms.
* I use emacs every day :)
I read and sometime post on various emacs MLs and I keep up with
Emacs-related news on the web.
I'm familiar with Lisp-like language. I wrote an interpreter (in C)
for my own Lisp-language with the help of SICP as a personnal project.
I have enough Elisp knowledge to automate some of my tasks but I'm no
expert at it.
I have used org-mode few times but I can't say I'm a frequent user.
I'm a C hacker at heart but I have an acceptable knowledge of C++ :)
---
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-21 18:51 Aurélien Aptel
@ 2012-03-21 19:32 ` Aurélien Aptel
2012-03-21 19:34 ` Samuel Wales
` (3 subsequent siblings)
4 siblings, 0 replies; 26+ messages in thread
From: Aurélien Aptel @ 2012-03-21 19:32 UTC (permalink / raw)
To: emacs-orgmode
Oops. Forgot the "Plan" section. I'm not sure about that one.
Plan:
The project has 2 clearly defined tasks:
* implement the backend
* replace the old parser
I'm still not sure what will take the most time and I'm tempted to
just have an almost-working backend as mid-term evaluation, working
backend as a final evaluation and the replacement of the old parser as
a bonus... The project title is "Add Elisp backend to Ragel" after
all.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-21 18:51 Aurélien Aptel
2012-03-21 19:32 ` Aurélien Aptel
@ 2012-03-21 19:34 ` Samuel Wales
2012-03-22 12:22 ` Thorsten
` (2 subsequent siblings)
4 siblings, 0 replies; 26+ messages in thread
From: Samuel Wales @ 2012-03-21 19:34 UTC (permalink / raw)
To: Aurélien Aptel; +Cc: emacs-orgmode
Great job.
--
The Kafka Pandemic: http://thekafkapandemic.blogspot.com
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-21 18:51 Aurélien Aptel
2012-03-21 19:32 ` Aurélien Aptel
2012-03-21 19:34 ` Samuel Wales
@ 2012-03-22 12:22 ` Thorsten
2012-03-24 8:16 ` Nicolas Goaziou
2012-03-26 16:03 ` Bastien
4 siblings, 0 replies; 26+ messages in thread
From: Thorsten @ 2012-03-22 12:22 UTC (permalink / raw)
To: emacs-orgmode
Aurélien Aptel <aurelien.aptel@gmail.com> writes:
Hi Aurelien,
> Here's a draft of my application. I was hoping someone (mentors?)
> could help me improve it (suggestion, typo, correction). Keep in mind
> English is not my native language.
I copied and pasted the Ragel proposal from this email:
,-----------------------------------------------------------------------
| From: Rustom Mody <rustompmody@gmail.com>
| Subject: Another gsoc idea -- ragel
| Newsgroups: gmane.emacs.orgmode
| To: emacs-orgmode <emacs-orgmode@gnu.org>
| Date: Fri, 9 Mar 2012 20:24:55 +0530 (1 week, 5 days, 21 hours ago)
|
| Ragel http://www.complang.org/ragel/ is a tool that integrates regular
| expressions and state machines under one umbrella.
| It has backends currently for C, C++, Objective-C, D, Java and Ruby.
| I do not think having an elisp backend would be a very big task.
|
| After that (in my estimate) org-mode code would (could) become half as
| long and twice as fast -- at least those sections that are heavily
| regex oriented
`-----------------------------------------------------------------------
to the GSoC ideas page, and added (a bit deliberately) the author of the
email and Bastien (Guerry) as potential mentors for the project idea -
just assuming they might be interested in mentoring. But I did not
confirm with neither of them!
I would therefore suggest that you contact them to find out who might
actually be the mentor for your project.
--
cheers,
Thorsten
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
@ 2012-03-22 13:43 Rustom Mody
2012-03-23 11:12 ` Aurélien Aptel
0 siblings, 1 reply; 26+ messages in thread
From: Rustom Mody @ 2012-03-22 13:43 UTC (permalink / raw)
To: emacs-orgmode
[-- Attachment #1: Type: text/plain, Size: 122 bytes --]
Writing a first cut backend for elisp into ragel is probably not a big job.
Porting org to that backend is damn ambitious
[-- Attachment #2: Type: text/html, Size: 129 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-22 13:43 GSoC 2012 -- Elisp backend for Ragel Rustom Mody
@ 2012-03-23 11:12 ` Aurélien Aptel
2012-03-23 11:36 ` Rustom Mody
0 siblings, 1 reply; 26+ messages in thread
From: Aurélien Aptel @ 2012-03-23 11:12 UTC (permalink / raw)
To: Rustom Mody; +Cc: emacs-orgmode
So, as an experienced org-mode developper you're saying it's very
hard? I should focus on the ragel part in the application and try to
go as far as i can for org-mode then. I still need something I can be
evaluated to for the mid-term and final evaluation.
Can you be my mentor? If no one can I should apply for another project :/
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-23 11:12 ` Aurélien Aptel
@ 2012-03-23 11:36 ` Rustom Mody
[not found] ` <CAJ+TeoebkVTLs9nrDTH_6xvzvkk1vTEZDL2iHmEAkTUfZRjpjQ@mail.gmail.com>
2012-03-26 15:53 ` Bastien
0 siblings, 2 replies; 26+ messages in thread
From: Rustom Mody @ 2012-03-23 11:36 UTC (permalink / raw)
To: Aurélien Aptel; +Cc: ragel-users, emacs-orgmode
[-- Attachment #1: Type: text/plain, Size: 1085 bytes --]
On Fri, Mar 23, 2012 at 4:42 PM, Aurélien Aptel <aurelien.aptel@gmail.com>wrote:
> So, as an experienced org-mode developper you're saying it's very
> hard? I should focus on the ragel part in the application and try to
> go as far as i can for org-mode then. I still need something I can be
> evaluated to for the mid-term and final evaluation.
>
> Can you be my mentor? If no one can I should apply for another project :/
>
I am rather far from being an org mode developer; just a user.
Adding a ragel backend for elisp on its own is too small for a gsoc project
Using ragelized elisp to rewrite orgmode is a wonderful project, but large.
The best solution is to chalk out a subset of the latter and work at that
Ive cced the ragel list in case Adrian Thurston, the ragel author, thinks
differently
For the first Adrian and the ragel list is your best bet.
For the second the orgmode list and developers.
For myself I am interested and would like to be informed. I dont think I
know enough of the internals of either to be the sole responsible party.
[-- Attachment #2: Type: text/html, Size: 1359 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-21 18:51 Aurélien Aptel
` (2 preceding siblings ...)
2012-03-22 12:22 ` Thorsten
@ 2012-03-24 8:16 ` Nicolas Goaziou
2012-03-26 16:03 ` Bastien
4 siblings, 0 replies; 26+ messages in thread
From: Nicolas Goaziou @ 2012-03-24 8:16 UTC (permalink / raw)
To: Aurélien Aptel; +Cc: emacs-orgmode
Hello,
Aurélien Aptel <aurelien.aptel@gmail.com> writes:
> The objective of the project is to implement an Elisp backend for
> Ragel (a "parser generator") in order to replace the slow, complex and
> error-prone parsing code in org-mode with fast code generated by Ragel
> from a clean and readable grammar.
FYI, there is already an elisp Org parser being worked on in development
branch of Org mode. It isn't finished yet, but still advanced enough so
a generic exporter could be built upon it.
Is there any interest in ignoring it and restart all the work from
scratch?
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
[not found] ` <CAJ+TeoebkVTLs9nrDTH_6xvzvkk1vTEZDL2iHmEAkTUfZRjpjQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-03-25 4:17 ` Rustom Mody
2012-03-25 6:55 ` Rustom Mody
2012-03-25 10:06 ` Aurélien Aptel
0 siblings, 2 replies; 26+ messages in thread
From: Rustom Mody @ 2012-03-25 4:17 UTC (permalink / raw)
To: emacs-orgmode, ragel-users-99ym4TKX+Hhg9hUCZPvPmw; +Cc: Aurélien Aptel
[-- Attachment #1.1: Type: text/plain, Size: 958 bytes --]
I asked Adrian Thurston the author of ragel if he would support help this
project.
His reply to me is here:
Hi Rustom,
I can certainly help with guidance. I'm about to relocate across much of
canada and so I won't be able to crack open code very easiliy.
Aurélien: my apologies I haven't responded. I certainly support your
effort, just have had some big career/life changes going on.
Regards,
Adrian
As for Nicolas:
FYI, there is already an elisp Org parser being worked on in development
branch of Org mode. It isn't finished yet, but still advanced enough so
a generic exporter could be built upon it.
Is there any interest in ignoring it and restart all the work from
scratch?
Yes I agree, no point redoing work unnecessarily. Maybe the optimal
solution would be for Aurélien to work with Nicolas and Adrian to minimize
useless rework?
Aurélien you now need to say what are your preferences in this matter.
Rusi
[-- Attachment #1.2: Type: text/html, Size: 1103 bytes --]
[-- Attachment #2: Type: text/plain, Size: 177 bytes --]
_______________________________________________
ragel-users mailing list
ragel-users-99ym4TKX+Hhg9hUCZPvPmw@public.gmane.org
http://www.complang.org/mailman/listinfo/ragel-users
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-25 4:17 ` Rustom Mody
@ 2012-03-25 6:55 ` Rustom Mody
2012-03-25 10:06 ` Aurélien Aptel
1 sibling, 0 replies; 26+ messages in thread
From: Rustom Mody @ 2012-03-25 6:55 UTC (permalink / raw)
To: emacs-orgmode
[-- Attachment #1: Type: text/plain, Size: 1085 bytes --]
Resending this because it seems to not have reached
-------------------------------
I asked Adrian Thurston the author of ragel if he would support help this
project.
His reply to me is here:
Hi Rustom,
>
> I can certainly help with guidance. I'm about to relocate across much of
> canada and so I won't be able to crack open code very easily.
>
> Aurélien: my apologies I haven't responded. I certainly support your
> effort, just have had some big career/life changes going on.
>
> Regards,
> Adrian
>
As for Nicolas, he wrote:
FYI, there is already an elisp Org parser being worked on in development
> branch of Org mode. It isn't finished yet, but still advanced enough so
> a generic exporter could be built upon it.
>
> Is there any interest in ignoring it and restart all the work from
> scratch?
>
Yes I agree, no point redoing work unnecessarily. Maybe the optimal
solution would be for Aurélien to work with Nicolas and Adrian to minimize
useless rework?
Aurélien you now need to say what are your preferences in this matter.
Rusi
[-- Attachment #2: Type: text/html, Size: 1447 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-25 4:17 ` Rustom Mody
2012-03-25 6:55 ` Rustom Mody
@ 2012-03-25 10:06 ` Aurélien Aptel
2012-03-25 11:40 ` Nicolas Goaziou
2012-03-26 16:01 ` Bastien
1 sibling, 2 replies; 26+ messages in thread
From: Aurélien Aptel @ 2012-03-25 10:06 UTC (permalink / raw)
To: Rustom Mody; +Cc: ragel-users, emacs-orgmode
On Sun, Mar 25, 2012 at 6:17 AM, Rustom Mody <rustompmody@gmail.com> wrote:
> FYI, there is already an elisp Org parser being worked on in development
> branch of Org mode. It isn't finished yet, but still advanced enough so
> a generic exporter could be built upon it.
> Is there any interest in ignoring it and restart all the work from
> scratch?
>
> Yes I agree, no point redoing work unnecessarily. Maybe the optimal
> solution would be for Aurélien to work with Nicolas and Adrian to minimize
> useless rework?
Regardless of the org-mode parser, I think I should work on the elisp
backend for ragel which is something that can benefit any elisp
project.
As for the new org-mode parser, I could not find it on the repo. Could
you point me to the relevant files?
Is it still hand written? If so, I think it's ultimately a bad idea
and it should be rewritten using ragel.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-25 10:06 ` Aurélien Aptel
@ 2012-03-25 11:40 ` Nicolas Goaziou
2012-03-25 12:52 ` Martyn Jago
[not found] ` <87vclsykl6.fsf-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2012-03-26 16:01 ` Bastien
1 sibling, 2 replies; 26+ messages in thread
From: Nicolas Goaziou @ 2012-03-25 11:40 UTC (permalink / raw)
To: Aurélien Aptel; +Cc: Rustom Mody, ragel-users, emacs-orgmode
Hello,
Aurélien Aptel <aurelien.aptel@gmail.com> writes:
> On Sun, Mar 25, 2012 at 6:17 AM, Rustom Mody <rustompmody@gmail.com> wrote:
>> FYI, there is already an elisp Org parser being worked on in development
>> branch of Org mode. It isn't finished yet, but still advanced enough so
>> a generic exporter could be built upon it.
>
>> Is there any interest in ignoring it and restart all the work from
>> scratch?
>>
>> Yes I agree, no point redoing work unnecessarily. Maybe the optimal
>> solution would be for Aurélien to work with Nicolas and Adrian to minimize
>> useless rework?
>
> Regardless of the org-mode parser, I think I should work on the elisp
> backend for ragel which is something that can benefit any elisp
> project.
Certainly.
> As for the new org-mode parser, I could not find it on the repo. Could
> you point me to the relevant files?
See org-element.el in contrib/ directory. You need development version.
> Is it still hand written?
Yes.
> If so, I think it's ultimately a bad idea and it should be rewritten
> using ragel.
It may be. But it allows for flexibility. Org's syntax is evolving, and
I consider org-element.el as a parser, but also as a guidance in that
process. Since there is no formal description for Org syntax yet, an
org-element.el is more useful than a full-blown parser generator for
now.
I don't know ragel (save for a short excursion in its website), but I'm
pretty sure that even if it generates elisp code without dependency, any
evolution to Org syntax will require to use it again. At that time, it
may be difficult to find someone able and willing to undertake that
updating task in a reasonable delay (since we're talking about a core
feature). On the other hand, there are quite a few elisp hackers in
Emacs's world.
Now, if ragel can improve org-element.el while preserving its
flexibility (and a compatible output, since I assume you won't also
rewrite the generic export engine), I'm all ears.
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-25 11:40 ` Nicolas Goaziou
@ 2012-03-25 12:52 ` Martyn Jago
2012-03-27 20:34 ` Aurélien Aptel
[not found] ` <87vclsykl6.fsf-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
1 sibling, 1 reply; 26+ messages in thread
From: Martyn Jago @ 2012-03-25 12:52 UTC (permalink / raw)
To: emacs-orgmode
Hi, and welcome Aurélien,
Nicolas Goaziou <n.goaziou@gmail.com> writes:
> Hello,
>
> Aurélien Aptel <aurelien.aptel@gmail.com> writes:
>
>> On Sun, Mar 25, 2012 at 6:17 AM, Rustom Mody <rustompmody@gmail.com> wrote:
>>> FYI, there is already an elisp Org parser being worked on in development
>>> branch of Org mode. It isn't finished yet, but still advanced enough so
>>> a generic exporter could be built upon it.
>>
>>> Is there any interest in ignoring it and restart all the work from
>>> scratch?
>>>
>>> Yes I agree, no point redoing work unnecessarily. Maybe the optimal
>>> solution would be for Aurélien to work with Nicolas and Adrian to minimize
>>> useless rework?
>>
>> Regardless of the org-mode parser, I think I should work on the elisp
>> backend for ragel which is something that can benefit any elisp
>> project.
>
> Certainly.
>
>> As for the new org-mode parser, I could not find it on the repo. Could
>> you point me to the relevant files?
>
> See org-element.el in contrib/ directory. You need development version.
>
>> Is it still hand written?
>
> Yes.
>
>> If so, I think it's ultimately a bad idea and it should be rewritten
>> using ragel.
>
> It may be. But it allows for flexibility. Org's syntax is evolving, and
> I consider org-element.el as a parser, but also as a guidance in that
> process. Since there is no formal description for Org syntax yet, an
> org-element.el is more useful than a full-blown parser generator for
> now.
>
> I don't know ragel (save for a short excursion in its website), but I'm
> pretty sure that even if it generates elisp code without dependency, any
> evolution to Org syntax will require to use it again. At that time, it
> may be difficult to find someone able and willing to undertake that
> updating task in a reasonable delay (since we're talking about a core
> feature). On the other hand, there are quite a few elisp hackers in
> Emacs's world.
>
> Now, if ragel can improve org-element.el while preserving its
> flexibility (and a compatible output, since I assume you won't also
> rewrite the generic export engine), I'm all ears.
I am a big fan of formalizing as much state design as possible into
formal state-machines, and have been for some time.
For the design process I currently use the excellent plantuml library
from within Org-mode (Aurélien, see lisp/ob-plantuml.el for the
Org-babel interface), although have previously used a bespoke
Ruby/Graphviz library of my design, and previous to that Ragel.
For that reason I think it would be great to see an `Org-Babel'
implementation of Ragel. I think it could be a very useful tool.
However, I don't use code auto-generation by Ragel and the like for the
following reasons...
1) It complicates the build process
2) It adds yet more SOUP (software of unknown provenance) to the build
process which may then need mitigating against (this is relevant to me,
although possibly not to Org-mode).
3) FSM implementation into code is inherently very simple anyway
What I do instead is validate the FSM code against the formal state
representation as an integration test (which is fairly easy to arrange).
So my point is, I personally don't see the use of FSM code generators as
a panacea to perfect super-fast code. To me their usefulness is in the
visual representation of the state interaction (during development, and
subsequent code documentation), and the resulting code quality.
Just my 10c
Best, Martyn
>
>
> Regards,
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-23 11:36 ` Rustom Mody
[not found] ` <CAJ+TeoebkVTLs9nrDTH_6xvzvkk1vTEZDL2iHmEAkTUfZRjpjQ@mail.gmail.com>
@ 2012-03-26 15:53 ` Bastien
1 sibling, 0 replies; 26+ messages in thread
From: Bastien @ 2012-03-26 15:53 UTC (permalink / raw)
To: Rustom Mody; +Cc: ragel-users, emacs-orgmode, Aurélien Aptel
Hi Aurélien,
thanks for your proposal.
Please get in touch with me and Nicolas privately to discuss it
more in depth: we are both french, that can help.
As an Org maintainer, my priority is to integrate Nicolas parser,
not to rewrite it. And Eric's suggestion of documenting the Org
syntax thoroughly is a good one. But perhaps this doesn't match
what you want to work on as a student.
In any case, please feel free to share your thoughts, we can
work something out.
Best,
--
Bastien
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-25 10:06 ` Aurélien Aptel
2012-03-25 11:40 ` Nicolas Goaziou
@ 2012-03-26 16:01 ` Bastien
[not found] ` <87wr674aii.fsf-mXXj517/zsQ@public.gmane.org>
1 sibling, 1 reply; 26+ messages in thread
From: Bastien @ 2012-03-26 16:01 UTC (permalink / raw)
To: Aurélien Aptel; +Cc: Rustom Mody, ragel-users, emacs-orgmode
Hi Aurélien,
Aurélien Aptel <aurelien.aptel@gmail.com> writes:
> Regardless of the org-mode parser, I think I should work on the elisp
> backend for ragel which is something that can benefit any elisp
> project.
Yes -- that would be great.
> As for the new org-mode parser, I could not find it on the repo. Could
> you point me to the relevant files?
> Is it still hand written? If so, I think it's ultimately a bad idea
> and it should be rewritten using ragel.
I guess that's because you're not fluent in english, but this sounds
a bit "peremptory". We are not only interested in code, we are also
interested in learning from each others. That's what make this list
a nice place to live in. If you have code or explanations you want
to share, please do!
Best,
--
Bastien
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-21 18:51 Aurélien Aptel
` (3 preceding siblings ...)
2012-03-24 8:16 ` Nicolas Goaziou
@ 2012-03-26 16:03 ` Bastien
4 siblings, 0 replies; 26+ messages in thread
From: Bastien @ 2012-03-26 16:03 UTC (permalink / raw)
To: Aurélien Aptel; +Cc: emacs-orgmode
Hi Aurélien,
Aurélien Aptel <aurelien.aptel@gmail.com> writes:
> I can be reached via email or irc. I plan on using a DVCS like
> Mercurial and publish
We use git for org-mode.
If you plan to contribute, I suggest you learn the basics of
git -- won't be hard, given your skills.
Best,
--
Bastien
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
[not found] ` <87vclsykl6.fsf-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2012-03-27 20:03 ` Aurélien Aptel
0 siblings, 0 replies; 26+ messages in thread
From: Aurélien Aptel @ 2012-03-27 20:03 UTC (permalink / raw)
To: Nicolas Goaziou
Cc: Rustom Mody, ragel-users-99ym4TKX+Hhg9hUCZPvPmw, emacs-orgmode
On Sun, Mar 25, 2012 at 1:40 PM, Nicolas Goaziou <n.goaziou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> If so, I think it's ultimately a bad idea and it should be rewritten
>> using ragel.
>
> It may be. But it allows for flexibility. Org's syntax is evolving, and
> I consider org-element.el as a parser, but also as a guidance in that
> process. Since there is no formal description for Org syntax yet, an
> org-element.el is more useful than a full-blown parser generator for
> now.
Using a parser generator can be flexible too. Big changes in the
syntax usually implies big changes in the parsing code. With a tool
like ragel the operation is much less painful since the code is
generated. Also, if the org syntax can be written as a grammar, it can
be safely imported in other software that have a parser for it making
the format more portable.
> I don't know ragel (save for a short excursion in its website), but I'm
> pretty sure that even if it generates elisp code without dependency, any
> evolution to Org syntax will require to use it again. At that time, it
> may be difficult to find someone able and willing to undertake that
> updating task in a reasonable delay (since we're talking about a core
> feature). On the other hand, there are quite a few elisp hackers in
> Emacs's world.
Frankly, I don't know ragel very much either. I've only used it on
very simple things. But it's easy to use. You can even execute action
at any state while parsing a token (look closely at the example on the
homepage).
> Now, if ragel can improve org-element.el while preserving its
I'm not sure it's possible :/
> flexibility (and a compatible output, since I assume you won't also
> rewrite the generic export engine), I'm all ears.
Yes, the output of the parser has to remain the same otherwise I would
have to rewrite everything :p
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-25 12:52 ` Martyn Jago
@ 2012-03-27 20:34 ` Aurélien Aptel
2012-03-27 21:22 ` Achim Gratz
0 siblings, 1 reply; 26+ messages in thread
From: Aurélien Aptel @ 2012-03-27 20:34 UTC (permalink / raw)
To: Martyn Jago; +Cc: emacs-orgmode
On Sun, Mar 25, 2012 at 2:52 PM, Martyn Jago <martyn.jago@btinternet.com> wrote:
> For the design process I currently use the excellent plantuml library
> from within Org-mode (Aurélien, see lisp/ob-plantuml.el for the
> Org-babel interface), although have previously used a bespoke
> Ruby/Graphviz library of my design, and previous to that Ragel.
I didn't know about plantuml. Nice tool!
> For that reason I think it would be great to see an `Org-Babel'
> implementation of Ragel. I think it could be a very useful tool.
I'm discovering org-mode little by little. The whole org-babel thing
is really nice too. So what you want is an org-babel interface that
will convert a source block input for ragel to an image of the
automata/generated code once you export e.g. to html?
> 1) It complicates the build process
Yes, it adds complexity for the developers. Users won't notice.
> 2) It adds yet more SOUP (software of unknown provenance) to the build
> process which may then need mitigating against (this is relevant to me,
> although possibly not to Org-mode).
You mean it complicates the build process? I'm not sure I understand.
> 3) FSM implementation into code is inherently very simple anyway
Well I've never done it for big language but I did not find it that
easy. Mistakes are easily made e.g. you end up accepting more thing
that the language does in order to simplify the code, etc.
> What I do instead is validate the FSM code against the formal state
> representation as an integration test (which is fairly easy to arrange).
I'm not familiar with this. Does this mean you test random valid and
invalid input against the parser (aka fuzzing)?
> So my point is, I personally don't see the use of FSM code generators as
> a panacea to perfect super-fast code. To me their usefulness is in the
> visual representation of the state interaction (during development, and
> subsequent code documentation), and the resulting code quality.
I'm not very experienced (just a student :) but for me their
usefulness is in the robustness and the abstraction it brings. It also
happens to be faster. But I see what you're saying.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
[not found] ` <87wr674aii.fsf-mXXj517/zsQ@public.gmane.org>
@ 2012-03-27 20:49 ` Aurélien Aptel
2012-03-27 22:10 ` Bastien
0 siblings, 1 reply; 26+ messages in thread
From: Aurélien Aptel @ 2012-03-27 20:49 UTC (permalink / raw)
To: Bastien; +Cc: Rustom Mody, ragel-users-99ym4TKX+Hhg9hUCZPvPmw, emacs-orgmode
On Mon, Mar 26, 2012 at 6:01 PM, Bastien <bzg@gnu.org> wrote:
>> Is it still hand written? If so, I think it's ultimately a bad idea
>> and it should be rewritten using ragel.
>
> I guess that's because you're not fluent in english, but this sounds
> a bit "peremptory". We are not only interested in code, we are also
Sorry I've made a bad impression or if my remarks feels a bit
passive-agressive. It's harder (and longer) for me to write in
english, yes :p
> interested in learning from each others. That's what make this list
> a nice place to live in. If you have code or explanations you want
> to share, please do!
From my classes, I thought it was widely accepted that sufficiently
complex parsers should now be written by a tool like ragel. But again
I'm not experienced, I never had to parse a complex language.
_______________________________________________
ragel-users mailing list
ragel-users@complang.org
http://www.complang.org/mailman/listinfo/ragel-users
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-27 20:34 ` Aurélien Aptel
@ 2012-03-27 21:22 ` Achim Gratz
2012-03-27 22:11 ` Aurélien Aptel
0 siblings, 1 reply; 26+ messages in thread
From: Achim Gratz @ 2012-03-27 21:22 UTC (permalink / raw)
To: emacs-orgmode
Aurélien Aptel <aurelien.aptel@gmail.com> writes:
>> 2) It adds yet more SOUP (software of unknown provenance) to the build
>> process which may then need mitigating against (this is relevant to me,
>> although possibly not to Org-mode).
>
> You mean it complicates the build process? I'm not sure I understand.
It needs yet another tool to fully build org-mode from scratch, which
needs to be installed, bug-free and configured correctly. Right now all
one really needs to have to build org-mode is a working Emacs (even make
is optional).
>> 3) FSM implementation into code is inherently very simple anyway
>
> Well I've never done it for big language but I did not find it that
> easy. Mistakes are easily made e.g. you end up accepting more thing
> that the language does in order to simplify the code, etc.
Which is just as easily done by specifying the syntax incorrectly.
>> What I do instead is validate the FSM code against the formal state
>> representation as an integration test (which is fairly easy to arrange).
>
> I'm not familiar with this. Does this mean you test random valid and
> invalid input against the parser (aka fuzzing)?
No, you can (for a suitably restricted set of languages) formally proof
that the implementation and the specification is identical for any
input.
> I'm not very experienced (just a student :) but for me their
> usefulness is in the robustness and the abstraction it brings. It also
> happens to be faster. But I see what you're saying.
The assumption that an FSM running in ELisp is faster than a bunch of
regexp has not been actually tested or has it?
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
Wavetables for the Terratec KOMPLEXER:
http://Synth.Stromeko.net/Downloads.html#KomplexerWaves
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-27 20:49 ` Aurélien Aptel
@ 2012-03-27 22:10 ` Bastien
0 siblings, 0 replies; 26+ messages in thread
From: Bastien @ 2012-03-27 22:10 UTC (permalink / raw)
To: Aurélien Aptel; +Cc: Rustom Mody, ragel-users, emacs-orgmode
Hi Aurélien,
Aurélien Aptel <aurelien.aptel@gmail.com> writes:
> On Mon, Mar 26, 2012 at 6:01 PM, Bastien <bzg@gnu.org> wrote:
>>> Is it still hand written? If so, I think it's ultimately a bad idea
>>> and it should be rewritten using ragel.
>>
>> I guess that's because you're not fluent in english, but this sounds
>> a bit "peremptory". We are not only interested in code, we are also
>
> Sorry I've made a bad impression or if my remarks feels a bit
> passive-agressive. It's harder (and longer) for me to write in
> english, yes :p
No problem!
>> interested in learning from each others. That's what make this list
>> a nice place to live in. If you have code or explanations you want
>> to share, please do!
>
> From my classes, I thought it was widely accepted that sufficiently
> complex parsers should now be written by a tool like ragel. But again
> I'm not experienced, I never had to parse a complex language.
Writing an Org parser with Ragel looks like an interesting project in
itself, theoretically speaking -- and you will find great minds around
that will follow and support your progress on this.
But for a GSoC project, we have to think in very practical terms ("how
will this improve the current code-base?) Achim's remarks are good
ones. And remember potential mentors are volunteers, the same ones
that (try to) maintain Org everyday.
Anyway, I'm glad to learn about Ragel. But as I said, my priority
goes to anything that can help Nicolas parser (testing, learning,
adding new backends, etc.)
Thanks,
--
Bastien
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-27 21:22 ` Achim Gratz
@ 2012-03-27 22:11 ` Aurélien Aptel
2012-03-28 6:34 ` Achim Gratz
0 siblings, 1 reply; 26+ messages in thread
From: Aurélien Aptel @ 2012-03-27 22:11 UTC (permalink / raw)
To: Achim Gratz; +Cc: emacs-orgmode
On Tue, Mar 27, 2012 at 11:22 PM, Achim Gratz <Stromeko@nexgo.de> wrote:
> It needs yet another tool to fully build org-mode from scratch, which
> needs to be installed, bug-free and configured correctly. Right now all
> one really needs to have to build org-mode is a working Emacs (even make
> is optional).
Ragel is written in C++ and has no dependency.
* every major platform has a C++ compiler
* ragel input along with generated code can be tracked in the repo
* the generated code is portable since it's elisp (doesn't need to be
regenerated on different platforms)
* the parser is a confined part of org-mode
I don't think this is a problem.
>>> 3) FSM implementation into code is inherently very simple anyway
>>
>> Well I've never done it for big language but I did not find it that
>> easy. Mistakes are easily made e.g. you end up accepting more thing
>> that the language does in order to simplify the code, etc.
>
> Which is just as easily done by specifying the syntax incorrectly.
I think the fix will be shorter and simpler in the syntax because it's
easier to reason on an abstract definition when it comes to language.
When you're neck-deep in your handwritten implementation trying to
figure what you did wrong, it can take a long time.
> No, you can (for a suitably restricted set of languages) formally proof
> that the implementation and the specification is identical for any
> input.
How would you do that programmatically?
> The assumption that an FSM running in ELisp is faster than a bunch of
> regexp has not been actually tested or has it?
I haven't tested anything yet.
If I remember correctly, the emacs regex API doesn't provide a way to
compile patterns and thus have to be compiled at each call.
Also the underlying FSM implementation uses NFA which can lead to a
exponential complexity in time [1] for certain patterns.
1: http://swtch.com/~rsc/regexp/regexp1.html
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-27 22:11 ` Aurélien Aptel
@ 2012-03-28 6:34 ` Achim Gratz
2012-03-29 17:50 ` Aurélien Aptel
0 siblings, 1 reply; 26+ messages in thread
From: Achim Gratz @ 2012-03-28 6:34 UTC (permalink / raw)
To: emacs-orgmode
Aurélien Aptel <aurelien.aptel@gmail.com> writes:
> Ragel is written in C++ and has no dependency.
It depends on having a working C++ compiler (presumably with some list
of features / standard conformance).
> * every major platform has a C++ compiler
Yes, but it may not be installed. Or has the wrong version. Or
whatever.
> * ragel input along with generated code can be tracked in the repo
It is a bad idea(TM) to track both the sources and the result of a
generation from that source in the same repo. That other projects are
doing that doesn't mean we should follow their example.
> * the generated code is portable since it's elisp (doesn't need to be
> regenerated on different platforms)
> * the parser is a confined part of org-mode
>
> I don't think this is a problem.
It may not be a problem for you. It probably isn't for me. I'm still
not saying it won't be a problem for every org-mode user. You need to
think about possible problems from the user perspective.
>> Which is just as easily done by specifying the syntax incorrectly.
>
> I think the fix will be shorter and simpler in the syntax because it's
> easier to reason on an abstract definition when it comes to language.
> When you're neck-deep in your handwritten implementation trying to
> figure what you did wrong, it can take a long time.
Please have a look at Nicolas' code first before making such statements.
I haven't seen ragel output, especially not in ELisp and I don't know
how easy it will be to debug parse errors. The other thing to keep in
mind is that org-mode doesn't have a formal syntax description, much
less one that follows one of the standard grammars. This will be a much
bigger fish to fry then
>> No, you can (for a suitably restricted set of languages) formally proof
>> that the implementation and the specification is identical for any
>> input.
>
> How would you do that programmatically?
Fundamentally? By induction.
>> The assumption that an FSM running in ELisp is faster than a bunch of
>> regexp has not been actually tested or has it?
>
> I haven't tested anything yet.
> If I remember correctly, the emacs regex API doesn't provide a way to
> compile patterns and thus have to be compiled at each call.
I haven't checked. But all this happens in machine code, not ELisp, so
it is not clear on whether a re-implementation of the regex engine, even
if it is vastly superior to the one Emacs uses now would be a net win.
> Also the underlying FSM implementation uses NFA which can lead to a
> exponential complexity in time [1] for certain patterns.
That trait is shared by all regex engines that allow backreferences.
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
SD adaptations for KORG EX-800 and Poly-800MkII V0.9:
http://Synth.Stromeko.net/Downloads.html#KorgSDada
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-28 6:34 ` Achim Gratz
@ 2012-03-29 17:50 ` Aurélien Aptel
2012-03-29 17:52 ` Samuel Wales
2012-03-29 19:04 ` Achim Gratz
0 siblings, 2 replies; 26+ messages in thread
From: Aurélien Aptel @ 2012-03-29 17:50 UTC (permalink / raw)
To: Achim Gratz; +Cc: emacs-orgmode
Hi again,
I'm going to change my proposal according to what has been said in
this discussion.
* I still want to make an Elisp backend for ragel. I understand it
won't be used in org-mode but it's a nice thing to have anyway. I hope
it's not a problem if this part of the project is not directly related
to org-mode.
* I will help Nicolas integrating his new parser.
Nicolas, what is left to do? In terms of work, is it enough/too much
for the GsoC?
I need to give precise information about how I will be evaluated for
the mid-term. See GNU GSoC guidelines, "Plan" section [1].
1: http://www.gnu.org/software/soc-projects/guidelines.html
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-29 17:50 ` Aurélien Aptel
@ 2012-03-29 17:52 ` Samuel Wales
2012-03-29 19:04 ` Achim Gratz
1 sibling, 0 replies; 26+ messages in thread
From: Samuel Wales @ 2012-03-29 17:52 UTC (permalink / raw)
To: Aurélien Aptel; +Cc: Achim Gratz, emacs-orgmode
Can Ragel be written in elisp or in portable C? I had thought that
was the idea.
--
The Kafka Pandemic: http://thekafkapandemic.blogspot.com
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: GSoC 2012 -- Elisp backend for Ragel
2012-03-29 17:50 ` Aurélien Aptel
2012-03-29 17:52 ` Samuel Wales
@ 2012-03-29 19:04 ` Achim Gratz
1 sibling, 0 replies; 26+ messages in thread
From: Achim Gratz @ 2012-03-29 19:04 UTC (permalink / raw)
To: emacs-orgmode
Aurélien Aptel writes:
> * I still want to make an Elisp backend for ragel. I understand it
> won't be used in org-mode but it's a nice thing to have anyway. I hope
> it's not a problem if this part of the project is not directly related
> to org-mode.
Please don't be discouraged by the discussion. If you like your
proposal you'll have to defend it. :-)
Even if org-mode won't directly use a Ragel generated parser for
whatever reason it would still be good to have for equally important
things:
1. Provide a (more) formal specification for org-mode syntax.
2. Provide an alternative implementation to test against.
3. Increase interoperability with other software.
Another, maybe more immediate, application of a Ragel Elisp backend
would probably be the Semantic parsers in Emacs, especially if the
Bovine (LL) and Wisent (LALR) grammars could be directly converted.
Wisent is an Elisp port of Bison. They are both table driven parsers.
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2012-03-29 19:05 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-22 13:43 GSoC 2012 -- Elisp backend for Ragel Rustom Mody
2012-03-23 11:12 ` Aurélien Aptel
2012-03-23 11:36 ` Rustom Mody
[not found] ` <CAJ+TeoebkVTLs9nrDTH_6xvzvkk1vTEZDL2iHmEAkTUfZRjpjQ@mail.gmail.com>
[not found] ` <CAJ+TeoebkVTLs9nrDTH_6xvzvkk1vTEZDL2iHmEAkTUfZRjpjQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-03-25 4:17 ` Rustom Mody
2012-03-25 6:55 ` Rustom Mody
2012-03-25 10:06 ` Aurélien Aptel
2012-03-25 11:40 ` Nicolas Goaziou
2012-03-25 12:52 ` Martyn Jago
2012-03-27 20:34 ` Aurélien Aptel
2012-03-27 21:22 ` Achim Gratz
2012-03-27 22:11 ` Aurélien Aptel
2012-03-28 6:34 ` Achim Gratz
2012-03-29 17:50 ` Aurélien Aptel
2012-03-29 17:52 ` Samuel Wales
2012-03-29 19:04 ` Achim Gratz
[not found] ` <87vclsykl6.fsf-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2012-03-27 20:03 ` Aurélien Aptel
2012-03-26 16:01 ` Bastien
[not found] ` <87wr674aii.fsf-mXXj517/zsQ@public.gmane.org>
2012-03-27 20:49 ` Aurélien Aptel
2012-03-27 22:10 ` Bastien
2012-03-26 15:53 ` Bastien
-- strict thread matches above, loose matches on Subject: below --
2012-03-21 18:51 Aurélien Aptel
2012-03-21 19:32 ` Aurélien Aptel
2012-03-21 19:34 ` Samuel Wales
2012-03-22 12:22 ` Thorsten
2012-03-24 8:16 ` Nicolas Goaziou
2012-03-26 16:03 ` Bastien
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).