emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Re: mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files)
  2012-02-27 16:52     ` Eric Schulte
@ 2010-08-01 23:37       ` Simon Castellan
  2012-03-01 13:20         ` Nicolas Goaziou
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Castellan @ 2010-08-01 23:37 UTC (permalink / raw)
  To: Eric Schulte; +Cc: emacs-orgmode

On lun. 27/févr. (09:52), Eric Schulte wrote:
> Simon Castellan <simon.castellan@iuwt.fr> writes:
> 
> > On lun. 27/févr. (15:27), Alan Schmitt wrote:
> >> On 26 févr. 2012, at 17:41, Simon Castellan wrote:
> >> 
> >> > I have been writing a parser for mlorg files in OCaml. This started as an
> >> > experiment to see if the literate programming mode of org-mode could scale to a
> >> > full application (among other things).
> >> 
> >> This looks very interesting, and would very much help in the
> >> dissemination of org-mode. Have you thought of announcing it on the
> >> caml mailing list?
> >> 
> >> Alan
> >
> > I have but prefer to wait mlorg to be more complete. This post was meant mainly
> > to gather info/document about org's syntax. (But as I said feedbacks welcome.)
> >
> 
> Hi Simon,
> 
> Nicolas Goaziou has been working recently on a new emacs-lisp parser of
> Org-mode files, with the goals of
> 1. standardizing the formal syntax of Org-mode files
> 2. parsing Org-mode files to a canonical emacs-lisp list-based
>    representation in memory (like an Org-mode AST)
> 3. re-basing the existing Org-mode exporters off of this canonical
>    representation
> 
> This work is contained in contrib/lisp/org-element.el, which includes a
> large amount of useful commentary at the top of the file.  This should
> serve as a starting point for learning more about the formal syntax of
> Org-mode files (as it is defined).  I think that developing parsers for
> this syntax in multiple language should be very useful to ensure that a
> usable syntax is developed separate from any particular implementation.
> 
> Cheers,
> 

Thank you very much for this pointer, This is what I was looking for : a list of
syntaxic construction in org-mode. I'd say though that it lacks a more-or-less
formal syntaxic definition of constructions.

Simon.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files)
@ 2012-02-26 16:41 Simon Castellan
  2012-02-27 14:27 ` Alan Schmitt
  2012-07-04 18:18 ` mlorg : yet another parser for org-mode (Written in OCaml) Simon Castellan
  0 siblings, 2 replies; 9+ messages in thread
From: Simon Castellan @ 2012-02-26 16:41 UTC (permalink / raw)
  To: emacs-orgmode

Hello,

I have been writing a parser for mlorg files in OCaml. This started as an
experiment to see if the literate programming mode of org-mode could scale to a
full application (among other things).

The project is at its beginning but can « bootstrap » itself (that is parses its
own source and extract the source code), yet the support for the syntax is very
far from being complete.

The goal is also to be able to convert org-mode files to latex/html/... without
having the dependancy on emacs. Indeed although org-mode files are just plain
text, there is still a feeling of being locked because this is such a
complicated format and that there doesn't seem to be a reference library to deal
with this. I hope that more libraries to do so will appear for one main reason :
to have a standard syntax we can build upon : I think that to know precisely the
syntax understood by org-mode is very difficult : no document about this exists
(Or I have found none). When I'm done with the main syntaxic part I will try to
document them.

Besides, I think org-mode is wonderful editor but does a terrible job at
exporting : slow, emacs-specific, strange errors on some document, ...

The code can be found on gitorious:
  http://gitorious.org/mlorg/mlorg

For those who would like to compile, you will need the batteries library from
git (hope it will be released before mlorg has reached a releasable state).

An example of cool feature that I have added in mlorg and that should be the
org-mode exporter : org-mode doesn't put location annotations (à la cpp) so that
compilers know how to report correct line numbers. This is very helpful when
compiling quite long files.

The point of this message is mainly to attract people interested in testing or
even contributing. (I will be very glad : there is so much to do). But I hope to
make the org-mode community think about a standardization process of the syntax
used in org-mode to ease the work of parsers mainteners. There is no README yet,
but the mlorg binary doesn't do much yet and the code should be self-documented
(I hope so).

Simon.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files)
  2012-02-26 16:41 mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files) Simon Castellan
@ 2012-02-27 14:27 ` Alan Schmitt
  2012-02-27 17:06   ` Simon Castellan
  2012-07-04 18:18 ` mlorg : yet another parser for org-mode (Written in OCaml) Simon Castellan
  1 sibling, 1 reply; 9+ messages in thread
From: Alan Schmitt @ 2012-02-27 14:27 UTC (permalink / raw)
  To: Simon Castellan; +Cc: emacs-orgmode

On 26 févr. 2012, at 17:41, Simon Castellan wrote:

> I have been writing a parser for mlorg files in OCaml. This started as an
> experiment to see if the literate programming mode of org-mode could scale to a
> full application (among other things).

This looks very interesting, and would very much help in the dissemination of org-mode. Have you thought of announcing it on the caml mailing list?

Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files)
  2012-02-27 17:06   ` Simon Castellan
@ 2012-02-27 16:52     ` Eric Schulte
  2010-08-01 23:37       ` Simon Castellan
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Schulte @ 2012-02-27 16:52 UTC (permalink / raw)
  To: Simon Castellan; +Cc: Nicolas Goaziou, Alan Schmitt, emacs-orgmode

Simon Castellan <simon.castellan@iuwt.fr> writes:

> On lun. 27/févr. (15:27), Alan Schmitt wrote:
>> On 26 févr. 2012, at 17:41, Simon Castellan wrote:
>> 
>> > I have been writing a parser for mlorg files in OCaml. This started as an
>> > experiment to see if the literate programming mode of org-mode could scale to a
>> > full application (among other things).
>> 
>> This looks very interesting, and would very much help in the
>> dissemination of org-mode. Have you thought of announcing it on the
>> caml mailing list?
>> 
>> Alan
>
> I have but prefer to wait mlorg to be more complete. This post was meant mainly
> to gather info/document about org's syntax. (But as I said feedbacks welcome.)
>

Hi Simon,

Nicolas Goaziou has been working recently on a new emacs-lisp parser of
Org-mode files, with the goals of
1. standardizing the formal syntax of Org-mode files
2. parsing Org-mode files to a canonical emacs-lisp list-based
   representation in memory (like an Org-mode AST)
3. re-basing the existing Org-mode exporters off of this canonical
   representation

This work is contained in contrib/lisp/org-element.el, which includes a
large amount of useful commentary at the top of the file.  This should
serve as a starting point for learning more about the formal syntax of
Org-mode files (as it is defined).  I think that developing parsers for
this syntax in multiple language should be very useful to ensure that a
usable syntax is developed separate from any particular implementation.

Cheers,

>
> Simon.
>

-- 
Eric Schulte
http://cs.unm.edu/~eschulte/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files)
  2012-02-27 14:27 ` Alan Schmitt
@ 2012-02-27 17:06   ` Simon Castellan
  2012-02-27 16:52     ` Eric Schulte
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Castellan @ 2012-02-27 17:06 UTC (permalink / raw)
  To: Alan Schmitt; +Cc: emacs-orgmode

On lun. 27/févr. (15:27), Alan Schmitt wrote:
> On 26 févr. 2012, at 17:41, Simon Castellan wrote:
> 
> > I have been writing a parser for mlorg files in OCaml. This started as an
> > experiment to see if the literate programming mode of org-mode could scale to a
> > full application (among other things).
> 
> This looks very interesting, and would very much help in the dissemination of org-mode. Have you thought of announcing it on the caml mailing list?
> 
> Alan

I have but prefer to wait mlorg to be more complete. This post was meant mainly
to gather info/document about org's syntax. (But as I said feedbacks welcome.)

Simon.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files)
  2010-08-01 23:37       ` Simon Castellan
@ 2012-03-01 13:20         ` Nicolas Goaziou
  2012-03-01 15:44           ` Simon Castellan
  0 siblings, 1 reply; 9+ messages in thread
From: Nicolas Goaziou @ 2012-03-01 13:20 UTC (permalink / raw)
  To: Simon Castellan; +Cc: emacs-orgmode, Eric Schulte

Hello,

Simon Castellan <simon.castellan@iuwt.fr> writes:

> Thank you very much for this pointer, This is what I was looking for :
> a list of syntaxic construction in org-mode. I'd say though that it
> lacks a more-or-less formal syntaxic definition of constructions.

It lacks that, indeed, among many other things. On the other hand, it's
a work in progress, so I guess that explains why.

I had postponed such a definition of constructions, since the model used
to describe the Org format wasn't (and still isn't) complete. Actually,
the list that you can observe in org-element.el will change a little
during the next few weeks.

Anyway, I agree that a document formally describing each element/object
in Org has to be written at some point. Even if I think it's one or two
months too early for that task, I'll gladly offer my help if you decide
to undertake it nonetheless. Do not hesitate to ask if you need more
information.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files)
  2012-03-01 13:20         ` Nicolas Goaziou
@ 2012-03-01 15:44           ` Simon Castellan
  2012-03-01 19:18             ` Nicolas Goaziou
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Castellan @ 2012-03-01 15:44 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode, Eric Schulte

Hello,

Thanks for your answer. I think indeed that a description of org's syntax would
be better in a separate document. For now I am rebasing my parser on your
categories (I must say I was lacking a lot). Please let me know when you change
your syntaxic categories (by change you mean additions only or removals as well
?). I will try in my sources to document meanings and (very) informal syntax of
handled constructions.

Besides, what are "export snippets" ? I can't find a reference to it in the manual.

Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files)
  2012-03-01 15:44           ` Simon Castellan
@ 2012-03-01 19:18             ` Nicolas Goaziou
  0 siblings, 0 replies; 9+ messages in thread
From: Nicolas Goaziou @ 2012-03-01 19:18 UTC (permalink / raw)
  To: Simon Castellan; +Cc: emacs-orgmode, Eric Schulte

Simon Castellan <simon.castellan@iuwt.fr> writes:

> For now I am rebasing my parser on your categories (I must say I was
> lacking a lot). Please let me know when you change your syntaxic
> categories (by change you mean additions only or removals as well ?).

I have a couple additions in mind: I'd like to refine table parsing.
I'll probably add "table-row" and "table-cell" elements. I'd like to
introduce a new type of drawers too, but that's another story.

There's no removal in sight, though.

I'll keep you informed on changes in that area.

> I will try in my sources to document meanings and (very) informal
> syntax of handled constructions.

This could be a starter for the complete document to come.

> Besides, what are "export snippets" ? I can't find a reference to it
> in the manual.

They're an experimental syntax I introduced to replace and generalize
HTML tags (@<tag>). They are, more or less, the inline counterpart of
export blocks. For example, one should be able to use @html{<tag>}, but
also @latex{\hfill{}}, etc. and back-ends filter out all but one
category.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mlorg : yet another parser for org-mode (Written in OCaml)
  2012-02-26 16:41 mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files) Simon Castellan
  2012-02-27 14:27 ` Alan Schmitt
@ 2012-07-04 18:18 ` Simon Castellan
  1 sibling, 0 replies; 9+ messages in thread
From: Simon Castellan @ 2012-07-04 18:18 UTC (permalink / raw)
  To: emacs-orgmode

Hello again,

Four months have passed and a lot of progress have been made.  First I
suppressed the literate programming layer as it was getting too much in
the way.

Second, the support for the syntax has been greatly improved and
supports almost all constructions mentioned in org-element.el. Fore most
documents, it should be ok I guess -- but I don't know what org features
are the most used.

To debug, and to help mlorg to talk with other languages, I coded an XML
backend which dumps the structure of the file as a XML tree.


----


What is more interesting to me — and that's why I started mlorg in the
first place — is the quote backend. This backend allows you to pick out
a code block in your file (OCaml only for now) and feed it the whole
document as a tree. Thus this code can extract the particuliar
information you want. For instance, I have at the end of my contacts.org
this little snippet that exports the contacts as mutt aliases

(F stands for filter, D for document and |- is the composition of
function as the code is written in point-free style -- the argument
isn't explicitely mentionned)

#+name:export
#+begin_src ocaml
let replace = Str.global_replace (Str.regexp " ") "_" in
F.run (F.has_property (F.s "EMAIL")) |-
List.map (fun d -> sprintf "alias %s %s\n" (D.name d |> replace) (D.prop_val_ "EMAIL" d)) |-
String.concat "" |-
write
#+end_src

With this, I just need to do

  $ mlorg --filename contacts.org --backend quote

to have my mutt aliases. With this quote feature I plan to let the user
override the html/latex exporters through the means of inheritance. For
instance, suppose the user has blocks like that in his document:

#+begin_lemma
Some lemma.
#+end_lemma

He wants to export it in a specific way in html, he can put at the end of his document:

#+name export
#+begin_src ocaml
let exporter = object(self)
  inherit htmlExporter as super
  method block = function
   | Custom ("lemma", name, contents) ->
     Xml.block "div" ~attr:["class", "lemma"]
       (Xml.data (name ^ " — ") ::
        self#blocks contents
   | block -> super#block block
end
in exporter#document
#+end_src

(It doesn't work yet but soon will)

---

I wrote a short README available here:

  http://kiwi.iuwt.fr/~asmanur/projets/mlorg/

(This shows that the html backend is pretty basic)

This comments briefly every construction of the syntax I support.


Performance-wise, it is not optimized at all and as such quite slow. To
process this file http://doc.norang.ca/org-mode.org, on my computer the
bytecode version is as fast as org-mode and the native version is about
5-6x faster. (tested quickly)

--

What I plan to do next:
- complete the syntax as much as possible
- improve the html & latex backend
- try to be a little faster
- have a agenda backend as well.
- implements other languages ?


Simon.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-07-04 18:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-26 16:41 mlorg : yet another parser for org-mode (Written in OCaml contained in org-mode files) Simon Castellan
2012-02-27 14:27 ` Alan Schmitt
2012-02-27 17:06   ` Simon Castellan
2012-02-27 16:52     ` Eric Schulte
2010-08-01 23:37       ` Simon Castellan
2012-03-01 13:20         ` Nicolas Goaziou
2012-03-01 15:44           ` Simon Castellan
2012-03-01 19:18             ` Nicolas Goaziou
2012-07-04 18:18 ` mlorg : yet another parser for org-mode (Written in OCaml) Simon Castellan

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).