From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sebastian Rose Subject: Validation and text search for export Date: Fri, 14 Nov 2008 19:38:58 +0100 Message-ID: <87bpwicful.fsf@kassiopeya.MSHEIMNETZ> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L13Wo-0002nP-HB for emacs-orgmode@gnu.org; Fri, 14 Nov 2008 13:36:22 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L13Wn-0002ma-0R for emacs-orgmode@gnu.org; Fri, 14 Nov 2008 13:36:22 -0500 Received: from [199.232.76.173] (port=37015 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L13Wm-0002mT-Q5 for emacs-orgmode@gnu.org; Fri, 14 Nov 2008 13:36:20 -0500 Received: from mail.gmx.net ([213.165.64.20]:54108) by monty-python.gnu.org with smtp (Exim 4.60) (envelope-from ) id 1L13Wn-0007zQ-3i for emacs-orgmode@gnu.org; Fri, 14 Nov 2008 13:36:21 -0500 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode Org-Mode Hi, I have started a little Php script for validating the output from Org's XHTML export. Everybody is welcome to use and improve it. I found there are so many corner cases for this subject (Org-files and XHTML export), that I can't find them all here. What you get for validating, is a simple text [1] search form and a list of the 10 most recently changed/added files [2] well as list of images referenced in your XHTML files [3]. The first raw version is located on github: http://github.com/SebastianRose/org-search.php/tree/master If you use the script and find parser errors, I'd be happy to receive a copy of the Org-file (private data removed or dummy data added) to add it to the test-cases found on http://github.com/SebastianRose/orghtmlexportdata/tree/master. Best, Sebastian Footnotes: ---------------- [1] Text search is case insensitive and there are no options yet (like `AND' or `OR'). The links to files with matches have `OCCUR=WORD' appended to play well with org-info.js. Scanning the files is _very_ slow because of the way they are validated. A local schema file could speed this up. [2] A pager will be added soon, to go back in time. [3] I plan to use this, to clean up unused images in the future. A Pager will be added to this list too. At least it will be a way to look up images and their path, if you plan to reuse them. -- Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover Tel.: +49 (0)511 - 36 58 472 Fax: +49 (0)1805 - 233633 - 11044 mobil: +49 (0)173 - 83 93 417 Http: www.emma-stil.de