From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Davis Subject: Re: Importing from Oddmuse? Date: Mon, 28 Oct 2013 11:01:30 -0400 Message-ID: <526E7C4A.7050804@pfdstudio.com> References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------040606060106040103050908" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:48646) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VaoK5-0007zq-Sh for emacs-orgmode@gnu.org; Mon, 28 Oct 2013 11:01:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VaoK0-00032h-Bf for emacs-orgmode@gnu.org; Mon, 28 Oct 2013 11:01:41 -0400 Received: from out4-smtp.messagingengine.com ([66.111.4.28]:36755) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VaoK0-00032Y-7N for emacs-orgmode@gnu.org; Mon, 28 Oct 2013 11:01:36 -0400 In-Reply-To: List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org This is a multi-part message in MIME format. --------------040606060106040103050908 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Just to answer my own question, I shamelessly took Alex Schroeder's raw.pl script and hacked it up a bit to do some conversion from Oddmuse markup to org-mode. The attached Perl script should run through all the pages in an Oddmuse Wiki and generate .org versions of them in a separate directory. This is still very much a work in progress, but I think the general framework is useful. On thing I have to fix is the hyperlinks. Right now, if the Wiki page is "one two.pg", this script will generate a file named "one_two.org," but any links will refer to "[[file:one two.org][one two]]" I concentrated on the small subset of Oddmuse markup that I'm using, but I think it's easily extensible. Let me know if this is at all useful to anyone else. -pd On 10/25/13 10:54 AM, Peter Davis wrote: > I'm comparatively new to Org mode (actually, I've used it for years, > but only a small subset of its functionality). I've used Oddmuse for > years to maintain my own personal Wiki, but now I'm looking to move to > Org mode. > > I know there are lots of tools for exporting or publishing from Org > mode to Oddmuse, but how about the other direction? Any tools or tips > for importing a large number of Oddmuse pages into Org mode? Ideally, > I'd like to keep them as separate files, with links converted to file > links, etc. > -- Peter Davis The Tech Curmudgeon www.techcurmudgeon.com --------------040606060106040103050908 Content-Type: text/x-perl-script; name="om2org.pl" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="om2org.pl" #! /usr/bin/perl -w # Copyright (C) 2005, 2007 Alex Schroeder # # Portions copyright (c) 2013, Peter Davis # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . sub ParseData { my $data = shift; my %result; while ($data =~ /(\S+?): (.*?)(?=\n[^ \t]|\Z)/sg) { my ($key, $value) = ($1, $2); $value =~ s/\n\t/\n/g; $result{$key} = $value; } return %result; } sub FixMarkUp { my $data = shift; my $orgout = "#+STARTUP: showeverything logdone\n#+options: num:nil\n\n"; my $csvMode = 0; foreach (split /\n/, $data) { if (length($_)) { s/\r//g; # csv tables if ($_ =~ //) { $csvMode = 1; s//#+ATTR_HTML: :border 2 :rules all :frame border/g; } elsif ($_ =~ /^\s*$/) { $csvMode = 0; } elsif ($csvMode) { s/^/|/g; s/,/|/g; s/$/|/g; } # hyperlinks s/\[\[([^]]*)\]\]/[[file:$1.org][$1]]/g; # strike through s/<\/?s>/+/g; # verse s/:::/#+BEGIN_VERSE/g; # bold and italic s/'''/*/g; s/''/\//g; # bullet lists s/^\*\*\*\*/ */g; s/^\*\*\*/ */g; s/^\*\*/ */g; s/^\*/ */g; # headers s/^\=\=\=\=/****/g; s/^\=\=\=/***/g; s/^\=\=/**/g; s/^\=/*/g; # s/ \=?$//g; s/ \=\=\=\=$//g; s/ \=\=\=$//g; s/ \=\=$//g; s/ \=$//g; s/^# / 1. /g; } else { $csvMode = 0; } $orgout = $orgout . $_ . "\n"; } return $orgout; } sub main { my ($regexp, $PageDir, $OrgDir) = @_; # include dotfiles! local $/ = undef; # Read complete files foreach my $file (glob("$PageDir/*/*.pg $PageDir/*/.*.pg")) { next unless $file =~ m|/.*/(.+)\.pg$|; my $page = $1; next if $regexp && $page !~ m|$regexp|o; $page = $page . ".org"; mkdir($OrgDir) or die "Cannot create $OrgDir directory: $!" unless -d $OrgDir; open(F, $file) or die "Cannot read $page file: $!"; my $data = ; close(F); my $ts = (stat("$OrgDir/$page"))[9]; my %result1 = ParseData($data); my $result2 = FixMarkUp($result1{text}); if ($ts && $ts == $result1{ts}) { print "skipping $page because it is up to date\n" if $verbose; } else { print "writing $page because $ts != $result{ts}\n" if $verbose; open(F,"> $OrgDir/$page") or die "Cannot write $page org file: $!"; # print F $result1{text}; print F $result2; close(F); utime $result1{ts}, $result1{ts}, "$OrgDir/$page"; # touch file } } } use Getopt::Long; my $regexp = undef; my $page = 'page'; my $dir = 'org'; GetOptions ("regexp=s" => \$regexp, "page=s" => \$page, "dir=s" => \$dir, "help" => \$help); if ($help) { print qq{ Usage: $0 [--regexp REGEXP] [--page DIR] [--dir DIR] Writes the org wiki text into plain text files. --regexp selects a subsets of pages whose names match the regular expression. Note that spaces have been translated to underscores. --page designates the page directory. By default this is 'page' in the current directory. If you run this script in your data directory, the default should be fine. --dir designates an output directory. By default this is 'org' in the current directory. Example: $0 --regexp '\\.el\$' --dir elisp } } else { main ($regexp, $page, $dir); } --------------040606060106040103050908--