From: Peter Davis <pfd@pfdstudio.com>
To: emacs-orgmode@gnu.org
Subject: Re: Importing from Oddmuse?
Date: Mon, 28 Oct 2013 11:01:30 -0400 [thread overview]
Message-ID: <526E7C4A.7050804@pfdstudio.com> (raw)
In-Reply-To: <CAE-e6gkqaWRf=QGHjYXoTR9BjgE+98mecY_w4w9Dc+uzPTCw-Q@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1405 bytes --]
Just to answer my own question, I shamelessly took Alex Schroeder's
raw.pl script and hacked it up a bit to do some conversion from Oddmuse
markup to org-mode. The attached Perl script should run through all the
pages in an Oddmuse Wiki and generate .org versions of them in a
separate directory.
This is still very much a work in progress, but I think the general
framework is useful. On thing I have to fix is the hyperlinks. Right
now, if the Wiki page is "one two.pg", this script will generate a file
named "one_two.org," but any links will refer to "[[file:one
two.org][one two]]"
I concentrated on the small subset of Oddmuse markup that I'm using, but
I think it's easily extensible.
Let me know if this is at all useful to anyone else.
-pd
On 10/25/13 10:54 AM, Peter Davis wrote:
> I'm comparatively new to Org mode (actually, I've used it for years,
> but only a small subset of its functionality). I've used Oddmuse for
> years to maintain my own personal Wiki, but now I'm looking to move to
> Org mode.
>
> I know there are lots of tools for exporting or publishing from Org
> mode to Oddmuse, but how about the other direction? Any tools or tips
> for importing a large number of Oddmuse pages into Org mode? Ideally,
> I'd like to keep them as separate files, with links converted to file
> links, etc.
>
--
Peter Davis
The Tech Curmudgeon
www.techcurmudgeon.com
[-- Attachment #2: om2org.pl --]
[-- Type: text/x-perl-script, Size: 3893 bytes --]
#! /usr/bin/perl -w
# Copyright (C) 2005, 2007 Alex Schroeder <alex@emacswiki.org>
#
# Portions copyright (c) 2013, Peter Davis <pfd@pfdstudio.com>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
sub ParseData {
my $data = shift;
my %result;
while ($data =~ /(\S+?): (.*?)(?=\n[^ \t]|\Z)/sg) {
my ($key, $value) = ($1, $2);
$value =~ s/\n\t/\n/g;
$result{$key} = $value;
}
return %result;
}
sub FixMarkUp {
my $data = shift;
my $orgout = "#+STARTUP: showeverything logdone\n#+options: num:nil\n\n";
my $csvMode = 0;
foreach (split /\n/, $data) {
if (length($_)) {
s/\r//g;
# csv tables
if ($_ =~ /<csv>/) {
$csvMode = 1;
s/<csv>/#+ATTR_HTML: :border 2 :rules all :frame border/g;
} elsif ($_ =~ /^\s*$/) {
$csvMode = 0;
} elsif ($csvMode) {
s/^/|/g;
s/,/|/g;
s/$/|/g;
}
# hyperlinks
s/\[\[([^]]*)\]\]/[[file:$1.org][$1]]/g;
# strike through
s/<\/?s>/+/g;
# verse
s/:::/#+BEGIN_VERSE/g;
# bold and italic
s/'''/*/g;
s/''/\//g;
# bullet lists
s/^\*\*\*\*/ */g;
s/^\*\*\*/ */g;
s/^\*\*/ */g;
s/^\*/ */g;
# headers
s/^\=\=\=\=/****/g;
s/^\=\=\=/***/g;
s/^\=\=/**/g;
s/^\=/*/g;
# s/ \=?$//g;
s/ \=\=\=\=$//g;
s/ \=\=\=$//g;
s/ \=\=$//g;
s/ \=$//g;
s/^# / 1. /g;
} else {
$csvMode = 0;
}
$orgout = $orgout . $_ . "\n";
}
return $orgout;
}
sub main {
my ($regexp, $PageDir, $OrgDir) = @_;
# include dotfiles!
local $/ = undef; # Read complete files
foreach my $file (glob("$PageDir/*/*.pg $PageDir/*/.*.pg")) {
next unless $file =~ m|/.*/(.+)\.pg$|;
my $page = $1;
next if $regexp && $page !~ m|$regexp|o;
$page = $page . ".org";
mkdir($OrgDir) or die "Cannot create $OrgDir directory: $!"
unless -d $OrgDir;
open(F, $file) or die "Cannot read $page file: $!";
my $data = <F>;
close(F);
my $ts = (stat("$OrgDir/$page"))[9];
my %result1 = ParseData($data);
my $result2 = FixMarkUp($result1{text});
if ($ts && $ts == $result1{ts}) {
print "skipping $page because it is up to date\n" if $verbose;
} else {
print "writing $page because $ts != $result{ts}\n" if $verbose;
open(F,"> $OrgDir/$page") or die "Cannot write $page org file: $!";
# print F $result1{text};
print F $result2;
close(F);
utime $result1{ts}, $result1{ts}, "$OrgDir/$page"; # touch file
}
}
}
use Getopt::Long;
my $regexp = undef;
my $page = 'page';
my $dir = 'org';
GetOptions ("regexp=s" => \$regexp,
"page=s" => \$page,
"dir=s" => \$dir,
"help" => \$help);
if ($help) {
print qq{
Usage: $0 [--regexp REGEXP] [--page DIR] [--dir DIR]
Writes the org wiki text into plain text files.
--regexp selects a subsets of pages whose names match the regular
expression. Note that spaces have been translated to underscores.
--page designates the page directory. By default this is 'page' in the
current directory. If you run this script in your data directory,
the default should be fine.
--dir designates an output directory. By default this is 'org' in the
current directory.
Example: $0 --regexp '\\.el\$' --dir elisp
}
} else {
main ($regexp, $page, $dir);
}
next prev parent reply other threads:[~2013-10-28 15:01 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-25 14:54 Importing from Oddmuse? Peter Davis
2013-10-28 15:01 ` Peter Davis [this message]
2013-10-28 18:12 ` Achim Gratz
2013-10-28 18:27 ` Peter Davis
2013-10-28 19:18 ` Achim Gratz
2013-10-28 19:35 ` Peter Davis
2013-10-28 20:26 ` Achim Gratz
2013-10-28 20:33 ` Peter Davis
2013-10-30 21:08 ` Jambunathan K
2013-10-30 22:44 ` Peter Davis
2013-10-31 6:18 ` Marcin Borkowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=526E7C4A.7050804@pfdstudio.com \
--to=pfd@pfdstudio.com \
--cc=emacs-orgmode@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).