* Making ePub books
@ 2011-12-11 6:59 Alan L Tyree
2011-12-11 7:07 ` Nick Dokos
0 siblings, 1 reply; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11 6:59 UTC (permalink / raw)
To: emacs-orgmode
Debian Squeeze; org 7.7; emacs 23.2.1
I am back to trying to make ePub books from org articles/books. I am
working on a book which currently produces about 100 pages in LaTeX
export. It will be about 200 pages when finished.
ePub uses XHTML for the main content. So, I export the org file to
HTML. It verifies as a valid XHTML1.0 file at the w3c verification
site: http://validator.w3.org/
OK. Then wrap it up in the mess that is the ePub specification. It
actually reads OK in FBReader and in Iceweasel with the ePub add on,
BUT it does not validate. There are several problems, but most of the
errors involve the "name" attribute. For example:
<h2 id="history"><a name="sec-1" id="sec-1"></a><span class="section-
number-2">1</span> History</h2>
ePub does not like the name in there. Wipe out all the name="xxx" and
the problem goes away. Everything else still works.
I know that I can do a post export clean up of the XHTML file, but I
wonder if this is set in some variable that I cannot find.
And, as a general question, whay have both name="sec-1" and id="sec-1"
in the same element?
I would like to automate everything to go from org to ePub. It doesn't
seem too hard, but I'm a legal academic, not a programmer :-). Any
pointers appreciated.
Cheers,
Alan
--
Alan L Tyree http://www2.austlii.edu.au/~alan
Tel: 04 2748 6206 sip:172385@iptel.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books
2011-12-11 6:59 Making ePub books Alan L Tyree
@ 2011-12-11 7:07 ` Nick Dokos
2011-12-11 7:25 ` Alan L Tyree
0 siblings, 1 reply; 13+ messages in thread
From: Nick Dokos @ 2011-12-11 7:07 UTC (permalink / raw)
To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode
Alan L Tyree <alantyree@gmail.com> wrote:
> Debian Squeeze; org 7.7; emacs 23.2.1
>
> I am back to trying to make ePub books from org articles/books. I am=20
> working on a book which currently produces about 100 pages in LaTeX=20
> export. It will be about 200 pages when finished.
>
> ePub uses XHTML for the main content. So, I export the org file to=20
> HTML. It verifies as a valid XHTML1.0 file at the w3c verification=20
> site: http://validator.w3.org/
>
> OK. Then wrap it up in the mess that is the ePub specification. It=20
> actually reads OK in FBReader and in Iceweasel with the ePub add on,=20
> BUT it does not validate. There are several problems, but most of the=20
> errors involve the "name" attribute. For example:
>
> <h2 id=3D"history"><a name=3D"sec-1" id=3D"sec-1"></a><span class=3D"sectio=
> n-
> number-2">1</span> History</h2>
>
> ePub does not like the name in there. Wipe out all the name=3D"xxx" and=20
> the problem goes away. Everything else still works.
>
> I know that I can do a post export clean up of the XHTML file, but I=20
> wonder if this is set in some variable that I cannot find.
>
> And, as a general question, whay have both name=3D"sec-1" and id=3D"sec-1"=20
> in the same element?
>
> I would like to automate everything to go from org to ePub. It doesn't=20
> seem too hard, but I'm a legal academic, not a programmer :-). Any=20
> pointers appreciated.
>
Back when Avdi Green was working on his book, there was some discussion
of this and Anthony Lander provided a pointer to http://calibre-ebook.com/
- see
http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
Nick
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books
2011-12-11 7:07 ` Nick Dokos
@ 2011-12-11 7:25 ` Alan L Tyree
2011-12-11 7:42 ` Nick Dokos
2011-12-11 7:50 ` Making ePub books Nick Dokos
0 siblings, 2 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11 7:25 UTC (permalink / raw)
Cc: nicholas.dokos, emacs-orgmode
On 11/12/11 18:07:48, Nick Dokos wrote:
> Alan L Tyree <alantyree@gmail.com> wrote:
>
> > Debian Squeeze; org 7.7; emacs 23.2.1
> >
> > I am back to trying to make ePub books from org articles/books. I
> am=20
> > working on a book which currently produces about 100 pages in
> LaTeX=20
> > export. It will be about 200 pages when finished.
> >
> > ePub uses XHTML for the main content. So, I export the org file
> to=20
> > HTML. It verifies as a valid XHTML1.0 file at the w3c
> verification=20
> > site: http://validator.w3.org/
> >
> > OK. Then wrap it up in the mess that is the ePub specification.
> It=20
> > actually reads OK in FBReader and in Iceweasel with the ePub add
> on,=20
> > BUT it does not validate. There are several problems, but most of
> the=20
> > errors involve the "name" attribute. For example:
> >
> > <h2 id=3D"history"><a name=3D"sec-1" id=3D"sec-1"></a><span
> class=3D"sectio=
> > n-
> > number-2">1</span> History</h2>
> >
> > ePub does not like the name in there. Wipe out all the name=3D"xxx"
> and=20
> > the problem goes away. Everything else still works.
> >
> > I know that I can do a post export clean up of the XHTML file, but
> I=20
> > wonder if this is set in some variable that I cannot find.
> >
> > And, as a general question, whay have both name=3D"sec-1" and
> id=3D"sec-1"=20
> > in the same element?
> >
> > I would like to automate everything to go from org to ePub. It
> doesn't=20
> > seem too hard, but I'm a legal academic, not a programmer :-).
> Any=20
> > pointers appreciated.
> >
>
> Back when Avdi Green was working on his book, there was some
> discussion
> of this and Anthony Lander provided a pointer to
> http://calibre-ebook.com/
> - see
>
> http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
Yes, Calibre does a nice job of converting XHTML to ePub; it can be
read in all the readers that I use, but it won't pass the validation
tests. OK unless you want to publish on sites that require validation.
Cheers,
Alan
> Nick
>
>
--
Alan L Tyree http://www2.austlii.edu.au/~alan
Tel: 04 2748 6206 sip:172385@iptel.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books
2011-12-11 7:25 ` Alan L Tyree
@ 2011-12-11 7:42 ` Nick Dokos
2011-12-11 8:28 ` Alan L Tyree
2011-12-11 9:41 ` Making ePub books: further report Alan L Tyree
2011-12-11 7:50 ` Making ePub books Nick Dokos
1 sibling, 2 replies; 13+ messages in thread
From: Nick Dokos @ 2011-12-11 7:42 UTC (permalink / raw)
To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode
Alan L Tyree <alantyree@gmail.com> wrote:
> > http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
>
> Yes, Calibre does a nice job of converting XHTML to ePub; it can be
> read in all the readers that I use, but it won't pass the validation
> tests. OK unless you want to publish on sites that require validation.
Have you tried submitting an enhancement request to the calibre people?
It sounds (from my vantage point of a million miles away...) like a simple
thing for them to do and it might be a good thing for them as well as
for you: they can integrate the validation step in their testing and
catch errors that might be difficult to catch any other way. And it
seems to be a very active project, so you might get results pronto.
Nick
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books
2011-12-11 7:25 ` Alan L Tyree
2011-12-11 7:42 ` Nick Dokos
@ 2011-12-11 7:50 ` Nick Dokos
1 sibling, 0 replies; 13+ messages in thread
From: Nick Dokos @ 2011-12-11 7:50 UTC (permalink / raw)
To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode
I said:
> > Back when Avdi Green was working on his book, there was some
and I managed to mangle Avdi's name pretty badly: it is "Avdi Grimm".
Apologies,
Nick
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books
2011-12-11 7:42 ` Nick Dokos
@ 2011-12-11 8:28 ` Alan L Tyree
2011-12-11 9:41 ` Making ePub books: further report Alan L Tyree
1 sibling, 0 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11 8:28 UTC (permalink / raw)
Cc: nicholas.dokos, emacs-orgmode
On 11/12/11 18:42:10, Nick Dokos wrote:
> Alan L Tyree <alantyree@gmail.com> wrote:
>
> > > http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
> >
> > Yes, Calibre does a nice job of converting XHTML to ePub; it can be
> > read in all the readers that I use, but it won't pass the
> validation
> > tests. OK unless you want to publish on sites that require
> validation.
>
> Have you tried submitting an enhancement request to the calibre
> people?
> It sounds (from my vantage point of a million miles away...) like a
> simple
> thing for them to do and it might be a good thing for them as well as
> for you: they can integrate the validation step in their testing and
> catch errors that might be difficult to catch any other way. And it
> seems to be a very active project, so you might get results pronto.
Good idea! That would certainly be the nicest way since Calibre is a
very nice piece of work.
Alan
>
> Nick
>
>
--
Alan L Tyree http://www2.austlii.edu.au/~alan
Tel: 04 2748 6206 sip:172385@iptel.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books: further report
2011-12-11 7:42 ` Nick Dokos
2011-12-11 8:28 ` Alan L Tyree
@ 2011-12-11 9:41 ` Alan L Tyree
2011-12-11 9:52 ` Alan L Tyree
2011-12-11 15:51 ` Bastien
1 sibling, 2 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11 9:41 UTC (permalink / raw)
Cc: nicholas.dokos, emacs-orgmode
On 11/12/11 18:42:10, Nick Dokos wrote:
> Alan L Tyree <alantyree@gmail.com> wrote:
>
> > > http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
> >
> > Yes, Calibre does a nice job of converting XHTML to ePub; it can be
> > read in all the readers that I use, but it won't pass the
> validation
> > tests. OK unless you want to publish on sites that require
> validation.
<SNIP>
I was being unfair to Calibre. If I clean up the XHTML file produced by
org (in the way indicated by my original post plus a couple of things
that I didn't mention), then Calibre produces an ePub book that passes
validation.
So -- back to my original question: is there some variable somewhere
that puts in both name="xxx" and id="xxx" or do I need to write a post
export clean up function?
Thanks,
Alan
--
Alan L Tyree http://www2.austlii.edu.au/~alan
Tel: 04 2748 6206 sip:172385@iptel.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books: further report
2011-12-11 9:41 ` Making ePub books: further report Alan L Tyree
@ 2011-12-11 9:52 ` Alan L Tyree
2011-12-11 10:02 ` Jambunathan K
2011-12-11 15:52 ` Bastien
2011-12-11 15:51 ` Bastien
1 sibling, 2 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11 9:52 UTC (permalink / raw)
Cc: nicholas.dokos, emacs-orgmode
On 11/12/11 20:41:18, Alan L Tyree wrote:
> On 11/12/11 18:42:10, Nick Dokos wrote:
> > Alan L Tyree <alantyree@gmail.com> wrote:
> >
> > > > http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
> > >
> > > Yes, Calibre does a nice job of converting XHTML to ePub; it can
> be
> > > read in all the readers that I use, but it won't pass the
> > validation
> > > tests. OK unless you want to publish on sites that require
> > validation.
> <SNIP>
>
> I was being unfair to Calibre. If I clean up the XHTML file produced
> by
> org (in the way indicated by my original post plus a couple of things
> that I didn't mention), then Calibre produces an ePub book that
> passes
>
> validation.
>
> So -- back to my original question: is there some variable somewhere
> that puts in both name="xxx" and id="xxx" or do I need to write a
> post
>
> export clean up function?
Bad form to answer my own question: these seem to be hard coded in org-
html.el along with the other items that give ePub validation a nervous
breakdown. I'll post a full list of the offending items later.
Cheers,
Alan
>
> Thanks,
> Alan
>
> --
> Alan L Tyree http://www2.austlii.edu.au/~alan
> Tel: 04 2748 6206 sip:172385@iptel.org
>
>
>
--
Alan L Tyree http://www2.austlii.edu.au/~alan
Tel: 04 2748 6206 sip:172385@iptel.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books: further report
2011-12-11 9:52 ` Alan L Tyree
@ 2011-12-11 10:02 ` Jambunathan K
2011-12-11 19:29 ` Alan L Tyree
2011-12-11 15:52 ` Bastien
1 sibling, 1 reply; 13+ messages in thread
From: Jambunathan K @ 2011-12-11 10:02 UTC (permalink / raw)
To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode
Alan L Tyree <alantyree@gmail.com> writes:
> On 11/12/11 20:41:18, Alan L Tyree wrote:
>> On 11/12/11 18:42:10, Nick Dokos wrote:
>> > Alan L Tyree <alantyree@gmail.com> wrote:
>> >
>> > > > http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
>> > >
>> > > Yes, Calibre does a nice job of converting XHTML to ePub; it can
>> be
>> > > read in all the readers that I use, but it won't pass the
>> > validation
>> > > tests. OK unless you want to publish on sites that require
>> > validation.
>> <SNIP>
>>
>> I was being unfair to Calibre. If I clean up the XHTML file produced
>> by
>> org (in the way indicated by my original post plus a couple of things
>> that I didn't mention), then Calibre produces an ePub book that
>> passes
>>
>> validation.
>>
>> So -- back to my original question: is there some variable somewhere
>> that puts in both name="xxx" and id="xxx" or do I need to write a
>> post
>>
>> export clean up function?
>
> Bad form to answer my own question: these seem to be hard coded in org-
> html.el along with the other items that give ePub validation a nervous
> breakdown. I'll post a full list of the offending items later.
If you use org-xhtml.el (in contrib/lisp/org-xhtml.el) then you can
re-define some aspects of html export selectively.
For example, you can redefine this to
,---- original
| (defun org-xhtml-format-anchor (text name &optional class)
| (let* ((id name)
| (extra (concat
| (when name (format " name=\"%s\"" name))
| (when id (format " id=\"%s\"" id))
| (when class (format " class=\"%s\"" class)))))
| (org-xhtml-format-tags '("<a%s>" . "</a>") text extra)))
`----
this
,---- modified
| (defun org-xhtml-format-anchor (text name &optional class)
| (let* ((id name)
| (extra (concat
| (when id (format " id=\"%s\"" id))
| (when class (format " class=\"%s\"" class)))))
| (org-xhtml-format-tags '("<a%s>" . "</a>") text extra)))
`----
to strip name from anchor.
I am not sure whether org-xhtml.el will minimize your efforts. Just a
suggestion.
ps: Add contrib/lisp to load-path and do org-export-as-xhtml.
> Cheers,
> Alan
>
>>
>> Thanks,
>> Alan
>>
>> --
>> Alan L Tyree http://www2.austlii.edu.au/~alan
>> Tel: 04 2748 6206 sip:172385@iptel.org
>>
>>
>>
--
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books: further report
2011-12-11 9:41 ` Making ePub books: further report Alan L Tyree
2011-12-11 9:52 ` Alan L Tyree
@ 2011-12-11 15:51 ` Bastien
2011-12-11 20:47 ` Alan L Tyree
1 sibling, 1 reply; 13+ messages in thread
From: Bastien @ 2011-12-11 15:51 UTC (permalink / raw)
To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode
Hi Alan,
Alan L Tyree <alantyree@gmail.com> writes:
> So -- back to my original question: is there some variable somewhere
> that puts in both name="xxx" and id="xxx" or do I need to write a post
> export clean up function?
From latest git, can now set ̀org-export-html-headline-anchor-format' to
nil. See the docstring of this new option.
Thanks,
--
Bastien
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books: further report
2011-12-11 9:52 ` Alan L Tyree
2011-12-11 10:02 ` Jambunathan K
@ 2011-12-11 15:52 ` Bastien
1 sibling, 0 replies; 13+ messages in thread
From: Bastien @ 2011-12-11 15:52 UTC (permalink / raw)
To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode
Alan L Tyree <alantyree@gmail.com> writes:
> Bad form to answer my own question: these seem to be hard coded in org-
> html.el along with the other items that give ePub validation a nervous
> breakdown. I'll post a full list of the offending items later.
Thanks. If you can, please document this on Worg. This will be useful
when we will rewrite org-html.el using Nicolas new export engine.
Best,
--
Bastien
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books: further report
2011-12-11 10:02 ` Jambunathan K
@ 2011-12-11 19:29 ` Alan L Tyree
0 siblings, 0 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11 19:29 UTC (permalink / raw)
To: Jambunathan K; +Cc: nicholas.dokos, emacs-orgmode
On 11/12/11 21:02:51, Jambunathan K wrote:
> Alan L Tyree <alantyree@gmail.com> writes:
>
> > On 11/12/11 20:41:18, Alan L Tyree wrote:
> >> On 11/12/11 18:42:10, Nick Dokos wrote:
> >> > Alan L Tyree <alantyree@gmail.com> wrote:
> >> >
> >> > > > http://thread.gmane.org/gmane.emacs.orgmode/41826/
> focus=41849
> >> > >
> >> > > Yes, Calibre does a nice job of converting XHTML to ePub; it
> can
> >> be
> >> > > read in all the readers that I use, but it won't pass the
> >> > validation
> >> > > tests. OK unless you want to publish on sites that require
> >> > validation.
> >> <SNIP>
> >>
> >> I was being unfair to Calibre. If I clean up the XHTML file
> produced
> >> by
> >> org (in the way indicated by my original post plus a couple of
> things
> >> that I didn't mention), then Calibre produces an ePub book that
> >> passes
> >>
> >> validation.
> >>
> >> So -- back to my original question: is there some variable
> somewhere
> >> that puts in both name="xxx" and id="xxx" or do I need to write a
> >> post
> >>
> >> export clean up function?
> >
> > Bad form to answer my own question: these seem to be hard coded in
> org-
> > html.el along with the other items that give ePub validation a
> nervous
> > breakdown. I'll post a full list of the offending items later.
>
>
> If you use org-xhtml.el (in contrib/lisp/org-xhtml.el) then you can
> re-define some aspects of html export selectively.
>
> For example, you can redefine this to
>
> ,---- original
> | (defun org-xhtml-format-anchor (text name &optional class)
> | (let* ((id name)
> | (extra (concat
> | (when name (format " name=\"%s\"" name))
> | (when id (format " id=\"%s\"" id))
> | (when class (format " class=\"%s\"" class)))))
> | (org-xhtml-format-tags '("<a%s>" . "</a>") text extra)))
> `----
>
> this
>
> ,---- modified
> | (defun org-xhtml-format-anchor (text name &optional class)
> | (let* ((id name)
> | (extra (concat
> | (when id (format " id=\"%s\"" id))
> | (when class (format " class=\"%s\"" class)))))
> | (org-xhtml-format-tags '("<a%s>" . "</a>") text extra)))
> `----
>
> to strip name from anchor.
>
> I am not sure whether org-xhtml.el will minimize your efforts. Just a
> suggestion.
>
> ps: Add contrib/lisp to load-path and do org-export-as-xhtml.
Thanks for this, Jambunathan. I'll give this a try.
Cheers,
Alan
>
> > Cheers,
> > Alan
> >
> >>
> >> Thanks,
> >> Alan
> >>
> >> --
> >> Alan L Tyree http://www2.austlii.edu.au/~alan
> >> Tel: 04 2748 6206 sip:172385@iptel.org
> >>
> >>
> >>
>
> --
>
--
Alan L Tyree http://www2.austlii.edu.au/~alan
Tel: 04 2748 6206 sip:172385@iptel.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Making ePub books: further report
2011-12-11 15:51 ` Bastien
@ 2011-12-11 20:47 ` Alan L Tyree
0 siblings, 0 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11 20:47 UTC (permalink / raw)
To: Bastien; +Cc: nicholas.dokos, emacs-orgmode
On 12/12/11 02:51:29, Bastien wrote:
> Hi Alan,
>
> Alan L Tyree <alantyree@gmail.com> writes:
>
> > So -- back to my original question: is there some variable
> somewhere
>
> > that puts in both name="xxx" and id="xxx" or do I need to write a
> post
> > export clean up function?
>
> From latest git, can now set ̀org-export-html-headline-anchor-format'
> to
> nil. See the docstring of this new option.
Thanks, Basien. I'll give it a try.
Alan
>
> Thanks,
>
> --
> Bastien
>
--
Alan L Tyree http://www2.austlii.edu.au/~alan
Tel: 04 2748 6206 sip:172385@iptel.org
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-12-11 20:46 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-11 6:59 Making ePub books Alan L Tyree
2011-12-11 7:07 ` Nick Dokos
2011-12-11 7:25 ` Alan L Tyree
2011-12-11 7:42 ` Nick Dokos
2011-12-11 8:28 ` Alan L Tyree
2011-12-11 9:41 ` Making ePub books: further report Alan L Tyree
2011-12-11 9:52 ` Alan L Tyree
2011-12-11 10:02 ` Jambunathan K
2011-12-11 19:29 ` Alan L Tyree
2011-12-11 15:52 ` Bastien
2011-12-11 15:51 ` Bastien
2011-12-11 20:47 ` Alan L Tyree
2011-12-11 7:50 ` Making ePub books Nick Dokos
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).