emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Making ePub books
@ 2011-12-11  6:59 Alan L Tyree
  2011-12-11  7:07 ` Nick Dokos
  0 siblings, 1 reply; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11  6:59 UTC (permalink / raw)
  To: emacs-orgmode

Debian Squeeze; org 7.7; emacs 23.2.1

I am back to trying to make ePub books from org articles/books. I am 
working on a book which currently produces about 100 pages in LaTeX 
export. It will be about 200 pages when finished.

ePub uses XHTML for the main content. So, I export the org file to 
HTML. It verifies as a valid XHTML1.0 file at the w3c verification 
site: http://validator.w3.org/

OK. Then wrap it up in the mess that is the ePub specification. It 
actually reads OK in FBReader and in Iceweasel with the ePub add on, 
BUT it does not validate. There are several problems, but most of the 
errors involve the "name" attribute. For example:

<h2 id="history"><a name="sec-1" id="sec-1"></a><span class="section-
number-2">1</span> History</h2>

ePub does not like the name in there. Wipe out all the name="xxx" and 
the problem goes away. Everything else still works.

I know that I can do a post export clean up of the XHTML file, but I 
wonder if this is set in some variable that I cannot find.

And, as a general question, whay have both name="sec-1" and id="sec-1" 
in the same element?

I would like to automate everything to go from org to ePub. It doesn't 
seem too hard, but I'm a legal academic, not a programmer :-). Any 
pointers appreciated.

Cheers,
Alan

-- 
Alan L Tyree                    http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206		sip:172385@iptel.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books
  2011-12-11  6:59 Making ePub books Alan L Tyree
@ 2011-12-11  7:07 ` Nick Dokos
  2011-12-11  7:25   ` Alan L Tyree
  0 siblings, 1 reply; 13+ messages in thread
From: Nick Dokos @ 2011-12-11  7:07 UTC (permalink / raw)
  To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode

Alan L Tyree <alantyree@gmail.com> wrote:

> Debian Squeeze; org 7.7; emacs 23.2.1
> 
> I am back to trying to make ePub books from org articles/books. I am=20
> working on a book which currently produces about 100 pages in LaTeX=20
> export. It will be about 200 pages when finished.
> 
> ePub uses XHTML for the main content. So, I export the org file to=20
> HTML. It verifies as a valid XHTML1.0 file at the w3c verification=20
> site: http://validator.w3.org/
> 
> OK. Then wrap it up in the mess that is the ePub specification. It=20
> actually reads OK in FBReader and in Iceweasel with the ePub add on,=20
> BUT it does not validate. There are several problems, but most of the=20
> errors involve the "name" attribute. For example:
> 
> <h2 id=3D"history"><a name=3D"sec-1" id=3D"sec-1"></a><span class=3D"sectio=
> n-
> number-2">1</span> History</h2>
> 
> ePub does not like the name in there. Wipe out all the name=3D"xxx" and=20
> the problem goes away. Everything else still works.
> 
> I know that I can do a post export clean up of the XHTML file, but I=20
> wonder if this is set in some variable that I cannot find.
> 
> And, as a general question, whay have both name=3D"sec-1" and id=3D"sec-1"=20
> in the same element?
> 
> I would like to automate everything to go from org to ePub. It doesn't=20
> seem too hard, but I'm a legal academic, not a programmer :-). Any=20
> pointers appreciated.
> 

Back when Avdi Green was working on his book, there was some discussion
of this and Anthony Lander provided a pointer to  http://calibre-ebook.com/
- see 

  http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849

Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books
  2011-12-11  7:07 ` Nick Dokos
@ 2011-12-11  7:25   ` Alan L Tyree
  2011-12-11  7:42     ` Nick Dokos
  2011-12-11  7:50     ` Making ePub books Nick Dokos
  0 siblings, 2 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11  7:25 UTC (permalink / raw)
  Cc: nicholas.dokos, emacs-orgmode

On 11/12/11 18:07:48, Nick Dokos wrote:
> Alan L Tyree <alantyree@gmail.com> wrote:
> 
> > Debian Squeeze; org 7.7; emacs 23.2.1
> > 
> > I am back to trying to make ePub books from org articles/books. I
> am=20
> > working on a book which currently produces about 100 pages in
> LaTeX=20
> > export. It will be about 200 pages when finished.
> > 
> > ePub uses XHTML for the main content. So, I export the org file
> to=20
> > HTML. It verifies as a valid XHTML1.0 file at the w3c
> verification=20
> > site: http://validator.w3.org/
> > 
> > OK. Then wrap it up in the mess that is the ePub specification.
> It=20
> > actually reads OK in FBReader and in Iceweasel with the ePub add
> on,=20
> > BUT it does not validate. There are several problems, but most of
> the=20
> > errors involve the "name" attribute. For example:
> > 
> > <h2 id=3D"history"><a name=3D"sec-1" id=3D"sec-1"></a><span
> class=3D"sectio=
> > n-
> > number-2">1</span> History</h2>
> > 
> > ePub does not like the name in there. Wipe out all the name=3D"xxx"
> and=20
> > the problem goes away. Everything else still works.
> > 
> > I know that I can do a post export clean up of the XHTML file, but
> I=20
> > wonder if this is set in some variable that I cannot find.
> > 
> > And, as a general question, whay have both name=3D"sec-1" and
> id=3D"sec-1"=20
> > in the same element?
> > 
> > I would like to automate everything to go from org to ePub. It
> doesn't=20
> > seem too hard, but I'm a legal academic, not a programmer :-).
> Any=20
> > pointers appreciated.
> > 
> 
> Back when Avdi Green was working on his book, there was some
> discussion
> of this and Anthony Lander provided a pointer to 
> http://calibre-ebook.com/
> - see 
> 
>   http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849

Yes, Calibre does a nice job of converting XHTML to ePub; it can be 
read in all the readers that I use, but it won't pass the validation 
tests. OK unless you want to publish on sites that require validation.

Cheers,
Alan



> Nick
> 
> 



-- 
Alan L Tyree                    http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206		sip:172385@iptel.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books
  2011-12-11  7:25   ` Alan L Tyree
@ 2011-12-11  7:42     ` Nick Dokos
  2011-12-11  8:28       ` Alan L Tyree
  2011-12-11  9:41       ` Making ePub books: further report Alan L Tyree
  2011-12-11  7:50     ` Making ePub books Nick Dokos
  1 sibling, 2 replies; 13+ messages in thread
From: Nick Dokos @ 2011-12-11  7:42 UTC (permalink / raw)
  To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode

Alan L Tyree <alantyree@gmail.com> wrote:

> >   http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
> 
> Yes, Calibre does a nice job of converting XHTML to ePub; it can be
> read in all the readers that I use, but it won't pass the validation
> tests. OK unless you want to publish on sites that require validation.

Have you tried submitting an enhancement request to the calibre people?
It sounds (from my vantage point of a million miles away...) like a simple
thing for them to do and it might be a good thing for them as well as
for you: they can integrate the validation step in their testing and
catch errors that might be difficult to catch any other way. And it
seems to be a very active project, so you might get results pronto.

Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books
  2011-12-11  7:25   ` Alan L Tyree
  2011-12-11  7:42     ` Nick Dokos
@ 2011-12-11  7:50     ` Nick Dokos
  1 sibling, 0 replies; 13+ messages in thread
From: Nick Dokos @ 2011-12-11  7:50 UTC (permalink / raw)
  To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode

I said:

> > Back when Avdi Green was working on his book, there was some

and I managed to mangle Avdi's name pretty badly: it is "Avdi Grimm".

Apologies,
Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books
  2011-12-11  7:42     ` Nick Dokos
@ 2011-12-11  8:28       ` Alan L Tyree
  2011-12-11  9:41       ` Making ePub books: further report Alan L Tyree
  1 sibling, 0 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11  8:28 UTC (permalink / raw)
  Cc: nicholas.dokos, emacs-orgmode

On 11/12/11 18:42:10, Nick Dokos wrote:
> Alan L Tyree <alantyree@gmail.com> wrote:
> 
> > >   http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
> > 
> > Yes, Calibre does a nice job of converting XHTML to ePub; it can be
> > read in all the readers that I use, but it won't pass the 
> validation
> > tests. OK unless you want to publish on sites that require
> validation.
> 
> Have you tried submitting an enhancement request to the calibre
> people?
> It sounds (from my vantage point of a million miles away...) like a
> simple
> thing for them to do and it might be a good thing for them as well as
> for you: they can integrate the validation step in their testing and
> catch errors that might be difficult to catch any other way. And it
> seems to be a very active project, so you might get results pronto.

Good idea! That would certainly be the nicest way since Calibre is a 
very nice piece of work. 

Alan

> 
> Nick
> 
> 



-- 
Alan L Tyree                    http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206		sip:172385@iptel.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books: further report
  2011-12-11  7:42     ` Nick Dokos
  2011-12-11  8:28       ` Alan L Tyree
@ 2011-12-11  9:41       ` Alan L Tyree
  2011-12-11  9:52         ` Alan L Tyree
  2011-12-11 15:51         ` Bastien
  1 sibling, 2 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11  9:41 UTC (permalink / raw)
  Cc: nicholas.dokos, emacs-orgmode

On 11/12/11 18:42:10, Nick Dokos wrote:
> Alan L Tyree <alantyree@gmail.com> wrote:
> 
> > >   http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
> > 
> > Yes, Calibre does a nice job of converting XHTML to ePub; it can be
> > read in all the readers that I use, but it won't pass the 
> validation
> > tests. OK unless you want to publish on sites that require
> validation.
<SNIP>

I was being unfair to Calibre. If I clean up the XHTML file produced by 
org (in the way indicated by my original post plus a couple of things 
that I didn't mention), then Calibre produces an ePub book that passes 
validation.

So -- back to my original question: is there some variable somewhere 
that puts in both name="xxx" and id="xxx" or do I need to write a post 
export clean up function?

Thanks,
Alan

-- 
Alan L Tyree                    http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206		sip:172385@iptel.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books: further report
  2011-12-11  9:41       ` Making ePub books: further report Alan L Tyree
@ 2011-12-11  9:52         ` Alan L Tyree
  2011-12-11 10:02           ` Jambunathan K
  2011-12-11 15:52           ` Bastien
  2011-12-11 15:51         ` Bastien
  1 sibling, 2 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11  9:52 UTC (permalink / raw)
  Cc: nicholas.dokos, emacs-orgmode

On 11/12/11 20:41:18, Alan L Tyree wrote:
> On 11/12/11 18:42:10, Nick Dokos wrote:
> > Alan L Tyree <alantyree@gmail.com> wrote:
> > 
> > > >   http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
> > > 
> > > Yes, Calibre does a nice job of converting XHTML to ePub; it can
> be
> > > read in all the readers that I use, but it won't pass the 
> > validation
> > > tests. OK unless you want to publish on sites that require
> > validation.
> <SNIP>
> 
> I was being unfair to Calibre. If I clean up the XHTML file produced
> by 
> org (in the way indicated by my original post plus a couple of things 
> that I didn't mention), then Calibre produces an ePub book that 
> passes
> 
> validation.
> 
> So -- back to my original question: is there some variable somewhere 
> that puts in both name="xxx" and id="xxx" or do I need to write a 
> post
> 
> export clean up function?

Bad form to answer my own question: these seem to be hard coded in org-
html.el along with the other items that give ePub validation a nervous 
breakdown. I'll post a full list of the offending items later.

Cheers,
Alan

> 
> Thanks,
> Alan
> 
> -- 
> Alan L Tyree                    http://www2.austlii.edu.au/~alan
> Tel:  04 2748 6206		sip:172385@iptel.org
> 
> 
> 



-- 
Alan L Tyree                    http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206		sip:172385@iptel.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books: further report
  2011-12-11  9:52         ` Alan L Tyree
@ 2011-12-11 10:02           ` Jambunathan K
  2011-12-11 19:29             ` Alan L Tyree
  2011-12-11 15:52           ` Bastien
  1 sibling, 1 reply; 13+ messages in thread
From: Jambunathan K @ 2011-12-11 10:02 UTC (permalink / raw)
  To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode

Alan L Tyree <alantyree@gmail.com> writes:

> On 11/12/11 20:41:18, Alan L Tyree wrote:
>> On 11/12/11 18:42:10, Nick Dokos wrote:
>> > Alan L Tyree <alantyree@gmail.com> wrote:
>> > 
>> > > >   http://thread.gmane.org/gmane.emacs.orgmode/41826/focus=41849
>> > > 
>> > > Yes, Calibre does a nice job of converting XHTML to ePub; it can
>> be
>> > > read in all the readers that I use, but it won't pass the 
>> > validation
>> > > tests. OK unless you want to publish on sites that require
>> > validation.
>> <SNIP>
>> 
>> I was being unfair to Calibre. If I clean up the XHTML file produced
>> by 
>> org (in the way indicated by my original post plus a couple of things 
>> that I didn't mention), then Calibre produces an ePub book that 
>> passes
>> 
>> validation.
>> 
>> So -- back to my original question: is there some variable somewhere 
>> that puts in both name="xxx" and id="xxx" or do I need to write a 
>> post
>> 
>> export clean up function?
>
> Bad form to answer my own question: these seem to be hard coded in org-
> html.el along with the other items that give ePub validation a nervous 
> breakdown. I'll post a full list of the offending items later.


If you use org-xhtml.el (in contrib/lisp/org-xhtml.el) then you can
re-define some aspects of html export selectively.

For example, you can redefine this to

,---- original
| (defun org-xhtml-format-anchor (text name &optional class)
|   (let* ((id name)
| 	 (extra (concat
| 		 (when name (format " name=\"%s\""  name))
| 		 (when id (format " id=\"%s\""  id))
| 		 (when class (format " class=\"%s\""  class)))))
|     (org-xhtml-format-tags '("<a%s>" . "</a>") text extra)))
`----

this

,---- modified
| (defun org-xhtml-format-anchor (text name &optional class)
|   (let* ((id name)
| 	 (extra (concat
| 		 (when id (format " id=\"%s\""  id))
| 		 (when class (format " class=\"%s\""  class)))))
|     (org-xhtml-format-tags '("<a%s>" . "</a>") text extra)))
`----

to strip name from anchor.

I am not sure whether org-xhtml.el will minimize your efforts. Just a
suggestion.

ps: Add contrib/lisp to load-path and do org-export-as-xhtml.

> Cheers,
> Alan
>
>> 
>> Thanks,
>> Alan
>> 
>> -- 
>> Alan L Tyree                    http://www2.austlii.edu.au/~alan
>> Tel:  04 2748 6206		sip:172385@iptel.org
>> 
>> 
>> 

-- 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books: further report
  2011-12-11  9:41       ` Making ePub books: further report Alan L Tyree
  2011-12-11  9:52         ` Alan L Tyree
@ 2011-12-11 15:51         ` Bastien
  2011-12-11 20:47           ` Alan L Tyree
  1 sibling, 1 reply; 13+ messages in thread
From: Bastien @ 2011-12-11 15:51 UTC (permalink / raw)
  To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode

Hi Alan,

Alan L Tyree <alantyree@gmail.com> writes:

> So -- back to my original question: is there some variable somewhere 
> that puts in both name="xxx" and id="xxx" or do I need to write a post 
> export clean up function?

From latest git, can now set ̀org-export-html-headline-anchor-format' to
nil.  See the docstring of this new option.

Thanks,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books: further report
  2011-12-11  9:52         ` Alan L Tyree
  2011-12-11 10:02           ` Jambunathan K
@ 2011-12-11 15:52           ` Bastien
  1 sibling, 0 replies; 13+ messages in thread
From: Bastien @ 2011-12-11 15:52 UTC (permalink / raw)
  To: Alan L Tyree; +Cc: nicholas.dokos, emacs-orgmode

Alan L Tyree <alantyree@gmail.com> writes:

> Bad form to answer my own question: these seem to be hard coded in org-
> html.el along with the other items that give ePub validation a nervous 
> breakdown. I'll post a full list of the offending items later.

Thanks.  If you can, please document this on Worg.  This will be useful
when we will rewrite org-html.el using Nicolas new export engine.

Best,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books: further report
  2011-12-11 10:02           ` Jambunathan K
@ 2011-12-11 19:29             ` Alan L Tyree
  0 siblings, 0 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11 19:29 UTC (permalink / raw)
  To: Jambunathan K; +Cc: nicholas.dokos, emacs-orgmode

On 11/12/11 21:02:51, Jambunathan K wrote:
> Alan L Tyree <alantyree@gmail.com> writes:
> 
> > On 11/12/11 20:41:18, Alan L Tyree wrote:
> >> On 11/12/11 18:42:10, Nick Dokos wrote:
> >> > Alan L Tyree <alantyree@gmail.com> wrote:
> >> > 
> >> > > >   http://thread.gmane.org/gmane.emacs.orgmode/41826/
> focus=41849
> >> > > 
> >> > > Yes, Calibre does a nice job of converting XHTML to ePub; it
> can
> >> be
> >> > > read in all the readers that I use, but it won't pass the 
> >> > validation
> >> > > tests. OK unless you want to publish on sites that require
> >> > validation.
> >> <SNIP>
> >> 
> >> I was being unfair to Calibre. If I clean up the XHTML file
> produced
> >> by 
> >> org (in the way indicated by my original post plus a couple of
> things 
> >> that I didn't mention), then Calibre produces an ePub book that 
> >> passes
> >> 
> >> validation.
> >> 
> >> So -- back to my original question: is there some variable
> somewhere 
> >> that puts in both name="xxx" and id="xxx" or do I need to write a 
> >> post
> >> 
> >> export clean up function?
> >
> > Bad form to answer my own question: these seem to be hard coded in
> org-
> > html.el along with the other items that give ePub validation a
> nervous 
> > breakdown. I'll post a full list of the offending items later.
> 
> 
> If you use org-xhtml.el (in contrib/lisp/org-xhtml.el) then you can
> re-define some aspects of html export selectively.
> 
> For example, you can redefine this to
> 
> ,---- original
> | (defun org-xhtml-format-anchor (text name &optional class)
> |   (let* ((id name)
> | 	 (extra (concat
> | 		 (when name (format " name=\"%s\""  name))
> | 		 (when id (format " id=\"%s\""  id))
> | 		 (when class (format " class=\"%s\""  class)))))
> |     (org-xhtml-format-tags '("<a%s>" . "</a>") text extra)))
> `----
> 
> this
> 
> ,---- modified
> | (defun org-xhtml-format-anchor (text name &optional class)
> |   (let* ((id name)
> | 	 (extra (concat
> | 		 (when id (format " id=\"%s\""  id))
> | 		 (when class (format " class=\"%s\""  class)))))
> |     (org-xhtml-format-tags '("<a%s>" . "</a>") text extra)))
> `----
> 
> to strip name from anchor.
> 
> I am not sure whether org-xhtml.el will minimize your efforts. Just a
> suggestion.
> 
> ps: Add contrib/lisp to load-path and do org-export-as-xhtml.

Thanks for this, Jambunathan. I'll give this a try.

Cheers,
Alan



> 
> > Cheers,
> > Alan
> >
> >> 
> >> Thanks,
> >> Alan
> >> 
> >> -- 
> >> Alan L Tyree                    http://www2.austlii.edu.au/~alan
> >> Tel:  04 2748 6206		sip:172385@iptel.org
> >> 
> >> 
> >> 
> 
> -- 
> 



-- 
Alan L Tyree                    http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206		sip:172385@iptel.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Making ePub books: further report
  2011-12-11 15:51         ` Bastien
@ 2011-12-11 20:47           ` Alan L Tyree
  0 siblings, 0 replies; 13+ messages in thread
From: Alan L Tyree @ 2011-12-11 20:47 UTC (permalink / raw)
  To: Bastien; +Cc: nicholas.dokos, emacs-orgmode

On 12/12/11 02:51:29, Bastien wrote:
> Hi Alan,
> 
> Alan L Tyree <alantyree@gmail.com> writes:
> 
> > So -- back to my original question: is there some variable 
> somewhere
> 
> > that puts in both name="xxx" and id="xxx" or do I need to write a
> post 
> > export clean up function?
> 
> From latest git, can now set ̀org-export-html-headline-anchor-format'
> to
> nil.  See the docstring of this new option.

Thanks, Basien. I'll give it a try.

Alan

> 
> Thanks,
> 
> -- 
>  Bastien
> 



-- 
Alan L Tyree                    http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206		sip:172385@iptel.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-12-11 20:46 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-11  6:59 Making ePub books Alan L Tyree
2011-12-11  7:07 ` Nick Dokos
2011-12-11  7:25   ` Alan L Tyree
2011-12-11  7:42     ` Nick Dokos
2011-12-11  8:28       ` Alan L Tyree
2011-12-11  9:41       ` Making ePub books: further report Alan L Tyree
2011-12-11  9:52         ` Alan L Tyree
2011-12-11 10:02           ` Jambunathan K
2011-12-11 19:29             ` Alan L Tyree
2011-12-11 15:52           ` Bastien
2011-12-11 15:51         ` Bastien
2011-12-11 20:47           ` Alan L Tyree
2011-12-11  7:50     ` Making ePub books Nick Dokos

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).