Re: Using org to create a TOC for a compilation of separate PDF documents

emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed

From: AW <alexander.willand@t-online.de>
To: emacs-orgmode@gnu.org
Subject: Re: Using org to create a TOC for a compilation of separate PDF documents
Date: Fri, 24 May 2013 11:09:11 +0200	[thread overview]
Message-ID: <2899371.ZmAkaXjmJC@linux-ik7b.site> (raw)
In-Reply-To: <20130523234803.36aa9323@aga-netbook>

[-- Attachment #1: Type: text/plain, Size: 4396 bytes --]

Am Donnerstag, 23. Mai 2013, 23:48:03 schrieb Marcin Borkowski:
> Dnia 2013-05-23, o godz. 15:21:56
> 
> John Hendy <jw.hendy@gmail.com> napisał(a):
> > I have a use case and am not sure if Org would help or not. I've
> > downloaded a bunch of technical data sheets on various materials from
> > a vendor. I'd like to create a "booklet" of them with a cover page
> > table of contents.
> > 
> > I can create the booklet very easily with Stapler (or similar), but am
> > not sure on the best way to generate a clickable linked PDF of the
> > individual materials contained in the compiled document.[1] What I'm
> > not sure on is how to create a table of contents.
> > 
> > Ideally, I could do something like generate a page count of each
> > document and then use this to create the page numbers I'd use to
> > create links to, which I thought I could do with Org. Even better
> > would be to have [back to top] links as well, since this will end up
> > being a multi-hundred page booklet (~100 documents of 2-4 pages each).
> > 
> > Any thoughts on this?
> > 
> > Is it easier to just generate a list of files and use Org to "include"
> > them somehow via LaTeX instead of using Stapler to combine them?
> 
> I'd just use LaTeX's pdfpages package, possibly with hyperref.  (If you
> encounter any problems, email me - I've done similar things before, so
> I guess I could help you.)
> 

You don't have to do that manually. Some time ago members of the German TEX-D-
List put together a bash script, which takes all the PDFs recursivly and 
creates a *.tex file:

#+begin_src bash

#!/bin/bash
#
# pdfdir OUTPUT_FILE
#
# produces one big PDF file of all PDF files in .
#
if [ $# -ne 1 ] || [ -z "$1" ]; then
  echo "Syntax: pdfdir OUTPUT_FILE"
  exit 1
fi
FILE="$(echo "$1"|sed -e 's/\.\(pdf\|tex\)$//')"
for F in "$FILE" "$FILE.tex" "$FILE.pdf" "$FILE.aux" "$FILE.log" ; do
  if [ -e "$F" ]; then
    echo "$F already exists."
    exit 2
  fi
done
cat >"$FILE.tex" <<EOF
\documentclass{article}%
\usepackage{pdfpages}%
\usepackage{grffile}%
\listfiles%
\begin{document}%
%\tableofcontents%
EOF
# Helperfunction
exist_pdf_files () {
  [ $(find -L "$1" -name \*.pdf -o -name \*.PDF -type f 2>/dev/null|wc -l) -eq 
0 ] && return 1
  return 0
}
list_directories () {
  find -L "$1" -maxdepth 1 -mindepth 1 -type d 2>/dev/null | sort
}
list_pdf_files () {
  # " around filenames:
  find -L "$1" -maxdepth 1 -mindepth 1 -name \*.pdf -o -name \*.PDF -type f 
2>/dev/null | sort | \
    sed -e 's/^/\\includepdf[pages=-]{"/; s/$/"}%/'
  # without " around filenames:
 # find -L "$1" -maxdepth 1 -mindepth 1 -name \*.pdf -o -name \*.PDF -type f 
2>/dev/null | sort | \
  #  sed -e 's/^/\\includepdf[pages=-]{/; s/$/}%/'
}
tex_headline () {
    echo "$1" | sed -e 's/_/\\_/g'
}
# folder level were we are (level 0):
list_pdf_files . >>"$FILE.tex"
# level 1:
list_directories . | while read -r DIR1; do
  # Are there PDF files a level down?
  exist_pdf_files "$DIR1" || continue
  # Yes...
  tex_headline "\section{${DIR1##*/}}%"
  # ... those are ...:
  list_pdf_files "$DIR1"
  # Level 2:
  list_directories "$DIR1" | while read -r DIR2; do
    exist_pdf_files "$DIR2" || continue
    tex_headline "\subsection{${DIR2##*/}}%"
    list_pdf_files "$DIR2"
    # Level 3:
    list_directories "$DIR2" | while read -r DIR3; do
      exist_pdf_files "$DIR3" || continue
      tex_headline "\subsubsection{${DIR3##*/}}%"
      list_pdf_files "$DIR3"
      # Level 4:
      list_directories "$DIR3" | while read -r DIR4; do
        exist_pdf_files "$DIR4" || continue
        tex_headline "\paragraph{${DIR4##*/}}%"
        list_pdf_files "$DIR4"
        # Level 5:
        list_directories "$DIR4" | while read -r DIR5; do
          exist_pdf_files "$DIR5" || continue
          tex_headline "\subparagraph{${DIR5##*/}}%"
          list_pdf_files "$DIR5"
        done
      done
    done
  done
done >>"$FILE.tex"
echo "\end{document}%" >>"$FILE.tex"
echo "Compile source now? [J/n]"
read -r ANSWER
case "$ANSWER" in
[JjYy]) ;;
*) exit 0 ;;
esac
pdflatex "$FILE"
[ $? -eq 0 ] && rm -f "$FILE.aux" "$FILE.log" "$FILE.tex"



#+end_src

I found that very helpfull, but I did not use it recently.

Regards,

Alexander

> > Thanks for any suggestions!
> > John
> 
> Best,

[-- Attachment #2: book-from-PDF.sh --]
[-- Type: application/x-shellscript, Size: 2521 bytes --]

next prev parent reply	other threads:[~2013-05-24  9:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-23 20:21 Using org to create a TOC for a compilation of separate PDF documents John Hendy
2013-05-23 21:48 ` Marcin Borkowski
2013-05-24  9:09   ` AW [this message]
2013-05-23 22:20 ` Suvayu Ali
2013-05-23 22:31   ` John Hendy
2013-05-24  0:08 ` Rasmus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2899371.ZmAkaXjmJC@linux-ik7b.site \
    --to=alexander.willand@t-online.de \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).