From mboxrd@z Thu Jan  1 00:00:00 1970
From: AW <alexander.willand@t-online.de>
Subject: Re: Using org to create a TOC for a compilation of separate PDF
 documents
Date: Fri, 24 May 2013 11:09:11 +0200
Message-ID: <2899371.ZmAkaXjmJC@linux-ik7b.site>
References: <CA+M2ft-Pi227BcOuPuq5NOZJzr00T=4OyGae9bd2G03oCq7Mxg@mail.gmail.com>
	<20130523234803.36aa9323@aga-netbook>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="nextPart1596572.8bsvxFggIk"
Content-Transfer-Encoding: 7Bit
Return-path: <emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([208.118.235.92]:41427)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alexander.willand@t-online.de>) id 1Ufnv0-0000UH-MA
	for emacs-orgmode@gnu.org; Fri, 24 May 2013 05:04:16 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <alexander.willand@t-online.de>) id 1Ufnuv-0003W2-0v
	for emacs-orgmode@gnu.org; Fri, 24 May 2013 05:04:10 -0400
Received: from mailout01.t-online.de ([194.25.134.80]:53258)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alexander.willand@t-online.de>) id 1Ufnuu-0003Uc-JB
	for emacs-orgmode@gnu.org; Fri, 24 May 2013 05:04:04 -0400
In-Reply-To: <20130523234803.36aa9323@aga-netbook>
List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-orgmode>
List-Post: <mailto:emacs-orgmode@gnu.org>
List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=subscribe>
Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
To: emacs-orgmode@gnu.org

This is a multi-part message in MIME format.

--nextPart1596572.8bsvxFggIk
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Am Donnerstag, 23. Mai 2013, 23:48:03 schrieb Marcin Borkowski:
> Dnia 2013-05-23, o godz. 15:21:56
>=20
> John Hendy <jw.hendy@gmail.com> napisa=C5=82(a):
> > I have a use case and am not sure if Org would help or not. I've
> > downloaded a bunch of technical data sheets on various materials fr=
om
> > a vendor. I'd like to create a "booklet" of them with a cover page
> > table of contents.
> >=20
> > I can create the booklet very easily with Stapler (or similar), but=
 am
> > not sure on the best way to generate a clickable linked PDF of the
> > individual materials contained in the compiled document.[1] What I'=
m
> > not sure on is how to create a table of contents.
> >=20
> > Ideally, I could do something like generate a page count of each
> > document and then use this to create the page numbers I'd use to
> > create links to, which I thought I could do with Org. Even better
> > would be to have [back to top] links as well, since this will end u=
p
> > being a multi-hundred page booklet (~100 documents of 2-4 pages eac=
h).
> >=20
> > Any thoughts on this?
> >=20
> > Is it easier to just generate a list of files and use Org to "inclu=
de"
> > them somehow via LaTeX instead of using Stapler to combine them?
>=20
> I'd just use LaTeX's pdfpages package, possibly with hyperref.  (If y=
ou
> encounter any problems, email me - I've done similar things before, s=
o
> I guess I could help you.)
>=20

You don't have to do that manually. Some time ago members of the German=
 TEX-D-
List put together a bash script, which takes all the PDFs recursivly an=
d=20
creates a *.tex file:

#+begin_src bash

#!/bin/bash
#
# pdfdir OUTPUT_FILE
#
# produces one big PDF file of all PDF files in .
#
if [ $# -ne 1 ] || [ -z "$1" ]; then
  echo "Syntax: pdfdir OUTPUT_FILE"
  exit 1
fi
FILE=3D"$(echo "$1"|sed -e 's/\.\(pdf\|tex\)$//')"
for F in "$FILE" "$FILE.tex" "$FILE.pdf" "$FILE.aux" "$FILE.log" ; do
  if [ -e "$F" ]; then
    echo "$F already exists."
    exit 2
  fi
done
cat >"$FILE.tex" <<EOF
\documentclass{article}%
\usepackage{pdfpages}%
\usepackage{grffile}%
\listfiles%
\begin{document}%
%\tableofcontents%
EOF
# Helperfunction
exist_pdf_files () {
  [ $(find -L "$1" -name \*.pdf -o -name \*.PDF -type f 2>/dev/null|wc =
-l) -eq=20
0 ] && return 1
  return 0
}
list_directories () {
  find -L "$1" -maxdepth 1 -mindepth 1 -type d 2>/dev/null | sort
}
list_pdf_files () {
  # " around filenames:
  find -L "$1" -maxdepth 1 -mindepth 1 -name \*.pdf -o -name \*.PDF -ty=
pe f=20
2>/dev/null | sort | \
    sed -e 's/^/\\includepdf[pages=3D-]{"/; s/$/"}%/'
  # without " around filenames:
 # find -L "$1" -maxdepth 1 -mindepth 1 -name \*.pdf -o -name \*.PDF -t=
ype f=20
2>/dev/null | sort | \
  #  sed -e 's/^/\\includepdf[pages=3D-]{/; s/$/}%/'
}
tex_headline () {
    echo "$1" | sed -e 's/_/\\_/g'
}
# folder level were we are (level 0):
list_pdf_files . >>"$FILE.tex"
# level 1:
list_directories . | while read -r DIR1; do
  # Are there PDF files a level down?
  exist_pdf_files "$DIR1" || continue
  # Yes...
  tex_headline "\section{${DIR1##*/}}%"
  # ... those are ...:
  list_pdf_files "$DIR1"
  # Level 2:
  list_directories "$DIR1" | while read -r DIR2; do
    exist_pdf_files "$DIR2" || continue
    tex_headline "\subsection{${DIR2##*/}}%"
    list_pdf_files "$DIR2"
    # Level 3:
    list_directories "$DIR2" | while read -r DIR3; do
      exist_pdf_files "$DIR3" || continue
      tex_headline "\subsubsection{${DIR3##*/}}%"
      list_pdf_files "$DIR3"
      # Level 4:
      list_directories "$DIR3" | while read -r DIR4; do
        exist_pdf_files "$DIR4" || continue
        tex_headline "\paragraph{${DIR4##*/}}%"
        list_pdf_files "$DIR4"
        # Level 5:
        list_directories "$DIR4" | while read -r DIR5; do
          exist_pdf_files "$DIR5" || continue
          tex_headline "\subparagraph{${DIR5##*/}}%"
          list_pdf_files "$DIR5"
        done
      done
    done
  done
done >>"$FILE.tex"
echo "\end{document}%" >>"$FILE.tex"
echo "Compile source now? [J/n]"
read -r ANSWER
case "$ANSWER" in
[JjYy]) ;;
*) exit 0 ;;
esac
pdflatex "$FILE"
[ $? -eq 0 ] && rm -f "$FILE.aux" "$FILE.log" "$FILE.tex"


#+end_src

I found that very helpfull, but I did not use it recently.

Regards,

Alexander

> > Thanks for any suggestions!
> > John
>=20
> Best,

--nextPart1596572.8bsvxFggIk
Content-Disposition: attachment; filename="book-from-PDF.sh"
Content-Transfer-Encoding: 7Bit
Content-Type: application/x-shellscript; name="book-from-PDF.sh"

#!/bin/bash
#
# pdfdir OUTPUT_FILE
#
# produces one big PDF file of all PDF files in .
#
if [ $# -ne 1 ] || [ -z "$1" ]; then
  echo "Syntax: pdfdir OUTPUT_FILE"
  exit 1
fi
FILE="$(echo "$1"|sed -e 's/\.\(pdf\|tex\)$//')"
for F in "$FILE" "$FILE.tex" "$FILE.pdf" "$FILE.aux" "$FILE.log" ; do
  if [ -e "$F" ]; then
    echo "$F already exists."
    exit 2
  fi
done
cat >"$FILE.tex" <<EOF
\documentclass{article}%
\usepackage{pdfpages}%
\usepackage{grffile}%
\listfiles%
\begin{document}%
%\tableofcontents%
EOF
# Helperfunction
exist_pdf_files () {
  [ $(find -L "$1" -name \*.pdf -o -name \*.PDF -type f 2>/dev/null|wc -l) -eq 0 ] && return 1
  return 0
}
list_directories () {
  find -L "$1" -maxdepth 1 -mindepth 1 -type d 2>/dev/null | sort
}
list_pdf_files () {
  # " around filenames:
  find -L "$1" -maxdepth 1 -mindepth 1 -name \*.pdf -o -name \*.PDF -type f 2>/dev/null | sort | \
    sed -e 's/^/\\includepdf[pages=-]{"/; s/$/"}%/'
  # without " around filenames:
 # find -L "$1" -maxdepth 1 -mindepth 1 -name \*.pdf -o -name \*.PDF -type f 2>/dev/null | sort | \
  #  sed -e 's/^/\\includepdf[pages=-]{/; s/$/}%/'
}
tex_headline () {
    echo "$1" | sed -e 's/_/\\_/g'
}
# folder level were we are (level 0):
list_pdf_files . >>"$FILE.tex"
# level 1:
list_directories . | while read -r DIR1; do
  # Are there PDF files a level down?
  exist_pdf_files "$DIR1" || continue
  # Yes...
  tex_headline "\section{${DIR1##*/}}%"
  # ... those are ...:
  list_pdf_files "$DIR1"
  # Level 2:
  list_directories "$DIR1" | while read -r DIR2; do
    exist_pdf_files "$DIR2" || continue
    tex_headline "\subsection{${DIR2##*/}}%"
    list_pdf_files "$DIR2"
    # Level 3:
    list_directories "$DIR2" | while read -r DIR3; do
      exist_pdf_files "$DIR3" || continue
      tex_headline "\subsubsection{${DIR3##*/}}%"
      list_pdf_files "$DIR3"
      # Level 4:
      list_directories "$DIR3" | while read -r DIR4; do
        exist_pdf_files "$DIR4" || continue
        tex_headline "\paragraph{${DIR4##*/}}%"
        list_pdf_files "$DIR4"
        # Level 5:
        list_directories "$DIR4" | while read -r DIR5; do
          exist_pdf_files "$DIR5" || continue
          tex_headline "\subparagraph{${DIR5##*/}}%"
          list_pdf_files "$DIR5"
        done
      done
    done
  done
done >>"$FILE.tex"
echo "\end{document}%" >>"$FILE.tex"
echo "Compile source now? [J/n]"
read -r ANSWER
case "$ANSWER" in
[JjYy]) ;;
*) exit 0 ;;
esac
pdflatex "$FILE"
[ $? -eq 0 ] && rm -f "$FILE.aux" "$FILE.log" "$FILE.tex"

--nextPart1596572.8bsvxFggIk--