emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Matt Price <moptop99@gmail.com>
To: Ramon Diaz-Uriarte <rdiaz02@gmail.com>
Cc: Org Mode <emacs-orgmode@gnu.org>
Subject: Re: Org Mode and PDF Notes!
Date: Wed, 11 Nov 2015 15:33:52 -0500	[thread overview]
Message-ID: <CAN_Dec9zt=8roZ5kwsN7UN1xTyvTCvh4Zb2u-N601kXni6-nhw@mail.gmail.com> (raw)
In-Reply-To: <877floffyq.fsf@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4622 bytes --]

On Wed, Nov 11, 2015 at 3:17 PM, Ramon Diaz-Uriarte <rdiaz02@gmail.com>
wrote:

> Dear Matt,
>
>
> On Wed, 11-11-2015, at 15:42, Matt Price <moptop99@gmail.com> wrote:
> > I've just written up a post on my workflow for PDF's Since my blog has, I
> > think, a readership of 0 (surely there's a way to get emacsers to follow
> > me? ah well), I will post a link here in the hopes that someone will be
>
> Add another 1 :-)
>
> > interested:
> >
> > http://matt.hackinghistory.ca/2015/11/11/note-taking-with-pdf-tools/
> >
>
> Really neat! A few comments/questions/ramblings:
>
> - The type of highlights you get from RepliGo contain the text itself. I
>   mean, when in your pdf I use C-c C-a l, the buffer showing the contents
>   of each highlight contain  the highlighted text.
>
>   This is not what I get from, say, EzPDF (which is what I use on Android),
>   or from highlighting from pdf-tools itself using C-c C-a h, or from
>   highlighting from Okular. The contents just gives the rectangle).
> Hummmm...
>
>
>   Because of this, when I use your code on my pdfs, I only get things
>   such as
>
> Highlight
>
> ([[pdfview:/home/ramon/Zotero-data/storage/ESHHD4KW/Frank_2015_Commentary.pdf::5][Frank_2015_Commentary]],
> 5)
>
>
>   instead of the text. Bummer! I wonder if RepliGO gives you a lot more
>   than the rest, or if I am doing something silly.
>
> I think that there is no standard way of storing the highlight contents. I
chose Repligo over EZPDF because it gives you access to the text of the
highlights!

Okular, I think, stores your annotations in its own database, rather than
in the pdf. You can (I think!) attach the annotations to the pdf from
inside Okular.  At leasts, that's what I remember from when I was looking
around.

Repligo stores the highlighted text in the "subject" field of the
annotation. It's possible that the content of the annotation is stored in
some other field, like "content".  Maybe you can try:

M-: (pdf-annot-get-annots) and look at the output in the *Messages*
buffer.  Can you see any evidence of the the text? Can you share what you
learned?

Politza and I are discussing this here:
https://github.com/politza/pdf-tools/issues/137

that might be a good place to ocntinue the conversation.


>
> - You have to call mwp/pdf-multi-extract on each file/set of files. I guess
>   if I knew elisp, I'd find it trivial to iterate over a set of directories
>   and subdirectories (and do this using a cron job at night), and also
>   place everything in one single org file. Would this be something
>   reasonable to do?
>
> for sure.  My elisp sucks too but I bet someone will answer you here on
the list.


>   (This might be related to your second Todo)
>
> well, wasn't what I was planning but would still be useful.

>
> - I know nothing about how it works, and it does not use pdf-tools, but in
>   your first Todo you mention: "extend the pdfview link type (in
>   org-pdfview) to permit me to specify the precise location of an
>   annotation,".  PDF.js (https://mozilla.github.io/pdf.js/), which is
>   used for instance by zotfile (http://zotfile.com/) does that and it
> works
>   out of the box with Okular (but I've not been able to get it to work with
>   pdftools).
>
> Until I found pdf-tools, I had planned to write a node wrapper for pdf.js
and grab the annotations that way.  But I don't really know how to do that,
so this turned out to be easier :-)

Anyway, I've judated the post, and it's now possible to create links to
individualt annotations, though you will have to use my updated version of
org-pdfview, until/unless Markus accepts my patch.

>
> - In case it matters, I have somewhat similar modus operandi.  I do a lot
>   of PDF reading, including note-taking and highlighting, in android
>   tablets ---I use EzPDF, which also embeds the notes in the PDF. I have a
>   cron job that extracts all the highlights and annotations of all the PDFs
>   and places them in a single org file. The kludge is explained here:
>
> https://github.com/rdiaz02/Adios_Mendeley#extracting-all-pdf-annotations-and-placing-them-in-an-org-mode-file
>   The truth is I use two mechanisms for PDF annotation and highlighting
>   extraction, since none is fully satisfactory to me, but the one that uses
>   Ruby (i.e., that does not depend on poppler) is able to actually extract
>   the text of the highlights.
>
> ah, man, that looks really cool and I'm sorry I didn't know about it
earlier! I haven't read through your whole document but looks like there's
a lot useful stuff there.



>
> Best, and thanks again for sharing,
>
> you're welcome & thank you!
m

[-- Attachment #2: Type: text/html, Size: 6761 bytes --]

  reply	other threads:[~2015-11-11 20:33 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-11 14:42 Org Mode and PDF Notes! Matt Price
2015-11-11 14:59 ` Kaushal Modi
2015-11-11 20:38   ` Matt Price
2015-11-11 20:48     ` Kaushal Modi
2015-11-11 20:58       ` Matt Price
2015-11-12 12:02         ` Sebastian Christ
2015-11-12 11:58       ` Sebastian Christ
2015-11-11 15:06 ` Xebar Saram
2015-11-11 15:10 ` Russell Adams
2015-11-11 16:40 ` Jeffrey DeLeo
2015-11-11 20:18   ` Matt Price
2015-11-11 17:09 ` Memnon Anon
2015-11-11 20:34   ` Matt Price
2015-11-12 17:31     ` Memnon Anon
2015-11-11 20:17 ` Ramon Diaz-Uriarte
2015-11-11 20:33   ` Matt Price [this message]
2015-11-11 22:43     ` Matt Lundin
2015-11-12 12:23     ` Ramon Diaz-Uriarte
2015-11-12 13:11       ` Matt Price
2015-11-13  0:39         ` Ramon Diaz-Uriarte
2015-11-12 14:28       ` Matt Lundin
2015-11-12 22:52         ` Matt Price
2015-11-12 23:51           ` Ramon Diaz-Uriarte
2015-11-12 23:55         ` Ramon Diaz-Uriarte
2015-11-12 11:30 ` Karl Voit
  -- strict thread matches above, loose matches on Subject: below --
2015-11-11 15:15 Peter Davis
     [not found] <20@gmane.emacs.orgmode.nnrss>
2015-11-13  8:04 ` Matti Minkkinen
2015-11-16 10:07   ` Ramon Diaz-Uriarte

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN_Dec9zt=8roZ5kwsN7UN1xTyvTCvh4Zb2u-N601kXni6-nhw@mail.gmail.com' \
    --to=moptop99@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=rdiaz02@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).