From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ramon Diaz-Uriarte Subject: Re: Org Mode and PDF Notes! Date: Thu, 12 Nov 2015 13:23:43 +0100 Message-ID: <87wptnqucw.fsf@gmail.com> References: <877floffyq.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:46758) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zwqur-0004Wv-Qn for emacs-orgmode@gnu.org; Thu, 12 Nov 2015 07:23:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zwquo-0001Ot-Jp for emacs-orgmode@gnu.org; Thu, 12 Nov 2015 07:23:49 -0500 Received: from mail-wm0-x232.google.com ([2a00:1450:400c:c09::232]:33098) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zwquo-0001Om-BY for emacs-orgmode@gnu.org; Thu, 12 Nov 2015 07:23:46 -0500 Received: by wmec201 with SMTP id c201so30203369wme.0 for ; Thu, 12 Nov 2015 04:23:45 -0800 (PST) In-reply-to: List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Matt Price Cc: Ramon Diaz-Uriarte , Org Mode On Wed, 11-11-2015, at 21:33, Matt Price wrote: >> >> >> instead of the text. Bummer! I wonder if RepliGO gives you a lot more >> than the rest, or if I am doing something silly. >> >> I think that there is no standard way of storing the highlight contents. I > chose Repligo over EZPDF because it gives you access to the text of the > highlights! I'll try to see if I can get repligo (I had it a few years ago) > > Okular, I think, stores your annotations in its own database, rather than > in the pdf. You can (I think!) attach the annotations to the pdf from > inside Okular. At leasts, that's what I remember from when I was looking > around. Actually, Okular stores the annotations in the PDF itself if you do "Save As". (It still keeps an internal db, but I never use it anymore). It is easy to check by doing that and then opening the file with another reader in another machine (e.g., in an android). > > Repligo stores the highlighted text in the "subject" field of the > annotation. It's possible that the content of the annotation is stored in > some other field, like "content". Maybe you can try: > > M-: (pdf-annot-get-annots) and look at the output in the *Messages* > buffer. Can you see any evidence of the the text? Can you share what you > learned? Nope, no evidence of the text. I get things such as (((buffer . #) (page . 13) (edges 0.113553 0.31717 0.868657 0.361746) (type . highlight) (id . annot-13-0) (flags . 4) (color . "#ffff00") (contents . "") (modified 22081 45188) (label . "TF201") (subject . "Highlight") (opacity . 1.0) ...) so we get the location of the highlight (and its properties), but not the textual contents. And this is the case whether I make the annotation with EzPDF or Okular or, for that matter, with pdf-tools itself. So it seems RepliGO is actually giving you a lot more by default :-) > > Politza and I are discussing this here: > https://github.com/politza/pdf-tools/issues/137 > > that might be a good place to ocntinue the conversation. > I'll do. In the meantime, I think this is a limitation coming from poppler. Other people have mentioned similar things (e.g., http://coda.caseykuhlman.com/entries/2014/pdf-extract.html) and using other tools that depend on poppler (such as Leela: https://github.com/TrilbyWhite/Leela) also will not give us the text itself. >> >> Until I found pdf-tools, I had planned to write a node wrapper for pdf.js > and grab the annotations that way. But I don't really know how to do that, > so this turned out to be easier :-) > > Anyway, I've judated the post, and it's now possible to create links to > individualt annotations, though you will have to use my updated version of > org-pdfview, until/unless Markus accepts my patch. I just updated packages, and things are working perfectly: I am jumping to the page and location. Thanks, R. -- Ramon Diaz-Uriarte Department of Biochemistry, Lab B-25 Facultad de Medicina Universidad Autónoma de Madrid Arzobispo Morcillo, 4 28029 Madrid Spain Phone: +34-91-497-2412 Email: rdiaz02@gmail.com ramon.diaz@iib.uam.es http://ligarto.org/rdiaz