emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Matt Price <moptop99@gmail.com>
To: Ramon Diaz-Uriarte <rdiaz02@gmail.com>
Cc: Org Mode <emacs-orgmode@gnu.org>
Subject: Re: Org Mode and PDF Notes!
Date: Thu, 12 Nov 2015 08:11:23 -0500	[thread overview]
Message-ID: <CAN_Dec-XG7eA5PnHvwjvbQjFT+VsThHxB8RdJrEpWoC10+ATyg@mail.gmail.com> (raw)
In-Reply-To: <87wptnqucw.fsf@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3669 bytes --]

On Thu, Nov 12, 2015 at 7:23 AM, Ramon Diaz-Uriarte <rdiaz02@gmail.com>
wrote:

>
>
>
> On Wed, 11-11-2015, at 21:33, Matt Price <moptop99@gmail.com> wrote:
> >>
> >>
> >>   instead of the text. Bummer! I wonder if RepliGO gives you a lot more
> >>   than the rest, or if I am doing something silly.
> >>
> >> I think that there is no standard way of storing the highlight
> contents. I
> > chose Repligo over EZPDF because it gives you access to the text of the
> > highlights!
>
>
> I'll try to see if I can get repligo (I had it a few years ago)
>
> >
> > Okular, I think, stores your annotations in its own database, rather than
> > in the pdf. You can (I think!) attach the annotations to the pdf from
> > inside Okular.  At leasts, that's what I remember from when I was looking
> > around.
>
> Actually, Okular stores the annotations in the PDF itself if you do "Save
> As". (It still keeps an internal db, but I never use it anymore). It is
> easy to check by doing that and then opening the file with another reader
> in another machine (e.g., in an android).
>
>
> my bad, thx.

>
> >
> > Repligo stores the highlighted text in the "subject" field of the
> > annotation. It's possible that the content of the annotation is stored in
> > some other field, like "content".  Maybe you can try:
> >
> > M-: (pdf-annot-get-annots) and look at the output in the *Messages*
> > buffer.  Can you see any evidence of the the text? Can you share what you
> > learned?
>
> Nope, no evidence of the text. I get things such as
>
> (((buffer . #<buffer Frank_2015_Commentary.pdf>) (page . 13) (edges
> 0.113553 0.31717 0.868657 0.361746) (type . highlight) (id . annot-13-0)
> (flags . 4) (color . "#ffff00") (contents . "") (modified 22081 45188)
> (label . "TF201") (subject . "Highlight") (opacity . 1.0) ...)
>
>
> so we get the location of the highlight (and its properties), but not the
> textual contents. And this is the case whether I make the annotation with
> EzPDF or Okular or, for that matter, with pdf-tools itself.
>
> So it seems RepliGO is actually giving you a lot more by default :-)
>
>
Try replacing

(text (assoc-default 'subject annot))

with

(text (pdf-info-gettext page (assoc-default 'edges annot)))


in the lambda function in pdf-annot-markups-as-org-text.  This will fail on
cropped pdfs if you have added highlights using the most recent pdf-tools,
which stores negative values in the 'edges field, but I've found it works
otherwise.  I'd love to hear if it works for you too. (I know you're
following the relevant bug report on the pdf-tools github repo).


>
> >
> > Politza and I are discussing this here:
> > https://github.com/politza/pdf-tools/issues/137
> >
> > that might be a good place to ocntinue the conversation.
> >
>
> I'll do. In the meantime, I think this is a limitation coming from
> poppler. Other people have mentioned similar things (e.g.,
> http://coda.caseykuhlman.com/entries/2014/pdf-extract.html) and using
> other
> tools that depend on poppler (such as Leela:
> https://github.com/TrilbyWhite/Leela) also will not give us the text
> itself.
>
>
>
> >>
> >> Until I found pdf-tools, I had planned to write a node wrapper for
> pdf.js
> > and grab the annotations that way.  But I don't really know how to do
> that,
> > so this turned out to be easier :-)
> >
> > Anyway, I've judated the post, and it's now possible to create links to
> > individualt annotations, though you will have to use my updated version
> of
> > org-pdfview, until/unless Markus accepts my patch.
>
>
> I just updated packages, and things are working perfectly: I am jumping to
> the page and location.
>
>
>
> Thanks,
>
>
> R.
>
>
>
>
>

[-- Attachment #2: Type: text/html, Size: 5449 bytes --]

  reply	other threads:[~2015-11-12 13:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-11 14:42 Org Mode and PDF Notes! Matt Price
2015-11-11 14:59 ` Kaushal Modi
2015-11-11 20:38   ` Matt Price
2015-11-11 20:48     ` Kaushal Modi
2015-11-11 20:58       ` Matt Price
2015-11-12 12:02         ` Sebastian Christ
2015-11-12 11:58       ` Sebastian Christ
2015-11-11 15:06 ` Xebar Saram
2015-11-11 15:10 ` Russell Adams
2015-11-11 16:40 ` Jeffrey DeLeo
2015-11-11 20:18   ` Matt Price
2015-11-11 17:09 ` Memnon Anon
2015-11-11 20:34   ` Matt Price
2015-11-12 17:31     ` Memnon Anon
2015-11-11 20:17 ` Ramon Diaz-Uriarte
2015-11-11 20:33   ` Matt Price
2015-11-11 22:43     ` Matt Lundin
2015-11-12 12:23     ` Ramon Diaz-Uriarte
2015-11-12 13:11       ` Matt Price [this message]
2015-11-13  0:39         ` Ramon Diaz-Uriarte
2015-11-12 14:28       ` Matt Lundin
2015-11-12 22:52         ` Matt Price
2015-11-12 23:51           ` Ramon Diaz-Uriarte
2015-11-12 23:55         ` Ramon Diaz-Uriarte
2015-11-12 11:30 ` Karl Voit
  -- strict thread matches above, loose matches on Subject: below --
2015-11-11 15:15 Peter Davis
     [not found] <20@gmane.emacs.orgmode.nnrss>
2015-11-13  8:04 ` Matti Minkkinen
2015-11-16 10:07   ` Ramon Diaz-Uriarte

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN_Dec-XG7eA5PnHvwjvbQjFT+VsThHxB8RdJrEpWoC10+ATyg@mail.gmail.com \
    --to=moptop99@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=rdiaz02@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).