emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Robert Horn <rjhorn@alum.mit.edu>
To: Karl Voit <news1142@Karl-Voit.at>
Cc: emacs-orgmode@gnu.org
Subject: Re: How do you store web pages for reference?
Date: Mon, 16 Jan 2017 11:52:59 -0500	[thread overview]
Message-ID: <m3tw8zkkus.fsf@quad.robs.office> (raw)
In-Reply-To: <2017-01-16T17-27-24@devnull.Karl-Voit.at>


There is also a Firefox plugin "ScrapBook X", which is a successor to
Scrapbook.  It can capture the web page alone (with links to outside
world) and allows you to select by depth or link additional pages
that are also to be captured.  (If you have infinite time and storage
with the right links you might attempt to capture the entire Internet.
Something like capture all pages to link depth 1000 comes to mind.)

I use it to capture a variety of things.  Each capture is stored in a
directory tree of html, css, etc. rooted at a time-date tag for when the
capture was performed.

I have not seen nor attempted to integrate it with org or any other
tools.  This is feasible in theory, since the file
<date-time>/index.html is a valid page starting point and links are been
rewritten as appropriate.  Something like "firefox
scrapbook-root/20170115205014/index.html" would be a proper reference.
The more the page content becomes active content like javascript, the
less likely that the page capture will save what you want, but that's
inherent with active content.

It would be nice to capture more metadata (like Zotero), but it only
preserves minimal metadata about the capture.

R Horn
rjhorn@alum.mit.edu

  reply	other threads:[~2017-01-16 16:53 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-16  8:48 How do you store web pages for reference? Alan Schmitt
2017-01-16  9:22 ` Michael Welle
2017-01-16  9:57   ` Alan Schmitt
2017-01-16 10:03     ` Michael Welle
2017-01-16 15:58     ` William Denton
2017-01-16 18:09       ` Alan Schmitt
2017-01-16 10:38 ` Charles A. Roelli
2017-01-16 13:06   ` Alan Schmitt
2017-01-16 14:43 ` Karl Voit
2017-01-16 15:41   ` Alan Schmitt
2017-01-16 16:35     ` Karl Voit
2017-01-16 16:52       ` Robert Horn [this message]
2017-01-16 17:40         ` Scott Otterson
2017-03-16 19:04           ` Bob Newell
2017-03-17  8:05             ` Alan Schmitt
2017-03-23  0:01             ` Adam Porter
2017-03-13 17:43 ` Peter Salazar
2017-03-14 12:17   ` Adonay Felipe Nogueira
2017-03-15  7:08   ` Alan Schmitt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3tw8zkkus.fsf@quad.robs.office \
    --to=rjhorn@alum.mit.edu \
    --cc=emacs-orgmode@gnu.org \
    --cc=news1142@Karl-Voit.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).