emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Karl Voit <devnull@Karl-Voit.at>
To: emacs-orgmode@gnu.org
Subject: Re: How do you store web pages for reference?
Date: Mon, 16 Jan 2017 17:35:50 +0100	[thread overview]
Message-ID: <2017-01-16T17-27-24@devnull.Karl-Voit.at> (raw)
In-Reply-To: m2d1fnypu0.fsf@charm-wifi.irisa.fr

Hi Alan,

* Alan Schmitt <alan.schmitt@polytechnique.org> wrote:

> On 2017-01-16 15:43, Karl Voit <devnull@Karl-Voit.at> writes:
>
>> I am using the Firefox plugin Shelve[1] which stores all of my web
>> pages visited. Those HTML files are written with an ISO time-stamp
>> in their file name. Therefore, my Memacs filename module (see sig)
>> is indexing all visited URLs and they appear on my agenda.
>>
>> So I do have a direct link between my agenda and the HTML files of
>> all web pages I have visited.
>>
>> [1] https://addons.mozilla.org/en-US/firefox/addon/shelve/
>
> This plugin looks interesting, but it seems to rely on the existing
> functionality of Firefox to save web pages. As I want to save a page
> with its picture and CSS, I would need to choose =E2=80=9CWeb page, complet=
> e=E2=80=9D,
> but the FF documentation says =E2=80=9CThis choice allows you to view it as
> originally shown with pictures, but it may not keep the HTML link
> structure of the original page=E2=80=9D, which worries me a little.

Well, this is a hard problem to do differently: when you save a web
page A which has an URL to B, do you want to end up with a local
copy of A that links to the local copy of B (which you might not
have at all) or an URL to online-B. The latter one is easy (no
change when downloading).

> Do you only save the html or the pictures as well. If it's the latter,
> have you had any issues about links not being preserved?

I save everything.

My settings (with self-translated terms from German):

Settings: MIME: Webpage, complete (HTML)

My default shelve:

Template: 
C:\Users\karl.voit\browser_history\%Y-%M_myhostname\%Y-%M-%DT%h.%m_%{host}_-_%{title}.html

MIME: Standard

This way, I end up with all web pages stored in my file system. When
I open an URL, the browser shows my local copy. Sometimes, included
stuff is not loaded correctly. All links point to their original
target (of course). So in case I want to stay local, I do not click
on any link in my local copy.

I mainly navigate through my agenda and its links: agenda -> local
copy -> back to agenda -> next local copy -> back to agenda -> ...

-- 
get mail|git|SVN|photos|postings|SMS|phonecalls|RSS|CSV|XML into Org-mode:
       > get Memacs from https://github.com/novoid/Memacs <
Personal Information Management > http://Karl-Voit.at/tags/pim/
Emacs-related > http://Karl-Voit.at/tags/emacs/

  reply	other threads:[~2017-01-16 16:36 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-16  8:48 How do you store web pages for reference? Alan Schmitt
2017-01-16  9:22 ` Michael Welle
2017-01-16  9:57   ` Alan Schmitt
2017-01-16 10:03     ` Michael Welle
2017-01-16 15:58     ` William Denton
2017-01-16 18:09       ` Alan Schmitt
2017-01-16 10:38 ` Charles A. Roelli
2017-01-16 13:06   ` Alan Schmitt
2017-01-16 14:43 ` Karl Voit
2017-01-16 15:41   ` Alan Schmitt
2017-01-16 16:35     ` Karl Voit [this message]
2017-01-16 16:52       ` Robert Horn
2017-01-16 17:40         ` Scott Otterson
2017-03-16 19:04           ` Bob Newell
2017-03-17  8:05             ` Alan Schmitt
2017-03-23  0:01             ` Adam Porter
2017-03-13 17:43 ` Peter Salazar
2017-03-14 12:17   ` Adonay Felipe Nogueira
2017-03-15  7:08   ` Alan Schmitt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2017-01-16T17-27-24@devnull.Karl-Voit.at \
    --to=devnull@karl-voit.at \
    --cc=emacs-orgmode@gnu.org \
    --cc=news1142@Karl-Voit.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).