* Yet another browser extension for capturing notes - LinkRemark @ 2020-12-25 12:44 Maxim Nikulin 2020-12-25 14:19 ` Ihor Radchenko 2020-12-25 14:26 ` Yet another browser extension for capturing notes - LinkRemark Russell Adams 0 siblings, 2 replies; 21+ messages in thread From: Maxim Nikulin @ 2020-12-25 12:44 UTC (permalink / raw) To: emacs-orgmode I am experimenting with a browser add-on that is intended to be a bridge between browser and Org mode. In the family of Org mode capture helpers it is among ones that adds web page metadata to the note. Source code repository: https://github.com/maxnikulin/linkremark Examples Link: --->8--- Link: Karl Voit: UOMF: Managing web bookmarks with Org Mode :PROPERTIES: :DATE_ADDED: [2020-12-25 18:06] :END: - Link URL :: [[https://karl-voit.at/2014/08/10/bookmarks-with-orgmode/]] - Link text :: Karl Voit: UOMF: Managing web bookmarks with Org Mode On the page - URL :: [[https://alphapapa.github.io/org-almanac/]] - title :: org-almanac - author :: Adam Porter - referrer :: [[https://www.google.com/]] ---8<--- Page: --->8--- public voit :PROPERTIES: :DATE_ADDED: [2020-12-25 18:11] :URL_IMAGE: http://Karl-Voit.at/images/public-voit_T_logo_200x200.png :END: - URL :: [[https://karl-voit.at/2014/08/10/bookmarks-with-orgmode/]] - title :: public voit - author :: Karl Voit - published_time :: 2014-08-10T17:13+01:00 - referrer :: [[https://alphapapa.github.io/org-almanac/]] #+begin_quote In my notes.org file, I collect all kind of snippets, knowledge, ideas, how-tos, and such stuff. #+end_quote ---8<--- It is not really ready for the wild web, though I believe it is already possible to get general impression and even use it for pages where specially crafted data are rather unlikely. Due to early development stage, there is no stability promise yet. The extension has not published to catalogues of browser extensions. Signed version for Firefox could be found in "releases" section on GitHub: https://github.com/maxnikulin/linkremark/releases/download/v0.1/linkremark-0.1-fx.xpi For chrome/chromium it could be loaded as unpacked extension. Just clone the code and create a symlink to =manifest-chrome.json= named =manifest.json=. =README.org= file contains a bit more details, so visit [[https://github.com/maxnikulin/linkremark]] or just clone this repository. The mail list is quite noisy last couple of months, so, please, do not post lengthy proposals how to integrate this extension to everything in response. The gift is crafted quite roughly, glue has not fully cured, so do not be surprised if you are stuck trying to adapt it for your habits. Merry Christmas and Happy New Year! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2020-12-25 12:44 Yet another browser extension for capturing notes - LinkRemark Maxim Nikulin @ 2020-12-25 14:19 ` Ihor Radchenko 2020-12-26 11:49 ` Maxim Nikulin 2020-12-25 14:26 ` Yet another browser extension for capturing notes - LinkRemark Russell Adams 1 sibling, 1 reply; 21+ messages in thread From: Ihor Radchenko @ 2020-12-25 14:19 UTC (permalink / raw) To: Maxim Nikulin, emacs-orgmode Maxim Nikulin <manikulin@gmail.com> writes: > I am experimenting with a browser add-on that is intended > to be a bridge between browser and Org mode. > In the family of Org mode capture helpers it is among ones > that adds web page metadata to the note. > Source code repository: https://github.com/maxnikulin/linkremark The author of org-capture-ref here. Reading through the code, I can see that you are familiar with metadata conventions. Do you know good references about what og: metadata is commonly used? I looked through the official OpenGraph specification, but popular websites appear to ignore most of the conventions. Also, org-capture-ref does not really force the user to put BiBTeX into the capture. Individual metadata fields are available using org-capture-ref-get-bibtex-field (which extracts data from internal alist structure). It's just that I mostly had BiBTeX in mind (with distant goal of supporting export to LaTeX) for my use-cases. Finally, would you be interested to join efforts on metadata parsing? (I hope this question does not qualify as "integrate this extension to everything"). P.S. Some links I collected myself when working on org-capture-ref. They might also be of interest for you: - https://github.com/ageitgey/node-unfluff - https://github.com/gabceb/node-metainspector - https://github.com/wikimedia/html-metadata - https://github.com/microlinkhq/metascraper - https://github.com/hboisgibault/unicontent Best, Ihor ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2020-12-25 14:19 ` Ihor Radchenko @ 2020-12-26 11:49 ` Maxim Nikulin 2020-12-26 13:49 ` Ihor Radchenko 0 siblings, 1 reply; 21+ messages in thread From: Maxim Nikulin @ 2020-12-26 11:49 UTC (permalink / raw) To: emacs-orgmode On 25/12/2020, Ihor Radchenko wrote: > > Reading through the code, I can see that you are familiar with metadata > conventions. Do you know good references about what og: metadata is > commonly used? I looked through the official OpenGraph specification, > but popular websites appear to ignore most of the conventions. I just inspected pages on several sites using developer tools and added code that handles noticed elements. I have not tried to find any resources on metadata (OK, once I searched for LD+JSON, essentially the outcome was the link to schema.org that I have seen in data already). Looking into page source, I realized that almost nobody cares if the site has metadata of appropriate quality. I think, search engines are advanced enough to work without metadata and even decrease page rank if something suspicious was added by SEO. The only force to add some formal data is "share" buttons. Maybe some guides for web developers from social networks or search engines could be more useful than formal references, but I have not had a closer look. > Also, org-capture-ref does not really force the user to put BiBTeX into > the capture. Individual metadata fields are available using > org-capture-ref-get-bibtex-field (which extracts data from internal > alist structure). It's just that I mostly had BiBTeX in mind (with > distant goal of supporting export to LaTeX) for my use-cases. I do not have clear vision how to use collected data for queries. Certainly I want to have more human-friendly representation than BibTeX entries (maybe in addition to machine-parsable data) adjacent to my notes. Personally, I would prefer to avoid http queries from Emacs. Sometimes it is better to have current DOM state, not page source, that is why I decided to gather data inside browser, despite security fences that are placed quite strangely in some cases. From my point of view, you should be happy with any of projects you mentioned below. Are all of them have some problems critical for you? Technically it should be possible to push e.g. raw document.head.innerHtml to any external metadata parser using native messaging (to deal with sites requiring authorization). However it could cause an alarm during review before publication of the extension to the browser catalogues. > Finally, would you be interested to join efforts on metadata parsing? Could you, please, share a bit more details on your ideas? There is some room for improvement, but I do not think that quality of metadata for ordinary sites could be dramatically better. The case that is not handled it all is scientific publications, unfortunately currently I have quite little interest in it. Definitely results should be stored in some structured format such as BibTeX. I have seen huge <head> elements describing even all references. Certainly such lists are not for general-purpose notes (at least without explicit request from the user), they should be handled by some bibliography software to display citation graphs in the local library. On the other hand it is not a problem to feed such data to some tool using native messaging protocol. I have no idea if various publisher provide such data in a uniform way, I just hope that pressure from citation indices and bibliography management software has positive influence on standardization. I am not going to blow up the code with recipes for particular sites. However I realize that some special cases still should be handled. I am not ready to adapt user script model used by Greasemonkey/Violentmonkey/Tampermonkey. I believe, it is better to create dedicated extension(s) that either adds and overwrites existing meta elements or allows to query gathered data using sendMessage webextensions interface. By the way, scripts for above mentioned extensions could be used as well. It should alleviate cases when some site with insane metadata is important for particular user. > P.S. Some links I collected myself when working on org-capture-ref. They > might also be of interest for you: > > - https://github.com/ageitgey/node-unfluff > - https://github.com/gabceb/node-metainspector > - https://github.com/wikimedia/html-metadata > - https://github.com/microlinkhq/metascraper > - https://github.com/hboisgibault/unicontent Thank you for the links. I should have a closer look at that projects. E.g. I considered itemprop="author" elements but postponed implementation of such features. For some reason I even did not tried to find existing projects for metadata extraction. Maybe I still hope that quite simple implementation could handle most of the cases. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2020-12-26 11:49 ` Maxim Nikulin @ 2020-12-26 13:49 ` Ihor Radchenko 2020-12-27 12:18 ` Maxim Nikulin 2021-11-18 17:01 ` LinkRemark Firefox extension approved for addons.mozilla.org Max Nikulin 0 siblings, 2 replies; 21+ messages in thread From: Ihor Radchenko @ 2020-12-26 13:49 UTC (permalink / raw) To: Maxim Nikulin, emacs-orgmode Maxim Nikulin <manikulin@gmail.com> writes: > I just inspected pages on several sites using developer tools and added > code that handles noticed elements. I see. I basically did the same, except some minimal support for OpenGraph (though I stopped when I saw that even YouTube is not following the standard, except the most basic fields). > The only force to add some formal data is "share" buttons. Maybe some > guides for web developers from social networks or search engines could > be more useful than formal references, but I have not had a closer > look. It is also consistent with what I saw. <meta .. twitter:..> fields seems to be very common. >> Also, org-capture-ref does not really force the user to put BiBTeX into >> the capture. Individual metadata fields are available using >> org-capture-ref-get-bibtex-field (which extracts data from internal >> alist structure). It's just that I mostly had BiBTeX in mind (with >> distant goal of supporting export to LaTeX) for my use-cases. > > I do not have clear vision how to use collected data for queries. > Certainly I want to have more human-friendly representation than BibTeX > entries (maybe in addition to machine-parsable data) adjacent to my notes. So far, I found author, website name, publication year, title, and resource type useful. My standard capture template for links is: * <Author> [<Website>] (<Year>) Title Example: * dash-docs-el [Github] Dash-Docs-El Helm-Dash: Browse Dash Docsets Inside Emacs Such headlines can be easily searched later, especially when I also add some #keywords manually. > Personally, I would prefer to avoid http queries from Emacs. Sometimes > it is better to have current DOM state, not page source, that is why I > decided to gather data inside browser, despite security fences that are > placed quite strangely in some cases. Completely agree here. That's why I directly reuse the current DOM state from qutebrowser in my own setup. However, extension for qutebrowser was easy to write for me as it can be simply a bash script. I know nothing about Firefox/Chrome extensions and I do not know javascript. On the other hand, having an ability to get html is still useful in my case (Emacs package) when the capture is not done from browser. For example, I often capture links from elfeed - http query from Emacs is useful then. > From my point of view, you should be happy with any of projects you > mentioned below. Are all of them have some problems critical for you? They are all javascript, except one (unicontent), which can be easily replaced with built-in Elisp libraries (dom.el). >> Finally, would you be interested to join efforts on metadata parsing? > > Could you, please, share a bit more details on your ideas? > Technically it should be possible to push e.g. raw > document.head.innerHtml to any external metadata parser using native > messaging (to deal with sites requiring authorization). However it could > cause an alarm during review before publication of the extension to the > browser catalogues. That's unfortunate. Pushing raw html/dom is what I had in mind when talking about joining efforts. Another idea would be providing a callback from elisp to browser (I am not sure if it is possible). org-capture-ref has a mechanism to check if the link was captured in the past. If the link is already captured, the information about the link location and todo-state can be messaged back to the browser. Example message (only qutebrowser is supported now): Bookmark not saved! Already captured into org-capture-ref:TODO maxnikulin [Github] linkremark: LinkRemark - page or link notes with context >There is some room for improvement, but I do not think that quality of > metadata for ordinary sites could be dramatically better. The case > that is not handled it all is scientific publications, unfortunately > currently I have quite little interest in it. Definitely results > should be stored in some structured format such as BibTeX. I have seen > huge <head> elements describing even all references. Certainly such > lists are not for general-purpose notes (at least without explicit > request from the user), they should be handled by some bibliography > software to display citation graphs in the local library. On the other > hand it is not a problem to feed such data to some tool using native > messaging protocol. I have no idea if various publisher provide such > data in a uniform way, I just hope that pressure from citation indices > and bibliography management software has positive influence on > standardization. I think https://github.com/microlinkhq/metascraper#core-rules can be used for ideas. It has generic parsing apart from site-specific rules. For the scientific publications, the key point is usually getting DOI/ISBN. Then, most of the metadata can be obtained using standard API of doi.org or various ISBN databases. In addition, reference data is generally available in OpenCitations.net (they also have all kinds of web APIs). Also, do you pass any of the parsed metadata to org-protocol? If you do, it would be trivial to get it into capture templates on Elisp (and org-capture-ref) side. > I am not going to blow up the code with recipes for particular sites. > However I realize that some special cases still should be handled. I am > not ready to adapt user script model used by > Greasemonkey/Violentmonkey/Tampermonkey. I believe, it is better to > create dedicated extension(s) that either adds and overwrites existing > meta elements or allows to query gathered data using sendMessage > webextensions interface. By the way, scripts for above mentioned > extensions could be used as well. It should alleviate cases when some > site with insane metadata is important for particular user. I see. This is another point I thought it could be worth collaborating. The parser rules just need to be written once (probably in some common format, like json) and then can be reused. > For some reason I even did not tried to > find existing projects for metadata extraction. Maybe I still hope that > quite simple implementation could handle most of the cases. Actually, simple parsing does fairly good job on most of websites. It's just that it is not ideal. For example, I tweaked title of captured github issues to include "issue#", which helps to distinguish such pages from individual repo bookmarks. I believe that such adjustments should be available for the users, which was where org-capture-ref code started from. Best, Ihor ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2020-12-26 13:49 ` Ihor Radchenko @ 2020-12-27 12:18 ` Maxim Nikulin 2021-11-18 17:01 ` LinkRemark Firefox extension approved for addons.mozilla.org Max Nikulin 1 sibling, 0 replies; 21+ messages in thread From: Maxim Nikulin @ 2020-12-27 12:18 UTC (permalink / raw) To: emacs-orgmode On 26/12/2020 20:49, Ihor Radchenko wrote: > Maxim Nikulin <manikulin@gmail.com> writes: I have reordered some parts of discussion > Also, do you pass any of the parsed metadata to org-protocol? If you > do, it would be trivial to get it into capture templates on Elisp > (and org-capture-ref) side. I decided that capture could be too complicated to fit into simple query parameters of org protocol, e.g. it could be a chain of frames. That is why I implemented just simple option title + body (url is available but it is contained in the body). I am considering generating of tree of headings in some cases. On the other hand almost all captured data is available to native messaging backend https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Native_messaging A dumb example is included into the sources. It is python, but you could use any other language. It is just streaming JSON with message size sent in binary form. I have added JSON-RPC to let native messaging host to report errors and to avoid ambiguity related to attribution of response to particular request. I do not think that setting up of org-protocol handler is harder than adding manifest for native messaging backend. It should be even a bit safer since some weird org-protocol message could not be placed behind an innocent link text. I think it should be no problem to call emacs-client from such application. Isn't it enough for customization? Do you still need raw html? Currently I am trying to avoid customization inside the extensions since it is harder to keep history of settings changes in git. Extensions are quite isolated from host. Also I do not think that something like mustache/handlebars templates would be warmly welcomed by emacs users. >> I do not have clear vision how to use collected data for queries. >> Certainly I want to have more human-friendly representation than BibTeX >> entries (maybe in addition to machine-parsable data) adjacent to my notes. > > So far, I found author, website name, publication year, title, and > resource type useful. My standard capture template for links is: > > * <Author> [<Website>] (<Year>) Title I see that my current choice to prefer og:title or twitter:title for header is far from been optimal, even head/title text usually is better. However I was writing about a bit more detailed two or three-line representation. Often I prefer a kind of "card" representation to table/columns view. Concerning queries, see below. > Completely agree here. That's why I directly reuse the current DOM state > from qutebrowser in my own setup. However, extension for qutebrowser was > easy to write for me as it can be simply a bash script. I know nothing > about Firefox/Chrome extensions and I do not know javascript. It is too easy to underquote some variable reference in bash and to get executed something unexpected. Almost any other script language is safer in this sense. >> From my point of view, you should be happy with any of projects you >> mentioned below. Are all of them have some problems critical for you? > > They are all javascript, except one (unicontent), which can be easily > replaced with built-in Elisp libraries (dom.el). I mean running them using a very thin wrapper that generates metadata in the form easily parsable in emacs. > Another idea would be providing a callback from elisp to browser (I am > not sure if it is possible). org-capture-ref has a mechanism to check if > the link was captured in the past. If the link is already captured, the > information about the link location and todo-state can be messaged back > to the browser. > > Example message (only qutebrowser is supported now): > > Bookmark not saved! > Already captured into org-capture-ref:TODO maxnikulin [Github] linkremark: LinkRemark - page or link notes with context Why it should be a callback from elisp? From my point of view it is extension that should initiate a query if particular URL has been captured already. I have realized that in my drafts I even have a native messaging backend that could filter matched URLs from a text file. It was intended to autocomplete URLs typed in the browser location bar using text file as a kind of bookmark storage, but it could be adapted for checks similar to yours. Though it is better to get link to the header with URL (e.g. CUSTOM_ID), so additional links or quotes could be added and linked to the "main" entry. I have not tried if such query using emacs-client is fast enough. I have seen a thread on Language Server Protocol but have not checked if that protocol supports such queries. I especially like idea of references to existing headers because it allows to avoid cluttering context menus with options to capture link without page metadata in addition to existing ones. ^ permalink raw reply [flat|nested] 21+ messages in thread
* LinkRemark Firefox extension approved for addons.mozilla.org 2020-12-26 13:49 ` Ihor Radchenko 2020-12-27 12:18 ` Maxim Nikulin @ 2021-11-18 17:01 ` Max Nikulin 1 sibling, 0 replies; 21+ messages in thread From: Max Nikulin @ 2021-11-18 17:01 UTC (permalink / raw) To: emacs-orgmode A year ago I announced LinkRemark browser extension to save metadata of web pages as notes in Org Mode. New version is available in Firefox catalog (It is not published to Chrome store, the only option is still to load unpacked extension.): https://addons.mozilla.org/firefox/addon/linkremark/ Capture is not ideal and notes require edits. Some subset of schema.org microdata embedded into HTML markup is extracted now. I addressed some issues from comments to first release. Example: #+begin_src org ,* Link: Karl Voit: UOMF: Managing web bookmarks with Org Mode :PROPERTIES: :DATE_ADDED: [2021-09-28 Tue 12:15] :END: - Link URL :: [[https://karl-voit.at/2014/08/10/bookmarks-with-orgmode/]] - Link text :: Karl Voit: UOMF: Managing web bookmarks with Org Mode ,#+begin_quote author: Karl Voit published: [2014-08-10 Sun] ,#+end_quote On the page ,** Adam Porter — org-almanac :PROPERTIES: :DATE_ADDED: [2021-09-28 Tue 12:15] :LAST_MODIFIED: [2021-09-18 Tue 01:23] 09/18/2021 01:23:46 :END: - URL :: [[https://alphapapa.github.io/org-almanac/]] - title :: org-almanac - author :: Adam Porter - referrer :: [[https://www.google.com/]] #+end_src On 26/12/2020 20:49, Ihor Radchenko wrote: > > Another idea would be providing a callback from elisp to browser (I am > not sure if it is possible). org-capture-ref has a mechanism to check if > the link was captured in the past. If the link is already captured, the > information about the link location and todo-state can be messaged back > to the browser. I looked into org-capture-ref code and stole the idea to use an external tool to search in Org files. LinkRemark now can ask native messaging application helper whether URLs are already known. Proof of concept: https://github.com/maxnikulin/burl > For the scientific publications, the key point is usually getting > DOI/ISBN. At least apparent DOI and links should be recognized now, however with no additional actions. > Also, do you pass any of the parsed metadata to org-protocol? If you do, > it would be trivial to get it into capture templates on Elisp (and > org-capture-ref) side. Actually it was possible even a year ago to specify "object" format instead of "org" and to get extracted metadata in JSON format wrapped into org-protocol URI. I can not say that structure of data has been stabilized and I would not change it again. > For example, I tweaked title of captured > github issues to include "issue#", which helps to distinguish such pages > from individual repo bookmarks. In particular case of GitHub it is better to fetch raw data curl -H 'Accept: application/vnd.github.v3+json' 'https://api.github.com/repos/yantar92/org-capture-ref/issues/2' On 26/12/2020 05:11, Samuel Wales wrote: > for > example, you could have sets of tabs, selected by right click in > firefox, to save to a bunch of org entries. then you could load that > particular set of entries into firefox whenever you want. and you > could keep notes on each page and move the entries wherever you want. > this would be useful for such things as "i am researching rice > cookers; these are my tabs, but i don't want them cluttering firefox > and i want them with my org notes and to make notes on them and will > re-load them into firefox when i want to revisit" I implemented capture of highlighted tab group for Firefox. No ready to use solution is provided to restore it. It is just a tree of Org headings. > now if i can only debug the extra-blank-lines-in-capture problem. I hope, a kind of hack to avoid excessive newlines in selected text would not be a source of problems. Clipboard managers might be a trouble though. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2020-12-25 12:44 Yet another browser extension for capturing notes - LinkRemark Maxim Nikulin 2020-12-25 14:19 ` Ihor Radchenko @ 2020-12-25 14:26 ` Russell Adams 2020-12-25 22:11 ` Samuel Wales 1 sibling, 1 reply; 21+ messages in thread From: Russell Adams @ 2020-12-25 14:26 UTC (permalink / raw) To: emacs-orgmode On Fri, Dec 25, 2020 at 07:44:22PM +0700, Maxim Nikulin wrote: > I am experimenting with a browser add-on that is intended > to be a bridge between browser and Org mode. > In the family of Org mode capture helpers it is among ones > that adds web page metadata to the note. > Source code repository: https://github.com/maxnikulin/linkremark That's a really neat idea! I hadn't previously considered having a Firefox plugin to capture information. Now I must look! ------------------------------------------------------------------ Russell Adams RLAdams@AdamsInfoServ.com PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/ Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2020-12-25 14:26 ` Yet another browser extension for capturing notes - LinkRemark Russell Adams @ 2020-12-25 22:11 ` Samuel Wales 2020-12-26 9:16 ` Maxim Nikulin 0 siblings, 1 reply; 21+ messages in thread From: Samuel Wales @ 2020-12-25 22:11 UTC (permalink / raw) To: emacs-orgmode maxim, it is great to see new work in this area. thanks for sharing. russell, i use the org-capture extension for firefox, which is on the firefox extensions site. it is for if you want a different set of data captured [it uses your org capture template]. it works well for me. [not a suggestion for maxim to integrate into everything; ignore please. i can imagine great things possible with such extensions. for example, you could have sets of tabs, selected by right click in firefox, to save to a bunch of org entries. then you could load that particular set of entries into firefox whenever you want. and you could keep notes on each page and move the entries wherever you want. this would be useful for such things as "i am researching rice cookers; these are my tabs, but i don't want them cluttering firefox and i want them with my org notes and to make notes on them and will re-load them into firefox when i want to revisit".] [now if i can only debug the extra-blank-lines-in-capture problem.] On 12/25/20, Russell Adams <RLAdams@adamsinfoserv.com> wrote: > On Fri, Dec 25, 2020 at 07:44:22PM +0700, Maxim Nikulin wrote: >> I am experimenting with a browser add-on that is intended >> to be a bridge between browser and Org mode. >> In the family of Org mode capture helpers it is among ones >> that adds web page metadata to the note. >> Source code repository: https://github.com/maxnikulin/linkremark > > That's a really neat idea! > > I hadn't previously considered having a Firefox plugin to capture > information. Now I must look! > > ------------------------------------------------------------------ > Russell Adams RLAdams@AdamsInfoServ.com > > PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/ > > Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3 > > -- The Kafka Pandemic Please learn what misopathy is. https://thekafkapandemic.blogspot.com/2013/10/why-some-diseases-are-wronged.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2020-12-25 22:11 ` Samuel Wales @ 2020-12-26 9:16 ` Maxim Nikulin 2022-01-17 2:29 ` Samuel Wales 0 siblings, 1 reply; 21+ messages in thread From: Maxim Nikulin @ 2020-12-26 9:16 UTC (permalink / raw) To: emacs-orgmode On 26/12/2020, Samuel Wales wrote: > [... i can imagine great things possible with such extensions. for > example, you could have sets of tabs, selected by right click in > firefox, to save to a bunch of org entries. then you could load that > particular set of entries into firefox whenever you want. and you > could keep notes on each page and move the entries wherever you want. > this would be useful for such things as "i am researching rice > cookers; these are my tabs, but i don't want them cluttering firefox > and i want them with my org notes and to make notes on them and will > re-load them into firefox when i want to revisit".] It should be possible since some tab management extension were used in mozilla to evaluate if webextensions are mature enough and if support of XUL add-ons could be dropped. On the other hand do not expect such feature soon. A kind of semi-blocker is absence of automatic tests to run before every release, and it will require a lot of time. In the meanwhile, have you looked at the following comment? https://github.com/sprig/org-capture-extension/issues/12#issuecomment-323569334 alphapapa commented Aug 20, 2017 > You can do this with the "Copy all URLs" extension (ID: > djdmadneanknadilpjiknlnanaolmbfk). Use this as the custom format (note > the linebreak): > > [[$url][$title]] I am almost sure that similar extension should exist for Firefox as well. Some points should be clarified in my opinion - Do you expect that metadata should be captured in addition to URLs and titles? Browsers can unload some tabs making page content unavailable. - Are you going to capture reviews of "rice cookers" that could be considered as ordinary pages or you are going to save items from online stores? I do not current state of affairs but I have heard about some activity for special metadata that allows search engines to display products in a special way. Could you inspect head element of pages in your favorite stores contains desired metadata using page source or inspect element tools? - Should tab group be captured as single Org heading or it should be a tree with a section per tab? I am not sure that capture will have no problem with subtree. Certainly Emacs interface for org-protocol + capture are not suitable for sending each tab as a separate link. Another option is to create nested lists, anyway org formatter in my extension need improvements. Are you expecting headings subtree or nested lists? > [now if i can only debug the extra-blank-lines-in-capture problem.] Fully agree that it is really annoying. It is among high priority items in my TODO list. Accidentally I pressed =C-x C-o= and discovered [[help:delete-blank-lines]] innerText is not exactly the same as selection range toString but the rules could work in a similar way. Table rows, floating and absolutely positioned elements require newlines. Such elements are often abused by designers. https://html.spec.whatwg.org/multipage/dom.html#dom-innertext ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2020-12-26 9:16 ` Maxim Nikulin @ 2022-01-17 2:29 ` Samuel Wales 2022-01-18 1:03 ` Samuel Wales 2022-01-18 10:34 ` Max Nikulin 0 siblings, 2 replies; 21+ messages in thread From: Samuel Wales @ 2022-01-17 2:29 UTC (permalink / raw) To: Maxim Nikulin; +Cc: emacs-orgmode more below. On 12/26/20, Maxim Nikulin <manikulin@gmail.com> wrote: > On 26/12/2020, Samuel Wales wrote: > >> [... i can imagine great things possible with such extensions. for >> example, you could have sets of tabs, selected by right click in >> firefox, to save to a bunch of org entries. then you could load that >> particular set of entries into firefox whenever you want. and you >> could keep notes on each page and move the entries wherever you want. >> this would be useful for such things as "i am researching rice >> cookers; these are my tabs, but i don't want them cluttering firefox >> and i want them with my org notes and to make notes on them and will >> re-load them into firefox when i want to revisit".] > > It should be possible since some tab management extension were used in > mozilla to evaluate if webextensions are mature enough and if support of > XUL add-ons could be dropped. On the other hand do not expect such > feature soon. A kind of semi-blocker is absence of automatic tests to > run before every release, and it will require a lot of time. interesting. i do note tab selection features in recent firefox-esr and i was just assuming something like that. > > In the meanwhile, have you looked at the following comment? > https://github.com/sprig/org-capture-extension/issues/12#issuecomment-323569334 > alphapapa commented Aug 20, 2017 > >> You can do this with the "Copy all URLs" extension (ID: >> djdmadneanknadilpjiknlnanaolmbfk). Use this as the custom format (note >> the linebreak): >> >> [[$url][$title]] > > I am almost sure that similar extension should exist for Firefox as well. i think this is for copying all tabs, not selected ones. so a workaround for my idea would be to have a fresh firefox window dedicated to rice cookers and then save them all. bit it does not save over existing canonical location for each url or similar. which would be needed for my idea so as to not have duplicates etc. also i think this extension does not exist any more in firefox. i used to use it for storing as org links. but it was just to store links in case firefox screwed up session restore. which it usually does. for that purpose, i use one that does not save as orglinks. > > Some points should be clarified in my opinion > > - Do you expect that metadata should be captured in addition to URLs and > titles? Browsers can unload some tabs making page content unavailable. i wouldn't need this i think. i'd want page title, just as in ordinary org links, but in principle that can be assumed from the existing org entry if exists, and if not exists and you are capturing, the page is already loaded. so i think not a metadata issue. > - Are you going to capture reviews of "rice cookers" that could be > considered as ordinary pages or you are going to save items from online > stores? I do not current state of affairs but I have heard about some > activity for special metadata that allows search engines to display > products in a special way. Could you inspect head element of pages in > your favorite stores contains desired metadata using page source or > inspect element tools? my web knowledge is too limited to understand your question, but i am just hoping it would capture ordinary amazon links, review sites, and so on. and i never use js if i can avoid it so i'm expecting pretty normal website stuff i think. so i'm flexible. [of course, amazon per se links might need cleaning or uniquification of some type for finding the version in org maybe, or maybe for improving privacy by removing amazon's data about you in the url, but that might not even need any special amazon link knowledge. [fanciness might look for the amazon id, if implementer willing or somethign exists for that.]] > - Should tab group be captured as single Org heading or it should be a > tree with a section per tab? I am not sure that capture will have no > problem with subtree. Certainly Emacs interface for org-protocol + > capture are not suitable for sending each tab as a separate link. > Another option is to create nested lists, anyway org formatter in my > extension need improvements. Are you expecting headings subtree or > nested lists? the status quo is that there is nothing, so using lists would be a huge improvement and work great. but fanciness by using org sections if poss [i assume this means header and metadata and content and maybe descendents] could be more flexible. > >> [now if i can only debug the extra-blank-lines-in-capture problem.] > > Fully agree that it is really annoying. It is among high priority items > in my TODO list. we might be talking about different thinks. i am referring to something in org that adds blank lines when my particular org capture templates are used. i think it is outside all of the hooks that are available for org capture so not fixable using those. recent org might fix it dunno. i am limited in coputer use so i have not tried to debug it further. just delete the extra lines. > > Accidentally I pressed =C-x C-o= and discovered > [[help:delete-blank-lines]] innerText is not exactly the same as > selection range toString but the rules could work in a similar way. > Table rows, floating and absolutely positioned elements require > newlines. Such elements are often abused by designers. > https://html.spec.whatwg.org/multipage/dom.html#dom-innertext web stuff is above my knowledge and so i think maybe different things we are talking about. > > > you would still keep notes on each thing and org metadata. then you load all links in an org subtree or list, or all with a :firefox: tag, into firefox. one question is making sure there is a canonical place for each topic. [rice cookers, a research topic, etc.] metadata snags like you mention are best figured out by those who undertstand them unlike myself and i'd be flexible. i'd be pleased with anything i think. i don't need metadata most of the time, just link and page title. this is all just an idea for cogitation. tldr you'd have a set of canonical tabs that is in org and sometimes in firefox as you please. you can keep org notes on the org links and they won't be overwritten when you save from firefox. you also won't create duplicates when you do so. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2022-01-17 2:29 ` Samuel Wales @ 2022-01-18 1:03 ` Samuel Wales 2022-01-18 5:43 ` Samuel Banya 2022-01-18 10:34 ` Max Nikulin 1 sibling, 1 reply; 21+ messages in thread From: Samuel Wales @ 2022-01-18 1:03 UTC (permalink / raw) To: Maxim Nikulin; +Cc: emacs-orgmode my amazon example was silly and confusing. the point isn't shopping for something; it's anything. science papers, news outlets, nerd blogs. On 1/16/22, Samuel Wales <samologist@gmail.com> wrote: > more below. > > On 12/26/20, Maxim Nikulin <manikulin@gmail.com> wrote: >> On 26/12/2020, Samuel Wales wrote: >> >>> [... i can imagine great things possible with such extensions. for >>> example, you could have sets of tabs, selected by right click in >>> firefox, to save to a bunch of org entries. then you could load that >>> particular set of entries into firefox whenever you want. and you >>> could keep notes on each page and move the entries wherever you want. >>> this would be useful for such things as "i am researching rice >>> cookers; these are my tabs, but i don't want them cluttering firefox >>> and i want them with my org notes and to make notes on them and will >>> re-load them into firefox when i want to revisit".] >> >> It should be possible since some tab management extension were used in >> mozilla to evaluate if webextensions are mature enough and if support of >> XUL add-ons could be dropped. On the other hand do not expect such >> feature soon. A kind of semi-blocker is absence of automatic tests to >> run before every release, and it will require a lot of time. > > interesting. i do note tab selection features in recent firefox-esr > and i was just assuming something like that. > >> >> In the meanwhile, have you looked at the following comment? >> https://github.com/sprig/org-capture-extension/issues/12#issuecomment-323569334 >> alphapapa commented Aug 20, 2017 >> >>> You can do this with the "Copy all URLs" extension (ID: >>> djdmadneanknadilpjiknlnanaolmbfk). Use this as the custom format (note >>> the linebreak): >>> >>> [[$url][$title]] >> >> I am almost sure that similar extension should exist for Firefox as well. > > i think this is for copying all tabs, not selected ones. so a > workaround for my idea would be to have a fresh firefox window > dedicated to rice cookers and then save them all. bit it does not > save over existing canonical location for each url or similar. > > which would be needed for my idea so as to not have duplicates etc. > > also i think this extension does not exist any more in firefox. i > used to use it for storing as org links. but it was just to store > links in case firefox screwed up session restore. which it usually > does. for that purpose, i use one that does not save as orglinks. > >> >> Some points should be clarified in my opinion >> >> - Do you expect that metadata should be captured in addition to URLs and >> titles? Browsers can unload some tabs making page content unavailable. > > i wouldn't need this i think. i'd want page title, just as in > ordinary org links, but in principle that can be assumed from the > existing org entry if exists, and if not exists and you are capturing, > the page is already loaded. so i think not a metadata issue. > >> - Are you going to capture reviews of "rice cookers" that could be >> considered as ordinary pages or you are going to save items from online >> stores? I do not current state of affairs but I have heard about some >> activity for special metadata that allows search engines to display >> products in a special way. Could you inspect head element of pages in >> your favorite stores contains desired metadata using page source or >> inspect element tools? > > my web knowledge is too limited to understand your question, but i am > just hoping it would capture ordinary amazon links, review sites, and > so on. and i never use js if i can avoid it so i'm expecting pretty > normal website stuff i think. so i'm flexible. > > [of course, amazon per se links might need cleaning or uniquification > of some type for finding the version in org maybe, or maybe for > improving privacy by removing amazon's data about you in the url, but > that might not even need any special amazon link knowledge. > [fanciness might look for the amazon id, if implementer willing or > somethign exists for that.]] > >> - Should tab group be captured as single Org heading or it should be a >> tree with a section per tab? I am not sure that capture will have no >> problem with subtree. Certainly Emacs interface for org-protocol + >> capture are not suitable for sending each tab as a separate link. >> Another option is to create nested lists, anyway org formatter in my >> extension need improvements. Are you expecting headings subtree or >> nested lists? > > the status quo is that there is nothing, so using lists would be a > huge improvement and work great. but fanciness by using org sections > if poss [i assume this means header and metadata and content and maybe > descendents] could be more flexible. > >> >>> [now if i can only debug the extra-blank-lines-in-capture problem.] >> >> Fully agree that it is really annoying. It is among high priority items >> in my TODO list. > > we might be talking about different thinks. i am referring to > something in org that adds blank lines when my particular org capture > templates are used. i think it is outside all of the hooks that are > available for org capture so not fixable using those. > > recent org might fix it dunno. i am limited in coputer use so i have > not tried to debug it further. just delete the extra lines. > >> >> Accidentally I pressed =C-x C-o= and discovered >> [[help:delete-blank-lines]] innerText is not exactly the same as >> selection range toString but the rules could work in a similar way. >> Table rows, floating and absolutely positioned elements require >> newlines. Such elements are often abused by designers. >> https://html.spec.whatwg.org/multipage/dom.html#dom-innertext > > web stuff is above my knowledge and so i think maybe different things > we are talking about. > >> >> >> > > you would still keep notes on each thing and org metadata. > > then you load all links in an org subtree or list, or all with a > :firefox: tag, into firefox. one question is making sure there is a > canonical place for each topic. [rice cookers, a research topic, > etc.] > > metadata snags like you mention are best figured out by those who > undertstand them unlike myself and i'd be flexible. i'd be pleased > with anything i think. i don't need metadata most of the time, just > link and page title. this is all just an idea for cogitation. > > tldr you'd have a set of canonical tabs that is in org and sometimes > in firefox as you please. you can keep org notes on the org links and > they won't be overwritten when you save from firefox. you also won't > create duplicates when you do so. > -- The Kafka Pandemic A blog about science, health, human rights, and misopathy: https://thekafkapandemic.blogspot.com ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2022-01-18 1:03 ` Samuel Wales @ 2022-01-18 5:43 ` Samuel Banya 2022-01-18 10:57 ` Max Nikulin 0 siblings, 1 reply; 21+ messages in thread From: Samuel Banya @ 2022-01-18 5:43 UTC (permalink / raw) To: Charles Berry [-- Attachment #1: Type: text/plain, Size: 7548 bytes --] Not sure if it helps, but you could also use the w3m browser's mentality of just keeping an HTML file that contains all of your bookmarks. I'm sure there's probably even a way to use 'eww' in the same fashion too. Maybe even making your own personal wiki of a webring of sorts would help too. I don't personally bookmark anything anymore but just store links on a webring on my site. Hope this helps. Sam On Mon, Jan 17, 2022, at 8:03 PM, Samuel Wales wrote: > my amazon example was silly and confusing. the point isn't shopping > for something; it's anything. science papers, news outlets, nerd > blogs. > > On 1/16/22, Samuel Wales <samologist@gmail.com> wrote: > > more below. > > > > On 12/26/20, Maxim Nikulin <manikulin@gmail.com> wrote: > >> On 26/12/2020, Samuel Wales wrote: > >> > >>> [... i can imagine great things possible with such extensions. for > >>> example, you could have sets of tabs, selected by right click in > >>> firefox, to save to a bunch of org entries. then you could load that > >>> particular set of entries into firefox whenever you want. and you > >>> could keep notes on each page and move the entries wherever you want. > >>> this would be useful for such things as "i am researching rice > >>> cookers; these are my tabs, but i don't want them cluttering firefox > >>> and i want them with my org notes and to make notes on them and will > >>> re-load them into firefox when i want to revisit".] > >> > >> It should be possible since some tab management extension were used in > >> mozilla to evaluate if webextensions are mature enough and if support of > >> XUL add-ons could be dropped. On the other hand do not expect such > >> feature soon. A kind of semi-blocker is absence of automatic tests to > >> run before every release, and it will require a lot of time. > > > > interesting. i do note tab selection features in recent firefox-esr > > and i was just assuming something like that. > > > >> > >> In the meanwhile, have you looked at the following comment? > >> https://github.com/sprig/org-capture-extension/issues/12#issuecomment-323569334 > >> alphapapa commented Aug 20, 2017 > >> > >>> You can do this with the "Copy all URLs" extension (ID: > >>> djdmadneanknadilpjiknlnanaolmbfk). Use this as the custom format (note > >>> the linebreak): > >>> > >>> [[$url][$title]] > >> > >> I am almost sure that similar extension should exist for Firefox as well. > > > > i think this is for copying all tabs, not selected ones. so a > > workaround for my idea would be to have a fresh firefox window > > dedicated to rice cookers and then save them all. bit it does not > > save over existing canonical location for each url or similar. > > > > which would be needed for my idea so as to not have duplicates etc. > > > > also i think this extension does not exist any more in firefox. i > > used to use it for storing as org links. but it was just to store > > links in case firefox screwed up session restore. which it usually > > does. for that purpose, i use one that does not save as orglinks. > > > >> > >> Some points should be clarified in my opinion > >> > >> - Do you expect that metadata should be captured in addition to URLs and > >> titles? Browsers can unload some tabs making page content unavailable. > > > > i wouldn't need this i think. i'd want page title, just as in > > ordinary org links, but in principle that can be assumed from the > > existing org entry if exists, and if not exists and you are capturing, > > the page is already loaded. so i think not a metadata issue. > > > >> - Are you going to capture reviews of "rice cookers" that could be > >> considered as ordinary pages or you are going to save items from online > >> stores? I do not current state of affairs but I have heard about some > >> activity for special metadata that allows search engines to display > >> products in a special way. Could you inspect head element of pages in > >> your favorite stores contains desired metadata using page source or > >> inspect element tools? > > > > my web knowledge is too limited to understand your question, but i am > > just hoping it would capture ordinary amazon links, review sites, and > > so on. and i never use js if i can avoid it so i'm expecting pretty > > normal website stuff i think. so i'm flexible. > > > > [of course, amazon per se links might need cleaning or uniquification > > of some type for finding the version in org maybe, or maybe for > > improving privacy by removing amazon's data about you in the url, but > > that might not even need any special amazon link knowledge. > > [fanciness might look for the amazon id, if implementer willing or > > somethign exists for that.]] > > > >> - Should tab group be captured as single Org heading or it should be a > >> tree with a section per tab? I am not sure that capture will have no > >> problem with subtree. Certainly Emacs interface for org-protocol + > >> capture are not suitable for sending each tab as a separate link. > >> Another option is to create nested lists, anyway org formatter in my > >> extension need improvements. Are you expecting headings subtree or > >> nested lists? > > > > the status quo is that there is nothing, so using lists would be a > > huge improvement and work great. but fanciness by using org sections > > if poss [i assume this means header and metadata and content and maybe > > descendents] could be more flexible. > > > >> > >>> [now if i can only debug the extra-blank-lines-in-capture problem.] > >> > >> Fully agree that it is really annoying. It is among high priority items > >> in my TODO list. > > > > we might be talking about different thinks. i am referring to > > something in org that adds blank lines when my particular org capture > > templates are used. i think it is outside all of the hooks that are > > available for org capture so not fixable using those. > > > > recent org might fix it dunno. i am limited in coputer use so i have > > not tried to debug it further. just delete the extra lines. > > > >> > >> Accidentally I pressed =C-x C-o= and discovered > >> [[help:delete-blank-lines]] innerText is not exactly the same as > >> selection range toString but the rules could work in a similar way. > >> Table rows, floating and absolutely positioned elements require > >> newlines. Such elements are often abused by designers. > >> https://html.spec.whatwg.org/multipage/dom.html#dom-innertext > > > > web stuff is above my knowledge and so i think maybe different things > > we are talking about. > > > >> > >> > >> > > > > you would still keep notes on each thing and org metadata. > > > > then you load all links in an org subtree or list, or all with a > > :firefox: tag, into firefox. one question is making sure there is a > > canonical place for each topic. [rice cookers, a research topic, > > etc.] > > > > metadata snags like you mention are best figured out by those who > > undertstand them unlike myself and i'd be flexible. i'd be pleased > > with anything i think. i don't need metadata most of the time, just > > link and page title. this is all just an idea for cogitation. > > > > tldr you'd have a set of canonical tabs that is in org and sometimes > > in firefox as you please. you can keep org notes on the org links and > > they won't be overwritten when you save from firefox. you also won't > > create duplicates when you do so. > > > > > -- > The Kafka Pandemic > > A blog about science, health, human rights, and misopathy: > https://thekafkapandemic.blogspot.com > > [-- Attachment #2: Type: text/html, Size: 10861 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2022-01-18 5:43 ` Samuel Banya @ 2022-01-18 10:57 ` Max Nikulin 0 siblings, 0 replies; 21+ messages in thread From: Max Nikulin @ 2022-01-18 10:57 UTC (permalink / raw) To: emacs-orgmode On 18/01/2022 12:43, Samuel Banya wrote: > Not sure if it helps, but you could also use the w3m browser's mentality > of just keeping an HTML file that contains all of your bookmarks. I'm > sure there's probably even a way to use 'eww' in the same fashion too. > > Maybe even making your own personal wiki of a webring of sorts would > help too. > > I don't personally bookmark anything anymore but just store links on a > webring on my site. Actually Samuel Wales added more details to his message posted a year ago. I started that thread to announce LinkRemark browser extension https://github.com/maxnikulin/linkremark It was me who tried to revive the thread a month ago. The idea is to store bookmarks in Org file and it should be more than just URL and page title. Rich "bookmark" should have more metadata and may have user comments. In eww you likely can use org-store-link or org-capture directly. Example of projects that extracts metadata: https://github.com/yantar92/org-capture-ref Doesn't Org mode is better than any wiki? At least in some aspects. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2022-01-17 2:29 ` Samuel Wales 2022-01-18 1:03 ` Samuel Wales @ 2022-01-18 10:34 ` Max Nikulin 2022-01-19 3:28 ` Ihor Radchenko 1 sibling, 1 reply; 21+ messages in thread From: Max Nikulin @ 2022-01-18 10:34 UTC (permalink / raw) To: emacs-orgmode Samuel, since significant part of your message is dedicated to capturing of tab groups I should ask if you have tried version of LinkRemark add-on currently available from browser extension catalogues: - https://addons.mozilla.org/firefox/addon/linkremark/ - https://chrome.google.com/webstore/detail/mgmcoaemjnaehlliifkgljdnbpedihoe Groups of tabs or selected (highlighted) tabs are supported for Chromium, Firefox has no built-in tab groups, but it is still possible to capture selected tabs. Your feature requests: - Clean-up URLs. I have such idea, but I have not approached to implementation of it. Maybe URLs should be sent to another extension that excels in such task. If you have come comments which add-ons are great and which work rather poor, the suggestions my be helpful. - Deduplicate URLs from tab groups. It requires some work to merge selected text, links, or nested frames from each tab. The complication is that some sites use internal navigation not reflected in location, so the same URL may have completely different content. Some sites have their top pages as canonical URLs, so some measures against false positives is required. Currently the extension may check if URL already present in org files. It requires https://github.com/maxnikulin/burl helper application that is in proof-of concept stage. - Restore set of tabs. It requires some elisp code to iterate over subtree and to pick first "Link URL" or "URL" from description lists. Currently I am thinking on some changes of interface since sometimes I just want to check if some URL is in my notes already. I would prefer to avoid adding more context menu items. Additional details are inline. On 17/01/2022 09:29, Samuel Wales wrote: > On 12/26/20, Maxim Nikulin <manikulin@gmail.com> wrote: >> On 26/12/2020, Samuel Wales wrote: >> >>> [... i can imagine great things possible with such extensions. for >>> example, you could have sets of tabs, selected by right click in >>> firefox, to save to a bunch of org entries. then you could load that >>> particular set of entries into firefox whenever you want. > > interesting. i do note tab selection features in recent firefox-esr > and i was just assuming something like that. There is no a ready to use recipe for loading saved tabs, but saving should work to some extent. >>> You can do this with the "Copy all URLs" extension (ID: >>> djdmadneanknadilpjiknlnanaolmbfk). Use this as the custom format (note >>> the linebreak): >> >> I am almost sure that similar extension should exist for Firefox as well. > > i think this is for copying all tabs, not selected ones. ... > also i think this extension does not exist any more in firefox. I have not tried them: - https://github.com/piroor/copy-selected-tabs-to-clipboard/ - https://github.com/yorkxin/copy-as-markdown >> - Are you going to capture reviews of "rice cookers" that could be >> considered as ordinary pages or you are going to save items from online >> stores? ... >> Could you inspect head element of pages in >> your favorite stores contains desired metadata using page source or >> inspect element tools? > > my web knowledge is too limited to understand your question, but i am > just hoping it would capture ordinary amazon links, review sites, and > so on. It seems that quality of metadata in marketplaces like amazon severely depends on particular seller. The extension attempts to treat some data specially if there are microdata or JSON-LD with Product schema.org type. If I remember correctly, Amazon does not expose canonical link explicitly. >>> [now if i can only debug the extra-blank-lines-in-capture problem.] >> >> Fully agree that it is really annoying. It is among high priority items >> in my TODO list. > > we might be talking about different thinks. i am referring to > something in org that adds blank lines when my particular org capture > templates are used. See info "(org) Template elements" https://orgmode.org/manual/Template-elements.html :empty-lines, :empty-lines-after, :empty-lines-before however I can not say that I really understand their meaning. Actually I do not mind to have empty line before next heading when refile is completed. My impression that it depends on number of empty lines at the end of capture buffer. I usually add some comments to captured pages. On 18/01/2022 08:03, Samuel Wales wrote: > my amazon example was silly and confusing. the point isn't shopping > for something; it's anything. science papers, news outlets, nerd > blogs. Scientific papers require more work, it is necessary to make them available to org-cite somehow. Some nerds use quite peculiar blog engines and strange setting of metadata. So shopping on some sites might work better than other cases. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2022-01-18 10:34 ` Max Nikulin @ 2022-01-19 3:28 ` Ihor Radchenko 2022-01-19 8:45 ` András Simonyi 2022-01-20 0:23 ` Samuel Wales 0 siblings, 2 replies; 21+ messages in thread From: Ihor Radchenko @ 2022-01-19 3:28 UTC (permalink / raw) To: Max Nikulin; +Cc: emacs-orgmode Max Nikulin <manikulin@gmail.com> writes: > Scientific papers require more work, it is necessary to make them > available to org-cite somehow. Some nerds use quite peculiar blog > engines and strange setting of metadata. So shopping on some sites might > work better than other cases. I have plans to implement something called oc-org.el The plan is using ol-bibtex-compatible Org headings as a source of citations. Best, Ihor ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2022-01-19 3:28 ` Ihor Radchenko @ 2022-01-19 8:45 ` András Simonyi 2022-01-19 10:00 ` Ihor Radchenko 2022-01-20 0:23 ` Samuel Wales 1 sibling, 1 reply; 21+ messages in thread From: András Simonyi @ 2022-01-19 8:45 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Max Nikulin, emacs-orgmode Dear All, On Wed, 19 Jan 2022 at 04:24, Ihor Radchenko <yantar92@gmail.com> wrote: > > Scientific papers require more work, it is necessary to make them > > available to org-cite somehow. Some nerds use quite peculiar blog > > engines and strange setting of metadata. So shopping on some sites might > > work better than other cases. > > I have plans to implement something called oc-org.el The plan is > using ol-bibtex-compatible Org headings as a source of citations. Just wanted to note that the CSL-based export processor, oc-csl.el, already supports this: you can add an Org file as a bibliography, cite items described by ol-bibtex style headings and export the citations. It'd be very nice indeed if other built-in processors supported the format too (e.g., "basic"). As for external ones, the CSL-based activation processor I wrote (https://github.com/andras-simonyi/org-cite-csl-activate) also supports it and there are plans to add support to Citar as well (through parsebib); see the discussion at https://github.com/bdarcus/citar/issues/397. best wishes, András > Best, > Ihor > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2022-01-19 8:45 ` András Simonyi @ 2022-01-19 10:00 ` Ihor Radchenko 2022-01-19 10:58 ` András Simonyi 0 siblings, 1 reply; 21+ messages in thread From: Ihor Radchenko @ 2022-01-19 10:00 UTC (permalink / raw) To: András Simonyi; +Cc: Max Nikulin, emacs-orgmode András Simonyi <andras.simonyi@gmail.com> writes: > Just wanted to note that the CSL-based export processor, oc-csl.el, > already supports this: you can add an Org file as a bibliography, cite > items described by ol-bibtex style headings and export the citations. Thanks for telling! oc-csl is tricky because it relies on external library. So, it's hard to know what it can do and what it cannot do. As a side note, citeproc-el currently has poor performance on large org files. It is unusable for me. > It'd be very nice indeed if other built-in processors supported the > format too (e.g., "basic"). As for external ones, the CSL-based > activation processor I wrote > (https://github.com/andras-simonyi/org-cite-csl-activate) also > supports it Interesting. By the way, I recommend using composition instead of display property for rendering. See prettify-symbols-mode. Best, Ihor ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2022-01-19 10:00 ` Ihor Radchenko @ 2022-01-19 10:58 ` András Simonyi 2022-01-19 11:42 ` Ihor Radchenko 0 siblings, 1 reply; 21+ messages in thread From: András Simonyi @ 2022-01-19 10:58 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Max Nikulin, emacs-orgmode Dear All, On Wed, 19 Jan 2022 at 10:56, Ihor Radchenko <yantar92@gmail.com> wrote: > As a side note, citeproc-el currently has poor performance on large org > files. It is unusable for me. Could you elaborate? In theory, oc-cs.el's performance should depend only on the number of citations (as opposed to the size of the Org document) and be in the same ballpark as pandoc's citeproc. It'd be interesting to know the details since I plan to work on speeding up citeproc-el's rendering, although you are the first one to actually complain :-). best wishes, András ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2022-01-19 10:58 ` András Simonyi @ 2022-01-19 11:42 ` Ihor Radchenko 0 siblings, 0 replies; 21+ messages in thread From: Ihor Radchenko @ 2022-01-19 11:42 UTC (permalink / raw) To: András Simonyi; +Cc: Max Nikulin, emacs-orgmode András Simonyi <andras.simonyi@gmail.com> writes: >> As a side note, citeproc-el currently has poor performance on large org >> files. It is unusable for me. > > Could you elaborate? In theory, oc-cs.el's performance should depend > only on the number of citations (as opposed to the size of the Org > document) and be in the same ballpark as pandoc's citeproc. It'd be > interesting to know the details since I plan to work on speeding up > citeproc-el's rendering, although you are the first one to actually > complain :-). There is no doubt why I complain - 15Mb "bibliography" file. The oc-csl.el performance depends on the size of the Org document during caching stage. Moreover, every time I change the Org document, caching is repeated. Every time I open the file using oc-csl.el, caching is repeated. Every time I revert file using oc-csl.el, caching is repeated. I think that the easiest solution for citeproc would be not calling org-bibtex-headline on every single headline, but using regexp search for "BTYPE" property. Best, Ihor ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Yet another browser extension for capturing notes - LinkRemark 2022-01-19 3:28 ` Ihor Radchenko 2022-01-19 8:45 ` András Simonyi @ 2022-01-20 0:23 ` Samuel Wales 2022-01-20 12:16 ` Org mode and firefox tabs (feature request) Max Nikulin 1 sibling, 1 reply; 21+ messages in thread From: Samuel Wales @ 2022-01-20 0:23 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Max Nikulin, emacs-orgmode just a quick fwiw before i try to reply to the longer message by max. my own suggestion is modest for metadata, [even for science papers and things with funny web construction]. just title like org-capture extension. no need to cite in my case. my needs for saving and restoring, however, are more fancy. something like achieving a 1:1 mapping from firefox selected tabs, or a tree style tabs extension tree, to their counterparts in org, even when those counterparts have notes and such. this might include marking the org version as deleted/doneified] merely by closing tab in firefox. vice-versa would be straightforward. so it's really a "get organized and don't get confused by having both firefox and org" kinda thing. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Org mode and firefox tabs (feature request) 2022-01-20 0:23 ` Samuel Wales @ 2022-01-20 12:16 ` Max Nikulin 0 siblings, 0 replies; 21+ messages in thread From: Max Nikulin @ 2022-01-20 12:16 UTC (permalink / raw) To: emacs-orgmode On 20/01/2022 07:23, Samuel Wales wrote: > > my needs for saving and restoring, however, are more fancy. something > like achieving a 1:1 mapping from firefox selected tabs, or a tree > style tabs extension tree, to their counterparts in org, even when > those counterparts have notes and such. this might include marking > the org version as deleted/doneified] merely by closing tab in > firefox. vice-versa would be straightforward. so it's really a "get > organized and don't get confused by having both firefox and org" kinda > thing. Let's split this into a separate subthread. I am not planning to adopt such workflow, you may try to inspire somebody else by this idea however. What may be done in LinkRemark is some API to capture tabs when another add-on request it. Tab management extensions offer some plugin APIs, so it may be implemented as a third add-on that glues Tree Style Tab and LinkRemark. I have not tried KDE Plasma Integration or mapping of browser tabs on filesystem, but I expect that some implementation ideas may be borrowed from these projects. - https://community.kde.org/Plasma/Browser_Integration - https://omar.website/tabfs/ https://news.ycombinator.com/item?id=25600338 ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2022-01-20 16:27 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-12-25 12:44 Yet another browser extension for capturing notes - LinkRemark Maxim Nikulin 2020-12-25 14:19 ` Ihor Radchenko 2020-12-26 11:49 ` Maxim Nikulin 2020-12-26 13:49 ` Ihor Radchenko 2020-12-27 12:18 ` Maxim Nikulin 2021-11-18 17:01 ` LinkRemark Firefox extension approved for addons.mozilla.org Max Nikulin 2020-12-25 14:26 ` Yet another browser extension for capturing notes - LinkRemark Russell Adams 2020-12-25 22:11 ` Samuel Wales 2020-12-26 9:16 ` Maxim Nikulin 2022-01-17 2:29 ` Samuel Wales 2022-01-18 1:03 ` Samuel Wales 2022-01-18 5:43 ` Samuel Banya 2022-01-18 10:57 ` Max Nikulin 2022-01-18 10:34 ` Max Nikulin 2022-01-19 3:28 ` Ihor Radchenko 2022-01-19 8:45 ` András Simonyi 2022-01-19 10:00 ` Ihor Radchenko 2022-01-19 10:58 ` András Simonyi 2022-01-19 11:42 ` Ihor Radchenko 2022-01-20 0:23 ` Samuel Wales 2022-01-20 12:16 ` Org mode and firefox tabs (feature request) Max Nikulin
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).