From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Kitchin Subject: Re: exported contacts problem Date: Fri, 02 Aug 2019 18:19:47 -0400 Message-ID: References: <20190802160236.GR17561@protected.rcdrun.com> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:56384) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1htftx-0006zr-8B for emacs-orgmode@gnu.org; Fri, 02 Aug 2019 18:19:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1htftv-0006eo-M7 for emacs-orgmode@gnu.org; Fri, 02 Aug 2019 18:19:53 -0400 Received: from mail-qk1-x741.google.com ([2607:f8b0:4864:20::741]:34446) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1htftv-0006eN-HZ for emacs-orgmode@gnu.org; Fri, 02 Aug 2019 18:19:51 -0400 Received: by mail-qk1-x741.google.com with SMTP id t8so56014609qkt.1 for ; Fri, 02 Aug 2019 15:19:50 -0700 (PDT) In-reply-to: <20190802160236.GR17561@protected.rcdrun.com> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: emacs-orgmode@gnu.org Cc: Jude DaShiell There are a few options for contacts in org-mode that I have tried. I agree for a lot of contacts (probably more than a hundred or so), then native org-contacts might be too slow. In scimax I have tried a few different approaches to deal with this. The first is all org/elisp that uses a cache to speed up looking up contacts. See https://github.com/jkitchin/scimax/blob/master/contacts.el. The gist of this is you define a list of files that contacts will be in, and as long as they haven't changed since the cache was built the cache is used, and if they have changed the cache for that file is rebuilt. For large files, org parsing is not that fast in this, and I have not spent any time optimizing this. There are lots of ways to speed it up, but it hasn't been so slow to need solving yet. I use this code regularly, but nothing critical depends on it working. The downside of this is you have to maintain a list of files that serve as contact sources. I am pretty sure everything in it is compatible with org-contacts. This method is fine and I have about 5.8K contacts in this cache. I use this approach very regularly and have made some nice org-speedkeys for these contacts and other interesting ideas. The second approach is also related to caching, and uses a sqlite database as the cache. https://github.com/jkitchin/scimax/blob/master/org-db.el This approach uses a hook function to update the db any time the file is updated. I can access about 7500 contacts pretty quickly this way (evidently I have a lot of contacts outside the list of files designated as contacts in the first approach). I don't use this approach as much for contacts as for searching all headlines in all my org-files, but I built a contacts feature because I could. This approach could be expanded to a server based database but so far sqlite has served all my needs, and it has about 72K headlines indexed into it right now. The main benefit of this is mostly set it and forget it; contacts get updated as you make them or update them. The main benefit of both of these approaches is it lets you keep org as the primary way of creating contacts, including keeping all manner of notes associated with them in sub-headings. You can import contacts from anywhere just by writing code that creates the headings in org-contacts form. Since there is a cache, you could probably also write directly to the cache instead of making intermediate org files. The downside in my opinion is that I consider it acceptable for the cache to mostly work most of the time and it is not hard for them to get out of sync (and neither difficult for them to get resynced). For my work, there is very little consequence if I can't find a contact. You can organize your contacts as you see fit all over your file system, which is at times convenient, and at times hard to deal with. Duplicates, for example, are a challenge to deal with, and sometimes contacts seem to belong in more than one place. This is a limitation of org at the moment, there is no way to "transclude" a heading yet. You can create groups, these are just headings with multiple emails in the EMAIL property, as well as use todo states and tags on the contacts to select them with ivy/helm. I haven't converged on what the best way to do this kind of persistent caching. I also use it in org-ref for very large (20K) bibtex databases and it works fine there too. I also use mu4e, which indexes everything into a xapian database. It has about 9K contacts in it right now, and indexes about 100K messages. So far this works for me. It is more complicated to setup than the previous two options though. I wouldn't claim any of these can scale to 192K contacts, but it seems there are paths towards it. It would certainly take an eye and dedicated effort towards performance though. Jean Louis writes: > * Jude DaShiell [2019-08-02 17:48]: >> I have one email message with several .vcf file attachments. Has orgmode >> got any tool or tools I can use to import contacts from such a message >> into an orgmode table? > > And by all means, I would never keep contact in Org file, that is for > short list fine, but for any future planning, contacts shall be in a > real database sorted by its lists. > > A list could be group of people, or account name, or company name, or > organization, or interest lists. > > Neither bbdb nor Org is suitable for any serious collection of > contacts. I have 192,000+ contacts, and when they are in database and > I am using PostgreSQL, it gives me most of benefits, I can sort people > into lists, groups, I can contact them, count interactions, open up > their files, emails with a fast command, edit their data, add notes, > send them faxes and SMS, maintain relations. > > Jean -- Professor John Kitchin Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu