From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:8:6d80::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id SMUDEpbkimDQEQEAgWs5BA (envelope-from ) for ; Thu, 29 Apr 2021 18:53:42 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id eGjBDZbkimBNNgAA1q6Kng (envelope-from ) for ; Thu, 29 Apr 2021 16:53:42 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id A5AF22D1A5 for ; Thu, 29 Apr 2021 18:53:41 +0200 (CEST) Received: from localhost ([::1]:50786 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lc9v1-00009L-Rd for larch@yhetil.org; Thu, 29 Apr 2021 12:53:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54362) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lc9tT-0008Ra-RC for emacs-orgmode@gnu.org; Thu, 29 Apr 2021 12:52:04 -0400 Received: from ciao.gmane.io ([116.202.254.214]:46034) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lc9tS-0008SI-70 for emacs-orgmode@gnu.org; Thu, 29 Apr 2021 12:52:03 -0400 Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1lc9tP-0007mu-MJ for emacs-orgmode@gnu.org; Thu, 29 Apr 2021 18:51:59 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: emacs-orgmode@gnu.org From: Maxim Nikulin Subject: Re: [PATCH] Bug: fragile org refile cache Date: Thu, 29 Apr 2021 23:51:53 +0700 Message-ID: References: <87v98598un.fsf@localhost> <87k0olxjpz.fsf@localhost> <877dklxecq.fsf@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 In-Reply-To: <877dklxecq.fsf@localhost> Content-Language: en-US Received-SPF: pass client-ip=116.202.254.214; envelope-from=geo-emacs-orgmode@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: 28 X-Spam_score: 2.8 X-Spam_bar: ++ X-Spam_report: (2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, FORGED_GMAIL_RCVD=1, FORGED_MUA_MOZILLA=2.309, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, NICE_REPLY_A=-0.001, NML_ADSP_CUSTOM_MED=0.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1619715222; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=5CPKREmjZnX0zsD+Bbz+kTt+l1AQaGt3XS/aXOdr5+c=; b=duuwUTIaumQnQi+H6uG1jziVWbvs/rI2/JKj8XaMttR6pXyb3N1JYAfuo35QYwioYKHcPr b93F0OtXr1pQ8ydyWjPuPM8rUxrr2SzbLvAAPghsehMeI+1xZx0Efaxo1Ue+heHpYcM6FO vi7BB4R5v2Pkm7GTFtr+VMzEJvqUnfDaI+5EqwMxePIOPDavG5G9WfCHjlsN3RMGgiZOeI KvmYaOo0HMyEAfALKz97ucnzgbtPNYdiilSIvDbxaUsWpkejdejqRTwb3ZHCO6dwV5akec javvIZqGrK2NVaDc/5bFucHD5JuRJ0VomgyOm108qcvBcpj3p0Rd4XHGg8yTNQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1619715222; a=rsa-sha256; cv=none; b=IKx8gtPxiOpOZCDMfDH++d4lXTN0TE2fRRHYnbhKKhzP8eqikk9tiBUb50uArrbQd+x0Ad FpwTzkw674EehYylrOUY1YU8M3ThAJ6hUUsn/bLHFd9ZFQFOM/FE6PrT01GeSdTryWVDnD 0S+gEFaf6dTbi5Zl/g5Yd1JpLSt7H5kh7qnTwyFpoOLMIoT+VQPjtWqtlCjdHx7jww3qT7 JmGslvsqx50/NN7H+fxsAgc340GINjiPubSnBF66KQdNybzqy/26GQDdsEhMktHzJeVfDW S9KgIPGL/STgTekkvCi2veOiBzO20mTWLQ9cfVYJaC/XMjFg2dMvkS3F2TRM+Q== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Spam-Score: -1.86 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: A5AF22D1A5 X-Spam-Score: -1.86 X-Migadu-Scanner: scn0.migadu.com X-TUID: zAsHXrh57KTw On 29/04/2021 23:08, Ihor Radchenko wrote: > Maxim Nikulin writes: > >> Curiously my experience is that avoiding this lazy cache with >> backtracking and maintaining custom structure during sequential scan of >> the buffer works several times faster. > > My experience is exactly opposite. Or maybe I miss something. Can you > elaborate? My benchmarks may be incorrect due to for development version I did not compile files. I did not purged outline path cache as well. https://orgmode.org/list/s1qola$158l$1@ciao.gmane.io/ > Outline path without cache: > > (benchmark-run 1 > (goto-char (point-min)) > (while (re-search-forward "^\\*+" nil t) > (org-get-outline-path t nil))) => (6.051079914 1 0.2864724879999869) > > Outline path with cache: > > (benchmark-run 1 > (goto-char (point-min)) > (while (re-search-forward "^\\*+" nil t) > (org-get-outline-path t nil))) => (1.658461165 0 0.0) ^ t I suppose. I agree with such test. Notice however the following patch (warning: :level and :max-level a cached with the same key) https://orgmode.org/list/s209r8$16en$1@ciao.gmane.io/ Avoiding call to org-get-outline-path and using custom structure during single pass scan through the buffer allowed to significantly improve performance. > Just cleanup heading text: > > (benchmark-run 1 > (goto-char (point-min)) > (while (re-search-forward "^\\*+" nil t) > (let ((case-fold-search nil)) > (looking-at org-complex-heading-regexp) > (if (not (match-end 4)) "" > ;; Remove statistics cookies. > (org-trim > (org-link-display-format > (replace-regexp-in-string > "\\[[0-9]+%\\]\\|\\[[0-9]+/[0-9]+\\]" "" > (match-string-no-properties 4)))))))) => (0.013364877 0 0.0) I may be wrong with the following statement. Attempt to profile org-refile-get-targets could give quite different results. I have seen a note that Emacs use internally a cache for only 5 compiled regular expressions. Just one extra regexp and every matching function require compiling of its regexp just wiped from the cache. It is a time consuming procedure. I am unsure whether you added all regexps used (directly or through function calls) by inner loop of org-refile-get-targets.