From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id oElTNs/yfGPzCAAAbAwnHQ (envelope-from ) for ; Tue, 22 Nov 2022 17:03:27 +0100 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id UJJQNs/yfGMs0wAAauVa8A (envelope-from ) for ; Tue, 22 Nov 2022 17:03:27 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 66F34F615 for ; Tue, 22 Nov 2022 17:02:29 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oxViN-0007lT-Jz; Tue, 22 Nov 2022 11:01:39 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxViL-0007lH-Uh for emacs-orgmode@gnu.org; Tue, 22 Nov 2022 11:01:38 -0500 Received: from ciao.gmane.io ([116.202.254.214]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxViI-0004ny-9T for emacs-orgmode@gnu.org; Tue, 22 Nov 2022 11:01:37 -0500 Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1oxViG-0001yT-Mb for emacs-orgmode@gnu.org; Tue, 22 Nov 2022 17:01:32 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: emacs-orgmode@gnu.org From: Max Nikulin Subject: Re: test-org-table/sort-lines: Failing test on macOS Date: Tue, 22 Nov 2022 23:01:26 +0700 Message-ID: References: <87ilkulwdy.fsf@localhost> <87y1tpejfm.fsf@localhost> <87wn7wdfis.fsf@localhost> <87pmdil0m0.fsf@localhost> <87k03pj8vw.fsf@localhost> <87leo3dc42.fsf@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Content-Language: en-US In-Reply-To: <87leo3dc42.fsf@localhost> Received-SPF: pass client-ip=116.202.254.214; envelope-from=geo-emacs-orgmode@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: 28 X-Spam_score: 2.8 X-Spam_bar: ++ X-Spam_report: (2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, FORGED_GMAIL_RCVD=1, FORGED_MUA_MOZILLA=2.309, FREEMAIL_FORGED_FROMDOMAIN=0.205, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, NICE_REPLY_A=-0.001, NML_ADSP_CUSTOM_MED=0.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1669132949; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=2j9Nqmi2zuQsNSUjbGWw/wtVM/EiKZXBpvi34lI4Ztw=; b=T6obWs94Betm7aYg0uqyW9IK+MaU/vaA0VFAnptl8ZJUJE01TvjcsWEUBKz2kzi02CdnKv 3CuJFsa1tAXihpqWWYsPXzJ3E/Ah7kkkiOosBTMrsZoo9ud+UbKheVUQznecNqKRekX+Iz svVOcYTd9ey+XXJQApnTKcmdEB7B3+y4nxj6jRIGlPWKuaZBSuHKiuVKi2Kfg9BEnyGSy/ xteQmhs53gnHQ71KcFX7HRKBZ+bCMSvBuIfjAsw+RTZQvi4+97ugToW/S26cWbeC9HptPS lql30h1ENJZTx6jc0+1lWnfm3gXi3DlNiGoh7DRXrBVe4+SZ2WVD+GTXcNoZmQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1669132949; a=rsa-sha256; cv=none; b=QO1cma64wQ3nX4WP+f9YIHtmNSh36NAQ7oGrUJjiato4cxVBkbw7BodKixu+PekxzzyisO OFWxYigcVsgkeCyRyeqh50Z5HCSO3CSSQ4P7ZpKbMJzKiABJyowMsL7XJTsC52VwMRcb5e uCVk2ldp98RZhIlerQcDgvFVtttVC+Z6xZqzn0y/IvfV+eJFvB4JI8gtHaJfdsV2/Gd5aN OeZ7YjrHwrax8J5JlK/ZKd5UMbTgTZ87a303bTMUjpsVCvAqUKooh6ly+AAu4Rm76xEy8b lUMSqr5J2T493h07RFkxHBw7iYzC+1BeHpmw76XruQWWsj7ZwUr0iUAC3400Vg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: 2.40 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 66F34F615 X-Spam-Score: 2.40 X-Migadu-Scanner: scn0.migadu.com X-TUID: rtpt7eyBW3l3 On 22/11/2022 08:14, Ihor Radchenko wrote: > Max Nikulin writes: > >>> 2. `org-sort-list' >>> 5. `org-sort-entries' >> `downcase' is used, not proper case folding, so a potential issue > > `downcase' is used to determine user input about sorting type. > Not for sorting itself. See case-func variable. Its initialization depends on the IGNORE-CASE argument. Strings to sort are passed either through `identity' or through `downcase'. >>> 4. `org-set-tags' (tag order), when `org-tags-sort-function' is set to >>> "Alphabetical" or "Reverse alphabetical". >> >> IGNORE-CASE argument is not used, perhaps `downcase' is hidden in the code. > > I feel like we are slightly miscommunicating here. > I mostly tried to list the uses of libc-sensitive sorting. Not > specifically cases when we try to ignore the case. > > The problem is not limited to case-sensitive comparisons. Some systems > may fail to implement specific locales and thus sorting may downgrade to > simple string-lessp. When case folding is not involved, I consider `string-lessp' as a graceful degradation. Despite locale rules are not applied, strings are mostly sorted. Exceptions exist, but usually order is reasonable. Completely disregarding IGNORE-CASE argument of `string-collate-lessp' on MacOS (that is not a heavily stripped embedded OS) is a bad surprise for me. >>> 6. Agenda sorting, when alphabetical sorting is involved >> >> `string-lessp' and `downcase' so even more severe locale-related issues >> might be expected. > > Could you please elaborate? I admit that `downcase' may be an acceptable workaround since `string-collate-lessp' may not work IGNORE-CASE, but I believe, when available, `string-collate-lessp' should be the preferred option for sorting. >> Achieving consistency across Org code requires additional efforts. > > Well. Just using `string-lessp' would make things very consistent. > Easily and with no efforts. With hope that clang will get better Unicode support, I would move in the opposite direction, namely wider usage of `string-collate-lessp'. Just using `string-lessp' means no ignore case sort even where it is available now. I have an idea of a compatibility wrapper for `string-collate-lessp' with special treatment of ignoring case and bad libc implementation. Apply `downcase' before passing arguments to `string-lessp'. It should provide consistency, best user experience when locales works properly, and graceful degradation otherwise. I hope, it is acceptable for Org even though such trick is undesired for Emacs due to performance reasons. However I am afraid of compatibility shims after d3a9c424b 2022-08-16 17:15:27 +0800 Ihor Radchenko: org-encode-time: Refactor into top-level `defmacro' P.S. I am not motivated enough to build Emacs on Linux using clang to check if locale information will be available. I am almost sure that some locale information is available on MacOS, e.g. at least strcasecmp even if full CLDR can not be easily accessed from C. I do not have a Mac to check state of affairs. For objective-C there is e.g. comareCaseIndependent. I do not like that Emacs relies on locale support (and timezone as well) in libc. It becomes a problem as soon as more than one locale should be used in simultaneously. I agree that there are enough complications and sometimes locale depends on the document (e.g. #+LANGUAGE:), sometimes specific locale even restricted to a part of a document. It is tricky to handle such cases, but current limitations are too strict (and defective `string-collate-lessp' on MacOS is an example).