emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
@ 2022-07-20 19:12 Janek F
  2022-07-23  5:22 ` [PATCH] " Ihor Radchenko
  2022-08-12 15:35 ` Max Nikulin
  0 siblings, 2 replies; 13+ messages in thread
From: Janek F @ 2022-07-20 19:12 UTC (permalink / raw)
  To: emacs-orgmode

When setting org-id-method to 'ts or 'org,
org-attach seems to use org-attach-id-ts-folder-format
to create its hierarchy.

However I tend to customize IDs for important files by hand,
causing any attempt to use org-attach on that file to fail
if the ID is shorter than six characters:

    org-attach-id-ts-folder-format: Args out of range: "ftt", 0, 6

This method should be adjusted to handle non-ts-ids just as well,
as org-id-method does not dictate the format of existing ids.

Regards,
Janek


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-07-20 19:12 [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)] Janek F
@ 2022-07-23  5:22 ` Ihor Radchenko
  2022-08-02 22:26   ` Janek F
  2022-08-03 16:03   ` Max Nikulin
  2022-08-12 15:35 ` Max Nikulin
  1 sibling, 2 replies; 13+ messages in thread
From: Ihor Radchenko @ 2022-07-23  5:22 UTC (permalink / raw)
  To: Janek F; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 874 bytes --]

Janek F <xerusx@pm.me> writes:

> When setting org-id-method to 'ts or 'org,
> org-attach seems to use org-attach-id-ts-folder-format
> to create its hierarchy.
>
> However I tend to customize IDs for important files by hand,
> causing any attempt to use org-attach on that file to fail
> if the ID is shorter than six characters:
>
>     org-attach-id-ts-folder-format: Args out of range: "ftt", 0, 6
>
> This method should be adjusted to handle non-ts-ids just as well,
> as org-id-method does not dictate the format of existing ids.

Thanks for reporting!
Tentative patch is attached.

I have added some fallbacks for the default attach folder formatters, so
that they can work when the ID does not conform to specific format.

This change is somewhat opinionated, so feel free to suggest alternative
solutions/fallback options.

Best,
Ihor


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-attach-dir-from-id-Do-not-rely-on-ID-being-over-.patch --]
[-- Type: text/x-patch, Size: 2341 bytes --]

From e004752ba39f0d328645a1f6053ad3ce3b06ac28 Mon Sep 17 00:00:00 2001
Message-Id: <e004752ba39f0d328645a1f6053ad3ce3b06ac28.1658553404.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Sat, 23 Jul 2022 13:13:24 +0800
Subject: [PATCH] org-attach-dir-from-id: Do not rely on ID being over 6 chars
 long

* lisp/org-attach.el (org-attach-id-uuid-folder-format): Fall back to
using ID md5 hash when the ID contains 2 chars or less and cannot be
split into the "xy/z...." path.
(org-attach-id-ts-folder-format): Fall back to "unknown/ID" path
format when the ID contains less than 7 chars and cannot be split into
the "YYYYMM/rest" path.

Fixes https://orgmode.org/list/KC8PcypJapBpJQtJxM0kX5N7Z0THL2Lq6EQjBMzpw1-vgQf72egZ2JOIlTbPYiqAVD4MdSBhrhBZr2Ykf5DN1mocm1ANvvuKKZShlkgzKYM=@pm.me
---
 lisp/org-attach.el | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/lisp/org-attach.el b/lisp/org-attach.el
index 36c21b702..7c72fd7ee 100644
--- a/lisp/org-attach.el
+++ b/lisp/org-attach.el
@@ -159,19 +159,28 @@ (defcustom org-attach-archive-delete nil
 (defun org-attach-id-uuid-folder-format (id)
   "Translate an UUID ID into a folder-path.
 Default format for how Org translates ID properties to a path for
-attachments.  Useful if ID is generated with UUID."
-  (format "%s/%s"
-	  (substring id 0 2)
-	  (substring id 2)))
+attachments.  Useful if ID is generated with UUID.
+
+When ID is too short (less than 3 chars), use its md5 hash to create
+the path."
+  (if (< (length id) 3)
+      (org-attach-id-uuid-folder-format (md5 id))
+    (format "%s/%s"
+	    (substring id 0 2)
+	    (substring id 2))))
 
 (defun org-attach-id-ts-folder-format (id)
   "Translate an ID based on a timestamp to a folder-path.
 Useful way of translation if ID is generated based on ISO8601
 timestamp.  Splits the attachment folder hierarchy into
-year-month, the rest."
-  (format "%s/%s"
-	  (substring id 0 6)
-	  (substring id 6)))
+year-month, the rest.
+
+When ID is too short (less than 7 chars), return \"unknown/ID\"."
+  (if (< (length id) 7)
+      (format "unknown/%s" id)
+    (format "%s/%s"
+	    (substring id 0 6)
+	    (substring id 6))))
 
 (defcustom org-attach-id-to-path-function-list '(org-attach-id-uuid-folder-format
 						 org-attach-id-ts-folder-format)
-- 
2.35.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-07-23  5:22 ` [PATCH] " Ihor Radchenko
@ 2022-08-02 22:26   ` Janek F
  2022-08-03 16:03   ` Max Nikulin
  1 sibling, 0 replies; 13+ messages in thread
From: Janek F @ 2022-08-02 22:26 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Just now had a closer look at your patch,
and I don't like the hashing.
My whole idea customizing ids is to make the file structure useful by itself, too.
This is how I patched it for me:

  (defun xf/org-attach-id-folder-format (id)
    "Translate any ID into a folder-path."
    (format "%s/%s"
            (substring id 0 2)
            (if (> (seq-length id) 2) (substring id 2) id))
    )
  (setq org-attach-id-to-path-function-list '(xf/org-attach-id-folder-format))


------- Original Message -------
Ihor Radchenko <yantar92@gmail.com> schrieb am Samstag, 23. Juli 2022 um 07:21:


> Janek F xerusx@pm.me writes:
>
> > When setting org-id-method to 'ts or 'org,
> > org-attach seems to use org-attach-id-ts-folder-format
> > to create its hierarchy.
> >
> > However I tend to customize IDs for important files by hand,
> > causing any attempt to use org-attach on that file to fail
> > if the ID is shorter than six characters:
> >
> > org-attach-id-ts-folder-format: Args out of range: "ftt", 0, 6
> >
> > This method should be adjusted to handle non-ts-ids just as well,
> > as org-id-method does not dictate the format of existing ids.
>
>
> Thanks for reporting!
> Tentative patch is attached.
>
> I have added some fallbacks for the default attach folder formatters, so
> that they can work when the ID does not conform to specific format.
>
> This change is somewhat opinionated, so feel free to suggest alternative
> solutions/fallback options.
>
> Best,
> Ihor


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-07-23  5:22 ` [PATCH] " Ihor Radchenko
  2022-08-02 22:26   ` Janek F
@ 2022-08-03 16:03   ` Max Nikulin
  2022-08-03 22:25     ` Ihor Radchenko
  1 sibling, 1 reply; 13+ messages in thread
From: Max Nikulin @ 2022-08-03 16:03 UTC (permalink / raw)
  To: emacs-orgmode

On 23/07/2022 12:22, Ihor Radchenko wrote:
> Tentative patch is attached.

> +  (if (< (length id) 3)
> +      (org-attach-id-uuid-folder-format (md5 id))

Ihor, I am afraid of collisions due to short input to md5. Long hash 
value gives false impression of high entropy but actually there are not 
so many variants for 1 or 2 characters strings. Either there is no point 
in making long name from short id or random string of appropriate length 
should be used instead.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-08-03 16:03   ` Max Nikulin
@ 2022-08-03 22:25     ` Ihor Radchenko
  2022-08-10 11:43       ` [PATCH v2] " Ihor Radchenko
  0 siblings, 1 reply; 13+ messages in thread
From: Ihor Radchenko @ 2022-08-03 22:25 UTC (permalink / raw)
  To: Max Nikulin; +Cc: emacs-orgmode

Max Nikulin <manikulin@gmail.com> writes:

> On 23/07/2022 12:22, Ihor Radchenko wrote:
>> Tentative patch is attached.
>
>> +  (if (< (length id) 3)
>> +      (org-attach-id-uuid-folder-format (md5 id))
>
> Ihor, I am afraid of collisions due to short input to md5. Long hash 
> value gives false impression of high entropy but actually there are not 
> so many variants for 1 or 2 characters strings. Either there is no point 
> in making long name from short id or random string of appropriate length 
> should be used instead.

Random is not an option. If we use random string, how will we get the
right attachment path on a second call? This function must return the
same value given the same input.

Having said that, I do agree (taking into account the other comment)
that md5 may not be a very good idea. Maybe simply something like
"__/id". Ideally, we should be able to recover the original id from the
path to attachment.

Best,
Ihor


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-08-03 22:25     ` Ihor Radchenko
@ 2022-08-10 11:43       ` Ihor Radchenko
  2022-08-10 12:17         ` Max Nikulin
  0 siblings, 1 reply; 13+ messages in thread
From: Ihor Radchenko @ 2022-08-10 11:43 UTC (permalink / raw)
  To: Max Nikulin; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 873 bytes --]

Ihor Radchenko <yantar92@gmail.com> writes:

>> Ihor, I am afraid of collisions due to short input to md5. Long hash 
>> value gives false impression of high entropy but actually there are not 
>> so many variants for 1 or 2 characters strings. Either there is no point 
>> in making long name from short id or random string of appropriate length 
>> should be used instead.
>
> Random is not an option. If we use random string, how will we get the
> right attachment path on a second call? This function must return the
> same value given the same input.
>
> Having said that, I do agree (taking into account the other comment)
> that md5 may not be a very good idea. Maybe simply something like
> "__/id". Ideally, we should be able to recover the original id from the
> path to attachment.

I have updated the patch to use "__/id" when id is too short.
Any objections?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: v2-0001-org-attach-dir-from-id-Do-not-rely-on-ID-being-ov.patch --]
[-- Type: text/x-patch, Size: 2310 bytes --]

From 7ca5fe66c53a1399943a2e2d4a1675ca8cd14194 Mon Sep 17 00:00:00 2001
Message-Id: <7ca5fe66c53a1399943a2e2d4a1675ca8cd14194.1660131747.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Sat, 23 Jul 2022 13:13:24 +0800
Subject: [PATCH v2] org-attach-dir-from-id: Do not rely on ID being over 6
 chars long

* lisp/org-attach.el (org-attach-id-uuid-folder-format): Fall back to
"--/ID" when the ID contains 2 chars or less and cannot be split into
the "xy/z...." path.
(org-attach-id-ts-folder-format): Fall back to "unknown/ID" path
format when the ID contains less than 7 chars and cannot be split into
the "YYYYMM/rest" path.

Fixes https://orgmode.org/list/KC8PcypJapBpJQtJxM0kX5N7Z0THL2Lq6EQjBMzpw1-vgQf72egZ2JOIlTbPYiqAVD4MdSBhrhBZr2Ykf5DN1mocm1ANvvuKKZShlkgzKYM=@pm.me
---
 lisp/org-attach.el | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/lisp/org-attach.el b/lisp/org-attach.el
index fe49af6f3..bb7c787f7 100644
--- a/lisp/org-attach.el
+++ b/lisp/org-attach.el
@@ -159,19 +159,28 @@ (defcustom org-attach-archive-delete nil
 (defun org-attach-id-uuid-folder-format (id)
   "Translate an UUID ID into a folder-path.
 Default format for how Org translates ID properties to a path for
-attachments.  Useful if ID is generated with UUID."
-  (format "%s/%s"
-	  (substring id 0 2)
-	  (substring id 2)))
+attachments.  Useful if ID is generated with UUID.
+
+When ID is too short (less than 3 chars), use its md5 hash to create
+the path."
+  (if (< (length id) 3)
+      (format "--/%s" id)
+    (format "%s/%s"
+	    (substring id 0 2)
+	    (substring id 2))))
 
 (defun org-attach-id-ts-folder-format (id)
   "Translate an ID based on a timestamp to a folder-path.
 Useful way of translation if ID is generated based on ISO8601
 timestamp.  Splits the attachment folder hierarchy into
-year-month, the rest."
-  (format "%s/%s"
-	  (substring id 0 6)
-	  (substring id 6)))
+year-month, the rest.
+
+When ID is too short (less than 7 chars), return \"unknown/ID\"."
+  (if (< (length id) 7)
+      (format "unknown/%s" id)
+    (format "%s/%s"
+	    (substring id 0 6)
+	    (substring id 6))))
 
 (defcustom org-attach-id-to-path-function-list '(org-attach-id-uuid-folder-format
 						 org-attach-id-ts-folder-format)
-- 
2.35.1


[-- Attachment #3: Type: text/plain, Size: 207 bytes --]



-- 
Ihor Radchenko,
Org mode contributor,
Learn more about Org mode at https://orgmode.org/.
Support Org development at https://liberapay.com/org-mode,
or support my work at https://liberapay.com/yantar92

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-08-10 11:43       ` [PATCH v2] " Ihor Radchenko
@ 2022-08-10 12:17         ` Max Nikulin
  2022-08-10 13:23           ` [PATCH v3] " Ihor Radchenko
  0 siblings, 1 reply; 13+ messages in thread
From: Max Nikulin @ 2022-08-10 12:17 UTC (permalink / raw)
  To: emacs-orgmode

On 10/08/2022 18:43, Ihor Radchenko wrote:
> Ihor Radchenko writes:
> 
> I have updated the patch to use "__/id" when id is too short.
> Any objections?
> 
> +When ID is too short (less than 3 chars), use its md5 hash to create

Misleading docs, you do not use md5.

> +the path."
> +  (if (< (length id) 3)
> +      (format "--/%s" id)

Please, do not use path components starting with dash, it is terrible 
for CLI tools. By the way, you promised underscores, not dashes.

> +    (format "%s/%s"
> +	    (substring id 0 2)
> +	    (substring id 2))))

Ihor, I have not look into the code around, so my suggestions may have 
no sense.

Is it possible to pass empty string as ID to these functions? Should it 
be explicitly checked?

What if ID contains "/" that can not be used in file name? Windows has 
more forbidden characters.

Do you expect any problem if here (and for timestamp-based ids) 
directory component is just padded with some character (unsure what can 
better than underscore) to required length?

"x" -> "_x/x" or "______x/x"
"xy" -> "xy/xy" or "_____xy/xy"
etc.

 From my point of view it might be a bit better than "__" and "unknown".



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-08-10 12:17         ` Max Nikulin
@ 2022-08-10 13:23           ` Ihor Radchenko
  2022-08-10 15:18             ` Max Nikulin
  0 siblings, 1 reply; 13+ messages in thread
From: Ihor Radchenko @ 2022-08-10 13:23 UTC (permalink / raw)
  To: Max Nikulin; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 2860 bytes --]

Max Nikulin <manikulin@gmail.com> writes:

> On 10/08/2022 18:43, Ihor Radchenko wrote:
>> Ihor Radchenko writes:
>> 
>> I have updated the patch to use "__/id" when id is too short.
>> Any objections?
>> 
>> +When ID is too short (less than 3 chars), use its md5 hash to create
>
> Misleading docs, you do not use md5.

Oops. Will fix.

>> +the path."
>> +  (if (< (length id) 3)
>> +      (format "--/%s" id)
>
> Please, do not use path components starting with dash, it is terrible 
> for CLI tools. By the way, you promised underscores, not dashes.

Why?

I have no opinion about the possible dummy folder name, except that it
should fit the general pattern we already have "xx/..." or "YYYYMM/...".

>> +    (format "%s/%s"
>> +	    (substring id 0 2)
>> +	    (substring id 2))))
>
> Ihor, I have not look into the code around, so my suggestions may have 
> no sense.
>
> Is it possible to pass empty string as ID to these functions? Should it 
> be explicitly checked?

It should not be possible under normal operation.

> What if ID contains "/" that can not be used in file name? Windows has 
> more forbidden characters.

Then, creating the attachment dir will fail in front of the user. I am
not sure if we really need to do much about this. At least, not until we
get a bug report.

If we really need to solve the edge cases with attach dir generation, we
may need to change the overall design in org-attach/org-id to something
more restrictive. I am not sure if it is worth the effort compared to
other directions where to improve Org.

> Do you expect any problem if here (and for timestamp-based ids) 
> directory component is just padded with some character (unsure what can 
> better than underscore) to required length?
>
> "x" -> "_x/x" or "______x/x"
> "xy" -> "xy/xy" or "_____xy/xy"
> etc.
>
>  From my point of view it might be a bit better than "__" and "unknown".

Does it make sense?
Timestamp-based IDs are like 20220810T210121.478800 by default.
Then, the top-level folders will be like 202208/...
If the actual format is different, we cannot possibly expect the
timestamp to follow the same pattern, which is why I chose "unknown"
referring to unknown date.

On the other hand, if an ID has more than 6 characters, it will generate
nonsense top-level folder with or without the patch. We could go further
and match the ID against [1-9][0-9]\{5\} and put the folder into
"unknown/ID" instead, but that would be a breaking change.

In summary, I am more of less neutral towards this fallback format,
except that it would be nice to be able to recover the ID from the
directory path. Padding should be recoverable.

I slightly dislike the "___xx" compared to "______" because it will
create a proliferation of top-level folders as opposed to cramping the
non-standard IDs into a single "______" folder.

Attaching the updated patch.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: v3-0001-org-attach-dir-from-id-Do-not-rely-on-ID-being-ov.patch --]
[-- Type: text/x-patch, Size: 2286 bytes --]

From 50d0f9de0acdf5d67b797476816cbeb40b19f554 Mon Sep 17 00:00:00 2001
Message-Id: <50d0f9de0acdf5d67b797476816cbeb40b19f554.1660137585.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Sat, 23 Jul 2022 13:13:24 +0800
Subject: [PATCH v3] org-attach-dir-from-id: Do not rely on ID being over 6
 chars long

* lisp/org-attach.el (org-attach-id-uuid-folder-format): Fall back to
"__/ID" when the ID contains 2 chars or less and cannot be split into
the "xy/z...." path.
(org-attach-id-ts-folder-format): Fall back to "______/ID" path format
when the ID contains less than 7 chars and cannot be split into the
"YYYYMM/rest" path.

Fixes https://orgmode.org/list/KC8PcypJapBpJQtJxM0kX5N7Z0THL2Lq6EQjBMzpw1-vgQf72egZ2JOIlTbPYiqAVD4MdSBhrhBZr2Ykf5DN1mocm1ANvvuKKZShlkgzKYM=@pm.me
---
 lisp/org-attach.el | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/lisp/org-attach.el b/lisp/org-attach.el
index fe49af6f3..0f5d5af82 100644
--- a/lisp/org-attach.el
+++ b/lisp/org-attach.el
@@ -159,19 +159,27 @@ (defcustom org-attach-archive-delete nil
 (defun org-attach-id-uuid-folder-format (id)
   "Translate an UUID ID into a folder-path.
 Default format for how Org translates ID properties to a path for
-attachments.  Useful if ID is generated with UUID."
-  (format "%s/%s"
-	  (substring id 0 2)
-	  (substring id 2)))
+attachments.  Useful if ID is generated with UUID.
+
+When ID is too short (less than 3 chars), return \"__/ID\"."
+  (if (< (length id) 3)
+      (format "__/%s" id)
+    (format "%s/%s"
+	    (substring id 0 2)
+	    (substring id 2))))
 
 (defun org-attach-id-ts-folder-format (id)
   "Translate an ID based on a timestamp to a folder-path.
 Useful way of translation if ID is generated based on ISO8601
 timestamp.  Splits the attachment folder hierarchy into
-year-month, the rest."
-  (format "%s/%s"
-	  (substring id 0 6)
-	  (substring id 6)))
+year-month, the rest.
+
+When ID is too short (less than 7 chars), return \"______/ID\"."
+  (if (< (length id) 7)
+      (format "______/%s" id)
+    (format "%s/%s"
+	    (substring id 0 6)
+	    (substring id 6))))
 
 (defcustom org-attach-id-to-path-function-list '(org-attach-id-uuid-folder-format
 						 org-attach-id-ts-folder-format)
-- 
2.35.1


[-- Attachment #3: Type: text/plain, Size: 206 bytes --]


-- 
Ihor Radchenko,
Org mode contributor,
Learn more about Org mode at https://orgmode.org/.
Support Org development at https://liberapay.com/org-mode,
or support my work at https://liberapay.com/yantar92

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-08-10 13:23           ` [PATCH v3] " Ihor Radchenko
@ 2022-08-10 15:18             ` Max Nikulin
  2022-08-11  4:19               ` Ihor Radchenko
  0 siblings, 1 reply; 13+ messages in thread
From: Max Nikulin @ 2022-08-10 15:18 UTC (permalink / raw)
  To: emacs-orgmode

On 10/08/2022 20:23, Ihor Radchenko wrote:
> Max Nikulin writes:
> 
>>> +the path."
>>> +  (if (< (length id) 3)
>>> +      (format "--/%s" id)
>>
>> Please, do not use path components starting with dash, it is terrible
>> for CLI tools. By the way, you promised underscores, not dashes.
> 
> Why?

It is inconvenient in interactive sessions (in scripts appropriate 
measures must be taken anyway), the following does not list content of 
the "--" directory:

     ls -l --

> I slightly dislike the "___xx" compared to "______" because it will
> create a proliferation of top-level folders as opposed to cramping the
> non-standard IDs into a single "______" folder.

I believed that proliferation of folders is for purpose. Intermediate 
directories allows to avoid excessive number of files in single 
directory. ext4 with directory tree index usually is not the case, but 
other filesystems may have rather poor performance when too much files 
are stuffed into single folder. Some applications become really slow for 
huge directories.

On the other hand there are not so many distinct file names that may 
fall into "__" directory, so perhaps it is OK. Concerning "unknown" I am 
less sure.

I do not have strong opinion how to properly deal with short IDs.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-08-10 15:18             ` Max Nikulin
@ 2022-08-11  4:19               ` Ihor Radchenko
  2022-08-11 14:43                 ` Max Nikulin
  0 siblings, 1 reply; 13+ messages in thread
From: Ihor Radchenko @ 2022-08-11  4:19 UTC (permalink / raw)
  To: Max Nikulin; +Cc: emacs-orgmode

Max Nikulin <manikulin@gmail.com> writes:

>>> Please, do not use path components starting with dash, it is terrible
>>> for CLI tools. By the way, you promised underscores, not dashes.
>> 
>> Why?
>
> It is inconvenient in interactive sessions (in scripts appropriate 
> measures must be taken anyway), the following does not list content of 
> the "--" directory:
>
>      ls -l --

Thanks for the clarification.

>> I slightly dislike the "___xx" compared to "______" because it will
>> create a proliferation of top-level folders as opposed to cramping the
>> non-standard IDs into a single "______" folder.
>
> I believed that proliferation of folders is for purpose. Intermediate 
> directories allows to avoid excessive number of files in single 
> directory. ext4 with directory tree index usually is not the case, but 
> other filesystems may have rather poor performance when too much files 
> are stuffed into single folder. Some applications become really slow for 
> huge directories.

I was referring to TS style timestamp resolver here. It is designed to
group directories by creation time, not to distribute them homogeneously.

> On the other hand there are not so many distinct file names that may 
> fall into "__" directory, so perhaps it is OK.

Yes. Such short IDs are not supposed to be common and even if they are,
the maximum possible number of such IDs is relatively small. 


-- 
Ihor Radchenko,
Org mode contributor,
Learn more about Org mode at https://orgmode.org/.
Support Org development at https://liberapay.com/org-mode,
or support my work at https://liberapay.com/yantar92


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-08-11  4:19               ` Ihor Radchenko
@ 2022-08-11 14:43                 ` Max Nikulin
  0 siblings, 0 replies; 13+ messages in thread
From: Max Nikulin @ 2022-08-11 14:43 UTC (permalink / raw)
  To: emacs-orgmode; +Cc: Janek F

On 11/08/2022 11:19, Ihor Radchenko wrote:
> Max Nikulin writes:
> 
>>> I slightly dislike the "___xx" compared to "______" because it will
>>> create a proliferation of top-level folders as opposed to cramping the
>>> non-standard IDs into a single "______" folder.
>>
>> I believed that proliferation of folders is for purpose. Intermediate
>> directories allows to avoid excessive number of files in single
>> directory. ext4 with directory tree index usually is not the case, but
>> other filesystems may have rather poor performance when too much files
>> are stuffed into single folder. Some applications become really slow for
>> huge directories.
> 
> I was referring to TS style timestamp resolver here. It is designed to
> group directories by creation time, not to distribute them homogeneously.

My bad, I have realizes that my idea of mapping

     "x" -> "______x/x"
     "xy" -> "_____xy/xy"

was a really bad one. It leads to a separate directory for each short 
ID. However I still believe that the purpose of 
`org-attach-id-ts-folder-format' is avoid concentration of huge number 
of file in a single directory. Distribution of attachments over 
subdirectories is not perfectly even but it still works reasonably well 
for long-lasting projects while IDs follow assumed format and month is 
included into directory names.

Back to the original problem. First of all, I believe that if a user 
decided to use non-standard IDs then it is their responsibility to 
customize `org-attach-id-to-path-function-list' and to provide a 
function that implements alternative layout of attachment files. 
Depending on expected amount, they may be put into single directory, 
spread using some hash function, or accordingly to project structure. So 
today I am against default fallback directories especially in 
`org-attach-id-ts-folder-format'.

I believe that interaction between `org-attach-dir-from-id' and members 
of `org-attach-id-to-path-function-list' could be improved. If a too 
short ID is passed to `org-attach-id-uuid-folder-format', 
`org-attach-id-ts-folder-format', or a user-defined function, they may 
return nil and `org-attach-dir-from-id' should try next element from the 
list. If nothing succeeds then a user error should be signaled demanding 
to implement a strategy to properly deal with peculiar IDs.

There should be no problem to put single character IDs into a common 
directory (UUID case), but there are enough variants for 5 characters 
long names to create delayed performance problem. The worst case is when 
a user decides to use some common prefix, e.g. "id-" before UUID and all 
files go to the same directory. On the other hand I do not think that 
code should detect such cases.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-07-20 19:12 [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)] Janek F
  2022-07-23  5:22 ` [PATCH] " Ihor Radchenko
@ 2022-08-12 15:35 ` Max Nikulin
  2022-08-12 16:08   ` Janek F
  1 sibling, 1 reply; 13+ messages in thread
From: Max Nikulin @ 2022-08-12 15:35 UTC (permalink / raw)
  To: Janek F, emacs-orgmode

On 21/07/2022 02:12, Janek F wrote:
> 
> However I tend to customize IDs for important files by hand,
> causing any attempt to use org-attach on that file to fail
> if the ID is shorter than six characters:
> 
>      org-attach-id-ts-folder-format: Args out of range: "ftt", 0, 6
> 
> This method should be adjusted to handle non-ts-ids just as well,
> as org-id-method does not dictate the format of existing ids.

There are should be no such restriction for CUSTOM_ID links (referenced 
using #anchor). IDs used for attachments assume particular format and 
consistent layout of files. Janek, could you, please, provide some 
details concerning your use case. I am interested why you decided to 
avoid CUSTOM_ID for headings important to you.

As I wrote yesterday, you can customize Org to use directories for 
attachments you like.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
  2022-08-12 15:35 ` Max Nikulin
@ 2022-08-12 16:08   ` Janek F
  0 siblings, 0 replies; 13+ messages in thread
From: Janek F @ 2022-08-12 16:08 UTC (permalink / raw)
  To: Max Nikulin; +Cc: emacs-orgmode

I use org-roam which creates links between org-files using the ID, and I would like to be able to manually add such links to common pages even if autocompletion doesn't work smoothly.

Considering the other messages, let's not get lost in details, the default function does not need to consider all edge cases. It simply should not error out on ids in a different format.

------- Original Message -------
Max Nikulin <manikulin@gmail.com> schrieb am Freitag, 12. August 2022 um 17:35:


> On 21/07/2022 02:12, Janek F wrote:
>
> > However I tend to customize IDs for important files by hand,
> > causing any attempt to use org-attach on that file to fail
> > if the ID is shorter than six characters:
> >
> > org-attach-id-ts-folder-format: Args out of range: "ftt", 0, 6
> >
> > This method should be adjusted to handle non-ts-ids just as well,
> > as org-id-method does not dictate the format of existing ids.
>
>
> There are should be no such restriction for CUSTOM_ID links (referenced
> using #anchor). IDs used for attachments assume particular format and
> consistent layout of files. Janek, could you, please, provide some
> details concerning your use case. I am interested why you decided to
> avoid CUSTOM_ID for headings important to you.
>
> As I wrote yesterday, you can customize Org to use directories for
> attachments you like.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-08-12 16:29 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-20 19:12 [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)] Janek F
2022-07-23  5:22 ` [PATCH] " Ihor Radchenko
2022-08-02 22:26   ` Janek F
2022-08-03 16:03   ` Max Nikulin
2022-08-03 22:25     ` Ihor Radchenko
2022-08-10 11:43       ` [PATCH v2] " Ihor Radchenko
2022-08-10 12:17         ` Max Nikulin
2022-08-10 13:23           ` [PATCH v3] " Ihor Radchenko
2022-08-10 15:18             ` Max Nikulin
2022-08-11  4:19               ` Ihor Radchenko
2022-08-11 14:43                 ` Max Nikulin
2022-08-12 15:35 ` Max Nikulin
2022-08-12 16:08   ` Janek F

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).