emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] Accept more :tangle-mode specification forms
@ 2021-09-30 18:14 Timothy
  2021-10-01  1:24 ` Tom Gillespie
  2021-10-05 14:45 ` Timothy
  0 siblings, 2 replies; 33+ messages in thread
From: Timothy @ 2021-09-30 18:14 UTC (permalink / raw)
  To: Org Mode List

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

Hello,

Currently, the only way to set a file mode when tangling seems to be 
:tangle-mode (identity #o755)

In a [prior thread], Jeremy proposed that :tangle-mode should convert octal
number strings to the required decimal form. I think we should go further, and
so have prepared the attached patch based on a snippet I shared in the thread.

To quote the docstring of the new function I’m introducing, this patch now
accepts the following :tangle-mode forms:
• an integer (returned without modification)
• “#o755” (elisp-style octal)
• “0755” (c style octal)
• “755” (chmod style octal)
• “rwxrw-r–” (ls style specification)
• “a=rw,u+x” (chmod style)
• “rwx” (interpreted as “u=rwx”)

Why be so permissive? I’d refer you to my reasoning in the prior thread:

I think there are a few arguably “sensible” formats that a user could reasonably
assume, and if we can support most of them without introducing ambiguity in
parsing or interpretation (and I think we can), can’t we make everyone happy?

All the best,
Timothy


[prior thread] <https://list.orgmode.org/20210928145448.245883-1-jeremy@cowgar.com/>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-ob-tangle-Accept-more-tangle-mode-forms.patch --]
[-- Type: text/x-patch, Size: 3691 bytes --]

From 5087de0d70151c33d66eb13dda84d78a361d7053 Mon Sep 17 00:00:00 2001
From: TEC <tec@tecosaur.com>
Date: Fri, 1 Oct 2021 02:02:22 +0800
Subject: [PATCH] ob-tangle: Accept more :tangle-mode forms

* lisp/ob-tangle.el (org-babel-tangle): Accept many more forms for
:tangle-mode, including octal strings (#o755, 0755, 755), ls forms (rwx,
rw-r--r--), and chmod forms (a=rw,u+x).  The interpretation of the input
is now handled by the new function `org-babel-interpret-file-mode' which
references the new variable `org-babel-tangle-default-mode' when
considering relative mode forms.
---
 lisp/ob-tangle.el | 42 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el
index 2dd1d031c..28a235429 100644
--- a/lisp/ob-tangle.el
+++ b/lisp/ob-tangle.el
@@ -140,6 +140,14 @@ (defcustom org-babel-process-comment-text 'org-remove-indentation
   :version "24.1"
   :type 'function)
 
+(defcustom org-babel-tangle-default-mode #o544
+  "The default mode used for tangled files, as an integer.
+The default value 356 correspands to the octal #o544, which is
+read-write permissions for the user, read-only for everyone else."
+  :group 'org-babel
+  :version "9.6"
+  :type 'integer)
+
 (defun org-babel-find-file-noselect-refresh (file)
   "Find file ensuring that the latest changes on disk are
 represented in the file."
@@ -255,7 +263,7 @@ (defun org-babel-tangle (&optional arg target-file lang-re)
 		        (when she-bang
 			  (unless tangle-mode (setq tangle-mode #o755)))
 		        (when tangle-mode
-			  (add-to-list 'modes tangle-mode))
+			  (add-to-list 'modes (org-babel-interpret-file-mode tangle-mode)))
 		        ;; Possibly create the parent directories for file.
 		        (let ((m (funcall get-spec :mkdirp)))
 			  (and m fnd (not (string= m "no"))
@@ -298,6 +306,38 @@ (defun org-babel-tangle (&optional arg target-file lang-re)
 	   path-collector))
 	path-collector))))
 
+(defun org-babel-interpret-file-mode (mode)
+  "Determine the integer representation of a file MODE specification.
+The following forms are currently recognised:
+- an integer (returned without modification)
+- \"#o755\" (elisp-style octal)
+- \"0755\" (c style octal)
+- \"755\" (chmod style octal)
+- \"rwxrw-r--\" (ls style specification)
+- \"a=rw,u+x\" (chmod style) *
+- \"rwx\" (interpreted as \"u=rwx\") *
+
+* The interpretation of these forms relies on `file-modes-symbolic-to-number',
+  and uses `org-babel-tangle-default-mode' as the base mode."
+  (cond
+   ((integerp mode) mode)
+   ((not (stringp mode))
+    (error "File mode %S not recognised as a valid format." mode))
+   ((string-match-p "^0?[0-7][0-7][0-7]$" mode)
+    (string-to-number mode 8))
+   ((string-match-p "^#o[0-7][0-7][0-7]$" mode)
+    (string-to-number (substring mode 2) 8))
+   ((string-match-p "^[ugoa]*\\(?:[+-=][rwxXstugo]*\\)+\\(,[ugoa]*\\(?:[+-=][rwxXstugo]*\\)+\\)*$" mode)
+    (file-modes-symbolic-to-number mode org-babel-tangle-default-mode))
+   ((string-match-p "^[rwx-]\\{3\\}$" mode)
+    (file-modes-symbolic-to-number (concat "u=" mode) org-babel-tangle-default-mode))
+   ((string-match-p "^[rwx-]\\{9\\}$" mode)
+    (file-modes-symbolic-to-number (concat  "u=" (substring mode 0 3)
+                                            ",g=" (substring mode 3 6)
+                                            ",a=" (substring mode 6 9))
+                                   0))
+   (t (error "File mode %S not recognised as a valid format." mode))))
+
 (defun org-babel-tangle-clean ()
   "Remove comments inserted by `org-babel-tangle'.
 Call this function inside of a source-code file generated by
-- 
2.33.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-09-30 18:14 [PATCH] Accept more :tangle-mode specification forms Timothy
@ 2021-10-01  1:24 ` Tom Gillespie
  2021-10-01  6:59   ` Timothy
  2021-10-01  8:39   ` Christian Moe
  2021-10-05 14:45 ` Timothy
  1 sibling, 2 replies; 33+ messages in thread
From: Tom Gillespie @ 2021-10-01  1:24 UTC (permalink / raw)
  To: Timothy; +Cc: Org Mode List

I strongly oppose this patch. It adds far too much complexity to the
org grammar. Representation of numbers is an extremely nasty part of
nearly every language, and I suggest that org steer well clear of
trying to formalize this. With an eye to future portability I suggest
that no special cases be given to something as important for security
as tangle mode without very careful consideration. Emacs lisp closures
have clear semantics in Org and the number syntax is clear. If users
are concerned about the verbosity of (identity #o0600) they could go
with the sorter (or #o0600). Best,
Tom


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-01  1:24 ` Tom Gillespie
@ 2021-10-01  6:59   ` Timothy
  2021-10-01  8:00     ` Stefan Nobis
  2021-10-01  8:39   ` Christian Moe
  1 sibling, 1 reply; 33+ messages in thread
From: Timothy @ 2021-10-01  6:59 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: Org Mode List

[-- Attachment #1: Type: text/plain, Size: 2738 bytes --]

Hi Tom,

Thanks for giving me your thoughts on this. I have a few thoughts in response :)

> I strongly oppose this patch. It adds far too much complexity to the
> org grammar. Representation of numbers is an extremely nasty part of
> nearly every language, and I suggest that org steer well clear of
> trying to formalize this.

I’m not quite sure I see your point here, as I don’t see how this affects the
grammar of Org at all. The :attribute value syntax is unaffected, this just
changes how a particular :attribute’s value is interpreted. Attribute specific
interpretation is normal, with “:file ~/hello” you expect `~' to be interpreted as
`$HOME', but were I to give “:session ~/hello” I would not expect `~' to be
expanded etc.

Similarly, with regard to the representation of numbers, I’m not sure that
applies here, as the value is still a string not a number, it’s just
interpreted. Arguably, we’re not even representing numbers here but representing
file permissions which are currently abstracted by a numerical representation.

> With an eye to future portability I suggest that no special cases be given to
> [snipped for later] tangle mode without very careful consideration.

Mmmm, we defiantly want to think about what options we allow for, but I don’t
think that precludes us from accepting more than one common permissions
representations.

> [the snip]: something as important for security as tangle mode

Thank you for considering potential security implications, this is something
that I didn’t consider when writing the patch, but if we allow for a confusing
format that could deceive people into tangling files in modes they didn’t
realise they were tangling to.

I think there are two relevant points here
⁃ If we only allow very widely-understood, standard representations, I think the
  risk of people misunderstanding a :tangle-mode value is acceptably low
⁃ If you consider things this way, since arbitrary lisp closures are currently
  permitted, one can already trivially create a much more misleading
  :tangle-mode value with the current code.

> Emacs lisp closures have clear semantics in Org and the number syntax is clear

See my earlier comments on the semantics being unaffected, and this not being a
number syntax.

> If users are concerned about the verbosity of (identity #o0600) they could go
> with the sorter (or #o0600).

Perhaps, but I personally find it easier to interpret “rwxr-xr–” for example
than “(or #o754)”, and I feel quite confident in guessing that
a. I’m not alone
b. Nobody that understands “#o754” will have difficult understanding “rwxr-xr–”

All the best,
Timothy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-01  6:59   ` Timothy
@ 2021-10-01  8:00     ` Stefan Nobis
  2021-10-01 10:05       ` Eric S Fraga
  0 siblings, 1 reply; 33+ messages in thread
From: Stefan Nobis @ 2021-10-01  8:00 UTC (permalink / raw)
  To: emacs-orgmode

Timothy <tecosaur@gmail.com> writes:

> Thank you for considering potential security implications

BTW: Security-wise I would argue to even forbid the integer case. From
my view next to nobody uses and is used to the decimal codes of file
modes. So this decimal integer representation is the most error prone,
I would say. The more explicit (like "rwx-r-x-r--" etc.) the better.

I would also tend to only support something like "#o755" and forbid
"755" as well as "0755", just to be more explicit and to avoid
misinterpretation.

But all in all I support your patch and your arguments for the change.
Currently, I also do not really see why this attribute value (just a
string if the integer case is forbidden) will make the Org syntax more
complex.

-- 
Until the next mail...,
Stefan.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-01  1:24 ` Tom Gillespie
  2021-10-01  6:59   ` Timothy
@ 2021-10-01  8:39   ` Christian Moe
  1 sibling, 0 replies; 33+ messages in thread
From: Christian Moe @ 2021-10-01  8:39 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: Org Mode List, Timothy


Tom Gillespie writes:

> I strongly oppose this patch. It adds far too much complexity to the
> org grammar.  Representation of numbers is an extremely nasty part of
> nearly every language, and I suggest that org steer well clear of
> trying to formalize this.

I'd like to understand these objections better. Aren't you overstating
what is at issue? The patch allows a single keyword option,
:tangle-mode, to accept a few different ways of setting file
permissions. I'm not sure that amounts to formalizing representation of
numbers in Org, or even modifying Org grammar as such. (And I can't
think of other parts of Org where this would be relevant, so I wouldn't
expect demands for further feature creep.)

> With an eye to future portability I suggest
> that no special cases be given to something as important for security
> as tangle mode without very careful consideration.

When you say portability, what are you thinking about? If you're talking
about tangling Org-Babel code with other processors than Emacs Org-mode,
this seems like fairly trivial functionality to reproduce. In fact,
wouldn't this be easier than the current arrangements, which would
require the processor to be able to evaluate an arbitrary Emacs Lisp
expression outside Emacs?

What is the added security problem here, given that file permissions can
already be set by tangle mode?

I suppose that the greater complexity of the patch provides maintainers
with somewhat more opportunities for making code errors. And the greater
choice of representations perhaps gives the user more opportunities for
making user errors (though speaking strictly for myself, I'm more likely
to make those errors calculating octals than using the more intuitive
representations Timothy is helpfully making available). But these
problems seem marginal to me. Are there others?

> Emacs lisp closures have clear semantics in Org and the number syntax
> is clear. If users are concerned about the verbosity of (identity
> #o0600) they could go with the sorter (or #o0600).

But why would anyone want to write a lisp closure a number literal would
suffice? It's not what a user would expect.

Yours,
Christian


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-01  8:00     ` Stefan Nobis
@ 2021-10-01 10:05       ` Eric S Fraga
  2021-10-01 10:29         ` tomas
  0 siblings, 1 reply; 33+ messages in thread
From: Eric S Fraga @ 2021-10-01 10:05 UTC (permalink / raw)
  To: emacs-orgmode

> BTW: Security-wise I would argue to even forbid the integer case. 

Completely agree with this.  If you look at the chmod(1) man page, only
symbolic and octal cases are described.  These are the options most
people will be comfortable with as a result.

> I would also tend to only support something like "#o755" and forbid
> "755" as well as "0755", just to be more explicit and to avoid
> misinterpretation.

Here I disagree; again, in the manual, the notation used, as an example,
is 0755.  I see no need for the #o syntax personally.  This is
especially true if we don't allow integer (i.e. base 10) values.  

-- 
: Eric S Fraga via Emacs 28.0.50, Org 9.5-g9a4a24
: Latest paper written in org: https://arxiv.org/abs/2106.05096


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-01 10:05       ` Eric S Fraga
@ 2021-10-01 10:29         ` tomas
  2021-10-01 18:04           ` Tom Gillespie
  0 siblings, 1 reply; 33+ messages in thread
From: tomas @ 2021-10-01 10:29 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1139 bytes --]

On Fri, Oct 01, 2021 at 11:05:17AM +0100, Eric S Fraga wrote:

[...]

> > I would also tend to only support something like "#o755" and forbid
> > "755" as well as "0755", just to be more explicit and to avoid
> > misinterpretation.
> 
> Here I disagree; again, in the manual, the notation used, as an example,
> is 0755.  I see no need for the #o syntax personally.  This is
> especially true if we don't allow integer (i.e. base 10) values.  

Chiming in, I might be the culprit (in this thread) for the #o755
idea: I proposed it only because I was seeing that the argument
was being interpreted as (a decimal representation of) an int, and
thought it to be a good idea to stay compatible to Elisp notation.

Since then, the movement was rather towards consistency with the
shell and coreutils (which also makes sense, perhaps more [1]).

I wouldn't mix both :)

Cheers

[1] If you get over the wart that there is a little embedded
   domain specific language in the arg of this one specific
   keyword. I can also understand Tom Gillespie's hesitations,
   since he's trying to formalise the grammar.

 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-01 10:29         ` tomas
@ 2021-10-01 18:04           ` Tom Gillespie
  2021-10-01 18:14             ` Timothy
  0 siblings, 1 reply; 33+ messages in thread
From: Tom Gillespie @ 2021-10-01 18:04 UTC (permalink / raw)
  To: emacs-orgmode; +Cc: tomas, mail, TEC

> I'd like to understand these objections better. Aren't you overstating
what is at issue?

Yes, after hitting send I realized I overstated my position a bit.
In the meantime the comments in this thread are encouraging,
however I have finally figured out what I was really trying to say.

tl;dr file permission modes are not universal and should thus not
be part of the Org implementation, Org itself knows nothing about
files or permissions, it is the system that Org is running in/on.
Therefore, so long as we make it abundantly clear that the
value for :tangle-mode is not expected to be portable and that
it is always up to the user to ensure correct behavior, then we
are ok. I'm not happy about this conclusion from a security
perspective, but it isn't really worse than the situation we have
right now.


As many have pointed out, the grammar itself will not be affected.
However, other parts of the spec will. In general my objective is to
try to reduce the number of special cases that an org implementation
has to know about and delegate them to something else.

However in this case it is a bit tricky because of the security implications
and due to the fact that octal modes for file permissions are NOT universal
and should not be expected to be universal!

I actually think that my gut reaction was correct, but was expressed
in the wrong way.

Unix file modes are not universal and should thus not be encoded as
part of a portable document format. This means that it is up to the
user to know what representation is suitable.

Right now that representation is delegated to Emacs, because Emacs
handles file permissions for Org, and Emac's language for modes is
octal.

There are some octal modes that do not translate on Windows, and cannot
be correctly set. There will (hopefully) be some happy day in the future
where there is an operating system that will run Org babel where octal
file modes do not exist at all!

Therefore I suggest that we do not enshrine a particularly obscure way
of expressing file modes into Org itself. Right now Org is confined to
Emacs' representations, which in a sense protects Org from becoming
too ossified by bad designs of the past --- Emacs can keep all that
for us!

If we want a more user friendly syntax for this I would suggest that we do
something like what has been done for Org babel :results, i.e. like
:tangle-mode read write execute, unfortunately that does not compose
well at all with user, group, and other and becomes exceedingly verbose.


Final conclusion, after all that rambling, is that I'd actually be ok with
any of the solutions proposed, so long as it is clear that :tangle-mode
will always be implementation dependent, and may or may not be
meaningful depending on which operating system you are using.
Unfortunate for security, but I don't see any way around tha. The
best we could do for security would be for implementations to
test the file modes after tangling to ensure that they match,
which is more important I think.

That said, reducing the number of forms as Eric suggests would
be a happy medium.

Best!
Tom


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-01 18:04           ` Tom Gillespie
@ 2021-10-01 18:14             ` Timothy
  0 siblings, 0 replies; 33+ messages in thread
From: Timothy @ 2021-10-01 18:14 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: tomas, emacs-orgmode, mail

[-- Attachment #1: Type: text/plain, Size: 1255 bytes --]

Hi Tom,

Thanks for going through the replies so far and refining your thoughts.

> *snip a whole bunch of comments*

I think I’m of the same mind as you that if we try to mentally separate Org the
markup format and Org the emacs mode, the format should not specify the
interpretation of the :tangle-mode value.

I think the best way to have it would be,
⁃ Org the format, :tangle-mode takes a value representing the file permissions
  the tangled file should have. Optional: here are some common examples that
  might be recognised
⁃ Org the emacs mode, :tangle-mode’s value is interpreted like so (…)

> That said, reducing the number of forms as Eric suggests would
> be a happy medium.

Indeed, I’ve basically supported every form I could think of. I’m currently
inclined to cut it down to:
• 755
• “rwxrw-r–” (`ls -l' style)
• chmod style with `org-babel-tangle-default-mode' and `file-modes-symbolic-to-number'

Maybe with (if anybody says they would like this)
• #o755 (elisp octal)
• 0755 (C octal)
• “rwx” = user perm, bit-or’d with `org-babel-tangle-default-mode' for the rest
  (i.e. `org-babel-tangle-default-mode', but not exceeding the user perm)

All the best,
Timothy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-09-30 18:14 [PATCH] Accept more :tangle-mode specification forms Timothy
  2021-10-01  1:24 ` Tom Gillespie
@ 2021-10-05 14:45 ` Timothy
  2021-10-05 15:54   ` unknown@email.com
                     ` (3 more replies)
  1 sibling, 4 replies; 33+ messages in thread
From: Timothy @ 2021-10-05 14:45 UTC (permalink / raw)
  To: Org Mode List

[-- Attachment #1: Type: text/plain, Size: 841 bytes --]

Hi  Everyone,

It feels like we’re near a patch that would be good to merge. I would very much
like to get feedback on what I proposed in my reply to Tom though (see below).

>> That said, reducing the number of forms as Eric suggests would
>> be a happy medium.
>
> Indeed, I’ve basically supported every form I could think of. I’m currently
> inclined to cut it down to:
> • 755
> • “rwxrw-r–” (`ls -l’ style)
> • chmod style with `org-babel-tangle-default-mode’ and `file-modes-symbolic-to-number’
>
> Maybe with (if anybody says they would like this)
> • #o755 (elisp octal)
> • 0755 (C octal)
> • “rwx” = user perm, bit-or’d with `org-babel-tangle-default-mode’ for the rest
>   (i.e. `org-babel-tangle-default-mode’, but not exceeding the user perm)

All the best,
Timothy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-05 14:45 ` Timothy
@ 2021-10-05 15:54   ` unknown@email.com
  2021-10-05 16:13     ` Timothy
  2021-10-05 16:06   ` tomas
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 33+ messages in thread
From: unknown@email.com @ 2021-10-05 15:54 UTC (permalink / raw)
  To: emacs-orgmode

Timothy <tecosaur@gmail.com> writes:

> It feels like we’re near a patch that would be good to merge. I would very much
> like to get feedback on what I proposed in my reply to Tom though (see below).
>
>> Maybe with (if anybody says they would like this)
>> • #o755 (elisp octal)
>> • 0755 (C octal)
>> • “rwx” = user perm, bit-or’d with `org-babel-tangle-default-mode’ for the rest

I think this is a good idea and don't see any problems (or other
suggestions) with the proposed formats.

The existing (identity #o0755) will still function, correct?
i.e. backward compatibility.

--
Jeremy Cowgar


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-05 14:45 ` Timothy
  2021-10-05 15:54   ` unknown@email.com
@ 2021-10-05 16:06   ` tomas
  2021-10-06 11:59   ` Max Nikulin
  2021-11-18 10:20   ` Timothy
  3 siblings, 0 replies; 33+ messages in thread
From: tomas @ 2021-10-05 16:06 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1143 bytes --]

On Tue, Oct 05, 2021 at 10:45:59PM +0800, Timothy wrote:
> Hi  Everyone,
> 
> It feels like we’re near a patch that would be good to merge. I would very much
> like to get feedback on what I proposed in my reply to Tom though (see below).

OK. Since I made some noises, I feel compelled to feed back :)

> >> That said, reducing the number of forms as Eric suggests would
> >> be a happy medium.
> >
> > Indeed, I’ve basically supported every form I could think of. I’m currently
> > inclined to cut it down to:
> > • 755
> > • “rwxrw-r–” (`ls -l’ style)
> > • chmod style with `org-babel-tangle-default-mode’ and `file-modes-symbolic-to-number’

This looks perfect to me.

> > Maybe with (if anybody says they would like this)
> > • #o755 (elisp octal)
> > • 0755 (C octal)
> > • “rwx” = user perm, bit-or’d with `org-babel-tangle-default-mode’ for the rest
> >   (i.e. `org-babel-tangle-default-mode’, but not exceeding the user perm)

I wouldn't miss those, having the above. Less is more, IMHO.

Thanks for your work, and for the way you do it. You rock!

Cheers
 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-05 15:54   ` unknown@email.com
@ 2021-10-05 16:13     ` Timothy
  0 siblings, 0 replies; 33+ messages in thread
From: Timothy @ 2021-10-05 16:13 UTC (permalink / raw)
  To: unknown@email.com; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 326 bytes --]

Hi  Jeremy,

> I think this is a good idea and don’t see any problems (or other
> suggestions) with the proposed formats.
>
> The existing (identity #o0755) will still function, correct?
> i.e. backward compatibility.

It should yes. I’ll double check before I actually push the commit.

All the best,
Timothy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-05 14:45 ` Timothy
  2021-10-05 15:54   ` unknown@email.com
  2021-10-05 16:06   ` tomas
@ 2021-10-06 11:59   ` Max Nikulin
  2021-11-18 10:20   ` Timothy
  3 siblings, 0 replies; 33+ messages in thread
From: Max Nikulin @ 2021-10-06 11:59 UTC (permalink / raw)
  To: emacs-orgmode

On 05/10/2021 21:45, Timothy wrote:
>>
>> Indeed, I’ve basically supported every form I could think of. I’m currently
>> inclined to cut it down to:
>> • 755
>> • “rwxrw-r–” (`ls -l’ style)
>> • chmod style with `org-babel-tangle-default-mode’ and `file-modes-symbolic-to-number’
>>
>> Maybe with (if anybody says they would like this)
>> • #o755 (elisp octal)
>> • 0755 (C octal)
>> • “rwx” = user perm, bit-or’d with `org-babel-tangle-default-mode’ for the rest
>>    (i.e. `org-babel-tangle-default-mode’, but not exceeding the user perm)

My opinion (discussion should not mean that I insist) is that since the 
following is currently working

#+begin_src bash :tangle yes :tangle-mode 755
   echo "Hello"
#+end_src

naked numbers (strings looking like numbers) should be forbidden. 
`org-lint' should report such lines and it should be at least warning 
for `org-babel-tagle'.

It is safer to define macros (namely macros, not just functions) that 
check argument type, e.g.

(filemode-octal "755") ; OK
(filemode-octal #o755) ; OK
(filemode-octal 755) ; Error

Maybe they should return not just raw number but e.g. tagged pair
(#o755 . 'filemode)

The point is that it should be hard to pass decimal or hex number 
(though it might be possible through a function generating tagged cons).

It is better to be a bit more verbose and explicit than to allow weird 
hard to notice later errors. Problem cases are too close to valid ones 
with current behavior.




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-10-05 14:45 ` Timothy
                     ` (2 preceding siblings ...)
  2021-10-06 11:59   ` Max Nikulin
@ 2021-11-18 10:20   ` Timothy
  2021-11-18 17:22     ` Timothy
  3 siblings, 1 reply; 33+ messages in thread
From: Timothy @ 2021-11-18 10:20 UTC (permalink / raw)
  To: Org Mode List


Hi All,

This has just been pushed as described in fa97f9a39.
See some tests I performed before pushing below.

#+begin_src text :tangle-mode (identity #o345) :tangle t1.txt
this works
#+end_src

#+begin_src text :tangle-mode 433 :tangle t2.txt
this works
#+end_src

#+begin_src text :tangle-mode u+x :tangle t3.txt
this works
#+end_src

#+begin_src text :tangle-mode rw-r--r-- :tangle t4.txt
this works
#+end_src

#+begin_src text :tangle-mode hey :tangle t5.txt
invalid, error message given
#+end_src

--
Timothy


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-18 10:20   ` Timothy
@ 2021-11-18 17:22     ` Timothy
  2021-11-18 23:33       ` Tom Gillespie
  2021-11-19 16:31       ` Tim Cross
  0 siblings, 2 replies; 33+ messages in thread
From: Timothy @ 2021-11-18 17:22 UTC (permalink / raw)
  To: Org Mode List

[-- Attachment #1: Type: text/plain, Size: 842 bytes --]

Hi All,

I thought I’d checked for this, but I’ve just noticed that :tangle-mode 755
doesn’t actually work as expected. I assumed 755 would be passed as a string but
org-babel-parse-header-arguments actually turns it into an integer, just like
(identity #o755). Obviously 755 != #o755 and so this causes issues.

As it stands “755” works, but that isn’t great (most importantly, it’s easy to
confuse). Since it’s easier to add than remove things like this, we could just
get rid of this for now, but a convenient octal notation was a large chunk of
the motivation here IIRC.

We could also change the implementation to handle :tangle-mode o755, which will
make org-babel-parse-header-arguments parse the argument as a string.

I’m be keen to hear other people’s thoughts on this.

All the best,
Timothy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-18 17:22     ` Timothy
@ 2021-11-18 23:33       ` Tom Gillespie
  2021-11-19 16:31       ` Tim Cross
  1 sibling, 0 replies; 33+ messages in thread
From: Tom Gillespie @ 2021-11-18 23:33 UTC (permalink / raw)
  To: Timothy; +Cc: Org Mode List

Hi Timothy,
    The confusion with 755 and "755" could lead to security issues in
cases like 600 vs "600" vs #o600. The need to protect against the 600
case is fairly important, however I don't think there is anything we
can do about it, because someone might want to enter their modes as
base 10 integers.

If we were to prepend every integer with #o (or setting the radix to 8
when reading this particular field) before passing it to
org-babel-parse-header-arguments then it would be impossible to use
base 10 integers unless they were provided in the #10r600 form (Emacs
doesn't support #d600 notation).

I think the best bet is to change the radix for bare integers to 8
when reading that particular header, however I don't know how complex
that would be to implement.

If we don't want to change the radix to 8 then here are some suggestions.

If #o0600 already parses correctly, then I suggest we leave things as
is. Adding complexity just to drop the leading # seems wasteful.

We may want to warn or raise an error if someone uses a value such as
the base 10 integer 600 which does not map to the usual expected octac
codes so that they don't silently get bad file modes that could leave
files readable to the world.

Best,
Tom


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-18 17:22     ` Timothy
  2021-11-18 23:33       ` Tom Gillespie
@ 2021-11-19 16:31       ` Tim Cross
  2021-11-19 18:10         ` tomas
                           ` (2 more replies)
  1 sibling, 3 replies; 33+ messages in thread
From: Tim Cross @ 2021-11-19 16:31 UTC (permalink / raw)
  To: emacs-orgmode


Timothy <tecosaur@gmail.com> writes:

> Hi All,
>
> I thought I’d checked for this, but I’ve just noticed that :tangle-mode 755
> doesn’t actually work as expected. I assumed 755 would be passed as a string but
> org-babel-parse-header-arguments actually turns it into an integer, just like
> (identity #o755). Obviously 755 != #o755 and so this causes issues.
>
> As it stands “755” works, but that isn’t great (most importantly, it’s easy to
> confuse). Since it’s easier to add than remove things like this, we could just
> get rid of this for now, but a convenient octal notation was a large chunk of
> the motivation here IIRC.
>
> We could also change the implementation to handle :tangle-mode o755, which will
> make org-babel-parse-header-arguments parse the argument as a string.
>
> I’m be keen to hear other people’s thoughts on this.
>

Thanks for your work on this. I am a little concerned we are making a
rod for our back by trying to make this overly clever in order to
provide as much convenience to the user as possible. As this setting
does have significant security implications, I would favour a simple and
easily testable option which is concise and unambiguous over
convenience. I would drop the 'rwxrw-r--' format as it isn't typical,
not allow base10 mode specifications and possibly even limit what
can be set (i.e. no sticky bit etc, just read, write and execute).

Security issues are more often than not, caused by complexity. Things
become complex when we try to satisfy too many options. In this case,
rather than being liberal in what we accept and precise in what we
send/do, I think we need to be precise in what we accept and do.

I would just accept two formats, both being strings with either "o400"
(or perhaps "#o400") and "u+rwx" symbolic form and anything else
generates an error (a hard error, not a warning i.e. stop processing,
don't tangle). 
 
Making the octal version be "#o600" rather than just "o600" would
possibly make interpretation easier as it would avoid "o600" and "o+r" -
if it starts with "#o" interpret as octal, otherwise try to parse as
symbolic names.

this would mean there will be some edge cases where you cannot set the
mode precisely to the value you want. However, these will be fringe
cases and requiring the user to take additional/special steps in this
case is IMO not too much to ask in exchange for reliability and
correctness for the majority and avoiding dangerous corner cases. 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-19 16:31       ` Tim Cross
@ 2021-11-19 18:10         ` tomas
  2021-11-20  4:31         ` Greg Minshall
  2021-11-20  8:08         ` Timothy
  2 siblings, 0 replies; 33+ messages in thread
From: tomas @ 2021-11-19 18:10 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 770 bytes --]

On Sat, Nov 20, 2021 at 03:31:16AM +1100, Tim Cross wrote:
> 
> Timothy <tecosaur@gmail.com> writes:
> 
> > Hi All,
> >
> > I thought I’d checked for this, but I’ve just noticed that :tangle-mode 755

[...]

> Thanks for your work on this. I am a little concerned we are making a
> rod for our back by trying to make this overly clever in order to
> provide as much convenience to the user as possible [...]

That's my feeling too. At the beginning I was a fan of the octal variant
(tradition), but I fear the interface, in its really good intent to "do-
what-i-mean" becomes unpredictable. Because different people do mean
different things.

An unpredictable interface having security implications tends to be...
interesting :)

Cheers
 - t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-19 16:31       ` Tim Cross
  2021-11-19 18:10         ` tomas
@ 2021-11-20  4:31         ` Greg Minshall
  2021-11-20  8:08         ` Timothy
  2 siblings, 0 replies; 33+ messages in thread
From: Greg Minshall @ 2021-11-20  4:31 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-orgmode

> Timothy <tecosaur@gmail.com> writes:

> I would just accept two formats, both being strings with either "o400"
> (or perhaps "#o400") and "u+rwx" symbolic form and anything else
> generates an error (a hard error, not a warning i.e. stop processing,
> don't tangle). 

+1.  (i'm neutral w.r.t. "o400" vs. "#o400".)

cheers, Greg


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-19 16:31       ` Tim Cross
  2021-11-19 18:10         ` tomas
  2021-11-20  4:31         ` Greg Minshall
@ 2021-11-20  8:08         ` Timothy
  2021-11-20 12:25           ` tomas
  2021-11-20 19:49           ` Tim Cross
  2 siblings, 2 replies; 33+ messages in thread
From: Timothy @ 2021-11-20  8:08 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 3195 bytes --]

Hi Tom, Tim, Thomas, and Greg,

Thank you all for your thoughts. I’ll try to respond to all the main points
raised below.

First off, in case it wasn’t clear in my earlier email when I said “#o555 works”
or the like I was being lazy and really meaning “(identity #o555)” works.

The parsing of “555” to the integer 555 is done by
org-babel-parse-header-arguments, and so can’t really be changed. For a simple
octal notation our best bet is adding non-digit characters so it is left as a
string, e.g. “o555”.

┌────
│ 1 -> (org-babel-parse-header-arguments ":tangle-mode 555" light)
│ 1 <- org-babel-parse-header-arguments: ((:tangle-mode . 555))
│ ======================================================================
│ 1 -> (org-babel-parse-header-arguments ":tangle-mode o555" light)
│ 1 <- org-babel-parse-header-arguments: ((:tangle-mode . "o555"))
└────

So while I’d like it if we could easily apply Tom’s suggestion to “change the
radix for bare integers to 8 when reading that particular header”, I don’t think
that’s very feasible, unfortunately.

Giving errors when a base 10 value has been given by accident would be a nice
idea, but in practice isn’t easy as many base 10 values are valid octal values,
e.g. #o555 = 365. Likewise, I don’t think Tim’s suggestion of disallowing base
10 values is feasible.

Regarding the concern that “we are making a rod for our back by trying to make
this overly clever” and that the change makes it too complex — I disagree.
Having had this discussion earlier we’ve ended up with,
• a shorthand for octal
• ls-style
• chmod-style
I think this small collection of distinct and simple input methods isn’t overly
clever or complex, and feel that it strikes the right balance between too many
options and too little flexibility.

Octal is great, for those that are familiar with it. Likewise, chmod-style is
quite natural for those who are used to chmod. There’s little added complexity
to the Org code base as we simply pass this to the Emacs function
file-modes-symbolic-to-number. I think the ls-style is quite valuable for two
reasons. It’s well-established thanks to ls, and is particularly good for
understanding permissions at a glance. For reading Org files I think this is
advantageous compared to the other styles. I’m don’t find assertions that this
is non-typical or unpredictable well-founded. Each style/syntax is well-defined,
simple, distinct, and taken from very common/wide spread usage.

Tim suggested that invalid forms should cause tangling to fail, but I feel this
may be a bit much. Personally I’m inclined to either
• warn, and don’t touch the file permissions (this is what currently occurs)
• use a very conservative file permission (e.g. rw——-).

So, as I see it the main decision that needs to be made is how to handle the
octal shorthand, now that it’s clear that the original plan is flawed? I’m
thinking either “o555” or “#o555” would be a good improvement over “(identity
#o555), but am open to other suggestions.

All the best,
Timothy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-20  8:08         ` Timothy
@ 2021-11-20 12:25           ` tomas
  2021-11-20 14:50             ` Timothy
  2021-11-20 19:49           ` Tim Cross
  1 sibling, 1 reply; 33+ messages in thread
From: tomas @ 2021-11-20 12:25 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 871 bytes --]

On Sat, Nov 20, 2021 at 04:08:16PM +0800, Timothy wrote:
> Hi Tom, Tim, Thomas, and Greg,

[...]

> • a shorthand for octal
> • ls-style
> • chmod-style
> I think this small collection of distinct and simple input methods isn’t overly
> clever or complex, and feel that it strikes the right balance between too many
> options and too little flexibility.

That sounds good.

> [...] I’m
> thinking either “o555” or “#o555” would be a good improvement over “(identity
> #o555), but am open to other suggestions.

That's reasonable. I'd even tend to disallow decimal. In usage, it's too
exotic and the potential for someone entering "just a number" expecting
for it to be read as octal is too high. Is it worth the risk?

Timothy, I really admire your patience, and your incredibly friendly way
of doing things :)

Cheers
 - t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-20 12:25           ` tomas
@ 2021-11-20 14:50             ` Timothy
  2021-11-20 16:09               ` tomas
                                 ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Timothy @ 2021-11-20 14:50 UTC (permalink / raw)
  To: tomas; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1006 bytes --]

Hi Thomas (& co.),

>> […] I’m thinking either “o555” or “#o555” would be a good improvement over
>> “(identity #o555), but am open to other suggestions.
>
> That’s reasonable. I’d even tend to disallow decimal. In usage, it’s too
> exotic and the potential for someone entering “just a number” expecting
> for it to be read as octal is too high. Is it worth the risk?

I’ve just pushed three commits that
1. Add “o555” as an octal shorthand
2. Perform a simple check that integer modes are valid*
3. Make the ls-style regex stricter

> Timothy, I really admire your patience, and your incredibly friendly way
> of doing things :)

Thanks. It helps that this list is fairly friendly to begin with :)

All the best,
Timothy

* For example, “:tangle-mode 755” will now produce the warning:
  “1363 is not a valid file mode octal. Did you give the decimal value 755 by
  mistake?”. Maybe it would be worth adding “if so try o755” or similar?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-20 14:50             ` Timothy
@ 2021-11-20 16:09               ` tomas
  2021-11-20 21:32               ` Tim Cross
  2021-11-21  4:08               ` Greg Minshall
  2 siblings, 0 replies; 33+ messages in thread
From: tomas @ 2021-11-20 16:09 UTC (permalink / raw)
  To: Timothy; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 531 bytes --]

On Sat, Nov 20, 2021 at 10:50:40PM +0800, Timothy wrote:
> Hi Thomas (& co.),

[...]

> Thanks. It helps that this list is fairly friendly to begin with :)

Friendly lists are made of friendly people, and there, your contribution
is... special.

> * For example, “:tangle-mode 755” will now produce the warning:
>   “1363 is not a valid file mode octal. Did you give the decimal value 755 by
>   mistake?”. Maybe it would be worth adding “if so try o755” or similar?

Awesome :)

Thanks & cheers
 - t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-20  8:08         ` Timothy
  2021-11-20 12:25           ` tomas
@ 2021-11-20 19:49           ` Tim Cross
  2021-11-21  4:02             ` Timothy
  1 sibling, 1 reply; 33+ messages in thread
From: Tim Cross @ 2021-11-20 19:49 UTC (permalink / raw)
  To: Timothy; +Cc: emacs-orgmode


Timothy <tecosaur@gmail.com> writes:

> The parsing of “555” to the integer 555 is done by
> org-babel-parse-header-arguments, and so can’t really be changed. For a simple
> octal notation our best bet is adding non-digit characters so it is left as a
> string, e.g. “o555”.

I don't understand this. Why can't it be changed? Doing so may not be
trivial, but I cannot see any reason it cannot be changed if we wanted
to. However, perhaps all we really need to do is look at the value
returned and if the value for mode is not a string, throw an error and
if it is a string, either match symbolic format or verify it has a
leading 'o' or '#o' and treat as octal, else throw an error. 

> Giving errors when a base 10 value has been given by accident would be a nice
> idea, but in practice isn’t easy as many base 10 values are valid octal values,
> e.g. #o555 = 365. Likewise, I don’t think Tim’s suggestion of disallowing base
> 10 values is feasible.
>

Did you mean many base 10 values are valid mode setting values? All base
10 values can be expressed in octal, so I'm assuming you meant some base
10 values will translate into octal values which are valid mode setting
values. My point here is that such translations are unlikely to be what
the user expected.

I have never seen anyone use base 10 to specify mode settings. It takes
a lot more mental agility to derive a base 10 version for a valid mode
setting than it does to use octal, which has a clear and natural mapping
of the octal characters to the underlying bits used to represent the
mode for user, grop and other (as well as setuid, setgid and sticky
bit).

When I said disable base 10 values, what I meant was generate an error
if the value looks like it is base 10 i.e. does not have a leading o or
#o. This makes it clear it has to be specified as an octal value and
will alert users to this fact. 

> Regarding the concern that “we are making a rod for our back by trying to make
> this overly clever” and that the change makes it too complex — I disagree.
> Having had this discussion earlier we’ve ended up with,
> • a shorthand for octal
> • ls-style
> • chmod-style
> I think this small collection of distinct and simple input methods isn’t overly
> clever or complex, and feel that it strikes the right balance between too many
> options and too little flexibility.
>

We will probably have to disagree here as I remain unconvinced. I guess
time will tell.

> Octal is great, for those that are familiar with it. Likewise, chmod-style is
> quite natural for those who are used to chmod. There’s little added complexity
> to the Org code base as we simply pass this to the Emacs function
> file-modes-symbolic-to-number. I think the ls-style is quite valuable for two
> reasons. It’s well-established thanks to ls, and is particularly good for
> understanding permissions at a glance.

I would agree it is an established way to display the permissions
associated with a filesystem object, but disagree it is a well
established way to set such values - I know of no other tool which uses
this format. It is also not the same as the ls -l display (no object
type indicator). The ls -l also displays sticky bit, uid and gid. Does
your format also support setting these values (something which can be
done with the octal or symbolic formats) i.e. support s, S, t and T for
the 'executable' bit for user/group?

>For reading Org files I think this is
> advantageous compared to the other styles. I’m don’t find assertions that this
> is non-typical or unpredictable well-founded. Each style/syntax is well-defined,
> simple, distinct, and taken from very common/wide spread usage.
>

Again, I know of no other tool which uses the ls -l format to set file
mode permissions, so I cannot agree with the assertion it is well
established. Personally, I prefer the symbolic form as it is shorter and
clear. I find the ls -l form too easy to get wrong (especially with
getting the number of '-' correct).

> Tim suggested that invalid forms should cause tangling to fail, but I feel this
> may be a bit much. Personally I’m inclined to either
> • warn, and don’t touch the file permissions (this is what currently occurs)
> • use a very conservative file permission (e.g. rw——-).

I'm unsure on this. My concern is people may not notice the warning and
then be surprised later. Given the potential security issues, a later
surprise is something to be avoided even if it is inconvenient. With
respect to the default action to take, I would suggest we also need to
look at the default umask setting for the user and if that is more
restrictive, use that rather than some value like rw------- For all we
know, the user has set a specific umask for a valid reason and will be
surprised if emacs just ignores that to do what it wants (another
surprise to be avoided).

The user is not required to specify a mode. However, if they do and if
we cannot interpret what they have specified without ambiguity, we
should throw an error and cease processing. Making a guess as to what
they intended in this situation is IMO a big mistake. 

>
> So, as I see it the main decision that needs to be made is how to handle the
> octal shorthand, now that it’s clear that the original plan is flawed? I’m
> thinking either “o555” or “#o555” would be a good improvement over “(identity
> #o555), but am open to other suggestions.
>

I agree the (identity ...) approach doesn't feel terribly 'natural'.

My only preference for "#o" over just "o" prefix is to clearly signal to
the user that it is an octal value and avoid the situation where you
might glance a value and see the leading 'o' as a '0', missing the point
it is an octal value not a decimal one. However, this seems like a low
risk, so I'm happy either way. 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-20 14:50             ` Timothy
  2021-11-20 16:09               ` tomas
@ 2021-11-20 21:32               ` Tim Cross
  2021-11-21  4:08               ` Greg Minshall
  2 siblings, 0 replies; 33+ messages in thread
From: Tim Cross @ 2021-11-20 21:32 UTC (permalink / raw)
  To: emacs-orgmode


Timothy <tecosaur@gmail.com> writes:

> * For example, “:tangle-mode 755” will now produce the warning:
>   “1363 is not a valid file mode octal. Did you give the decimal value 755 by
>   mistake?”. Maybe it would be worth adding “if so try o755” or similar?


I think that warning will be confusing for users. They will look at the
value 1363 and be confused where that value came from, likely assuming
they have found a bug in org mode. Perhaps something simpler like

"Decimal 555 is not a valid file mode specification. Did you mean to
specify it as an octal value? If so, prefix it with a 'o'".


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-20 19:49           ` Tim Cross
@ 2021-11-21  4:02             ` Timothy
  2021-11-21 13:51               ` Tim Cross
  0 siblings, 1 reply; 33+ messages in thread
From: Timothy @ 2021-11-21  4:02 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-orgmode


Hi Tim,

>> The parsing of “555” to the integer 555 is done by
>> org-babel-parse-header-arguments, and so can’t really be changed.
>
> I don't understand this. Why can't it be changed?

Well, it can't be changed without changing
org-babel-parse-header-arguments, which is quite a major change and I
suspect may be a breaking change.

Maybe it's fine? I just don't feel confident about this.

> When I said disable base 10 values, what I meant was generate an error
> if the value looks like it is base 10 i.e. does not have a leading o or
> #o. This makes it clear it has to be specified as an octal value and
> will alert users to this fact.

This is all well and good, but the point I'm making is that since due to
the current behaviour of org-babel-parse-header-arguments we can't
distinguish between :tangle-mode (identity #o555) and :tangle-mode 365
putting this idea /into practice/ doesn't look like it will be easy.

It's also worth noting that with regard to the recent changes that have
been made, this is not a new issue.

>> I think the ls-style is quite valuable for two reasons. It’s
>> well-established thanks to ls, and is particularly good for
>> understanding permissions at a glance.
>
> I would agree it is an established way to display the permissions
> associated with a filesystem object, but disagree it is a well
> established way to set such values - I know of no other tool which uses
> this format.

The driving motivation here is that while the tools which you mention do
not use this for setting permissions (e.g. chmod), they are only used
for /setting/ permissions. Since Org files aren't write-only, and
occasionally we go back and look at what we've written :P I think
allowing a format that is optimised for ease-of-reading instead of
ease-of-writing makes some sense. It is in this respect that I think
ls -l style is a good idea.

> It is also not the same as the ls -l display (no object
> type indicator). The ls -l also displays sticky bit, uid and gid. Does
> your format also support setting these values (something which can be
> done with the octal or symbolic formats) i.e. support s, S, t and T for
> the 'executable' bit for user/group?

Ah, I'm afraid that I'm not that up-together on sticky bits. I suspect
that it's not just the ls -l style that will need tweaking. I'll read up
on this in the next few days and update the ML.

> Personally, I prefer the symbolic form as it is shorter and clear. I
> find the ls -l form too easy to get wrong (especially with getting the
> number of '-' correct).

Isn't choice a great thing? :D In seriousness, this is exactly why I
think it's worth providing these options. At least thinking back to when
I started using Linux a few years ago I would have said the same thing
about the octal form, and was completely unfamiliar with chmod. I expect
different people to have different preferences with these three options,
but for one of them to be something most people are happy with.

>> Tim suggested that invalid forms should cause tangling to fail, but I feel this
>> may be a bit much. Personally I’m inclined to either
>> • warn, and don’t touch the file permissions (this is what currently occurs)
>> • use a very conservative file permission (e.g. rw——-).
>
> I'm unsure on this. My concern is people may not notice the warning and
> then be surprised later. Given the potential security issues, a later
> surprise is something to be avoided even if it is inconvenient. With
> respect to the default action to take, I would suggest we also need to
> look at the default umask setting for the user and if that is more
> restrictive, use that rather than some value like rw------- For all we
> know, the user has set a specific umask for a valid reason and will be
> surprised if emacs just ignores that to do what it wants (another
> surprise to be avoided).

> The user is not required to specify a mode. However, if they do and if
> we cannot interpret what they have specified without ambiguity, we
> should throw an error and cease processing. Making a guess as to what
> they intended in this situation is IMO a big mistake.

I don't see how using the default permissions is a potential security
risk? Could it surprise the user, yes, but that seems unavoidable when
the user is doing something invalid.

> My only preference for "#o" over just "o" prefix is to clearly signal to
> the user that it is an octal value and avoid the situation where you
> might glance a value and see the leading 'o' as a '0', missing the point
> it is an octal value not a decimal one. However, this seems like a low
> risk, so I'm happy either way.

Well, I've got "o" for now but that's not set in stone. With a lower
case "o" I feel the risk for confusion with "0" is quite low (besides
which a "0" prefix seems to be the C-style for octal anyway).

All the best,
Timothy


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-20 14:50             ` Timothy
  2021-11-20 16:09               ` tomas
  2021-11-20 21:32               ` Tim Cross
@ 2021-11-21  4:08               ` Greg Minshall
  2021-11-21  4:27                 ` Timothy
  2 siblings, 1 reply; 33+ messages in thread
From: Greg Minshall @ 2021-11-21  4:08 UTC (permalink / raw)
  To: Timothy; +Cc: tomas, emacs-orgmode

hi, Timothy,

> I’ve just pushed three commits that
> 1. Add “o555” as an octal shorthand
> 2. Perform a simple check that integer modes are valid*
> 3. Make the ls-style regex stricter
> ...
> * For example, “:tangle-mode 755” will now produce the warning:
>   “1363 is not a valid file mode octal. Did you give the decimal value 755 by
>   mistake?”. Maybe it would be worth adding “if so try o755” or similar?

i'd push back, even here, on allowing decimal.  file modes are bit
masks.  to me, offering a way to set a bit mask via a *decimal* value
seems a mistake.

my +/- one cent.

cheers, Greg


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-21  4:08               ` Greg Minshall
@ 2021-11-21  4:27                 ` Timothy
  2021-11-21  5:11                   ` Greg Minshall
  0 siblings, 1 reply; 33+ messages in thread
From: Timothy @ 2021-11-21  4:27 UTC (permalink / raw)
  To: Greg Minshall; +Cc: tomas, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 508 bytes --]

Hi Greg,

> i’d push back, even here, on allowing decimal.  file modes are bit
> masks.  to me, offering a way to set a bit mask via a *decimal* value
> seems a mistake.

Just a quick note (see my long recent reply to Tim where I expand on this more),
but this isn’t new behaviour (isn’t actually affected by my recent changes) and
my concerns are with the viability of the necessary changes rather than whether
this would be good (we are of the same mind I think).

All the best,
Timothy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-21  4:27                 ` Timothy
@ 2021-11-21  5:11                   ` Greg Minshall
  0 siblings, 0 replies; 33+ messages in thread
From: Greg Minshall @ 2021-11-21  5:11 UTC (permalink / raw)
  To: Timothy; +Cc: tomas, emacs-orgmode

Timothy,

> Just a quick note (see my long recent reply to Tim where I expand on this more),
> but this isn’t new behaviour (isn’t actually affected by my recent changes) and
> my concerns are with the viability of the necessary changes rather than whether
> this would be good (we are of the same mind I think).

yes, i see.  i hadn't realized that.

maybe, though, we could start off by warning people who are writing
integer values; then deprecate it; then, N-releases from now, make it
illegal?

such a path might be worth it.  or, at least the warnings --
irrespective, i would say, of whether or not the decimal value turns
into some "reasonable" file mode value.

cheers, Greg


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-21  4:02             ` Timothy
@ 2021-11-21 13:51               ` Tim Cross
  2021-11-21 14:33                 ` Timothy
  0 siblings, 1 reply; 33+ messages in thread
From: Tim Cross @ 2021-11-21 13:51 UTC (permalink / raw)
  To: Timothy; +Cc: emacs-orgmode


Timothy <tecosaur@gmail.com> writes:

> Hi Tim,
>
>>> The parsing of “555” to the integer 555 is done by
>>> org-babel-parse-header-arguments, and so can’t really be changed.
>>
>> I don't understand this. Why can't it be changed?
>
> Well, it can't be changed without changing
> org-babel-parse-header-arguments, which is quite a major change and I
> suspect may be a breaking change.
>
> Maybe it's fine? I just don't feel confident about this.
>
>> When I said disable base 10 values, what I meant was generate an error
>> if the value looks like it is base 10 i.e. does not have a leading o or
>> #o. This makes it clear it has to be specified as an octal value and
>> will alert users to this fact.
>
> This is all well and good, but the point I'm making is that since due to
> the current behaviour of org-babel-parse-header-arguments we can't
> distinguish between :tangle-mode (identity #o555) and :tangle-mode 365
> putting this idea /into practice/ doesn't look like it will be easy.
>

I don't see the complication here. When you call
org-babel-parse-header-arguments, it returns an alist of argument name
and value. If you had a header like :tangle-mode 555, you get the value
(:tangle-mode . 555), if you have :tangle-mode o555, you get
(:tangle-mode . "o555"). My suggestion is simply to look at the value
and if it is not a string, reject it outright. If it is a string, then
match that string to the allowed formats (i.e. an octal value if it
starts with "#0o" or "o", a symbolic value if it matches a regexp for
symbolic representation and even possiblly your ls -l format if it
matches that. This is not a change in org-babel-parse-arguments, but
rather a change in how we interpret those arguments. 


> It's also worth noting that with regard to the recent changes that have
> been made, this is not a new issue.

Which I don't think is a valid argument for the status quo. Allowing
base 10 mode specifications is just a bad idea. The fact it has been
permitted in the past is likely just an oversight rather than
intentional behaviour. Given the potential security issues involved
here, I think we can legitimately obsolete such features (using
acceptable change management approaches with suitable notification for
users etc).


>
>>> I think the ls-style is quite valuable for two reasons. It’s
>>> well-established thanks to ls, and is particularly good for
>>> understanding permissions at a glance.
>>
>> I would agree it is an established way to display the permissions
>> associated with a filesystem object, but disagree it is a well
>> established way to set such values - I know of no other tool which uses
>> this format.
>
> The driving motivation here is that while the tools which you mention do
> not use this for setting permissions (e.g. chmod), they are only used
> for /setting/ permissions. Since Org files aren't write-only, and
> occasionally we go back and look at what we've written :P I think
> allowing a format that is optimised for ease-of-reading instead of
> ease-of-writing makes some sense. It is in this respect that I think
> ls -l style is a good idea.
>

I understand your motivation. I dispute your claim it is an established
and common convention. 

>> It is also not the same as the ls -l display (no object
>> type indicator). The ls -l also displays sticky bit, uid and gid. Does
>> your format also support setting these values (something which can be
>> done with the octal or symbolic formats) i.e. support s, S, t and T for
>> the 'executable' bit for user/group?
>
> Ah, I'm afraid that I'm not that up-together on sticky bits. I suspect
> that it's not just the ls -l style that will need tweaking. I'll read up
> on this in the next few days and update the ML.
>

This is very difficult Timothy. I really do appreciate all the work you
have done here and your patience, which makes what I'm going to say all
the more difficult. My big concern here is that you don't understand
this area with sufficient depth of understanding to really appreciate
some of the subtleties involved here. This makes me concerned that any
implementation of a new way to specify mode settings may have some
unexpected side effects or hidden issues which could be exploited in
unexpected ways (worse case) or simply have unexpected bugs which result
in mode settings which are not what the user intended. As someone who
has used Unix since 1980 and Linux since 1995, I've experienced my share
of issues relating to file modes and their interaction with different
filesystems, NFS, different platforms, character sets, locales etc.
Implementing a new approach as you are proposing is something I would be
extremely wary about.

If you do plan to go forward with this approach, please ensure you are
very confident in your understanding of the sticky bit, setuid/setgid
and associated interaction with execute etc. 

I suspect you do have the skills to implement correctly given sufficient
time, testing and motivation. However, I am concerned your initial stab
at this may be somewhat naive and requires more consideration. My main
concern is that once you take all the subtleties into consideration, the
complexity of your ls -l approach will exceed its utility and
maintainability. I could also be completely wrong.  

>> Personally, I prefer the symbolic form as it is shorter and clear. I
>> find the ls -l form too easy to get wrong (especially with getting the
>> number of '-' correct).
>
> Isn't choice a great thing? :D In seriousness, this is exactly why I
> think it's worth providing these options. At least thinking back to when
> I started using Linux a few years ago I would have said the same thing
> about the octal form, and was completely unfamiliar with chmod. I expect
> different people to have different preferences with these three options,
> but for one of them to be something most people are happy with.
>

Choice is good. However, as I mentioned ina previous post, the problem
with choice is complexity and the biggest source of security problems is
complexity. Keep in mind that what you do now is something which will
need to be maintained by others in the future. It needs to be clear,
concise and as simple as possible (but of course, no simpler!). 


>>> Tim suggested that invalid forms should cause tangling to fail, but I feel this
>>> may be a bit much. Personally I’m inclined to either
>>> • warn, and don’t touch the file permissions (this is what currently occurs)
>>> • use a very conservative file permission (e.g. rw——-).
>>
>> I'm unsure on this. My concern is people may not notice the warning and
>> then be surprised later. Given the potential security issues, a later
>> surprise is something to be avoided even if it is inconvenient. With
>> respect to the default action to take, I would suggest we also need to
>> look at the default umask setting for the user and if that is more
>> restrictive, use that rather than some value like rw------- For all we
>> know, the user has set a specific umask for a valid reason and will be
>> surprised if emacs just ignores that to do what it wants (another
>> surprise to be avoided).
>
>> The user is not required to specify a mode. However, if they do and if
>> we cannot interpret what they have specified without ambiguity, we
>> should throw an error and cease processing. Making a guess as to what
>> they intended in this situation is IMO a big mistake.
>
> I don't see how using the default permissions is a potential security
> risk? Could it surprise the user, yes, but that seems unavoidable when
> the user is doing something invalid.

Consider this simple scenario. The tangled output is being written to a
directory which requires specific permissions for security reasons -
perhaps it is some type of ftp server where permissions will determine
what can be seen by whom. The data might contain sensitive information.
The user specifies what they thing is the correct mode specification,
but the7y get it wrong. Org comes along and looks at what they specified
and decides it cannot interpret what the user has specified, so applies
a default (and potentially completely wrong) mode, exposing sensitive
data to others who should not have had access.

Use a default if the user does not specify a mode, but if they specify a
mode and org cannot interpret what they specified, then do nothing -
don't generate the tangled output at all. This is the only safe
approach. 

>
>> My only preference for "#o" over just "o" prefix is to clearly signal to
>> the user that it is an octal value and avoid the situation where you
>> might glance a value and see the leading 'o' as a '0', missing the point
>> it is an octal value not a decimal one. However, this seems like a low
>> risk, so I'm happy either way.
>
> Well, I've got "o" for now but that's not set in stone. With a lower
> case "o" I feel the risk for confusion with "0" is quite low (besides
> which a "0" prefix seems to be the C-style for octal anyway).
>

I am not overly concerned about this one. I merely mention the use of
"#0" as a possible approach and why. If the preference is just to have
'o', that is fine with me. My main point is that when it comes to
parsing the value, having "#o" as a prefix for octal values is possibly
slightly easier wrt regexp specification than just 'o' as 'o' could also
be the first character for a symbolic mode specification as in o+w.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-21 13:51               ` Tim Cross
@ 2021-11-21 14:33                 ` Timothy
  2021-11-29 18:57                   ` Timothy
  0 siblings, 1 reply; 33+ messages in thread
From: Timothy @ 2021-11-21 14:33 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 6101 bytes --]

Hi  Tim,

Thanks for the way in which you’ve responded to my comments. I appreciate the
effort you’ve gone to to explain your views as opposed to simply saying you
disagree with some of my current thoughts :)

Tim Cross <theophilusx@gmail.com> writes:

> My suggestion is simply to look at the value and if it is not a string, reject
> it outright. […] This is not a change in org-babel-parse-arguments,
> but rather a change in how we interpret those arguments.

Ah I see, I did not think you were suggesting this. I’m quite wary of this as it
would break all current `:tangle-mode (identity #oXYZ)' headers, which as I
understand it are /the/ way tangle-mode has been previously set.

Perhaps this is a good argument for deprecation? I think we’d need to give a
decently long transition period though (a few years).

> Allowing base 10 mode specifications is just a bad idea. The fact it has been
> permitted in the past is likely just an oversight rather than intentional
> behaviour.

I think we are of the same mind regarding base10 permissions specifications
being bad, but are niggling over what to do about this.

> Given the potential security issues involved
> here, I think we can legitimately obsolete such features (using
> acceptable change management approaches with suitable notification for
> users etc).

I’m certainly open to deprecation, though I’d like to hear from some of the
other “main Org people” (e.g. Bastien, Nicolas) before doing anything along
these lines. Perhaps this would be worth making a second thread for?

> I understand your motivation. I dispute your claim it is an established
> and common convention.

Ok, cool. Since we should have a while till the next Org release, nothing is set
in stone yet and hopefully this will give us time to see if the ls -l format is
fit for purpose or not. I think it will be, but correctness is of course more
important than niceness.

>> [sticky bits and the ls -l format]
>
> This is very difficult Timothy. I really do appreciate all the work you
> have done here and your patience, which makes what I’m going to say all
> the more difficult. My big concern here is that you don’t understand
> this area with sufficient depth of understanding to really appreciate
> some of the subtleties involved here. This makes me concerned that any
> implementation of a new way to specify mode settings may have some
> unexpected side effects or hidden issues which could be exploited in
> unexpected ways (worse case) or simply have unexpected bugs which result
> in mode settings which are not what the user intended. As someone who
> has used Unix since 1980 and Linux since 1995, I’ve experienced my share
> of issues relating to file modes and their interaction with different
> filesystems, NFS, different platforms, character sets, locales etc.
> Implementing a new approach as you are proposing is something I would be
> extremely wary about.
>
> If you do plan to go forward with this approach, please ensure you are
> very confident in your understanding of the sticky bit, setuid/setgid
> and associated interaction with execute etc.

Thanks for explaining your position thoroughly here. I don’t need any convincing
that a solid understanding is necessary for a good implementation. I had a quick
peek and estimated the time I’d need to properly come to terms with sticky bits —
hence my estimate of a few days before I look at implementing this.

> I suspect you do have the skills to implement correctly given sufficient
> time, testing and motivation. However, I am concerned your initial stab
> at this may be somewhat naive and requires more consideration. My main
> concern is that once you take all the subtleties into consideration, the
> complexity of your ls -l approach will exceed its utility and
> maintainability. I could also be completely wrong.

I’ll let you (/the ML) know when I’ve taken a look at stiky bits, and whether I
think I’m able to create something that works. My intuition is that if ls -l can
properly represent sticky bits (and my rudimentary understanding is that it can)
it should be fine for specifying them too. We’ll see.

> Choice is good. However, as I mentioned in a previous post, the problem
> with choice is complexity and the biggest source of security problems is
> complexity. Keep in mind that what you do now is something which will
> need to be maintained by others in the future. It needs to be clear,
> concise and as simple as possible (but of course, no simpler!).

I like to think that the current implementation is fairly short and clean. If I
can’t add sticky bits “nicely”, I’ll reconsider things.

> Consider this simple scenario. The tangled output is being written to a
> directory which requires specific permissions for security reasons -
> perhaps it is some type of ftp server where permissions will determine
> what can be seen by whom. The data might contain sensitive information.
> The user specifies what they thing is the correct mode specification,
> but the7y get it wrong. Org comes along and looks at what they specified
> and decides it cannot interpret what the user has specified, so applies
> a default (and potentially completely wrong) mode, exposing sensitive
> data to others who should not have had access.
>
> Use a default if the user does not specify a mode, but if they specify a
> mode and org cannot interpret what they specified, then do nothing -
> don’t generate the tangled output at all. This is the only safe
> approach.

Thanks for the example. This sounds reasonable to me, and has prompted me to
test what’s actually happening at the moment.

Seems like the user-error I raise actually causes the tangling for that file
/and all subsequent files/ to fail, i.e. no content is written. It looks like we
actually want to loosen the current behaviour :P

Alright, that’s it for now. You can expect an update in a few days on sticky
bits.

All the best,
Timothy

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Accept more :tangle-mode specification forms
  2021-11-21 14:33                 ` Timothy
@ 2021-11-29 18:57                   ` Timothy
  0 siblings, 0 replies; 33+ messages in thread
From: Timothy @ 2021-11-29 18:57 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 868 bytes --]

Hi Tim,

> I’ll let you (/the ML) know when I’ve taken a look at stiky bits, and whether I
> think I’m able to create something that works. My intuition is that if ls -l can
> properly represent sticky bits (and my rudimentary understanding is that it can)
> it should be fine for specifying them too. We’ll see.

I’ve gone away and had a look, then come back and had a think. This has resulted
in 9ce7802. Since we just have to worry about suid/sgid (as :tangle-mode is only
applied to the file produced [with the sticky bit being dir-specific]), not much
is required — we just allow the `s' in place of `x' for the user/group executable
flag.

This is pretty trivial as we’re still relying on `file-modes-symbolic-to-number'
for this, and in a very straight-forward way. So, I think we should be fine now 🙂.

All the best,
Timothy

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2021-11-29 19:04 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-30 18:14 [PATCH] Accept more :tangle-mode specification forms Timothy
2021-10-01  1:24 ` Tom Gillespie
2021-10-01  6:59   ` Timothy
2021-10-01  8:00     ` Stefan Nobis
2021-10-01 10:05       ` Eric S Fraga
2021-10-01 10:29         ` tomas
2021-10-01 18:04           ` Tom Gillespie
2021-10-01 18:14             ` Timothy
2021-10-01  8:39   ` Christian Moe
2021-10-05 14:45 ` Timothy
2021-10-05 15:54   ` unknown@email.com
2021-10-05 16:13     ` Timothy
2021-10-05 16:06   ` tomas
2021-10-06 11:59   ` Max Nikulin
2021-11-18 10:20   ` Timothy
2021-11-18 17:22     ` Timothy
2021-11-18 23:33       ` Tom Gillespie
2021-11-19 16:31       ` Tim Cross
2021-11-19 18:10         ` tomas
2021-11-20  4:31         ` Greg Minshall
2021-11-20  8:08         ` Timothy
2021-11-20 12:25           ` tomas
2021-11-20 14:50             ` Timothy
2021-11-20 16:09               ` tomas
2021-11-20 21:32               ` Tim Cross
2021-11-21  4:08               ` Greg Minshall
2021-11-21  4:27                 ` Timothy
2021-11-21  5:11                   ` Greg Minshall
2021-11-20 19:49           ` Tim Cross
2021-11-21  4:02             ` Timothy
2021-11-21 13:51               ` Tim Cross
2021-11-21 14:33                 ` Timothy
2021-11-29 18:57                   ` Timothy

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).