Large Capsicum patch for review.

Wed Feb 27 22:15:44 UTC 2013

On 25.02.2013 22:50, Pawel Jakub Dawidek wrote:
> On Mon, Feb 25, 2013 at 11:32:27AM +0100, Christoph Mallon wrote:
>> On 25.02.2013 00:59, Pawel Jakub Dawidek wrote:
>>> On Sun, Feb 24, 2013 at 04:07:12PM +0100, Christoph Mallon wrote:
>>>> On 24.02.2013 07:05, Christoph Mallon wrote:
>>>> - Why did you choose INT_MAX to represent "all capabilities"?
>>>>   The data type used is ssize_t, so SSIZE_MAX seems more logical.
>>>>   Further, there are several places where size_t and ssize_t come into contact, which need careful tests and casts to get right.
>>>>   Maybe it is better to always use size_t and represent "all capabilities" with SIZE_T_MAX or (size_t)-1 (which is guaranteed by C to be the maximal value).
>>>
>>> This is not used for capabilities, but for white-listing ioctl commands.
>>> INT_MAX seems to just be least odd. We only allow for 256 anyway.
>>> I could provide some dedicated define for it, eg.
>>>
>>> #define	CAP_IOCTLS_ALL	<some random big number>
>>
>> A nice name is a good step.
>> Still, I would prefer, if consistently only one type (size_t) would be used instead of two/three (ssize_t, size_t and int for the constant).
> 
> Using three types here is not inconsistent, it is a common practise.

Using three different types for the same context /is/ inconsistent.

> Check out even read(2) and write(2) - they take size as size_t, as there
> is no need for a negative size, but they return ssize_t, because they
> can return -1 if an error occured. This is exactly what I do in
> cap_ioctls_get().

"Common practise" does not automatically turn something into a good idea.

> I use size_t as this is preferred type for size, but I don't need size_t
> for iterator, because I know the value will never need more than 16
> bits, so I use int as a more CPU-friendly type.

See my earlier mail about this.

>>>> - I could not verify many changes, e.g. changed requested permissions, because this is just a big diff with no information about the intention of the changes.
>>>>   A series of smaller diffs with commit logs to state their intent would be really useful for reviewing.
>>>
>>> The entire history is in perforce, but there are many commits in there.
>>
>> So effectively the history is lost.
> 
> The entire history is there, nothing is lost:
> 
> 	http://p4db.freebsd.org/

So effectively it is lost.
FreeBSD's history is recorded in its SVN respository.
Throwing big patch bombs into it only makes it harder to comprehend changes.

>>> The key goals of the patch are:
>>>
>>> - Move from special capability descriptors that were used to keep
>>>   capability rights in the file structure to ability to configure
>>>   capability rights on any descriptor type.
>>>
>>> - Make capability rights more practical.
>>>
>>> - Allow for ioctl(2) in capability mode by providing a way to limit
>>>   permitted ioctl commands.
>>>
>>> - Allow for limiting permitted fcntl(2) commands.
>>
>> In a branch in svn one could reread these steps on every commit.
>> Or even better, use git(-svn), which can automatically create a nice patch series (like the one I provided).
> 
> Your changes were pretty simple, so they look nice as separate commits.

It's the other way round:
The changes are simple, because they are separate commits, which do a single thing each.
So it is clear what to look for in each commit.

> The patch you review is a result of ~2 months of work and exporting
> every single change would not create such nice output. If those changes
> were implemented totally separated, I could generate several independent
> patches, but those changes were implemented together and trying to
> separate them now would be, if possible, very time consuming. So
> eventhough I agree it would be easier to review them separately, I only
> have this single patch.

I understand, p4 does not provide a suitable basis for code review.
As we have investigated before, it is even necessary to massage its output to make it usable with common tools at all.

>>>> --- a/sys/kern/sys_capability.c
>>>> +++ b/sys/kern/sys_capability.c
>>>> @@ -314,12 +314,12 @@ cap_ioctl_limit_check(struct filedesc *fdp, int fd, const u_long *cmds,
>>>>  
>>>>  	ocmds = fdp->fd_ofiles[fd].fde_ioctls;
>>>>  	for (i = 0; i < ncmds; i++) {
>>>> -		for (j = 0; j < oncmds; j++) {
>>>> +		for (j = 0;; j++) {
>>>> +			if (j == oncmds)
>>>> +				return (ENOTCAPABLE);
>>>>  			if (cmds[i] == ocmds[j])
>>>>  				break;
>>>>  		}
>>>> -		if (j == oncmds)
>>>> -			return (ENOTCAPABLE);
>>>>  	}
>>>>  
>>>>  	return (0);
>>>
>>> I decided to skip this one. My version is more readable, IMHO, as it is
>>> used for other cases like TAILQ_FOREACH(), etc. where the check is
>>> already done by the macro.
>>
>> The duplicate tests do not even match!
>> You have to carefully read the code to verify, that if indeed only triggers, when the for loop terminated without break.
>> Code duplication is almost always bad, especially if the duplicated code does not fully match.
> 
> I don't agree, sorry. Your loop looks like endless loop at first. If the
> loop would be more complex, it would be hard to tell at first glance if
> the loop even terminates. For me it looks awkward - there is a place in
> 'for ()' for a check, which defines when the loop should end; IMHO it is
> there for a reason. The construction I use is widely used, at least in
> FreeBSD code and looks obvious to me.

At least make the duplicate checks match: == and < are not complements.

>>> Applied, but I removed first goto and now out label is placed before
>>> unlock.
>>
>> This leaves code duplication in place, which is error prone.
> 
> But eliminates extra goto label that is used for exactly one case.

Two cases: explicitly jumping to it and reaching it from the preceeding statement.
If it was only reached in one case, then it would be pointless.

> Once we grow at least one more error condition that doesn't need
> unlocking and I'm fine with adding it.

There is code duplication present right now.

>>>> diff --git a/sys/kern/sys_capability.c b/sys/kern/sys_capability.c
>>>> index e3622b0..2306811 100644
>>>> --- a/sys/kern/sys_capability.c
>>>> +++ b/sys/kern/sys_capability.c
>>>> @@ -409,10 +409,8 @@ sys_cap_ioctls_get(struct thread *td, struct cap_ioctls_get_args *uap)
>>>>  		if (error != 0)
>>>>  			goto out;
>>>>  	}
>>>> -	if (fdep->fde_nioctls == -1)
>>>> -		td->td_retval[0] = INT_MAX;
>>>> -	else
>>>> -		td->td_retval[0] = fdep->fde_nioctls;
>>>> +	td->td_retval[0] =
>>>> +	    fdep->fde_nioctls != -1 ? fdep->fde_nioctls : INT_MAX;
>>>>  	error = 0;
>>>>  
>>>>  out:
>>>
>>> Well, IMHO my version is more readable, especially in the first case.
>>
>> I had to read these lines very carefully to be sure that they actually assign to the same variables.
>> Code duplication is really hurts readability.
> 
> Trying to squeeze as much as possible into one line and then breaking
> the line into three doesn't cure readability either, IMHO.
> It's probably a matter of taste, but my version is more readable for me.

The version with the if-else is simply a different way to break the line.
It just does it with more code duplication.

> Ok, I applied your change, after looking at it more it definiately is
> simpler, although I don't think ?:'s result is boolean, but that's not
> a big deal.

The result type of ?: is whatever the combined type of the second and third operand of the ?: is.
The result type of all logical and comparison operators is int, so the result is int (with a hint of boolean, because the result of these operators is always 0 or 1).

	Christoph