svn commit: r277199 - in head/sys: fs/devfs kern
Hans Petter Selasky
hps at selasky.org
Thu Jan 15 10:48:26 UTC 2015
On 01/15/15 10:38, Konstantin Belousov wrote:
> On Thu, Jan 15, 2015 at 08:41:31AM +0100, Hans Petter Selasky wrote:
>> On 01/15/15 04:31, Konstantin Belousov wrote:
>>> On Wed, Jan 14, 2015 at 10:07:13PM +0000, Hans Petter Selasky wrote:
>>>> Author: hselasky
>>>> Date: Wed Jan 14 22:07:13 2015
>>>> New Revision: 277199
>>>> URL: https://svnweb.freebsd.org/changeset/base/277199
>>>>
>>>> Log:
>>>> Avoid race with "dev_rel()" when using the recently added
>>>> "delist_dev()" function. Make sure the character device structure
>>>> doesn't go away until the end of the "destroy_dev()" function due to
>>>> concurrently running cleanup code inside "devfs_populate()".
>>>>
>>>> MFC after: 1 week
>>>> Reported by: dchagin@
>>>>
>>>> Modified:
>>>> head/sys/fs/devfs/devfs_devs.c
>>>> head/sys/kern/kern_conf.c
>>>>
>>>> Modified: head/sys/fs/devfs/devfs_devs.c
>>>> ==============================================================================
>>>> --- head/sys/fs/devfs/devfs_devs.c Wed Jan 14 22:05:57 2015 (r277198)
>>>> +++ head/sys/fs/devfs/devfs_devs.c Wed Jan 14 22:07:13 2015 (r277199)
>>>> @@ -137,6 +137,12 @@ devfs_alloc(int flags)
>>>> vfs_timestamp(&ts);
>>>> cdev->si_atime = cdev->si_mtime = cdev->si_ctime = ts;
>>>> cdev->si_cred = NULL;
>>>> + /*
>>>> + * Avoid race with dev_rel() by setting the initial
>>>> + * reference count to 1. This last reference is taken
>>>> + * by the destroy_dev() function.
>>>> + */
>>>> + cdev->si_refcount = 1;
>>> This is wrong. Not all devices are destroyed with destroy_dev().
>>> dev_rel() must be allowed to clean up allocated device.
>>>
>>> That said, I do not understand what race you are trying to solve.
>>> Freeing of the accessible cdev memory cannot happen in parallel while
>>> dev_mtx is owned.
>>>
>>> Please do not commit (to devfs) without seeking for the review first.
>>
>> Hi Konstantin,
>>
>> From my analysis there are basically three ways for a cdev to die:
>>
>> 1) Through dev_free_devlocked()
>> 2) Through destroy_devl() which then later calls dev_free_devlocked()
>> 3) Through destroy_dev_sched() which really is a wrapper around
>> destroy_devl().
> You only look from the consumers PoV. Devfs cdev can be dereferenced
> because e.g. clone handler decides that cdev is not valid/needed,
> and now the memory is never freed due to extra reference.
>
> Do not assume that all cdevs go through destroy_dev().
Hi,
All cdevs go through either case #2 or case #1 eventually from what I
can see, including clone devices, which call destroy_devl() in the end
aswell. See the "clone_destroy()" function!
I did a simple test with /dev/dspX.Y which use clone devices. I did:
vmstat -m | grep -i devfs1
1) Before plugging USB audio device:
DEVFS1 157 79K - 189 512
2) Plug USB audio device:
DEVFS1 164 82K - 196 512
3) Play something (env AUDIODEV=/dev/dsp2.4 play track01.wav)
DEVFS1 165 83K - 197 512
4) Stop playing (clone device still exits):
DEVFS1 165 83K - 197 512
5) Detach USB audio device:
DEVFS1 157 79K - 197 512
I see no leakage in that case!
Other case:
1) After "kldload if_tap"
DEVFS1 158 79K - 201 512
2) After creating TAP device (cat /dev/tap99)
DEVFS1 159 80K - 204 512
3) After creating TAP device (cat /dev/tap101)
DEVFS1 160 80K - 207 512
5) After "kldunload if_tap":
DEVFS1 158 79K - 207 512
6) After "kldload if_tap" again:
DEVFS1 158 79K - 207 512
I see no leakage in that case either!
Are there more cases which I don't see?
>
>>
>> In the case of direct free through #1, the reference count is ignored
>> and it doesn't matter if it is one or zero. Only in the case of
>> destruction through destroy_dev() it matters.
>>
>> Like the comment says in destroy_devl():
>>
>> /* Avoid race with dev_rel() */
>>
>> The problem is that the "cdev->si_refcount" is zero when the initial
>> devfs_create() is called. Then one ref is made. When we clear the
>> CDP_ACTIVE flag in devfs_destroy() it instructs a !parallel! running
>> process to destroy all the FS related structures and the reference count
>> goes back to zero when the "cdp" is removed from the "cdevp_list". Then
>> the cdev is freed too early. This happens because destroy_devl() is
>> dropping the dev_lock() to sleep waiting for pending references.
> Basically, this is very good explanation why your delist hack is wrong,
> for one of the reason. Another reason is explained below.
> You are trying to cover it with additional reference, but this is wrong
> as well.
>
>>
>> Do you see something else?
>
> I think that what you are trying to do with the CDP_ACTIVE hack is doomed
> anyway, because you are allowing for devfs directory to have two entries
> with the same name, until the populate loop cleans up the inactive one.
> In the meantime, any access to the directory operates on random entry.
The entry will not be random, because upon an open() call to a character
device, I believe the devfs_lookup() function will be called, which
always populate the devfs tree at first by calls to
devfs_populate_xxx(). Any delisted devices which don't have the
"CDP_ACTIVE" bit set, will never be seen by any open.
Regarding leftover filedescriptors which still access the old "cdev"
this is not a problem, and these will be closed when the si_refcount
goes to zero after the destroy_devl() call.
>
> The checks for existent names in make_dev() are performed for the reason,
> and you makes the rounds to effectively ignore it.
>
These checks are still correct and don't conflict with my patch from
what I can see. Else the existing destroy_devl() would also be broken
even before my patch with regard to the "random" selection of character
devices at open() from userspace.
--HPS
More information about the svn-src-all
mailing list