Running linux ldconfig on tmpfs results in unkillable process
Beat Gätzi
beat at chruetertee.ch
Thu Jan 20 00:35:11 UTC 2011
On 19.01.2011 13:24, Kostik Belousov wrote:
> On Tue, Jan 18, 2011 at 05:40:14PM +0100, Beat G?tzi wrote:
>> On 18.01.2011 17:13, Kostik Belousov wrote:
>>> On Tue, Jan 18, 2011 at 04:34:10PM +0100, Beat G?tzi wrote:
>>>> On 18.01.2011 15:46, Kostik Belousov wrote:
>>>>> On Tue, Jan 18, 2011 at 03:16:27PM +0100, Beat G?tzi wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I've a tinderbox which uses tmpfs to build ports. Every time I build a
>>>>>> port which executes linux ldconfig it results in an unkillable process
>>>>>> which uses 100% CPU. The problem is reproduceable without tinderbox:
>>>>>>
>>>>>> # uname -a
>>>>>> FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3
>>>>>> r216761: Tue Dec 28 15:32:26 CET 2010
>>>>>> root at daedalus.network.local:/usr/obj/usr/src/sys/GENERIC i386
>>>>>> # mkdir /compat/test
>>>>>> # mount -t tmpfs tmpfs /compat/test
>>>>>> # cp -Rp /compat/linux/* /compat/test/
>>>>>> # mount -t linprocfs linprocfs /compat/test/proc
>>>>>> # /compat/linux/sbin/ldconfig -r /compat/test/
>>>>>> # pgrep ldconfig
>>>>>> 3449
>>>>>> # procstat -i 3449 | grep KILL
>>>>>> 3449 ldconfig KILL ---
>>>>>> # kill -9 3449
>>>>>> # procstat -i 3449 | grep KILL
>>>>>> 3449 ldconfig KILL P--
>>>>>>
>>>>>> >From top(1):
>>>>>> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
>>>>>> 3449 root 1 44 0 992K 712K CPU1 1 10:06 100.00% ldconfig
>>>>>>
>>>>>> When I reboot the machine it hangs after "All buffers synced.".
>>>>>>
>>>>>> I've uploaded some additional output of procstat and ktrace here:
>>>>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt
>>>>>>
>>>>>> Anyone knows how to fix this?
>>>>> kdump for the trace of the linux binary is a garbage. You need to
>>>>> use linux_kdump (from ports).
>>>>>
>>>>> I think that your process is looping in the kernel, you can confirm this
>>>>> by dropping in the ddb and doing "bt <pid>".
>>>>
>>>> I've uploaded a screenshot from the output of bt <pid> in ddb:
>>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs-bt.jpg
>>>
>>> Please try this.
>>>
>>> diff --git a/sys/compat/linux/linux_file.c b/sys/compat/linux/linux_file.c
>>> index 9ff1cf0..44ad193 100644
>>> --- a/sys/compat/linux/linux_file.c
>>> +++ b/sys/compat/linux/linux_file.c
>>> @@ -369,7 +369,6 @@ getdents_common(struct thread *td, struct linux_getdents64_args *args,
>>> lbuf = malloc(LINUX_MAXRECLEN, M_TEMP, M_WAITOK | M_ZERO);
>>> vn_lock(vp, LK_SHARED | LK_RETRY);
>>>
>>> -again:
>>> aiov.iov_base = buf;
>>> aiov.iov_len = buflen;
>>> auio.uio_iov = &aiov;
>>> @@ -506,8 +505,10 @@ again:
>>> break;
>>> }
>>>
>>> - if (outp == (caddr_t)args->dirent)
>>> - goto again;
>>> + if (outp == (caddr_t)args->dirent) {
>>> + nbytes = resid;
>>> + goto eof;
>>> + }
>>>
>>> fp->f_offset = off;
>>> if (justone)
>>> diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c
>>> index 84a2038..62dd0bf 100644
>>> --- a/sys/fs/tmpfs/tmpfs_subr.c
>>> +++ b/sys/fs/tmpfs/tmpfs_subr.c
>>> @@ -827,9 +827,10 @@ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio *uio, off_t *cntp)
>>> /* Copy the new dirent structure into the output buffer and
>>> * advance pointers. */
>>> error = uiomove(&d, d.d_reclen, uio);
>>> -
>>> - (*cntp)++;
>>> - de = TAILQ_NEXT(de, td_entries);
>>> + if (error == 0) {
>>> + (*cntp)++;
>>> + de = TAILQ_NEXT(de, td_entries);
>>> + }
>>> } while (error == 0 && uio->uio_resid > 0 && de != NULL);
>>>
>>> /* Update the offset and cache. */
>>
>> This patch solves the problem.
>>
> Thank you, but apparently this is not the end of story.
>
> I committed the linuxolator part of change, but I think that tmpfs
> change is uncomplete yet. Strictly following getdirentries(2), tmpfs
> must return EINVAL in the case when no single record can be returned.
> Currently, it indicates EOF instead. I think this could be a complete
> solution, but it might break e.g. Linux ldconfig(8) since it exposed
> the linuxolator situation.
>
> Can you apply the patch below over the latest HEAD with r217578 included
> and retest ? Thanks.
>
> diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c
> index 84a2038..62dd0bf 100644
> --- a/sys/fs/tmpfs/tmpfs_subr.c
> +++ b/sys/fs/tmpfs/tmpfs_subr.c
> @@ -827,9 +827,10 @@ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio *uio, off_t *cntp)
> /* Copy the new dirent structure into the output buffer and
> * advance pointers. */
> error = uiomove(&d, d.d_reclen, uio);
> -
> - (*cntp)++;
> - de = TAILQ_NEXT(de, td_entries);
> + if (error == 0) {
> + (*cntp)++;
> + de = TAILQ_NEXT(de, td_entries);
> + }
> } while (error == 0 && uio->uio_resid > 0 && de != NULL);
>
> /* Update the offset and cache. */
> diff --git a/sys/fs/tmpfs/tmpfs_vnops.c b/sys/fs/tmpfs/tmpfs_vnops.c
> index 059a790..a57c1f2 100644
> --- a/sys/fs/tmpfs/tmpfs_vnops.c
> +++ b/sys/fs/tmpfs/tmpfs_vnops.c
> @@ -1349,7 +1349,7 @@ outok:
> MPASS(error >= -1);
>
> if (error == -1)
> - error = 0;
> + error = (cnt != 0) ? 0 : EINVAL;
>
> if (eofflag != NULL)
> *eofflag =
I've applied the new patch on top of r217615 and was not able to
reproduce the problem.
Thanks again,
Beat
More information about the freebsd-current
mailing list