svn commit: r194204 - in head/sys: amd64/conf i386/conf
Bruce Evans
brde at optusnet.com.au
Mon Jun 15 22:16:09 UTC 2009
On Mon, 15 Jun 2009, Bruce Evans wrote:
> ...
> This version for RELENG_7 seems to work as intended. jhb should look at
> its atomic ops.
No reply yet.
I thought of some problems with this version. Mainly with the delaying code:
- Some versions of DELAY() need to lock the hardware, so calling DELAY()
can deadlock. E.g., on amd64 before the TSC is initialized, and on
i386 with no TSC and/or before the TSC is initialized, and when kdb
is not active on both, DELAY() calls getit(), and getit() locks the
clock hardware unconditionally using a non-recursive spin mutex.
Contrary to what I said in previous mail, detection of erroneous
recursion isn't broken in the usual case. The usual case is probably
INVARIANTS, and then recursion is detected. The handling of erroneous
version then causes endless recursion on printf(): it is a failing
KASSERT() which will call panic(), which will call printf(), which
will reach the failing KASSERT() again. The non-recursive spinlock
in cnputs() has the same bug (deadlock --> recursive deadlock). This
problem in DELAY() is well known, so it is worked around when kdb
is active by not calling getit() then.
Nearby bugs in DELAY(): if DELAY() is first called after the TSC is
initialized, then its debugging code is never reached. Its debugging
code is a non-NOTEd non-option and could have been removed after the
getit() version of DELAY() was verified to give reasonably accurate
timing, but it is more useful for the TSC version since the TSC version
has not been verified to give reasonably accurate timing. The TSC version
must fail any reasonable verifiications except probably for P-state
invariant TSCs since the TSC varies and DELAY() makes no attempt to
compensate for its variance). If DELAY() is first called before the
TSC is initialized, then the debugging code still works for the i8254
but its placement is confusing, and when the counter is changed to the
TSC there is no code to debug the change.
- timecounters are no better than DELAY() for implementing the delaying,
since apart from them possibly not working on cold and/or deadlocked
systems, although the upper-level timecounter code is lock-free, the
timecounter hardware code might need to use a lock. Again, the i8254
timecounter hardware code uses the clock spinlock.
- KTR uses plain printf(), and KTR can produce a lot of output, so the
delay should be as short as possible, as for mcount_trylock(), and
1ms is too long. Recursion is a relatively unimportant part of the
problem here. Too-long delays are possible in normal operation,
when one CPU is in a normal printf() and other CPUs want to do KTR
printfs. Serialization of the printf()s is especially important for
voluminous concurrent KTR output, but so is printing such output fast.
jhb should look at this too. I use KTR less than once a year.
> % Index: subr_prf.c
> % ===================================================================
> % RCS file: /home/ncvs/src/sys/kern/subr_prf.c,v
> % retrieving revision 1.130.2.1
> % diff -u -2 -r1.130.2.1 subr_prf.c
> % --- subr_prf.c 21 Jan 2009 00:26:45 -0000 1.130.2.1
> % +++ subr_prf.c 15 Jun 2009 05:32:03 -0000
> % @@ -112,4 +112,27 @@
> % &always_console_output, 0, "Always output to console despite
> TIOCCONS.");
> % % +static int printf_lockcount;
> % +
> % +static void
> % +printf_lock(void)
> % +{
> % + int timeout;
> % +
> % + timeout = 1000;
> % + do {
> % + if (atomic_cmpset_acq_int(&printf_lockcount, 0, 1))
> % + return;
> % + DELAY(1000);
> % + } while (--timeout != 0);
> % + atomic_add_acq_int(&printf_lockcount, 1);
> % +}
If the DELAY() is removed, the initial value of `timeout' would need to
be (possibly dynamically) calibrated.
The timeouts for the panics for spinlocks and threadlocks in kern_mutex.c
have similar problems and could benefit from calibration. First they
do up to 10 million cpu_spinwait()s. 10 million might be too small
or too large. Then they do up to 60 million DELAY(1)s. DELAY() can
deadlock as above. 60 million is supposed to give a delay of 60
seconds, but short delays can be very inaccurate (the minimum DELAY()
is about 5 us with i8254 hardware on fast CPUs and about 30 us with
i8254 hardware on 1990's CPUs), so the timeouts can be more like 5
minutes than 1 minute.
A non-dynamically calibrated loop using uncached memory or device
accesses has a better chance of working accurately than the non-dynamically
calibrated TSC loop in the amd64 and i8254 DELAY()s, since throttling
of the TSC is more common than throttling of memory systems.
Bruce
More information about the svn-src-all
mailing list