svn commit: r225372 - head/sys/kern

Sun Sep 4 18:04:34 UTC 2011

On Sun, 4 Sep 2011, Attilio Rao wrote:

>> Log:
>> Â Interrupts are disabled/enabled when entering and exiting the KDB context.
>> Â While this is generally good, it brings along a serie of problems,
>> Â like clocks going off sync and in presence of SW_WATCHDOG, watchdogs
>> Â firing without a good reason (missed hardclock wdog ticks update).

Please fix your mailer.  It corrupts spaces to binary data \xc2\xa0.

My version resets the timecounter 3 seconds after leaving ddb (not
immediately, to avoid setting it thousands or millions of times per
second for trace traps and possibly making it drift a few usec every
time (I get a drift of ~ 1 usec per second stopped)).  In fact, I don't
see how anyone can use ddb without this.  When running current, I
have to remember to stop ntpd before running ddb so that ntpd doesn't
corrupt its database (just the driftfile and logs).

> Also please notice that intr enable/disable happens in the wrong way
> as it is done via the MD (x86 specific likely) interface. This is
> wrong for 2 reasons:

No, intr_disable() is MI.  It is also used by witness.  disable_intr()
is the corresponding x86 interface that you may be thinking of.  The MI
interface intr_disable() was introduced to avoid the MD'ness of
intr_disable().

> 1) There may be some codepaths leading to explicit preemption
> 2) It should  really use an MI interface
>
> The right way to do this should be via spinlock_enter().

spinlock_enter() is MI, but has wrong semantics.  In my version of i386,
spinlocks don't disable any h/w interrupt, as is needed for fast interrupt
handlers to actually work.  I believe sparc64 is similar, except its
spinlock_enter() disables most h/w interrupts and this includes fast
interrupt handlers.  I don't understand sparc64, but it looks like its
spinlock_enter() disables all interrupts visible in C code, but not
all interrupts:

from cpufunc.h:
% static __inline register_t
% intr_disable(void)
% {
% 	register_t s;
% 
% 	s = rdpr(pstate);
% 	wrpr(pstate, s & ~PSTATE_IE, 0);
% 	return (s);
% }

This seems to mask all interrupts, as required.

     (The interface here is slightly broken (non-MI).  It returns register_t.
     This assumes that the interrupt state can be represented in a single
     register.  The type critical_t exists to avoid the same bug in an
     old version of critical_enter().  Now this type is just bogus.
     critical_enter() no longer returns it.  Instead, spinlock_enter() uses
     a non-reentrant interface which stores what used to be the return value
     of critical_enter() in a per-thread MD data structure (md_saved_pil
     in the above).  Most or all arches use register_t for this.  This
     leaves critical_t as pure garbage -- the only remaining references to
     it are for its definition.)

From machep.c:
% void
% spinlock_enter(void)
% {
% 	struct thread *td;
% 	register_t pil;
% 
% 	td = curthread;
% 	if (td->td_md.md_spinlock_count == 0) {
% 		pil = rdpr(pil);
% 		wrpr(pil, 0, PIL_TICK);
% 		td->td_md.md_spinlock_count = 1;
% 		td->td_md.md_saved_pil = pil;
% 	} else
% 		td->td_md.md_spinlock_count++;
% 	critical_enter();
% }

I believe this disables all interrupts at and below a certain level.

From intr_machdep.h:
% #define	PIL_LOW		1	/* stray interrupts */
% #define	PIL_ITHREAD	2	/* interrupts that use ithreads */
% #define	PIL_RENDEZVOUS	3	/* smp rendezvous ipi */
% #define	PIL_AST		4	/* ast ipi */
% #define	PIL_STOP	5	/* stop cpu ipi */
% #define	PIL_PREEMPT	6	/* preempt idle thread cpu ipi */
% #define	PIL_HARDCLOCK	7	/* hardclock broadcast */
% #define	PIL_FILTER	12	/* filter interrupts */
% #define	PIL_BRIDGE	13	/* bridge interrupts */
% #define	PIL_TICK	14	/* tick interrupts */

And this level includes all interrupts that have a level, including
tick interrupts.

In my version of x86, spinlocks mask everything except the equivalent
of filter interrupts (= non-broken fast interrupts in my version). 
Non-broken fast-interrupt handlers have to be written very carefully
to be reentrant.  Since full reentrancy is too hard, they must run
with _all_ h/w interrupts disabled.  kdb needs all h/w interrupts
disabled for similar reasons.  It cannot even call things like
spinlock_enter() without knowing too much about their internals.
In fact, the non-reentrant state for spinlock_enter() and/or critical
_enter() was a reliable cause of panics for many years even when it
was accessed externally near kdb -- tracing through these functions
often paniced.

Below the filter level is a much higher level than sparc64.  I think
the interrupts not masked at PIL_TICK are mostly handled in asm, which
I am further from understanding.  Anyway, this is all at a low level
for sparc64.  Higher levels just need to use the correct interface to
disable all interrupts.  That is disable_intr(), not spinlock_enter().

Bruce