RFC: Hiding per-CPU kernel output behind bootverbose
Konstantin Belousov
kib at freebsd.org
Sat Apr 21 20:19:47 UTC 2018
On Sat, Apr 21, 2018 at 01:52:46PM -0600, Warner Losh wrote:
> On Sat, Apr 21, 2018 at 4:35 AM, Gary Jennejohn <gljennjohn at gmail.com>
> wrote:
>
> > On Sat, 21 Apr 2018 12:20:49 +0300
> > Konstantin Belousov <kib at freebsd.org> wrote:
> >
> > > On Sat, Apr 21, 2018 at 12:11:07AM +0000, Colin Percival wrote:
> > > > On 04/19/18 14:45, Konstantin Belousov wrote:
> > > > > On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote:
> > > > >> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov wrote:
> > > > >>> The 'CPU XX Launched' messages are very useful for initial
> > diagnostic
> > > > >>> of the SMP startup failures. You need to enable bootverbose to see
> > the
> > > > >>> hang details, but for initial hint they are required.
> > Unfortunately, AP
> > > > >>> startup hangs occur too often to pretend that this can be
> > delegated to
> > > > >>> very specific circumstances.
> > > > >>
> > > > >> Really? I don't know that I've ever seen an AP startup hang. How
> > > > >> often do they occur?
> > > > >
> > > > > It was epidemic with Sandy Bridge, mostly correlated to specific BIOS
> > > > > supplier and its interaction with the x2APIC enablement, see
> > madt.c:170
> > > > > and below.
> > > > >
> > > > > There were several recent reports of the issue with Broadwell Xeon
> > > > > machines, no additional data or resolution.
> > > > >
> > > > > There are sporadic reports of the problem, where I do not see
> > > > > a clear commonality.
> > > >
> > > > Would it be sufficient for debugging purposes if I change the
> > !bootverbose
> > > > case from printing many lines of
> > > >
> > > > SMP: AP CPU #N Launched!
> > > >
> > > > to instead have a single
> > > >
> > > > SMP: Launching AP CPUs: 86 73 111 21 8 77 100 28 57 42 10 60 87 88 41
> > 113 36
> > > > 19 72 46 92 52 24 81 90 3 107 96 9 14 80 118 29 121 62 74 56 55 1 12
> > 63 18 67
> > > > 13 45 102 33 94 69 68 93 83 48 31 30 32 51 89 71 78 64 84 123 61 40 47
> > 37 22
> > > > 54 101 38 4 97 44 17 109 104 5 85 43 2 99 39 65 95 53 58 66 91 125 23
> > 115 16
> > > > 35 79 112 103 82 7 75 11 6 98 15 126 127 20 70 34 105 27 50 116 120 49
> > 25 108
> > > > 106 122 117 114 26 110 59 76 124 119
> > > >
> > > > ? (With each AP printing its number as it reaches the appropriate
> > point?)
> > > >
> > > > This yields almost the same gain as silencing the launch messages
> > completely,
> > > > while still allowing you to see each CPU announcing itself.
> > > I am fine with the behaviour, but I am not sure how would you implement
> > > this. printf(9) buffers the output, you need to flush it somehow.
> > >
> >
> > And printf(9) calls vprintf(9) which in turn calls _vprintf which
> > will allocate a buffer on the stack (bad idea) if the option
> > PRINTF_BUFR_SIZE is set in the kernel config file. So it seems
> > that output may even be double buffered.
> >
>
> I don't think either of these is true (the double buffered part,. nor the
> buffered part). Well, they are both kinda true, but they don't matter. It's
> all to make dmesg and friends work. There's other issues, but at this stage
> of the boot we're single threaded and the CPUs that are launching take out
> a spin lock.
>
> We always call putchar for each character that kvprintf(). So any buffering
> that's done is in addition to putchar going to the console.
>
> diff --git a/sys/x86/x86/mp_x86.c b/sys/x86/x86/mp_x86.c
> index 3fcf7aa25152..06973b1ea6d5 100644
> --- a/sys/x86/x86/mp_x86.c
> +++ b/sys/x86/x86/mp_x86.c
> @@ -1020,7 +1020,7 @@ init_secondary_tail(void)
> smp_cpus++;
>
> CTR1(KTR_SMP, "SMP: AP CPU #%d Launched", cpuid);
> - printf("SMP: AP CPU #%d Launched!\n", cpuid);
> + printf("%d%s", cpuid, smp_cpus == mp_ncpus ? "\n" : " ");
With this, my concern is to have the cpu ids printed immediately, instead
of being buffered until \n is printed as well. Please ensure that this is
the case.
>
> /* Determine if we are a logical CPU. */
> if (cpu_info[PCPU_GET(apic_id)].cpu_hyperthread)
>
> is enough to give them all to me on a single line. We'd need another printf
> that says 'Starting APs: ' since otherwise we see:
>
> andom: unblocking device.
> ioapic0 <Version 2.0> irqs 0-23 on motherboard
> ioapic1 <Version 2.0> irqs 24-47 on motherboard
> 1 8 10 20 2 19 11 12 22 4 14 23 16 13 3 18 5 17 21 9 15 6 7
> Timecounter "TSC-low" frequency 1350029824 Hz quality 1000
>
> which doesn't make a lot of sense.
>
> Comments?
>
> Warner
More information about the freebsd-hackers
mailing list