isp driver + clustered NetApp failover = strangeness

Mon May 30 09:26:02 PDT 2005

The message you see indicates that the disk, as far as FreeBSD is
concerned, has gone away. Since we don't really support dynamic
reattachment, except possible with some cleverness with parts of
FreeBSD I haven't worked much with (like the new GEOM stuff), it's
possible you're somewhat toast.

The other approach is to fool with the isp drive to *not* have the
disks go away. The scenario *does* sound like it should work in that
what is happening sounds like it should like one port going away and
reappearing elsewhere.

To do this right involves some more use case scenarios about what can
occur and deciding what policies to apply. I did this for FreeBSD 4.X
at a company, but they declined to contribute this work back to
FreeBSD.

A short term fix *might* be (I haven't tested this) to comment out the
clause around line 2891 in isp_freebsd.c that looks like:

                        } else {
                                xpt_async(AC_LOST_DEVICE, tmppath, NULL);
                        }

That is, don't tell the upper layers that a device vanished (that's
the ISPASYNC_PROMENADE case where I report back up noting that a
particular WWN has left the loop or fabric). What I really need to do
here is set a policy knob that allows users to establish how long to
wait for WWNs to reappear.

There's a long history of discussion about this going back some years-
how much validation do you need to do when something "leaves" and then
"returns". There are those who believe that WWN validation is enough.
There are those who believe that WWN validation and checking
parameters like size is enough. There are those who believe that a
device that "leaves" and then "returns" has to be treated like a
complete removal and reattachment event.

On 5/29/05, Tim Spencer <tspencer at hungry.com> wrote:
> Hey there!
> 
>     I've got a pair of NetApp 940c heads that are exporting LUNs out
> to a bunch of FreeBSD hosts with qla2312 cards in them over a Brocade
> 2850 FC switch.  Everything works great until I test out standby
> cluster failover on the NetApps.  To quote NetApp's manual:
> "Port A on each target HBA operates as the active port, and Port B
> operates as a standby port. When the cluster is in normal operation,
> Port A provides access to local LUNs, and Port B is not available to
> the initiator. When one filer fails, Port B on the partner filer
> becomes active and provides access to the LUNs on the failed filer.
> The Port B assumes the WWPN of the Port A on the failed partner."
> 
>     So, to me, it sounds like this _should_ work for our FreeBSD
> hosts, which don't support multipathing, and thus must use this sort
> of failover.  When the failover happens, the WWPN moves over to port
> B on the other head, perhaps a link reset happens or something, and
> everything keeps going.  Well, it turns out that this is only partly
> true.  If there is no I/O happening during the swap, then everything
> does seem to work out fine.  But if there is I/O going on, then
> things quickly go downhill.  I see this:
> 
> May 28 19:35:56 toc2-db1 /kernel: (da0:isp0:0:1:0): Invalidating pack
> May 28 19:35:58 toc2-db1 /kernel: (da0:isp0:0:1:0): Invalidating pack
> May 28 19:36:50 toc2-db1 /kernel: (da0:isp0:0:1:0): isp0: watchdog
> timeout for handle 0x1f3
> 
>     After this, sometimes the system locks up completely, and
> sometimes the system is operational, but anything that has to do with
> the filesystem in question hangs, etc.
> 
>     So here's my question:  Is this something that we can make
> work?  I really don't know all that much about the lower levels of
> how Fibre-Channel and the isp driver work, but it sounds like this
> ought to work.  Is there anybody out there who knows more about the
> driver who might be willing to work on this?  I can't guarantee
> anything, but our company does support FreeBSD development, and we
> might be able to swing some cash towards somebody who would be able
> to make this work.  Is there anything else that I can include to help
> figure out what is going wrong?  Below, I include dmesg from one of
> the hosts so you can see what sort of system is running this, but if
> you've got more things that I can do to diagnose this, let me know.
>     Thanks, and have fun!
> 
>         -tspencer
> 
>  : toc2-db2 []$; dmesg
> Copyright (c) 1992-2005 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>         The Regents of the University of California. All rights
> reserved.
> FreeBSD 4.11-STABLE #0: Wed May 25 05:39:38 GMT 2005
>     root@:/usr/src/sys/compile/BSD4.11.GODSPEED-SMP
> Timecounter "i8254"  frequency 1193182 Hz
> CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2786.13-MHz 686-class CPU)
>   Origin = "GenuineIntel"  Id = 0xf29  Stepping = 9
> 
> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE
> ,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>   Hyperthreading: 2 logical CPUs
> real memory  = 3221094400 (3145600K bytes)
> avail memory = 3134447616 (3060984K bytes)
> Changing APIC ID for IO APIC #0 from 0 to 8 on chip
> Changing APIC ID for IO APIC #1 from 0 to 9 on chip
> Changing APIC ID for IO APIC #2 from 0 to 10 on chip
> Programming 16 pins in IOAPIC #0
> IOAPIC #0 intpin 2 -> irq 0
> Programming 16 pins in IOAPIC #1
> Programming 16 pins in IOAPIC #2
> FreeBSD/SMP: Multiprocessor motherboard: 4 CPUs
> cpu0 (BSP): apic id:  0, version: 0x00050014, at 0xfee00000
> cpu1 (AP):  apic id:  1, version: 0x00050014, at 0xfee00000
> cpu2 (AP):  apic id:  6, version: 0x00050014, at 0xfee00000
> cpu3 (AP):  apic id:  7, version: 0x00050014, at 0xfee00000
> io0 (APIC): apic id:  8, version: 0x000f0011, at 0xfec00000
> io1 (APIC): apic id:  9, version: 0x000f0011, at 0xfec01000
> io2 (APIC): apic id: 10, version: 0x000f0011, at 0xfec02000
> Preloaded elf kernel "kernel" at 0x9f3d2000.
> Warning: Pentium 4 CPU: PSE disabled
> Pentium Pro MTRR support enabled
> md0: Malloc disk
> Using $PIR table, 9 entries at 0x9f0fc410
> npx0: <math processor> on motherboard
> npx0: INT 16 interface
> pcib0: <Host to PCI bridge> on motherboard
> IOAPIC #1 intpin 3 -> irq 2
> IOAPIC #1 intpin 7 -> irq 7
> IOAPIC #1 intpin 11 -> irq 10
> pci0: <PCI bus> on pcib0
> pci0: <unknown card> (vendor=0x1028, dev=0x000c) at 4.0 irq 2
> pci0: <unknown card> (vendor=0x1028, dev=0x0008) at 4.1 irq 7
> pci0: <unknown card> (vendor=0x1028, dev=0x000d) at 4.2 irq 10
> pci0: <ATI Mach64-GR graphics accelerator> at 14.0
> atapci0: <ServerWorks CSB5 ATA100 controller> port 0x8b0-0x8bf,
> 0x8d8-0x8db,0x8d0-0x8d7,0x8c8-0x8cb,0x8c0-0x8c7 at device 15.1 on pci0
> ata0: at 0x1f0 irq 14 on atapci0
> ata1: at 0x170 irq 15 on atapci0
> pci0: <OHCI USB controller> at 15.2 irq 5
> isab0: <PCI to ISA bridge (vendor=1166 device=0225)> at device 15.3
> on pci0
> isa0: <ISA bus> on isab0
> pcib1: <Host to PCI bridge> on motherboard
> IOAPIC #1 intpin 4 -> irq 11
> pci1: <PCI bus> on pcib1
> fxp0: <Intel 82550 Pro/100 Ethernet> port 0xdcc0-0xdcff mem
> 0xfcf00000-0xfcf1ffff,0xfcf20000-0xfcf20fff irq 11 at device 8.0 on pci1
> fxp0: Ethernet address 00:0e:0c:62:9e:17
> inphy0: <i82555 10/100 media interface> on miibus0
> inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> pcib2: <Host to PCI bridge> on motherboard
> IOAPIC #1 intpin 8 -> irq 13
> pci2: <PCI bus> on pcib2
> isp0: <Qlogic ISP 2312 PCI FC-AL Adapter> port 0xcc00-0xccff mem
> 0xfcd00000-0xfcd00fff irq 13 at device 6.0 on pci2
> isp0: bad execution throttle of 0- using 16
> pcib3: <Host to PCI bridge> on motherboard
> IOAPIC #1 intpin 12 -> irq 16
> IOAPIC #1 intpin 13 -> irq 17
> pci3: <PCI bus> on pcib3
> bge0: <Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002> mem
> 0xfcb10000-0xfcb1ffff irq 16 at device 6.0 on pci3
> bge0: Ethernet address: 00:11:43:34:7b:3f
> miibus1: <MII bus> on bge0
> brgphy0: <BCM5703 10/100/1000baseTX PHY> on miibus1
> brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
> 1000baseTX-FDX, auto
> bge1: <Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002> mem
> 0xfcb00000-0xfcb0ffff irq 17 at device 8.0 on pci3
> bge1: Ethernet address: 00:11:43:34:7b:40
> miibus2: <MII bus> on bge1
> brgphy1: <BCM5703 10/100/1000baseTX PHY> on miibus2
> brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
> 1000baseTX-FDX, auto
> pcib4: <ServerWorks host to PCI bridge(unknown chipset)> on motherboard
> IOAPIC #1 intpin 14 -> irq 18
> pci4: <PCI bus> on pcib4
> pcib8: <PCI to PCI bridge (vendor=8086 device=0309)> at device 8.0 on
> pci4
> pci5: <PCI bus> on pcib8
> aac0: <Dell PERC 3/Di> mem 0xf0000000-0xf7ffffff irq 18 at device 8.1
> on pci4
> aac0: i960RX 100MHz, 118MB cache memory, optional battery present
> aac0: Kernel 2.8-0, Build 6089, S/N 74a1d3
> aac0: Supported
> Options=275c<WCACHE,DATA64,HOSTTIME,WINDOW4GB,SOFTERR,NORECOND,SGMAP64>
> pcib5: <ServerWorks host to PCI bridge(unknown chipset)> on motherboard
> pci6: <PCI bus> on pcib5
> pcib6: <ServerWorks host to PCI bridge(unknown chipset)> on motherboard
> pci7: <PCI bus> on pcib6
> pcib7: <ServerWorks host to PCI bridge(unknown chipset)> on motherboard
> pci8: <PCI bus> on pcib7
> orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,
> 0xc9800-0xcd7ff,0xcd800-0xcefff,0xec000-0xeffff on isa0
> pmtimer0 on isa0
> fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on
> isa0
> fdc0: FIFO enabled, 8 bytes threshold
> fd0: <1440-KB 3.5" drive> on fdc0 drive 0
> atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
> isa0
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
> sio0: type 16550A
> sio1 at port 0x2f8-0x2ff irq 3 on isa0
> sio1: type 16550A
> APIC_IO: Testing 8254 interrupt delivery
> APIC_IO: Broken MP table detected: 8254 is not connected to IOAPIC #0
> intpin 2
> APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0
> IP packet filtering initialized, divert disabled, rule-based
> forwarding enabled, default to accept, logging limited to 100 packets/
> entry by default
> ata0-slave: ATAPI identify retries exceeded
> SMP: AP CPU #2 Launched!
> SMP: AP CPU #3 Launched!
> SMP: AP CPU #1 Launched!
> acd0: CDROM <TEAC CD-ROM CD-224E> at ata0-master PIO4
> aacd0: <RAID 0/1> on aac0
> aacd0: 139997MB (286714368 sectors)
> Mounting root from ufs:/dev/aacd0s1a
> da0 at isp0 bus 0 target 0 lun 0
> da0: <NETAPP LUN 0.2> Fixed Direct Access SCSI-4 device
> da0: 200.000MB/s transfers, Tagged Queueing Enabled
> da0: 817152MB (1673527296 512 byte sectors: 255H 63S/T 38636C)
> WARNING: / was not properly dismounted
> bge0: gigabit link up
> ohci0: <OHCI (generic) USB controller> mem 0xfe100000-0xfe100fff irq
> 5 at device 15.2 on pci0
> usb0: OHCI version 1.0, legacy support
> usb0: <OHCI (generic) USB controller> on ohci0
> usb0: USB revision 1.0
> uhub0: 4 ports with 4 removable, self powered
> : toc2-db2 []$;
> 
> _______________________________________________
> freebsd-scsi at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-scsi
> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe at freebsd.org"
>