"read defect list" with 2.0.30-pre7 and patch Aug19
Doug Ledford
dledford at dialnet.net
Mon Aug 25 08:24:58 PDT 1997
On Mon, 25 Aug 1997, Ulrich Windl wrote:
> On 23 Aug 97 at 15:09, Doug Ledford wrote:
>
> To your patch: If changing things like below, you might also add a
> comment (inside the patch) saying why you changed it. It will help
> anybody outside this discussion group.
OK...for people outside of this group, Leonard Zubkoff has looked at the
patch, agrees with it, has changed the comments in the code as well to
agree with what I did, and is sending it off to Linus. So, my patch is
somewhat depricated in the sense that it isn't complete ;)
> And finally another one with DEBUG enabled in scsi.c and sd.c:
>
> Aug 22 23:57:52 elf kernel: SMalloc: 4096 0000a000
> Aug 22 23:57:52 elf kernel: scsi_do_cmd (host = 0, channel = 0 target = 0, buffer =0000a000, bufflen = 4096, done = 001ab754, timeout = 1000, retries = 5)
> Aug 22 23:57:52 elf kernel: command : 37 00 14 00 00 00 00 20 00 00
> Aug 22 23:57:52 elf kernel: (scsi0:0:0) Data overrun of 16773218 bytes detected in Data-In phase, tag 7; forcing a retry.
> Aug 22 23:57:52 elf kernel: Have seen Data Phase. Length=4096, NumSGs=1.
> Aug 22 23:57:52 elf kernel: sg[0] - Addr 0xa000 : Length 4096
>
> The size of 4096 seems to be the single page originally stated. Despite
> of the suggestion to increase the size of the buffer (maybe per
> ioctl(): set timeout and buffer size), I have only the hint that
> reading the defect list is not the fastest operation.
OK...looking at this (and noting that you still got crashes), one of three
things is true. First, you might not have applied the patch to
scsi_ioctl.c yet, second you could have applied it but it failed to
recompile the file on your kernel build (a dependency/time stamp problem),
or third, you applied it but your copy of scsiinfo is not using the
scsi_ioctl to send the read defects list command (mine does, that's what
lead me there). The reason I say one of these three is true is the debug
output you have above. Let me explain somewhat.
> Aug 22 23:57:52 elf kernel: SMalloc: 4096 0000a000
This inidicates that the command allocated a 4096 byte transfer buffer
(and the later aic7xxx debug statements concur, as well as giving the
address and the fact that it is a single buffer that we built into a one
element SG array).
> Aug 22 23:57:52 elf kernel: command : 37 00 14 00 00 00 00 20 00
00
This indicates on the other hand, that the command itself is requesting
8192 bytes from the drive. cmd[7] << 8 | cmd[8] == length of request. In
this case, 0x20 << 8 | 0x0 == 8192. Now, my patch to scsi_ioctl.c would
have stopped this from even being allowed to go through (tested and
verified on my system). So, that's why I say one of the three above
scenarios is true. The next thing I did on my system at home was to
recompile the source to scsiinfo. Before I did so, I went to the read
defects list function (the only place in the entire source code that has
the number 8192 hard coded into it) and changed both occurences of 8192 to
4096, at which point scsiinfo started working again after the patch. I
should note here that scsiinfo is very broken in the sense that it doesn't
even bother to check the return status of the ioctl before it prints out
the table, so if you leave 8192 in the source code after applying the
patch to scsi_ioctl.c, then you get a 614 element defect list and a 614
element grown defect list if I remember correctly, and they are all 0 ;)
Now, also, scsiinfo is again broken because when it tries to read this
table, it uses the number of defects the drive says is in the list without
doing any bounds checking on whether or not that number of elements will
extend beyond the amount of data read, so it will return bogus information
beyond the initial 4k of defect data. In other words, scsiinfo is fairly
broken in regards to defect lists, but at least the patch I made to
scsi_ioctl.c should keep it from crashing your system. Now, if your copy
of scsiinfo is using something other than scsi_ioctl to send its command,
then we may have to look into that as well.
*****************************************************************************
* Doug Ledford * Unix, Novell, Dos, Windows 3.x, *
* dledford at dialnet.net 873-DIAL * WfW, Windows 95 & NT Technician *
* PPP access $14.95/month *****************************************
* Springfield, MO and surrounding * Usenet news, e-mail and shell account.*
* communities. Sign-up online at * Web page creation and hosting, other *
* 873-9000 V.34 * services available, call for info. *
*****************************************************************************
More information about the aic7xxx
mailing list