[PATCH] gcc 4.x cleanups of cam code
Bruce Evans
bde at zeta.org.au
Thu Jan 18 09:36:49 UTC 2007
On Wed, 17 Jan 2007, Scott Long wrote:
> Craig Rodrigues wrote:
>>
>> Could someone who is more familiar with CAM take a look
That's not me.
>> at the pass I did to try to clean up some gcc 4.x compiler warnings?
>> gcc 4.x is more intolerant than gcc 3.x of mixing up assignments of
>> char * and unsigned char *.
> I'm not a language guru, nor do I care to be. However, your patch seems
I play better as a language guru for 1 language.
> to imply that gcc 4.x requires an increasing amount of casting
> obfuscation in otherwise working source code. Is the point of gcc 4.x
> really to merely make C programming even more tedious, or should we be
> fixing our interfaces do that they work without all of the clumsy casts
> that you are proposing?. Below are my amateur comments:
Both. More warnings make programming more tedious, especially if you
try to be careful and use lots of types, but we should be fixing
interfaces and not hiding bugs behind casts.
> Index: cam.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/cam/cam.c,v
> retrieving revision 1.10
> diff -u -u -r1.10 cam.c
> --- cam.c 18 Apr 2006 21:53:39 -0000 1.10
> +++ cam.c 18 Jan 2007 02:05:36 -0000
> @@ -104,7 +104,7 @@
> #endif
>
> void
> -cam_strvis(u_int8_t *dst, const u_int8_t *src, int srclen, int dstlen)
> +cam_strvis(char *dst, const char *src, int srclen, int dstlen)
I think the main problem addressed by this set of patches is that gcc
finally started warning about the error of mixing pointers to plain char
with pointers to unsigned char. If chars are signed, then there is a
type mismatch, and if chars are unsigned then the code is unportable.
This error is most common in code that tries to use the correct types.
Old sloppy code tends to use plain chars for everything and not worry
about the sign conversion problems or 1's complement problems from this.
Here the code was apparently trying to be even more careful and use
u_int8_t instead of u_char. Userland strvis() just uses char *, but
that doesn't make it sloppy, since strings actually are char * in
userland. I think cam_strvis()'s use of u_int8_t is correct too, since
CAM wants to make visible non-strings which should be declared as
u_int8_t[] in SCSI data structures. Howver, the only relevant non-
strings are vendor[], product[] and revision[], and these are declared
as plain char[], hence the type mismatch. I think the bug is in the
declaration of these non-strings. AFAIK (not far), the non-strings
are specified by the SCSI standard as consisting of 8-bit bytes.
scsi_all.h uses u_int8_t for most byte data except these.
> {
>
> /* Trim leading/trailing spaces, nulls. */
> @@ -115,9 +115,9 @@
> srclen--;
>
> while (srclen > 0 && dstlen > 1) {
> - u_int8_t *cur_pos = dst;
> + char *cur_pos = dst;
>
> - if (*src < 0x20 || *src >= 0x80) {
> + if ((u_char)*src < 0x20 || (u_char)*src >= 0x80) {
> /* SCSI-II Specifies that these should never occur.
> */
> /* non-printable character */
> if (dstlen > 4) {
>
> You've gone from an unsigned quantity to a signed quantity. On the down
> side, this breaks the old API, though in a fairly trivial way. On the plus
> side, it brings the function more in line with the standard strvis(3)
> function. Not sure exactly how I feel about this. Since cam.c and cam.h are
> shared with userland, have you verified that this
> works there too?
For conversions between chars and u_int8_t, something like this is needed,
but it is better to start with u_int8_t and avoid casts if possible.
Casts should never be applied before range checks because they may move
the value in or out of the range. In the above, the casts are just ugly
and not needed. Range checking has already been broken in the 1's complement
case (the invalid value (u_int8_t)0xFF is -0 as a signed 8-bit char, and
of course things would be completely broken if the data is 8 bits but char
is > 8 bits (maybe this can't happen if u_int8_t exists). The casts in
the above are to avoid changing the semantics of cam_strvis() provided
problems have not already occurred (they might even give identical object
code). With more uglyness, API problems can probably be fixed up too.
The code would reduce to what the old sign mismatches did, but to be
strictly correct it has to access the 8-bit unsigned chars (it that is
what the were) using something like memcpy() to get at the individual
bits. The bits can then be interpreted.
> ...
> Same as above.
Poisoning for sign mismatches is like const poisoning, but worse, since
any change in signedness risks sign extension bugs.
> Index: cam_periph.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/cam/cam_periph.c,v
> retrieving revision 1.64
> diff -u -u -r1.64 cam_periph.c
> --- cam_periph.c 5 Dec 2006 07:45:27 -0000 1.64
> +++ cam_periph.c 18 Jan 2007 02:05:36 -0000
> @@ -648,7 +648,7 @@
> mapinfo->bp[i]->b_saveaddr = mapinfo->bp[i]->b_data;
>
> /* put our pointer in the data slot */
> - mapinfo->bp[i]->b_data = *data_ptrs[i];
> + mapinfo->bp[i]->b_data = (caddr_t)*data_ptrs[i];
>
> /* set the transfer length, we know it's < DFLTPHYS */
> mapinfo->bp[i]->b_bufsize = lengths[i];
> @@ -676,7 +676,7 @@
> }
>
> /* set our pointer to the new mapped area */
> - *data_ptrs[i] = mapinfo->bp[i]->b_data;
> + *data_ptrs[i] = (u_int8_t *)mapinfo->bp[i]->b_data;
>
> mapinfo->num_bufs_used++;
> }
>
> I've seen this cast in other patches that have been pushed out for gcc
> 4.x. It seems like every single assignment involving bp->b_data in the
> kernel is going to need a clumsy cast from now on. How thrilling. I
> wonder if there is a better way to do this. Also, I thought that the
> use of caddr_t had been frowned upon many years ago.
b_data is still (bogusly) caddr_t, so it needs to be fixed someday and
it's surprising that more casts aren't needed when assigning to it.
The above seems to be just another case of the sign poisoning and not
closely related to b_data. caddr_t is char *, so it has signedness,
so with warnings about signedness mismatches for pointers you can now
only assign pointers to signed quantities to it. C has a grandfather
clause (kludge) which allows void * to be misspelled as char * in
some contexts without a diagnostic for type mismatches which are much
larger than signedness mismatches being required. I wonder if the new
warnings are allowed with this. Maybe we are asking for them.
> ===================================================================
> RCS file: /home/ncvs/src/sys/cam/scsi/scsi_ses.c,v
> retrieving revision 1.33
> diff -u -u -r1.33 scsi_ses.c
> --- scsi/scsi_ses.c 5 Dec 2006 07:45:28 -0000 1.33
> +++ scsi/scsi_ses.c 18 Jan 2007 02:05:36 -0000
> @@ -676,7 +676,7 @@
> }
>
> ccb = cam_periph_getccb(ssc->periph, 1);
> - cam_fill_csio(&ccb->csio, 0, sesdone, ddf, MSG_SIMPLE_Q_TAG, dptr,
> + cam_fill_csio(&ccb->csio, 0, sesdone, ddf, MSG_SIMPLE_Q_TAG,
> (u_int8_t *)dptr,
> dlen, sizeof (struct scsi_sense_data), cdbl, 60 * 1000);
> bcopy(cdb, ccb->csio.cdb_io.cdb_bytes, cdbl);
>
>
> Blah, another ugly cast. Avoiding the cast looks to require a lot of work,
> but I don't know if it's ultimately the right thing.
Also, an expansion of a lines beyond 80 characters and subsequent mangling
of the long line in the mail.
> @@ -762,7 +762,7 @@
> return (SES_NONE);
> }
>
> - if (STRNCMP((char *)&iqd[SAFTE_START], "SAF-TE", SAFTE_LEN - 2) == 0)
> {
> + if (STRNCMP(&iqd[SAFTE_START], "SAF-TE", SAFTE_LEN - 2) == 0) {
> return (SES_SAFT);
> }
> return (SES_NONE);
>
>
> Finally, a cast is removed!
Any code that tries to be very careful and use u_int8_t for byte data has
this problem when interfacing with standard string functions.
Bruce
More information about the freebsd-scsi
mailing list