Inconsistent behavior with dd(1)
William Orr
will at worrbase.com
Fri Sep 12 00:15:16 UTC 2014
Hey, any thoughts on this? Does it need more work? Can it get merged?
On 08/25/2014 09:49 PM, William Orr wrote:
> On 8/18/14 12:00 PM, William Orr wrote:
>> Reply inline.
>>
>> On 08/16/2014 10:34 AM, John-Mark Gurney wrote:
>>> Alan Somers wrote this message on Fri, Aug 15, 2014 at 10:42 -0600:
>>>> On Thu, Aug 14, 2014 at 11:55 PM, William Orr <will at worrbase.com>
>>>> wrote:
>>>>> Hey,
>>>>>
>>>>> I found some inconsistent behavior with dd(1) when it comes to
>>>>> specifying arguments in -CURRENT.
>>>>>
>>>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null
>>>>> count=18446744073709551616
>>>>> dd: count: Result too large
>>>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null
>>>>> count=18446744073709551617
>>>>> dd: count: Result too large
>>>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null
>>>>> count=18446744073709551615
>>>>> dd: count cannot be negative
>>>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null
>>>>> count=-18446744073709551615
>>>>> 1+0 records in
>>>>> 1+0 records out
>>>>> 512 bytes transferred in 0.000373 secs (1373071 bytes/sec)
>>>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=-1
>>>>> dd: count cannot be negative
>>>>>
>>>>> ???
>>>>>
>>>>> Any chance someone has the time and could take a look?
>>>>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191263
>>>>>
>>>>> Thanks,
>>>>> William Orr
>>>>>
>>>>> ???
>>>>
>>>> IMHO, this is a bug in strtouq(3), not in dd(1). Why should it parse
>>>> negative numbers at all, when there is stroq(3) for that purpose? The
>>>> standard is clear that it must, though. Oddly enough, stroq would
>>>> probably not accept -18446744073709551615, even though strtouq does.
>>>> Specific comments on your patch below:
>>>>
>>>>
>>>>> Here???s the patch:
>>>>>
>>>>> Index: bin/dd/args.c
>>>>> ===================================================================
>>>>> --- bin/dd/args.c (revision 267712)
>>>>> +++ bin/dd/args.c (working copy)
>>>>> @@ -186,46 +186,31 @@
>>>>> static void
>>>>> f_bs(char *arg)
>>>>> {
>>>>> - uintmax_t res;
>>>>> -
>>>>> - res = get_num(arg);
>>>>> - if (res < 1 || res > SSIZE_MAX)
>>>>> - errx(1, "bs must be between 1 and %jd",
>>>>> (intmax_t)SSIZE_MAX);
>>>>> - in.dbsz = out.dbsz = (size_t)res;
>>>>> + in.dbsz = out.dbsz = get_num(arg);
>>>>> + if (in.dbsz < 1 || out.dbsz < 1)
>>>> Why do you need to check both in and out? Aren't they the same?
>>>> Also, you eliminated the check for overflowing SSIZE_MAX. That's not
>>>> ok, because these values get passed to places that expect signed
>>>> numbers, for example in dd.c:303.
>>> The type of dbsz is size_t, so really:
>>>
>>>>> + errx(1, "bs must be between 1 and %ju",
>>>>> (uintmax_t)-1);
>>> This should be SIZE_MAX, except there isn't a define for this? So maybe
>>> the code really should be:
>>> (uintmax_t)(size_t)-1
>>>
>>> to get the correct value for SIZE_MAX...
>>>
>>> Otherwise on systems that uintmax_t is >32bits and size_t is 32bits,
>>> the error message will be wrong...
>> Yes, this should probably be SIZE_MAX rather than that cast. Same with
>> the others
>>
>>>>> }
>>>>>
>>>>> static void
>>>>> f_cbs(char *arg)
>>>>> {
>>>>> - uintmax_t res;
>>>>> -
>>>>> - res = get_num(arg);
>>>>> - if (res < 1 || res > SSIZE_MAX)
>>>>> - errx(1, "cbs must be between 1 and %jd",
>>>>> (intmax_t)SSIZE_MAX);
>>>>> - cbsz = (size_t)res;
>>>>> + cbsz = get_num(arg);
>>>>> + if (cbsz < 1)
>>>>> + errx(1, "cbs must be between 1 and %ju",
>>>>> (uintmax_t)-1);
>>>>> }
>>>> Again, you eliminated the check for SSIZE_MAX, but cbsz must be signed.
>>> What do you mean by this? cbsz is size_t which is unsigned...
>> I believe he's referring to this use of cbsz/in.dbsz/out.dbsz:
>>
>> https://svnweb.freebsd.org/base/head/bin/dd/dd.c?revision=265698&view=markup#l171
>>
>>
>> Really, this is more wrong since there is math inside of a malloc(3)
>> call without any overflow handling. By virtue of making this max out at
>> a ssize_t, it becomes more unlikely that you'll have overflow.
>>
>> This math should probably be done ahead of time with proper overflow
>> handling. I'll include that in my next patch, if there's no objection.
>>
>> I don't see any other reason why in.dbsz, out.dbsz or cbsz should be
>> signed, but it's very possible that I didn't look hard enough.
>>
>>> Again, the cast above is wrong... Maybe we should add a SIZE_MAX
>>> define so we don't have to see the double cast...
>>>
>>>>> static void
>>>>> f_count(char *arg)
>>>>> {
>>>>> - intmax_t res;
>>>>> -
>>>>> - res = (intmax_t)get_num(arg);
>>>>> - if (res < 0)
>>>>> - errx(1, "count cannot be negative");
>>>>> - if (res == 0)
>>>>> - cpy_cnt = (uintmax_t)-1;
>>>> This is a special case. See dd_in(). I think that eliminating this
>>>> special case will have the unintended effect of breaking count=0.
>>>>
>>>>> - else
>>>>> - cpy_cnt = (uintmax_t)res;
>>>>> + cpy_cnt = get_num(arg);
>>>>> }
>>>>>
>>>>> static void
>>>>> f_files(char *arg)
>>>>> {
>>>>> -
>>> Don't eliminate these blank lines.. they are intentional per style(9):
>>> /* Insert an empty line if the function has no local
>>> variables. */
>>>
>>>>> files_cnt = get_num(arg);
>>>>> if (files_cnt < 1)
>>>>> - errx(1, "files must be between 1 and %jd",
>>>>> (uintmax_t)-1);
>>>>> + errx(1, "files must be between 1 and %ju",
>>>>> (uintmax_t)-1);
>>>> Good catch.
>>>>
>>>>> }
>>>>>
>>>>> static void
>>>>> @@ -241,14 +226,10 @@
>>>>> static void
>>>>> f_ibs(char *arg)
>>>>> {
>>>>> - uintmax_t res;
>>>>> -
>>>>> if (!(ddflags & C_BS)) {
>>>>> - res = get_num(arg);
>>>>> - if (res < 1 || res > SSIZE_MAX)
>>>>> - errx(1, "ibs must be between 1 and %jd",
>>>>> - (intmax_t)SSIZE_MAX);
>>>>> - in.dbsz = (size_t)res;
>>>>> + in.dbsz = get_num(arg);
>>>>> + if (in.dbsz < 1)
>>>>> + errx(1, "ibs must be between 1 and %ju",
>>>>> (uintmax_t)-1);
>>>> Again, you eliminated the check for SSIZE_MAX, but dbsz must be signed.
>>> If dbsz must be signed, we should change it's definition to ssize_t
>>> instead of size_t... Can you point to the line that says this?
>>>
>>> In investigating this, it looks like we may have a bug in ftruncate in
>>> that out.offset * out.dbsz may overflow and return incorrect results...
>>> We should probably check that the output (cast to off_t) is greater than
>>> both the inputs before calling ftruncate... This is safe as both are
>>> unsigned...
>> Yeah, there probably ought to be integer overflow handling here as well.
>>
>>>>> }
>>>>> }
>>>>>
>>>>> @@ -262,14 +243,10 @@
>>>>> static void
>>>>> f_obs(char *arg)
>>>>> {
>>>>> - uintmax_t res;
>>>>> -
>>>>> if (!(ddflags & C_BS)) {
>>>>> - res = get_num(arg);
>>>>> - if (res < 1 || res > SSIZE_MAX)
>>>>> - errx(1, "obs must be between 1 and %jd",
>>>>> - (intmax_t)SSIZE_MAX);
>>>>> - out.dbsz = (size_t)res;
>>>>> + out.dbsz = get_num(arg);
>>>>> + if (out.dbsz < 1)
>>>>> + errx(1, "obs must be between 1 and %ju",
>>>>> (uintmax_t)-1);
>>>>> }
>>>>> }
>>>> Again, you eliminated the check for SSIZE_MAX, but dbsz must be signed.
>>>>
>>>>> @@ -378,11 +355,14 @@
>>>>> uintmax_t num, mult, prevnum;
>>>>> char *expr;
>>>>>
>>>>> + if (val[0] == '-')
>>>>> + errx(1, "%s: cannot be negative", oper);
>>>>> +
>>>> In general, I like this part of the diff. Every user of get_num
>>>> checks for negative values, so why not move the check into get_num
>>>> itself? But you changed it from a numeric check to a text check, and
>>>> writing text parsers makes me nervous. I can't see any problems,
>>>> though.
>> Funnily enough this part of the diff was wrong. I didn't account for
>> spaces, so I'll add that in my upcoming diff.
>>
>>>>> errno = 0;
>>>>> num = strtouq(val, &expr, 0);
>>>>> if (errno != 0) /* Overflow or
>>>>> underflow. */
>>>>> err(1, "%s", oper);
>>>>> -
>>>>> +
>>>>> if (expr == val) /* No valid
>>>>> digits. */
>>>>> errx(1, "%s: illegal numeric value", oper);
>>>>>
>>>>> Index: bin/dd/dd.c
>>>>> ===================================================================
>>>>> --- bin/dd/dd.c (revision 267712)
>>>>> +++ bin/dd/dd.c (working copy)
>>>>> @@ -284,8 +284,6 @@
>>>>>
>>>>> for (;;) {
>>>>> switch (cpy_cnt) {
>>>>> - case -1: /* count=0 was
>>>>> specified */
>>>>> - return;
>>>> Again, I don't think this will do what you want it to do. Previously,
>>>> leaving count unspecified resulted in cpy_cnt being 0, and specifying
>>>> count=0 set cpy_cnt to -1. With your patch, setting count=0 will have
>>>> the same effect as leaving it unspecified.
>> Nope. It didn't do what I wanted. I'll submit an updated diff with this
>> fixed as well as the other things I mentioned, provided there's no
>> objection to my direction.
>>
>>>>> case 0:
>>>>> break;
>>>>> default:
>>>>>
>>>>>
> Here is a new version of this patch. I've now changed the members of the
> struct that are used as ssize_ts to ssize_ts, and have fixed casts
> appropriately. I have also opted for strtoull over the deprecated
> strtouq. I changed some struct members that were declared as uintmax_t's
> to size_t's.
>
> Any comments? Criticisms? Gross attacks on my character?
>
> Index: args.c
> ===================================================================
> --- args.c (revision 270645)
> +++ args.c (working copy)
> @@ -41,6 +41,7 @@
>
> #include <sys/types.h>
>
> +#include <ctype.h>
> #include <err.h>
> #include <errno.h>
> #include <inttypes.h>
> @@ -171,8 +172,7 @@
> */
> if (in.offset > OFF_MAX / (ssize_t)in.dbsz ||
> out.offset > OFF_MAX / (ssize_t)out.dbsz)
> - errx(1, "seek offsets cannot be larger than %jd",
> - (intmax_t)OFF_MAX);
> + errx(1, "seek offsets cannot be larger than %jd", OFF_MAX);
> }
>
> static int
> @@ -186,37 +186,28 @@
> static void
> f_bs(char *arg)
> {
> - uintmax_t res;
>
> - res = get_num(arg);
> - if (res < 1 || res > SSIZE_MAX)
> - errx(1, "bs must be between 1 and %jd", (intmax_t)SSIZE_MAX);
> - in.dbsz = out.dbsz = (size_t)res;
> + in.dbsz = out.dbsz = get_num(arg);
> + if (out.dbsz < 1 || out.dbsz > SSIZE_MAX)
> + errx(1, "bs must be between 1 and %jd", SSIZE_MAX);
> }
>
> static void
> f_cbs(char *arg)
> {
> - uintmax_t res;
>
> - res = get_num(arg);
> - if (res < 1 || res > SSIZE_MAX)
> - errx(1, "cbs must be between 1 and %jd", (intmax_t)SSIZE_MAX);
> - cbsz = (size_t)res;
> + cbsz = get_num(arg);
> + if (cbsz < 1 || cbsz > SSIZE_MAX)
> + errx(1, "cbs must be between 1 and %jd", SSIZE_MAX);
> }
>
> static void
> f_count(char *arg)
> {
> - intmax_t res;
>
> - res = (intmax_t)get_num(arg);
> - if (res < 0)
> - errx(1, "count cannot be negative");
> - if (res == 0)
> - cpy_cnt = (uintmax_t)-1;
> - else
> - cpy_cnt = (uintmax_t)res;
> + cpy_cnt = get_num(arg);
> + if (cpy_cnt == 0)
> + cpy_cnt = -1;
> }
>
> static void
> @@ -225,7 +216,7 @@
>
> files_cnt = get_num(arg);
> if (files_cnt < 1)
> - errx(1, "files must be between 1 and %jd", (uintmax_t)-1);
> + errx(1, "files must be between 1 and %ju", SIZE_MAX);
> }
>
> static void
> @@ -241,14 +232,11 @@
> static void
> f_ibs(char *arg)
> {
> - uintmax_t res;
>
> if (!(ddflags & C_BS)) {
> - res = get_num(arg);
> - if (res < 1 || res > SSIZE_MAX)
> - errx(1, "ibs must be between 1 and %jd",
> - (intmax_t)SSIZE_MAX);
> - in.dbsz = (size_t)res;
> + in.dbsz = get_num(arg);
> + if (in.dbsz < 1 || in.dbsz > SSIZE_MAX)
> + errx(1, "ibs must be between 1 and %ju", SSIZE_MAX);
> }
> }
>
> @@ -262,14 +250,11 @@
> static void
> f_obs(char *arg)
> {
> - uintmax_t res;
>
> if (!(ddflags & C_BS)) {
> - res = get_num(arg);
> - if (res < 1 || res > SSIZE_MAX)
> - errx(1, "obs must be between 1 and %jd",
> - (intmax_t)SSIZE_MAX);
> - out.dbsz = (size_t)res;
> + out.dbsz = get_num(arg);
> + if (out.dbsz < 1 || out.dbsz > SSIZE_MAX)
> + errx(1, "obs must be between 1 and %jd", SSIZE_MAX);
> }
> }
>
> @@ -378,11 +363,17 @@
> uintmax_t num, mult, prevnum;
> char *expr;
>
> + while (isspace(val[0]))
> + val++;
> +
> + if (val[0] == '-')
> + errx(1, "%s: cannot be negative", oper);
> +
> errno = 0;
> - num = strtouq(val, &expr, 0);
> + num = strtoull(val, &expr, 0);
> if (errno != 0) /* Overflow or underflow. */
> err(1, "%s", oper);
> -
> +
> if (expr == val) /* No valid digits. */
> errx(1, "%s: illegal numeric value", oper);
>
> Index: conv.c
> ===================================================================
> --- conv.c (revision 270645)
> +++ conv.c (working copy)
> @@ -133,7 +133,7 @@
> */
> ch = 0;
> for (inp = in.dbp - in.dbcnt, outp = out.dbp; in.dbcnt;) {
> - maxlen = MIN(cbsz, in.dbcnt);
> + maxlen = MIN(cbsz, (size_t)in.dbcnt);
> if ((t = ctab) != NULL)
> for (cnt = 0; cnt < maxlen && (ch = *inp++) != '\n';
> ++cnt)
> @@ -146,7 +146,7 @@
> * Check for short record without a newline. Reassemble the
> * input block.
> */
> - if (ch != '\n' && in.dbcnt < cbsz) {
> + if (ch != '\n' && (size_t)in.dbcnt < cbsz) {
> (void)memmove(in.db, in.dbp - in.dbcnt, in.dbcnt);
> break;
> }
> @@ -228,7 +228,7 @@
> * translation has to already be done or we might not recognize the
> * spaces.
> */
> - for (inp = in.db; in.dbcnt >= cbsz; inp += cbsz, in.dbcnt -= cbsz) {
> + for (inp = in.db; (size_t)in.dbcnt >= cbsz; inp += cbsz, in.dbcnt
> -= cbsz) {
> for (t = inp + cbsz - 1; t >= inp && *t == ' '; --t)
> ;
> if (t >= inp) {
> Index: dd.c
> ===================================================================
> --- dd.c (revision 270645)
> +++ dd.c (working copy)
> @@ -168,10 +168,10 @@
> * record oriented I/O, only need a single buffer.
> */
> if (!(ddflags & (C_BLOCK | C_UNBLOCK))) {
> - if ((in.db = malloc(out.dbsz + in.dbsz - 1)) == NULL)
> + if ((in.db = malloc((size_t)out.dbsz + in.dbsz - 1)) == NULL)
> err(1, "input buffer");
> out.db = in.db;
> - } else if ((in.db = malloc(MAX(in.dbsz, cbsz) + cbsz)) == NULL ||
> + } else if ((in.db = malloc(MAX((size_t)in.dbsz, cbsz) + cbsz)) ==
> NULL ||
> (out.db = malloc(out.dbsz + cbsz)) == NULL)
> err(1, "output buffer");
>
> @@ -343,7 +343,7 @@
> ++st.in_full;
>
> /* Handle full input blocks. */
> - } else if ((size_t)n == in.dbsz) {
> + } else if ((size_t)n == (size_t)in.dbsz) {
> in.dbcnt += in.dbrcnt = n;
> ++st.in_full;
>
> @@ -493,7 +493,7 @@
> outp += nw;
> st.bytes += nw;
>
> - if ((size_t)nw == n && n == out.dbsz)
> + if ((size_t)nw == n && n == (size_t)out.dbsz)
> ++st.out_full;
> else
> ++st.out_part;
> Index: dd.h
> ===================================================================
> --- dd.h (revision 270645)
> +++ dd.h (working copy)
> @@ -38,10 +38,9 @@
> typedef struct {
> u_char *db; /* buffer address */
> u_char *dbp; /* current buffer I/O address */
> - /* XXX ssize_t? */
> - size_t dbcnt; /* current buffer byte count */
> - size_t dbrcnt; /* last read byte count */
> - size_t dbsz; /* block size */
> + ssize_t dbcnt; /* current buffer byte count */
> + ssize_t dbrcnt; /* last read byte count */
> + ssize_t dbsz; /* block size */
>
> #define ISCHR 0x01 /* character device (warn on short) */
> #define ISPIPE 0x02 /* pipe-like (see position.c) */
> @@ -57,13 +56,13 @@
> } IO;
>
> typedef struct {
> - uintmax_t in_full; /* # of full input blocks */
> - uintmax_t in_part; /* # of partial input blocks */
> - uintmax_t out_full; /* # of full output blocks */
> - uintmax_t out_part; /* # of partial output blocks */
> - uintmax_t trunc; /* # of truncated records */
> - uintmax_t swab; /* # of odd-length swab blocks */
> - uintmax_t bytes; /* # of bytes written */
> + size_t in_full; /* # of full input blocks */
> + size_t in_part; /* # of partial input blocks */
> + size_t out_full; /* # of full output blocks */
> + size_t out_part; /* # of partial output blocks */
> + size_t trunc; /* # of truncated records */
> + size_t swab; /* # of odd-length swab blocks */
> + size_t bytes; /* # of bytes written */
> struct timespec start; /* start time of dd */
> } STAT;
>
> Index: position.c
> ===================================================================
> --- position.c (revision 270645)
> +++ position.c (working copy)
> @@ -178,7 +178,7 @@
> n = write(out.fd, out.db, out.dbsz);
> if (n == -1)
> err(1, "%s", out.name);
> - if ((size_t)n != out.dbsz)
> + if (n != out.dbsz)
> errx(1, "%s: write failure", out.name);
> }
> break;
>
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-current/attachments/20140911/f975a9f8/attachment.sig>
More information about the freebsd-current
mailing list