Inconsistent behavior with dd(1)
William Orr
will at worrbase.com
Tue Aug 26 04:49:24 UTC 2014
On 8/18/14 12:00 PM, William Orr wrote:
> Reply inline.
>
> On 08/16/2014 10:34 AM, John-Mark Gurney wrote:
>> Alan Somers wrote this message on Fri, Aug 15, 2014 at 10:42 -0600:
>>> On Thu, Aug 14, 2014 at 11:55 PM, William Orr <will at worrbase.com> wrote:
>>>> Hey,
>>>>
>>>> I found some inconsistent behavior with dd(1) when it comes to specifying arguments in -CURRENT.
>>>>
>>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=18446744073709551616
>>>> dd: count: Result too large
>>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=18446744073709551617
>>>> dd: count: Result too large
>>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=18446744073709551615
>>>> dd: count cannot be negative
>>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=-18446744073709551615
>>>> 1+0 records in
>>>> 1+0 records out
>>>> 512 bytes transferred in 0.000373 secs (1373071 bytes/sec)
>>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=-1
>>>> dd: count cannot be negative
>>>>
>>>> ???
>>>>
>>>> Any chance someone has the time and could take a look? https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191263
>>>>
>>>> Thanks,
>>>> William Orr
>>>>
>>>> ???
>>>
>>> IMHO, this is a bug in strtouq(3), not in dd(1). Why should it parse
>>> negative numbers at all, when there is stroq(3) for that purpose? The
>>> standard is clear that it must, though. Oddly enough, stroq would
>>> probably not accept -18446744073709551615, even though strtouq does.
>>> Specific comments on your patch below:
>>>
>>>
>>>> Here???s the patch:
>>>>
>>>> Index: bin/dd/args.c
>>>> ===================================================================
>>>> --- bin/dd/args.c (revision 267712)
>>>> +++ bin/dd/args.c (working copy)
>>>> @@ -186,46 +186,31 @@
>>>> static void
>>>> f_bs(char *arg)
>>>> {
>>>> - uintmax_t res;
>>>> -
>>>> - res = get_num(arg);
>>>> - if (res < 1 || res > SSIZE_MAX)
>>>> - errx(1, "bs must be between 1 and %jd", (intmax_t)SSIZE_MAX);
>>>> - in.dbsz = out.dbsz = (size_t)res;
>>>> + in.dbsz = out.dbsz = get_num(arg);
>>>> + if (in.dbsz < 1 || out.dbsz < 1)
>>> Why do you need to check both in and out? Aren't they the same?
>>> Also, you eliminated the check for overflowing SSIZE_MAX. That's not
>>> ok, because these values get passed to places that expect signed
>>> numbers, for example in dd.c:303.
>> The type of dbsz is size_t, so really:
>>
>>>> + errx(1, "bs must be between 1 and %ju", (uintmax_t)-1);
>> This should be SIZE_MAX, except there isn't a define for this? So maybe
>> the code really should be:
>> (uintmax_t)(size_t)-1
>>
>> to get the correct value for SIZE_MAX...
>>
>> Otherwise on systems that uintmax_t is >32bits and size_t is 32bits,
>> the error message will be wrong...
> Yes, this should probably be SIZE_MAX rather than that cast. Same with
> the others
>
>>>> }
>>>>
>>>> static void
>>>> f_cbs(char *arg)
>>>> {
>>>> - uintmax_t res;
>>>> -
>>>> - res = get_num(arg);
>>>> - if (res < 1 || res > SSIZE_MAX)
>>>> - errx(1, "cbs must be between 1 and %jd", (intmax_t)SSIZE_MAX);
>>>> - cbsz = (size_t)res;
>>>> + cbsz = get_num(arg);
>>>> + if (cbsz < 1)
>>>> + errx(1, "cbs must be between 1 and %ju", (uintmax_t)-1);
>>>> }
>>> Again, you eliminated the check for SSIZE_MAX, but cbsz must be signed.
>> What do you mean by this? cbsz is size_t which is unsigned...
> I believe he's referring to this use of cbsz/in.dbsz/out.dbsz:
>
> https://svnweb.freebsd.org/base/head/bin/dd/dd.c?revision=265698&view=markup#l171
>
> Really, this is more wrong since there is math inside of a malloc(3)
> call without any overflow handling. By virtue of making this max out at
> a ssize_t, it becomes more unlikely that you'll have overflow.
>
> This math should probably be done ahead of time with proper overflow
> handling. I'll include that in my next patch, if there's no objection.
>
> I don't see any other reason why in.dbsz, out.dbsz or cbsz should be
> signed, but it's very possible that I didn't look hard enough.
>
>> Again, the cast above is wrong... Maybe we should add a SIZE_MAX
>> define so we don't have to see the double cast...
>>
>>>> static void
>>>> f_count(char *arg)
>>>> {
>>>> - intmax_t res;
>>>> -
>>>> - res = (intmax_t)get_num(arg);
>>>> - if (res < 0)
>>>> - errx(1, "count cannot be negative");
>>>> - if (res == 0)
>>>> - cpy_cnt = (uintmax_t)-1;
>>> This is a special case. See dd_in(). I think that eliminating this
>>> special case will have the unintended effect of breaking count=0.
>>>
>>>> - else
>>>> - cpy_cnt = (uintmax_t)res;
>>>> + cpy_cnt = get_num(arg);
>>>> }
>>>>
>>>> static void
>>>> f_files(char *arg)
>>>> {
>>>> -
>> Don't eliminate these blank lines.. they are intentional per style(9):
>> /* Insert an empty line if the function has no local variables. */
>>
>>>> files_cnt = get_num(arg);
>>>> if (files_cnt < 1)
>>>> - errx(1, "files must be between 1 and %jd", (uintmax_t)-1);
>>>> + errx(1, "files must be between 1 and %ju", (uintmax_t)-1);
>>> Good catch.
>>>
>>>> }
>>>>
>>>> static void
>>>> @@ -241,14 +226,10 @@
>>>> static void
>>>> f_ibs(char *arg)
>>>> {
>>>> - uintmax_t res;
>>>> -
>>>> if (!(ddflags & C_BS)) {
>>>> - res = get_num(arg);
>>>> - if (res < 1 || res > SSIZE_MAX)
>>>> - errx(1, "ibs must be between 1 and %jd",
>>>> - (intmax_t)SSIZE_MAX);
>>>> - in.dbsz = (size_t)res;
>>>> + in.dbsz = get_num(arg);
>>>> + if (in.dbsz < 1)
>>>> + errx(1, "ibs must be between 1 and %ju", (uintmax_t)-1);
>>> Again, you eliminated the check for SSIZE_MAX, but dbsz must be signed.
>> If dbsz must be signed, we should change it's definition to ssize_t
>> instead of size_t... Can you point to the line that says this?
>>
>> In investigating this, it looks like we may have a bug in ftruncate in
>> that out.offset * out.dbsz may overflow and return incorrect results...
>> We should probably check that the output (cast to off_t) is greater than
>> both the inputs before calling ftruncate... This is safe as both are
>> unsigned...
> Yeah, there probably ought to be integer overflow handling here as well.
>
>>>> }
>>>> }
>>>>
>>>> @@ -262,14 +243,10 @@
>>>> static void
>>>> f_obs(char *arg)
>>>> {
>>>> - uintmax_t res;
>>>> -
>>>> if (!(ddflags & C_BS)) {
>>>> - res = get_num(arg);
>>>> - if (res < 1 || res > SSIZE_MAX)
>>>> - errx(1, "obs must be between 1 and %jd",
>>>> - (intmax_t)SSIZE_MAX);
>>>> - out.dbsz = (size_t)res;
>>>> + out.dbsz = get_num(arg);
>>>> + if (out.dbsz < 1)
>>>> + errx(1, "obs must be between 1 and %ju", (uintmax_t)-1);
>>>> }
>>>> }
>>> Again, you eliminated the check for SSIZE_MAX, but dbsz must be signed.
>>>
>>>> @@ -378,11 +355,14 @@
>>>> uintmax_t num, mult, prevnum;
>>>> char *expr;
>>>>
>>>> + if (val[0] == '-')
>>>> + errx(1, "%s: cannot be negative", oper);
>>>> +
>>> In general, I like this part of the diff. Every user of get_num
>>> checks for negative values, so why not move the check into get_num
>>> itself? But you changed it from a numeric check to a text check, and
>>> writing text parsers makes me nervous. I can't see any problems,
>>> though.
> Funnily enough this part of the diff was wrong. I didn't account for
> spaces, so I'll add that in my upcoming diff.
>
>>>> errno = 0;
>>>> num = strtouq(val, &expr, 0);
>>>> if (errno != 0) /* Overflow or underflow. */
>>>> err(1, "%s", oper);
>>>> -
>>>> +
>>>> if (expr == val) /* No valid digits. */
>>>> errx(1, "%s: illegal numeric value", oper);
>>>>
>>>> Index: bin/dd/dd.c
>>>> ===================================================================
>>>> --- bin/dd/dd.c (revision 267712)
>>>> +++ bin/dd/dd.c (working copy)
>>>> @@ -284,8 +284,6 @@
>>>>
>>>> for (;;) {
>>>> switch (cpy_cnt) {
>>>> - case -1: /* count=0 was specified */
>>>> - return;
>>> Again, I don't think this will do what you want it to do. Previously,
>>> leaving count unspecified resulted in cpy_cnt being 0, and specifying
>>> count=0 set cpy_cnt to -1. With your patch, setting count=0 will have
>>> the same effect as leaving it unspecified.
> Nope. It didn't do what I wanted. I'll submit an updated diff with this
> fixed as well as the other things I mentioned, provided there's no
> objection to my direction.
>
>>>> case 0:
>>>> break;
>>>> default:
>>>>
>>>>
Here is a new version of this patch. I've now changed the members of the
struct that are used as ssize_ts to ssize_ts, and have fixed casts
appropriately. I have also opted for strtoull over the deprecated
strtouq. I changed some struct members that were declared as uintmax_t's
to size_t's.
Any comments? Criticisms? Gross attacks on my character?
Index: args.c
===================================================================
--- args.c (revision 270645)
+++ args.c (working copy)
@@ -41,6 +41,7 @@
#include <sys/types.h>
+#include <ctype.h>
#include <err.h>
#include <errno.h>
#include <inttypes.h>
@@ -171,8 +172,7 @@
*/
if (in.offset > OFF_MAX / (ssize_t)in.dbsz ||
out.offset > OFF_MAX / (ssize_t)out.dbsz)
- errx(1, "seek offsets cannot be larger than %jd",
- (intmax_t)OFF_MAX);
+ errx(1, "seek offsets cannot be larger than %jd", OFF_MAX);
}
static int
@@ -186,37 +186,28 @@
static void
f_bs(char *arg)
{
- uintmax_t res;
- res = get_num(arg);
- if (res < 1 || res > SSIZE_MAX)
- errx(1, "bs must be between 1 and %jd", (intmax_t)SSIZE_MAX);
- in.dbsz = out.dbsz = (size_t)res;
+ in.dbsz = out.dbsz = get_num(arg);
+ if (out.dbsz < 1 || out.dbsz > SSIZE_MAX)
+ errx(1, "bs must be between 1 and %jd", SSIZE_MAX);
}
static void
f_cbs(char *arg)
{
- uintmax_t res;
- res = get_num(arg);
- if (res < 1 || res > SSIZE_MAX)
- errx(1, "cbs must be between 1 and %jd", (intmax_t)SSIZE_MAX);
- cbsz = (size_t)res;
+ cbsz = get_num(arg);
+ if (cbsz < 1 || cbsz > SSIZE_MAX)
+ errx(1, "cbs must be between 1 and %jd", SSIZE_MAX);
}
static void
f_count(char *arg)
{
- intmax_t res;
- res = (intmax_t)get_num(arg);
- if (res < 0)
- errx(1, "count cannot be negative");
- if (res == 0)
- cpy_cnt = (uintmax_t)-1;
- else
- cpy_cnt = (uintmax_t)res;
+ cpy_cnt = get_num(arg);
+ if (cpy_cnt == 0)
+ cpy_cnt = -1;
}
static void
@@ -225,7 +216,7 @@
files_cnt = get_num(arg);
if (files_cnt < 1)
- errx(1, "files must be between 1 and %jd", (uintmax_t)-1);
+ errx(1, "files must be between 1 and %ju", SIZE_MAX);
}
static void
@@ -241,14 +232,11 @@
static void
f_ibs(char *arg)
{
- uintmax_t res;
if (!(ddflags & C_BS)) {
- res = get_num(arg);
- if (res < 1 || res > SSIZE_MAX)
- errx(1, "ibs must be between 1 and %jd",
- (intmax_t)SSIZE_MAX);
- in.dbsz = (size_t)res;
+ in.dbsz = get_num(arg);
+ if (in.dbsz < 1 || in.dbsz > SSIZE_MAX)
+ errx(1, "ibs must be between 1 and %ju", SSIZE_MAX);
}
}
@@ -262,14 +250,11 @@
static void
f_obs(char *arg)
{
- uintmax_t res;
if (!(ddflags & C_BS)) {
- res = get_num(arg);
- if (res < 1 || res > SSIZE_MAX)
- errx(1, "obs must be between 1 and %jd",
- (intmax_t)SSIZE_MAX);
- out.dbsz = (size_t)res;
+ out.dbsz = get_num(arg);
+ if (out.dbsz < 1 || out.dbsz > SSIZE_MAX)
+ errx(1, "obs must be between 1 and %jd", SSIZE_MAX);
}
}
@@ -378,11 +363,17 @@
uintmax_t num, mult, prevnum;
char *expr;
+ while (isspace(val[0]))
+ val++;
+
+ if (val[0] == '-')
+ errx(1, "%s: cannot be negative", oper);
+
errno = 0;
- num = strtouq(val, &expr, 0);
+ num = strtoull(val, &expr, 0);
if (errno != 0) /* Overflow or underflow. */
err(1, "%s", oper);
-
+
if (expr == val) /* No valid digits. */
errx(1, "%s: illegal numeric value", oper);
Index: conv.c
===================================================================
--- conv.c (revision 270645)
+++ conv.c (working copy)
@@ -133,7 +133,7 @@
*/
ch = 0;
for (inp = in.dbp - in.dbcnt, outp = out.dbp; in.dbcnt;) {
- maxlen = MIN(cbsz, in.dbcnt);
+ maxlen = MIN(cbsz, (size_t)in.dbcnt);
if ((t = ctab) != NULL)
for (cnt = 0; cnt < maxlen && (ch = *inp++) != '\n';
++cnt)
@@ -146,7 +146,7 @@
* Check for short record without a newline. Reassemble the
* input block.
*/
- if (ch != '\n' && in.dbcnt < cbsz) {
+ if (ch != '\n' && (size_t)in.dbcnt < cbsz) {
(void)memmove(in.db, in.dbp - in.dbcnt, in.dbcnt);
break;
}
@@ -228,7 +228,7 @@
* translation has to already be done or we might not recognize the
* spaces.
*/
- for (inp = in.db; in.dbcnt >= cbsz; inp += cbsz, in.dbcnt -= cbsz) {
+ for (inp = in.db; (size_t)in.dbcnt >= cbsz; inp += cbsz, in.dbcnt
-= cbsz) {
for (t = inp + cbsz - 1; t >= inp && *t == ' '; --t)
;
if (t >= inp) {
Index: dd.c
===================================================================
--- dd.c (revision 270645)
+++ dd.c (working copy)
@@ -168,10 +168,10 @@
* record oriented I/O, only need a single buffer.
*/
if (!(ddflags & (C_BLOCK | C_UNBLOCK))) {
- if ((in.db = malloc(out.dbsz + in.dbsz - 1)) == NULL)
+ if ((in.db = malloc((size_t)out.dbsz + in.dbsz - 1)) == NULL)
err(1, "input buffer");
out.db = in.db;
- } else if ((in.db = malloc(MAX(in.dbsz, cbsz) + cbsz)) == NULL ||
+ } else if ((in.db = malloc(MAX((size_t)in.dbsz, cbsz) + cbsz)) ==
NULL ||
(out.db = malloc(out.dbsz + cbsz)) == NULL)
err(1, "output buffer");
@@ -343,7 +343,7 @@
++st.in_full;
/* Handle full input blocks. */
- } else if ((size_t)n == in.dbsz) {
+ } else if ((size_t)n == (size_t)in.dbsz) {
in.dbcnt += in.dbrcnt = n;
++st.in_full;
@@ -493,7 +493,7 @@
outp += nw;
st.bytes += nw;
- if ((size_t)nw == n && n == out.dbsz)
+ if ((size_t)nw == n && n == (size_t)out.dbsz)
++st.out_full;
else
++st.out_part;
Index: dd.h
===================================================================
--- dd.h (revision 270645)
+++ dd.h (working copy)
@@ -38,10 +38,9 @@
typedef struct {
u_char *db; /* buffer address */
u_char *dbp; /* current buffer I/O address */
- /* XXX ssize_t? */
- size_t dbcnt; /* current buffer byte count */
- size_t dbrcnt; /* last read byte count */
- size_t dbsz; /* block size */
+ ssize_t dbcnt; /* current buffer byte count */
+ ssize_t dbrcnt; /* last read byte count */
+ ssize_t dbsz; /* block size */
#define ISCHR 0x01 /* character device (warn on short) */
#define ISPIPE 0x02 /* pipe-like (see position.c) */
@@ -57,13 +56,13 @@
} IO;
typedef struct {
- uintmax_t in_full; /* # of full input blocks */
- uintmax_t in_part; /* # of partial input blocks */
- uintmax_t out_full; /* # of full output blocks */
- uintmax_t out_part; /* # of partial output blocks */
- uintmax_t trunc; /* # of truncated records */
- uintmax_t swab; /* # of odd-length swab blocks */
- uintmax_t bytes; /* # of bytes written */
+ size_t in_full; /* # of full input blocks */
+ size_t in_part; /* # of partial input blocks */
+ size_t out_full; /* # of full output blocks */
+ size_t out_part; /* # of partial output blocks */
+ size_t trunc; /* # of truncated records */
+ size_t swab; /* # of odd-length swab blocks */
+ size_t bytes; /* # of bytes written */
struct timespec start; /* start time of dd */
} STAT;
Index: position.c
===================================================================
--- position.c (revision 270645)
+++ position.c (working copy)
@@ -178,7 +178,7 @@
n = write(out.fd, out.db, out.dbsz);
if (n == -1)
err(1, "%s", out.name);
- if ((size_t)n != out.dbsz)
+ if (n != out.dbsz)
errx(1, "%s: write failure", out.name);
}
break;
More information about the freebsd-current
mailing list