svn commit: r267675 - head/lib/libc/regex
Pedro Giffuni
pfg at freebsd.org
Sat Jun 21 01:16:16 UTC 2014
Il giorno 20/giu/2014, alle ore 16:59, Bruce Evans <brde at optusnet.com.au> ha scritto:
>
>> El 6/20/2014 1:23 PM, Stefan Farfeleder escribió:
>>> On Fri, Jun 20, 2014 at 03:29:10PM +0000, Pedro F. Giffuni wrote:
>>>> ...
>>>> Log:
>>>> regex: Make use of reallocf().
>>>> Use of reallocf is useful in libraries as we are not certain the
>>>> application will exit after NULL.
>>>> ...
>>>> This somewhat reduces portability but if since you are building
>>>> this as part of libc it is likely you have our non-standard
>>>> reallocf(3) already.
>
> It also somewhat increases namespace pollution.
>
>>>> Modified: head/lib/libc/regex/regcomp.c
>>>> ==============================================================================
>>>> --- head/lib/libc/regex/regcomp.c Fri Jun 20 13:26:49 2014 (r267674)
>>>> +++ head/lib/libc/regex/regcomp.c Fri Jun 20 15:29:09 2014 (r267675)
>>>> @@ -1111,7 +1111,7 @@ allocset(struct parse *p)
>>>> {
>>>> cset *cs, *ncs;
>>>> - ncs = realloc(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs));
>>>> + ncs = reallocf(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs));
>>>> if (ncs == NULL) {
>>>> SETERROR(REG_ESPACE);
>>>> return (NULL);
>>>> @@ -1174,7 +1174,7 @@ CHadd(struct parse *p, cset *cs, wint_t
>>>> if (ch < NC)
>>>> cs->bmp[ch >> 3] |= 1 << (ch & 7);
>>>> else {
>>>> - newwides = realloc(cs->wides, (cs->nwides + 1) *
>>>> + newwides = reallocf(cs->wides, (cs->nwides + 1) *
>>>> sizeof(*cs->wides));
>>>> if (newwides == NULL) {
>>>> SETERROR(REG_ESPACE);
>>> I don't think these changes are OK. If reallocf() fails here, the
>>> cs->wides pointer will be freed and later freeset() will call
>>> free(cs->wides), probably crashing. The other cases are most probably
>>> similar though I haven't examined them closely.
>>
>> OK ...
>>
>> I don't think there is any problem:
>>
>> If reallocf fails, newwides will be set to NULL and if free() is called it doesn't do anything when the argument is NULL.
>>
>> Also freeset() is meant to be called to "free a now-unused set" and it is not called within the library. I would think using a value when the allocation has failed is a much more serious issue than attempting to fail to free it after trying to use it. ;-).
>
> It doesn't look OK to me. It turns the careful use of newwides, etc.,
> into garbage, and leaves a garbage pointer in cs->wides, etc., to cause
> problems later. It might work without this careful use of a temporary
> variable, but even that is not clear. Consider:
>
> - start with a consistent data structure with some pointers to malloced
> storage in it; foo->p say
> - simple reallocf() allocation strategy:
> foo->p = reallocf(foo->p, size)
> if (foo->p == NULL)
> return (ERROR);
> This leaves the data structure consistent if and only if the only thing
> in it relevant to p is p itself, with foo->p serving as a flag for
> validity.
> - more complicated reallocf() allocation strategy:
> foo->p = reallocf(foo->p, size)
> if (foo->p == NULL) {
> foo->psize = 0;
> foo->pvalid = 0;
> foo->foovalid = 0;
> ...
> return (ERROR);
> }
> This still can't do things like cleaning up what foo->p points to, since
> reallocf() clobbered that.
> This shows that reallocf() is only useful in simple cases. In general, you
> must keep the pointer valid to clean things up:
> newp = reallocf(foo->p, size)
> if (newp == NULL) {
> for (i = 0; i < foo->psize; i++)
> free(foo->p[i]);
> foo->p = NULL;
> foo->psize = 0;
> foo->pvalid = 0;
> foo->foovalid = 0;
> ...
> return (ERROR);
> }
>
> Of course, malloc never fails so all this error checking is more a source
> of complexity and bugs than useful.
>
This is interesting, however I have problems understanding that a library would let you ignore an error condition and pass you garbage as part of it’s normal behavior. Plus being a standard library I am not sure if you can count on other implementations doing the same ...
> Re-quoting/recovering some context:
>
> @ Modified: head/lib/libc/regex/regcomp.c
> @ ==============================================================================
> @ --- head/lib/libc/regex/regcomp.c Fri Jun 20 13:26:49 2014 (r267674)
> @ +++ head/lib/libc/regex/regcomp.c Fri Jun 20 15:29:09 2014 (r267675)
> @ @@ -1111,7 +1111,7 @@ allocset(struct parse *p)
> @ {
> @ cset *cs, *ncs;
> @ @ - ncs = realloc(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs));
> @ + ncs = reallocf(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs));
> @ if (ncs == NULL) {
> @ SETERROR(REG_ESPACE);
> @ return (NULL);
>
> This code shows that even Henry didn't know how to program memory allocation
> in 1988. The code was much worse in 4.4BSD-Lite2:
>
> old@ if (p->g->sets == NULL)
> old@ p->g->sets = (cset *)malloc(nc * sizeof(cset));
> old@ else
> old@ p->g->sets = (cset *)realloc((char *)p->g->sets,
> old@ nc * sizeof(cset));
>
> This just leaked memory. It was improved by assigning to ncs and by
> removing the pre-C90 and C++ casts. Now it is unimproved by turning the
> careful use of ncs into garbage and leaving garbage in *p.
>
> old@ if (p->g->setbits == NULL)
> old@ p->g->setbits = (uch *)malloc(nbytes);
> old@ else {
> old@ p->g->setbits = (uch *)realloc((char *)p->g->setbits,
> old@ nbytes);
> old@ /* xxx this isn't right if setbits is now NULL */
> old@ for (i = 0; i < no; i++)
> old@ p->g->sets[i].ptr = p->g->setbits + css*(i/CHAR_BIT);
> old@ }
>
> Honest memory mismanagement. It just assumes that malloc() and realloc()
> can't fail. In -current, the function is much simpler and doesn't have
> this code. I think part of the simplication is to use realloc() of NULL
> instead of malloc() to start up.
>
> @ @@ -1174,7 +1174,7 @@ CHadd(struct parse *p, cset *cs, wint_t @ if (ch < NC)
> @ cs->bmp[ch >> 3] |= 1 << (ch & 7);
> @ else {
> @ - newwides = realloc(cs->wides, (cs->nwides + 1) *
> @ + newwides = reallocf(cs->wides, (cs->nwides + 1) *
> @ sizeof(*cs->wides));
> @ if (newwides == NULL) {
> @ SETERROR(REG_ESPACE);
> @ @@ -1203,7 +1203,7 @@ CHaddrange(struct parse *p, cset *cs, wi
> @ CHadd(p, cs, min);
> @ if (min >= max)
> @ return;
> @ - newranges = realloc(cs->ranges, (cs->nranges + 1) *
> @ + newranges = reallocf(cs->ranges, (cs->nranges + 1) *
> @ sizeof(*cs->ranges));
> @ if (newranges == NULL) {
> @ SETERROR(REG_ESPACE);
> @ @@ -1227,7 +1227,7 @@ CHaddtype(struct parse *p, cset *cs, wct
> @ for (i = 0; i < NC; i++)
> @ if (iswctype(i, wct))
> @ CHadd(p, cs, i);
> @ - newtypes = realloc(cs->types, (cs->ntypes + 1) *
> @ + newtypes = reallocf(cs->types, (cs->ntypes + 1) *
> @ sizeof(*cs->types));
> @ if (newtypes == NULL) {
> @ SETERROR(REG_ESPACE);
> @ @@ -1350,7 +1350,7 @@ enlarge(struct parse *p, sopno size)
> @ if (p->ssize >= size)
> @ return 1;
> @ @ - sp = (sop *)realloc(p->strip, size*sizeof(sop));
> @ + sp = (sop *)reallocf(p->strip, size*sizeof(sop));
> @ if (sp == NULL) {
> @ SETERROR(REG_ESPACE);
> @ return 0;
>
> Similarly in 4 more cases, except most of them weren't in old versions.
>
> @ @@ -1368,7 +1368,7 @@ static void
> @ stripsnug(struct parse *p, struct re_guts *g)
> @ {
> @ g->nstates = p->slen;
> @ - g->strip = (sop *)realloc((char *)p->strip, p->slen * sizeof(sop));
> @ + g->strip = (sop *)reallocf((char *)p->strip, p->slen * sizeof(sop));
> @ if (g->strip == NULL) {
> @ SETERROR(REG_ESPACE);
> @ g->strip = p->strip;
>
> This was the only realloc() that had not been cleaned up, so using
> reallocf() has a chance of improving it. It still has the pre-C90 and
> C++ casts. But there is an obvious new bug visible without reading more
> than the patch context:
> - the local variable might not be needed now, since the variable assigned
> to, g->strip, is different from the variable realloced, p->strip
> - on realloc failure, p->strip becomes invalid
> - the error handling is completely broken. It points g->strip at the old
> (now deallocated) storage. This is immediate use of the garbage pointer
> created by using reallocf().
>
> Clearly, only realloc() code of the form "p = realloc(p, size);", where the
> pointer assigned to is the same as the pointer realloced, can be improved
> by blindly converting it to use reallocf().
>
Ah yes, my fault there.
> The last function is most interesting. It seems to be to reduce the size.
> So an allocation failure is even less likely than usually, except lots of
> small allocations and reallocations are more likely to waste space than
> save it. But if failure occurs it is almost harmless, and the error
> handling of keeping using the previous allocation was good. Maybe Henry
> knew how to program memory allocation after all. (The last function
> survived previously-unchanged from 4.4BSDLite-2 except for removing K&R
> support and 'register’.
OK, it’s pretty tricky, but I agree that reallocf() simply won’t do anything here. I will revert.
Thanks!
Pedro.
More information about the svn-src-head
mailing list