Re: FYI: Why META_MODE rebuilds so much for building again after installworld (no source changes) [code level bug evidence]
Date: Thu, 23 Feb 2023 20:23:02 UTC
On Feb 23, 2023, at 11:53, Mark Millard <marklmi@yahoo.com> wrote: > cached_realpath only reports its "cached_realpath:" notice > (not the purging one) when it does not find the value via > HashTable_FindValue and so does a HashTable_Set : > > const char * > cached_realpath(const char *pathname, char *resolved) > { > const char *rp; > > if (pathname == NULL || pathname[0] == '\0') > return NULL; > > rp = HashTable_FindValue(&cached_realpaths, pathname); > if (rp != NULL) { > /* a hit */ > strncpy(resolved, rp, MAXPATHLEN); > resolved[MAXPATHLEN - 1] = '\0'; > return resolved; > } > > rp = realpath(pathname, resolved); > if (rp != NULL) { > HashTable_Set(&cached_realpaths, pathname, bmake_strdup(rp)); > DEBUG2(DIR, "cached_realpath: %s -> %s\n", pathname, rp); > return resolved; > } > > /* should we negative-cache? */ > return NULL; > } > > cached_realpaths is global: > > static HashTable cached_realpaths; > > So with -ddM why do I see lots of "cached_realpath:" > notices for the same path? For example: > > # grep "tmp/legacy/usr/sbin/ln\>" /usr/obj/BUILDs/main-amd64-nodbg-clang/sys-typescripts/typescript-make-amd64-nodbg-clang-amd64-host-2023-02-23:10:20:26 | more > cached_realpath: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln -> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/bin/ln > cached_realpath: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln -> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/bin/ln > Caching 02:49:37 Feb 23, 2023 for /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln > /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/usr.bin/awk/awkgram.tab.h.meta: 22: file '/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln' is newer than the target... > cached_realpath: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln -> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/bin/ln > Caching 02:49:37 Feb 23, 2023 for /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln > cached_realpath: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln -> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/bin/ln > Caching 02:49:37 Feb 23, 2023 for /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln > cached_realpath: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln -> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/bin/ln > Caching 02:49:37 Feb 23, 2023 for /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln > cached_realpath: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln -> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/bin/ln > Caching 02:49:37 Feb 23, 2023 for /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln > cached_realpath: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln -> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/bin/ln > Caching 02:49:37 Feb 23, 2023 for /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln > /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/usr.bin/awk/awkgram.tab.h.meta: 22: file '/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy/usr/sbin/ln' is newer than the target... > . . . > > A possible cause is something I ran into while looking around: > > /* A read-only range of a character array, NOT null-terminated. */ > typedef struct Substring { > const char *start; > const char *end; > } Substring; > . . . > MAKE_STATIC Substring > Substring_Init(const char *start, const char *end) > { > Substring sub; > > sub.start = start; > sub.end = end; > return sub; > } > . . . > /* Find the entry corresponding to the key, or return NULL. */ > HashEntry * > HashTable_FindEntry(HashTable *t, const char *key) > { > const char *keyEnd; > unsigned int h = Hash_String(key, &keyEnd); > return HashTable_Find(t, Substring_Init(key, keyEnd), h); > } > . . . > /* A read-only range of a character array, NOT null-terminated. */ > typedef struct Substring { > const char *start; > const char *end; > } Substring; > . . . > MAKE_STATIC Substring > Substring_Init(const char *start, const char *end) > { > Substring sub; > > sub.start = start; > sub.end = end; > return sub; > } > . . . > /* Find the entry corresponding to the key, or return NULL. */ > HashEntry * > HashTable_FindEntry(HashTable *t, const char *key) > { > const char *keyEnd; > unsigned int h = Hash_String(key, &keyEnd); > return HashTable_Find(t, Substring_Init(key, keyEnd), h); > } > . . . > /* This hash function matches Gosling's Emacs and java.lang.String. */ > static unsigned int > Hash_String(const char *key, const char **out_keyEnd) > { > unsigned int h; > const char *p; > > h = 0; > for (p = key; *p != '\0'; p++) > h = 31 * h + (unsigned char)*p; > > *out_keyEnd = p; > return h; > } > > But after the loop: *p=='\0' so *out_keyEnd=='\0' > and the FindEntry Substring_Init(key, keyEnd) ends > up including the '\0' byte. > > But note that the h in Hash_String did not include the > '\0' byte. Call this h value h_VALUE0 for later reference. > Then look at: > > /* This hash function matches Gosling's Emacs and java.lang.String. */ > unsigned int > Hash_Substring(Substring key) > { > unsigned int h; > const char *p; > > h = 0; > for (p = key.start; p != key.end; p++) > h = 31 * h + (unsigned char)*p; > return h; > } > > This h does include the '\0' byte so h==(unsigned int)(31*h_VALUE0). Dumb mistake on my part. Actually *(key.end) is never used., even if *(key.end) != '\0' . > I expect the mismatched hash values explain the repeated > "cached_realpath:" notices for the same path: inserted > but never found. Still, the comments and code do not match and I've not checked all usage for assumptions about *(key.end) vs. '\0' . === Mark Millard marklmi at yahoo.com