Re: git: 151abc80cde7 - main - if_vlan: avoid hash table thrashing when adding and removing entries
Date: Tue, 26 Jul 2022 09:03:47 UTC
On 25 Jul 2022, at 18:20, John Baldwin wrote: > On 7/22/22 4:19 PM, Kristof Provost wrote: >> The branch main has been updated by kp: >> >> URL: >> https://cgit.FreeBSD.org/src/commit/?id=151abc80cde778bc18b91c334d07fbd52bbb38fb >> >> commit 151abc80cde778bc18b91c334d07fbd52bbb38fb >> Author: Kristof Provost <kp@FreeBSD.org> >> AuthorDate: 2022-07-22 17:17:04 +0000 >> Commit: Kristof Provost <kp@FreeBSD.org> >> CommitDate: 2022-07-22 17:18:41 +0000 >> >> if_vlan: avoid hash table thrashing when adding and removing >> entries >> vlan_remhash() uses incorrect value for b. >> When using the default value for VLAN_DEF_HWIDTH (4), the >> VLAN hash-list table >> expands from 16 chains to 32 chains as the 129th entry is added. >> trunk->hwidth >> becomes 5. Say a few more entries are added and there are now >> 135 entries. >> trunk-hwidth will still be 5. If an entry is removed, >> vlan_remhash() will >> calculate a value of 32 for b. refcnt will be decremented to >> 134. The if >> comparison at line 473 will return true and vlan_growhash() will >> be called. The >> VLAN hash-list table will be compressed from 32 chains wide to >> 16 chains wide. >> hwidth will become 4. This is an error, and it can be seen when >> a new VLAN is >> added. The table will again be expanded. If an entry is then >> removed, again >> the table is contracted. >> If the number of VLANS stays in the range of 128-512, each >> time an insert >> follows a remove, the table will expand. Each time a remove >> follows an >> insert, the table will be contracted. >> The fix is simple. The line 473 should test that the number >> of entries has >> decreased such that the table should be contracted using what >> would be the new >> value of hwidth. line 467 should be: >> b = 1 << (trunk->hwidth - 1); >> PR: 265382 >> Reviewed by: kp >> MFC after: 2 weeks >> Sponsored by: NetApp, Inc. > > This does mean that there are some odd edge cases (e.g. if you remove > the 513th entry > only to turn around and re-add it) that can still thrash even with > this fixed. Not > sure if those are worth worrying about though? If they were, perhaps > you could > imagine some sort of "dampener" where you have to remove some number N > of entries > to actually rehash to a smaller table. But that is probably rather > complex to > implement and probably not worth it? > Good point, although I think I agree it’s probably not worth worrying too much about this. There’s a tangentially related issue I also want to look at one of these days, namely that the current `trunk->refcnt > (b * b) / 2` check results in far too many entries in each hash row. For example, until 128 vlan interfaces we stick with width = 4, so end up with an average of 8 entries in each row. That means that for every inbound packet we scan through a linked list of 8 entries to find the correct vlan interface. We should tweak the check to reduce this, as all it’ll cost is a tiny bit of memory. (Or we should just default VLAN_ARRAY on, paying a bit more memory for no linked-list scanning). Kristof