NFS server bottlenecks
Ivan Voras
ivoras at freebsd.org
Mon Oct 15 21:58:28 UTC 2012
On 15 October 2012 22:58, Rick Macklem <rmacklem at uoguelph.ca> wrote:
> The problem is that UDP entries very seldom time out (unless the
> NFS server isn't seeing hardly any load) and are mostly trimmed
> because the size exceeds the highwater mark.
>
> With your code, it will clear out all of the entries in the first
> hash buckets that aren't currently busy, until the total count
> drops below the high water mark. (If you monitor a busy server
> with "nfsstat -e -s", you'll see the cache never goes below the
> high water mark, which is 500 by default.) This would delete
> entries of fairly recent requests.
You are right about that, if testing by Nikolay goes reasonably well,
I'll work on that.
> If you are going to replace the global LRU list with ones for
> each hash bucket, then you'll have to compare the time stamps
> on the least recently used entries of all the hash buckets and
> then delete those. If you keep the timestamp of the least recent
> one for that hash bucket in the hash bucket head, you could at least
> use that to select which bucket to delete from next, but you'll still
> need to:
> - lock that hash bucket
> - delete a few entries from that bucket's lru list
> - unlock hash bucket
> - repeat for various buckets until the count is beloew the high
> water mark
Ah, I think I get it: is the reliance on the high watermark as a
criteria for cache expiry the reason the list is a LRU instead of an
ordinary unordered list?
> Or something like that. I think you'll find it a lot more work that
> one LRU list and one mutex. Remember that mutex isn't held for long.
It could be, but the current state of my code is just groundwork for
the next things I have in plan:
1) Move the expiry code (the trim function) into a separate thread,
run periodically (or as a callout, I'll need to talk with someone
about which one is cheaper)
2) Replace the mutex with a rwlock. The only thing which is preventing
me from doing this right away is the LRU list, since each read access
modifies it (and requires a write lock). This is why I was asking you
if we can do away with the LRU algorithm.
> Btw, the code looks very nice. (If I was being a style(9) zealot,
> I'd remind you that it likes "return (X);" and not "return X;".
Thanks, I'll make it more style(9) compliant as I go along.
More information about the freebsd-hackers
mailing list