svn commit: r270759 - in head/sys: cddl/compat/opensolaris/kern cddl/compat/opensolaris/sys cddl/contrib/opensolaris/uts/common/fs/zfs vm
Peter Wemm
peter at wemm.org
Sat Aug 30 17:37:53 UTC 2014
On Saturday 30 August 2014 02:03:42 Steven Hartland wrote:
> ----- Original Message -----
> From: "Peter Wemm" <peter at wemm.org>
>
> > On Friday 29 August 2014 21:42:15 Steven Hartland wrote:
>
> > If this function returns non-zerp, ARC is given back:
> >
> > static int
> > arc_reclaim_needed(void)
> > {
> >
> > if (kmem_free_count() < zfs_arc_free_target) {
> >
> > return (1);
> >
> > }
> >
> > /*
> > * Cooperate with pagedaemon when it's time for it to scan
> > * and reclaim some pages.
> > */
> >
> > if (vm_paging_needed()) {
> >
> > return (1);
> >
> > }
> >
> > ie: if v_free (ignoring v_cache free pages) gets below the threshold,
> > stop
> > evertyhing and discard ARC pages.
> >
> > The vm_paging_needed() code is a NO-OP at this point. It can never
> > return
> >
> > true. Consider:
> > vm_cnt.v_free_target = 4 * vm_cnt.v_free_min +
> >
> > vm_cnt.v_free_reserved;
> > vs
> >
> > vm_pageout_wakeup_thresh = (vm_cnt.v_free_min / 10) * 11;
> >
> > zfs_arc_free_target defaults to vm_cnt.v_free_target, which is 400% of
> > v_free_min, and compares it against the smaller v_free pool.
> >
> > vm_paging_needed() compares the total free pool (v_free + v_cache)
> > against the
> > smaller wakeup threshold - 110% of v_free_min.
> >
> > Comparing a larger value against a smaller target than the previous
> > test will
> > never succeed unless you manually change the arc_free_target sysctl.
>
> I'm aware of the values involved, and as I said what you're proposing
> was more akin to where I started, but I was informed that it had already
> been tested and didn't work well.
And Karl also said that his tests are on machines that have no v_cache, so
he's not testing the scenario.
The code, as written, is wrong. It's as simple as that.
The logic is wrong.
You've introduced dead code.
Your code changes introduce a scenario that CAUSES one of the very problems
you're using as a justtification for the changes.
Your own testers have admitted that they don't test the scenario that the
problem exists with.
> > Also, what about the magic numbers here:
> > u_int zfs_arc_free_target = (1 << 19); /* default before pagedaemon
> > init only */
>
> That is just a total fall back case and should never be triggered unless
> as the comment states the pagedaemon isn't initialised.
>
> > That's half a million pages, or 2GB of physical ram on a 4K page size
> > system
> > How is this going to work on early boot in the machines in the cluster
> > with
> > less than 2GB of ram?
>
> Its there to ensure that ARC doesn't run wild ARC for the few
> milliseconds
> / seconds before pagedaemon is initalised.
>
> We can change the value no problem, what would you suggest 1<<16 aka
> 256MB?
Please stop picking magic numbers out of thin air. You are working with file
system and VM - critical parts of the system. This is NOT the place to be
screwing around with things you don't understand. alc@ was trying to be
polite.
> Thanks for all the feedback, its great to have my understanding of
> how things work in this area confirmed by those who know.
>
> Hopefully we'll be able to get to the bottom of this with everyones
> help and get a solid fix for these issues that have plaged 10 into
> 10.1 :)
I'm very disappointed in the attention to detail and errors in the commit.
I'm almost at the point where I want to ask for the whole thing to be backed
out.
--
Peter Wemm - peter at wemm.org; peter at FreeBSD.org; peter at yahoo-inc.com; KI6FJV
UTF-8: for when a ' or ... just won\342\200\231t do\342\200\246
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freebsd.org/pipermail/svn-src-all/attachments/20140830/22b5fa92/attachment.sig>
More information about the svn-src-all
mailing list