svn commit: r317061 - in head: libexec/rpc.rstatd sys/amd64/amd64 sys/amd64/include sys/arm/arm sys/arm/include sys/arm64/include sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys/compat/linprocfs...
Konstantin Belousov
kostikbel at gmail.com
Tue May 2 12:17:18 UTC 2017
On Tue, May 02, 2017 at 09:25:40PM +1000, Bruce Evans wrote:
> On Tue, 2 May 2017, Konstantin Belousov wrote:
>
> > On Mon, May 01, 2017 at 06:25:11PM +1000, Bruce Evans wrote:
> >> XX Index: subr_counter.c
> >> XX ===================================================================
> >> XX --- subr_counter.c (revision 317604)
> >> XX +++ subr_counter.c (working copy)
> >> XX @@ -78,11 +78,15 @@
> >> XX sysctl_handle_counter_u64(SYSCTL_HANDLER_ARGS)
> >> XX {
> >> XX uint64_t out;
> >> XX + uint32_t out32;
> >> XX int error;
> >> XX
> >> XX out = counter_u64_fetch(*(counter_u64_t *)arg1);
> >> XX
> >> XX - error = SYSCTL_OUT(req, &out, sizeof(uint64_t));
> >> XX + if (req->oldlen == 4 && (out32 = out) == out)
> >> XX + error = SYSCTL_OUT(req, &out32, sizeof(out32));
> >> XX + else
> >> XX + error = SYSCTL_OUT(req, &out, sizeof(out));
> >> XX
> >> XX if (error || !req->newptr)
> >> XX return (error);
> >>
> >> This does the minimum necessary to fix the current problem.
> >>
> >> This works until the counters become too large for a u_int. There
> >> could be an option to get truncation with no error, but that would
> >> require an API change, so applications should be required to do the
> >> truncation for themself if that is what they want.
> >
> > Yes, this is approximately what I consider to be enough fix, but with
> > some details which I do not like and which did not worked well on my
> > test machine with enough memory to generate lot of increments.
> >
> > Below is what I consider to be good enough. I will proceed with this
> > version unless very strong objections appear.
>
> I prefer my version. It is the same as for sysctl_bufspace(), except
> for simplifications and bug fixes that are easier because the types
> are unsigned.
>
> Your version is especially broken for the case of a lot of increments.
You consider the 'differentiators' as the main users, but I only want
systat and top to not abort, I have no intent of making them work
for updated counters if overflown. Instead, I want something human-visible
so that person looking at raw counters values using old tools detected the
failure.
>
> Clamping breaks many cases in systat -v which just needs unsigned counters
> to calculate small differences. Not many counters overflow faster than
> the refresh interval in systat or top. These cases used to appear to
> work perfectly for most counters when the kernel did blind truncations.
> Clamping breaks the API. Not returing ENOMEM breaks the API.
>
> In my version, this is handled as specified by the API by truncating and
> returning ENOMEM. Then:
> - top sees the ENOMEM and mishandles it by aborting
> - systat sees the ENOMEM (and also sees a short count, if it has the newer
> breakage of hard-coding 64-bit counters instead of u_int ones, and is
> run on a kernel with u_int counters) and mishandles it better by
> printing an error message but not aborting. Thus it appears to work
> just as well as when the kernel did blind truncation, except for the
> error message.
ENOMEM is, of course, the situation which I want to avoid.
>
> Clamping would be difficult for the application to handle even if an
> error is returned. This would be like the situation for clamping for
> strtoul(), but worse since it changes the API. Many applications don't
> understand how to handle errno for strtoul(), but they do do a bogus
> check of the clamped value. For sysctl(), clamping is not part of
> the API now, so applications shouldn't check for the clamped values.
> Ones that work with deltas like systat -v would see the counters stop
> incrementing when the clamped value is reached, while truncated values
> allow them to continue to appear to work perfectly.
>
> > diff --git a/sys/vm/vm_meter.c b/sys/vm/vm_meter.c
> > index 5f4cd46ab1e..93d5161de89 100644
> > --- a/sys/vm/vm_meter.c
> > +++ b/sys/vm/vm_meter.c
> > @@ -266,8 +266,29 @@ static SYSCTL_NODE(_vm_stats, OID_AUTO, vm, CTLFLAG_RW, 0,
> > "VM meter vm stats");
> > SYSCTL_NODE(_vm_stats, OID_AUTO, misc, CTLFLAG_RW, 0, "VM meter misc stats");
> >
> > +static int
> > +sysctl_handle_vmstat(SYSCTL_HANDLER_ARGS)
> > +{
> > + uint64_t out;
> > +#ifdef COMPAT_FREEBSD11
> > + uint32_t out1;
> > +#endif
>
> The extra wrapper and the ifdef limit the damage from the API change.
> Otherwise, I don't like them.
>
> I prefer my variable name out32 to the nondescript out1. ('out' should
> also be renamed out64, but I wanted to keep the diffs small.)
Changed to the out32 name.
>
> > +
> > + out = counter_u64_fetch(*(counter_u64_t *)arg1);
> > +#ifdef COMPAT_FREEBSD11
> > + if (req->oldlen == sizeof(uint32_t)) {
>
> sizeof(uint32_t) is an obfuscated spelling of '4'. The size is already
> hard-coded in '32'. This depends on sizeof(char) being 1. It is, and
> won't change. POSIX started specifying this in 2001, and the mere
> existence of uint8_t probably requires 8-bit chars since u_char must
> be the smallest type of an object. This depends on the size of a u_int
> being 32 bits/4 bytes in the old ABI. This won't change, and is already
> hard-coded as 32, as it must be because sizeof(u_int) may change.
I prefer the sizeof over 4.
>
> > + out1 = 0xffffffff;
> > + if (out < out1)
> > + out1 = out;
>
> This does the clamping. Clamping is probably better expressed as:
>
> out1 = MIN(out, 0xffffffff)
Changed as well.
>
> > + return (SYSCTL_OUT(req, &out1, sizeof(uint32_t)));
>
> Here sizeof(uint32_t) is an obfuscated spelling of sizeof(out1). I think
> the sizeof() is justified in this case, to limit the hard-coding to of
> sizes to 1 place. Truncation by assigment to the destination variable also
> helps limit this
> (sysctl_bufspace() could also be improved by truncation by assignment.
> Its integers are signed, so the result is implementation-defined, but
> we know that it is benign 2's complement and don't really care what it
> is provided it is reasonable. This is better than laboriously checking
> the lower limit or neglecting to check it).
> Here the limit 0xffffffff also depends on the size.
Yes, there is no UINT32_MAX, and never will.
>
> > + }
> > +#endif
> > + return (SYSCTL_OUT(req, &out, sizeof(uint64_t)));
> > +}
> > +
> > #define VM_STATS(parent, var, descr) \
> > - SYSCTL_COUNTER_U64(parent, OID_AUTO, var, CTLFLAG_RD, &vm_cnt.var, descr)
> > + SYSCTL_OID(parent, OID_AUTO, var, CTLTYPE_U64 | CTLFLAG_MPSAFE | \
> > + CTLFLAG_RD, &vm_cnt.var, 0, sysctl_handle_vmstat, "QU", descr);
>
> I see a new problem, shared with my version. The sysctl data says that this
> sysctl is pure 64 bits, but it can return 32 bits. The sysctl data just
> can't describe the variation. I guess this is not much a problem, since it
> takes magic to read the sysctl data from the kernel and not many programs
> except sysctl know how to do it. There programs will see the pure 64
> bits and only attempt to use 64 bit types.
It can return 32bit only on improper use, which we allow for ABI compat.
So I do not see this as a problem either.
diff --git a/sys/vm/vm_meter.c b/sys/vm/vm_meter.c
index 5f4cd46ab1e..67f24609b5a 100644
--- a/sys/vm/vm_meter.c
+++ b/sys/vm/vm_meter.c
@@ -266,8 +266,27 @@ static SYSCTL_NODE(_vm_stats, OID_AUTO, vm, CTLFLAG_RW, 0,
"VM meter vm stats");
SYSCTL_NODE(_vm_stats, OID_AUTO, misc, CTLFLAG_RW, 0, "VM meter misc stats");
+static int
+sysctl_handle_vmstat(SYSCTL_HANDLER_ARGS)
+{
+ uint64_t out;
+#ifdef COMPAT_FREEBSD11
+ uint32_t out32;
+#endif
+
+ out = counter_u64_fetch(*(counter_u64_t *)arg1);
+#ifdef COMPAT_FREEBSD11
+ if (req->oldlen == sizeof(uint32_t)) {
+ out32 = MIN(out, 0xffffffff);
+ return (SYSCTL_OUT(req, &out32, sizeof(uint32_t)));
+ }
+#endif
+ return (SYSCTL_OUT(req, &out, sizeof(uint64_t)));
+}
+
#define VM_STATS(parent, var, descr) \
- SYSCTL_COUNTER_U64(parent, OID_AUTO, var, CTLFLAG_RD, &vm_cnt.var, descr)
+ SYSCTL_OID(parent, OID_AUTO, var, CTLTYPE_U64 | CTLFLAG_MPSAFE | \
+ CTLFLAG_RD, &vm_cnt.var, 0, sysctl_handle_vmstat, "QU", descr);
#define VM_STATS_VM(var, descr) VM_STATS(_vm_stats_vm, var, descr)
#define VM_STATS_SYS(var, descr) VM_STATS(_vm_stats_sys, var, descr)
More information about the svn-src-head
mailing list