Terrible NFS performance under 9.2-RELEASE?

Tue Jan 28 14:55:59 UTC 2014

J David wrote:
> Another way to test this is to instrument the virtio driver, which
> turned out to be very straightforward:
> 
> Index: if_vtnet.c
> 
> ===================================================================
> 
> --- if_vtnet.c (revision 260701)
> 
> +++ if_vtnet.c (working copy)
> 
> @@ -1886,6 +1887,7 @@
> 
>   return (virtqueue_enqueue(vq, txhdr, &sg, sg.sg_nseg, 0));
> 
> 
> 
>  fail:
> 
> + sc->vtnet_stats.tx_excess_mbuf_drop++;
> 
>   m_freem(*m_head);
> 
>   *m_head = NULL;
> 
> 
> 
> @@ -2645,6 +2647,9 @@
> 
>   SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "tx_task_rescheduled",
> 
>       CTLFLAG_RD, &stats->tx_task_rescheduled,
> 
>       "Times the transmit interrupt task rescheduled itself");
> 
> + SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "tx_excess_mbuf_drop",
> 
> +     CTLFLAG_RD, &stats->tx_excess_mbuf_drop,
> 
> +     "Times packets were dropped due to excess mbufs");
> 
>  }
> 
> 
> 
>  static int
> 
> Index: if_vtnetvar.h
> 
> ===================================================================
> 
> --- if_vtnetvar.h (revision 260701)
> 
> +++ if_vtnetvar.h (working copy)
> 
> @@ -48,6 +48,7 @@
> 
>   unsigned long tx_csum_bad_ethtype;
> 
>   unsigned long tx_tso_bad_ethtype;
> 
>   unsigned long tx_task_rescheduled;
> 
> + unsigned long tx_excess_mbuf_drop;
> 
>  };
> 
> 
> 
>  struct vtnet_softc {
> 
> 
> This patch didn't seem harmful from a performance standpoint since if
> things are working, the counter increment never gets hit.
> 
> With this change, I re-ran some 64k tests.  I found that the number
> of
> drops was very small, but not zero.
> 
> On the client, doing the write-append test (which has no reads), it
> seems like it slowly builds up 8 with what appears to be some sort of
> back off (each one takes longer to appear than the last):
> 
> 
> $ sysctl dev.vtnet.1.tx_excess_mbuf_drop
> 
> dev.vtnet.1.tx_excess_mbuf_drop: 8
> 
> 
> But after 8, it appears congestion control is clamped down so hard
> that no more happen.
> 
> Once read activity starts, the server builds up more:
> 
> dev.vtnet.1.tx_excess_mbuf_drop: 53
> 
> 
> So while there aren't a lot of these, they definitely do exist and
> there's just no way they're good for performance.
> 
It would be nice to also count the number of times m_collapse() gets
called, since that will generate a lot of overhead that I think will
show up on your test, since you don't have any disk activity.

And I'd state that having any of these is near-disastrous for performance,
since it means a timeout/retransmit of a TCP segment. For a lan environment,
I would consider 1 timeout/retransmit in a million packets as a lot.

rick
ps: I've cc'd Bryan, since he's the guy handling virtio, I think.

> Thanks!
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to
> "freebsd-net-unsubscribe at freebsd.org"
>