Socket option to configure Ethernet PCP / CoS per-flow

Matthew Grooms mgrooms at shrew.net
Fri Sep 11 22:19:42 UTC 2020


On 9/11/2020 12:15 PM, Scheffenegger, Richard wrote:
> Thank you for the quick feedback.
>
> On a related note - it just occurred to me, that the PCP functionality could be extended to make more effective use of PFC (priority flow control) without explicitly managing it on an application level directly.
>
> Right now, PFC typically degenerates to good-old Flow control, as all traffic is handled just in the default class (0, or whatever is set up using the IOCTL interface API).
>
> Typically, the different Ethernet classes come with a notion of prioritization between them - traffic in a "higher" class may be forwarded prior to traffic in a lower class. But that is not a strong requirement - using WRR with 1/8th bandwidth "reserved" for each class in a switch, assigning flows to a random PCP value, PFC could work in a more scalable fashion - only blocking a fraction of traffic, that is actually queue building (has to go over a lower bandwidth link, or a NIC excessively pausing its ingress), thus reducing the chance of the formation of congrestion trees...
>
> E.g. PCP runs from 0 (default) to 7;
>
> Adding a socket option to explicitly assign traffic to one of these flows would allow testing and configuring applications to make use of "real" prioritization capabilities of modern switches.
>
> And what I was just pondering was a special interface level setting (e.g. 8), which results in a socket to pick a "random" value when created, to distribute packets across all the queues available in hardware, allowing PFC to no longer collapse in effect to old FC style "on"/"off" for all traffic...
>
> Perhaps someone here has experience with congestion tree formation in multi-hop switching environments, and can comment if the above approach would be feasible to address that FC issue?
>
>
> Richard Scheffenegger

Hey There Richard,

I live in Austin where we are fortunate enough to have Google Fiber. And 
while I love the service, I hate the idea of being forced to use the 
Google Fiber black box as my edge device. But get full use of the 
service, you have to set VLAN + PCP values appropriately or you hit a 
Google imposed traffic shaping bottleneck. In any case, I was able to do 
this using pf as the packet classifier. You simply write a rule to match 
the traffic and assign the desired value. Perhaps this may be a way to 
accomplish what you're trying to do without having to add a new socket 
option. Have a look at the pf.conf man page and search for 'set prio'. I 
assume ipfw has an equivalent feature as well ...

      set prio priority | (priority, priority)
            Packets matching this rule will be assigned a specific queueing
            priority.  Priorities are assigned as integers 0 through 7.  
If the
            packet is transmitted on a vlan(4) interface, the queueing 
priority
            will be written as the priority code point in the 802.1Q VLAN
            header.  If two priorities are given, packets which have a 
TOS of
            lowdelay and TCP ACKs with no data payload will be assigned 
to the
            second one.

Hope this helps,

-Matthew



More information about the freebsd-net mailing list