mbuf doubts

Tue Sep 23 20:54:05 PDT 2003

I'm not an expert on all BSD-derived stacks and the way mbufs are 
defined and used in each, but:

On Tuesday, September 23, 2003, at 07:12 PM, Giovanni P. Tirloni wrote:

>  struct mbuf *m;
>
>   1. Normal mbuf using m->M_databuf

     M_databuf is the beginning of the data area in an mbuf

>   2. Normal mbuf with external storage (cluster?) in m->m_hdr->mh_data

     mh_data *always* points to the beginning of valid data or available 
space.
     The bit M_EXT indicates whether mh_data points into the external 
storage,
     or into the area beginning at M_databuf.

>   3. Header mbuf using m->m_pktdat;

     This is used to access the data in an mbuf when the M_PKTHDR bit is 
set
     in the m_flags word.  This is because extra space in this lead mbuf 
is
     taken up with local information pertaining to the packet and its 
handling.
     I'm not entirely clear on how it's used.

>   4. Header mbuf with ext. storage (cluster?) in m->m_ext->ext_buf

     This points to the external storage buffer.  It can be a cluster, 
or it
     can be other data areas.  I believe the distinction is made based 
on the
     field ext_free in the m_ext structure (if non-null, it points to a 
routine
     to free data, and thus the external storage is *not* a cluster).

>  Other questions:
>   1. When using ext. storage is the space allocated by M_databuf 
> wasted?

Yes.

>   2. How the system decides 256 bytes for each mbuf isn't enough and it
>      needs a mbuf cluster? Isn't chaining useful there?

There is a constant (MINCLSIZE) that the system uses to decide when to 
allocate a cluster, and when to use a chain of normal mbufs.  If the 
size is greater than MINCLSIZE, it opts for a cluster.

Note that you can sometimes notice the effect of MINCLSIZE on the 
performance of both the system and the network, so the choice of this 
value can be important.  It is normally set to a value that goes to 
clusters when two mbufs won't suffice.

>   3. How does changing MSIZE affects the whole thing?

Significantly :-}.  This is a gnarly subject.  You have to balance 
wasted space, time, and other subtle details (typical packet sizes vs. 
mbuf size; time spent dealing with chains vs. time spent dealing with 
clusters; ...).  At one point, for example, packet sizes on the 
internet were strongly "bi-modal" (small packets for telnet; max-sized 
packets for ftp).  More recently, I suspect that this has changed, but 
I don't know what the distribution looks like now.

>   4. What about MCLBYTES?

Same set of issues.  AIX, for example, has a "power-of-2" collection of 
mbuf pools, and tries to allocate from the best pool for the requested 
size, bumping up at most two levels to fill empty pools.  Other BSDs 
stick with a single size, generally 2048 bytes; this makes jumbo 
ethernet packets kind of expensive.

Check out Wright/Stevens, "TCP/IP Illustrated, V.2", Addison Wesley, 
1995.  Ch. 2 is a fairly in-depth discussion of the above.  It deals 
with a long-dead version of BSD, but the fundamentals have not changed 
that much.  In addition, the book is a very well-done code walkthrough 
of the networking code in BSD (again, from long ago, but the "bones" 
are good).

Regards,

Justin

--
Justin C. Walker, Curmudgeon-At-Large  *
Institute for General Semantics        | It's not whether you win or 
lose...
                                        |  It's whether *I* win or lose.
*--------------------------------------*-------------------------------*