Question about bridging code

Wed Jul 9 12:23:53 PDT 2003

Hi guys,

My first attempts at hacking FreeBSD kernel code has not been very fruitful, so 
I'm hoping someone with more experience and knowhow might be able to point out 
the mistakes that I'm making.

Firstly, let me explain what I'm trying to do. I'm currently working on a 
University project that performs some type of transformation (compression, 
security, string replacement, etc) on packets as they pass through the system. 
The current setup has the FreeBSD machine configured as a router, and the 
transformation is performed on packets that are routed. This is done via divert 
sockets and everything is fine and dandy, we're getting great results from this 
setup.

However, what we want to do next is to have the machine setup as a ethernet 
bridge instead, and the transformation is to be performed on the bridged 
packets. Unfortunately, as most of you probably know, divert sockets do not 
work with bridges as of yet.

So I've been trying to add a somewhat hack-ish support for divert sockets over 
bridges. The concession that I'm making is that instead of diverting ip 
packets, I'll be diverting ethernet frames. In userspace my program will 
reattach the ethernet headers back onto the packet before passing it back to 
the divert socket. A second concession is that when I sendto the divert socket, 
the sin_zero in the sockaddr must contain the source network adaptor name. All 
these concessions are necessary (I think) as I would otherwise not know how to 
output the data in a ip-less bridge. 

So here is what my code changes involved so far. BTW, I'm using FreeBSD 4.8

1) Removed the check in ipfw_chk (ip_fw2.c) for whether it is layer2 or not. 
This allows briged packets to still match the ipfw2 divert rules 

2) In bridge.c at function bdg_forward, after the ip_fw_chk_ptr (and after the 
check for dummynet, around line 974), the following code fragment is added

    if (i != 0 && (i & IP_FW_PORT_DYNT_FLAG) == 0) {
        struct mbuf *m;

        /* Need to determine whether this is an IP. If not just forward
        */
        if (ntohs(eh->ether_type) != ETHERTYPE_IP)
            goto forward;

        if ( shared ) {
            int j = min(m0->m_pkthdr.len + ETHER_HDR_LEN, max_protohdr) ;

            m0 = m_pullup(m0, j) ;
            if (m0 == NULL)
                return NULL;
        }

        if (shared == 0 && once ) { /* no need to copy */
            m = m0 ;
            m0 = NULL ; /* original is gone */
        } else {
            m = m_copypacket(m0, M_DONTWAIT);
            if (m == NULL) {
                printf("bdg_forward: sorry, m_copypacket failed!\n");
                return m0 ; /* the original is still there... */
            }
        }

        if ( (void *)(eh + 1) == (void *)m->m_data) {
            m->m_data -= ETHER_HDR_LEN ;
            m->m_len += ETHER_HDR_LEN ;
            m->m_pkthdr.len += ETHER_HDR_LEN ;
            bdg_predict++;
        } else {
            M_PREPEND(m, ETHER_HDR_LEN, M_DONTWAIT);
            if (m == NULL)
            {
                printf("M_PREPEND failed\n");
                /* Should probably return original instead of NULL */
                /* return NULL; */
                return m0;
            }
            bcopy(&save_eh, mtod(m, struct ether_header *), ETHER_HDR_LEN);
        }

        divert_packet(m, 1, i & 0xffff, args.divert_rule);
        return NULL;
    }

This allows me to divert the ethernet frames to userspace.

3) To allow me to inject ethernet frames back into the system via divert 
sockets, I've modified div_output so that it will call ether_output_frame. The 
following are my changes to div_output, which is added before ip_output is 
called:

    /*  rcvif is copied from sin_zero, and is required to be valid
        for the current system to work
    */
    if (m->m_pkthdr.rcvif != NULL && BDG_USED(m->m_pkthdr.rcvif))
    {
        if (m->m_len < sizeof(struct ether_header)) {
            /* XXX error in the caller. */
            error = EINVAL;
            goto cantsend;
        }

        return ether_output_frame(m->m_pkthdr.rcvif, m);
    }

4) In userspace for testing purposes, I have a program that simply reads from 
the divert socket, and writes back out to it - here's the core snippet of the 
code.

    while (true)
    {
        sstBytes = ::recvfrom(nFD, kpucInPacket, sizeof(kpucInPacket), 0,
            (struct sockaddr *) &SockAddr, &AddrLen);

        if (sstBytes == -1)
            ::err(errno, "recvfrom");

        ::bcopy(SockAddr.sin_zero, 
            SockAddrSend.sin_zero, 
            sizeof(SockAddr.sin_zero));

        int nSendBytes = ::sendto(nSendFD, (void*)kpucInPacket, sstBytes, 0,
            (struct sockaddr *) &SockAddrSend, sizeof(SockAddrSend));

        if (nSendBytes != sstBytes)
            ::err(errno, "sendto");
    }

Now I understand I'm breaking lots of abstractions/layers, but I do plan to 
clean that up a bit later. And I also understand that perhaps no one else in 
the world needs this functionality - although I can see a couple of other 
possible applications for it. 

The changes does seem to work, I'm able to receive the ethernet frame and also 
reinject it via the divert sockets - ping, ftp, etc. all work over the bridge 
when my test program is running. However, I'm finding that I'm losing/leaking 
mbufs. sbdrop will complain and panic that the sb_cc doesn't match up with what 
the mbuf chains says - usually the sb_cc will be larger by a couple of hundred 
bytes. Furthermore, a netstat -m will show that I have mbufs allocated to 
socket names and address even after the termination of the diverting program. 
This only seem to happen when I transfer over ftp a really large file (>100M) 
at high speed (full line speed of a 100Mbps network). Ping and ftping small 
files do not seem to cause the mbuf leakage.

So my question is, does anyone see where I might be losing the mbufs - is there 
some mbufs that must be freed or not freed that I'm not aware of? I've never 
worked on the FreeBSD kernel before, so I'm not sure 100% sure how to correctly 
manage the mbufs. Any advise, tips, discussion, anything will be highly 
appreciated! =) If anyone needs any more clarification/information, just ask 
and I'll try my best to explain myself better.

Thanks!!
Bernie

----------------------------------------
This mail sent through www.mywaterloo.ca