realtek performance (was Re: good ATI chipset results)
Sean McNeil
sean at mcneil.com
Wed Oct 19 14:49:58 PDT 2005
On Thu, 2005-10-13 at 16:17 -0400, John Baldwin wrote:
> On Thursday 13 October 2005 03:46 pm, Scott Long wrote:
> > John Baldwin wrote:
> > > On Thursday 13 October 2005 01:07 pm, Sean McNeil wrote:
> > >>On Thu, 2005-10-13 at 11:49 -0400, John Baldwin wrote:
> > >>>On Thursday 13 October 2005 11:13 am, Sean McNeil wrote:
> > >>>>On Thu, 2005-10-13 at 09:17 -0400, Mike Tancsa wrote:
> > >>>>>Havent really seen anyone else use this board, but I have had good
> > >>>>>luck with it so far
> > >>>>>
> > >>>>>http://www.ecs.com.tw/ECSWeb/Products/ProductsDetail.aspx?DetailID=50
> > >>>>>6&Me nuID=90&LanID=0
> > >>>>>
> > >>>>>Its a micro ATX formfactor with built in video and the onboard NIC is
> > >>>>>a realtek. (Although its not the fastest NIC, its driver is stable
> > >>>>>and mature-- especially compared to the headaches people seem to have
> > >>>>>with the NVIDIA NICs.)
> > >>>>
> > >>>>Is this the RealTek 8169S Single-chip Gigabit Ethernet?
> > >>>>
> > >>>>For those interested, here are some changes I always use to increase
> > >>>>the performance of the above NIC. With these mods, I can stream over
> > >>>>20 MBps video multicast and do other stuff over the network without
> > >>>>issues. Without the changes, xmit is horrible with severe UDP packet
> > >>>>loss.
> > >>>
> > >>>So, I see two changes. One is to up the number of descriptors from 32
> > >>> rx and 64 tx to 64 rx and 64 tx on some models and 1024 rx and 1024 tx
> > >>> on other modules. The other thing is that you seem to pessimize TX
> > >>> performance by always forcing the send packets to be coalesced into one
> > >>> mbuf (which requires doing an alloc and then copying all of the data)
> > >>> instead of making use of scatter/gatter for sending packets. Do you
> > >>> need both changes or do just the higher descriptor counts make the
> > >>> difference?
> > >>
> > >>Actually, I've found that the higher descriptor counts do not make a
> > >>noticeable difference. The only thing that mattered was to eliminate
> > >>the scatter/gather of sending packets. I can't remember why I left the
> > >>descriptor increase in there. I think it was to get the best use out of
> > >>the hardware.
> > >
> > > Hmm, odd. Scott, do you have any ideas why m_defrag() plus one
> > > descriptor would be faster than s/g dma for re(4)?
> >
> > There are two things that I would consider. First is that
> > bus_dmamap_load_mbuf_sg()
> > should be use, as that cuts out some indirection (and thus latency) in
> > the code. Second
> > is that not all DMA engines are created equal, and I honestly wouldn't
> > expect a whole lot
> > out of Realtek given the price point of this chip. It might be
> > optimized only for operating
> > on only a single S/G element, for example. Maybe it's really slow at
> > pre-fetching s/g
> > elements, or maybe it has some sort of a stall after each DMA sement
> > transfer while it
> > restarts a state machine. I've seen evidence in other hardware that
> > only one S/G element
> > should be used even though there are slots for 2 (or 3 in the case of 9k
> > jumbo frames). One
> > thing to keep in mind is the difference in the driver models between
> > Windows and BSD
> > that Bill Paul talked about the other day. In the Windows world, the
> > driver owns the
> > network packet memory, whereas in BSD the stack owns it (in the form of
> > mbufs). This
> > means that the driver can pre-allocate a contiguous slab and populate
> > the descriptor rings
> > with it without ever having to worry about s/g fragmentation, while in
> > BSD fragmentation
> > is a fact of life. So it's likely yet another case of hardware being
> > optimized for certain
> > characteristics of Windows at the expense of other operating systems.
>
> Ok. Sean, do you think you can trim the patch down to just the m_defrag()
> changes and test that to make sure that is all that is needed?
I was working on this when I fried my motherboard. I have two sets of
patches for the re0 and managed to do some testing. I didn't, however,
have the opportunity to perform my 19Mbps multicast stream.
Unfortunately, I couldn't find a comparable MB and now have an sk0
instead of re0. So I won't be able to assist with this driver anymore.
The patches are broken into 2:
Just the m_defrag() patch.
Using bus_dmamap_load_mbuf_sg() as suggested by Scott.
The first is trivial, the second a little more involved.
Since I've had issues with attachments, I'll include as text as well as
attach.
Cheers,
Sean
First patch (just m_defrag() stuff):
--- sys/dev/re/if_re.c.orig Wed Oct 19 14:39:32 2005
+++ sys/dev/re/if_re.c Wed Oct 19 14:40:13 2005
@@ -1901,40 +1901,26 @@
arg.sc = sc;
arg.rl_idx = *idx;
arg.rl_maxsegs = sc->rl_ldata.rl_tx_free;
- if (arg.rl_maxsegs > 4)
- arg.rl_maxsegs -= 4;
arg.rl_ring = sc->rl_ldata.rl_tx_list;
map = sc->rl_ldata.rl_tx_dmamap[*idx];
+
+ /*
+ * Network performance is extremely poor in gigE mode without
+ * defragmenting each time. It is believed this might be an
+ * issue with the realtek part not being able to DMA anything
+ * smaller than a complete segment.
+ */
+ m_new = m_defrag(*m_head, M_DONTWAIT);
+ if (m_new != NULL)
+ *m_head = m_new;
+
error = bus_dmamap_load_mbuf(sc->rl_ldata.rl_mtag, map,
*m_head, re_dma_map_desc, &arg, BUS_DMA_NOWAIT);
- if (error && error != EFBIG) {
+ if (error) {
if_printf(sc->rl_ifp, "can't map mbuf (error %d)\n", error);
return (ENOBUFS);
- }
-
- /* Too many segments to map, coalesce into a single mbuf */
-
- if (error || arg.rl_maxsegs == 0) {
- m_new = m_defrag(*m_head, M_DONTWAIT);
- if (m_new == NULL)
- return (ENOBUFS);
- else
- *m_head = m_new;
-
- arg.sc = sc;
- arg.rl_idx = *idx;
- arg.rl_maxsegs = sc->rl_ldata.rl_tx_free;
- arg.rl_ring = sc->rl_ldata.rl_tx_list;
-
- error = bus_dmamap_load_mbuf(sc->rl_ldata.rl_mtag, map,
- *m_head, re_dma_map_desc, &arg, BUS_DMA_NOWAIT);
- if (error) {
- if_printf(sc->rl_ifp, "can't map mbuf (error %d)\n",
- error);
- return (EFBIG);
- }
}
/*
Second patch (_sg changes):
--- sys/pci/if_rlreg.h.orig Thu Oct 6 13:17:17 2005
+++ sys/pci/if_rlreg.h Wed Oct 19 14:43:52 2005
@@ -645,14 +645,6 @@
struct rl_softc;
-struct rl_dmaload_arg {
- struct rl_softc *sc;
- int rl_idx;
- int rl_maxsegs;
- uint32_t rl_flags;
- struct rl_desc *rl_ring;
-};
-
struct rl_list_data {
struct mbuf *rl_tx_mbuf[RL_TX_DESC_CNT];
struct mbuf *rl_rx_mbuf[RL_TX_DESC_CNT];
--- sys/dev/re/if_re.c.orig Wed Oct 19 14:40:13 2005
+++ sys/dev/re/if_re.c Wed Oct 19 14:42:23 2005
@@ -159,6 +159,8 @@
#define RE_CSUM_FEATURES (CSUM_IP | CSUM_TCP | CSUM_UDP)
+#define RL_NSEG_NEW 32
+
/*
* Various supported device vendors/types and their names.
*/
@@ -205,8 +207,6 @@
static int re_encap (struct rl_softc *, struct mbuf **, int *);
static void re_dma_map_addr (void *, bus_dma_segment_t *, int, int);
-static void re_dma_map_desc (void *, bus_dma_segment_t *, int,
- bus_size_t, int);
static int re_allocmem (device_t, struct rl_softc *);
static int re_newbuf (struct rl_softc *, int, struct mbuf *);
static int re_rx_list_init (struct rl_softc *);
@@ -856,82 +856,6 @@
}
/*
- * This routine takes the segment list provided as the result of
- * a bus_dma_map_load() operation and assigns the addresses/lengths
- * to RealTek DMA descriptors. This can be called either by the RX
- * code or the TX code. In the RX case, we'll probably wind up mapping
- * at most one segment. For the TX case, there could be any number of
- * segments since TX packets may span multiple mbufs. In either case,
- * if the number of segments is larger than the rl_maxsegs limit
- * specified by the caller, we abort the mapping operation. Sadly,
- * whoever designed the buffer mapping API did not provide a way to
- * return an error from here, so we have to fake it a bit.
- */
-
-static void
-re_dma_map_desc(arg, segs, nseg, mapsize, error)
- void *arg;
- bus_dma_segment_t *segs;
- int nseg;
- bus_size_t mapsize;
- int error;
-{
- struct rl_dmaload_arg *ctx;
- struct rl_desc *d = NULL;
- int i = 0, idx;
-
- if (error)
- return;
-
- ctx = arg;
-
- /* Signal error to caller if there's too many segments */
- if (nseg > ctx->rl_maxsegs) {
- ctx->rl_maxsegs = 0;
- return;
- }
-
- /*
- * Map the segment array into descriptors. Note that we set the
- * start-of-frame and end-of-frame markers for either TX or RX, but
- * they really only have meaning in the TX case. (In the RX case,
- * it's the chip that tells us where packets begin and end.)
- * We also keep track of the end of the ring and set the
- * end-of-ring bits as needed, and we set the ownership bits
- * in all except the very first descriptor. (The caller will
- * set this descriptor later when it start transmission or
- * reception.)
- */
- idx = ctx->rl_idx;
- for (;;) {
- u_int32_t cmdstat;
- d = &ctx->rl_ring[idx];
- if (le32toh(d->rl_cmdstat) & RL_RDESC_STAT_OWN) {
- ctx->rl_maxsegs = 0;
- return;
- }
- cmdstat = segs[i].ds_len;
- d->rl_bufaddr_lo = htole32(RL_ADDR_LO(segs[i].ds_addr));
- d->rl_bufaddr_hi = htole32(RL_ADDR_HI(segs[i].ds_addr));
- if (i == 0)
- cmdstat |= RL_TDESC_CMD_SOF;
- else
- cmdstat |= RL_TDESC_CMD_OWN;
- if (idx == (RL_RX_DESC_CNT - 1))
- cmdstat |= RL_TDESC_CMD_EOR;
- d->rl_cmdstat = htole32(cmdstat | ctx->rl_flags);
- i++;
- if (i == nseg)
- break;
- RL_DESC_INC(idx);
- }
-
- d->rl_cmdstat |= htole32(RL_TDESC_CMD_EOF);
- ctx->rl_maxsegs = nseg;
- ctx->rl_idx = idx;
-}
-
-/*
* Map a single buffer address.
*/
@@ -958,17 +882,15 @@
struct rl_softc *sc;
{
int error;
- int nseg;
int i;
/*
* Allocate map for RX mbufs.
*/
- nseg = 32;
error = bus_dma_tag_create(sc->rl_parent_tag, ETHER_ALIGN, 0,
BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL,
- NULL, MCLBYTES * nseg, nseg, MCLBYTES, BUS_DMA_ALLOCNOW,
- NULL, NULL, &sc->rl_ldata.rl_mtag);
+ NULL, MCLBYTES * RL_NSEG_NEW, RL_NSEG_NEW, MCLBYTES,
+ BUS_DMA_ALLOCNOW, NULL, NULL, &sc->rl_ldata.rl_mtag);
if (error) {
device_printf(dev, "could not allocate dma tag\n");
return (ENOMEM);
@@ -1163,7 +1085,6 @@
/*
* Allocate the parent bus DMA tag appropriate for PCI.
*/
-#define RL_NSEG_NEW 32
error = bus_dma_tag_create(NULL, /* parent */
1, 0, /* alignment, boundary */
BUS_SPACE_MAXADDR_32BIT,/* lowaddr */
@@ -1374,9 +1295,11 @@
int idx;
struct mbuf *m;
{
- struct rl_dmaload_arg arg;
struct mbuf *n = NULL;
- int error;
+ struct rl_desc *d = &sc->rl_ldata.rl_rx_list[idx];
+ bus_dma_segment_t segs[1];
+ int error, nseg;
+ u_int32_t cmdstat;
if (m == NULL) {
n = m_getcl(M_DONTWAIT, MT_DATA, M_PKTHDR);
@@ -1400,22 +1323,23 @@
*/
m_adj(m, RE_ETHER_ALIGN);
#endif
- arg.sc = sc;
- arg.rl_idx = idx;
- arg.rl_maxsegs = 1;
- arg.rl_flags = 0;
- arg.rl_ring = sc->rl_ldata.rl_rx_list;
-
- error = bus_dmamap_load_mbuf(sc->rl_ldata.rl_mtag,
- sc->rl_ldata.rl_rx_dmamap[idx], m, re_dma_map_desc,
- &arg, BUS_DMA_NOWAIT);
- if (error || arg.rl_maxsegs != 1) {
+ error = bus_dmamap_load_mbuf_sg(sc->rl_ldata.rl_mtag,
+ sc->rl_ldata.rl_rx_dmamap[idx], m, segs, &nseg, BUS_DMA_NOWAIT);
+ if (error || nseg != 1) {
if (n != NULL)
m_freem(n);
return (ENOMEM);
}
- sc->rl_ldata.rl_rx_list[idx].rl_cmdstat |= htole32(RL_RDESC_CMD_OWN);
+ d->rl_bufaddr_lo = htole32(RL_ADDR_LO(segs[0].ds_addr));
+ d->rl_bufaddr_hi = htole32(RL_ADDR_HI(segs[0].ds_addr));
+
+ cmdstat = segs[0].ds_len | RL_RDESC_CMD_OWN;
+ if (idx == (RL_RX_DESC_CNT - 1))
+ cmdstat |= RL_TDESC_CMD_EOR;
+
+ d->rl_cmdstat = htole32(cmdstat);
+
sc->rl_ldata.rl_rx_mbuf[idx] = m;
bus_dmamap_sync(sc->rl_ldata.rl_mtag,
@@ -1872,10 +1796,13 @@
int *idx;
{
struct mbuf *m_new = NULL;
- struct rl_dmaload_arg arg;
- bus_dmamap_t map;
- int error;
struct m_tag *mtag;
+ bus_dmamap_t map;
+ bus_dma_segment_t segs[RL_NSEG_NEW];
+ struct rl_desc *d;
+ int error, nseg, i;
+ int nidx = *idx;
+ uint32_t flags = 0;
RL_LOCK_ASSERT(sc);
@@ -1889,21 +1816,14 @@
* chip. This is a requirement.
*/
- arg.rl_flags = 0;
-
if ((*m_head)->m_pkthdr.csum_flags & CSUM_IP)
- arg.rl_flags |= RL_TDESC_CMD_IPCSUM;
+ flags |= RL_TDESC_CMD_IPCSUM;
if ((*m_head)->m_pkthdr.csum_flags & CSUM_TCP)
- arg.rl_flags |= RL_TDESC_CMD_TCPCSUM;
+ flags |= RL_TDESC_CMD_TCPCSUM;
if ((*m_head)->m_pkthdr.csum_flags & CSUM_UDP)
- arg.rl_flags |= RL_TDESC_CMD_UDPCSUM;
-
- arg.sc = sc;
- arg.rl_idx = *idx;
- arg.rl_maxsegs = sc->rl_ldata.rl_tx_free;
- arg.rl_ring = sc->rl_ldata.rl_tx_list;
+ flags |= RL_TDESC_CMD_UDPCSUM;
- map = sc->rl_ldata.rl_tx_dmamap[*idx];
+ map = sc->rl_ldata.rl_tx_dmamap[nidx];
/*
* Network performance is extremely poor in gigE mode without
@@ -1915,25 +1835,52 @@
if (m_new != NULL)
*m_head = m_new;
- error = bus_dmamap_load_mbuf(sc->rl_ldata.rl_mtag, map,
- *m_head, re_dma_map_desc, &arg, BUS_DMA_NOWAIT);
+ error = bus_dmamap_load_mbuf_sg(sc->rl_ldata.rl_mtag, map,
+ *m_head, segs, &nseg, BUS_DMA_NOWAIT);
if (error) {
if_printf(sc->rl_ifp, "can't map mbuf (error %d)\n", error);
return (ENOBUFS);
}
+ KASSERT(nseg <= sc->rl_ldata.rl_tx_free, ("too many DMA segments"));
+
+ /*
+ * Map the segment array into descriptors. We keep track of the
+ * end of the ring and set the end-of-ring bits as needed, and
+ * we set the ownership bits in all except the very first descriptor.
+ * (It is set later during the start transmission or reception.)
+ */
+ for (i = 0;;) {
+ u_int32_t cmdstat;
+ d = &sc->rl_ldata.rl_tx_list[nidx];
+ KASSERT((le32toh(d->rl_cmdstat) & RL_RDESC_STAT_OWN)==0,
+ ("Descriptor still owned"));
+ cmdstat = segs[i].ds_len;
+ d->rl_bufaddr_lo = htole32(RL_ADDR_LO(segs[i].ds_addr));
+ d->rl_bufaddr_hi = htole32(RL_ADDR_HI(segs[i].ds_addr));
+ cmdstat |= (i == 0) ? RL_TDESC_CMD_SOF : RL_TDESC_CMD_OWN;
+ if (nidx == (RL_TX_DESC_CNT - 1))
+ cmdstat |= RL_TDESC_CMD_EOR;
+ d->rl_cmdstat = htole32(cmdstat | flags);
+ i++;
+ if (i == nseg)
+ break;
+ RL_DESC_INC(nidx);
+ }
+
+ d->rl_cmdstat |= htole32(RL_TDESC_CMD_EOF);
+
/*
* Insure that the map for this transmission
* is placed at the array index of the last descriptor
* in this chain. (Swap last and first dmamaps.)
*/
- sc->rl_ldata.rl_tx_dmamap[*idx] =
- sc->rl_ldata.rl_tx_dmamap[arg.rl_idx];
- sc->rl_ldata.rl_tx_dmamap[arg.rl_idx] = map;
+ sc->rl_ldata.rl_tx_dmamap[*idx] = sc->rl_ldata.rl_tx_dmamap[nidx];
+ sc->rl_ldata.rl_tx_dmamap[nidx] = map;
- sc->rl_ldata.rl_tx_mbuf[arg.rl_idx] = *m_head;
- sc->rl_ldata.rl_tx_free -= arg.rl_maxsegs;
+ sc->rl_ldata.rl_tx_mbuf[nidx] = *m_head;
+ sc->rl_ldata.rl_tx_free -= nseg;
/*
* Set up hardware VLAN tagging. Note: vlan tag info must
@@ -1941,21 +1888,19 @@
* transmission attempt.
*/
+ d = &sc->rl_ldata.rl_tx_list[*idx];
+
mtag = VLAN_OUTPUT_TAG(sc->rl_ifp, *m_head);
if (mtag != NULL)
- sc->rl_ldata.rl_tx_list[*idx].rl_vlanctl =
+ d->rl_vlanctl =
htole32(htons(VLAN_TAG_VALUE(mtag)) | RL_TDESC_VLANCTL_TAG);
/* Transfer ownership of packet to the chip. */
- sc->rl_ldata.rl_tx_list[arg.rl_idx].rl_cmdstat |=
- htole32(RL_TDESC_CMD_OWN);
- if (*idx != arg.rl_idx)
- sc->rl_ldata.rl_tx_list[*idx].rl_cmdstat |=
- htole32(RL_TDESC_CMD_OWN);
+ d->rl_cmdstat |= htole32(RL_TDESC_CMD_OWN);
- RL_DESC_INC(arg.rl_idx);
- *idx = arg.rl_idx;
+ RL_DESC_INC(nidx);
+ *idx = nidx;
return (0);
}
-------------- next part --------------
--- sys/dev/re/if_re.c.orig Wed Oct 19 14:39:32 2005
+++ sys/dev/re/if_re.c Wed Oct 19 14:40:13 2005
@@ -1901,40 +1901,26 @@
arg.sc = sc;
arg.rl_idx = *idx;
arg.rl_maxsegs = sc->rl_ldata.rl_tx_free;
- if (arg.rl_maxsegs > 4)
- arg.rl_maxsegs -= 4;
arg.rl_ring = sc->rl_ldata.rl_tx_list;
map = sc->rl_ldata.rl_tx_dmamap[*idx];
+
+ /*
+ * Network performance is extremely poor in gigE mode without
+ * defragmenting each time. It is believed this might be an
+ * issue with the realtek part not being able to DMA anything
+ * smaller than a complete segment.
+ */
+ m_new = m_defrag(*m_head, M_DONTWAIT);
+ if (m_new != NULL)
+ *m_head = m_new;
+
error = bus_dmamap_load_mbuf(sc->rl_ldata.rl_mtag, map,
*m_head, re_dma_map_desc, &arg, BUS_DMA_NOWAIT);
- if (error && error != EFBIG) {
+ if (error) {
if_printf(sc->rl_ifp, "can't map mbuf (error %d)\n", error);
return (ENOBUFS);
- }
-
- /* Too many segments to map, coalesce into a single mbuf */
-
- if (error || arg.rl_maxsegs == 0) {
- m_new = m_defrag(*m_head, M_DONTWAIT);
- if (m_new == NULL)
- return (ENOBUFS);
- else
- *m_head = m_new;
-
- arg.sc = sc;
- arg.rl_idx = *idx;
- arg.rl_maxsegs = sc->rl_ldata.rl_tx_free;
- arg.rl_ring = sc->rl_ldata.rl_tx_list;
-
- error = bus_dmamap_load_mbuf(sc->rl_ldata.rl_mtag, map,
- *m_head, re_dma_map_desc, &arg, BUS_DMA_NOWAIT);
- if (error) {
- if_printf(sc->rl_ifp, "can't map mbuf (error %d)\n",
- error);
- return (EFBIG);
- }
}
/*
-------------- next part --------------
--- sys/pci/if_rlreg.h.orig Thu Oct 6 13:17:17 2005
+++ sys/pci/if_rlreg.h Wed Oct 19 14:43:52 2005
@@ -645,14 +645,6 @@
struct rl_softc;
-struct rl_dmaload_arg {
- struct rl_softc *sc;
- int rl_idx;
- int rl_maxsegs;
- uint32_t rl_flags;
- struct rl_desc *rl_ring;
-};
-
struct rl_list_data {
struct mbuf *rl_tx_mbuf[RL_TX_DESC_CNT];
struct mbuf *rl_rx_mbuf[RL_TX_DESC_CNT];
-------------- next part --------------
--- sys/dev/re/if_re.c.orig Wed Oct 19 14:40:13 2005
+++ sys/dev/re/if_re.c Wed Oct 19 14:42:23 2005
@@ -159,6 +159,8 @@
#define RE_CSUM_FEATURES (CSUM_IP | CSUM_TCP | CSUM_UDP)
+#define RL_NSEG_NEW 32
+
/*
* Various supported device vendors/types and their names.
*/
@@ -205,8 +207,6 @@
static int re_encap (struct rl_softc *, struct mbuf **, int *);
static void re_dma_map_addr (void *, bus_dma_segment_t *, int, int);
-static void re_dma_map_desc (void *, bus_dma_segment_t *, int,
- bus_size_t, int);
static int re_allocmem (device_t, struct rl_softc *);
static int re_newbuf (struct rl_softc *, int, struct mbuf *);
static int re_rx_list_init (struct rl_softc *);
@@ -856,82 +856,6 @@
}
/*
- * This routine takes the segment list provided as the result of
- * a bus_dma_map_load() operation and assigns the addresses/lengths
- * to RealTek DMA descriptors. This can be called either by the RX
- * code or the TX code. In the RX case, we'll probably wind up mapping
- * at most one segment. For the TX case, there could be any number of
- * segments since TX packets may span multiple mbufs. In either case,
- * if the number of segments is larger than the rl_maxsegs limit
- * specified by the caller, we abort the mapping operation. Sadly,
- * whoever designed the buffer mapping API did not provide a way to
- * return an error from here, so we have to fake it a bit.
- */
-
-static void
-re_dma_map_desc(arg, segs, nseg, mapsize, error)
- void *arg;
- bus_dma_segment_t *segs;
- int nseg;
- bus_size_t mapsize;
- int error;
-{
- struct rl_dmaload_arg *ctx;
- struct rl_desc *d = NULL;
- int i = 0, idx;
-
- if (error)
- return;
-
- ctx = arg;
-
- /* Signal error to caller if there's too many segments */
- if (nseg > ctx->rl_maxsegs) {
- ctx->rl_maxsegs = 0;
- return;
- }
-
- /*
- * Map the segment array into descriptors. Note that we set the
- * start-of-frame and end-of-frame markers for either TX or RX, but
- * they really only have meaning in the TX case. (In the RX case,
- * it's the chip that tells us where packets begin and end.)
- * We also keep track of the end of the ring and set the
- * end-of-ring bits as needed, and we set the ownership bits
- * in all except the very first descriptor. (The caller will
- * set this descriptor later when it start transmission or
- * reception.)
- */
- idx = ctx->rl_idx;
- for (;;) {
- u_int32_t cmdstat;
- d = &ctx->rl_ring[idx];
- if (le32toh(d->rl_cmdstat) & RL_RDESC_STAT_OWN) {
- ctx->rl_maxsegs = 0;
- return;
- }
- cmdstat = segs[i].ds_len;
- d->rl_bufaddr_lo = htole32(RL_ADDR_LO(segs[i].ds_addr));
- d->rl_bufaddr_hi = htole32(RL_ADDR_HI(segs[i].ds_addr));
- if (i == 0)
- cmdstat |= RL_TDESC_CMD_SOF;
- else
- cmdstat |= RL_TDESC_CMD_OWN;
- if (idx == (RL_RX_DESC_CNT - 1))
- cmdstat |= RL_TDESC_CMD_EOR;
- d->rl_cmdstat = htole32(cmdstat | ctx->rl_flags);
- i++;
- if (i == nseg)
- break;
- RL_DESC_INC(idx);
- }
-
- d->rl_cmdstat |= htole32(RL_TDESC_CMD_EOF);
- ctx->rl_maxsegs = nseg;
- ctx->rl_idx = idx;
-}
-
-/*
* Map a single buffer address.
*/
@@ -958,17 +882,15 @@
struct rl_softc *sc;
{
int error;
- int nseg;
int i;
/*
* Allocate map for RX mbufs.
*/
- nseg = 32;
error = bus_dma_tag_create(sc->rl_parent_tag, ETHER_ALIGN, 0,
BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL,
- NULL, MCLBYTES * nseg, nseg, MCLBYTES, BUS_DMA_ALLOCNOW,
- NULL, NULL, &sc->rl_ldata.rl_mtag);
+ NULL, MCLBYTES * RL_NSEG_NEW, RL_NSEG_NEW, MCLBYTES,
+ BUS_DMA_ALLOCNOW, NULL, NULL, &sc->rl_ldata.rl_mtag);
if (error) {
device_printf(dev, "could not allocate dma tag\n");
return (ENOMEM);
@@ -1163,7 +1085,6 @@
/*
* Allocate the parent bus DMA tag appropriate for PCI.
*/
-#define RL_NSEG_NEW 32
error = bus_dma_tag_create(NULL, /* parent */
1, 0, /* alignment, boundary */
BUS_SPACE_MAXADDR_32BIT,/* lowaddr */
@@ -1374,9 +1295,11 @@
int idx;
struct mbuf *m;
{
- struct rl_dmaload_arg arg;
struct mbuf *n = NULL;
- int error;
+ struct rl_desc *d = &sc->rl_ldata.rl_rx_list[idx];
+ bus_dma_segment_t segs[1];
+ int error, nseg;
+ u_int32_t cmdstat;
if (m == NULL) {
n = m_getcl(M_DONTWAIT, MT_DATA, M_PKTHDR);
@@ -1400,22 +1323,23 @@
*/
m_adj(m, RE_ETHER_ALIGN);
#endif
- arg.sc = sc;
- arg.rl_idx = idx;
- arg.rl_maxsegs = 1;
- arg.rl_flags = 0;
- arg.rl_ring = sc->rl_ldata.rl_rx_list;
-
- error = bus_dmamap_load_mbuf(sc->rl_ldata.rl_mtag,
- sc->rl_ldata.rl_rx_dmamap[idx], m, re_dma_map_desc,
- &arg, BUS_DMA_NOWAIT);
- if (error || arg.rl_maxsegs != 1) {
+ error = bus_dmamap_load_mbuf_sg(sc->rl_ldata.rl_mtag,
+ sc->rl_ldata.rl_rx_dmamap[idx], m, segs, &nseg, BUS_DMA_NOWAIT);
+ if (error || nseg != 1) {
if (n != NULL)
m_freem(n);
return (ENOMEM);
}
- sc->rl_ldata.rl_rx_list[idx].rl_cmdstat |= htole32(RL_RDESC_CMD_OWN);
+ d->rl_bufaddr_lo = htole32(RL_ADDR_LO(segs[0].ds_addr));
+ d->rl_bufaddr_hi = htole32(RL_ADDR_HI(segs[0].ds_addr));
+
+ cmdstat = segs[0].ds_len | RL_RDESC_CMD_OWN;
+ if (idx == (RL_RX_DESC_CNT - 1))
+ cmdstat |= RL_TDESC_CMD_EOR;
+
+ d->rl_cmdstat = htole32(cmdstat);
+
sc->rl_ldata.rl_rx_mbuf[idx] = m;
bus_dmamap_sync(sc->rl_ldata.rl_mtag,
@@ -1872,10 +1796,13 @@
int *idx;
{
struct mbuf *m_new = NULL;
- struct rl_dmaload_arg arg;
- bus_dmamap_t map;
- int error;
struct m_tag *mtag;
+ bus_dmamap_t map;
+ bus_dma_segment_t segs[RL_NSEG_NEW];
+ struct rl_desc *d;
+ int error, nseg, i;
+ int nidx = *idx;
+ uint32_t flags = 0;
RL_LOCK_ASSERT(sc);
@@ -1889,21 +1816,14 @@
* chip. This is a requirement.
*/
- arg.rl_flags = 0;
-
if ((*m_head)->m_pkthdr.csum_flags & CSUM_IP)
- arg.rl_flags |= RL_TDESC_CMD_IPCSUM;
+ flags |= RL_TDESC_CMD_IPCSUM;
if ((*m_head)->m_pkthdr.csum_flags & CSUM_TCP)
- arg.rl_flags |= RL_TDESC_CMD_TCPCSUM;
+ flags |= RL_TDESC_CMD_TCPCSUM;
if ((*m_head)->m_pkthdr.csum_flags & CSUM_UDP)
- arg.rl_flags |= RL_TDESC_CMD_UDPCSUM;
-
- arg.sc = sc;
- arg.rl_idx = *idx;
- arg.rl_maxsegs = sc->rl_ldata.rl_tx_free;
- arg.rl_ring = sc->rl_ldata.rl_tx_list;
+ flags |= RL_TDESC_CMD_UDPCSUM;
- map = sc->rl_ldata.rl_tx_dmamap[*idx];
+ map = sc->rl_ldata.rl_tx_dmamap[nidx];
/*
* Network performance is extremely poor in gigE mode without
@@ -1915,25 +1835,52 @@
if (m_new != NULL)
*m_head = m_new;
- error = bus_dmamap_load_mbuf(sc->rl_ldata.rl_mtag, map,
- *m_head, re_dma_map_desc, &arg, BUS_DMA_NOWAIT);
+ error = bus_dmamap_load_mbuf_sg(sc->rl_ldata.rl_mtag, map,
+ *m_head, segs, &nseg, BUS_DMA_NOWAIT);
if (error) {
if_printf(sc->rl_ifp, "can't map mbuf (error %d)\n", error);
return (ENOBUFS);
}
+ KASSERT(nseg <= sc->rl_ldata.rl_tx_free, ("too many DMA segments"));
+
+ /*
+ * Map the segment array into descriptors. We keep track of the
+ * end of the ring and set the end-of-ring bits as needed, and
+ * we set the ownership bits in all except the very first descriptor.
+ * (It is set later during the start transmission or reception.)
+ */
+ for (i = 0;;) {
+ u_int32_t cmdstat;
+ d = &sc->rl_ldata.rl_tx_list[nidx];
+ KASSERT((le32toh(d->rl_cmdstat) & RL_RDESC_STAT_OWN)==0,
+ ("Descriptor still owned"));
+ cmdstat = segs[i].ds_len;
+ d->rl_bufaddr_lo = htole32(RL_ADDR_LO(segs[i].ds_addr));
+ d->rl_bufaddr_hi = htole32(RL_ADDR_HI(segs[i].ds_addr));
+ cmdstat |= (i == 0) ? RL_TDESC_CMD_SOF : RL_TDESC_CMD_OWN;
+ if (nidx == (RL_TX_DESC_CNT - 1))
+ cmdstat |= RL_TDESC_CMD_EOR;
+ d->rl_cmdstat = htole32(cmdstat | flags);
+ i++;
+ if (i == nseg)
+ break;
+ RL_DESC_INC(nidx);
+ }
+
+ d->rl_cmdstat |= htole32(RL_TDESC_CMD_EOF);
+
/*
* Insure that the map for this transmission
* is placed at the array index of the last descriptor
* in this chain. (Swap last and first dmamaps.)
*/
- sc->rl_ldata.rl_tx_dmamap[*idx] =
- sc->rl_ldata.rl_tx_dmamap[arg.rl_idx];
- sc->rl_ldata.rl_tx_dmamap[arg.rl_idx] = map;
+ sc->rl_ldata.rl_tx_dmamap[*idx] = sc->rl_ldata.rl_tx_dmamap[nidx];
+ sc->rl_ldata.rl_tx_dmamap[nidx] = map;
- sc->rl_ldata.rl_tx_mbuf[arg.rl_idx] = *m_head;
- sc->rl_ldata.rl_tx_free -= arg.rl_maxsegs;
+ sc->rl_ldata.rl_tx_mbuf[nidx] = *m_head;
+ sc->rl_ldata.rl_tx_free -= nseg;
/*
* Set up hardware VLAN tagging. Note: vlan tag info must
@@ -1941,21 +1888,19 @@
* transmission attempt.
*/
+ d = &sc->rl_ldata.rl_tx_list[*idx];
+
mtag = VLAN_OUTPUT_TAG(sc->rl_ifp, *m_head);
if (mtag != NULL)
- sc->rl_ldata.rl_tx_list[*idx].rl_vlanctl =
+ d->rl_vlanctl =
htole32(htons(VLAN_TAG_VALUE(mtag)) | RL_TDESC_VLANCTL_TAG);
/* Transfer ownership of packet to the chip. */
- sc->rl_ldata.rl_tx_list[arg.rl_idx].rl_cmdstat |=
- htole32(RL_TDESC_CMD_OWN);
- if (*idx != arg.rl_idx)
- sc->rl_ldata.rl_tx_list[*idx].rl_cmdstat |=
- htole32(RL_TDESC_CMD_OWN);
+ d->rl_cmdstat |= htole32(RL_TDESC_CMD_OWN);
- RL_DESC_INC(arg.rl_idx);
- *idx = arg.rl_idx;
+ RL_DESC_INC(nidx);
+ *idx = nidx;
return (0);
}
More information about the freebsd-amd64
mailing list