[CFT/review] new sendfile(2)
Gleb Smirnoff
glebius at FreeBSD.org
Sun Aug 31 16:48:29 UTC 2014
Hi!
Just a followup with fresh version of the patch. For details
see below.
On Thu, May 29, 2014 at 02:20:54PM +0400, Gleb Smirnoff wrote:
T> Hello!
T>
T> At Netflix and Nginx we are experimenting with improving FreeBSD
T> wrt sending large amounts of static data via HTTP.
T>
T> One of the approaches we are experimenting with is new sendfile(2)
T> implementation, that doesn't block on the I/O done from the file
T> descriptor.
T>
T> The problem with classic sendfile(2) is that if the the request
T> length is large enough, and file data is not cached in VM, then
T> sendfile(2) syscall would not return until it fills socket buffer
T> with data. With modern internet socket buffers can be up to 1 Mb,
T> thus time taken by the syscall raises by order of magnitude. All
T> the time, the nginx worker is blocked in syscall and doesn't
T> process data from other clients. The best current practice to
T> mitigate that is known as "sendfile(2) + aio_read(2)". This is
T> special mode of nginx operation on FreeBSD. The sendfile(2) call
T> is issued with SF_NODISKIO flag, that forbids the syscall to
T> perform disk I/O, and send only data that is cached by VM. If
T> sendfile(2) reports that I/O needs to be done (but forbidden), then
T> nginx would do aio_read() of a chunk of the file. The data read
T> is cached by VM, as side affect. Then sendfile() is called again.
T>
T> Now for the new sendfile. The core idea is that sendfile()
T> schedules the I/O, but doesn't wait for it to complete. It
T> returns immediately to the process, and I/O completion is
T> processed in kernel context. Unlike aio(4), no additional
T> threads in kernel are created. The new sendfile is a drop-in
T> replacement for the old one. Applications (like nginx) doesn't
T> need recompile, neither configuration change. The SF_NODISKIO is
T> ignored.
T>
T> The patch for review is available at:
T>
T> https://phabric.freebsd.org/D102
T>
T> And for those who prefer email attachments, it is also attached.
T> The patch has 3 logically separate changes in itself:
T>
T> 1) Split of socket buffer sb_cc field into sb_acc and sb_ccc. Where
T> sb_acc stands for "available character count" and sb_ccc is "claimed
T> character count". This allows us to write a data to a socket, that is
T> not ready yet. The data sits in the socket, consumes its space, and
T> keeps itself in the right order with earlier or later writes to socket.
T> But it can be send only after it is marked as ready. This change is
T> split across many files.
T>
T> 2) A new vnode operation: VOP_GETPAGES_ASYNC(). This one lives in sys/vm.
T>
T> 3) Actual implementation of new sendfile(2). This one lives in
T> kern/uipc_syscalls.c
T>
T>
T>
T> At Netflix, we already see improvements with new sendfile(2).
T> We can send more data utilizing same amount of CPU, and we can
T> push closer to 0% idle, without experiencing short lags.
T>
T> However, we have somewhat modified VM subsystem, that behaves
T> optimal for our task, but suboptimal for average FreeBSD system.
T> I'd like someone from community to try the new sendfile(2) at
T> other setup and see how does it serve for you.
T>
T> To be the early tester you need to checkout projects/sendfile
T> branch and build kernel from it. The world from head/ would
T> run fine with it.
T>
T> svn co http://svn.freebsd.org/base/projects/sendfile
T> cd sendfile
T> ... build kernel ...
T>
T> Limitations:
T> - No testing were done on serving files on NFS.
T> - No testing were done on serving files on ZFS.
T>
T> --
T> Totus tuus, Glebius.
T> Index: sys/dev/ti/if_ti.c
T> ===================================================================
T> --- sys/dev/ti/if_ti.c (.../head) (revision 266804)
T> +++ sys/dev/ti/if_ti.c (.../projects/sendfile) (revision 266807)
T> @@ -1629,7 +1629,7 @@ ti_newbuf_jumbo(struct ti_softc *sc, int idx, stru
T> m[i]->m_data = (void *)sf_buf_kva(sf[i]);
T> m[i]->m_len = PAGE_SIZE;
T> MEXTADD(m[i], sf_buf_kva(sf[i]), PAGE_SIZE,
T> - sf_buf_mext, (void*)sf_buf_kva(sf[i]), sf[i],
T> + sf_mext_free, (void*)sf_buf_kva(sf[i]), sf[i],
T> 0, EXT_DISPOSABLE);
T> m[i]->m_next = m[i+1];
T> }
T> @@ -1694,7 +1694,7 @@ nobufs:
T> if (m[i])
T> m_freem(m[i]);
T> if (sf[i])
T> - sf_buf_mext((void *)sf_buf_kva(sf[i]), sf[i]);
T> + sf_mext_free((void *)sf_buf_kva(sf[i]), sf[i]);
T> }
T> return (ENOBUFS);
T> }
T> Index: sys/dev/cxgbe/tom/t4_cpl_io.c
T> ===================================================================
T> --- sys/dev/cxgbe/tom/t4_cpl_io.c (.../head) (revision 266804)
T> +++ sys/dev/cxgbe/tom/t4_cpl_io.c (.../projects/sendfile) (revision 266807)
T> @@ -338,11 +338,11 @@ t4_rcvd(struct toedev *tod, struct tcpcb *tp)
T> INP_WLOCK_ASSERT(inp);
T>
T> SOCKBUF_LOCK(sb);
T> - KASSERT(toep->sb_cc >= sb->sb_cc,
T> + KASSERT(toep->sb_cc >= sbused(sb),
T> ("%s: sb %p has more data (%d) than last time (%d).",
T> - __func__, sb, sb->sb_cc, toep->sb_cc));
T> - toep->rx_credits += toep->sb_cc - sb->sb_cc;
T> - toep->sb_cc = sb->sb_cc;
T> + __func__, sb, sbused(sb), toep->sb_cc));
T> + toep->rx_credits += toep->sb_cc - sbused(sb);
T> + toep->sb_cc = sbused(sb);
T> credits = toep->rx_credits;
T> SOCKBUF_UNLOCK(sb);
T>
T> @@ -863,15 +863,15 @@ do_peer_close(struct sge_iq *iq, const struct rss_
T> tp->rcv_nxt = be32toh(cpl->rcv_nxt);
T> toep->ddp_flags &= ~(DDP_BUF0_ACTIVE | DDP_BUF1_ACTIVE);
T>
T> - KASSERT(toep->sb_cc >= sb->sb_cc,
T> + KASSERT(toep->sb_cc >= sbused(sb),
T> ("%s: sb %p has more data (%d) than last time (%d).",
T> - __func__, sb, sb->sb_cc, toep->sb_cc));
T> - toep->rx_credits += toep->sb_cc - sb->sb_cc;
T> + __func__, sb, sbused(sb), toep->sb_cc));
T> + toep->rx_credits += toep->sb_cc - sbused(sb);
T> #ifdef USE_DDP_RX_FLOW_CONTROL
T> toep->rx_credits -= m->m_len; /* adjust for F_RX_FC_DDP */
T> #endif
T> - sbappendstream_locked(sb, m);
T> - toep->sb_cc = sb->sb_cc;
T> + sbappendstream_locked(sb, m, 0);
T> + toep->sb_cc = sbused(sb);
T> }
T> socantrcvmore_locked(so); /* unlocks the sockbuf */
T>
T> @@ -1281,12 +1281,12 @@ do_rx_data(struct sge_iq *iq, const struct rss_hea
T> }
T> }
T>
T> - KASSERT(toep->sb_cc >= sb->sb_cc,
T> + KASSERT(toep->sb_cc >= sbused(sb),
T> ("%s: sb %p has more data (%d) than last time (%d).",
T> - __func__, sb, sb->sb_cc, toep->sb_cc));
T> - toep->rx_credits += toep->sb_cc - sb->sb_cc;
T> - sbappendstream_locked(sb, m);
T> - toep->sb_cc = sb->sb_cc;
T> + __func__, sb, sbused(sb), toep->sb_cc));
T> + toep->rx_credits += toep->sb_cc - sbused(sb);
T> + sbappendstream_locked(sb, m, 0);
T> + toep->sb_cc = sbused(sb);
T> sorwakeup_locked(so);
T> SOCKBUF_UNLOCK_ASSERT(sb);
T>
T> Index: sys/dev/cxgbe/tom/t4_ddp.c
T> ===================================================================
T> --- sys/dev/cxgbe/tom/t4_ddp.c (.../head) (revision 266804)
T> +++ sys/dev/cxgbe/tom/t4_ddp.c (.../projects/sendfile) (revision 266807)
T> @@ -224,15 +224,15 @@ insert_ddp_data(struct toepcb *toep, uint32_t n)
T> tp->rcv_wnd -= n;
T> #endif
T>
T> - KASSERT(toep->sb_cc >= sb->sb_cc,
T> + KASSERT(toep->sb_cc >= sbused(sb),
T> ("%s: sb %p has more data (%d) than last time (%d).",
T> - __func__, sb, sb->sb_cc, toep->sb_cc));
T> - toep->rx_credits += toep->sb_cc - sb->sb_cc;
T> + __func__, sb, sbused(sb), toep->sb_cc));
T> + toep->rx_credits += toep->sb_cc - sbused(sb);
T> #ifdef USE_DDP_RX_FLOW_CONTROL
T> toep->rx_credits -= n; /* adjust for F_RX_FC_DDP */
T> #endif
T> - sbappendstream_locked(sb, m);
T> - toep->sb_cc = sb->sb_cc;
T> + sbappendstream_locked(sb, m, 0);
T> + toep->sb_cc = sbused(sb);
T> }
T>
T> /* SET_TCB_FIELD sent as a ULP command looks like this */
T> @@ -459,15 +459,15 @@ handle_ddp_data(struct toepcb *toep, __be32 ddp_re
T> else
T> discourage_ddp(toep);
T>
T> - KASSERT(toep->sb_cc >= sb->sb_cc,
T> + KASSERT(toep->sb_cc >= sbused(sb),
T> ("%s: sb %p has more data (%d) than last time (%d).",
T> - __func__, sb, sb->sb_cc, toep->sb_cc));
T> - toep->rx_credits += toep->sb_cc - sb->sb_cc;
T> + __func__, sb, sbused(sb), toep->sb_cc));
T> + toep->rx_credits += toep->sb_cc - sbused(sb);
T> #ifdef USE_DDP_RX_FLOW_CONTROL
T> toep->rx_credits -= len; /* adjust for F_RX_FC_DDP */
T> #endif
T> - sbappendstream_locked(sb, m);
T> - toep->sb_cc = sb->sb_cc;
T> + sbappendstream_locked(sb, m, 0);
T> + toep->sb_cc = sbused(sb);
T> wakeup:
T> KASSERT(toep->ddp_flags & db_flag,
T> ("%s: DDP buffer not active. toep %p, ddp_flags 0x%x, report 0x%x",
T> @@ -897,7 +897,7 @@ handle_ddp(struct socket *so, struct uio *uio, int
T> #endif
T>
T> /* XXX: too eager to disable DDP, could handle NBIO better than this. */
T> - if (sb->sb_cc >= uio->uio_resid || uio->uio_resid < sc->tt.ddp_thres ||
T> + if (sbused(sb) >= uio->uio_resid || uio->uio_resid < sc->tt.ddp_thres ||
T> uio->uio_resid > MAX_DDP_BUFFER_SIZE || uio->uio_iovcnt > 1 ||
T> so->so_state & SS_NBIO || flags & (MSG_DONTWAIT | MSG_NBIO) ||
T> error || so->so_error || sb->sb_state & SBS_CANTRCVMORE)
T> @@ -935,7 +935,7 @@ handle_ddp(struct socket *so, struct uio *uio, int
T> * payload.
T> */
T> ddp_flags = select_ddp_flags(so, flags, db_idx);
T> - wr = mk_update_tcb_for_ddp(sc, toep, db_idx, sb->sb_cc, ddp_flags);
T> + wr = mk_update_tcb_for_ddp(sc, toep, db_idx, sbused(sb), ddp_flags);
T> if (wr == NULL) {
T> /*
T> * Just unhold the pages. The DDP buffer's software state is
T> @@ -960,8 +960,9 @@ handle_ddp(struct socket *so, struct uio *uio, int
T> */
T> rc = sbwait(sb);
T> while (toep->ddp_flags & buf_flag) {
T> + /* XXXGL: shouldn't here be sbwait() call? */
T> sb->sb_flags |= SB_WAIT;
T> - msleep(&sb->sb_cc, &sb->sb_mtx, PSOCK , "sbwait", 0);
T> + msleep(&sb->sb_acc, &sb->sb_mtx, PSOCK , "sbwait", 0);
T> }
T> unwire_ddp_buffer(db);
T> return (rc);
T> @@ -1123,8 +1124,8 @@ restart:
T>
T> /* uio should be just as it was at entry */
T> KASSERT(oresid == uio->uio_resid,
T> - ("%s: oresid = %d, uio_resid = %zd, sb_cc = %d",
T> - __func__, oresid, uio->uio_resid, sb->sb_cc));
T> + ("%s: oresid = %d, uio_resid = %zd, sbused = %d",
T> + __func__, oresid, uio->uio_resid, sbused(sb)));
T>
T> error = handle_ddp(so, uio, flags, 0);
T> ddp_handled = 1;
T> @@ -1134,7 +1135,7 @@ restart:
T>
T> /* Abort if socket has reported problems. */
T> if (so->so_error) {
T> - if (sb->sb_cc > 0)
T> + if (sbused(sb))
T> goto deliver;
T> if (oresid > uio->uio_resid)
T> goto out;
T> @@ -1146,7 +1147,7 @@ restart:
T>
T> /* Door is closed. Deliver what is left, if any. */
T> if (sb->sb_state & SBS_CANTRCVMORE) {
T> - if (sb->sb_cc > 0)
T> + if (sbused(sb))
T> goto deliver;
T> else
T> goto out;
T> @@ -1153,7 +1154,7 @@ restart:
T> }
T>
T> /* Socket buffer is empty and we shall not block. */
T> - if (sb->sb_cc == 0 &&
T> + if (sbused(sb) == 0 &&
T> ((so->so_state & SS_NBIO) || (flags & (MSG_DONTWAIT|MSG_NBIO)))) {
T> error = EAGAIN;
T> goto out;
T> @@ -1160,18 +1161,18 @@ restart:
T> }
T>
T> /* Socket buffer got some data that we shall deliver now. */
T> - if (sb->sb_cc > 0 && !(flags & MSG_WAITALL) &&
T> + if (sbused(sb) && !(flags & MSG_WAITALL) &&
T> ((sb->sb_flags & SS_NBIO) ||
T> (flags & (MSG_DONTWAIT|MSG_NBIO)) ||
T> - sb->sb_cc >= sb->sb_lowat ||
T> - sb->sb_cc >= uio->uio_resid ||
T> - sb->sb_cc >= sb->sb_hiwat) ) {
T> + sbused(sb) >= sb->sb_lowat ||
T> + sbused(sb) >= uio->uio_resid ||
T> + sbused(sb) >= sb->sb_hiwat) ) {
T> goto deliver;
T> }
T>
T> /* On MSG_WAITALL we must wait until all data or error arrives. */
T> if ((flags & MSG_WAITALL) &&
T> - (sb->sb_cc >= uio->uio_resid || sb->sb_cc >= sb->sb_lowat))
T> + (sbused(sb) >= uio->uio_resid || sbused(sb) >= sb->sb_lowat))
T> goto deliver;
T>
T> /*
T> @@ -1190,7 +1191,7 @@ restart:
T>
T> deliver:
T> SOCKBUF_LOCK_ASSERT(&so->so_rcv);
T> - KASSERT(sb->sb_cc > 0, ("%s: sockbuf empty", __func__));
T> + KASSERT(sbused(sb) > 0, ("%s: sockbuf empty", __func__));
T> KASSERT(sb->sb_mb != NULL, ("%s: sb_mb == NULL", __func__));
T>
T> if (sb->sb_flags & SB_DDP_INDICATE && !ddp_handled)
T> @@ -1201,7 +1202,7 @@ deliver:
T> uio->uio_td->td_ru.ru_msgrcv++;
T>
T> /* Fill uio until full or current end of socket buffer is reached. */
T> - len = min(uio->uio_resid, sb->sb_cc);
T> + len = min(uio->uio_resid, sbused(sb));
T> if (mp0 != NULL) {
T> /* Dequeue as many mbufs as possible. */
T> if (!(flags & MSG_PEEK) && len >= sb->sb_mb->m_len) {
T> Index: sys/dev/cxgbe/iw_cxgbe/cm.c
T> ===================================================================
T> --- sys/dev/cxgbe/iw_cxgbe/cm.c (.../head) (revision 266804)
T> +++ sys/dev/cxgbe/iw_cxgbe/cm.c (.../projects/sendfile) (revision 266807)
T> @@ -585,8 +585,8 @@ process_data(struct c4iw_ep *ep)
T> {
T> struct sockaddr_in *local, *remote;
T>
T> - CTR5(KTR_IW_CXGBE, "%s: so %p, ep %p, state %s, sb_cc %d", __func__,
T> - ep->com.so, ep, states[ep->com.state], ep->com.so->so_rcv.sb_cc);
T> + CTR5(KTR_IW_CXGBE, "%s: so %p, ep %p, state %s, sbused %d", __func__,
T> + ep->com.so, ep, states[ep->com.state], sbused(&ep->com.so->so_rcv));
T>
T> switch (state_read(&ep->com)) {
T> case MPA_REQ_SENT:
T> @@ -602,11 +602,11 @@ process_data(struct c4iw_ep *ep)
T> process_mpa_request(ep);
T> break;
T> default:
T> - if (ep->com.so->so_rcv.sb_cc)
T> - log(LOG_ERR, "%s: Unexpected streaming data. "
T> - "ep %p, state %d, so %p, so_state 0x%x, sb_cc %u\n",
T> + if (sbused(&ep->com.so->so_rcv))
T> + log(LOG_ERR, "%s: Unexpected streaming data. ep %p, "
T> + "state %d, so %p, so_state 0x%x, sbused %u\n",
T> __func__, ep, state_read(&ep->com), ep->com.so,
T> - ep->com.so->so_state, ep->com.so->so_rcv.sb_cc);
T> + ep->com.so->so_state, sbused(&ep->com.so->so_rcv));
T> break;
T> }
T> }
T> Index: sys/dev/iscsi/icl.c
T> ===================================================================
T> --- sys/dev/iscsi/icl.c (.../head) (revision 266804)
T> +++ sys/dev/iscsi/icl.c (.../projects/sendfile) (revision 266807)
T> @@ -758,7 +758,7 @@ icl_receive_thread(void *arg)
T> * is enough data received to read the PDU.
T> */
T> SOCKBUF_LOCK(&so->so_rcv);
T> - available = so->so_rcv.sb_cc;
T> + available = sbavail(&so->so_rcv);
T> if (available < ic->ic_receive_len) {
T> so->so_rcv.sb_lowat = ic->ic_receive_len;
T> cv_wait(&ic->ic_receive_cv, &so->so_rcv.sb_mtx);
T> Index: sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c
T> ===================================================================
T> --- sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c (.../head) (revision 266804)
T> +++ sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c (.../projects/sendfile) (revision 266807)
T> @@ -445,8 +445,8 @@ t3_push_frames(struct socket *so, int req_completi
T> * Autosize the send buffer.
T> */
T> if (snd->sb_flags & SB_AUTOSIZE && VNET(tcp_do_autosndbuf)) {
T> - if (snd->sb_cc >= (snd->sb_hiwat / 8 * 7) &&
T> - snd->sb_cc < VNET(tcp_autosndbuf_max)) {
T> + if (sbused(snd) >= (snd->sb_hiwat / 8 * 7) &&
T> + sbused(snd) < VNET(tcp_autosndbuf_max)) {
T> if (!sbreserve_locked(snd, min(snd->sb_hiwat +
T> VNET(tcp_autosndbuf_inc), VNET(tcp_autosndbuf_max)),
T> so, curthread))
T> @@ -597,10 +597,10 @@ t3_rcvd(struct toedev *tod, struct tcpcb *tp)
T> INP_WLOCK_ASSERT(inp);
T>
T> SOCKBUF_LOCK(so_rcv);
T> - KASSERT(toep->tp_enqueued >= so_rcv->sb_cc,
T> - ("%s: so_rcv->sb_cc > enqueued", __func__));
T> - toep->tp_rx_credits += toep->tp_enqueued - so_rcv->sb_cc;
T> - toep->tp_enqueued = so_rcv->sb_cc;
T> + KASSERT(toep->tp_enqueued >= sbused(so_rcv),
T> + ("%s: sbused(so_rcv) > enqueued", __func__));
T> + toep->tp_rx_credits += toep->tp_enqueued - sbused(so_rcv);
T> + toep->tp_enqueued = sbused(so_rcv);
T> SOCKBUF_UNLOCK(so_rcv);
T>
T> must_send = toep->tp_rx_credits + 16384 >= tp->rcv_wnd;
T> @@ -1199,7 +1199,7 @@ do_rx_data(struct sge_qset *qs, struct rsp_desc *r
T> }
T>
T> toep->tp_enqueued += m->m_pkthdr.len;
T> - sbappendstream_locked(so_rcv, m);
T> + sbappendstream_locked(so_rcv, m, 0);
T> sorwakeup_locked(so);
T> SOCKBUF_UNLOCK_ASSERT(so_rcv);
T>
T> @@ -1768,7 +1768,7 @@ wr_ack(struct toepcb *toep, struct mbuf *m)
T> so_sowwakeup_locked(so);
T> }
T>
T> - if (snd->sb_sndptroff < snd->sb_cc)
T> + if (snd->sb_sndptroff < sbused(snd))
T> t3_push_frames(so, 0);
T>
T> out_free:
T> Index: sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_cm.c
T> ===================================================================
T> --- sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_cm.c (.../head) (revision 266804)
T> +++ sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_cm.c (.../projects/sendfile) (revision 266807)
T> @@ -1515,11 +1515,11 @@ process_data(struct iwch_ep *ep)
T> process_mpa_request(ep);
T> break;
T> default:
T> - if (ep->com.so->so_rcv.sb_cc)
T> + if (sbavail(&ep->com.so->so_rcv))
T> printf("%s Unexpected streaming data."
T> " ep %p state %d so %p so_state %x so_rcv.sb_cc %u so_rcv.sb_mb %p\n",
T> __FUNCTION__, ep, state_read(&ep->com), ep->com.so, ep->com.so->so_state,
T> - ep->com.so->so_rcv.sb_cc, ep->com.so->so_rcv.sb_mb);
T> + sbavail(&ep->com.so->so_rcv), ep->com.so->so_rcv.sb_mb);
T> break;
T> }
T> return;
T> Index: sys/kern/uipc_debug.c
T> ===================================================================
T> --- sys/kern/uipc_debug.c (.../head) (revision 266804)
T> +++ sys/kern/uipc_debug.c (.../projects/sendfile) (revision 266807)
T> @@ -403,7 +403,8 @@ db_print_sockbuf(struct sockbuf *sb, const char *s
T> db_printf("sb_sndptroff: %u\n", sb->sb_sndptroff);
T>
T> db_print_indent(indent);
T> - db_printf("sb_cc: %u ", sb->sb_cc);
T> + db_printf("sb_acc: %u ", sb->sb_acc);
T> + db_printf("sb_ccc: %u ", sb->sb_ccc);
T> db_printf("sb_hiwat: %u ", sb->sb_hiwat);
T> db_printf("sb_mbcnt: %u ", sb->sb_mbcnt);
T> db_printf("sb_mbmax: %u\n", sb->sb_mbmax);
T> Index: sys/kern/uipc_mbuf.c
T> ===================================================================
T> --- sys/kern/uipc_mbuf.c (.../head) (revision 266804)
T> +++ sys/kern/uipc_mbuf.c (.../projects/sendfile) (revision 266807)
T> @@ -389,7 +389,7 @@ mb_dupcl(struct mbuf *n, struct mbuf *m)
T> * cleaned too.
T> */
T> void
T> -m_demote(struct mbuf *m0, int all)
T> +m_demote(struct mbuf *m0, int all, int flags)
T> {
T> struct mbuf *m;
T>
T> @@ -405,7 +405,7 @@ void
T> m_freem(m->m_nextpkt);
T> m->m_nextpkt = NULL;
T> }
T> - m->m_flags = m->m_flags & (M_EXT|M_RDONLY|M_NOFREE);
T> + m->m_flags = m->m_flags & (M_EXT | M_RDONLY | M_NOFREE | flags);
T> }
T> }
T>
T> Index: sys/kern/sys_socket.c
T> ===================================================================
T> --- sys/kern/sys_socket.c (.../head) (revision 266804)
T> +++ sys/kern/sys_socket.c (.../projects/sendfile) (revision 266807)
T> @@ -167,20 +167,17 @@ soo_ioctl(struct file *fp, u_long cmd, void *data,
T>
T> case FIONREAD:
T> /* Unlocked read. */
T> - *(int *)data = so->so_rcv.sb_cc;
T> + *(int *)data = sbavail(&so->so_rcv);
T> break;
T>
T> case FIONWRITE:
T> /* Unlocked read. */
T> - *(int *)data = so->so_snd.sb_cc;
T> + *(int *)data = sbavail(&so->so_snd);
T> break;
T>
T> case FIONSPACE:
T> - if ((so->so_snd.sb_hiwat < so->so_snd.sb_cc) ||
T> - (so->so_snd.sb_mbmax < so->so_snd.sb_mbcnt))
T> - *(int *)data = 0;
T> - else
T> - *(int *)data = sbspace(&so->so_snd);
T> + /* Unlocked read. */
T> + *(int *)data = sbspace(&so->so_snd);
T> break;
T>
T> case FIOSETOWN:
T> @@ -246,6 +243,7 @@ soo_stat(struct file *fp, struct stat *ub, struct
T> struct thread *td)
T> {
T> struct socket *so = fp->f_data;
T> + struct sockbuf *sb;
T> #ifdef MAC
T> int error;
T> #endif
T> @@ -261,15 +259,18 @@ soo_stat(struct file *fp, struct stat *ub, struct
T> * If SBS_CANTRCVMORE is set, but there's still data left in the
T> * receive buffer, the socket is still readable.
T> */
T> - SOCKBUF_LOCK(&so->so_rcv);
T> - if ((so->so_rcv.sb_state & SBS_CANTRCVMORE) == 0 ||
T> - so->so_rcv.sb_cc != 0)
T> + sb = &so->so_rcv;
T> + SOCKBUF_LOCK(sb);
T> + if ((sb->sb_state & SBS_CANTRCVMORE) == 0 || sbavail(sb))
T> ub->st_mode |= S_IRUSR | S_IRGRP | S_IROTH;
T> - ub->st_size = so->so_rcv.sb_cc - so->so_rcv.sb_ctl;
T> - SOCKBUF_UNLOCK(&so->so_rcv);
T> - /* Unlocked read. */
T> - if ((so->so_snd.sb_state & SBS_CANTSENDMORE) == 0)
T> + ub->st_size = sbavail(sb) - sb->sb_ctl;
T> + SOCKBUF_UNLOCK(sb);
T> +
T> + sb = &so->so_snd;
T> + SOCKBUF_LOCK(sb);
T> + if ((sb->sb_state & SBS_CANTSENDMORE) == 0)
T> ub->st_mode |= S_IWUSR | S_IWGRP | S_IWOTH;
T> + SOCKBUF_UNLOCK(sb);
T> ub->st_uid = so->so_cred->cr_uid;
T> ub->st_gid = so->so_cred->cr_gid;
T> return (*so->so_proto->pr_usrreqs->pru_sense)(so, ub);
T> Index: sys/kern/uipc_usrreq.c
T> ===================================================================
T> --- sys/kern/uipc_usrreq.c (.../head) (revision 266804)
T> +++ sys/kern/uipc_usrreq.c (.../projects/sendfile) (revision 266807)
T> @@ -790,11 +790,10 @@ uipc_rcvd(struct socket *so, int flags)
T> u_int mbcnt, sbcc;
T>
T> unp = sotounpcb(so);
T> - KASSERT(unp != NULL, ("uipc_rcvd: unp == NULL"));
T> + KASSERT(unp != NULL, ("%s: unp == NULL", __func__));
T> + KASSERT(so->so_type == SOCK_STREAM || so->so_type == SOCK_SEQPACKET,
T> + ("%s: socktype %d", __func__, so->so_type));
T>
T> - if (so->so_type != SOCK_STREAM && so->so_type != SOCK_SEQPACKET)
T> - panic("uipc_rcvd socktype %d", so->so_type);
T> -
T> /*
T> * Adjust backpressure on sender and wakeup any waiting to write.
T> *
T> @@ -807,7 +806,7 @@ uipc_rcvd(struct socket *so, int flags)
T> */
T> SOCKBUF_LOCK(&so->so_rcv);
T> mbcnt = so->so_rcv.sb_mbcnt;
T> - sbcc = so->so_rcv.sb_cc;
T> + sbcc = sbavail(&so->so_rcv);
T> SOCKBUF_UNLOCK(&so->so_rcv);
T> /*
T> * There is a benign race condition at this point. If we're planning to
T> @@ -843,7 +842,10 @@ uipc_send(struct socket *so, int flags, struct mbu
T> int error = 0;
T>
T> unp = sotounpcb(so);
T> - KASSERT(unp != NULL, ("uipc_send: unp == NULL"));
T> + KASSERT(unp != NULL, ("%s: unp == NULL", __func__));
T> + KASSERT(so->so_type == SOCK_STREAM || so->so_type == SOCK_DGRAM ||
T> + so->so_type == SOCK_SEQPACKET,
T> + ("%s: socktype %d", __func__, so->so_type));
T>
T> if (flags & PRUS_OOB) {
T> error = EOPNOTSUPP;
T> @@ -994,7 +996,7 @@ uipc_send(struct socket *so, int flags, struct mbu
T> }
T>
T> mbcnt = so2->so_rcv.sb_mbcnt;
T> - sbcc = so2->so_rcv.sb_cc;
T> + sbcc = sbavail(&so2->so_rcv);
T> sorwakeup_locked(so2);
T>
T> /*
T> @@ -1011,9 +1013,6 @@ uipc_send(struct socket *so, int flags, struct mbu
T> UNP_PCB_UNLOCK(unp2);
T> m = NULL;
T> break;
T> -
T> - default:
T> - panic("uipc_send unknown socktype");
T> }
T>
T> /*
T> Index: sys/kern/vfs_default.c
T> ===================================================================
T> --- sys/kern/vfs_default.c (.../head) (revision 266804)
T> +++ sys/kern/vfs_default.c (.../projects/sendfile) (revision 266807)
T> @@ -111,6 +111,7 @@ struct vop_vector default_vnodeops = {
T> .vop_close = VOP_NULL,
T> .vop_fsync = VOP_NULL,
T> .vop_getpages = vop_stdgetpages,
T> + .vop_getpages_async = vop_stdgetpages_async,
T> .vop_getwritemount = vop_stdgetwritemount,
T> .vop_inactive = VOP_NULL,
T> .vop_ioctl = VOP_ENOTTY,
T> @@ -726,10 +727,19 @@ vop_stdgetpages(ap)
T> {
T>
T> return vnode_pager_generic_getpages(ap->a_vp, ap->a_m,
T> - ap->a_count, ap->a_reqpage);
T> + ap->a_count, ap->a_reqpage, NULL, NULL);
T> }
T>
T> +/* XXX Needs good comment and a manpage. */
T> int
T> +vop_stdgetpages_async(struct vop_getpages_async_args *ap)
T> +{
T> +
T> + return vnode_pager_generic_getpages(ap->a_vp, ap->a_m,
T> + ap->a_count, ap->a_reqpage, ap->a_vop_getpages_iodone, ap->a_arg);
T> +}
T> +
T> +int
T> vop_stdkqfilter(struct vop_kqfilter_args *ap)
T> {
T> return vfs_kqfilter(ap);
T> Index: sys/kern/uipc_socket.c
T> ===================================================================
T> --- sys/kern/uipc_socket.c (.../head) (revision 266804)
T> +++ sys/kern/uipc_socket.c (.../projects/sendfile) (revision 266807)
T> @@ -1459,12 +1459,12 @@ restart:
T> * 2. MSG_DONTWAIT is not set
T> */
T> if (m == NULL || (((flags & MSG_DONTWAIT) == 0 &&
T> - so->so_rcv.sb_cc < uio->uio_resid) &&
T> - so->so_rcv.sb_cc < so->so_rcv.sb_lowat &&
T> + sbavail(&so->so_rcv) < uio->uio_resid) &&
T> + sbavail(&so->so_rcv) < so->so_rcv.sb_lowat &&
T> m->m_nextpkt == NULL && (pr->pr_flags & PR_ATOMIC) == 0)) {
T> - KASSERT(m != NULL || !so->so_rcv.sb_cc,
T> - ("receive: m == %p so->so_rcv.sb_cc == %u",
T> - m, so->so_rcv.sb_cc));
T> + KASSERT(m != NULL || !sbavail(&so->so_rcv),
T> + ("receive: m == %p sbavail == %u",
T> + m, sbavail(&so->so_rcv)));
T> if (so->so_error) {
T> if (m != NULL)
T> goto dontblock;
T> @@ -1746,9 +1746,7 @@ dontblock:
T> SOCKBUF_LOCK(&so->so_rcv);
T> }
T> }
T> - m->m_data += len;
T> - m->m_len -= len;
T> - so->so_rcv.sb_cc -= len;
T> + sbmtrim(&so->so_rcv, m, len);
T> }
T> }
T> SOCKBUF_LOCK_ASSERT(&so->so_rcv);
T> @@ -1913,7 +1911,7 @@ restart:
T>
T> /* Abort if socket has reported problems. */
T> if (so->so_error) {
T> - if (sb->sb_cc > 0)
T> + if (sbavail(sb) > 0)
T> goto deliver;
T> if (oresid > uio->uio_resid)
T> goto out;
T> @@ -1925,7 +1923,7 @@ restart:
T>
T> /* Door is closed. Deliver what is left, if any. */
T> if (sb->sb_state & SBS_CANTRCVMORE) {
T> - if (sb->sb_cc > 0)
T> + if (sbavail(sb) > 0)
T> goto deliver;
T> else
T> goto out;
T> @@ -1932,7 +1930,7 @@ restart:
T> }
T>
T> /* Socket buffer is empty and we shall not block. */
T> - if (sb->sb_cc == 0 &&
T> + if (sbavail(sb) == 0 &&
T> ((so->so_state & SS_NBIO) || (flags & (MSG_DONTWAIT|MSG_NBIO)))) {
T> error = EAGAIN;
T> goto out;
T> @@ -1939,18 +1937,18 @@ restart:
T> }
T>
T> /* Socket buffer got some data that we shall deliver now. */
T> - if (sb->sb_cc > 0 && !(flags & MSG_WAITALL) &&
T> + if (sbavail(sb) > 0 && !(flags & MSG_WAITALL) &&
T> ((sb->sb_flags & SS_NBIO) ||
T> (flags & (MSG_DONTWAIT|MSG_NBIO)) ||
T> - sb->sb_cc >= sb->sb_lowat ||
T> - sb->sb_cc >= uio->uio_resid ||
T> - sb->sb_cc >= sb->sb_hiwat) ) {
T> + sbavail(sb) >= sb->sb_lowat ||
T> + sbavail(sb) >= uio->uio_resid ||
T> + sbavail(sb) >= sb->sb_hiwat) ) {
T> goto deliver;
T> }
T>
T> /* On MSG_WAITALL we must wait until all data or error arrives. */
T> if ((flags & MSG_WAITALL) &&
T> - (sb->sb_cc >= uio->uio_resid || sb->sb_cc >= sb->sb_hiwat))
T> + (sbavail(sb) >= uio->uio_resid || sbavail(sb) >= sb->sb_hiwat))
T> goto deliver;
T>
T> /*
T> @@ -1964,7 +1962,7 @@ restart:
T>
T> deliver:
T> SOCKBUF_LOCK_ASSERT(&so->so_rcv);
T> - KASSERT(sb->sb_cc > 0, ("%s: sockbuf empty", __func__));
T> + KASSERT(sbavail(sb) > 0, ("%s: sockbuf empty", __func__));
T> KASSERT(sb->sb_mb != NULL, ("%s: sb_mb == NULL", __func__));
T>
T> /* Statistics. */
T> @@ -1972,7 +1970,7 @@ deliver:
T> uio->uio_td->td_ru.ru_msgrcv++;
T>
T> /* Fill uio until full or current end of socket buffer is reached. */
T> - len = min(uio->uio_resid, sb->sb_cc);
T> + len = min(uio->uio_resid, sbavail(sb));
T> if (mp0 != NULL) {
T> /* Dequeue as many mbufs as possible. */
T> if (!(flags & MSG_PEEK) && len >= sb->sb_mb->m_len) {
T> @@ -1983,6 +1981,8 @@ deliver:
T> for (m = sb->sb_mb;
T> m != NULL && m->m_len <= len;
T> m = m->m_next) {
T> + KASSERT(!(m->m_flags & M_NOTAVAIL),
T> + ("%s: m %p not available", __func__, m));
T> len -= m->m_len;
T> uio->uio_resid -= m->m_len;
T> sbfree(sb, m);
T> @@ -2107,9 +2107,9 @@ soreceive_dgram(struct socket *so, struct sockaddr
T> */
T> SOCKBUF_LOCK(&so->so_rcv);
T> while ((m = so->so_rcv.sb_mb) == NULL) {
T> - KASSERT(so->so_rcv.sb_cc == 0,
T> - ("soreceive_dgram: sb_mb NULL but sb_cc %u",
T> - so->so_rcv.sb_cc));
T> + KASSERT(sbavail(&so->so_rcv) == 0,
T> + ("soreceive_dgram: sb_mb NULL but sbavail %u",
T> + sbavail(&so->so_rcv)));
T> if (so->so_error) {
T> error = so->so_error;
T> so->so_error = 0;
T> @@ -3157,7 +3157,7 @@ filt_soread(struct knote *kn, long hint)
T> so = kn->kn_fp->f_data;
T> SOCKBUF_LOCK_ASSERT(&so->so_rcv);
T>
T> - kn->kn_data = so->so_rcv.sb_cc - so->so_rcv.sb_ctl;
T> + kn->kn_data = sbavail(&so->so_rcv) - so->so_rcv.sb_ctl;
T> if (so->so_rcv.sb_state & SBS_CANTRCVMORE) {
T> kn->kn_flags |= EV_EOF;
T> kn->kn_fflags = so->so_error;
T> @@ -3167,7 +3167,7 @@ filt_soread(struct knote *kn, long hint)
T> else if (kn->kn_sfflags & NOTE_LOWAT)
T> return (kn->kn_data >= kn->kn_sdata);
T> else
T> - return (so->so_rcv.sb_cc >= so->so_rcv.sb_lowat);
T> + return (sbavail(&so->so_rcv) >= so->so_rcv.sb_lowat);
T> }
T>
T> static void
T> @@ -3350,7 +3350,7 @@ soisdisconnected(struct socket *so)
T> sorwakeup_locked(so);
T> SOCKBUF_LOCK(&so->so_snd);
T> so->so_snd.sb_state |= SBS_CANTSENDMORE;
T> - sbdrop_locked(&so->so_snd, so->so_snd.sb_cc);
T> + sbdrop_locked(&so->so_snd, sbused(&so->so_snd));
T> sowwakeup_locked(so);
T> wakeup(&so->so_timeo);
T> }
T> Index: sys/kern/vnode_if.src
T> ===================================================================
T> --- sys/kern/vnode_if.src (.../head) (revision 266804)
T> +++ sys/kern/vnode_if.src (.../projects/sendfile) (revision 266807)
T> @@ -477,6 +477,19 @@ vop_getpages {
T> };
T>
T>
T> +%% getpages_async vp L L L
T> +
T> +vop_getpages_async {
T> + IN struct vnode *vp;
T> + IN vm_page_t *m;
T> + IN int count;
T> + IN int reqpage;
T> + IN vm_ooffset_t offset;
T> + IN void (*vop_getpages_iodone)(void *);
T> + IN void *arg;
T> +};
T> +
T> +
T> %% putpages vp L L L
T>
T> vop_putpages {
T> Index: sys/kern/uipc_sockbuf.c
T> ===================================================================
T> --- sys/kern/uipc_sockbuf.c (.../head) (revision 266804)
T> +++ sys/kern/uipc_sockbuf.c (.../projects/sendfile) (revision 266807)
T> @@ -68,7 +68,152 @@ static u_long sb_efficiency = 8; /* parameter for
T> static struct mbuf *sbcut_internal(struct sockbuf *sb, int len);
T> static void sbflush_internal(struct sockbuf *sb);
T>
T> +static void
T> +sb_shift_nrdy(struct sockbuf *sb, struct mbuf *m)
T> +{
T> +
T> + SOCKBUF_LOCK_ASSERT(sb);
T> + KASSERT(m->m_flags & M_NOTREADY, ("%s: m %p !M_NOTREADY", __func__, m));
T> +
T> + m = m->m_next;
T> + while (m != NULL && !(m->m_flags & M_NOTREADY)) {
T> + m->m_flags &= ~M_BLOCKED;
T> + sb->sb_acc += m->m_len;
T> + m = m->m_next;
T> + }
T> +
T> + sb->sb_fnrdy = m;
T> +}
T> +
T> +int
T> +sbready(struct sockbuf *sb, struct mbuf *m, int count)
T> +{
T> + u_int blocker;
T> +
T> + SOCKBUF_LOCK(sb);
T> +
T> + if (sb->sb_state & SBS_CANTSENDMORE) {
T> + SOCKBUF_UNLOCK(sb);
T> + return (ENOTCONN);
T> + }
T> +
T> + KASSERT(sb->sb_fnrdy != NULL, ("%s: sb %p NULL fnrdy", __func__, sb));
T> +
T> + blocker = (sb->sb_fnrdy == m) ? M_BLOCKED : 0;
T> +
T> + for (int i = 0; i < count; i++, m = m->m_next) {
T> + KASSERT(m->m_flags & M_NOTREADY,
T> + ("%s: m %p !M_NOTREADY", __func__, m));
T> + m->m_flags &= ~(M_NOTREADY | blocker);
T> + if (blocker)
T> + sb->sb_acc += m->m_len;
T> + }
T> +
T> + if (!blocker) {
T> + SOCKBUF_UNLOCK(sb);
T> + return (EWOULDBLOCK);
T> + }
T> +
T> + /* This one was blocking all the queue. */
T> + for (; m && (m->m_flags & M_NOTREADY) == 0; m = m->m_next) {
T> + KASSERT(m->m_flags & M_BLOCKED,
T> + ("%s: m %p !M_BLOCKED", __func__, m));
T> + m->m_flags &= ~M_BLOCKED;
T> + sb->sb_acc += m->m_len;
T> + }
T> +
T> + sb->sb_fnrdy = m;
T> +
T> + SOCKBUF_UNLOCK(sb);
T> +
T> + return (0);
T> +}
T> +
T> /*
T> + * Adjust sockbuf state reflecting allocation of m.
T> + */
T> +void
T> +sballoc(struct sockbuf *sb, struct mbuf *m)
T> +{
T> +
T> + SOCKBUF_LOCK_ASSERT(sb);
T> +
T> + sb->sb_ccc += m->m_len;
T> +
T> + if (sb->sb_fnrdy == NULL) {
T> + if (m->m_flags & M_NOTREADY)
T> + sb->sb_fnrdy = m;
T> + else
T> + sb->sb_acc += m->m_len;
T> + } else
T> + m->m_flags |= M_BLOCKED;
T> +
T> + if (m->m_type != MT_DATA && m->m_type != MT_OOBDATA)
T> + sb->sb_ctl += m->m_len;
T> +
T> + sb->sb_mbcnt += MSIZE;
T> + sb->sb_mcnt += 1;
T> +
T> + if (m->m_flags & M_EXT) {
T> + sb->sb_mbcnt += m->m_ext.ext_size;
T> + sb->sb_ccnt += 1;
T> + }
T> +}
T> +
T> +/*
T> + * Adjust sockbuf state reflecting freeing of m.
T> + */
T> +void
T> +sbfree(struct sockbuf *sb, struct mbuf *m)
T> +{
T> +
T> +#if 0 /* XXX: not yet: soclose() call path comes here w/o lock. */
T> + SOCKBUF_LOCK_ASSERT(sb);
T> +#endif
T> +
T> + sb->sb_ccc -= m->m_len;
T> +
T> + if (!(m->m_flags & M_NOTAVAIL))
T> + sb->sb_acc -= m->m_len;
T> +
T> + if (sb->sb_fnrdy == m)
T> + sb_shift_nrdy(sb, m);
T> +
T> + if (m->m_type != MT_DATA && m->m_type != MT_OOBDATA)
T> + sb->sb_ctl -= m->m_len;
T> +
T> + sb->sb_mbcnt -= MSIZE;
T> + sb->sb_mcnt -= 1;
T> + if (m->m_flags & M_EXT) {
T> + sb->sb_mbcnt -= m->m_ext.ext_size;
T> + sb->sb_ccnt -= 1;
T> + }
T> +
T> + if (sb->sb_sndptr == m) {
T> + sb->sb_sndptr = NULL;
T> + sb->sb_sndptroff = 0;
T> + }
T> + if (sb->sb_sndptroff != 0)
T> + sb->sb_sndptroff -= m->m_len;
T> +}
T> +
T> +/*
T> + * Trim some amount of data from (first?) mbuf in buffer.
T> + */
T> +void
T> +sbmtrim(struct sockbuf *sb, struct mbuf *m, int len)
T> +{
T> +
T> + SOCKBUF_LOCK_ASSERT(sb);
T> + KASSERT(len < m->m_len, ("%s: m %p len %d", __func__, m, len));
T> +
T> + m->m_data += len;
T> + m->m_len -= len;
T> + sb->sb_acc -= len;
T> + sb->sb_ccc -= len;
T> +}
T> +
T> +/*
T> * Socantsendmore indicates that no more data will be sent on the socket; it
T> * would normally be applied to a socket when the user informs the system
T> * that no more data is to be sent, by the protocol code (in case
T> @@ -127,7 +272,7 @@ sbwait(struct sockbuf *sb)
T> SOCKBUF_LOCK_ASSERT(sb);
T>
T> sb->sb_flags |= SB_WAIT;
T> - return (msleep_sbt(&sb->sb_cc, &sb->sb_mtx,
T> + return (msleep_sbt(&sb->sb_acc, &sb->sb_mtx,
T> (sb->sb_flags & SB_NOINTR) ? PSOCK : PSOCK | PCATCH, "sbwait",
T> sb->sb_timeo, 0, 0));
T> }
T> @@ -184,7 +329,7 @@ sowakeup(struct socket *so, struct sockbuf *sb)
T> sb->sb_flags &= ~SB_SEL;
T> if (sb->sb_flags & SB_WAIT) {
T> sb->sb_flags &= ~SB_WAIT;
T> - wakeup(&sb->sb_cc);
T> + wakeup(&sb->sb_acc);
T> }
T> KNOTE_LOCKED(&sb->sb_sel.si_note, 0);
T> if (sb->sb_upcall != NULL) {
T> @@ -519,7 +664,7 @@ sbappend(struct sockbuf *sb, struct mbuf *m)
T> * that is, a stream protocol (such as TCP).
T> */
T> void
T> -sbappendstream_locked(struct sockbuf *sb, struct mbuf *m)
T> +sbappendstream_locked(struct sockbuf *sb, struct mbuf *m, int flags)
T> {
T> SOCKBUF_LOCK_ASSERT(sb);
T>
T> @@ -529,8 +674,8 @@ void
T> SBLASTMBUFCHK(sb);
T>
T> /* Remove all packet headers and mbuf tags to get a pure data chain. */
T> - m_demote(m, 1);
T> -
T> + m_demote(m, 1, flags & PRUS_NOTREADY ? M_NOTREADY : 0);
T> +
T> sbcompress(sb, m, sb->sb_mbtail);
T>
T> sb->sb_lastrecord = sb->sb_mb;
T> @@ -543,38 +688,59 @@ void
T> * that is, a stream protocol (such as TCP).
T> */
T> void
T> -sbappendstream(struct sockbuf *sb, struct mbuf *m)
T> +sbappendstream(struct sockbuf *sb, struct mbuf *m, int flags)
T> {
T>
T> SOCKBUF_LOCK(sb);
T> - sbappendstream_locked(sb, m);
T> + sbappendstream_locked(sb, m, flags);
T> SOCKBUF_UNLOCK(sb);
T> }
T>
T> #ifdef SOCKBUF_DEBUG
T> void
T> -sbcheck(struct sockbuf *sb)
T> +sbcheck(struct sockbuf *sb, const char *file, int line)
T> {
T> - struct mbuf *m;
T> - struct mbuf *n = 0;
T> - u_long len = 0, mbcnt = 0;
T> + struct mbuf *m, *n, *fnrdy;
T> + u_long acc, ccc, mbcnt;
T>
T> SOCKBUF_LOCK_ASSERT(sb);
T>
T> + acc = ccc = mbcnt = 0;
T> + fnrdy = NULL;
T> +
T> for (m = sb->sb_mb; m; m = n) {
T> n = m->m_nextpkt;
T> for (; m; m = m->m_next) {
T> - len += m->m_len;
T> + if ((m->m_flags & M_NOTREADY) && fnrdy == NULL) {
T> + if (m != sb->sb_fnrdy) {
T> + printf("sb %p: fnrdy %p != m %p\n",
T> + sb, sb->sb_fnrdy, m);
T> + goto fail;
T> + }
T> + fnrdy = m;
T> + }
T> + if (fnrdy) {
T> + if (!(m->m_flags & M_NOTAVAIL)) {
T> + printf("sb %p: fnrdy %p, m %p is avail\n",
T> + sb, sb->sb_fnrdy, m);
T> + goto fail;
T> + }
T> + } else
T> + acc += m->m_len;
T> + ccc += m->m_len;
T> mbcnt += MSIZE;
T> if (m->m_flags & M_EXT) /*XXX*/ /* pretty sure this is bogus */
T> mbcnt += m->m_ext.ext_size;
T> }
T> }
T> - if (len != sb->sb_cc || mbcnt != sb->sb_mbcnt) {
T> - printf("cc %ld != %u || mbcnt %ld != %u\n", len, sb->sb_cc,
T> - mbcnt, sb->sb_mbcnt);
T> - panic("sbcheck");
T> + if (acc != sb->sb_acc || ccc != sb->sb_ccc || mbcnt != sb->sb_mbcnt) {
T> + printf("acc %ld/%u ccc %ld/%u mbcnt %ld/%u\n",
T> + acc, sb->sb_acc, ccc, sb->sb_ccc, mbcnt, sb->sb_mbcnt);
T> + goto fail;
T> }
T> + return;
T> +fail:
T> + panic("%s from %s:%u", __func__, file, line);
T> }
T> #endif
T>
T> @@ -800,6 +966,7 @@ sbcompress(struct sockbuf *sb, struct mbuf *m, str
T> if (n && (n->m_flags & M_EOR) == 0 &&
T> M_WRITABLE(n) &&
T> ((sb->sb_flags & SB_NOCOALESCE) == 0) &&
T> + !(m->m_flags & M_NOTREADY) &&
T> m->m_len <= MCLBYTES / 4 && /* XXX: Don't copy too much */
T> m->m_len <= M_TRAILINGSPACE(n) &&
T> n->m_type == m->m_type) {
T> @@ -806,7 +973,9 @@ sbcompress(struct sockbuf *sb, struct mbuf *m, str
T> bcopy(mtod(m, caddr_t), mtod(n, caddr_t) + n->m_len,
T> (unsigned)m->m_len);
T> n->m_len += m->m_len;
T> - sb->sb_cc += m->m_len;
T> + sb->sb_ccc += m->m_len;
T> + if (sb->sb_fnrdy == NULL)
T> + sb->sb_acc += m->m_len;
T> if (m->m_type != MT_DATA && m->m_type != MT_OOBDATA)
T> /* XXX: Probably don't need.*/
T> sb->sb_ctl += m->m_len;
T> @@ -843,13 +1012,13 @@ sbflush_internal(struct sockbuf *sb)
T> * Don't call sbcut(sb, 0) if the leading mbuf is non-empty:
T> * we would loop forever. Panic instead.
T> */
T> - if (!sb->sb_cc && (sb->sb_mb == NULL || sb->sb_mb->m_len))
T> + if (sb->sb_ccc == 0 && (sb->sb_mb == NULL || sb->sb_mb->m_len))
T> break;
T> - m_freem(sbcut_internal(sb, (int)sb->sb_cc));
T> + m_freem(sbcut_internal(sb, (int)sb->sb_ccc));
T> }
T> - if (sb->sb_cc || sb->sb_mb || sb->sb_mbcnt)
T> - panic("sbflush_internal: cc %u || mb %p || mbcnt %u",
T> - sb->sb_cc, (void *)sb->sb_mb, sb->sb_mbcnt);
T> + KASSERT(sb->sb_ccc == 0 && sb->sb_mb == 0 && sb->sb_mbcnt == 0,
T> + ("%s: ccc %u mb %p mbcnt %u", __func__,
T> + sb->sb_ccc, (void *)sb->sb_mb, sb->sb_mbcnt));
T> }
T>
T> void
T> @@ -891,7 +1060,9 @@ sbcut_internal(struct sockbuf *sb, int len)
T> if (m->m_len > len) {
T> m->m_len -= len;
T> m->m_data += len;
T> - sb->sb_cc -= len;
T> + sb->sb_ccc -= len;
T> + if (!(m->m_flags & M_NOTAVAIL))
T> + sb->sb_acc -= len;
T> if (sb->sb_sndptroff != 0)
T> sb->sb_sndptroff -= len;
T> if (m->m_type != MT_DATA && m->m_type != MT_OOBDATA)
T> @@ -977,8 +1148,8 @@ sbsndptr(struct sockbuf *sb, u_int off, u_int len,
T> struct mbuf *m, *ret;
T>
T> KASSERT(sb->sb_mb != NULL, ("%s: sb_mb is NULL", __func__));
T> - KASSERT(off + len <= sb->sb_cc, ("%s: beyond sb", __func__));
T> - KASSERT(sb->sb_sndptroff <= sb->sb_cc, ("%s: sndptroff broken", __func__));
T> + KASSERT(off + len <= sb->sb_acc, ("%s: beyond sb", __func__));
T> + KASSERT(sb->sb_sndptroff <= sb->sb_acc, ("%s: sndptroff broken", __func__));
T>
T> /*
T> * Is off below stored offset? Happens on retransmits.
T> @@ -1091,7 +1262,7 @@ void
T> sbtoxsockbuf(struct sockbuf *sb, struct xsockbuf *xsb)
T> {
T>
T> - xsb->sb_cc = sb->sb_cc;
T> + xsb->sb_cc = sb->sb_ccc;
T> xsb->sb_hiwat = sb->sb_hiwat;
T> xsb->sb_mbcnt = sb->sb_mbcnt;
T> xsb->sb_mcnt = sb->sb_mcnt;
T> Index: sys/kern/uipc_syscalls.c
T> ===================================================================
T> --- sys/kern/uipc_syscalls.c (.../head) (revision 266804)
T> +++ sys/kern/uipc_syscalls.c (.../projects/sendfile) (revision 266807)
T> @@ -132,9 +132,10 @@ static int filt_sfsync(struct knote *kn, long hint
T> */
T> static SYSCTL_NODE(_kern_ipc, OID_AUTO, sendfile, CTLFLAG_RW, 0,
T> "sendfile(2) tunables");
T> -static int sfreadahead = 1;
T> +
T> +static int sfreadahead = 0;
T> SYSCTL_INT(_kern_ipc_sendfile, OID_AUTO, readahead, CTLFLAG_RW,
T> - &sfreadahead, 0, "Number of sendfile(2) read-ahead MAXBSIZE blocks");
T> + &sfreadahead, 0, "Read this more pages than socket buffer can accept");
T>
T> #ifdef SFSYNC_DEBUG
T> static int sf_sync_debug = 0;
T> @@ -1988,7 +1989,7 @@ filt_sfsync(struct knote *kn, long hint)
T> * Detach mapped page and release resources back to the system.
T> */
T> int
T> -sf_buf_mext(struct mbuf *mb, void *addr, void *args)
T> +sf_mext_free(struct mbuf *mb, void *addr, void *args)
T> {
T> vm_page_t m;
T> struct sendfile_sync *sfs;
T> @@ -2009,13 +2010,42 @@ int
T> sfs = addr;
T> sf_sync_deref(sfs);
T> }
T> - /*
T> - * sfs may be invalid at this point, don't use it!
T> - */
T> return (EXT_FREE_OK);
T> }
T>
T> /*
T> + * Same as above, but forces the page to be detached from the object
T> + * and go into free pool.
T> + */
T> +static int
T> +sf_mext_free_nocache(struct mbuf *mb, void *addr, void *args)
T> +{
T> + vm_page_t m;
T> + struct sendfile_sync *sfs;
T> +
T> + m = sf_buf_page(args);
T> + sf_buf_free(args);
T> + vm_page_lock(m);
T> + vm_page_unwire(m, 0);
T> + if (m->wire_count == 0) {
T> + vm_object_t obj;
T> +
T> + if ((obj = m->object) == NULL)
T> + vm_page_free(m);
T> + else if (!vm_page_xbusied(m) && VM_OBJECT_TRYWLOCK(obj)) {
T> + vm_page_free(m);
T> + VM_OBJECT_WUNLOCK(obj);
T> + }
T> + }
T> + vm_page_unlock(m);
T> + if (addr != NULL) {
T> + sfs = addr;
T> + sf_sync_deref(sfs);
T> + }
T> + return (EXT_FREE_OK);
T> +}
T> +
T> +/*
T> * Called to remove a reference to a sf_sync object.
T> *
T> * This is generally done during the mbuf free path to signify
T> @@ -2608,106 +2638,181 @@ freebsd4_sendfile(struct thread *td, struct freebs
T> }
T> #endif /* COMPAT_FREEBSD4 */
T>
T> + /*
T> + * How much data to put into page i of n.
T> + * Only first and last pages are special.
T> + */
T> +static inline off_t
T> +xfsize(int i, int n, off_t off, off_t len)
T> +{
T> +
T> + if (i == 0)
T> + return (omin(PAGE_SIZE - (off & PAGE_MASK), len));
T> +
T> + if (i == n - 1 && ((off + len) & PAGE_MASK) > 0)
T> + return ((off + len) & PAGE_MASK);
T> +
T> + return (PAGE_SIZE);
T> +}
T> +
T> +/*
T> + * Offset within object for i page.
T> + */
T> +static inline vm_offset_t
T> +vmoff(int i, off_t off)
T> +{
T> +
T> + if (i == 0)
T> + return ((vm_offset_t)off);
T> +
T> + return (trunc_page(off + i * PAGE_SIZE));
T> +}
T> +
T> +/*
T> + * Pretend as if we don't have enough space, subtract xfsize() of
T> + * all pages that failed.
T> + */
T> +static inline void
T> +fixspace(int old, int new, off_t off, int *space)
T> +{
T> +
T> + KASSERT(old > new, ("%s: old %d new %d", __func__, old, new));
T> +
T> + /* Subtract last one. */
T> + *space -= xfsize(old - 1, old, off, *space);
T> + old--;
T> +
T> + if (new == old)
T> + /* There was only one page. */
T> + return;
T> +
T> + /* Subtract first one. */
T> + if (new == 0) {
T> + *space -= xfsize(0, old, off, *space);
T> + new++;
T> + }
T> +
T> + /* Rest of pages are full sized. */
T> + *space -= (old - new) * PAGE_SIZE;
T> +
T> + KASSERT(*space >= 0, ("%s: space went backwards", __func__));
T> +}
T> +
T> +struct sf_io {
T> + u_int nios;
T> + int npages;
T> + struct file *sock_fp;
T> + struct mbuf *m;
T> + vm_page_t pa[];
T> +};
T> +
T> +static void
T> +sf_io_done(void *arg)
T> +{
T> + struct sf_io *sfio = arg;
T> + struct socket *so;
T> +
T> + if (!refcount_release(&sfio->nios))
T> + return;
T> +
T> + so = sfio->sock_fp->f_data;
T> +
T> + if (sbready(&so->so_snd, sfio->m, sfio->npages) == 0) {
T> + struct mbuf *m;
T> +
T> + m = m_get(M_NOWAIT, MT_DATA);
T> + if (m == NULL) {
T> + panic("XXXGL");
T> + }
T> + m->m_len = 0;
T> + CURVNET_SET(so->so_vnet);
T> + /* XXXGL: curthread */
T> + (void )(so->so_proto->pr_usrreqs->pru_send)
T> + (so, 0, m, NULL, NULL, curthread);
T> + CURVNET_RESTORE();
T> + }
T> +
T> + /* XXXGL: curthread */
T> + fdrop(sfio->sock_fp, curthread);
T> + free(sfio, M_TEMP);
T> +}
T> +
T> static int
T> -sendfile_readpage(vm_object_t obj, struct vnode *vp, int nd,
T> - off_t off, int xfsize, int bsize, struct thread *td, vm_page_t *res)
T> +sendfile_swapin(vm_object_t obj, struct sf_io *sfio, off_t off, off_t len,
T> + int npages, int rhpages)
T> {
T> - vm_page_t m;
T> - vm_pindex_t pindex;
T> - ssize_t resid;
T> - int error, readahead, rv;
T> + vm_page_t *pa = sfio->pa;
T> + int nios;
T>
T> - pindex = OFF_TO_IDX(off);
T> + nios = 0;
T> VM_OBJECT_WLOCK(obj);
T> - m = vm_page_grab(obj, pindex, (vp != NULL ? VM_ALLOC_NOBUSY |
T> - VM_ALLOC_IGN_SBUSY : 0) | VM_ALLOC_WIRED | VM_ALLOC_NORMAL);
T> + for (int i = 0; i < npages; i++)
T> + pa[i] = vm_page_grab(obj, OFF_TO_IDX(vmoff(i, off)),
T> + VM_ALLOC_WIRED | VM_ALLOC_NORMAL);
T>
T> - /*
T> - * Check if page is valid for what we need, otherwise initiate I/O.
T> - *
T> - * The non-zero nd argument prevents disk I/O, instead we
T> - * return the caller what he specified in nd. In particular,
T> - * if we already turned some pages into mbufs, nd == EAGAIN
T> - * and the main function send them the pages before we come
T> - * here again and block.
T> - */
T> - if (m->valid != 0 && vm_page_is_valid(m, off & PAGE_MASK, xfsize)) {
T> - if (vp == NULL)
T> - vm_page_xunbusy(m);
T> - VM_OBJECT_WUNLOCK(obj);
T> - *res = m;
T> - return (0);
T> - } else if (nd != 0) {
T> - if (vp == NULL)
T> - vm_page_xunbusy(m);
T> - error = nd;
T> - goto free_page;
T> - }
T> + for (int i = 0; i < npages;) {
T> + int j, a, count, rv;
T>
T> - /*
T> - * Get the page from backing store.
T> - */
T> - error = 0;
T> - if (vp != NULL) {
T> - VM_OBJECT_WUNLOCK(obj);
T> - readahead = sfreadahead * MAXBSIZE;
T> + if (vm_page_is_valid(pa[i], vmoff(i, off) & PAGE_MASK,
T> + xfsize(i, npages, off, len))) {
T> + vm_page_xunbusy(pa[i]);
T> + i++;
T> + continue;
T> + }
T>
T> - /*
T> - * Use vn_rdwr() instead of the pager interface for
T> - * the vnode, to allow the read-ahead.
T> - *
T> - * XXXMAC: Because we don't have fp->f_cred here, we
T> - * pass in NOCRED. This is probably wrong, but is
T> - * consistent with our original implementation.
T> - */
T> - error = vn_rdwr(UIO_READ, vp, NULL, readahead, trunc_page(off),
T> - UIO_NOCOPY, IO_NODELOCKED | IO_VMIO | ((readahead /
T> - bsize) << IO_SEQSHIFT), td->td_ucred, NOCRED, &resid, td);
T> - SFSTAT_INC(sf_iocnt);
T> - VM_OBJECT_WLOCK(obj);
T> - } else {
T> - if (vm_pager_has_page(obj, pindex, NULL, NULL)) {
T> - rv = vm_pager_get_pages(obj, &m, 1, 0);
T> - SFSTAT_INC(sf_iocnt);
T> - m = vm_page_lookup(obj, pindex);
T> - if (m == NULL)
T> - error = EIO;
T> - else if (rv != VM_PAGER_OK) {
T> - vm_page_lock(m);
T> - vm_page_free(m);
T> - vm_page_unlock(m);
T> - m = NULL;
T> - error = EIO;
T> + for (j = i + 1; j < npages; j++)
T> + if (vm_page_is_valid(pa[j], vmoff(j, off) & PAGE_MASK,
T> + xfsize(j, npages, off, len)))
T> + break;
T> +
T> + while (!vm_pager_has_page(obj, OFF_TO_IDX(vmoff(i, off)),
T> + NULL, &a) && i < j) {
T> + pmap_zero_page(pa[i]);
T> + pa[i]->valid = VM_PAGE_BITS_ALL;
T> + pa[i]->dirty = 0;
T> + vm_page_xunbusy(pa[i]);
T> + i++;
T> + }
T> + if (i == j)
T> + continue;
T> +
T> + count = min(a + 1, npages + rhpages - i);
T> + for (j = npages; j < i + count; j++) {
T> + pa[j] = vm_page_grab(obj, OFF_TO_IDX(vmoff(j, off)),
T> + VM_ALLOC_NORMAL | VM_ALLOC_NOWAIT);
T> + if (pa[j] == NULL) {
T> + count = j - i;
T> + break;
T> }
T> - } else {
T> - pmap_zero_page(m);
T> - m->valid = VM_PAGE_BITS_ALL;
T> - m->dirty = 0;
T> + if (pa[j]->valid) {
T> + vm_page_xunbusy(pa[j]);
T> + count = j - i;
T> + break;
T> + }
T> }
T> - if (m != NULL)
T> - vm_page_xunbusy(m);
T> +
T> + refcount_acquire(&sfio->nios);
T> + rv = vm_pager_get_pages_async(obj, pa + i, count, 0,
T> + &sf_io_done, sfio);
T> +
T> + KASSERT(rv == VM_PAGER_OK, ("%s: pager fail obj %p page %p",
T> + __func__, obj, pa[i]));
T> +
T> + SFSTAT_INC(sf_iocnt);
T> + nios++;
T> +
T> + for (j = i; j < i + count && j < npages; j++)
T> + KASSERT(pa[j] == vm_page_lookup(obj,
T> + OFF_TO_IDX(vmoff(j, off))),
T> + ("pa[j] %p lookup %p\n", pa[j],
T> + vm_page_lookup(obj, OFF_TO_IDX(vmoff(j, off)))));
T> +
T> + i += count;
T> }
T> - if (error == 0) {
T> - *res = m;
T> - } else if (m != NULL) {
T> -free_page:
T> - vm_page_lock(m);
T> - vm_page_unwire(m, 0);
T>
T> - /*
T> - * See if anyone else might know about this page. If
T> - * not and it is not valid, then free it.
T> - */
T> - if (m->wire_count == 0 && m->valid == 0 && !vm_page_busied(m))
T> - vm_page_free(m);
T> - vm_page_unlock(m);
T> - }
T> - KASSERT(error != 0 || (m->wire_count > 0 &&
T> - vm_page_is_valid(m, off & PAGE_MASK, xfsize)),
T> - ("wrong page state m %p off %#jx xfsize %d", m, (uintmax_t)off,
T> - xfsize));
T> VM_OBJECT_WUNLOCK(obj);
T> - return (error);
T> +
T> + return (nios);
T> }
T>
T> static int
T> @@ -2814,41 +2919,26 @@ vn_sendfile(struct file *fp, int sockfd, struct ui
T> struct vnode *vp;
T> struct vm_object *obj;
T> struct socket *so;
T> - struct mbuf *m;
T> + struct mbuf *m, *mh, *mhtail;
T> struct sf_buf *sf;
T> - struct vm_page *pg;
T> struct shmfd *shmfd;
T> struct vattr va;
T> - off_t off, xfsize, fsbytes, sbytes, rem, obj_size;
T> - int error, bsize, nd, hdrlen, mnw;
T> + off_t off, sbytes, rem, obj_size;
T> + int error, serror, bsize, hdrlen;
T>
T> - pg = NULL;
T> obj = NULL;
T> so = NULL;
T> - m = NULL;
T> - fsbytes = sbytes = 0;
T> - hdrlen = mnw = 0;
T> - rem = nbytes;
T> - obj_size = 0;
T> + m = mh = NULL;
T> + sbytes = 0;
T>
T> error = sendfile_getobj(td, fp, &obj, &vp, &shmfd, &obj_size, &bsize);
T> if (error != 0)
T> return (error);
T> - if (rem == 0)
T> - rem = obj_size;
T>
T> error = kern_sendfile_getsock(td, sockfd, &sock_fp, &so);
T> if (error != 0)
T> goto out;
T>
T> - /*
T> - * Do not wait on memory allocations but return ENOMEM for
T> - * caller to retry later.
T> - * XXX: Experimental.
T> - */
T> - if (flags & SF_MNOWAIT)
T> - mnw = 1;
T> -
T> #ifdef MAC
T> error = mac_socket_check_send(td->td_ucred, so);
T> if (error != 0)
T> @@ -2856,31 +2946,27 @@ vn_sendfile(struct file *fp, int sockfd, struct ui
T> #endif
T>
T> /* If headers are specified copy them into mbufs. */
T> - if (hdr_uio != NULL) {
T> + if (hdr_uio != NULL && hdr_uio->uio_resid > 0) {
T> hdr_uio->uio_td = td;
T> hdr_uio->uio_rw = UIO_WRITE;
T> - if (hdr_uio->uio_resid > 0) {
T> - /*
T> - * In FBSD < 5.0 the nbytes to send also included
T> - * the header. If compat is specified subtract the
T> - * header size from nbytes.
T> - */
T> - if (kflags & SFK_COMPAT) {
T> - if (nbytes > hdr_uio->uio_resid)
T> - nbytes -= hdr_uio->uio_resid;
T> - else
T> - nbytes = 0;
T> - }
T> - m = m_uiotombuf(hdr_uio, (mnw ? M_NOWAIT : M_WAITOK),
T> - 0, 0, 0);
T> - if (m == NULL) {
T> - error = mnw ? EAGAIN : ENOBUFS;
T> - goto out;
T> - }
T> - hdrlen = m_length(m, NULL);
T> + /*
T> + * In FBSD < 5.0 the nbytes to send also included
T> + * the header. If compat is specified subtract the
T> + * header size from nbytes.
T> + */
T> + if (kflags & SFK_COMPAT) {
T> + if (nbytes > hdr_uio->uio_resid)
T> + nbytes -= hdr_uio->uio_resid;
T> + else
T> + nbytes = 0;
T> }
T> - }
T> + mh = m_uiotombuf(hdr_uio, M_WAITOK, 0, 0, 0);
T> + hdrlen = m_length(mh, &mhtail);
T> + } else
T> + hdrlen = 0;
T>
T> + rem = nbytes ? omin(nbytes, obj_size - offset) : obj_size - offset;
T> +
T> /*
T> * Protect against multiple writers to the socket.
T> *
T> @@ -2900,21 +2986,13 @@ vn_sendfile(struct file *fp, int sockfd, struct ui
T> * The outer loop checks the state and available space of the socket
T> * and takes care of the overall progress.
T> */
T> - for (off = offset; ; ) {
T> + for (off = offset; rem > 0; ) {
T> + struct sf_io *sfio;
T> + vm_page_t *pa;
T> struct mbuf *mtail;
T> - int loopbytes;
T> - int space;
T> - int done;
T> + int nios, space, npages, rhpages;
T>
T> - if ((nbytes != 0 && nbytes == fsbytes) ||
T> - (nbytes == 0 && obj_size == fsbytes))
T> - break;
T> -
T> mtail = NULL;
T> - loopbytes = 0;
T> - space = 0;
T> - done = 0;
T> -
T> /*
T> * Check the socket state for ongoing connection,
T> * no errors and space in socket buffer.
T> @@ -2990,53 +3068,44 @@ retry_space:
T> VOP_UNLOCK(vp, 0);
T> goto done;
T> }
T> - obj_size = va.va_size;
T> + if (va.va_size != obj_size) {
T> + if (nbytes == 0)
T> + rem += va.va_size - obj_size;
T> + else if (offset + nbytes > va.va_size)
T> + rem -= (offset + nbytes - va.va_size);
T> + obj_size = va.va_size;
T> + }
T> }
T>
T> + if (space > rem)
T> + space = rem;
T> +
T> + if (off & PAGE_MASK)
T> + npages = 1 + howmany(space -
T> + (PAGE_SIZE - (off & PAGE_MASK)), PAGE_SIZE);
T> + else
T> + npages = howmany(space, PAGE_SIZE);
T> +
T> + rhpages = SF_READAHEAD(flags) ?
T> + SF_READAHEAD(flags) : sfreadahead;
T> + rhpages = min(howmany(obj_size - (off & ~PAGE_MASK) -
T> + (npages * PAGE_SIZE), PAGE_SIZE), rhpages);
T> +
T> + sfio = malloc(sizeof(struct sf_io) +
T> + (rhpages + npages) * sizeof(vm_page_t), M_TEMP, M_WAITOK);
T> + refcount_init(&sfio->nios, 1);
T> +
T> + nios = sendfile_swapin(obj, sfio, off, space, npages, rhpages);
T> +
T> /*
T> * Loop and construct maximum sized mbuf chain to be bulk
T> * dumped into socket buffer.
T> */
T> - while (space > loopbytes) {
T> - vm_offset_t pgoff;
T> + pa = sfio->pa;
T> + for (int i = 0; i < npages; i++) {
T> struct mbuf *m0;
T>
T> /*
T> - * Calculate the amount to transfer.
T> - * Not to exceed a page, the EOF,
T> - * or the passed in nbytes.
T> - */
T> - pgoff = (vm_offset_t)(off & PAGE_MASK);
T> - rem = obj_size - offset;
T> - if (nbytes != 0)
T> - rem = omin(rem, nbytes);
T> - rem -= fsbytes + loopbytes;
T> - xfsize = omin(PAGE_SIZE - pgoff, rem);
T> - xfsize = omin(space - loopbytes, xfsize);
T> - if (xfsize <= 0) {
T> - done = 1; /* all data sent */
T> - break;
T> - }
T> -
T> - /*
T> - * Attempt to look up the page. Allocate
T> - * if not found or wait and loop if busy.
T> - */
T> - if (m != NULL)
T> - nd = EAGAIN; /* send what we already got */
T> - else if ((flags & SF_NODISKIO) != 0)
T> - nd = EBUSY;
T> - else
T> - nd = 0;
T> - error = sendfile_readpage(obj, vp, nd, off,
T> - xfsize, bsize, td, &pg);
T> - if (error != 0) {
T> - if (error == EAGAIN)
T> - error = 0; /* not a real error */
T> - break;
T> - }
T> -
T> - /*
T> * Get a sendfile buf. When allocating the
T> * first buffer for mbuf chain, we usually
T> * wait as long as necessary, but this wait
T> @@ -3045,17 +3114,18 @@ retry_space:
T> * threads might exhaust the buffers and then
T> * deadlock.
T> */
T> - sf = sf_buf_alloc(pg, (mnw || m != NULL) ? SFB_NOWAIT :
T> - SFB_CATCH);
T> + sf = sf_buf_alloc(pa[i],
T> + m != NULL ? SFB_NOWAIT : SFB_CATCH);
T> if (sf == NULL) {
T> SFSTAT_INC(sf_allocfail);
T> - vm_page_lock(pg);
T> - vm_page_unwire(pg, 0);
T> - KASSERT(pg->object != NULL,
T> - ("%s: object disappeared", __func__));
T> - vm_page_unlock(pg);
T> + for (int j = i; j < npages; j++) {
T> + vm_page_lock(pa[j]);
T> + vm_page_unwire(pa[j], 0);
T> + vm_page_unlock(pa[j]);
T> + }
T> if (m == NULL)
T> - error = (mnw ? EAGAIN : EINTR);
T> + error = ENOBUFS;
T> + fixspace(npages, i, off, &space);
T> break;
T> }
T>
T> @@ -3063,36 +3133,26 @@ retry_space:
T> * Get an mbuf and set it up as having
T> * external storage.
T> */
T> - m0 = m_get((mnw ? M_NOWAIT : M_WAITOK), MT_DATA);
T> - if (m0 == NULL) {
T> - error = (mnw ? EAGAIN : ENOBUFS);
T> - (void)sf_buf_mext(NULL, NULL, sf);
T> - break;
T> - }
T> - if (m_extadd(m0, (caddr_t )sf_buf_kva(sf), PAGE_SIZE,
T> - sf_buf_mext, sfs, sf, M_RDONLY, EXT_SFBUF,
T> - (mnw ? M_NOWAIT : M_WAITOK)) != 0) {
T> - error = (mnw ? EAGAIN : ENOBUFS);
T> - (void)sf_buf_mext(NULL, NULL, sf);
T> - m_freem(m0);
T> - break;
T> - }
T> - m0->m_data = (char *)sf_buf_kva(sf) + pgoff;
T> - m0->m_len = xfsize;
T> + m0 = m_get(M_WAITOK, MT_DATA);
T> + (void )m_extadd(m0, (caddr_t )sf_buf_kva(sf), PAGE_SIZE,
T> + (flags & SF_NOCACHE) ? sf_mext_free_nocache :
T> + sf_mext_free, sfs, sf, M_RDONLY, EXT_SFBUF,
T> + M_WAITOK);
T> + m0->m_data = (char *)sf_buf_kva(sf) +
T> + (vmoff(i, off) & PAGE_MASK);
T> + m0->m_len = xfsize(i, npages, off, space);
T> + m0->m_flags |= M_NOTREADY;
T>
T> + if (i == 0)
T> + sfio->m = m0;
T> +
T> /* Append to mbuf chain. */
T> if (mtail != NULL)
T> mtail->m_next = m0;
T> - else if (m != NULL)
T> - m_last(m)->m_next = m0;
T> else
T> m = m0;
T> mtail = m0;
T>
T> - /* Keep track of bits processed. */
T> - loopbytes += xfsize;
T> - off += xfsize;
T> -
T> /*
T> * XXX eventually this should be a sfsync
T> * method call!
T> @@ -3104,47 +3164,51 @@ retry_space:
T> if (vp != NULL)
T> VOP_UNLOCK(vp, 0);
T>
T> + /* Keep track of bytes processed. */
T> + off += space;
T> + rem -= space;
T> +
T> + /* Prepend header, if any. */
T> + if (hdrlen) {
T> + mhtail->m_next = m;
T> + m = mh;
T> + mh = NULL;
T> + }
T> +
T> + if (error) {
T> + free(sfio, M_TEMP);
T> + goto done;
T> + }
T> +
T> /* Add the buffer chain to the socket buffer. */
T> - if (m != NULL) {
T> - int mlen, err;
T> + KASSERT(m_length(m, NULL) == space + hdrlen,
T> + ("%s: mlen %u space %d hdrlen %d",
T> + __func__, m_length(m, NULL), space, hdrlen));
T>
T> - mlen = m_length(m, NULL);
T> - SOCKBUF_LOCK(&so->so_snd);
T> - if (so->so_snd.sb_state & SBS_CANTSENDMORE) {
T> - error = EPIPE;
T> - SOCKBUF_UNLOCK(&so->so_snd);
T> - goto done;
T> - }
T> - SOCKBUF_UNLOCK(&so->so_snd);
T> - CURVNET_SET(so->so_vnet);
T> - /* Avoid error aliasing. */
T> - err = (*so->so_proto->pr_usrreqs->pru_send)
T> - (so, 0, m, NULL, NULL, td);
T> - CURVNET_RESTORE();
T> - if (err == 0) {
T> - /*
T> - * We need two counters to get the
T> - * file offset and nbytes to send
T> - * right:
T> - * - sbytes contains the total amount
T> - * of bytes sent, including headers.
T> - * - fsbytes contains the total amount
T> - * of bytes sent from the file.
T> - */
T> - sbytes += mlen;
T> - fsbytes += mlen;
T> - if (hdrlen) {
T> - fsbytes -= hdrlen;
T> - hdrlen = 0;
T> - }
T> - } else if (error == 0)
T> - error = err;
T> - m = NULL; /* pru_send always consumes */
T> + CURVNET_SET(so->so_vnet);
T> + if (nios == 0) {
T> + free(sfio, M_TEMP);
T> + serror = (*so->so_proto->pr_usrreqs->pru_send)
T> + (so, 0, m, NULL, NULL, td);
T> + } else {
T> + sfio->sock_fp = sock_fp;
T> + sfio->npages = npages;
T> + fhold(sock_fp);
T> + serror = (*so->so_proto->pr_usrreqs->pru_send)
T> + (so, PRUS_NOTREADY, m, NULL, NULL, td);
T> + sf_io_done(sfio);
T> }
T> + CURVNET_RESTORE();
T>
T> - /* Quit outer loop on error or when we're done. */
T> - if (done)
T> - break;
T> + if (serror == 0) {
T> + sbytes += space + hdrlen;
T> + if (hdrlen)
T> + hdrlen = 0;
T> + } else if (error == 0)
T> + error = serror;
T> + m = NULL; /* pru_send always consumes */
T> +
T> + /* Quit outer loop on error. */
T> if (error != 0)
T> goto done;
T> }
T> @@ -3179,6 +3243,8 @@ out:
T> fdrop(sock_fp, td);
T> if (m)
T> m_freem(m);
T> + if (mh)
T> + m_freem(mh);
T>
T> if (error == ERESTART)
T> error = EINTR;
T> Index: sys/netgraph/bluetooth/socket/ng_btsocket_l2cap.c
T> ===================================================================
T> --- sys/netgraph/bluetooth/socket/ng_btsocket_l2cap.c (.../head) (revision 266804)
T> +++ sys/netgraph/bluetooth/socket/ng_btsocket_l2cap.c (.../projects/sendfile) (revision 266807)
T> @@ -1127,9 +1127,8 @@ ng_btsocket_l2cap_process_l2ca_write_rsp(struct ng
T> /*
T> * Check if we have more data to send
T> */
T> -
T> sbdroprecord(&pcb->so->so_snd);
T> - if (pcb->so->so_snd.sb_cc > 0) {
T> + if (sbavail(&pcb->so->so_snd) > 0) {
T> if (ng_btsocket_l2cap_send2(pcb) == 0)
T> ng_btsocket_l2cap_timeout(pcb);
T> else
T> @@ -2510,7 +2509,7 @@ ng_btsocket_l2cap_send2(ng_btsocket_l2cap_pcb_p pc
T>
T> mtx_assert(&pcb->pcb_mtx, MA_OWNED);
T>
T> - if (pcb->so->so_snd.sb_cc == 0)
T> + if (sbavail(&pcb->so->so_snd) == 0)
T> return (EINVAL); /* XXX */
T>
T> m = m_dup(pcb->so->so_snd.sb_mb, M_NOWAIT);
T> Index: sys/netgraph/bluetooth/socket/ng_btsocket_rfcomm.c
T> ===================================================================
T> --- sys/netgraph/bluetooth/socket/ng_btsocket_rfcomm.c (.../head) (revision 266804)
T> +++ sys/netgraph/bluetooth/socket/ng_btsocket_rfcomm.c (.../projects/sendfile) (revision 266807)
T> @@ -3274,7 +3274,7 @@ ng_btsocket_rfcomm_pcb_send(ng_btsocket_rfcomm_pcb
T> }
T>
T> for (error = 0, sent = 0; sent < limit; sent ++) {
T> - length = min(pcb->mtu, pcb->so->so_snd.sb_cc);
T> + length = min(pcb->mtu, sbavail(&pcb->so->so_snd));
T> if (length == 0)
T> break;
T>
T> Index: sys/netgraph/bluetooth/socket/ng_btsocket_sco.c
T> ===================================================================
T> --- sys/netgraph/bluetooth/socket/ng_btsocket_sco.c (.../head) (revision 266804)
T> +++ sys/netgraph/bluetooth/socket/ng_btsocket_sco.c (.../projects/sendfile) (revision 266807)
T> @@ -906,7 +906,7 @@ ng_btsocket_sco_default_msg_input(struct ng_mesg *
T> sbdroprecord(&pcb->so->so_snd);
T>
T> /* Send more if we have any */
T> - if (pcb->so->so_snd.sb_cc > 0)
T> + if (sbavail(&pcb->so->so_snd) > 0)
T> if (ng_btsocket_sco_send2(pcb) == 0)
T> ng_btsocket_sco_timeout(pcb);
T>
T> @@ -1744,7 +1744,7 @@ ng_btsocket_sco_send2(ng_btsocket_sco_pcb_p pcb)
T> mtx_assert(&pcb->pcb_mtx, MA_OWNED);
T>
T> while (pcb->rt->pending < pcb->rt->num_pkts &&
T> - pcb->so->so_snd.sb_cc > 0) {
T> + sbavail(&pcb->so->so_snd) > 0) {
T> /* Get a copy of the first packet on send queue */
T> m = m_dup(pcb->so->so_snd.sb_mb, M_NOWAIT);
T> if (m == NULL) {
T> Index: sys/ofed/drivers/infiniband/ulp/sdp/sdp_main.c
T> ===================================================================
T> --- sys/ofed/drivers/infiniband/ulp/sdp/sdp_main.c (.../head) (revision 266804)
T> +++ sys/ofed/drivers/infiniband/ulp/sdp/sdp_main.c (.../projects/sendfile) (revision 266807)
T> @@ -746,7 +746,7 @@ sdp_start_disconnect(struct sdp_sock *ssk)
T> ("sdp_start_disconnect: sdp_drop() returned NULL"));
T> } else {
T> soisdisconnecting(so);
T> - unread = so->so_rcv.sb_cc;
T> + unread = sbused(&so->so_rcv);
T> sbflush(&so->so_rcv);
T> sdp_usrclosed(ssk);
T> if (!(ssk->flags & SDP_DROPPED)) {
T> @@ -888,7 +888,7 @@ sdp_append(struct sdp_sock *ssk, struct sockbuf *s
T> m_adj(mb, SDP_HEAD_SIZE);
T> n->m_pkthdr.len += mb->m_pkthdr.len;
T> n->m_flags |= mb->m_flags & (M_PUSH | M_URG);
T> - m_demote(mb, 1);
T> + m_demote(mb, 1, 0);
T> sbcompress(sb, mb, sb->sb_mbtail);
T> return;
T> }
T> @@ -1258,7 +1258,7 @@ sdp_sorecv(struct socket *so, struct sockaddr **ps
T> /* We will never ever get anything unless we are connected. */
T> if (!(so->so_state & (SS_ISCONNECTED|SS_ISDISCONNECTED))) {
T> /* When disconnecting there may be still some data left. */
T> - if (sb->sb_cc > 0)
T> + if (sbavail(sb))
T> goto deliver;
T> if (!(so->so_state & SS_ISDISCONNECTED))
T> error = ENOTCONN;
T> @@ -1266,7 +1266,7 @@ sdp_sorecv(struct socket *so, struct sockaddr **ps
T> }
T>
T> /* Socket buffer is empty and we shall not block. */
T> - if (sb->sb_cc == 0 &&
T> + if (sbavail(sb) == 0 &&
T> ((so->so_state & SS_NBIO) || (flags & (MSG_DONTWAIT|MSG_NBIO)))) {
T> error = EAGAIN;
T> goto out;
T> @@ -1277,7 +1277,7 @@ restart:
T>
T> /* Abort if socket has reported problems. */
T> if (so->so_error) {
T> - if (sb->sb_cc > 0)
T> + if (sbavail(sb))
T> goto deliver;
T> if (oresid > uio->uio_resid)
T> goto out;
T> @@ -1289,7 +1289,7 @@ restart:
T>
T> /* Door is closed. Deliver what is left, if any. */
T> if (sb->sb_state & SBS_CANTRCVMORE) {
T> - if (sb->sb_cc > 0)
T> + if (sbavail(sb))
T> goto deliver;
T> else
T> goto out;
T> @@ -1296,18 +1296,18 @@ restart:
T> }
T>
T> /* Socket buffer got some data that we shall deliver now. */
T> - if (sb->sb_cc > 0 && !(flags & MSG_WAITALL) &&
T> + if (sbavail(sb) && !(flags & MSG_WAITALL) &&
T> ((so->so_state & SS_NBIO) ||
T> (flags & (MSG_DONTWAIT|MSG_NBIO)) ||
T> - sb->sb_cc >= sb->sb_lowat ||
T> - sb->sb_cc >= uio->uio_resid ||
T> - sb->sb_cc >= sb->sb_hiwat) ) {
T> + sbavail(sb) >= sb->sb_lowat ||
T> + sbavail(sb) >= uio->uio_resid ||
T> + sbavail(sb) >= sb->sb_hiwat) ) {
T> goto deliver;
T> }
T>
T> /* On MSG_WAITALL we must wait until all data or error arrives. */
T> if ((flags & MSG_WAITALL) &&
T> - (sb->sb_cc >= uio->uio_resid || sb->sb_cc >= sb->sb_lowat))
T> + (sbavail(sb) >= uio->uio_resid || sbavail(sb) >= sb->sb_lowat))
T> goto deliver;
T>
T> /*
T> @@ -1321,7 +1321,7 @@ restart:
T>
T> deliver:
T> SOCKBUF_LOCK_ASSERT(&so->so_rcv);
T> - KASSERT(sb->sb_cc > 0, ("%s: sockbuf empty", __func__));
T> + KASSERT(sbavail(sb), ("%s: sockbuf empty", __func__));
T> KASSERT(sb->sb_mb != NULL, ("%s: sb_mb == NULL", __func__));
T>
T> /* Statistics. */
T> @@ -1329,7 +1329,7 @@ deliver:
T> uio->uio_td->td_ru.ru_msgrcv++;
T>
T> /* Fill uio until full or current end of socket buffer is reached. */
T> - len = min(uio->uio_resid, sb->sb_cc);
T> + len = min(uio->uio_resid, sbavail(sb));
T> if (mp0 != NULL) {
T> /* Dequeue as many mbufs as possible. */
T> if (!(flags & MSG_PEEK) && len >= sb->sb_mb->m_len) {
T> @@ -1509,7 +1509,7 @@ sdp_urg(struct sdp_sock *ssk, struct mbuf *mb)
T> if (so == NULL)
T> return;
T>
T> - so->so_oobmark = so->so_rcv.sb_cc + mb->m_pkthdr.len - 1;
T> + so->so_oobmark = sbused(&so->so_rcv) + mb->m_pkthdr.len - 1;
T> sohasoutofband(so);
T> ssk->oobflags &= ~(SDP_HAVEOOB | SDP_HADOOB);
T> if (!(so->so_options & SO_OOBINLINE)) {
T> Index: sys/ofed/drivers/infiniband/ulp/sdp/sdp_rx.c
T> ===================================================================
T> --- sys/ofed/drivers/infiniband/ulp/sdp/sdp_rx.c (.../head) (revision 266804)
T> +++ sys/ofed/drivers/infiniband/ulp/sdp/sdp_rx.c (.../projects/sendfile) (revision 266807)
T> @@ -183,7 +183,7 @@ sdp_post_recvs_needed(struct sdp_sock *ssk)
T> * Compute bytes in the receive queue and socket buffer.
T> */
T> bytes_in_process = (posted - SDP_MIN_TX_CREDITS) * buffer_size;
T> - bytes_in_process += ssk->socket->so_rcv.sb_cc;
T> + bytes_in_process += sbused(&ssk->socket->so_rcv);
T>
T> return bytes_in_process < max_bytes;
T> }
T> Index: sys/sys/socket.h
T> ===================================================================
T> --- sys/sys/socket.h (.../head) (revision 266804)
T> +++ sys/sys/socket.h (.../projects/sendfile) (revision 266807)
T> @@ -602,12 +602,15 @@ struct sf_hdtr_all {
T> * Sendfile-specific flag(s)
T> */
T> #define SF_NODISKIO 0x00000001
T> -#define SF_MNOWAIT 0x00000002
T> +#define SF_MNOWAIT 0x00000002 /* unused since 11.0 */
T> #define SF_SYNC 0x00000004
T> #define SF_KQUEUE 0x00000008
T> +#define SF_NOCACHE 0x00000010
T> +#define SF_FLAGS(rh, flags) (((rh) << 16) | (flags))
T>
T> #ifdef _KERNEL
T> #define SFK_COMPAT 0x00000001
T> +#define SF_READAHEAD(flags) ((flags) >> 16)
T> #endif /* _KERNEL */
T> #endif /* __BSD_VISIBLE */
T>
T> Index: sys/sys/sockbuf.h
T> ===================================================================
T> --- sys/sys/sockbuf.h (.../head) (revision 266804)
T> +++ sys/sys/sockbuf.h (.../projects/sendfile) (revision 266807)
T> @@ -89,8 +89,13 @@ struct sockbuf {
T> struct mbuf *sb_lastrecord; /* (c/d) first mbuf of last
T> * record in socket buffer */
T> struct mbuf *sb_sndptr; /* (c/d) pointer into mbuf chain */
T> + struct mbuf *sb_fnrdy; /* (c/d) pointer to first not ready buffer */
T> +#if 0
T> + struct mbuf *sb_lnrdy; /* (c/d) pointer to last not ready buffer */
T> +#endif
T> u_int sb_sndptroff; /* (c/d) byte offset of ptr into chain */
T> - u_int sb_cc; /* (c/d) actual chars in buffer */
T> + u_int sb_acc; /* (c/d) available chars in buffer */
T> + u_int sb_ccc; /* (c/d) claimed chars in buffer */
T> u_int sb_hiwat; /* (c/d) max actual char count */
T> u_int sb_mbcnt; /* (c/d) chars of mbufs used */
T> u_int sb_mcnt; /* (c/d) number of mbufs in buffer */
T> @@ -120,10 +125,17 @@ struct sockbuf {
T> #define SOCKBUF_LOCK_ASSERT(_sb) mtx_assert(SOCKBUF_MTX(_sb), MA_OWNED)
T> #define SOCKBUF_UNLOCK_ASSERT(_sb) mtx_assert(SOCKBUF_MTX(_sb), MA_NOTOWNED)
T>
T> +/*
T> + * Socket buffer private mbuf(9) flags.
T> + */
T> +#define M_NOTREADY M_PROTO1 /* m_data not populated yet */
T> +#define M_BLOCKED M_PROTO2 /* M_NOTREADY in front of m */
T> +#define M_NOTAVAIL (M_NOTREADY | M_BLOCKED)
T> +
T> void sbappend(struct sockbuf *sb, struct mbuf *m);
T> void sbappend_locked(struct sockbuf *sb, struct mbuf *m);
T> -void sbappendstream(struct sockbuf *sb, struct mbuf *m);
T> -void sbappendstream_locked(struct sockbuf *sb, struct mbuf *m);
T> +void sbappendstream(struct sockbuf *sb, struct mbuf *m, int flags);
T> +void sbappendstream_locked(struct sockbuf *sb, struct mbuf *m, int flags);
T> int sbappendaddr(struct sockbuf *sb, const struct sockaddr *asa,
T> struct mbuf *m0, struct mbuf *control);
T> int sbappendaddr_locked(struct sockbuf *sb, const struct sockaddr *asa,
T> @@ -136,7 +148,6 @@ int sbappendcontrol_locked(struct sockbuf *sb, str
T> struct mbuf *control);
T> void sbappendrecord(struct sockbuf *sb, struct mbuf *m0);
T> void sbappendrecord_locked(struct sockbuf *sb, struct mbuf *m0);
T> -void sbcheck(struct sockbuf *sb);
T> void sbcompress(struct sockbuf *sb, struct mbuf *m, struct mbuf *n);
T> struct mbuf *
T> sbcreatecontrol(caddr_t p, int size, int type, int level);
T> @@ -162,59 +173,54 @@ void sbtoxsockbuf(struct sockbuf *sb, struct xsock
T> int sbwait(struct sockbuf *sb);
T> int sblock(struct sockbuf *sb, int flags);
T> void sbunlock(struct sockbuf *sb);
T> +void sballoc(struct sockbuf *, struct mbuf *);
T> +void sbfree(struct sockbuf *, struct mbuf *);
T> +void sbmtrim(struct sockbuf *, struct mbuf *, int);
T> +int sbready(struct sockbuf *, struct mbuf *, int);
T>
T> +static inline u_int
T> +sbavail(struct sockbuf *sb)
T> +{
T> +
T> +#if 0
T> + SOCKBUF_LOCK_ASSERT(sb);
T> +#endif
T> + return (sb->sb_acc);
T> +}
T> +
T> +static inline u_int
T> +sbused(struct sockbuf *sb)
T> +{
T> +
T> +#if 0
T> + SOCKBUF_LOCK_ASSERT(sb);
T> +#endif
T> + return (sb->sb_ccc);
T> +}
T> +
T> /*
T> * How much space is there in a socket buffer (so->so_snd or so->so_rcv)?
T> * This is problematical if the fields are unsigned, as the space might
T> - * still be negative (cc > hiwat or mbcnt > mbmax). Should detect
T> - * overflow and return 0. Should use "lmin" but it doesn't exist now.
T> + * still be negative (ccc > hiwat or mbcnt > mbmax).
T> */
T> -static __inline
T> -long
T> +static inline long
T> sbspace(struct sockbuf *sb)
T> {
T> - long bleft;
T> - long mleft;
T> + long bleft, mleft;
T>
T> +#if 0
T> + SOCKBUF_LOCK_ASSERT(sb);
T> +#endif
T> +
T> if (sb->sb_flags & SB_STOP)
T> return(0);
T> - bleft = sb->sb_hiwat - sb->sb_cc;
T> +
T> + bleft = sb->sb_hiwat - sb->sb_ccc;
T> mleft = sb->sb_mbmax - sb->sb_mbcnt;
T> - return((bleft < mleft) ? bleft : mleft);
T> -}
T>
T> -/* adjust counters in sb reflecting allocation of m */
T> -#define sballoc(sb, m) { \
T> - (sb)->sb_cc += (m)->m_len; \
T> - if ((m)->m_type != MT_DATA && (m)->m_type != MT_OOBDATA) \
T> - (sb)->sb_ctl += (m)->m_len; \
T> - (sb)->sb_mbcnt += MSIZE; \
T> - (sb)->sb_mcnt += 1; \
T> - if ((m)->m_flags & M_EXT) { \
T> - (sb)->sb_mbcnt += (m)->m_ext.ext_size; \
T> - (sb)->sb_ccnt += 1; \
T> - } \
T> + return ((bleft < mleft) ? bleft : mleft);
T> }
T>
T> -/* adjust counters in sb reflecting freeing of m */
T> -#define sbfree(sb, m) { \
T> - (sb)->sb_cc -= (m)->m_len; \
T> - if ((m)->m_type != MT_DATA && (m)->m_type != MT_OOBDATA) \
T> - (sb)->sb_ctl -= (m)->m_len; \
T> - (sb)->sb_mbcnt -= MSIZE; \
T> - (sb)->sb_mcnt -= 1; \
T> - if ((m)->m_flags & M_EXT) { \
T> - (sb)->sb_mbcnt -= (m)->m_ext.ext_size; \
T> - (sb)->sb_ccnt -= 1; \
T> - } \
T> - if ((sb)->sb_sndptr == (m)) { \
T> - (sb)->sb_sndptr = NULL; \
T> - (sb)->sb_sndptroff = 0; \
T> - } \
T> - if ((sb)->sb_sndptroff != 0) \
T> - (sb)->sb_sndptroff -= (m)->m_len; \
T> -}
T> -
T> #define SB_EMPTY_FIXUP(sb) do { \
T> if ((sb)->sb_mb == NULL) { \
T> (sb)->sb_mbtail = NULL; \
T> @@ -224,13 +230,15 @@ sbspace(struct sockbuf *sb)
T>
T> #ifdef SOCKBUF_DEBUG
T> void sblastrecordchk(struct sockbuf *, const char *, int);
T> +void sblastmbufchk(struct sockbuf *, const char *, int);
T> +void sbcheck(struct sockbuf *, const char *, int);
T> #define SBLASTRECORDCHK(sb) sblastrecordchk((sb), __FILE__, __LINE__)
T> -
T> -void sblastmbufchk(struct sockbuf *, const char *, int);
T> #define SBLASTMBUFCHK(sb) sblastmbufchk((sb), __FILE__, __LINE__)
T> +#define SBCHECK(sb) sbcheck((sb), __FILE__, __LINE__)
T> #else
T> -#define SBLASTRECORDCHK(sb) /* nothing */
T> -#define SBLASTMBUFCHK(sb) /* nothing */
T> +#define SBLASTRECORDCHK(sb) do {} while (0)
T> +#define SBLASTMBUFCHK(sb) do {} while (0)
T> +#define SBCHECK(sb) do {} while (0)
T> #endif /* SOCKBUF_DEBUG */
T>
T> #endif /* _KERNEL */
T> Index: sys/sys/protosw.h
T> ===================================================================
T> --- sys/sys/protosw.h (.../head) (revision 266804)
T> +++ sys/sys/protosw.h (.../projects/sendfile) (revision 266807)
T> @@ -209,6 +209,7 @@ struct pr_usrreqs {
T> #define PRUS_OOB 0x1
T> #define PRUS_EOF 0x2
T> #define PRUS_MORETOCOME 0x4
T> +#define PRUS_NOTREADY 0x8
T> int (*pru_sense)(struct socket *so, struct stat *sb);
T> int (*pru_shutdown)(struct socket *so);
T> int (*pru_flush)(struct socket *so, int direction);
T> Index: sys/sys/sf_buf.h
T> ===================================================================
T> --- sys/sys/sf_buf.h (.../head) (revision 266804)
T> +++ sys/sys/sf_buf.h (.../projects/sendfile) (revision 266807)
T> @@ -52,7 +52,7 @@ struct sfstat { /* sendfile statistics */
T> #include <machine/sf_buf.h>
T> #include <sys/systm.h>
T> #include <sys/counter.h>
T> -struct mbuf; /* for sf_buf_mext() */
T> +struct mbuf; /* for sf_mext_free() */
T>
T> extern counter_u64_t sfstat[sizeof(struct sfstat) / sizeof(uint64_t)];
T> #define SFSTAT_ADD(name, val) \
T> @@ -61,6 +61,6 @@ extern counter_u64_t sfstat[sizeof(struct sfstat)
T> #define SFSTAT_INC(name) SFSTAT_ADD(name, 1)
T> #endif /* _KERNEL */
T>
T> -int sf_buf_mext(struct mbuf *mb, void *addr, void *args);
T> +int sf_mext_free(struct mbuf *mb, void *addr, void *args);
T>
T> #endif /* !_SYS_SF_BUF_H_ */
T> Index: sys/sys/vnode.h
T> ===================================================================
T> --- sys/sys/vnode.h (.../head) (revision 266804)
T> +++ sys/sys/vnode.h (.../projects/sendfile) (revision 266807)
T> @@ -719,6 +719,7 @@ int vop_stdbmap(struct vop_bmap_args *);
T> int vop_stdfsync(struct vop_fsync_args *);
T> int vop_stdgetwritemount(struct vop_getwritemount_args *);
T> int vop_stdgetpages(struct vop_getpages_args *);
T> +int vop_stdgetpages_async(struct vop_getpages_async_args *);
T> int vop_stdinactive(struct vop_inactive_args *);
T> int vop_stdislocked(struct vop_islocked_args *);
T> int vop_stdkqfilter(struct vop_kqfilter_args *);
T> Index: sys/sys/socketvar.h
T> ===================================================================
T> --- sys/sys/socketvar.h (.../head) (revision 266804)
T> +++ sys/sys/socketvar.h (.../projects/sendfile) (revision 266807)
T> @@ -205,7 +205,7 @@ struct xsocket {
T>
T> /* can we read something from so? */
T> #define soreadabledata(so) \
T> - ((so)->so_rcv.sb_cc >= (so)->so_rcv.sb_lowat || \
T> + (sbavail(&(so)->so_rcv) >= (so)->so_rcv.sb_lowat || \
T> !TAILQ_EMPTY(&(so)->so_comp) || (so)->so_error)
T> #define soreadable(so) \
T> (soreadabledata(so) || ((so)->so_rcv.sb_state & SBS_CANTRCVMORE))
T> Index: sys/sys/mbuf.h
T> ===================================================================
T> --- sys/sys/mbuf.h (.../head) (revision 266804)
T> +++ sys/sys/mbuf.h (.../projects/sendfile) (revision 266807)
T> @@ -922,7 +922,7 @@ struct mbuf *m_copypacket(struct mbuf *, int);
T> void m_copy_pkthdr(struct mbuf *, struct mbuf *);
T> struct mbuf *m_copyup(struct mbuf *, int, int);
T> struct mbuf *m_defrag(struct mbuf *, int);
T> -void m_demote(struct mbuf *, int);
T> +void m_demote(struct mbuf *, int, int);
T> struct mbuf *m_devget(char *, int, int, struct ifnet *,
T> void (*)(char *, caddr_t, u_int));
T> struct mbuf *m_dup(struct mbuf *, int);
T> Index: sys/vm/vnode_pager.h
T> ===================================================================
T> --- sys/vm/vnode_pager.h (.../head) (revision 266804)
T> +++ sys/vm/vnode_pager.h (.../projects/sendfile) (revision 266807)
T> @@ -41,7 +41,7 @@
T> #ifdef _KERNEL
T>
T> int vnode_pager_generic_getpages(struct vnode *vp, vm_page_t *m,
T> - int count, int reqpage);
T> + int count, int reqpage, void (*iodone)(void *), void *arg);
T> int vnode_pager_generic_putpages(struct vnode *vp, vm_page_t *m,
T> int count, boolean_t sync,
T> int *rtvals);
T> Index: sys/vm/vm_pager.h
T> ===================================================================
T> --- sys/vm/vm_pager.h (.../head) (revision 266804)
T> +++ sys/vm/vm_pager.h (.../projects/sendfile) (revision 266807)
T> @@ -51,18 +51,21 @@ typedef vm_object_t pgo_alloc_t(void *, vm_ooffset
T> struct ucred *);
T> typedef void pgo_dealloc_t(vm_object_t);
T> typedef int pgo_getpages_t(vm_object_t, vm_page_t *, int, int);
T> +typedef int pgo_getpages_async_t(vm_object_t, vm_page_t *, int, int,
T> + void(*)(void *), void *);
T> typedef void pgo_putpages_t(vm_object_t, vm_page_t *, int, int, int *);
T> typedef boolean_t pgo_haspage_t(vm_object_t, vm_pindex_t, int *, int *);
T> typedef void pgo_pageunswapped_t(vm_page_t);
T>
T> struct pagerops {
T> - pgo_init_t *pgo_init; /* Initialize pager. */
T> - pgo_alloc_t *pgo_alloc; /* Allocate pager. */
T> - pgo_dealloc_t *pgo_dealloc; /* Disassociate. */
T> - pgo_getpages_t *pgo_getpages; /* Get (read) page. */
T> - pgo_putpages_t *pgo_putpages; /* Put (write) page. */
T> - pgo_haspage_t *pgo_haspage; /* Does pager have page? */
T> - pgo_pageunswapped_t *pgo_pageunswapped;
T> + pgo_init_t *pgo_init; /* Initialize pager. */
T> + pgo_alloc_t *pgo_alloc; /* Allocate pager. */
T> + pgo_dealloc_t *pgo_dealloc; /* Disassociate. */
T> + pgo_getpages_t *pgo_getpages; /* Get (read) page. */
T> + pgo_getpages_async_t *pgo_getpages_async; /* Get page asyncly. */
T> + pgo_putpages_t *pgo_putpages; /* Put (write) page. */
T> + pgo_haspage_t *pgo_haspage; /* Query page. */
T> + pgo_pageunswapped_t *pgo_pageunswapped;
T> };
T>
T> extern struct pagerops defaultpagerops;
T> @@ -103,6 +106,8 @@ vm_object_t vm_pager_allocate(objtype_t, void *, v
T> void vm_pager_bufferinit(void);
T> void vm_pager_deallocate(vm_object_t);
T> static __inline int vm_pager_get_pages(vm_object_t, vm_page_t *, int, int);
T> +static __inline int vm_pager_get_pages_async(vm_object_t, vm_page_t *, int,
T> + int, void(*)(void *), void *);
T> static __inline boolean_t vm_pager_has_page(vm_object_t, vm_pindex_t, int *, int *);
T> void vm_pager_init(void);
T> vm_object_t vm_pager_object_lookup(struct pagerlst *, void *);
T> @@ -131,6 +136,27 @@ vm_pager_get_pages(
T> return (r);
T> }
T>
T> +static __inline int
T> +vm_pager_get_pages_async(vm_object_t object, vm_page_t *m, int count,
T> + int reqpage, void (*iodone)(void *), void *arg)
T> +{
T> + int r;
T> +
T> + VM_OBJECT_ASSERT_WLOCKED(object);
T> +
T> + if (*pagertab[object->type]->pgo_getpages_async == NULL) {
T> + /* Emulate async operation. */
T> + r = vm_pager_get_pages(object, m, count, reqpage);
T> + VM_OBJECT_WUNLOCK(object);
T> + (iodone)(arg);
T> + VM_OBJECT_WLOCK(object);
T> + } else
T> + r = (*pagertab[object->type]->pgo_getpages_async)(object, m,
T> + count, reqpage, iodone, arg);
T> +
T> + return (r);
T> +}
T> +
T> static __inline void
T> vm_pager_put_pages(
T> vm_object_t object,
T> Index: sys/vm/vm_page.c
T> ===================================================================
T> --- sys/vm/vm_page.c (.../head) (revision 266804)
T> +++ sys/vm/vm_page.c (.../projects/sendfile) (revision 266807)
T> @@ -2689,6 +2689,8 @@ retrylookup:
T> sleep = (allocflags & VM_ALLOC_IGN_SBUSY) != 0 ?
T> vm_page_xbusied(m) : vm_page_busied(m);
T> if (sleep) {
T> + if (allocflags & VM_ALLOC_NOWAIT)
T> + return (NULL);
T> /*
T> * Reference the page before unlocking and
T> * sleeping so that the page daemon is less
T> @@ -2716,6 +2718,8 @@ retrylookup:
T> }
T> m = vm_page_alloc(object, pindex, allocflags & ~VM_ALLOC_IGN_SBUSY);
T> if (m == NULL) {
T> + if (allocflags & VM_ALLOC_NOWAIT)
T> + return (NULL);
T> VM_OBJECT_WUNLOCK(object);
T> VM_WAIT;
T> VM_OBJECT_WLOCK(object);
T> Index: sys/vm/vm_page.h
T> ===================================================================
T> --- sys/vm/vm_page.h (.../head) (revision 266804)
T> +++ sys/vm/vm_page.h (.../projects/sendfile) (revision 266807)
T> @@ -390,6 +390,7 @@ vm_page_t PHYS_TO_VM_PAGE(vm_paddr_t pa);
T> #define VM_ALLOC_IGN_SBUSY 0x1000 /* vm_page_grab() only */
T> #define VM_ALLOC_NODUMP 0x2000 /* don't include in dump */
T> #define VM_ALLOC_SBUSY 0x4000 /* Shared busy the page */
T> +#define VM_ALLOC_NOWAIT 0x8000 /* Return NULL instead of sleeping */
T>
T> #define VM_ALLOC_COUNT_SHIFT 16
T> #define VM_ALLOC_COUNT(count) ((count) << VM_ALLOC_COUNT_SHIFT)
T> Index: sys/vm/vnode_pager.c
T> ===================================================================
T> --- sys/vm/vnode_pager.c (.../head) (revision 266804)
T> +++ sys/vm/vnode_pager.c (.../projects/sendfile) (revision 266807)
T> @@ -83,6 +83,8 @@ static int vnode_pager_input_smlfs(vm_object_t obj
T> static int vnode_pager_input_old(vm_object_t object, vm_page_t m);
T> static void vnode_pager_dealloc(vm_object_t);
T> static int vnode_pager_getpages(vm_object_t, vm_page_t *, int, int);
T> +static int vnode_pager_getpages_async(vm_object_t, vm_page_t *, int, int,
T> + void(*)(void *), void *);
T> static void vnode_pager_putpages(vm_object_t, vm_page_t *, int, boolean_t, int *);
T> static boolean_t vnode_pager_haspage(vm_object_t, vm_pindex_t, int *, int *);
T> static vm_object_t vnode_pager_alloc(void *, vm_ooffset_t, vm_prot_t,
T> @@ -92,6 +94,7 @@ struct pagerops vnodepagerops = {
T> .pgo_alloc = vnode_pager_alloc,
T> .pgo_dealloc = vnode_pager_dealloc,
T> .pgo_getpages = vnode_pager_getpages,
T> + .pgo_getpages_async = vnode_pager_getpages_async,
T> .pgo_putpages = vnode_pager_putpages,
T> .pgo_haspage = vnode_pager_haspage,
T> };
T> @@ -664,6 +667,40 @@ vnode_pager_getpages(vm_object_t object, vm_page_t
T> return rtval;
T> }
T>
T> +static int
T> +vnode_pager_getpages_async(vm_object_t object, vm_page_t *m, int count,
T> + int reqpage, void (*iodone)(void *), void *arg)
T> +{
T> + int rtval;
T> + struct vnode *vp;
T> + int bytes = count * PAGE_SIZE;
T> +
T> + vp = object->handle;
T> + VM_OBJECT_WUNLOCK(object);
T> + rtval = VOP_GETPAGES_ASYNC(vp, m, bytes, reqpage, 0, iodone, arg);
T> + KASSERT(rtval != EOPNOTSUPP,
T> + ("vnode_pager: FS getpages_async not implemented\n"));
T> + VM_OBJECT_WLOCK(object);
T> + return rtval;
T> +}
T> +
T> +struct getpages_softc {
T> + vm_page_t *m;
T> + struct buf *bp;
T> + vm_object_t object;
T> + vm_offset_t kva;
T> + off_t foff;
T> + int size;
T> + int count;
T> + int unmapped;
T> + int reqpage;
T> + void (*iodone)(void *);
T> + void *arg;
T> +};
T> +
T> +int vnode_pager_generic_getpages_done(struct getpages_softc *);
T> +void vnode_pager_generic_getpages_done_async(struct buf *);
T> +
T> /*
T> * This is now called from local media FS's to operate against their
T> * own vnodes if they fail to implement VOP_GETPAGES.
T> @@ -670,11 +707,11 @@ vnode_pager_getpages(vm_object_t object, vm_page_t
T> */
T> int
T> vnode_pager_generic_getpages(struct vnode *vp, vm_page_t *m, int bytecount,
T> - int reqpage)
T> + int reqpage, void (*iodone)(void *), void *arg)
T> {
T> vm_object_t object;
T> vm_offset_t kva;
T> - off_t foff, tfoff, nextoff;
T> + off_t foff;
T> int i, j, size, bsize, first;
T> daddr_t firstaddr, reqblock;
T> struct bufobj *bo;
T> @@ -684,6 +721,7 @@ vnode_pager_generic_getpages(struct vnode *vp, vm_
T> struct mount *mp;
T> int count;
T> int error;
T> + int unmapped;
T>
T> object = vp->v_object;
T> count = bytecount / PAGE_SIZE;
T> @@ -891,8 +929,8 @@ vnode_pager_generic_getpages(struct vnode *vp, vm_
T> * requires mapped buffers.
T> */
T> mp = vp->v_mount;
T> - if (mp != NULL && (mp->mnt_kern_flag & MNTK_UNMAPPED_BUFS) != 0 &&
T> - unmapped_buf_allowed) {
T> + unmapped = (mp != NULL && (mp->mnt_kern_flag & MNTK_UNMAPPED_BUFS));
T> + if (unmapped && unmapped_buf_allowed) {
T> bp->b_data = unmapped_buf;
T> bp->b_kvabase = unmapped_buf;
T> bp->b_offset = 0;
T> @@ -905,7 +943,6 @@ vnode_pager_generic_getpages(struct vnode *vp, vm_
T>
T> /* build a minimal buffer header */
T> bp->b_iocmd = BIO_READ;
T> - bp->b_iodone = bdone;
T> KASSERT(bp->b_rcred == NOCRED, ("leaking read ucred"));
T> KASSERT(bp->b_wcred == NOCRED, ("leaking write ucred"));
T> bp->b_rcred = crhold(curthread->td_ucred);
T> @@ -923,10 +960,88 @@ vnode_pager_generic_getpages(struct vnode *vp, vm_
T>
T> /* do the input */
T> bp->b_iooffset = dbtob(bp->b_blkno);
T> - bstrategy(bp);
T>
T> - bwait(bp, PVM, "vnread");
T> + if (iodone) { /* async */
T> + struct getpages_softc *sc;
T>
T> + sc = malloc(sizeof(*sc), M_TEMP, M_WAITOK);
T> +
T> + sc->m = m;
T> + sc->bp = bp;
T> + sc->object = object;
T> + sc->foff = foff;
T> + sc->size = size;
T> + sc->count = count;
T> + sc->unmapped = unmapped;
T> + sc->reqpage = reqpage;
T> + sc->kva = kva;
T> +
T> + sc->iodone = iodone;
T> + sc->arg = arg;
T> +
T> + bp->b_iodone = vnode_pager_generic_getpages_done_async;
T> + bp->b_caller1 = sc;
T> + BUF_KERNPROC(bp);
T> + bstrategy(bp);
T> + /* Good bye! */
T> + } else {
T> + struct getpages_softc sc;
T> +
T> + sc.m = m;
T> + sc.bp = bp;
T> + sc.object = object;
T> + sc.foff = foff;
T> + sc.size = size;
T> + sc.count = count;
T> + sc.unmapped = unmapped;
T> + sc.reqpage = reqpage;
T> + sc.kva = kva;
T> +
T> + bp->b_iodone = bdone;
T> + bstrategy(bp);
T> + bwait(bp, PVM, "vnread");
T> + error = vnode_pager_generic_getpages_done(&sc);
T> + }
T> +
T> + return (error ? VM_PAGER_ERROR : VM_PAGER_OK);
T> +}
T> +
T> +void
T> +vnode_pager_generic_getpages_done_async(struct buf *bp)
T> +{
T> + struct getpages_softc *sc = bp->b_caller1;
T> + int error;
T> +
T> + error = vnode_pager_generic_getpages_done(sc);
T> +
T> + vm_page_xunbusy(sc->m[sc->reqpage]);
T> +
T> + sc->iodone(sc->arg);
T> +
T> + free(sc, M_TEMP);
T> +}
T> +
T> +int
T> +vnode_pager_generic_getpages_done(struct getpages_softc *sc)
T> +{
T> + vm_object_t object;
T> + vm_offset_t kva;
T> + vm_page_t *m;
T> + struct buf *bp;
T> + off_t foff, tfoff, nextoff;
T> + int i, size, count, unmapped, reqpage;
T> + int error = 0;
T> +
T> + m = sc->m;
T> + bp = sc->bp;
T> + object = sc->object;
T> + foff = sc->foff;
T> + size = sc->size;
T> + count = sc->count;
T> + unmapped = sc->unmapped;
T> + reqpage = sc->reqpage;
T> + kva = sc->kva;
T> +
T> if ((bp->b_ioflags & BIO_ERROR) != 0)
T> error = EIO;
T>
T> @@ -939,7 +1054,7 @@ vnode_pager_generic_getpages(struct vnode *vp, vm_
T> }
T> if ((bp->b_flags & B_UNMAPPED) == 0)
T> pmap_qremove(kva, count);
T> - if (mp != NULL && (mp->mnt_kern_flag & MNTK_UNMAPPED_BUFS) != 0) {
T> + if (unmapped) {
T> bp->b_data = (caddr_t)kva;
T> bp->b_kvabase = (caddr_t)kva;
T> bp->b_flags &= ~B_UNMAPPED;
T> @@ -995,7 +1110,8 @@ vnode_pager_generic_getpages(struct vnode *vp, vm_
T> if (error) {
T> printf("vnode_pager_getpages: I/O read error\n");
T> }
T> - return (error ? VM_PAGER_ERROR : VM_PAGER_OK);
T> +
T> + return (error);
T> }
T>
T> /*
T> Index: sys/rpc/clnt_vc.c
T> ===================================================================
T> --- sys/rpc/clnt_vc.c (.../head) (revision 266804)
T> +++ sys/rpc/clnt_vc.c (.../projects/sendfile) (revision 266807)
T> @@ -860,7 +860,7 @@ clnt_vc_soupcall(struct socket *so, void *arg, int
T> * error condition
T> */
T> do_read = FALSE;
T> - if (so->so_rcv.sb_cc >= sizeof(uint32_t)
T> + if (sbavail(&so->so_rcv) >= sizeof(uint32_t)
T> || (so->so_rcv.sb_state & SBS_CANTRCVMORE)
T> || so->so_error)
T> do_read = TRUE;
T> @@ -913,7 +913,7 @@ clnt_vc_soupcall(struct socket *so, void *arg, int
T> * buffered.
T> */
T> do_read = FALSE;
T> - if (so->so_rcv.sb_cc >= ct->ct_record_resid
T> + if (sbavail(&so->so_rcv) >= ct->ct_record_resid
T> || (so->so_rcv.sb_state & SBS_CANTRCVMORE)
T> || so->so_error)
T> do_read = TRUE;
T> Index: sys/rpc/svc_vc.c
T> ===================================================================
T> --- sys/rpc/svc_vc.c (.../head) (revision 266804)
T> +++ sys/rpc/svc_vc.c (.../projects/sendfile) (revision 266807)
T> @@ -546,7 +546,7 @@ svc_vc_ack(SVCXPRT *xprt, uint32_t *ack)
T> {
T>
T> *ack = atomic_load_acq_32(&xprt->xp_snt_cnt);
T> - *ack -= xprt->xp_socket->so_snd.sb_cc;
T> + *ack -= sbused(&xprt->xp_socket->so_snd);
T> return (TRUE);
T> }
T>
T> Index: sys/ufs/ffs/ffs_vnops.c
T> ===================================================================
T> --- sys/ufs/ffs/ffs_vnops.c (.../head) (revision 266804)
T> +++ sys/ufs/ffs/ffs_vnops.c (.../projects/sendfile) (revision 266807)
T> @@ -105,6 +105,7 @@ extern int ffs_rawread(struct vnode *vp, struct ui
T> static vop_fsync_t ffs_fsync;
T> static vop_lock1_t ffs_lock;
T> static vop_getpages_t ffs_getpages;
T> +static vop_getpages_async_t ffs_getpages_async;
T> static vop_read_t ffs_read;
T> static vop_write_t ffs_write;
T> static int ffs_extread(struct vnode *vp, struct uio *uio, int ioflag);
T> @@ -125,6 +126,7 @@ struct vop_vector ffs_vnodeops1 = {
T> .vop_default = &ufs_vnodeops,
T> .vop_fsync = ffs_fsync,
T> .vop_getpages = ffs_getpages,
T> + .vop_getpages_async = ffs_getpages_async,
T> .vop_lock1 = ffs_lock,
T> .vop_read = ffs_read,
T> .vop_reallocblks = ffs_reallocblks,
T> @@ -847,18 +849,16 @@ ffs_write(ap)
T> }
T>
T> /*
T> - * get page routine
T> + * Get page routines.
T> */
T> static int
T> -ffs_getpages(ap)
T> - struct vop_getpages_args *ap;
T> +ffs_getpages_checkvalid(vm_page_t *m, int count, int reqpage)
T> {
T> - int i;
T> vm_page_t mreq;
T> int pcount;
T>
T> - pcount = round_page(ap->a_count) / PAGE_SIZE;
T> - mreq = ap->a_m[ap->a_reqpage];
T> + pcount = round_page(count) / PAGE_SIZE;
T> + mreq = m[reqpage];
T>
T> /*
T> * if ANY DEV_BSIZE blocks are valid on a large filesystem block,
T> @@ -870,24 +870,48 @@ static int
T> if (mreq->valid) {
T> if (mreq->valid != VM_PAGE_BITS_ALL)
T> vm_page_zero_invalid(mreq, TRUE);
T> - for (i = 0; i < pcount; i++) {
T> - if (i != ap->a_reqpage) {
T> - vm_page_lock(ap->a_m[i]);
T> - vm_page_free(ap->a_m[i]);
T> - vm_page_unlock(ap->a_m[i]);
T> + for (int i = 0; i < pcount; i++) {
T> + if (i != reqpage) {
T> + vm_page_lock(m[i]);
T> + vm_page_free(m[i]);
T> + vm_page_unlock(m[i]);
T> }
T> }
T> VM_OBJECT_WUNLOCK(mreq->object);
T> - return VM_PAGER_OK;
T> + return (VM_PAGER_OK);
T> }
T> VM_OBJECT_WUNLOCK(mreq->object);
T>
T> - return vnode_pager_generic_getpages(ap->a_vp, ap->a_m,
T> - ap->a_count,
T> - ap->a_reqpage);
T> + return (-1);
T> }
T>
T> +static int
T> +ffs_getpages(struct vop_getpages_args *ap)
T> +{
T> + int rv;
T>
T> + rv = ffs_getpages_checkvalid(ap->a_m, ap->a_count, ap->a_reqpage);
T> + if (rv == VM_PAGER_OK)
T> + return (rv);
T> +
T> + return (vnode_pager_generic_getpages(ap->a_vp, ap->a_m, ap->a_count,
T> + ap->a_reqpage, NULL, NULL));
T> +}
T> +
T> +static int
T> +ffs_getpages_async(struct vop_getpages_async_args *ap)
T> +{
T> + int rv;
T> +
T> + rv = ffs_getpages_checkvalid(ap->a_m, ap->a_count, ap->a_reqpage);
T> + if (rv == VM_PAGER_OK) {
T> + (ap->a_vop_getpages_iodone)(ap->a_arg);
T> + return (rv);
T> + }
T> + return (vnode_pager_generic_getpages(ap->a_vp, ap->a_m, ap->a_count,
T> + ap->a_reqpage, ap->a_vop_getpages_iodone, ap->a_arg));
T> +}
T> +
T> /*
T> * Extended attribute area reading.
T> */
T> Index: sys/tools/vnode_if.awk
T> ===================================================================
T> --- sys/tools/vnode_if.awk (.../head) (revision 266804)
T> +++ sys/tools/vnode_if.awk (.../projects/sendfile) (revision 266807)
T> @@ -254,16 +254,26 @@ while ((getline < srcfile) > 0) {
T> if (sub(/;$/, "") < 1)
T> die("Missing end-of-line ; in \"%s\".", $0);
T>
T> - # pick off variable name
T> - if ((argp = match($0, /[A-Za-z0-9_]+$/)) < 1)
T> - die("Missing var name \"a_foo\" in \"%s\".", $0);
T> - args[numargs] = substr($0, argp);
T> - $0 = substr($0, 1, argp - 1);
T> -
T> - # what is left must be type
T> - # remove trailing space (if any)
T> - sub(/ $/, "");
T> - types[numargs] = $0;
T> + # pick off argument name
T> + if ((argp = match($0, /[A-Za-z0-9_]+$/)) > 0) {
T> + args[numargs] = substr($0, argp);
T> + $0 = substr($0, 1, argp - 1);
T> + sub(/ $/, "");
T> + delete fargs[numargs];
T> + types[numargs] = $0;
T> + } else { # try to parse a function pointer argument
T> + if ((argp = match($0,
T> + /\(\*[A-Za-z0-9_]+\)\([A-Za-z0-9_*, ]+\)$/)) < 1)
T> + die("Missing var name \"a_foo\" in \"%s\".",
T> + $0);
T> + args[numargs] = substr($0, argp + 2);
T> + sub(/\).+/, "", args[numargs]);
T> + fargs[numargs] = substr($0, argp);
T> + sub(/^\([^)]+\)/, "", fargs[numargs]);
T> + $0 = substr($0, 1, argp - 1);
T> + sub(/ $/, "");
T> + types[numargs] = $0;
T> + }
T> }
T> if (numargs > 4)
T> ctrargs = 4;
T> @@ -286,8 +296,13 @@ while ((getline < srcfile) > 0) {
T> if (hfile) {
T> # Print out the vop_F_args structure.
T> printh("struct "name"_args {\n\tstruct vop_generic_args a_gen;");
T> - for (i = 0; i < numargs; ++i)
T> - printh("\t" t_spc(types[i]) "a_" args[i] ";");
T> + for (i = 0; i < numargs; ++i) {
T> + if (fargs[i]) {
T> + printh("\t" t_spc(types[i]) "(*a_" args[i] \
T> + ")" fargs[i] ";");
T> + } else
T> + printh("\t" t_spc(types[i]) "a_" args[i] ";");
T> + }
T> printh("};");
T> printh("");
T>
T> @@ -301,8 +316,14 @@ while ((getline < srcfile) > 0) {
T> printh("");
T> printh("static __inline int " uname "(");
T> for (i = 0; i < numargs; ++i) {
T> - printh("\t" t_spc(types[i]) args[i] \
T> - (i < numargs - 1 ? "," : ")"));
T> + if (fargs[i]) {
T> + printh("\t" t_spc(types[i]) "(*" args[i] \
T> + ")" fargs[i] \
T> + (i < numargs - 1 ? "," : ")"));
T> + } else {
T> + printh("\t" t_spc(types[i]) args[i] \
T> + (i < numargs - 1 ? "," : ")"));
T> + }
T> }
T> printh("{");
T> printh("\tstruct " name "_args a;");
T> Index: sys/netinet/tcp_reass.c
T> ===================================================================
T> --- sys/netinet/tcp_reass.c (.../head) (revision 266804)
T> +++ sys/netinet/tcp_reass.c (.../projects/sendfile) (revision 266807)
T> @@ -248,7 +248,7 @@ present:
T> m_freem(mq);
T> else {
T> mq->m_nextpkt = NULL;
T> - sbappendstream_locked(&so->so_rcv, mq);
T> + sbappendstream_locked(&so->so_rcv, mq, 0);
T> wakeup = 1;
T> }
T> }
T> Index: sys/netinet/accf_http.c
T> ===================================================================
T> --- sys/netinet/accf_http.c (.../head) (revision 266804)
T> +++ sys/netinet/accf_http.c (.../projects/sendfile) (revision 266807)
T> @@ -92,7 +92,7 @@ sbfull(struct sockbuf *sb)
T> "mbcnt(%ld) >= mbmax(%ld): %d",
T> sb->sb_cc, sb->sb_hiwat, sb->sb_cc >= sb->sb_hiwat,
T> sb->sb_mbcnt, sb->sb_mbmax, sb->sb_mbcnt >= sb->sb_mbmax);
T> - return (sb->sb_cc >= sb->sb_hiwat || sb->sb_mbcnt >= sb->sb_mbmax);
T> + return (sbused(sb) >= sb->sb_hiwat || sb->sb_mbcnt >= sb->sb_mbmax);
T> }
T>
T> /*
T> @@ -162,13 +162,14 @@ static int
T> sohashttpget(struct socket *so, void *arg, int waitflag)
T> {
T>
T> - if ((so->so_rcv.sb_state & SBS_CANTRCVMORE) == 0 && !sbfull(&so->so_rcv)) {
T> + if ((so->so_rcv.sb_state & SBS_CANTRCVMORE) == 0 &&
T> + !sbfull(&so->so_rcv)) {
T> struct mbuf *m;
T> char *cmp;
T> int cmplen, cc;
T>
T> m = so->so_rcv.sb_mb;
T> - cc = so->so_rcv.sb_cc - 1;
T> + cc = sbavail(&so->so_rcv) - 1;
T> if (cc < 1)
T> return (SU_OK);
T> switch (*mtod(m, char *)) {
T> @@ -215,7 +216,7 @@ soparsehttpvers(struct socket *so, void *arg, int
T> goto fallout;
T>
T> m = so->so_rcv.sb_mb;
T> - cc = so->so_rcv.sb_cc;
T> + cc = sbavail(&so->so_rcv);
T> inspaces = spaces = 0;
T> for (m = so->so_rcv.sb_mb; m; m = n) {
T> n = m->m_nextpkt;
T> @@ -304,7 +305,7 @@ soishttpconnected(struct socket *so, void *arg, in
T> * have NCHRS left
T> */
T> copied = 0;
T> - ccleft = so->so_rcv.sb_cc;
T> + ccleft = sbavail(&so->so_rcv);
T> if (ccleft < NCHRS)
T> goto readmore;
T> a = b = c = '\0';
T> Index: sys/netinet/sctp_os_bsd.h
T> ===================================================================
T> --- sys/netinet/sctp_os_bsd.h (.../head) (revision 266804)
T> +++ sys/netinet/sctp_os_bsd.h (.../projects/sendfile) (revision 266807)
T> @@ -405,7 +405,7 @@ typedef struct callout sctp_os_timer_t;
T> #define SCTP_SOWAKEUP(so) wakeup(&(so)->so_timeo)
T> /* clear the socket buffer state */
T> #define SCTP_SB_CLEAR(sb) \
T> - (sb).sb_cc = 0; \
T> + (sb).sb_ccc = 0; \
T> (sb).sb_mb = NULL; \
T> (sb).sb_mbcnt = 0;
T>
T> Index: sys/netinet/tcp_output.c
T> ===================================================================
T> --- sys/netinet/tcp_output.c (.../head) (revision 266804)
T> +++ sys/netinet/tcp_output.c (.../projects/sendfile) (revision 266807)
T> @@ -322,7 +322,7 @@ after_sack_rexmit:
T> * to send then the probe will be the FIN
T> * itself.
T> */
T> - if (off < so->so_snd.sb_cc)
T> + if (off < sbavail(&so->so_snd))
T> flags &= ~TH_FIN;
T> sendwin = 1;
T> } else {
T> @@ -348,7 +348,8 @@ after_sack_rexmit:
T> */
T> if (sack_rxmit == 0) {
T> if (sack_bytes_rxmt == 0)
T> - len = ((long)ulmin(so->so_snd.sb_cc, sendwin) - off);
T> + len = ((long)ulmin(sbavail(&so->so_snd), sendwin) -
T> + off);
T> else {
T> long cwin;
T>
T> @@ -357,8 +358,8 @@ after_sack_rexmit:
T> * sending new data, having retransmitted all the
T> * data possible in the scoreboard.
T> */
T> - len = ((long)ulmin(so->so_snd.sb_cc, tp->snd_wnd)
T> - - off);
T> + len = ((long)ulmin(sbavail(&so->so_snd), tp->snd_wnd) -
T> + off);
T> /*
T> * Don't remove this (len > 0) check !
T> * We explicitly check for len > 0 here (although it
T> @@ -457,12 +458,15 @@ after_sack_rexmit:
T> * TODO: Shrink send buffer during idle periods together
T> * with congestion window. Requires another timer. Has to
T> * wait for upcoming tcp timer rewrite.
T> + *
T> + * XXXGL: should there be used sbused() or sbavail()?
T> */
T> if (V_tcp_do_autosndbuf && so->so_snd.sb_flags & SB_AUTOSIZE) {
T> if ((tp->snd_wnd / 4 * 5) >= so->so_snd.sb_hiwat &&
T> - so->so_snd.sb_cc >= (so->so_snd.sb_hiwat / 8 * 7) &&
T> - so->so_snd.sb_cc < V_tcp_autosndbuf_max &&
T> - sendwin >= (so->so_snd.sb_cc - (tp->snd_nxt - tp->snd_una))) {
T> + sbused(&so->so_snd) >= (so->so_snd.sb_hiwat / 8 * 7) &&
T> + sbused(&so->so_snd) < V_tcp_autosndbuf_max &&
T> + sendwin >= (sbused(&so->so_snd) -
T> + (tp->snd_nxt - tp->snd_una))) {
T> if (!sbreserve_locked(&so->so_snd,
T> min(so->so_snd.sb_hiwat + V_tcp_autosndbuf_inc,
T> V_tcp_autosndbuf_max), so, curthread))
T> @@ -499,10 +503,11 @@ after_sack_rexmit:
T> tso = 1;
T>
T> if (sack_rxmit) {
T> - if (SEQ_LT(p->rxmit + len, tp->snd_una + so->so_snd.sb_cc))
T> + if (SEQ_LT(p->rxmit + len, tp->snd_una + sbavail(&so->so_snd)))
T> flags &= ~TH_FIN;
T> } else {
T> - if (SEQ_LT(tp->snd_nxt + len, tp->snd_una + so->so_snd.sb_cc))
T> + if (SEQ_LT(tp->snd_nxt + len, tp->snd_una +
T> + sbavail(&so->so_snd)))
T> flags &= ~TH_FIN;
T> }
T>
T> @@ -532,7 +537,7 @@ after_sack_rexmit:
T> */
T> if (!(tp->t_flags & TF_MORETOCOME) && /* normal case */
T> (idle || (tp->t_flags & TF_NODELAY)) &&
T> - len + off >= so->so_snd.sb_cc &&
T> + len + off >= sbavail(&so->so_snd) &&
T> (tp->t_flags & TF_NOPUSH) == 0) {
T> goto send;
T> }
T> @@ -660,7 +665,7 @@ dontupdate:
T> * if window is nonzero, transmit what we can,
T> * otherwise force out a byte.
T> */
T> - if (so->so_snd.sb_cc && !tcp_timer_active(tp, TT_REXMT) &&
T> + if (sbavail(&so->so_snd) && !tcp_timer_active(tp, TT_REXMT) &&
T> !tcp_timer_active(tp, TT_PERSIST)) {
T> tp->t_rxtshift = 0;
T> tcp_setpersist(tp);
T> @@ -786,7 +791,7 @@ send:
T> * fractional unless the send sockbuf can
T> * be emptied.
T> */
T> - if (sendalot && off + len < so->so_snd.sb_cc) {
T> + if (sendalot && off + len < sbavail(&so->so_snd)) {
T> len -= len % (tp->t_maxopd - optlen);
T> sendalot = 1;
T> }
T> @@ -889,7 +894,7 @@ send:
T> * give data to the user when a buffer fills or
T> * a PUSH comes in.)
T> */
T> - if (off + len == so->so_snd.sb_cc)
T> + if (off + len == sbavail(&so->so_snd))
T> flags |= TH_PUSH;
T> SOCKBUF_UNLOCK(&so->so_snd);
T> } else {
T> Index: sys/netinet/siftr.c
T> ===================================================================
T> --- sys/netinet/siftr.c (.../head) (revision 266804)
T> +++ sys/netinet/siftr.c (.../projects/sendfile) (revision 266807)
T> @@ -781,9 +781,9 @@ siftr_siftdata(struct pkt_node *pn, struct inpcb *
T> pn->flags = tp->t_flags;
T> pn->rxt_length = tp->t_rxtcur;
T> pn->snd_buf_hiwater = inp->inp_socket->so_snd.sb_hiwat;
T> - pn->snd_buf_cc = inp->inp_socket->so_snd.sb_cc;
T> + pn->snd_buf_cc = sbused(&inp->inp_socket->so_snd);
T> pn->rcv_buf_hiwater = inp->inp_socket->so_rcv.sb_hiwat;
T> - pn->rcv_buf_cc = inp->inp_socket->so_rcv.sb_cc;
T> + pn->rcv_buf_cc = sbused(&inp->inp_socket->so_rcv);
T> pn->sent_inflight_bytes = tp->snd_max - tp->snd_una;
T> pn->t_segqlen = tp->t_segqlen;
T>
T> Index: sys/netinet/sctp_indata.c
T> ===================================================================
T> --- sys/netinet/sctp_indata.c (.../head) (revision 266804)
T> +++ sys/netinet/sctp_indata.c (.../projects/sendfile) (revision 266807)
T> @@ -70,7 +70,7 @@ sctp_calc_rwnd(struct sctp_tcb *stcb, struct sctp_
T>
T> /*
T> * This is really set wrong with respect to a 1-2-m socket. Since
T> - * the sb_cc is the count that everyone as put up. When we re-write
T> + * the sb_ccc is the count that everyone as put up. When we re-write
T> * sctp_soreceive then we will fix this so that ONLY this
T> * associations data is taken into account.
T> */
T> @@ -77,7 +77,7 @@ sctp_calc_rwnd(struct sctp_tcb *stcb, struct sctp_
T> if (stcb->sctp_socket == NULL)
T> return (calc);
T>
T> - if (stcb->asoc.sb_cc == 0 &&
T> + if (stcb->asoc.sb_ccc == 0 &&
T> asoc->size_on_reasm_queue == 0 &&
T> asoc->size_on_all_streams == 0) {
T> /* Full rwnd granted */
T> @@ -1358,7 +1358,7 @@ sctp_process_a_data_chunk(struct sctp_tcb *stcb, s
T> * When we have NO room in the rwnd we check to make sure
T> * the reader is doing its job...
T> */
T> - if (stcb->sctp_socket->so_rcv.sb_cc) {
T> + if (stcb->sctp_socket->so_rcv.sb_ccc) {
T> /* some to read, wake-up */
T> #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING)
T> struct socket *so;
T> Index: sys/netinet/sctp_pcb.c
T> ===================================================================
T> --- sys/netinet/sctp_pcb.c (.../head) (revision 266804)
T> +++ sys/netinet/sctp_pcb.c (.../projects/sendfile) (revision 266807)
T> @@ -3328,7 +3328,7 @@ sctp_inpcb_free(struct sctp_inpcb *inp, int immedi
T> if ((asoc->asoc.size_on_reasm_queue > 0) ||
T> (asoc->asoc.control_pdapi) ||
T> (asoc->asoc.size_on_all_streams > 0) ||
T> - (so && (so->so_rcv.sb_cc > 0))) {
T> + (so && (so->so_rcv.sb_ccc > 0))) {
T> /* Left with Data unread */
T> struct mbuf *op_err;
T>
T> @@ -3556,7 +3556,7 @@ sctp_inpcb_free(struct sctp_inpcb *inp, int immedi
T> TAILQ_REMOVE(&inp->read_queue, sq, next);
T> sctp_free_remote_addr(sq->whoFrom);
T> if (so)
T> - so->so_rcv.sb_cc -= sq->length;
T> + so->so_rcv.sb_ccc -= sq->length;
T> if (sq->data) {
T> sctp_m_freem(sq->data);
T> sq->data = NULL;
T> @@ -4775,7 +4775,7 @@ sctp_free_assoc(struct sctp_inpcb *inp, struct sct
T> inp->sctp_flags |= SCTP_PCB_FLAGS_WAS_CONNECTED;
T> if (so) {
T> SOCK_LOCK(so);
T> - if (so->so_rcv.sb_cc == 0) {
T> + if (so->so_rcv.sb_ccc == 0) {
T> so->so_state &= ~(SS_ISCONNECTING |
T> SS_ISDISCONNECTING |
T> SS_ISCONFIRMING |
T> Index: sys/netinet/sctp_pcb.h
T> ===================================================================
T> --- sys/netinet/sctp_pcb.h (.../head) (revision 266804)
T> +++ sys/netinet/sctp_pcb.h (.../projects/sendfile) (revision 266807)
T> @@ -369,7 +369,7 @@ struct sctp_inpcb {
T> } ip_inp;
T>
T>
T> - /* Socket buffer lock protects read_queue and of course sb_cc */
T> + /* Socket buffer lock protects read_queue and of course sb_ccc */
T> struct sctp_readhead read_queue;
T>
T> LIST_ENTRY(sctp_inpcb) sctp_list; /* lists all endpoints */
T> Index: sys/netinet/sctp_usrreq.c
T> ===================================================================
T> --- sys/netinet/sctp_usrreq.c (.../head) (revision 266804)
T> +++ sys/netinet/sctp_usrreq.c (.../projects/sendfile) (revision 266807)
T> @@ -586,7 +586,7 @@ sctp_must_try_again:
T> if (((flags & SCTP_PCB_FLAGS_SOCKET_GONE) == 0) &&
T> (atomic_cmpset_int(&inp->sctp_flags, flags, (flags | SCTP_PCB_FLAGS_SOCKET_GONE | SCTP_PCB_FLAGS_CLOSE_IP)))) {
T> if (((so->so_options & SO_LINGER) && (so->so_linger == 0)) ||
T> - (so->so_rcv.sb_cc > 0)) {
T> + (so->so_rcv.sb_ccc > 0)) {
T> #ifdef SCTP_LOG_CLOSING
T> sctp_log_closing(inp, NULL, 13);
T> #endif
T> @@ -751,7 +751,7 @@ sctp_disconnect(struct socket *so)
T> }
T> if (((so->so_options & SO_LINGER) &&
T> (so->so_linger == 0)) ||
T> - (so->so_rcv.sb_cc > 0)) {
T> + (so->so_rcv.sb_ccc > 0)) {
T> if (SCTP_GET_STATE(asoc) !=
T> SCTP_STATE_COOKIE_WAIT) {
T> /* Left with Data unread */
T> @@ -916,7 +916,7 @@ sctp_flush(struct socket *so, int how)
T> inp->sctp_flags |= SCTP_PCB_FLAGS_SOCKET_CANT_READ;
T> SCTP_INP_READ_UNLOCK(inp);
T> SCTP_INP_WUNLOCK(inp);
T> - so->so_rcv.sb_cc = 0;
T> + so->so_rcv.sb_ccc = 0;
T> so->so_rcv.sb_mbcnt = 0;
T> so->so_rcv.sb_mb = NULL;
T> }
T> @@ -925,7 +925,7 @@ sctp_flush(struct socket *so, int how)
T> * First make sure the sb will be happy, we don't use these
T> * except maybe the count
T> */
T> - so->so_snd.sb_cc = 0;
T> + so->so_snd.sb_ccc = 0;
T> so->so_snd.sb_mbcnt = 0;
T> so->so_snd.sb_mb = NULL;
T>
T> Index: sys/netinet/sctp_structs.h
T> ===================================================================
T> --- sys/netinet/sctp_structs.h (.../head) (revision 266804)
T> +++ sys/netinet/sctp_structs.h (.../projects/sendfile) (revision 266807)
T> @@ -982,7 +982,7 @@ struct sctp_association {
T>
T> uint32_t total_output_queue_size;
T>
T> - uint32_t sb_cc; /* shadow of sb_cc */
T> + uint32_t sb_ccc; /* shadow of sb_ccc */
T> uint32_t sb_send_resv; /* amount reserved on a send */
T> uint32_t my_rwnd_control_len; /* shadow of sb_mbcnt used for rwnd
T> * control */
T> Index: sys/netinet/tcp_input.c
T> ===================================================================
T> --- sys/netinet/tcp_input.c (.../head) (revision 266804)
T> +++ sys/netinet/tcp_input.c (.../projects/sendfile) (revision 266807)
T> @@ -1729,7 +1729,7 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th,
T> tcp_timer_activate(tp, TT_REXMT,
T> tp->t_rxtcur);
T> sowwakeup(so);
T> - if (so->so_snd.sb_cc)
T> + if (sbavail(&so->so_snd))
T> (void) tcp_output(tp);
T> goto check_delack;
T> }
T> @@ -1837,7 +1837,7 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th,
T> newsize, so, NULL))
T> so->so_rcv.sb_flags &= ~SB_AUTOSIZE;
T> m_adj(m, drop_hdrlen); /* delayed header drop */
T> - sbappendstream_locked(&so->so_rcv, m);
T> + sbappendstream_locked(&so->so_rcv, m, 0);
T> }
T> /* NB: sorwakeup_locked() does an implicit unlock. */
T> sorwakeup_locked(so);
T> @@ -2541,7 +2541,7 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th,
T> * Otherwise we would send pure ACKs.
T> */
T> SOCKBUF_LOCK(&so->so_snd);
T> - avail = so->so_snd.sb_cc -
T> + avail = sbavail(&so->so_snd) -
T> (tp->snd_nxt - tp->snd_una);
T> SOCKBUF_UNLOCK(&so->so_snd);
T> if (avail > 0)
T> @@ -2676,10 +2676,10 @@ process_ACK:
T> cc_ack_received(tp, th, CC_ACK);
T>
T> SOCKBUF_LOCK(&so->so_snd);
T> - if (acked > so->so_snd.sb_cc) {
T> - tp->snd_wnd -= so->so_snd.sb_cc;
T> + if (acked > sbavail(&so->so_snd)) {
T> + tp->snd_wnd -= sbavail(&so->so_snd);
T> mfree = sbcut_locked(&so->so_snd,
T> - (int)so->so_snd.sb_cc);
T> + (int)sbavail(&so->so_snd));
T> ourfinisacked = 1;
T> } else {
T> mfree = sbcut_locked(&so->so_snd, acked);
T> @@ -2805,7 +2805,7 @@ step6:
T> * actually wanting to send this much urgent data.
T> */
T> SOCKBUF_LOCK(&so->so_rcv);
T> - if (th->th_urp + so->so_rcv.sb_cc > sb_max) {
T> + if (th->th_urp + sbavail(&so->so_rcv) > sb_max) {
T> th->th_urp = 0; /* XXX */
T> thflags &= ~TH_URG; /* XXX */
T> SOCKBUF_UNLOCK(&so->so_rcv); /* XXX */
T> @@ -2827,7 +2827,7 @@ step6:
T> */
T> if (SEQ_GT(th->th_seq+th->th_urp, tp->rcv_up)) {
T> tp->rcv_up = th->th_seq + th->th_urp;
T> - so->so_oobmark = so->so_rcv.sb_cc +
T> + so->so_oobmark = sbavail(&so->so_rcv) +
T> (tp->rcv_up - tp->rcv_nxt) - 1;
T> if (so->so_oobmark == 0)
T> so->so_rcv.sb_state |= SBS_RCVATMARK;
T> @@ -2897,7 +2897,7 @@ dodata: /* XXX */
T> if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
T> m_freem(m);
T> else
T> - sbappendstream_locked(&so->so_rcv, m);
T> + sbappendstream_locked(&so->so_rcv, m, 0);
T> /* NB: sorwakeup_locked() does an implicit unlock. */
T> sorwakeup_locked(so);
T> } else {
T> Index: sys/netinet/sctp_input.c
T> ===================================================================
T> --- sys/netinet/sctp_input.c (.../head) (revision 266804)
T> +++ sys/netinet/sctp_input.c (.../projects/sendfile) (revision 266807)
T> @@ -1042,7 +1042,7 @@ sctp_handle_shutdown_ack(struct sctp_shutdown_ack_
T> if (stcb->sctp_socket) {
T> if ((stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) ||
T> (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL)) {
T> - stcb->sctp_socket->so_snd.sb_cc = 0;
T> + stcb->sctp_socket->so_snd.sb_ccc = 0;
T> }
T> sctp_ulp_notify(SCTP_NOTIFY_ASSOC_DOWN, stcb, 0, NULL, SCTP_SO_NOT_LOCKED);
T> }
T> Index: sys/netinet/sctp_var.h
T> ===================================================================
T> --- sys/netinet/sctp_var.h (.../head) (revision 266804)
T> +++ sys/netinet/sctp_var.h (.../projects/sendfile) (revision 266807)
T> @@ -82,9 +82,9 @@ extern struct pr_usrreqs sctp_usrreqs;
T>
T> #define sctp_maxspace(sb) (max((sb)->sb_hiwat,SCTP_MINIMAL_RWND))
T>
T> -#define sctp_sbspace(asoc, sb) ((long) ((sctp_maxspace(sb) > (asoc)->sb_cc) ? (sctp_maxspace(sb) - (asoc)->sb_cc) : 0))
T> +#define sctp_sbspace(asoc, sb) ((long) ((sctp_maxspace(sb) > (asoc)->sb_ccc) ? (sctp_maxspace(sb) - (asoc)->sb_ccc) : 0))
T>
T> -#define sctp_sbspace_failedmsgs(sb) ((long) ((sctp_maxspace(sb) > (sb)->sb_cc) ? (sctp_maxspace(sb) - (sb)->sb_cc) : 0))
T> +#define sctp_sbspace_failedmsgs(sb) ((long) ((sctp_maxspace(sb) > (sb)->sb_ccc) ? (sctp_maxspace(sb) - (sb)->sb_ccc) : 0))
T>
T> #define sctp_sbspace_sub(a,b) ((a > b) ? (a - b) : 0)
T>
T> @@ -195,10 +195,10 @@ extern struct pr_usrreqs sctp_usrreqs;
T> }
T>
T> #define sctp_sbfree(ctl, stcb, sb, m) { \
T> - SCTP_SAVE_ATOMIC_DECREMENT(&(sb)->sb_cc, SCTP_BUF_LEN((m))); \
T> + SCTP_SAVE_ATOMIC_DECREMENT(&(sb)->sb_ccc, SCTP_BUF_LEN((m))); \
T> SCTP_SAVE_ATOMIC_DECREMENT(&(sb)->sb_mbcnt, MSIZE); \
T> if (((ctl)->do_not_ref_stcb == 0) && stcb) {\
T> - SCTP_SAVE_ATOMIC_DECREMENT(&(stcb)->asoc.sb_cc, SCTP_BUF_LEN((m))); \
T> + SCTP_SAVE_ATOMIC_DECREMENT(&(stcb)->asoc.sb_ccc, SCTP_BUF_LEN((m))); \
T> SCTP_SAVE_ATOMIC_DECREMENT(&(stcb)->asoc.my_rwnd_control_len, MSIZE); \
T> } \
T> if (SCTP_BUF_TYPE(m) != MT_DATA && SCTP_BUF_TYPE(m) != MT_HEADER && \
T> @@ -207,10 +207,10 @@ extern struct pr_usrreqs sctp_usrreqs;
T> }
T>
T> #define sctp_sballoc(stcb, sb, m) { \
T> - atomic_add_int(&(sb)->sb_cc,SCTP_BUF_LEN((m))); \
T> + atomic_add_int(&(sb)->sb_ccc,SCTP_BUF_LEN((m))); \
T> atomic_add_int(&(sb)->sb_mbcnt, MSIZE); \
T> if (stcb) { \
T> - atomic_add_int(&(stcb)->asoc.sb_cc,SCTP_BUF_LEN((m))); \
T> + atomic_add_int(&(stcb)->asoc.sb_ccc,SCTP_BUF_LEN((m))); \
T> atomic_add_int(&(stcb)->asoc.my_rwnd_control_len, MSIZE); \
T> } \
T> if (SCTP_BUF_TYPE(m) != MT_DATA && SCTP_BUF_TYPE(m) != MT_HEADER && \
T> Index: sys/netinet/sctp_output.c
T> ===================================================================
T> --- sys/netinet/sctp_output.c (.../head) (revision 266804)
T> +++ sys/netinet/sctp_output.c (.../projects/sendfile) (revision 266807)
T> @@ -7104,7 +7104,7 @@ one_more_time:
T> if ((stcb->sctp_socket != NULL) && \
T> ((stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) ||
T> (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL))) {
T> - atomic_subtract_int(&stcb->sctp_socket->so_snd.sb_cc, sp->length);
T> + atomic_subtract_int(&stcb->sctp_socket->so_snd.sb_ccc, sp->length);
T> }
T> if (sp->data) {
T> sctp_m_freem(sp->data);
T> @@ -11382,7 +11382,7 @@ jump_out:
T> drp->current_onq = htonl(asoc->size_on_reasm_queue +
T> asoc->size_on_all_streams +
T> asoc->my_rwnd_control_len +
T> - stcb->sctp_socket->so_rcv.sb_cc);
T> + stcb->sctp_socket->so_rcv.sb_ccc);
T> } else {
T> /*-
T> * If my rwnd is 0, possibly from mbuf depletion as well as
T> Index: sys/netinet/tcp_usrreq.c
T> ===================================================================
T> --- sys/netinet/tcp_usrreq.c (.../head) (revision 266804)
T> +++ sys/netinet/tcp_usrreq.c (.../projects/sendfile) (revision 266807)
T> @@ -826,7 +826,7 @@ tcp_usr_send(struct socket *so, int flags, struct
T> m_freem(control); /* empty control, just free it */
T> }
T> if (!(flags & PRUS_OOB)) {
T> - sbappendstream(&so->so_snd, m);
T> + sbappendstream(&so->so_snd, m, flags);
T> if (nam && tp->t_state < TCPS_SYN_SENT) {
T> /*
T> * Do implied connect if not yet connected,
T> @@ -858,7 +858,8 @@ tcp_usr_send(struct socket *so, int flags, struct
T> socantsendmore(so);
T> tcp_usrclosed(tp);
T> }
T> - if (!(inp->inp_flags & INP_DROPPED)) {
T> + if (!(inp->inp_flags & INP_DROPPED) &&
T> + !(flags & PRUS_NOTREADY)) {
T> if (flags & PRUS_MORETOCOME)
T> tp->t_flags |= TF_MORETOCOME;
T> error = tcp_output(tp);
T> @@ -884,7 +885,7 @@ tcp_usr_send(struct socket *so, int flags, struct
T> * of data past the urgent section.
T> * Otherwise, snd_up should be one lower.
T> */
T> - sbappendstream_locked(&so->so_snd, m);
T> + sbappendstream_locked(&so->so_snd, m, flags);
T> SOCKBUF_UNLOCK(&so->so_snd);
T> if (nam && tp->t_state < TCPS_SYN_SENT) {
T> /*
T> @@ -908,10 +909,12 @@ tcp_usr_send(struct socket *so, int flags, struct
T> tp->snd_wnd = TTCP_CLIENT_SND_WND;
T> tcp_mss(tp, -1);
T> }
T> - tp->snd_up = tp->snd_una + so->so_snd.sb_cc;
T> - tp->t_flags |= TF_FORCEDATA;
T> - error = tcp_output(tp);
T> - tp->t_flags &= ~TF_FORCEDATA;
T> + tp->snd_up = tp->snd_una + sbavail(&so->so_snd);
T> + if (!(flags & PRUS_NOTREADY)) {
T> + tp->t_flags |= TF_FORCEDATA;
T> + error = tcp_output(tp);
T> + tp->t_flags &= ~TF_FORCEDATA;
T> + }
T> }
T> out:
T> TCPDEBUG2((flags & PRUS_OOB) ? PRU_SENDOOB :
T> Index: sys/netinet/accf_dns.c
T> ===================================================================
T> --- sys/netinet/accf_dns.c (.../head) (revision 266804)
T> +++ sys/netinet/accf_dns.c (.../projects/sendfile) (revision 266807)
T> @@ -75,7 +75,7 @@ sohasdns(struct socket *so, void *arg, int waitfla
T> struct sockbuf *sb = &so->so_rcv;
T>
T> /* If the socket is full, we're ready. */
T> - if (sb->sb_cc >= sb->sb_hiwat || sb->sb_mbcnt >= sb->sb_mbmax)
T> + if (sbused(sb) >= sb->sb_hiwat || sb->sb_mbcnt >= sb->sb_mbmax)
T> goto ready;
T>
T> /* Check to see if we have a request. */
T> @@ -115,7 +115,7 @@ skippacket(struct sockbuf *sb) {
T> unsigned long packlen;
T> struct packet q, *p = &q;
T>
T> - if (sb->sb_cc < 2)
T> + if (sbavail(sb) < 2)
T> return DNS_WAIT;
T>
T> q.m = sb->sb_mb;
T> @@ -122,7 +122,7 @@ skippacket(struct sockbuf *sb) {
T> q.n = q.m->m_nextpkt;
T> q.moff = 0;
T> q.offset = 0;
T> - q.len = sb->sb_cc;
T> + q.len = sbavail(sb);
T>
T> GET16(p, packlen);
T> if (packlen + 2 > q.len)
T> Index: sys/netinet/sctputil.c
T> ===================================================================
T> --- sys/netinet/sctputil.c (.../head) (revision 266804)
T> +++ sys/netinet/sctputil.c (.../projects/sendfile) (revision 266807)
T> @@ -67,9 +67,9 @@ sctp_sblog(struct sockbuf *sb, struct sctp_tcb *st
T> struct sctp_cwnd_log sctp_clog;
T>
T> sctp_clog.x.sb.stcb = stcb;
T> - sctp_clog.x.sb.so_sbcc = sb->sb_cc;
T> + sctp_clog.x.sb.so_sbcc = sb->sb_ccc;
T> if (stcb)
T> - sctp_clog.x.sb.stcb_sbcc = stcb->asoc.sb_cc;
T> + sctp_clog.x.sb.stcb_sbcc = stcb->asoc.sb_ccc;
T> else
T> sctp_clog.x.sb.stcb_sbcc = 0;
T> sctp_clog.x.sb.incr = incr;
T> @@ -4356,7 +4356,7 @@ sctp_add_to_readq(struct sctp_inpcb *inp,
T> {
T> /*
T> * Here we must place the control on the end of the socket read
T> - * queue AND increment sb_cc so that select will work properly on
T> + * queue AND increment sb_ccc so that select will work properly on
T> * read.
T> */
T> struct mbuf *m, *prev = NULL;
T> @@ -4482,7 +4482,7 @@ sctp_append_to_readq(struct sctp_inpcb *inp,
T> * the reassembly queue.
T> *
T> * If PDAPI this means we need to add m to the end of the data.
T> - * Increase the length in the control AND increment the sb_cc.
T> + * Increase the length in the control AND increment the sb_ccc.
T> * Otherwise sb is NULL and all we need to do is put it at the end
T> * of the mbuf chain.
T> */
T> @@ -4694,10 +4694,10 @@ sctp_free_bufspace(struct sctp_tcb *stcb, struct s
T>
T> if (stcb->sctp_socket && (((stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL)) ||
T> ((stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE)))) {
T> - if (stcb->sctp_socket->so_snd.sb_cc >= tp1->book_size) {
T> - stcb->sctp_socket->so_snd.sb_cc -= tp1->book_size;
T> + if (stcb->sctp_socket->so_snd.sb_ccc >= tp1->book_size) {
T> + stcb->sctp_socket->so_snd.sb_ccc -= tp1->book_size;
T> } else {
T> - stcb->sctp_socket->so_snd.sb_cc = 0;
T> + stcb->sctp_socket->so_snd.sb_ccc = 0;
T>
T> }
T> }
T> @@ -5232,11 +5232,11 @@ sctp_sorecvmsg(struct socket *so,
T> in_eeor_mode = sctp_is_feature_on(inp, SCTP_PCB_FLAGS_EXPLICIT_EOR);
T> if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_RECV_RWND_LOGGING_ENABLE) {
T> sctp_misc_ints(SCTP_SORECV_ENTER,
T> - rwnd_req, in_eeor_mode, so->so_rcv.sb_cc, uio->uio_resid);
T> + rwnd_req, in_eeor_mode, so->so_rcv.sb_ccc, uio->uio_resid);
T> }
T> if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_RECV_RWND_LOGGING_ENABLE) {
T> sctp_misc_ints(SCTP_SORECV_ENTERPL,
T> - rwnd_req, block_allowed, so->so_rcv.sb_cc, uio->uio_resid);
T> + rwnd_req, block_allowed, so->so_rcv.sb_ccc, uio->uio_resid);
T> }
T> error = sblock(&so->so_rcv, (block_allowed ? SBL_WAIT : 0));
T> if (error) {
T> @@ -5255,7 +5255,7 @@ restart_nosblocks:
T> (inp->sctp_flags & SCTP_PCB_FLAGS_SOCKET_ALLGONE)) {
T> goto out;
T> }
T> - if ((so->so_rcv.sb_state & SBS_CANTRCVMORE) && (so->so_rcv.sb_cc == 0)) {
T> + if ((so->so_rcv.sb_state & SBS_CANTRCVMORE) && (so->so_rcv.sb_ccc == 0)) {
T> if (so->so_error) {
T> error = so->so_error;
T> if ((in_flags & MSG_PEEK) == 0)
T> @@ -5262,7 +5262,7 @@ restart_nosblocks:
T> so->so_error = 0;
T> goto out;
T> } else {
T> - if (so->so_rcv.sb_cc == 0) {
T> + if (so->so_rcv.sb_ccc == 0) {
T> /* indicate EOF */
T> error = 0;
T> goto out;
T> @@ -5269,9 +5269,9 @@ restart_nosblocks:
T> }
T> }
T> }
T> - if ((so->so_rcv.sb_cc <= held_length) && block_allowed) {
T> + if ((so->so_rcv.sb_ccc <= held_length) && block_allowed) {
T> /* we need to wait for data */
T> - if ((so->so_rcv.sb_cc == 0) &&
T> + if ((so->so_rcv.sb_ccc == 0) &&
T> ((inp->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) ||
T> (inp->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL))) {
T> if ((inp->sctp_flags & SCTP_PCB_FLAGS_CONNECTED) == 0) {
T> @@ -5307,7 +5307,7 @@ restart_nosblocks:
T> }
T> held_length = 0;
T> goto restart_nosblocks;
T> - } else if (so->so_rcv.sb_cc == 0) {
T> + } else if (so->so_rcv.sb_ccc == 0) {
T> if (so->so_error) {
T> error = so->so_error;
T> if ((in_flags & MSG_PEEK) == 0)
T> @@ -5364,11 +5364,11 @@ restart_nosblocks:
T> SCTP_INP_READ_LOCK(inp);
T> }
T> control = TAILQ_FIRST(&inp->read_queue);
T> - if ((control == NULL) && (so->so_rcv.sb_cc != 0)) {
T> + if ((control == NULL) && (so->so_rcv.sb_ccc != 0)) {
T> #ifdef INVARIANTS
T> panic("Huh, its non zero and nothing on control?");
T> #endif
T> - so->so_rcv.sb_cc = 0;
T> + so->so_rcv.sb_ccc = 0;
T> }
T> SCTP_INP_READ_UNLOCK(inp);
T> hold_rlock = 0;
T> @@ -5489,11 +5489,11 @@ restart_nosblocks:
T> }
T> /*
T> * if we reach here, not suitable replacement is available
T> - * <or> fragment interleave is NOT on. So stuff the sb_cc
T> + * <or> fragment interleave is NOT on. So stuff the sb_ccc
T> * into the our held count, and its time to sleep again.
T> */
T> - held_length = so->so_rcv.sb_cc;
T> - control->held_length = so->so_rcv.sb_cc;
T> + held_length = so->so_rcv.sb_ccc;
T> + control->held_length = so->so_rcv.sb_ccc;
T> goto restart;
T> }
T> /* Clear the held length since there is something to read */
T> @@ -5790,10 +5790,10 @@ get_more_data:
T> if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_SB_LOGGING_ENABLE) {
T> sctp_sblog(&so->so_rcv, control->do_not_ref_stcb ? NULL : stcb, SCTP_LOG_SBFREE, cp_len);
T> }
T> - atomic_subtract_int(&so->so_rcv.sb_cc, cp_len);
T> + atomic_subtract_int(&so->so_rcv.sb_ccc, cp_len);
T> if ((control->do_not_ref_stcb == 0) &&
T> stcb) {
T> - atomic_subtract_int(&stcb->asoc.sb_cc, cp_len);
T> + atomic_subtract_int(&stcb->asoc.sb_ccc, cp_len);
T> }
T> copied_so_far += cp_len;
T> freed_so_far += cp_len;
T> @@ -5938,7 +5938,7 @@ wait_some_more:
T> (sctp_is_feature_on(inp, SCTP_PCB_FLAGS_FRAG_INTERLEAVE))) {
T> goto release;
T> }
T> - if (so->so_rcv.sb_cc <= control->held_length) {
T> + if (so->so_rcv.sb_ccc <= control->held_length) {
T> error = sbwait(&so->so_rcv);
T> if (error) {
T> goto release;
T> @@ -5965,8 +5965,8 @@ wait_some_more:
T> }
T> goto done_with_control;
T> }
T> - if (so->so_rcv.sb_cc > held_length) {
T> - control->held_length = so->so_rcv.sb_cc;
T> + if (so->so_rcv.sb_ccc > held_length) {
T> + control->held_length = so->so_rcv.sb_ccc;
T> held_length = 0;
T> }
T> goto wait_some_more;
T> @@ -6113,13 +6113,13 @@ out:
T> freed_so_far,
T> ((uio) ? (slen - uio->uio_resid) : slen),
T> stcb->asoc.my_rwnd,
T> - so->so_rcv.sb_cc);
T> + so->so_rcv.sb_ccc);
T> } else {
T> sctp_misc_ints(SCTP_SORECV_DONE,
T> freed_so_far,
T> ((uio) ? (slen - uio->uio_resid) : slen),
T> 0,
T> - so->so_rcv.sb_cc);
T> + so->so_rcv.sb_ccc);
T> }
T> }
T> stage_left:
T> Index: sys/netinet/sctputil.h
T> ===================================================================
T> --- sys/netinet/sctputil.h (.../head) (revision 266804)
T> +++ sys/netinet/sctputil.h (.../projects/sendfile) (revision 266807)
T> @@ -284,10 +284,10 @@ do { \
T> } \
T> if (stcb->sctp_socket && ((stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) || \
T> (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL))) { \
T> - if (stcb->sctp_socket->so_snd.sb_cc >= tp1->book_size) { \
T> - atomic_subtract_int(&((stcb)->sctp_socket->so_snd.sb_cc), tp1->book_size); \
T> + if (stcb->sctp_socket->so_snd.sb_ccc >= tp1->book_size) { \
T> + atomic_subtract_int(&((stcb)->sctp_socket->so_snd.sb_ccc), tp1->book_size); \
T> } else { \
T> - stcb->sctp_socket->so_snd.sb_cc = 0; \
T> + stcb->sctp_socket->so_snd.sb_ccc = 0; \
T> } \
T> } \
T> } \
T> @@ -305,10 +305,10 @@ do { \
T> } \
T> if (stcb->sctp_socket && ((stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) || \
T> (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL))) { \
T> - if (stcb->sctp_socket->so_snd.sb_cc >= sp->length) { \
T> - atomic_subtract_int(&stcb->sctp_socket->so_snd.sb_cc,sp->length); \
T> + if (stcb->sctp_socket->so_snd.sb_ccc >= sp->length) { \
T> + atomic_subtract_int(&stcb->sctp_socket->so_snd.sb_ccc,sp->length); \
T> } else { \
T> - stcb->sctp_socket->so_snd.sb_cc = 0; \
T> + stcb->sctp_socket->so_snd.sb_ccc = 0; \
T> } \
T> } \
T> } \
T> @@ -320,7 +320,7 @@ do { \
T> if ((stcb->sctp_socket != NULL) && \
T> ((stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) || \
T> (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL))) { \
T> - atomic_add_int(&stcb->sctp_socket->so_snd.sb_cc,sz); \
T> + atomic_add_int(&stcb->sctp_socket->so_snd.sb_ccc,sz); \
T> } \
T> } while (0)
T>
T> Index: usr.bin/bluetooth/btsockstat/btsockstat.c
T> ===================================================================
T> --- usr.bin/bluetooth/btsockstat/btsockstat.c (.../head) (revision 266804)
T> +++ usr.bin/bluetooth/btsockstat/btsockstat.c (.../projects/sendfile) (revision 266807)
T> @@ -255,8 +255,8 @@ hcirawpr(kvm_t *kvmd, u_long addr)
T> (unsigned long) pcb.so,
T> (unsigned long) this,
T> pcb.flags,
T> - so.so_rcv.sb_cc,
T> - so.so_snd.sb_cc,
T> + so.so_rcv.sb_ccc,
T> + so.so_snd.sb_ccc,
T> pcb.addr.hci_node);
T> }
T> } /* hcirawpr */
T> @@ -303,8 +303,8 @@ l2caprawpr(kvm_t *kvmd, u_long addr)
T> "%-8lx %-8lx %6d %6d %-17.17s\n",
T> (unsigned long) pcb.so,
T> (unsigned long) this,
T> - so.so_rcv.sb_cc,
T> - so.so_snd.sb_cc,
T> + so.so_rcv.sb_ccc,
T> + so.so_snd.sb_ccc,
T> bdaddrpr(&pcb.src, NULL, 0));
T> }
T> } /* l2caprawpr */
T> @@ -361,8 +361,8 @@ l2cappr(kvm_t *kvmd, u_long addr)
T> fprintf(stdout,
T> "%-8lx %6d %6d %-17.17s/%-5d %-17.17s %-5d %s\n",
T> (unsigned long) this,
T> - so.so_rcv.sb_cc,
T> - so.so_snd.sb_cc,
T> + so.so_rcv.sb_ccc,
T> + so.so_snd.sb_ccc,
T> bdaddrpr(&pcb.src, local, sizeof(local)),
T> pcb.psm,
T> bdaddrpr(&pcb.dst, remote, sizeof(remote)),
T> @@ -467,8 +467,8 @@ rfcommpr(kvm_t *kvmd, u_long addr)
T> fprintf(stdout,
T> "%-8lx %6d %6d %-17.17s %-17.17s %-4d %-4d %s\n",
T> (unsigned long) this,
T> - so.so_rcv.sb_cc,
T> - so.so_snd.sb_cc,
T> + so.so_rcv.sb_ccc,
T> + so.so_snd.sb_ccc,
T> bdaddrpr(&pcb.src, local, sizeof(local)),
T> bdaddrpr(&pcb.dst, remote, sizeof(remote)),
T> pcb.channel,
T> Index: usr.bin/systat/netstat.c
T> ===================================================================
T> --- usr.bin/systat/netstat.c (.../head) (revision 266804)
T> +++ usr.bin/systat/netstat.c (.../projects/sendfile) (revision 266807)
T> @@ -333,8 +333,8 @@ enter_kvm(struct inpcb *inp, struct socket *so, in
T> struct netinfo *p;
T>
T> if ((p = enter(inp, state, proto)) != NULL) {
T> - p->ni_rcvcc = so->so_rcv.sb_cc;
T> - p->ni_sndcc = so->so_snd.sb_cc;
T> + p->ni_rcvcc = so->so_rcv.sb_ccc;
T> + p->ni_sndcc = so->so_snd.sb_ccc;
T> }
T> }
T>
T> Index: usr.bin/netstat/netgraph.c
T> ===================================================================
T> --- usr.bin/netstat/netgraph.c (.../head) (revision 266804)
T> +++ usr.bin/netstat/netgraph.c (.../projects/sendfile) (revision 266807)
T> @@ -119,7 +119,7 @@ netgraphprotopr(u_long off, const char *name, int
T> if (Aflag)
T> printf("%8lx ", (u_long) this);
T> printf("%-5.5s %6u %6u ",
T> - name, sockb.so_rcv.sb_cc, sockb.so_snd.sb_cc);
T> + name, sockb.so_rcv.sb_ccc, sockb.so_snd.sb_ccc);
T>
T> /* Get info on associated node */
T> if (ngpcb.node_id == 0 || csock == -1)
T> Index: usr.bin/netstat/unix.c
T> ===================================================================
T> --- usr.bin/netstat/unix.c (.../head) (revision 266804)
T> +++ usr.bin/netstat/unix.c (.../projects/sendfile) (revision 266807)
T> @@ -287,7 +287,8 @@ unixdomainpr(struct xunpcb *xunp, struct xsocket *
T> } else {
T> printf("%8lx %-6.6s %6u %6u %8lx %8lx %8lx %8lx",
T> (long)so->so_pcb, socktype[so->so_type], so->so_rcv.sb_cc,
T> - so->so_snd.sb_cc, (long)unp->unp_vnode, (long)unp->unp_conn,
T> + so->so_snd.sb_cc, (long)unp->unp_vnode,
T> + (long)unp->unp_conn,
T> (long)LIST_FIRST(&unp->unp_refs),
T> (long)LIST_NEXT(unp, unp_reflink));
T> }
T> Index: usr.bin/netstat/inet.c
T> ===================================================================
T> --- usr.bin/netstat/inet.c (.../head) (revision 266804)
T> +++ usr.bin/netstat/inet.c (.../projects/sendfile) (revision 266807)
T> @@ -137,7 +137,7 @@ pcblist_sysctl(int proto, const char *name, char *
T> static void
T> sbtoxsockbuf(struct sockbuf *sb, struct xsockbuf *xsb)
T> {
T> - xsb->sb_cc = sb->sb_cc;
T> + xsb->sb_cc = sb->sb_ccc;
T> xsb->sb_hiwat = sb->sb_hiwat;
T> xsb->sb_mbcnt = sb->sb_mbcnt;
T> xsb->sb_mcnt = sb->sb_mcnt;
T> @@ -479,7 +479,8 @@ protopr(u_long off, const char *name, int af1, int
T> printf("%6u %6u %6u ", tp->t_sndrexmitpack,
T> tp->t_rcvoopack, tp->t_sndzerowin);
T> } else {
T> - printf("%6u %6u ", so->so_rcv.sb_cc, so->so_snd.sb_cc);
T> + printf("%6u %6u ",
T> + so->so_rcv.sb_cc, so->so_snd.sb_cc);
T> }
T> if (numeric_port) {
T> if (inp->inp_vflag & INP_IPV4) {
T> _______________________________________________
T> freebsd-arch at freebsd.org mailing list
T> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
T> To unsubscribe, send any mail to "freebsd-arch-unsubscribe at freebsd.org"
--
Totus tuus, Glebius.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sendfile.diff
Type: text/x-diff
Size: 123147 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-arch/attachments/20140831/1c7fba68/attachment-0001.diff>
More information about the freebsd-arch
mailing list