Netgraph performance with ng_ipfw
Евгений
ekamyshev at omsk.multinex.ru
Fri Jan 22 21:48:22 UTC 2010
Hi,
I have several routers under heavy load, running FreeBSD 7.2
These routers use Netgraph to impelement traffic shaping and accouting
(using ng_car and ng_netflow nodes).
Packets are passed from firewall to netgraph using the following rules
accounting:
netgraph 100 ip from any to any in
shaping:
netgraph tablearg ip from any to table(118) out
netgraph tablearg ip from table(118) to any in
Table 118 contains users' ip addresses with tablearg referencing configured individual ng_car node.
At peak, there are 1500-2000 entries in table and configured nodes.
The problem is that at peak load the router loses packets. After studying the sources & doing some debugging,
it became clear that packets are being droped at netgraph queue, at ng_alloc_item function:
static __inline item_p
ng_alloc_item(int type, int flags)
{
item_p item;
KASSERT(((type & ~NGQF_TYPE) == 0),
("%s: incorrect item type: %d", __func__, type));
item = uma_zalloc((type == NGQF_DATA)?ng_qdzone:ng_qzone,
((flags & NG_WAITOK) ? M_WAITOK : M_NOWAIT) | M_ZERO);
if (item) {
item->el_flags = type;
#ifdef NETGRAPH_DEBUG
mtx_lock(&ngq_mtx);
TAILQ_INSERT_TAIL(&ng_itemlist, item, all);
allocated++;
mtx_unlock(&ngq_mtx);
#endif
}
return (item);
}
It returns NULL if it is unable to allocate entry in ng_qdzone.
When it is being called from ng_package_data, this causes the packet to be dropped:
item_p
ng_package_data(struct mbuf *m, int flags)
{
item_p item;
if ((item = ng_alloc_item(NGQF_DATA, flags)) == NULL) {
NG_FREE_M(m);
return (NULL);
}
ITEM_DEBUG_CHECKS;
item->el_flags |= NGQF_READER;
NGI_M(item) = m;
return (item);
}
After tuning maxdata parameter, I was able to decrease loses(and increase delays), but the question is, why
the system does not contain some kind of a counter of packets dropped at Netgraph queue? It seem to be
a trivial task to add, for example, a sysctl variable that would reflect the number of dropped packets, and it would
really simplify things.
The second question is about the effectiveness of Netgraph queueing and ng_ipfw node with SMP kernel...
At ng_ipfw_connect function, when being connected to some other node,
to avoid recursion the hook is set to queueing mode:
/*
* Set hooks into queueing mode, to avoid recursion between
* netgraph layer and ip_{input,output}.
*/
static int
ng_ipfw_connect(hook_p hook)
{
NG_HOOK_FORCE_QUEUE(hook);
return (0);
}
This causes the packets to be queued when being passed back to ng_ipfw node.
On SMP kernels, several kernel processes are created to process
queues(they are seen as ng_queue* processes in ps).
Now, the code of ngthread that processes the queue:
static void
ngthread(void *arg)
{
for (;;) {
node_p node;
/* Get node from the worklist. */
NG_WORKLIST_LOCK();
while ((node = TAILQ_FIRST(&ng_worklist)) == NULL)
NG_WORKLIST_SLEEP();
TAILQ_REMOVE(&ng_worklist, node, nd_work);
NG_WORKLIST_UNLOCK();
CTR3(KTR_NET, "%20s: node [%x] (%p) taken off worklist",
__func__, node->nd_ID, node);
/*
* We have the node. We also take over the reference
* that the list had on it.
* Now process as much as you can, until it won't
* let you have another item off the queue.
* All this time, keep the reference
* that lets us be sure that the node still exists.
* Let the reference go at the last minute.
*/
for (;;) {
item_p item;
int rw;
NG_QUEUE_LOCK(&node->nd_input_queue);
item = ng_dequeue(&node->nd_input_queue, &rw);
if (item == NULL) {
atomic_clear_int(&node->nd_flags, NGF_WORKQ);
NG_QUEUE_UNLOCK(&node->nd_input_queue);
break; /* go look for another node */
} else {
NG_QUEUE_UNLOCK(&node->nd_input_queue);
NGI_GET_NODE(item, node); /* zaps stored node */
ng_apply_item(node, item, rw);
NG_NODE_UNREF(node);
}
}
NG_NODE_UNREF(node);
}
}
It takes the node from ng_worklist, and tries to process as many items
in queue as possible, until ng_dequeue function returns NULL(no more items).
Note that in ng_worklist there is usually only one node - ng_ipfw(if other nodes
did not configure queueing for themselves, that is the case with ng_car and ng_netflow nodes).
If the large number of packets is being passed back to ng_ipfw node
from other nodes, it is clear that one kernel process(ng_queue*) will simply take one node, and
if the packets are being passed quicker than they are being processed in ng_ipfw(sent further to
ip_input or ip_output), one of the ng_queue* processes will take 100% time of one CPU core, when the others will not
process anything.
I have seen such behavior on my routers - at peak load, one of ng_queue* processes takes 100% of one core,
and the other processes are seen in top taking 0% of CPU.
This seem to be a problem of ng_ipfw - it doesn't seem to be working good with SMP.
My question is, can it somehow be fixed?
The third question is about the algorithm of finding hooks in ng_ipfw.
When being passed from firewall, ng_ipfw_input is called, in turn,
it calls ng_ipfw_findhook1 function to find hook matching cookie from
struct ip_fw_args *fwa.
if (fw_node == NULL ||
(hook = ng_ipfw_findhook1(fw_node, fwa->cookie)) == NULL) {
if (tee == 0)
m_freem(*m0);
return (ESRCH); /* no hook associated with this rule */
}
ng_ipfw_findhook function calls converts this cookie to numeric representation
and calls ng_ipfw_findhook1:
/* Look up hook by name */
hook_p
ng_ipfw_findhook(node_p node, const char *name)
{
u_int16_t n; /* numeric representation of hook */
char *endptr;
n = (u_int16_t)strtol(name, &endptr, 10);
if (*endptr != '\0')
return NULL;
return ng_ipfw_findhook1(node, n);
}
and ng_ipfw_findhook1 simply goes through the whole list of hooks to find one matching
given cookie:
/* Look up hook by rule number */
static hook_p
ng_ipfw_findhook1(node_p node, u_int16_t rulenum)
{
hook_p hook;
hpriv_p hpriv;
LIST_FOREACH(hook, &node->nd_hooks, hk_hooks) {
hpriv = NG_HOOK_PRIVATE(hook);
if (NG_HOOK_IS_VALID(hook) && (hpriv->rulenum == rulenum))
return (hook);
}
return (NULL);
}
When the large number of hooks is present, as in the configuration given in the beginning of this message,
this would cause an obvious decrease in performance - for each packet passed from ipfw to netgraph,
1 to 1500-2000 iterations are needed to find matching hook. And again, it seem to be a trivial task to rewrite
this code to find hook by hash or even by array.
More information about the freebsd-net
mailing list