kqueue disable on delivery...

Igor Sysoev is at rambler-co.ru
Wed Sep 27 09:16:54 PDT 2006


On Sat, 23 Sep 2006, John-Mark Gurney wrote:

> Igor Sysoev wrote this message on Sat, Sep 23, 2006 at 11:40 +0400:
>> On Fri, 22 Sep 2006, John-Mark Gurney wrote:
>>
>>> Igor Sysoev wrote this message on Fri, Sep 22, 2006 at 17:25 +0400:
>>>> On Sun, 17 Sep 2006, John-Mark Gurney wrote:
>>>>
>>>>> I have implemented a couple additional features to kqueue.  These allow
>>>>> kqueue to be a multithreaded event delivery system that can guarantee
>>>>> that the event will only be active in one thread at any time.
>>>>>
>>>>> The first is EV_DOD, aka disable on delivery.  When the event will be
>>>>> delivered to userland, the knote is marked disabled so we don't
>>>>> have to go through the expense of reallocing the knote each time.
>>>>> (Reallocation of the knote is also lock intensive, and disabling is
>>>>> cheap.)
>>>>
>>>> In my opinion, it's too implementation specific flag.
>>>
>>> How else are you doing to solve having multiple threads servicing
>>> the same queue at the same time?  Also, Apple is planing on having
>>> a similar flag to EV_DOD, but I don't know what they are naming it..
>>> I've tried for a while to find out, but haven't been able to...
>>
>> As I understand EV_DOD or EV_CLEAR|EV_DOD are like simple EV_ONESHOT,
>> except the filter is not deleted on delivery, but is disabled skipping
>> some in-kernel lock overhead. That's I'd named it too implementation
>> specific.
>>
>> Yes, the EV_CLEAR|EV_DOD guarantees that the event will be active
>> in one thread only at any time. But in my practice I saw there is
>> necessity to guarantee that the socket (both events - EVFILT_READ
>> and EVFILT_WRITE) will be active in one thread only at any time.
>> It seems that is the reason why heavy threaded Solaris 10 event ports
>> use the oneshot only model where a socket is deleted from port on delivery.
>
> Only if you need to both read and write active on the socket  at once...
> In some/many servers, you only need one or the other, such as file
> transfer servers like http and ftp...

I thought about you flags and believe it would be usefull.
Also I think that the cheapness of EV_DISABLE and EV_DOD should be
documented.

>>>>> Even though this means that the event will only ever be active in a
>>>>> thread at a time, (when you're done handling the event, you reenable
>>>>> it), removing the event from the queue outside the event handler (say
>>>>> a timeout handler for the connection) poses to be a problem.  If you
>>>>> simply close the socket, the event disappears, but then there is a
>>>>> race between another event being created with the same socket, and
>>>>> notification of the handler that you want the event to stop.
>>>>>
>>>>> In order to handle that situation, I have come up w/ EV_FORCEOS, aka
>>>>> FORCE ONE_SHOT.  EV_ONESHOT events have the advantage that once queued,
>>>>> they don't care if they have been activated or not, they will be returned
>>>>> the next round.  This means that the timeout handler can safely set
>>>>> EV_FORCEOS on the handler, and either if it's _DISABLED (handler running
>>>>> and will reenable it), or it's _ENABLED, it will get dispatched, allowing
>>>>> the handler to detect the EV_FORCEOS flag and teardown the connection.
>>>>
>>>> I think it should be EVFILT_USER event, allowing to
>>>> EV_SET(&kev, fd, EVFILT_USER, 0, 0, 0, udata);
>>>> and the event should automatically sets the EV_ONESHOT flag internally.
>>>
>>> I'll agree EV_FORCEOS is open for discussion, but you did see how much
>>> code it adds right?  I was surprised at how small the patch was for the
>>> additional functionality..
>>
>> Yes, EV_FORCEOS is small patch. However, EVFILT_USER is more generic
>> (by the way, Solaris 10 event ports allow to send user-specific
>> PORT_SOURCE_USER notification).
>
> I agree EVFILT_USER would be a useful thing, but it is still different
> from EV_FORCEOS...  Would you like to contribute some the to
> EVFILT_USER?  I'll look at integrating it...

Here is patch and test program. The patch is against 6.2-PRERELEASE.
On 7.0 the EVFILT_LIO should be taked into account.

test program should show oneshot user event:
>./t
n: 1, id: 0x55, filt: -10, fl: 0x0010, ff:0, data:0x0, udata: 0x5678
n: 0, id: 0x0, filt: 0, fl: 0x0000, ff:0, data:0x0, udata: 0x0

>> Two years ago I was implementing threads for my server nginx
>> on FreeBSD 4.x, using rfork(). In the absence of EVFILT_USER I made
>> the condition variables using kill() and EV_SIGNAL and this user-level
>> code may panic kernel.
>
> Does it still?

It seems it was fixed in 1.80, 1.79.2.1, and 1.2.2.11 revisions of
src/sys/kern/kern_event.c, but there is report that is not so:
http://freebsd.rambler.ru/bsdmail/cvs-src_2005/msg04709.html
Currently nginx has threads disabled, so I could not test it.


Igor Sysoev
http://sysoev.ru/en/
-------------- next part --------------
--- src/sys/sys/event.h	Fri Jul  1 20:28:32 2005
+++ src/sys/sys/event.h	Wed Sep 27 17:35:09 2006
@@ -38,8 +38,9 @@
 #define EVFILT_TIMER		(-7)	/* timers */
 #define EVFILT_NETDEV		(-8)	/* network devices */
 #define EVFILT_FS		(-9)	/* filesystem events */
+#define EVFILT_USER		(-10)	/* user events */
 
-#define EVFILT_SYSCOUNT		9
+#define EVFILT_SYSCOUNT		10
 
 #define EV_SET(kevp_, a, b, c, d, e, f) do {	\
 	struct kevent *kevp = (kevp_);		\
--- src/sys/kern/kern_event.c	Mon Sep  4 21:17:25 2006
+++ src/sys/kern/kern_event.c	Wed Sep 27 18:53:25 2006
@@ -132,6 +132,9 @@
 static int	filt_timerattach(struct knote *kn);
 static void	filt_timerdetach(struct knote *kn);
 static int	filt_timer(struct knote *kn, long hint);
+static int	filt_userattach(struct knote *kn);
+static void	filt_userdetach(struct knote *kn);
+static int	filt_user(struct knote *kn, long hint);
 
 static struct filterops file_filtops =
 	{ 1, filt_fileattach, NULL, NULL };
@@ -142,6 +145,8 @@
 	{ 0, filt_procattach, filt_procdetach, filt_proc };
 static struct filterops timer_filtops =
 	{ 0, filt_timerattach, filt_timerdetach, filt_timer };
+static struct filterops user_filtops =
+	{ 0, filt_userattach, filt_userdetach, filt_user };
 
 static uma_zone_t	knote_zone;
 static int 		kq_ncallouts = 0;
@@ -247,6 +252,7 @@
 	{ &timer_filtops },			/* EVFILT_TIMER */
 	{ &file_filtops },			/* EVFILT_NETDEV */
 	{ &fs_filtops },			/* EVFILT_FS */
+	{ &user_filtops },			/* EVFILT_USER */
 };
 
 /*
@@ -495,6 +501,27 @@
 {
 
 	return (kn->kn_data != 0);
+}
+
+static int
+filt_userattach(struct knote *kn)
+{
+	kn->kn_flags |= EV_ONESHOT;		/* automatically set */
+	kn->kn_status &= ~KN_DETACHED;		/* knlist_add usually sets it */
+	return (0);
+}
+
+static void
+filt_userdetach(struct knote *kn)
+{
+	kn->kn_status |= KN_DETACHED;	/* knlist_remove usually clears it */
+}
+
+static int
+filt_user(struct knote *kn, long hint)
+{
+
+	return (1);
 }
 
 /*


More information about the freebsd-arch mailing list