vnode_pager_putpages errors and DOS?
Uwe Doering
gemini at geminix.org
Fri Nov 5 13:08:24 PST 2004
Igor Sysoev wrote:
> [...]
> I've tried your patch from second email (it requires to include
> <sys/conf.h> for devsw and D_DISK): the system also became unresponsible.
>
> The main problem is that I could not kill the offending process - it
> stuck in biowr state.
In the meantime I've investigated this further. The two patches I
provided so far certainly have their merits, since they deal with some
unwanted side effects. However, I found that the root cause for the
eventual system lock-up lies elsewhere.
In an earlier email I already pointed out that function
vnode_pager_generic_putpages() actually doesn't care whether the write
operation failed or not. It always returns VM_PAGER_OK.
Now, in case the write operation succeeds the file system code takes
care that the formerly dirty pages associated with the i/o buffer get
marked clean. On the other hand, if the write attempt fails, for
instance in an out-of-disk-space situation, the pages are left dirty.
At this point the syncer enters an infinite loop, trying to flush the
same dirty pages to disk over and over again.
The fix is actually quite simple. In case of a write error we have to
make sure ourselves that the associated pages get marked clean. We do
this by returning VM_PAGER_BAD instead of VM_PAGER_OK. These two result
codes are functionally identical, with the exception that VM_PAGER_BAD
additionally marks the respective page clean. For the details, please
have a look at the caller function vm_pageout_flush() in 'vm_pageout.c'.
What this modification means is that in case of a write error the
affected pages remain intact in memory until they get recycled, but we
lose their contents as far as the copy on disk is concerned. I believe
this is acceptable (and possibly even originally intended) because
giving up on syncing is about the best thing we can do in this
situation, anyway. And it is certainly a much better choice than
halting the whole system due to an infinite loop.
I've attached an updated version of the patch for 'vnode_pager.c'. On
my test system it resolved the issue. Please let us know whether it
works for you as well.
Uwe
--
Uwe Doering | EscapeBox - Managed On-Demand UNIX Servers
gemini at geminix.org | http://www.escapebox.net
-------------- next part --------------
--- src/sys/vm/vnode_pager.c.orig Tue Dec 31 10:34:51 2002
+++ src/sys/vm/vnode_pager.c Fri Nov 5 20:41:15 2004
@@ -954,7 +954,9 @@
struct uio auio;
struct iovec aiov;
int error;
+ int status;
int ioflags;
+ static int last_elog, last_rlog;
object = vp->v_object;
count = bytecount / PAGE_SIZE;
@@ -1035,15 +1037,18 @@
cnt.v_vnodeout++;
cnt.v_vnodepgsout += ncount;
- if (error) {
+ if (error && last_elog != time_second) {
+ last_elog = time_second;
printf("vnode_pager_putpages: I/O error %d\n", error);
}
- if (auio.uio_resid) {
+ if (auio.uio_resid && last_rlog != time_second) {
+ last_rlog = time_second;
printf("vnode_pager_putpages: residual I/O %d at %lu\n",
auio.uio_resid, (u_long)m[0]->pindex);
}
+ status = error ? VM_PAGER_BAD : VM_PAGER_OK;
for (i = 0; i < ncount; i++) {
- rtvals[i] = VM_PAGER_OK;
+ rtvals[i] = status;
}
return rtvals[0];
}
More information about the freebsd-stable
mailing list