ZFS registering ENOSPC as pool write errors

Steven Hartland killing at multiplay.co.uk
Tue Jun 18 11:16:32 UTC 2013


We've been testing using a swap backed md for temporary storage
via a zfs pool.

This was working well until an application error caused the pool
to fill. This shouldn't have been an issue but it appears that it
caused write errors to be registered against the pool apparently
for error 28 (ENOSPC).

Whats unusual is that the only backing device md0 shows 0 errors.

zpool status -x output:
pool: ramdisk
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://www.sun.com/msg/ZFS-8000-HC
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
ramdisk ONLINE 0 11.1K 0
md0 ONLINE 0 0 0

errors: 1393 data errors, use '-v' for a list

/var/log/messages showed:-
Jun 12 10:46:48 node9 root: ZFS: zpool I/O failure, zpool=ramdisk error=28
Jun 12 10:46:49 node9 last message repeated 999 times
Jun 12 10:46:49 node9 root: ZFS: vdev I/O failure, zpool=ramdisk path= offset= size= error=

These are logged via devd triggering the following:-
notify 10 {
        match "system"          "ZFS";
        match "type"            "data";
        action "logger -p kern.warn 'ZFS: zpool I/O failure, zpool=$pool error=$zio_err'";
};
notify 10 {
        match "system"          "ZFS";
        match "type"            "io";
        action "logger -p kern.warn 'ZFS: vdev I/O failure, zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size 
error=$zio_err'";
};

The first of these (type=data) corrisponds to FM_EREPORT_ZFS_DATA
and the second (type=io) corrisponds to FM_EREPORT_ZFS_IO both
of which are reported via zio_done.

I've tried to reproduce this by filling an identically configured
pool, deleting some and filling again but have had no joy so far,
so it seems like its a fairly rare edge case.

So the questions:-
1. I'm assuming not, but should ENOSPC ever get registered as a pool
   write error?
2. Has anyone ever seen similar behavour before?

Unfortunately I was away when this happened and as it was a production
machine the pool was destroyed before I could do any further diagnosis.

The machine in question is running FreeBSD 8.3 so a little old in terms
of ZFS code base but wanted to get general feedback from illumos and
FreeBSD given it seems like an edge case so likely not that common in
practice.

    Regards
    Steve 


================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster at multiplay.co.uk.



More information about the zfs-devel mailing list