possible memory corruption prior to r268075 [Was: r268075 (MFV r267565) undone FreeBSD-specific changes]

Justin T. Gibbs gibbs at FreeBSD.org
Wed Sep 17 16:33:37 UTC 2014


On Sep 17, 2014, at 3:28 AM, Andriy Gapon <avg at FreeBSD.org> wrote:

> 
> While looking at the changes discussed in the previous email another issue has
> caught my attention.
> Both l2arc_compress_buf() and zio_write_bp_init() allocate a buffer for
> compressed data using an original / logical buffer size for the size.  The
> buffer allocation is done using zio_buf_alloc() and zio_data_buf_alloc().  So,
> the size would be increased to one of the standard buffer sizes if needed.  The
> standard buffer sizes are: 512B - 4KB with a step of 512B, 4KB - 8KB with a step
> of 1KB, 8KB - 16KB with a step of 2KB, 16KB - 128KB with a step of 4KB.
> The size for compressed data should always be sufficient because the compressed
> data is supposed to never be larger than the original data even taking into
> account rounding up to a multiple of 512B.  In other words, psize <= lsize.
> 
> But in FreeBSD prior to commit r268075 we used to round up the compressed size
> to a multiple of vdev sector size.

There is no data corruption due to the earlier code that truncates d_len such that at least a 12.5% compression can be achieved in a minblocksize sized/align buffer:

	/* Compress at least 12.5% */
	d_len = P2ALIGN(s_len - (s_len >> 3), minblocksize);
	if (d_len == 0) {
		ZCOMPSTAT_BUMP(zcompstat_skipped_minblocksize);
		return (s_len);
	}

d_len is strictly less than s_len, and c_len is tested to be <= d_len, so the roundup that follows later cannot exceed the original buffer size (s_len).

r268075 lost this optimization to avoid attempting to compress something that can never achieve even a 12.5% improvement due to minimum ashift.  This was done to support data embedded in block pointers.  However, I have no experience with how likely we are to compress 4K of data down to 112 bytes (a 98% compression ratio).  Perhaps it’s quite likely for metadata, but not so likely for user data?  If so, we should restore the old check for user data blocks to avoid useless consumption of CPU.

> I think that in general our code had and may still has a bit of confusion about
> psize (possibly compressed data size) and asize (actual disk allocation size).
> I think that psize should always be not greater than lsize.  While asize is
> always not less than psize.
>  psize <= lsize
>  psize <= asize
> Obviously, there is no rule on the relation between lsize and asize.
> But our code used to force ashift on psize which made it kind of asize, or
> something in between the classic psize and asize.

Where did we force this?  You mean in the 12.5% compression test?  Or the roundup/zero fill during compression?

—
Justin


More information about the zfs-devel mailing list