Current gptzfsboot limitations

Matt Reimer mattjreimer at gmail.com
Mon Nov 23 22:04:32 UTC 2009


On Mon, Nov 23, 2009 at 7:18 AM, John Baldwin <jhb at freebsd.org> wrote:
> On Friday 20 November 2009 7:46:54 pm Matt Reimer wrote:
>> I've been analyzing gptzfsboot to see what its limitations are. I
>> think it should now work fine for a healthy pool with any number of
>> disks, with any type of vdev, whether single disk, stripe, mirror,
>> raidz or raidz2.
>>
>> But there are currently several limitations (likely in loader.zfs
>> too), mostly due to the limited amount of memory available (< 640KB)
>> and the simple memory allocators used (a simple malloc() and
>> zfs_alloc_temp()).
...
>>
>> I think I've also hit a stack overflow a couple of times while debugging.
>>
>> I don't know enough about the gptzfsboot/loader.zfs environment to
>> know whether the heap size could be easily enlarged, or whether there
>> is room for a real malloc() with free(). loader(8) seems to use the
>> malloc() in libstand. Can anyone shed some light on the memory
>> limitations and possible solutions?
>>
>> I won't be able to spend much more time on this, but I wanted to pass
>> on what I've learned in case someone else has the time and boot fu to
>> take it the next step.
>
> One issue is that disk transfers need to happen in the lower 1MB due to BIOS
> limitations.  The loader uses a bounce buffer (in biosdisk.c in libi386) to
> make this work ok.  The loader uses memory > 1MB for malloc().  You could
> probably change zfsboot to do that as well if not already.  Just note that
> drvread() has to bounce buffer requests in that case.  The text + data + bss
> + stack is all in the lower 640k and there's not much you can do about that.
> The stack grows down from 640k, and the boot program text + data starts at
> 64k with the bss following.

Ah, the stack growing down from 640k explains a problem I was seeing
where a memcpy() to a temp buf would restart gptzfsboot--it must have
been overwriting the stack.

> Hmm, drvread() might already be bounce buffering
> since boot2 has to do so since it copies the loader up to memory > 1MB as
> well.

Looks like it's already bounce buffering. All the I/O drvread does is
to statically allocated char arrays, and the data is copied when
necessary, e.g. in vdev_read():

                if (drvread(dsk, dmadat->rdbuf, lba, nb))
                        return -1;
                memcpy(p, dmadat->rdbuf, nb * DEV_BSIZE);


> You might need to use memory > 2MB for zfsboot's malloc() so that the
> loader can be copied up to 1MB.  It looks like you could patch malloc() in
> zfsboot.c to use 4*1024*1024 as heap_next and maybe 64*1024*1024 as heap_end
> (this assumes all machines that boot ZFS have at least 64MB of RAM, which is
> probably safe).

So are the page tables etc. already configured such that RAM above 1MB
is ready to use in gptzfsboot? (I'm not familiar with the details of
how virtual memory is handled on i386.)

Thanks for your help John.

Matt


More information about the freebsd-fs mailing list