Re: BHYVE SNAPSHOT image format proposal

From: Corvin Khne <corvink_at_FreeBSD.org>
Date: Tue, 06 Jun 2023 12:59:53 UTC
Hi Vitaliy,

thanks for your answers. See comments below.

On Tue, 2023-06-06 at 14:25 +0300, Vitaliy Gusev wrote:
> Hi Corvin, 
> 
> Thanks for your comments and advices. 
> 
> Answers are below,
> 
> > On 5 Jun 2023, at 18:32, Corvin Köhne <corvink@FreeBSD.org> wrote:
> > 
> > On Tue, 2023-05-23 at 19:05 +0300, Vitaliy Gusev wrote:
> > > 2. HEADER PHYS format: 
> > > 
> > >  0    +—————————————————————————————————————————+ 
> > >       |        IDENT STRING  - 64 BYTES         |
> > >  64   +—————————————————————————————————————————+   
> > >       | NVLIST SIZE  - 4 BYTES   |  NVLIST DATA |
> > >  72   +—————————————————————————————————————————+
> > >       |                                         |
> > >       |               NVLIST DATA               |
> > >       |                                         |
> > >  4096 +—————————————————————————————————————————+
> > > 
> > > > 
> > > > IDENT STRING - Each producer can set its own value to specify
> > > > image.
> > > > NVLIST SIZE  - The following packed header nvlist data size.
> > > > NVLIST DATA - Packed nvlist header data.
> > > > 
> > > > 4KB should be enough for the HEADER to keep basic information
> > > > about
> > > > Sections. However, it can
> > > > be enlarged lately, without breaking backward compatibility. 
> > > > 
> > 
> > I can't see an advantage of using a fixed sized header of 4KB. You
> > have
> > to parse the offset and size of every section anyways. If it's for
> > alignment requirements you can still align all sections on save and
> > set
> > the offset accordingly. So, why complicating things by using a
> > fixed
> > header size?
> 
> 
> You are right about 4KB restriction. I will correct it in updated
> format proposal. Idea is
> to reserve enough space for HEADER and write it after all finished
> stages at the beginning 
> of a snapshot file.
> 
> Implementation (snapshot path) should know estimated maximum size of
> the header and can
> use the possible maximum. Currently 4KB is enough and easily can be
> increased in the bhyve’s code without any problem. 
> 
> Alignment is useful to debug and looking into snapshot image file.
> 

Ah okay, I see the issue. Bhyve has to estimate the size. It's fine to
use a fixed header size in the code. However, our image format
shouldn't be defined that way.

> > 
> > The IDENT STRING seems to be very large. Even a GUID which should
> > be a
> > global unique identifier uses just 16 Bytes. Additionally, it might
> > be
> > worth using a dedicated ident and version field for an easier
> > version
> > parsing. E.g.:
> 
> 
> Intention is to add enough space for the future version (as
> reservation) and other producers
> and companies to specify it’s own ID string with possible add-on
> information. So adding  64 bytes
> for the future is not so huge pay, but can be very useful.

> During resume, if IDENT string is not the same as in bhyve, resume
> can fail before parsing
> other data, because it could be that internal format is not as
> expected.
> 
> I would not to fix IDENT string format and just apply rule:
> 
> > During resume, bhyve compares its own IDENT string and IDENT string
> > from an
> > Snapshot image. If it is not the same, further assumption about
> > format cannot be done,
> > and resume should fail.
> 

We may have different version of the format from the same produce.
IMHO, it makes sense to have a dedicated IDENT and VERSION field to
easily figure out

1) if the producer of the image is known
2) if we support that version of the producer

Even if you allocated a huge amount of free space, someone would need
more. So, what do you think about this format:

+---------------------------------------------------------------------+
| IDENT - 16 BYTES                                                    |
+-------------------+-----------------------+-------------------------+
| VERSION - 4 BYTES | NVLIST SIZE - 4 BYTES | NVLIST OFFSET - 8 BYTES |
+-------------------+-----------------------+-------------------------+
| POSSIBLE FREE SPACE (e.g. for custom data, alignment etc.)          |
+---------------------------------------------------------------------+
| NVLIST DATA                                                         |
+---------------------------------------------------------------------+
| POSSIBLE FREE SPACE (for whatever reason)                           |
+---------------------------------------------------------------------+
| SNAPSHOT DATA                                                       |
+---------------------------------------------------------------------+

> > 
> > +------------------+-------------------+
> > | IDENT - 56 BYTES | VERSION - 8 BYTES |
> > +------------------+-------------------+
> > 
> > IDENT - "BHYVE CHECKPOINT IMAGE"
> > VERSION - 1 (as uint64_t)
> > 
> > Btw: I don't care but here we could leave some free space for
> > possible
> > forward compatibility. E.g.:
> > 
> > +------------------+-------------------+-------------------------+
> > | IDENT - 16 BYTES | VERSION - 8 BYTES | _FREE_SPACE_ - 40 BYTES |
> > +------------------+-------------------+————————————+
> ...
> > > 4. EXAMPLE:
> > > 
> > > 
> > >  IDENT STRING:
> > > 
> > >        "BHYVE CHECKPOINT IMAGE VERSION 1"
> > > 
> > >  NVLIST HEADER: 
> > > 
> > >   [config]
> > >         config.offset = 0x1000 (4096)
> > >         config.size = 0x1f6 (502)
> > >         config.type = "text"
> > > 
> > 
> > Not sure if it's just an example for the "text" type. bhyve
> > converts it
> > into a nvlist, so it could be saved directly as nvlist.
> > Btw: I would only implement the "text" type if there's an usecase
> > that
> > can't be solved by one of the other types.
> 
> 
> 
> Intention is to use current engine to dump bhyve’s config and read
> config
> from a file (-k option).
> 
> Advantage of using “text” type - simple implementation and as an
> example
> of flexibility of proposed image format. Image file can keep any
> types that
> a producer would like to use: text, nvlist, binary, diff-pages, etc.
> 

nvlist is bhyve's internal format. So, implementing it should be even
more simple.
We don't need an example of how flexible the image format is in our
final implementation. Adding it as an example for this thread is fine.

> > 
> > All in all, it looks good. Keep on your work!
> > 
> > Regards checksum feature:
> > We should focus on enabling this feature by default before adding
> > advanced features. So, keep it simple and small.
> 
> 
> Could you give a more example what you meant about “checksum”
> feature? Did you mean as
> TAR’s checksum, i.e. only header?
> 

Sry for the confusion. I was referring to the discussion about
checksums (like checksums for section data) in this thread. Just ignore
my comment and focus on the rest.

> 
> > 
> > Regards forward compatibility:
> > Backward compatibility is way more important than forward
> > compatibility. Nevertheless, forward compatibility would be nice to
> > have. So, we should keep it in mind when modifying the layout. For
> > the
> > moment, just focus on a format which is backward compatible.
> > 
> 
> 
> It seems that having information about forward compatibility could be
> very
> useful, at least to get it in advance if it is impossible to restore.
> I will add it during
> implementing this format.
> 

Forward compatibility would be great but please just ignore it at the
moment to speed up the development. We have versioning. So, we can add
forward compatibility in a future version.


-- 
Kind regards,
Corvin