From nobody Mon Jun 26 07:16:08 2023 X-Original-To: freebsd-virtualization@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QqJzb45Jrz4jCX3 for ; Mon, 26 Jun 2023 07:16:11 +0000 (UTC) (envelope-from corvink@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4QqJzb3f4Sz3rws; Mon, 26 Jun 2023 07:16:11 +0000 (UTC) (envelope-from corvink@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1687763771; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=K42rtsODAVGBBN0QOiBUyMxTaeVMPFZfX4xCguwyRSY=; b=PvP5M0E/I6o6wgK4oDcM3bD4O108aMG81H8b6+soGSrbN9IZMA02cV98QeT0pe/pFqYiyq GqzizphY1AkO196gY4J6ouFG0lsrxfi7n3PukhAb85RUA/QCYwYmCCb+oT928fzdNelHOK ihRzl5nAnbvtFKB9BBr2MYAt2Y67EvqUv4LbMo+Z3X01d54K9FXJgGZz5509Rs+OHI5aph 5JVVTp8DUgHm2jRhn0204y+VviTHhUzxlMsx2oEiKbtsFKhTMjhu9kV90dlRqtOI3ZfaF2 3CPsEF9q6dmwlngGYwG+HtpnV3o7j3uXXIn5kyQ1BF+nh9xsOluDs9MNKRPNpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1687763771; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=K42rtsODAVGBBN0QOiBUyMxTaeVMPFZfX4xCguwyRSY=; b=iJQtpkrqIcfwd6CVxwq9+Dn8ZaEuAt7agAQOhz29w2oX/mWUNGxaN5yE6FFHff+nig5BbZ nCLfPpZBPvugqik8nQyJYzf/8E/PcGjs7Q6blsc7iBlAZbqTK+Tz9QTyJuVw0aTu7lxzLO HS/enrr/oG9Vs3myNaWTQPA2CVC+GJsJWbjrdwMn4jUSBtS4bucCa14i3IOZ5LJK5Q4iRA Eawc552qK9axygdPGePaOvnhS2X4PaceegGiwqTy+53TrnOU+rFVJ6enlIBnCDgsLKJV2Q tjkY1/c1a051HsUrF36yw0jY/YqIxQRkBPuoEvRvlireLYYt0jhzqIHdXSgDSw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1687763771; a=rsa-sha256; cv=none; b=OHd6RoXB7XTKa4ISyHSvigDFxjG3oSkFSGrF+bvJ+ObiMVhCtrlxUQSB+G53xW7nt0PHu0 M53+6t8InBR1DPInAqqwB7N8nIqmRmf2Y8Z/FV+Cpt5M78aoiXV9K7GG8BJ8i/Ek8cZulj 49JSm665YKk85DOrgSPoHY0uafncZn1cesTzdqpiRlrNBvQ2pzvmbIM1FzqbvYP4G3S1D8 5OXNAFmOzumBL1sMnm6T0D7/vqooU1STyvBOLA7x7tGxsHw/AGiykwvkyVDYZhQ8ygCWPV HdfB3wtFiR6/VwC8dNcqy5xgREVzIsVHRBnDqpqQuWm92HI4loTpDQJkmq0yyw== Received: from [172.21.179.48] (unknown [195.226.174.194]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: corvink) by smtp.freebsd.org (Postfix) with ESMTPSA id 4QqJzZ4sJTzlN1; Mon, 26 Jun 2023 07:16:10 +0000 (UTC) (envelope-from corvink@FreeBSD.org) Message-ID: <3d7ee1f6ff98fe9aede5a85702b906fc3014b6b6.camel@FreeBSD.org> Subject: Re: Warm and Live Migration Implementation for bhyve From: Corvin =?ISO-8859-1?Q?K=F6hne?= To: Elena Mihailescu , freebsd-virtualization@freebsd.org Cc: Mihai Carabas , Matthew Grooms Date: Mon, 26 Jun 2023 09:16:08 +0200 In-Reply-To: References: Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-gt6pZqTTXiN27mLWVuKb" User-Agent: Evolution 3.48.3 List-Id: Discussion List-Archive: https://lists.freebsd.org/archives/freebsd-virtualization List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N --=-gt6pZqTTXiN27mLWVuKb Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Elena, thanks for posting this proposal here. Some open questions from my side: 1. How is the data send to the target? Does the host send a complete dump and the target parses it? Or does the target request data one by one und the host sends it as response? 2. What happens if we add a new data section? 3. What happens if the bhyve version differs on host and target machine? --=20 Kind regards, Corvin On Fri, 2023-06-23 at 13:00 +0300, Elena Mihailescu wrote: > Hello, >=20 > This mail presents the migration feature we have implemented for > bhyve. Any feedback from the community is much appreciated. >=20 > We have opened a stack of reviews on Phabricator > (https://reviews.freebsd.org/D34717) that is meant to split the code > in smaller parts so it can be more easily reviewed. A brief history > of > the implementation can be found at the bottom of this email. >=20 > The migration mechanism we propose needs two main components in order > to move a virtual machine from one host to another: > 1. the guest's state (vCPUs, emulated and virtualized devices) > 2. the guest's memory >=20 > For the first part, we rely on the suspend/resume feature. We call > the > same functions as the ones used by suspend/resume, but instead of > saving the data in files, we send it via the network. >=20 > The most time consuming aspect of migration is transmitting guest > memory. The UPB team has implemented two options to accomplish this: > 1. Warm Migration: The guest execution is suspended on the source > host > while the memory is sent to the destination host. This method is less > complex but may cause extended downtime. > 2. Live Migration: The guest continues to execute on the source host > while the memory is transmitted to the destination host. This method > is more complex but offers reduced downtime. >=20 > The proposed live migration procedure (pre-copy live migration) > migrates the memory in rounds: > 1. In the initial round, we migrate all the guest memory (all pages > that are allocated) > 2. In the subsequent rounds, we migrate only the pages that were > modified since the previous round started > 3. In the final round, we suspend the guest, migrate the remaining > pages that were modified from the previous round and the guest's > internal state (vCPU, emulated and virtualized devices). >=20 > To detect the pages that were modified between rounds, we propose an > additional dirty bit (virtualization dirty bit) for each memory page. > This bit would be set every time the page's dirty bit is set. > However, > this virtualization dirty bit is reset only when the page is > migrated. >=20 > The proposed implementation is split in two parts: > 1. The first one, the warm migration, is just a wrapper on the > suspend/resume feature which, instead of saving the suspended state > on > disk, sends it via the network to the destination > 2. The second part, the live migration, uses the layer previously > presented, but sends the guest's memory in rounds, as described > above. >=20 > The migration process works as follows: > 1. we identify: > =C2=A0- VM_NAME - the name of the virtual machine which will be migrated > =C2=A0- SRC_IP - the IP address of the source host > =C2=A0- DST_IP - the IP address of the destination host (default is 24983= ) > =C2=A0- DST_PORT - the port we want to use for migration > 2. we start a virtual machine on the destination host that will wait > for a migration. Here, we must specify SRC_IP (and the port we want > to > open for migration, default is 24983). > e.g.: bhyve ... -R SRC_IP:24983 guest_vm_dst > 3. using bhyvectl on the source host, we start the migration process. > e.g.: bhyvectl --migrate=3DDST_IP:24983 --vm=3Dguest_vm >=20 > A full tutorial on this can be found here: > https://github.com/FreeBSD-UPB/freebsd-src/wiki/Virtual-Machine-Migration= -using-bhyve >=20 > For sending the migration request to a virtual machine, we use the > same thread/socket that is used for suspend. > For receiving a migration request, we used a similar approach to the > resume process. >=20 > As some of you may remember seeing similar emails from our part on > the > freebsd-virtualization list, I'll present a brief history of this > project: > The first part of the project was the suspend/resume implementation > which landed in bhyve in 2020, under the BHYVE_SNAPSHOT guard > (https://reviews.freebsd.org/D19495). > After that, we focused on two tracks: > 1. adding various suspend/resume features (multiple device support - > https://reviews.freebsd.org/D26387, CAPSICUM support - > https://reviews.freebsd.org/D30471, having an uniform file format - > at > that time, during the bhyve bi-weekly calls, we concluded that the > JSON format was the most suitable at that time - > https://reviews.freebsd.org/D29262) so we can remove the #ifdef > BHYVE_SNAPSHOT guard. > 2. implementing the migration feature for bhyve. Since this one > relies > on the save/restore, but does not modify its behaviour, we considered > we can go in parallel with both tracks. > We had various presentations in the FreeBSD Community on these > topics: > AsiaBSDCon2018, AsiaBSDCon2019, BSDCan2019, BSDCan2020, > AsiaBSDCon2023. >=20 > The first patches for warm and live migration were opened in 2021: > https://reviews.freebsd.org/D28270, > https://reviews.freebsd.org/D30954. However, the general feedback on > these was that the patches are too big to be reviewed, so we should > split them in smaller chunks (this was also true for some of the > suspend/resume improvements). Thus, we split them into smaller parts. > Also, as things changed in bhyve (i.e., capsicum support for > suspend/resume was added this year), we rebased and updated our > reviews. >=20 > Thank you, > Elena >=20 --=-gt6pZqTTXiN27mLWVuKb Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEgvRSla3m2t/H2U9G2FTaVjFeAmoFAmSZOzgACgkQ2FTaVjFe Amp7ChAAhVnqrbTjXV4R9N+UjDVvfGZCBMuV4lekBKi/N1dj8dr9P0EaMmMgQGcq WtXyMFo0BsC+GKhmQe83Go5EnrdpowHVgOtqPP/9WlsKetmIBU0dCtYoPuQeUKek mOOr95yfOrv8HH2aKXL5MF7jml2OV0WsafJk7Im5NWmvWAxoDfyigxgtkMj3EV/5 hHwgT5/SDppyPmTyVP5XGYZjfsuJmOr3LM2smzcwcTfz4LZCJSs7WEFk63ZdNuYF 5QY5cQYQQrec976Fomrbc6KHGEZcNqFU/b6QfkZ9Cb5QEYxh93AAKcbqkiL34kiO izdAfXclZzVy/6qCvXb88FUUj1+oc4QjbAvzsZR3AoJZBBWwFHL4gXc1hv9CFfPC or3+zu2HwIaw4Dove6EtA8UMXHPVjCTeJf45JRt4r5UYvh/4gG7obysf99DFTpE7 GiVnsVoxNi7o5/0Pqbi8WTQ/aWrdRWA7XrYpmwJQohIGynqNAdFeZ3H/xJjvA09R EXylKJd5ST0BXE5jQOFJFapNgs6rOsRRhtXMVI+m6VcjIVNYPzCHMw2kKqy2IPrD 3fOWOdHZPsu4aG5wbaKgRrN530Kq+iDPBqC1GkH3iQ0ls2m7pW2PS0rfXuLwTkLp XfKji9Wn02iiXaY4bhyCxGIxz5wwfMp9JOeUlm6Vb3PGiLgmkrg= =sMQq -----END PGP SIGNATURE----- --=-gt6pZqTTXiN27mLWVuKb--