From nobody Mon Dec 19 17:36:05 2022 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NbRhF27Qjz1GHst for ; Mon, 19 Dec 2022 17:36:13 +0000 (UTC) (envelope-from bz@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4NbRhF1WMRz3r3p; Mon, 19 Dec 2022 17:36:13 +0000 (UTC) (envelope-from bz@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1671471373; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Dvd+MjyYG9IflkJ0jedPAYfa+HlZHgCRoVoEOvjgaV8=; b=ou87nnGn5reYjXPN7NiAbUuatQ4q9gI4NO1dwSqL0WAWgFOr2bQIY5p/O9XvO1tI9EbVup u0WSCauKToHQZ97nxtXMSJ+2ki+JebLu5eSgkitKXsK7py1WAJzuIjPTMcFgaK2YFJjp6O xY8au6i6ulnji2OArCH0JURg56tMmJnXXGvwSP4sHA1mHHMANtuNMX0n2gTGCyNCVTRPOf juwPjK5eoQNxLDlGe+oN+pwnaQGskIY/E3i+qEvhvmIm7QAsd8XHT65XvIINhfgGyLI0Ez kx7e0UuEQFhsWUjCU8sXweQkOIk3lpS2UcsiLZn/B5Cc5yeemF5UtU3w+rGQTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1671471373; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Dvd+MjyYG9IflkJ0jedPAYfa+HlZHgCRoVoEOvjgaV8=; b=d00DRuM9cSLc2VOkcSN1+NSzjDxNCprUgmRzHlvW9ySlxXPaUQFVBS184c4V5e8yByuNzh CmrfUmFW5K2L+ZuiWQbdkieXrts/bYAEo0ipRRQyBp+mJZjr4YOIs33vqBEhPn1nKytgGW /66VvS6tIXvSJrl1tYexUf+oosjzVinwKMXJMg+R9Ng8aVIObrU5Z3vsnyLTNlxrboSSC0 KKsDFwMilZc+IqzjG564Q+3BjZQUS8Kk/yW1DxQggJwbrhMYqCiC5y8hgyHoAoMe2Tt26J oXhPpJhac3vbSAu3pWMVLlGlipxzkimcJtxJeo/nMc1ciHTUR5pArG/LCC07Jw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1671471373; a=rsa-sha256; cv=none; b=Ek6SayBUHApADMhDFSIFELjno1abCZpd++BhjjLxFytZ1ep+RroxYhACKwoHBoxnZqZAJQ lfM/Gkm+99dlVmLci7I2kJdsldI6PgHCDEmrADH26sYUIv+IE4hq5MSItnAV1R7qm6sJWO lxc0AecixGu6cKmlh7Tlvnif+jlgdz7WLweKZtbPtp2HX0uVJohle+XJTJu3eS76yTk1s8 1GqWpyA7iRSO9lnODSAcb42+XgpRq1VvzGLoevnYS/QNOtfZEJydg3XEuL5g4bbJjDM7A1 k5GWEmbVDkkoyFZ2O61jQ7UwTv9aTGpcnMdSanCcOLyxFI2fGfHMpeYPkijwjQ== Received: from mx1.sbone.de (cross.sbone.de [195.201.62.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.sbone.de", Issuer "SBone.DE" (not verified)) (Authenticated sender: bz/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 4NbRhD6yppzl9M; Mon, 19 Dec 2022 17:36:12 +0000 (UTC) (envelope-from bz@freebsd.org) Received: from mail.sbone.de (mail.sbone.de [IPv6:fde9:577b:c1a9:4902:0:7404:2:1025]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.sbone.de (Postfix) with ESMTPS id 8BC5F8D4A142; Mon, 19 Dec 2022 17:36:10 +0000 (UTC) Received: from content-filter.t4-02.sbone.de (content-filter.t4-02.sbone.de [IPv6:fde9:577b:c1a9:4902:0:7404:2:2742]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.sbone.de (Postfix) with ESMTPS id 2A5E25C3A876; Mon, 19 Dec 2022 17:36:10 +0000 (UTC) X-Virus-Scanned: amavisd-new at sbone.de Received: from mail.sbone.de ([IPv6:fde9:577b:c1a9:4902:0:7404:2:1025]) by content-filter.t4-02.sbone.de (content-filter.t4-02.sbone.de [IPv6:fde9:577b:c1a9:4902:0:7404:2:2742]) (amavisd-new, port 10024) with ESMTP id b_L558xNkaDe; Mon, 19 Dec 2022 17:36:05 +0000 (UTC) Received: from strong-iwl0.sbone.de (strong-iwl0.sbone.de [IPv6:fde9:577b:c1a9:4902:b66b:fcff:fef3:e3d2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.sbone.de (Postfix) with ESMTPSA id 8F6045C3A833; Mon, 19 Dec 2022 17:36:05 +0000 (UTC) Date: Mon, 19 Dec 2022 17:36:05 +0000 (UTC) From: "Bjoern A. Zeeb" To: Rick Macklem cc: Konstantin Belousov , James Gritton , freebsd-current@freebsd.org Subject: Re: RFC: nfsd in a vnet jail In-Reply-To: Message-ID: References: <1955021.aDjkhKmpDe@ravel> <8351812.Gc231LQI4k@ravel> X-OpenPGP-Key-Id: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842 List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-ThisMailContainsUnwantedMimeParts: N On Mon, 19 Dec 2022, Rick Macklem wrote: > Hi, > > Kostik expressed some concern w.r.t. using a non-default VNET_NFSD kernel > build option and I understand his concern, given that many prefer to use > a GENERIC kernel and binary updates. yes, I may have hinted towards that (at least in my mind) during looking at the review. There is a reason that (at least for now) I do like having it. Personally (due to lack of time mostly) I haven't figured out if I would want this to be a vnfs one day or part of vnet. The earlier (like the kernel option) could more easily address possible security concerns to people (especially while this is a moving target with more possibly coming later). Hence me having asked for a dedicated macro not mangling with the VNET macros directly in place as this gives us a lot of flexibility to easily move this around in the future if we wanted to. Removing the option will make the code simpler. > Right now there are 29 NFS variables VNET_DEFINED() and several of them > are arrays currently sized at 500. One of the reasons for the non-default > VNET_NFSD kernel option was the bloat this caused to the vnet. > (Chris expressed concern that adding mountd/nfsd > to the vnet would result in bloat/overhead in a previous post to this > thread.) > > Another issue with putting all these variables in the vnet is that the > nfsd.ko > cannot be loaded (it complains the vnet is out of space) and, as such, > options NFSD must be used with VNET_NFSD so that the nfsd is linked into the > kernel. > > As such, I am wondering what others think of this alternate plan? > - Pull all the VNET_DEFINE()'d variable (except the 3 manipulated by > sysctls) > into a structure. > - Define a single VNET_DEFINE()'d variable that is a pointer to that > structure > and then malloc() the structure in the function called by VNET_SYSINIT(). > This would result in a malloc'd structure for each vnet jail (for kernels > built with VMIMAGE), but would only add 4 variables to the vnet. > > If a small C file that only consists of the VNET_DEFINE()s for the 4 > variables > is linked into the kernel whenever the VIMAGE option is specified, then I > think nfsd.ko would be loadable. It is not the number of variables added that is a problem; it is (as you point out above) their size which is a problem. So 500 uint8_t variables are as expensive as 1 uint8_t[500]; There's only a "small" allocated per-vnet to replicate state for modules. Multicast also had that problem needing huge junks and we eventually switched to malloc to fix this and make the module loadable as well again. (See 1a117215c7f90e6ef8c50ef3bfe099490aaa98f9 for one of the changes -- I think I scrwed somehting up there about sizing so probably follow-ups but it'll show the concept). So wether you make this one huge malloc or multiple small ones for the few big variables is up to you in the end. I also wonder what is easier to deal with "vnet0/prison0/'nfs0-bits'" as NFS_ROOT and other parts will need some of these things eventually to be in place early one for the base. Also sysctl and malloc and virtualisation can be a bit tricky. The "common" theme would be to only malloc the complex data types but leave the simple type variables being normal virtualised ones. > Unfortunately, this does not deal with vnet'ng the kgssapi, rpcsec_gss for > Kerberized mounts or vnet'ng NFS-over-TLS, but those could be handled in a > similar manner, I think? Could be, yes. > So, what do others think of this alternate plan? > > rick > ps: Every use of the vnet'd variables is currently wrapped in a macro called > NFSD_VNET(), so the change is pretty easy to do by just re-writing this > macro. > -- Bjoern A. Zeeb r15:7