From nobody Fri Mar 31 21:33:16 2023 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PpDBj2x1dz43HX2 for ; Fri, 31 Mar 2023 21:36:45 +0000 (UTC) (envelope-from jroberson@jroberson.net) Received: from mail-pj1-x102e.google.com (mail-pj1-x102e.google.com [IPv6:2607:f8b0:4864:20::102e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PpDBh3PmMz3mxF for ; Fri, 31 Mar 2023 21:36:44 +0000 (UTC) (envelope-from jroberson@jroberson.net) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=jroberson-net.20210112.gappssmtp.com header.s=20210112 header.b=nhlv0j+a; spf=none (mx1.freebsd.org: domain of jroberson@jroberson.net has no SPF policy when checking 2607:f8b0:4864:20::102e) smtp.mailfrom=jroberson@jroberson.net; dmarc=none Received: by mail-pj1-x102e.google.com with SMTP id j13so21842255pjd.1 for ; Fri, 31 Mar 2023 14:36:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jroberson-net.20210112.gappssmtp.com; s=20210112; t=1680298603; h=mime-version:references:message-id:in-reply-to:subject:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=GkvLA4J7/67CS6q1egDijef6oduiJIB+874ml0PnuTY=; b=nhlv0j+adZkrDon5a7lK4BiSW4b/MXTJkWmAKH0gAYzSCukdOraYqcHU1sm4NmBlP4 enTtio70LtWpKtf4qo1P5DwBZ3LJw8XRrsUiFMkIxNeh1OZxGYj3rj9kJjtTF+sDf7S9 +5gg7o09qsbthBfTPH4rLUIzO+zExqooO9jJ8qek8pD+AalfSUyCErS+RKm7aX4TZKfe Hyn1eHXPA9wu9hz1M2HvgoiWCUa0RxBbYwQ/dVnxwx7KgYhjxauUH1K+jReX3Un+ozMK M6MK7Hsxmyf1rqHQjzVQ90G+VQtZo3Quop8wPjgTS858bl9Iqm+p1i6K7irSLVX6wAUH ralg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680298603; h=mime-version:references:message-id:in-reply-to:subject:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GkvLA4J7/67CS6q1egDijef6oduiJIB+874ml0PnuTY=; b=pGvwPZvjyhoCJpruq69iNrD1rIIppNYb9aLsPHrYa2maRmcj+rXnvrwdRwBBIDGO64 4ZjHInUGHS/m0SI+iTIfDuYTgSb6XyUvQQzPjtD/avxRq5M6NlgXgPP13DLxF8HYhQ2w 4K1+KKKx5i+q4srppnf08WG99WP9iNoh2cv6twFqoat6wSYwVZYi1zOlpAqMAnd7CFyZ O2gCMo0gTuED96pMfG1RBVwx6IvXy485WAXNYrKzesHFOIpGS/lo+IHeOFk6nRV2uDPf dc6KJUN1iXpgzmAAc2uv1zbt2QiizQmT5qEHtdLNWE61o+9b0D0tcuwoEIIJTndWYXye GyXQ== X-Gm-Message-State: AAQBX9cRU7boik/jJFME7BGWT98vJN3PnuJiGH2vJg5p+fCVli3+vYpk 6t8PTiY7V4cnQtiPPWuE0acuana80BMwTWnsvac= X-Google-Smtp-Source: AKy350Y1ysbafr60vhkYXjxdWiL3s4Jlc5cz5Uj1DLuBg4uzGFnxzYDMGa7KXwip/wgWrkAvkqBisw== X-Received: by 2002:a17:90b:1b12:b0:23f:46a5:248e with SMTP id nu18-20020a17090b1b1200b0023f46a5248emr31017530pjb.44.1680298602903; Fri, 31 Mar 2023 14:36:42 -0700 (PDT) Received: from [192.168.0.31] (c-98-246-66-2.hsd1.wa.comcast.net. [98.246.66.2]) by smtp.gmail.com with ESMTPSA id x3-20020a17090a8a8300b00234465cd2a7sm1881713pjn.56.2023.03.31.14.36.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Mar 2023 14:36:42 -0700 (PDT) Date: Fri, 31 Mar 2023 14:33:16 -0700 (PDT) From: Jeff Roberson To: freebsd-hackers@freebsd.org Subject: Re: ULE process to resolution In-Reply-To: Message-ID: <11380305-6261-6c08-fd15-299e695fa342@jroberson.net> References: List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Spamd-Result: default: False [-3.30 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[jroberson-net.20210112.gappssmtp.com:s=20210112]; MIME_GOOD(-0.10)[text/plain]; R_SPF_NA(0.00)[no SPF record]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::102e:from]; MLMMJ_DEST(0.00)[freebsd-hackers@freebsd.org]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; DKIM_TRACE(0.00)[jroberson-net.20210112.gappssmtp.com:+]; RCVD_COUNT_THREE(0.00)[3]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; DMARC_NA(0.00)[jroberson.net]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_NONE(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; MID_RHS_MATCH_FROM(0.00)[] X-Rspamd-Queue-Id: 4PpDBh3PmMz3mxF X-Spamd-Bar: --- X-ThisMailContainsUnwantedMimeParts: N I found an old patch of mine that addresses some of the issues with rapid sleeping/waking batch processes here: https://reviews.freebsd.org/D15985 Seems there are some bits relevant to behavior described earlier on hackers@. I was not subscribed to this list so I can't reply to the specific message. Jeff On Fri, 31 Mar 2023, Jeff Roberson wrote: > Hi Folks, > > For those who don't know, I am the original author of ULE. I have not had > much time for FreeBSD in recent years but this thread was forwarded to me and > I am dishearetened at the state of things. I will give my perspective and > propose a path to resolve this systematically. > > The fundamental benefit of ULE is also the fundamental challenge, That is: N > cpu local decisions need to add up to a reasonable approximation of a correct > global decision. This is necessary to scale to large core counts, large > thread counts, and preserve some affinity. You could permute 4BSD further > towards these goals but I posit that you would simply have to work through > the same bugs. > > As I read these threads I can state with a high degree of confidence that > many of these tests worked with superior results with ULE at one time. It may > be that tradeoffs have changed or exposed weaknesses, it may also be that > it's simply been broken over time. I see a large number of commits intended > to address point issues and wonder whether we adequately explored the > consquences. Indeed I see solutions involving tunables proposed here that > will definitively break other cases. > > I know that CPU tradeoffs have changed. ULE was written in a way that the > topology could be annotated and cost of migration can be specified. It is > adaptable to this but someone has to put in the effort. The cost function > was written in ticks which does not scale down properly and accurate cpu tick > counters could now be used for more precise time-keeping for more specific > affinity. Over time people have also added additional searches to pickcpu > which don't scale well to very high core count systems. NUMA and > heterogeneous CPUs are also possible in the graph framework but need further > investment. > > The other thing that has changed over time is the ability of the > interactivity score to correctly detect truely interactive applications. When > I wrote it you could do a buildworld on a single core or small multi-core > system and play mp3s and browse the web without a hiccup. However, web > browsers have evolved to be significantly more resource intensive. I'm not > sure a heuristic can or should catch this case. We're probably long overdue > to add x window focus hints as most other operating systems do. I don't > think tossing the interactivity score is really going to produce the desired > results. Linux CFS disagrees with me but I have always been able to achieve > superior responsiveness with ULE. My intuition is that with an x window focus > hint we could dial back the interactive threshold and have better tradeoffs > with the soft real-time score. > > schedgraph is also no longer adequate for modern systems. In my professional > life I have taken the same types of data sources and built text based > processes on top because graphical representations just can't scale to the > number of events and cores for full system scheduling. For complex > scheduling issues you need detailed introspection. You're not going to tweak > variables and run buildworlds to arrive at success by supposition with any > kind of reasonable velocity. > > The first step to resolving this is to come up with a list of regression > tests and catalog how they behave compared to 4BSD. When I wrote the > scheduler I also wrote a simple fixed duty cycle program that could be run > with different scheduling parameters and report on its cpu usage and latency. > Combining many copies of this program you can simulate various kinds of > interactions. It is available at people.freebsd.org/~jeff/late.tgz. I know > there is also a linux scheduler benchmark that may be worth porting. > > If someone would start making regression tests I am happy to fix bugs or > review bug fixes. Personally I would start from fairness given different > nice values on a single CPU, and then multi-cpu. Evaluate allocation with > variation on load to core count ratios. It should not take a few hours to > iterate through the interesting cases here before going on to more complex > questions about buildworld or firefox etc. This would need to be something > we carried forward in the source tree and ask people to re-run as part of > scheduler CRs or we're just going to find ourselves back in this spot again. > > I also have a backlog of improvements for large multi-core systems from work > I did years ago that have not made it into the tree. And I have an old > review for patches to improve the reliability of priority in causing > scheduling events that may be germane. If we can collaborate on a testing > framework I could trickle these in. > > Thanks, > Jeff >