From nobody Fri Jan 31 20:43:55 2025 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Yl7Cw6CMGz5n42N for ; Fri, 31 Jan 2025 20:44:12 +0000 (UTC) (envelope-from paige@paige.bio) Received: from mr85p00im-zteg06021501.me.com (mr85p00im-zteg06021501.me.com [17.58.23.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Yl7Cw1Rqgz3T23 for ; Fri, 31 Jan 2025 20:44:12 +0000 (UTC) (envelope-from paige@paige.bio) Authentication-Results: mx1.freebsd.org; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paige.bio; s=sig1; bh=loHXBarxaDk3i3QzvIDqkouqpWqz6o6zCmSUqKysKyI=; h=Content-Type:Mime-Version:Subject:From:Date:Message-Id:To:x-icloud-hme; b=Kzsivm8VAXX0+FxfUHAWMIrzNnLiJ1KifMRZlFWTpo1u6nj9aj5zkMeQeLAZGHxK5 VUFGlRt0ZRreUs6XLMFw64JWhzrM7+CoeDStYN+m76ChwHuIuW2VQeaQjUsOHH0IJv dcF4yXp5Ihi3+Yrj+Ruv1ts/Wv7x4KIbl9+hG+WaFrgQuZRGoSu/HVN3Tkj1PfyyFM z3KqnTE49OT8RAbO/U8IiP1BT+Nr2QVtNcEmwKdjzAjtjKNMKs6wbocvCuh74XpT4H odpm1tC8MqcQWYTzTVPIBh6xz1aw8JDYHSC4SHWHjpJ9VRDXTqcsaN86LV1T7BOWcE ParPWPwuZ1wKw== Received: from smtpclient.apple (mr38p00im-dlb-asmtp-mailmevip.me.com [17.57.152.18]) by mr85p00im-zteg06021501.me.com (Postfix) with ESMTPSA id 118E92793EF0; Fri, 31 Jan 2025 20:44:07 +0000 (UTC) Content-Type: text/plain; charset=utf-8 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.400.131.1.6\)) Subject: Re: Provisions to the contribution guidelines for using LLM generated code From: paige@paige.bio In-Reply-To: <7F5CCEEE-A8A9-459A-A2C1-9ADC31BC91C6@FreeBSD.org> Date: Fri, 31 Jan 2025 12:43:55 -0800 Cc: Sulev-Madis Silber , freebsd-hackers@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <1E478400-5DFD-4C45-B466-F29EFD76A29E@paige.bio> References: <49B92974-E37A-4786-A456-E258D5A1D35E@paige.bio> <4922BB4E-1361-4AE9-A40D-D75E4875033D@freebsd.org> <7F5CCEEE-A8A9-459A-A2C1-9ADC31BC91C6@FreeBSD.org> To: David Chisnall X-Mailer: Apple Mail (2.3826.400.131.1.6) X-Proofpoint-ORIG-GUID: fXW1oTrL33TL7Tplng86KJNTXbhlMd4b X-Proofpoint-GUID: fXW1oTrL33TL7Tplng86KJNTXbhlMd4b X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 clxscore=1030 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2308100000 definitions=main-2501310157 X-Rspamd-Queue-Id: 4Yl7Cw1Rqgz3T23 X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:714, ipnet:17.58.16.0/20, country:US] > In the second case, I have deliberately used a plagiarism machine=20 I get what you're saying that it makes easy work of an otherwise = difficult task but I don=E2=80=99t think that inherently is what makes = it a plagiarism machine. I think people who have lives and kids to raise = generally like to contribute anything that adds quality to their own = life and given the circumstances will want to take the path of least = resistance. It=E2=80=99s entirely possible for somebody with good = intentions to use something like an LLM and for things like = Microsoft=E2=80=99s obscure hash table patent to be completely lost on = people who are responsible to say whether or not something gets merged. = There are of course people who will blatantly break the rules with the = intent to deceive and put things in places that they don=E2=80=99t = belong but that is a different problem than the one I have in mind and = my point is that even though the two are mutually exclusive they are not = always handled in their own unique way like they should be and that=E2=80=99= s unfortunate for people who have good intentions and the overall = reputation of LLMs. =20 > Microsoft has issued an explicit patent grant of the exFAT patents = *for Linux*. The =E2=80=98Open=E2=80=99 Innovation Network Sorry to mix threads here, but you=E2=80=99re right and this is also = what I mean; a lot of people might see something has a GPL = implementation and won't immediately arrive at the conclusion that = it=E2=80=99s only because they have permission to implement that idea = and make it GPL. The only reason that I know any better is because = I=E2=80=99ve watched Paragon Software for more than 20 years try to make = NTFS-3G a thing for Linux users. If I=E2=80=99m being honest with you, = Microsoft doesn=E2=80=99t just have an idea they have a monopoly on how = you can exchange data between computers that effectively makes it = impossible (still to this day) to use anything that they=E2=80=99re not = vetting.=20 > If a committer deliberately violates copyright, the code will be = removed and the committer will, most likely, lose commit access. Honestly I know it doesn=E2=80=99t do a whole lot of good to speculate = about what could become of LLMs at the moment, but I feel like if they = keep improving this that pretty soon somebody will be able to generate = their own driver for virtually anything they want and they won=E2=80=99t = need to share it because anybody else will be able to do the same. For a = few hours of work I already have:=20 - a KEXT for ExFAT (compiles) - fsck_exfat (compiles) - newfs_exfat (compiles) - mount_exfat (compiles) And granted none of them produce the correct filesystem (it=E2=80=99s = trying to) or handle a filesystem created by any other means (it also = makes a concerted effort to do this)--it=E2=80=99s really close. I think = we might actually see something that is powerful enough to create a = solution like this given a prompt in the next couple of years and = realistically contributions won=E2=80=99t mean much because people will = be able to make whatever they want or need for themselves and they = won=E2=80=99t have to distribute it.=20 I guess what I=E2=80=99m wondering is how will FreeBSD stay relevant = when this becomes a reality? I understand reality is much different from = this as it stands but I also gather the intention is to improve LLMs to = bring this reality into fruition. I think there=E2=80=99s an opportunity = to embrace the technology that is coming, but that there should be rules = and a vision behind it. I think it=E2=80=99s coming faster than a lot of = people can even keep up and there might not be any time as good as the = present to start thinking about it. > Next year, I believe, all patents on the original version of exFAT = will have expired I mean... they could renew their patents, but one has to wonder towards = what end at this point? As far as I know the only benefit to patenting = something is just so that you can=E2=80=99t reproduce somebody=E2=80=99s = idea and reap the benefits of redistribution. Really makes you think=E2=80= =A6=20 -Paige > On Jan 31, 2025, at 3:23=E2=80=AFAM, David Chisnall = wrote: >=20 > On 30 Jan 2025, at 12:03, Sulev-Madis Silber = wrote: >>=20 >> what happens if you take the word llm out and put a human in there? >>=20 >> there are ton of fbsd contributors and i often wonder if some of them = bring something in. apparently it's no "code-id" where we can put code = for checks. esp i worry about all those linuxkpi things. where's the = voluntary no consequences drug test that proves you didn't smoke any gpl = before you opened code editor >>=20 >> it's like llm is right out but humans are all ok? >=20 >=20 > No, as I said, the following two are equivalent: >=20 > - I copy some GPL=E2=80=99d code (or code with a license that requires = an attribution) and contribute it in such a way that violates the = license. > - I use an LLM to copy some GPL=E2=80=99d code (or code with a license = that requires an attribution) and contribute it in such a way that = violates the license. >=20 > The difference is that, in the first case, I *know* that I am doing = so. In the second case, I have deliberately used a plagiarism machine = but don=E2=80=99t know whether this specific output is copyright = violation or not. >=20 > If a committer deliberately violates copyright, the code will be = removed and the committer will, most likely, lose commit access. = Committers are responsible for the code that they commit, but if they = are using a plagiarism machine then the chances of them committing = accidental copyright infringement are much higher and that=E2=80=99s a = risk to the project. >=20 > David >=20 >=20