Re: Provisions to the contribution guidelines for using LLM generated code
- Reply: Sulev-Madis Silber : "Re:_Provisions_to_the_contribution_gu idelines_for_using_LLM_generated_code"
- Reply: paige_a_paige.bio: "Re: Provisions to the contribution guidelines for using LLM generated code"
- In reply to: paige_a_paige.bio: "Provisions to the contribution guidelines for using LLM generated code"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 30 Jan 2025 08:47:32 UTC
I am not a lawyer. If you want legal advice, you should talk to a lawyer. As a not-a-lawyer, my opinion is: Copyright law, in general, does not in any way describe how copying occurs. If you photocopy a book, or if someone reads it to you and you write it down, that’s equally copyright infringement or fair use based on the result: the mechanism does not matter. If you take a load of existing exFAT implementations, apply a lossy compression algorithm to them (neural network training) and then decompress, is the output a derived work of the input? That will depend on a load of tests that a court can apply to judge similarity and so on. In general, a good legal rule of thumb is that judges are not idiots (ignoring the Texas West District). If you use an obfuscated process to hide your illegal action then they will regard it as both illegal and wilful (and be annoyed with you), which is not a good place to be. Is your exFAT implementation a new creative work or a derived work of something else? Does it infringe Microsoft’s exFAT patents? I don’t know and going to court is probably the only way of getting a definitive answer. Please don’t expose the FreeBSD project to that legal risk, defending it would cost more than the annual budget of the Foundation. David > On 30 Jan 2025, at 02:05, paige@paige.bio wrote: > > Hi there, > > As y’all have probably heard AI is the new big thing in town and people are at a bit of a loss for what it means. Despite the news about the stock market sell off that came in the wake of the new DeepSeek thing, I’ve actually been playing around with this thing called Claude for the past couple of weeks and I’m still not really sure what to think of it. I think it’s really cool to say the least, but I still have a lot of questions myself. > > More specifically, I’m not really sure at what point does using something like Claude to create something like a native ExFAT filesystem become an issue of attribution; > > https://github.com/paigeadelethompson/exfat/tree/main/sys/fs/exfat > > it presumably created this based on the parameters in it’s model (presumably, it is not actually known how Anthropic’s models work because as far as I know that information is proprietary.) I vaguely understand how it is able to do this and to the best of my knowledge, it doesn’t plagiarize code but it does generate code based on facts that it can find in it’s own model about ideas which are potentially subject to patent restrictions. For what this is worth, I think that people are going to find this to be incredibly valuable regardless of whether or not it produces an exact desired result. What it doesn’t get right the first time is often the subject of something being really damn close. > > I’m really just dumbfounded by how much it actually can do that I haven’t even tried to compile this code for this filesystem it created; it didn’t take me more than an hour of saying “yes” following the initial "I'd like to make an ExFAT driver for FreeBSD in C can you give me the best starting point possible?” To be honest I kinda had to fact check it a couple of times, it wanted to do things like implementing extattrs which this filesystem patently doesn’t have. But as soon as I asked it, it seemed to know exactly what I meant: > > "No, you're right - I apologize for adding unnecessary complexity. The ExFAT specification doesn't include support for extended attributes like other filesystems (e.g., UFS or ext4). The only attributes ExFAT supports are the basic DOS/FAT attributes we already have defined” > > And then it proceeded to make changes to remove the stubs and so forth (which it may not have done right but I haven’t gotten that far yet.) In fact, I don’t really feel like I can realistically move forward with this (because I’ll have to fork $20 to get more time out of it) but also I just don’t really know whether or not this is okay. Obviously I want to say yes, but I get the impression that some people might not be okay with this, especially if what it creates is not well understood or violates copyright laws. > > "Under U.S. law, you cannot patent an idea, but you may be able to protect your idea by bringing it to life.” As far as I know the licensing for ExFAT is a little bit of a gray area. It’s Microsoft’s patent, there’s a GPL implementation that exists but asides from that I don’t know if it’s technically okay to make another implementation that is licensed any other way. I assume so, but it’s not unimaginable that even simply ingesting an ExFAT filesystem could come with some kind of stipulation. > > And I’m sure some people might even think “why would you, there’s a FUSE implementation for this already” and you know because FUSE is FUSE and this is an implementation of ExFAT that uses VFS. Also ExFAT/fuse does have problems but it works (sorta) in a pinch. I’d personally be more interested in improving something that is part of core FreeBSD than I would anything having to do with a port that I have to install in addition to the OS itself in order to use it. > > The reason why it matters; I just really like ExFAT. Virtually everything now has native support for it out of the box except for UEFI (they should, surprised Microsoft hasn’t pushed the standard to adopt it given that .WIM files can certainly exceed 4.3GB on modern versions of Windows. It just makes good sense to me to use it, even though it’s not a journaled filesystem. Using parchive is not lost on me, but I’ve seldom ever truly needed it even with ExFAT. > > Maybe I’m not even really trying to drive this to completion as much as I just needed an example and am just wanting to understand are people already doing this? Is it possible that people have already done this and nobody is really aware of it? I’d like to think if you can then you certainly should but where do you draw the line, and should there perhaps be conventions for keeping track of code in FreeBSD that is produced by LLMs? Maybe there already is and I just haven’t found it yet but it wouldn’t come as any surprise if there weren’t given this is all still kind of novel. Either way I’m sure there are things much more substantial than ExFAT worth trying, but there should probably be something of an understanding about what is and isn’t okay. I wonder if what we don’t know about proprietary LLMs like Claude could potentially be an easily overlooked problem that could have legal consequences later. > > In any case I’m sure people will figure it out, but if anybody was looking for a cue to discuss this I mean.. it’d be really useful to me if FreeBSD supported ExFAT out of the box (especially since I can’t get to my offline archive of the ports and it’s distfiles without it.) The only available implementations at present are GPL— so can we just like… generate an implementation with Claude and license it BSD? I honestly wish that my friend hadn’t insisted on showing me this kinda because I hoped to avoid something that I know is certainly going to have repercussions for the way things are currently done, but I can’t unsee this and I feel like I’ve been “doing it wrong” my whole life. > > -Paige >