|
You are correct, the laws surrounding unjust or unlawful enrichment are tricky. The NYT will have to prove in court that the AI is not randomly piecing articles together and not following a rule (like the standard "Who, What, When, Where, Why and How" of news article structure). But that the AI algorithm is using the stylistic pattern that was trained by the use of the NYT articles. That pattern when applied to "new" news articles will allow the AI to impersonate the successful NYT style and unfairly compete with the NYT.
You are correct that there is nothing stopping you from studying the NYT article style and copying that style. But to compete with the NYT you would also need to raise money to start your own newspaper. You as a person will not be able to compete with a complete news organization. You would need to hire people and in the end, your organization would be similar but not identical to the NYT. However, an AI with proper hardware can replicate the work of hundreds of people. It can be identical because it is not creative. It is not sentient, it is not conscious. It is an algorithm.
The NYT is claiming that the news articles were not used for their intended purpose, which is to inform the public of events. Instead it was used to train a machine to replicate the style that makes the NYT unique and the result will be a machine that can unfairly compete with the NYT.
For that valuable training, the NYT wants to be compensated or the material removed from the training dataset.
It remains to be seen how this will play out in court.
|
|
|
|
|
Gary Stachelski 2021 wrote: the laws surrounding unjust or unlawful enrichment are tricky.
Follow up on actual video (CNN?) suggested that NYT provided an 'example' which was a post where a real person could not find anything so they used a AI which responded with the first three paragraphs of an existing article.
Now one might say that is problematic. But any standard paywall is likely going to do something similar. Only alternative with a paywall is either to use only the headline or to provide a synopsis for every article.
The user/reader, if they wanted to see the entire article, would still need to access NYT.
So at least with that example I am not convinced where the problem lies.
Gary Stachelski 2021 wrote: nothing stopping you from studying the NYT article style
Nothing I have seen suggests that has anything to do with it. The problem is content in everything that I have seen.
|
|
|
|
|
Here is an article that just came out that sheds more light on NYT suit.
One thing that I did not consider is that AI responses often hallucinate (fabricate) results and in some of the NYT examples a GPT model completely fabricated an article that it claimed that the NYT published on January 10, 2020 titled "Study Finds Possible Link between Orange Juice and Non-Hodgkin's Lymphoma", The NYT never published such an article. Other examples show a mix of fact and fabricated info. Never thought about that aspect of AI responses.
NY Times sues Open AI, Microsoft over copyright infringement | Ars Technica[^]
|
|
|
|
|
But I doubt that is actionable. Not in this suit.
Their current claim is about how it is using the data it collected. Obviously this demonstrates something it didn't collect.
Not to mention they would also need to prove that what they publish is a standard in truth telling and thus this would hurt them.
But following as an example suggests otherwise.
What the New York Times UFO Report Actually Reveals[^]
|
|
|
|
|
|
Thanks for the link!
I don't think there is a problem when re-reading three times an article.
But an AI in learning mode can read thousands, hundred thousands or more text paragraphs or articles and re-read it for each Optimization Loop.
And there can be a huge number of loops.
|
|
|
|
|
Remember the napster debacle?
They stab it with their steely knives but they just can't kill the beast.
|
|
|
|
|
Can go one step further.
All the words of the NYT articles are taken from a standard English dictionary, and the AI is just rearranging/reusing words from that dictionary into meaningful (sometimes meaningless?) sentences.
So the publishers of that dictionary can indeed sue the AI, isn't it?
|
|
|
|
|
That's already settled law. An English dictionary publisher cannot sue everyone writing in English for breach of copyright.
What is protected in a copyright is not the individual words, but the creativity required to arrange them in a particular order. It is for this reason that derived works (set in the same "universe" as the original work) are also protected under copyright.
EDIT: corrected syntax
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
Maybe they should sue the NYT first for using their words.
|
|
|
|
|
I'm probably way more amused by this than I have any right to be.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
Begs the question of whether they’re protecting their journalists’ work or their paywall.
Time is the differentiation of eternity devised by man to measure the passage of human events.
- Manly P. Hall
Mark
Just another cog in the wheel
|
|
|
|
|
|
Good for them, it's certainly an issue that needs to be addressed. In their terms, AI isn't intelligence, it's parroting back what 'it' reads. Sometimes verbatim, sometimes glued together, and often mis-cited; appearing to come from sources that don't reflect the content.
In our industry, we can look at this from two different perspectives. One is the "Houston, we have a problem building AI" and the other is "yeah, we need better IP protections". Creating intelligent content costs money; in some cases a lot of money. If AI is allowed to trample IP rights, what is the motivation to invest the time and resources to create that content? What happens if the Times and other media cease to exist since their ability to make money ends? AI can't replace it and the information age will be permanently stuck in 2023 to some extent.
As an aside, in my opinion, the Fed needs to revisit the entire IP realm. We, as an industry, have been stuck between the lame copyright protection and the extreme bar of patent protection. The day is liable to come at some point, where AI could get into recreating software on its own, possibly eliminating any IP protection. There's a lot of things that need to be sorted out.
|
|
|
|
|
I've come up with a simple defense that the OpenAI team of lawyers can utilize and that no one can possibly defend.
If the President of Harvard can do it then Chat-GPT can do it because if the President of Harvard can do it because she is a "protected" class then what is more of a minority than the very first instance of an AI and shouldn't that then be a protected class that is allowed to also break the law and all forms of ethics if the Harvard President is also allowed to otherwise keep her job after having so many clear instances of plegarism?
|
|
|
|
|
upvoted.
Charlie Gilley
“They who can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.” BF, 1759
Has never been more appropriate.
|
|
|
|
|
Wait till they figure out everything is a derivative and nothing is original.
|
|
|
|
|
Bear with me, because as much as I am loath to holy roll about technology, I still have my peeves.
I went about porting my DFA lexer engine from C# to TypeScript. It was primarily an exercise in teaching myself TypeScript, plus brushing up on my JS.
So I implement the bones of it, and after adjusting my mental map to the JS way of doing things I got it mostly working.
Then I went about trying to use a Map keyed by Sets.
Turns out JS Map and Set will only compare by value for "scalar" types (presumably including strings) or otherwise it uses reference comparisons. You can't override your own equality mechanism either.
how to customize object equality for javascript set - Stack Overflow[^]
Consequently, there is no performant way to do subset construction to convert an NFA to a DFA in this language.
I've seen others solve this problem by using string keys, but this falls down for machines of non-trivial size.
Regex FA visualizer[^] is one example but I can basically crash it or stall it out for a long time at least with any non-trivial expression. This one also doesn't work properly besides, but I have no other link handy for you to try.
This may be academic, but it is also basic computer science. A language should be able to allow you to implement computer sciencey algorithms and constructs - especially those that have been adapted to countless other programming languages. DFA by subset construction is basic.
And you can't do it in JS.
I can't even begin to imagine what LALR table generation would look like.
You may be wondering why do I care?
Because node.js.
Because Angular
Because React-Native
it's not just for web front ends anymore. JS is an almost virulent technology these days. It needs to be, if not Turing complete at least cover the fundamentals, or you're just spreading garbage around.
Without a way to do custom comparisons at the very least on hashed containers, your language isn't going to be able to do a lot of things other high level languages can accomplish handily.
Is it even a "real" language? Is it ready for primetime, or is it just being adopted because we can?
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
modified 27-Dec-23 9:13am.
|
|
|
|
|
My experience has been that programming languages evolve mostly to meet practical needs. You can substitute the words economic or business for the word practical and still have a valid statement. While there is a certain amount of 'need' for the ability to implement computer-sciencey algorithms in a language in a performant way, I think it's a lower priority than other features that simplify or extend expression of common idioms.honey the codewitch wrote: Consequently, there is no performant way to do subset construction to convert an NFA to a DFA in this language I take it that it's not impossible, and your objection is to the performance of the implementation required by the language? It sounds like an edge case you run in to with almost every language that needs an alternative solution.
For example: Since you're implementing this in TypeScript, it's a web app. That implies a server. What about serializing the NFA, shipping it to the server for conversion, and deserializing the DFA returned? 'Out-of-the-box', as it were .
Software Zen: delete this;
|
|
|
|
|
TypeScript (JS really in this case, since TS is just syntactic sugar and validation) doesn't imply a web app anymore, which was part of the point I was making in my original, arguably too verbose post.
It's used on the backend (node.js). It's used on the desktop (angular, react-native). Kevin only knows where next?
My concern is it doesn't seem ready for it.
As far as computer sciencey algorithms not being needed, consider that constructs in computer science make up nearly every programming problem you'll ever solve.
DFA by subset construction is not the only place you'd ever need custom equality.
From ECMAScript 6: maps and sets[^]
5.2 Why can’t I configure how maps and sets compare keys and values?
Question: It would be nice if there were a way to configure what map keys and what set elements are considered equal. Why isn’t there?
Answer: That feature has been postponed, as it is difficult to implement properly and efficiently. One option is to hand callbacks to collections that specify equality.
Another option, available in Java, is to specify equality via a method that object implement (equals() in Java). However, this approach is problematic for mutable objects: In general, if an object changes, its “location” inside a collection has to change, as well. But that’s not what happens in Java. JavaScript will probably go the safer route of only enabling comparison by value for special immutable objects (so-called value objects). Comparison by value means that two values are considered equal if their contents are equal. Primitive values are compared by value in JavaScript.
Read that carefully and you'll see the problem is more fundamental than simply maps and sets. You can't override equality. You can't implement custom value equality for objects.
That hamstrings your ability to use Sets and Maps in the first place, but that's not the only place it limits you.
It also speaks to a large issue of - if this is missing/incomplete/problematic-to-implement based on how the language works under the covers, what else can't it do that is fundamental?
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
honey the codewitch wrote: As far as computer sciencey algorithms not being needed, consider that constructs in computer science make up nearly every programming problem you'll ever solve. I was trying to express the notion that the ability to implement algorithms efficiently depends upon the utility of that algorithm in the target environment.honey the codewitch wrote: Read that carefully and you'll see the problem is more fundamental than simply maps and sets. You can't override equality. You can't implement custom value equality for objects. My knowledge of TypeScript/JavaScript is limited. Would it be possible re-implement maps and sets that supported at least a limited form of the capabilities you require?honey the codewitch wrote: if this is missing/incomplete/problematic-to-implement based on how the language works under the covers, what else can't it do that is fundamental? There's the key word (no pun intended).
Software Zen: delete this;
|
|
|
|
|
Gary R. Wheeler wrote: I was trying to express the notion that the ability to implement algorithms efficiently depends upon the utility of that algorithm in the target environment.
Okay, fair enough. Though again, I'm still worried about this given this is my first attempt to do something non-trivial with typescript, or even - in years - javascript. The first attempt, and I run into such a fundamental and showstopping limitation it definitely gets the hair on the back of my neck on end, you know?
Gary R. Wheeler wrote: My knowledge of TypeScript/JavaScript is limited. Would it be possible re-implement maps and sets that supported at least a limited form of the capabilities you require?
Not efficiently using JS itself, as far as I can tell. I haven't profiled to be sure, but it's a lot of work for something that probably won't solve the issue.
Gary R. Wheeler wrote: There's the key word (no pun intended).
Which brings me back to me above response, about this being my first real go at TS, and my first time running into a major wall with it. It doesn't bode well at least to me.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
honey the codewitch wrote: if an object changes, its “location” inside a collection has to change, as well. But that’s not what happens in Java. JavaScript will probably go the safer route of only enabling comparison by value for special immutable objects (so-called value objects).
I realize that is not your statement but I will note it is a silly response.
In ANY language if you change the contents of a entities within a collection there is a risk that you violate the constraints of the collection.
Any competent programmer that actually understands a Java HashMap (and similar) must understand the impact of attempting to change the semantics of what equals() and hashcode() actually means.
I can see a Junior developer doing that.
I can also see a Junior developer failing to correctly manage memory allocations in C/C++ also.
But Java is not JavaScript nor are either C/C++.
honey the codewitch wrote: But that’s not what happens in Java
That is perhaps more ludicrous. Is there any language where that happens?
|
|
|
|
|
I agree with you about that response. It seems like they refuse to implement the feature until they can overengineer it.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
Quote: It's used on the backend (node.js). It's used on the desktop (angular, react-native).
That is just bad application design. JavaScript on the backend or the desktop is like putting a lawnmower engine in a Chevy Corvette and expecting to get a speeding ticket on the Interstate.
C#, VB.NET, Java/Kotlin, Rust, Python, etc. are much better server-side and desktop languages than JavaScript.
|
|
|
|