FutureLaw 2024 - Generative AI and Intellectual Property
Estimated read time: 1:20
Summary
The FutureLaw 2024 panel at Stanford Law School discussed the implications of generative AI on intellectual property (IP). Led by Professor Lemley, various experts, including Angela Dunning and Max Sills, debated the legality of training AI models using copyrighted data. Key issues included whether AI training constitutes fair use and the potential for licensing markets. The discussion emphasized the complexity of balancing innovation with creative rights, reflecting diverse legal and emotional perspectives. While fair use and transformative use were central themes, the panel acknowledged that legal outcomes are unpredictable due to evolving technological and societal contexts.
Highlights
- Professor Mark Lemley moderated the panel at Stanford Law. 🎓
- Discussion focused on the law of AI rather than AI for law. ⚖️
- Experts debated if AI training is fair use or needs licensing. 🤔
- Training using copyrighted works poses existential questions for AI. 🧠
- Potential licensing schemes for AI training are complicated. 💼
Key Takeaways
- Generative AI is reshaping the legal landscape of intellectual property! 🤖
- The distinction between AI training and outputs is crucial. 🎓
- Fair use and transformative use are key legal concepts under scrutiny. ⚖️
- Emotions around AI and copyright vary from excitement to concern. 😅
- The potential for licensing AI training data is being explored but is complex. 🔍
Overview
The FutureLaw 2024 session at Stanford Law School brought together experts to discuss a hot topic: generative AI and its implications for intellectual property. With Professor Mark Lemley at the helm, the panel explored the legality and ethical considerations of AI learning from copyrighted materials. Does it fall under fair use? Or is there a need for new licensing frameworks? These questions were at the heart of the discussion, reflecting the complex interplay between law, innovation, and creativity.
One of the key discussions was the separation between AI training and the resulting outputs. It's a nuanced distinction that's crucial in legal debates, especially regarding fair use and transformative use. Panelists shared differing opinions, from those who see AI's ability to learn as a natural extension of human creativity, to those worried about the economic implications and potential misuse of intellectual property.
The conversation also touched on emotional and societal perspectives, illustrating how people’s feelings about AI's role in IP range from innovative excitement to protective concern. As industries grapple with these changes, the possibility of licensing training data emerges, though it presents logistical challenges. Ultimately, the panel underscored that the outcomes of these legal battles are unpredictable, significantly influenced by technological advancements and cultural shifts.
Chapters
- 00:00 - 00:30: Introduction and Overview The chapter introduces a panel discussion on Generative AI (Gen AI) and Intellectual Property (IP), moderated by Professor Lemley. It humorously notes that this is Professor Lemley's appearance outside of his 'Jedi outfit'. The initial remarks set a light-hearted tone, jesting about the AI's potential uses, including creating a picture of Professor Lemley as a Jedi.
- 00:30 - 01:00: Moderator Introduction - Professor Lemley The chapter titled 'Moderator Introduction - Professor Lemley' introduces Professor Lemley, who is the Faculty Director of the Law, Science, and Technology program. The speaker humorously mentions seeing actual pictures of themselves in a Jedi outfit, joking that AI is not needed for that. Professor Lemley is described as the speaker's boss at the Law School, and is noted to be one of the most published and cited authors, not only in intellectual property (IP) law but in all areas of legal academia.
- 01:00 - 02:00: Panel Introduction The chapter titled 'Panel Introduction' introduces the panel's focus on Generative AI and Intellectual Property. The speaker highlights a prominent figure in legal technology, who is a creator of a legal tech project that evolved into the well-known startup Lex Machina. This individual will be leading the discussion for the panel. The audience is invited to welcome the panel and its leader, Mark.
- 02:00 - 03:00: Panel Focus: Law of AI The chapter titled 'Panel Focus: Law of AI' begins with the moderator addressing the audience, expressing gratitude, and setting the stage for the panel discussion. The panel is unique in the context of the conference as it shifts the focus from AI for law to the law governing AI. The panelists are introduced and there is an emphasis on the significance of this discussion because it addresses key questions that will influence the future and legal landscape of AI technologies like ChatGPT and other similar tools.
- 03:00 - 04:00: Issues in AI Ownership and Control The chapter discusses the complex issues surrounding ownership and control of artificial intelligence (AI). Legal considerations are highlighted as critical factors that will influence the development and use of AI, particularly in legal contexts. The chapter emphasizes that current legal challenges related to AI will significantly impact its future development and application in society. Mark Lemley, a notable figure in the legal field, is mentioned, indicating the discussion draws on expert insights to explore these issues.
- 04:00 - 06:00: Introduction of Panelists The chapter 'Introduction of Panelists' begins with panelists being introduced, highlighting their professional backgrounds and areas of expertise. A lawyer from Lex Lumina mentions litigating some relevant cases to be discussed. Angela Dunning self-introduces as a litigation partner at Cleary in Palo Alto, specializing in copyright, trademark, false advertising, and right of publicity cases. She has been practicing for approximately 25 years in the Valley and also engages in teaching.
- 06:00 - 08:00: Overview of AI-related Litigation The chapter discusses AI-related litigation, specifically focusing on copyright class action lawsuits linked to the development, training, and output of generative AI tools. The narrator shares their background in trademark law and expresses enthusiasm about discussing these matters with esteemed colleagues.
- 08:00 - 12:00: Fair Use in AI Training This chapter features a discussion with Max Sills, who is the General Counsel of Midjourney and runs Open Advisory Services, a consulting firm for AI startups. It also includes Danielle Van Leer, a former Senior Assistant General Counsel at SAG AFTRA, where she worked on legal issues concerning AI, performer rights, name image likeness rights, publicity rights, intellectual property, and privacy. The conversation touches upon the complex legal landscape of AI, especially regarding contracts and compliance that do not easily fit into collective bargaining frameworks.
- 12:00 - 16:00: Fair Use Defense and Case Study The chapter titled 'Fair Use Defense and Case Study' begins with a character reflecting on their career, contemplating a change in direction after a strike disrupted their field. The narrative introduces Paul Goldstein, a long-serving faculty member at an academic institution, specializing in teaching copyright law, particularly focusing on international intellectual property. Goldstein shares insights from his extensive experience dating back to 1975, offering a perspective that predates many in his audience.
- 16:00 - 20:00: Transformative Use and Copyright Challenges This chapter discusses the ongoing legal battles concerning the use of copyrighted material in AI training, with a focus on the numerous lawsuits that address these issues in the United States and internationally. The legal landscape is complex, with over 20 active lawsuits in the US alone. This indicates the significant legal challenges faced by AI developers in normalizing the use of copyrighted content for machine learning purposes.
- 20:00 - 26:00: Licensing and Market Effects This chapter discusses the early stages of creating generative AI, focusing on the training process and the construction of datasets. It highlights the immense volume of content required for these datasets to understand language, concepts, and image relationships. Additionally, the chapter touches on the prevalent issue of copyright, noting that nearly everything in today's world is protected by copyright laws.
- 26:00 - 30:00: International Regulation and Compliance This chapter discusses the legality of training generative AI databases, specifically focusing on the aspects of copyright law. It highlights notable exceptions in the law and questions whether it is legal to use copyrighted material in training datasets. The speaker Angela suggests that while it would be ideal to declare it legal unambiguously, there remain challenges and complexities to navigate within this legal sphere.
- 30:00 - 35:20: Economic Realities of Licensing This chapter discusses the ongoing legal challenges related to the use of licensing in the context of training AI models. The central legal question being addressed is whether the act of making copies for AI training can be considered fair use. Although many lawsuits have been filed, few have directly put this fair use question to the court. The resolution of these cases is expected to hinge on this crucial issue. As it stands, the definitive legal determination on this matter is still forthcoming.
- 35:20 - 43:00: Output Infringement The chapter titled 'Output Infringement' discusses a legal case involving music publishers seeking a preliminary injunction against Anthropic. The concern is over Anthropic's Claude model being trained on content owned by the music publishers, aiming to prevent further training that infringes on their rights.
- 43:00 - 46:00: Future of AI and Legal Implications The chapter explores the future of Artificial Intelligence and its legal ramifications. It delves into a case involving plaintiffs where the central legal question revolves around the 'fair use' doctrine and its application. The discussion highlights the importance of establishing whether plaintiffs are likely to succeed on the merits of their claims, which is crucial for determining the provision of injunctive relief. Both sides have presented arguments regarding the legality and implications of training AI on certain content, making the issue of 'fair use' a pivotal point in this legal debate.
- 46:00 - 47:00: Closing and Audience Q&A The chapter covers the closing remarks and an audience Q&A session. It highlights a legal case involving former Arkansas Governor Mike Huckabee and other writers in the Southern District of New York. The case involved multiple parties, most of whom were either dismissed or transferred out, leaving Bloomberg as the remaining party.
FutureLaw 2024 - Generative AI and Intellectual Property Transcription
- 00:00 - 00:30 Alright, this is a highly anticipated panel on Gen AI and IP. We're continuing with our next session, which will be moderated by our very own Professor Lemley. This is how, this is how he looks when he's not wearing his Jedi outfit. I, I, I have to say all the uses of AI that could be made and a picture of me in a Jedi outfit, it seems particularly
- 00:30 - 01:00 useless because you can find actual pictures of me in a Jedi outfit with illegal eyes. You don't need AI for that. Anyway. So please come on in. And so, and Professor Lemley is of course the Faculty Director of our program in Law, Science, and Technology. He's my boss here at the Law School so I have to be my, my best behavior here today. But yeah, he's one, not only one of the, the most published and cited authors in IP, but in all of legal academia.
- 01:00 - 01:30 And he's also a legal tech innovator. He created a legal tech project here several years ago, which became a startup that we all know Lex Machina. And he will be running this Gen AI and IP panel here for us today. So please join me in welcoming our panel and over to you, Mark.
- 01:30 - 02:00 All right. Thanks everybody. So I'm going to, let me just, I want to ask the panelists to self introduce and in a minute, but just this panel is a little different than everything else at the conference. We're going to talk about the law of AI rather than AI for law. And in part that's because it's kind of an interesting question, but in part because I think it's a question that is going to determine whether we have things like ChatGPT and other tools
- 02:00 - 02:30 that can be used to these things, who owns them, who controls them and how they work. And so a lot of the sort of legal issues that are going on right now I think are going to be dealt with. Significantly impacting the way AI develops and how it can be, or used or not used for law. So I am, as Roland said, Mark Lemley here at the law school I will note that I am also practicing
- 02:30 - 03:00 lawyer at Lex Lumina right, where I am litigating some of the cases that we are gonna talk about and we'll mention that as relevant and let me ask just going down the line, Angela, to self introduce. Hi everyone, I'm Angela Dunning. I'm a litigation partner at Cleary based just down the street in Palo Alto. I've been practicing copyright and trademark, false advertising, right of publicity cases for the better part of about a quarter century now here in the Valley. I also teach
- 03:00 - 03:30 trademark law at a little law school across the bay that won't be mentioned in this environment. But I, like Mark, am litigating several of the copyright class action lawsuits that have been directed to the development, training, and output of generative AI tools. And I'm excited to be here to talk to you about that with my esteemed co panelists. Hi,
- 03:30 - 04:00 I'm Max Sills. I'm the General Counsel of Midjourney, and I also run Open Advisory Services, which is an advisory practice for AI startups. I'm Danielle Van Leer, until about a month ago, I was a Senior Assistant GC for Contracts and Compliance at SAG AFTRA, where I worked on AI and performer rights issues, name image likeness rights, rights of publicity, IP, privacy, all kinds of stuff. Whatever kind of didn't fit into the collective bargaining
- 04:00 - 04:30 hopper kind of fell in my area. Now I'm trying to figure out what I want to be when I grow up because I needed a change after the strike. I'm Paul Goldstein. I've been on the faculty here since 1975 before some of you were born. And teaching principally copyright and generally around intellectual property, but mostly copyright and international and
- 04:30 - 05:00 comparative copyright.Like Angela, like Mark, I am involved in some of the litigation that surrounds training, AI training as well. Great. So, I want to start with that set of litigation. There are, at my last count, 20 lawsuits going on in the United States as well as several in other countries. And most of those lawsuits are focused right now
- 05:00 - 05:30 at least on the sort of early stage creation of generative AI, the training the building and use of a data set to train generative AI, those data sets of course take enormous amounts of content to try to learn how language works to try to learn concepts and image relationships. And because almost everything in the world is copyrighted in the modern era almost everything
- 05:30 - 06:00 that goes into a training dataset is copyrighted. There are some notable exceptions, particularly in law that we'll talk about. So where are we with the sort of fundamental question, right? Is it even legal to train a generative AI database? Angela? Well, I'd love to declare that it is and have that be the end of it. But I think that we've got a little ways
- 06:00 - 06:30 to go. So as Mark said, there are an awful lot of lawsuits. To date, very few of them have actually put the question to the court yet directly with respect to whether the training of an AI model constitutes fair use in connection with the making of copies for purposes of training. That issue will likely be the decisive factor in most of these cases. But we are a ways from
- 06:30 - 07:00 getting a ruling. I would just highlight a couple of cases that are out in front. One is the case filed by music publishers against Anthropic in which a preliminary injunction has been sought to block the further future training of Anthropic's Claude model on content owned by
- 07:00 - 07:30 those plaintiffs. And in that case the fair use question has been briefed in connection with the inquiry into whether plaintiffs are likely to succeed on the merits of their claim, which is a key factor in determining whether injunctive relief should be granted. And there, there are arguments that have been put in both ways on why the training on that content is
- 07:30 - 08:00 fair or not but there is not yet a ruling. And then I would also just point out to the room the case that was filed by former Arkansas Governor Mike Huckabee among a class of writers in the Southern District of New York. That case originally was filed against a number of parties. All have been transferred out or dismissed except Bloomberg. And in Bloomberg's motion to dismiss,
- 08:00 - 08:30 which is not yet fully briefed, they have sought dismissal at the pleading stage on fair use grounds arguing that the tool they developed has never been commercially released and has never been used by anyone, hasn't generated any revenue. It was a research tool in and of, and, and a research tool deployed by a news organization. And so, therefore, on the papers very squarely within what fair use was intended to protect. So we're watching those cases and it'll be some
- 08:30 - 09:00 time before fair use issues properly bubble up in the other cases that we're working on. Yeah, and so just to note on the procedure there, right? I mean, that's because these cases are all at the motion to dismiss stage. Fair use is a defense, so it's not part of the pleading. So you've got to wait until the case is at issue. What we've seen, I think, in the cases is the whittling out of a lot of sort of ancillary theories of liability right? So the courts
- 09:00 - 09:30 have pretty much across the board rejected claims that you are violating Section 1202 of the Digital Millennium Copyright Act by removing copyright management information in the training data set. They have across the board rejected the theory that the model itself is somehow a derivative work of all of the billions of works that went into training it. Some of the state law cases have gone away. But, so let's talk about the sort of heart of the fair use question,
- 09:30 - 10:00 right? Is there, uh, I mean, there is a lot of copying going on here, right? I mean, these models are built on a database, right? That trains maybe on Common Crawl maybe on the Lion Image database, right? But that are billions of different copyrighted works. That at least in the outset, right, in their entirety go into the, to the database. Is that fair use?
- 10:00 - 10:30 Why? Or why not? And Angela, you're welcome to jump in, but anybody's welcome to jump in with thoughts on this. I'll start us off. I, I think fundamentally it has to be. What we as human beings have done from the dawn of time is ingested what knowledge exists in the world. Through the reading of books, the viewing of art, the general perception
- 10:30 - 11:00 of language and all forms of learning. We take that information, we then think on it, iterate on it in our minds and produce new content. Obviously if the content we produce is substantially similar in protected expression to somebody else's book or somebody else's artwork, then that may raise a real copyright concern. But the learning aspect is not just permissible. That's the whole point of the Copyright Act. That's the whole point of
- 11:00 - 11:30 the constitutional provision that guarantees this limited copyright monopoly for purposes of ensuring that there is a promotion of the development of the arts and sciences. We want information in the form of literature, text, data, music, art, to be made available to all so that it can be learned from and developed further. And at the training stage in, when you're talking about an AI model, you're, you're talking about the making of copies, not so those copies
- 11:30 - 12:00 can then be reproduced, which may raise issues, but so that you can take the information from those copies, whether it's art or literature or post, figure out how language works, figure out what a cat is from its spatial dimensions so that the AI can produce new content the same way that humans can. Maybe just to add, I completely agree, just to add to that I think it's a good
- 12:00 - 12:30 time to just check in on how we're feeling. So I think that people are overloading copyright with a lot of feelings that, that we have around AI. So one issue that's going on is we're having another period of industrialization and automation. And people are upset because it seems like the monetary gains of that are accruing to a small amount of people. That's a separate question from what is current copyright law. There's also the idea of what,
- 12:30 - 13:00 what do we want copyright law to be? So just to underline the point Angela was making I think if you feel upset, and you want money, it makes sense to advance a theory that you own a property interest in something. But, the whole reason we gave people those property interests was to advance culture. Advance society. And, I think we might be at a point where we're forgetting the merger doctrine. Forgetting the idea expression dichotomy.
- 13:00 - 13:30 Because we're confusing feelings of upset about the economic gains of automation with why we have this other law over here. So I kind of think that asking whether under current law is fair use, is a boring question with a boring answer. It's yes, clearly. So I think we should be prepping ourselves for like, what do we really want to say? What direction do we want society to move in to? We probably want society to maximize creative
- 13:30 - 14:00 expression rather than let everyone assert a property interest over every idea they have. Alright, I'm going to disagree and, you know. And for the feelings perspective, I actually, I, one thing I didn't mention is that I'm a screenwriter and photographer on the side. So, I, you know, I do have an interest on kind of on all sides. I do use, you know, ChatGPT, for instance, in kind of ideating things, just trying to help people, writer's block. One of
- 14:00 - 14:30 the guys I played D& D with uses mid journey all the time to create really cool photos. But, you know, there's a, the flip side of that is one is like where the training data is coming from. I mean, these are like, a lot of this training data is coming from being harvested off the internet that people didn't necessarily, you know, anticipate it would be used and maybe they would have kept their IP private if they, if that was the case. I mean, certainly I have photographs that I wouldn't have posted publicly if I knew that that was going to happen.
- 14:30 - 15:00 I'm not even thrilled that I have photos up on Getty, iStock and Shutterstock. I'm not thrilled that they licensed my photos for training data. But I think it, you know, I think there is a, it depends context here too because it depends on the, the particular algorithm, right? I mean, if you look at, say, going back to the face swap videos, where they are literally taking clips from films and repurposing those, the output of that. Is that training data fair use when the purpose
- 15:00 - 15:30 is to reproduce that exact clip or those exact people? You know, I do think. I think this is something that, I am a a proponent of very strong copyright, but I also have friends that, like I said, have used Midjourney for stuff. I played around with it once to create my D&D character picture. You know her, my little avatar. So, you know, I think they're, you know, I think there's a middle ground here that we're not finding it right
- 15:30 - 16:00 now because these are all shaking out everything is at the far ends of the extreme and we're not seeing many middle ground positions right now, but but I do think it depends on the intent behind the training in large part when you start looking at the four factors and it, it does depend on I, I, I, I don't know that everybody here even knows the, the, are there folks who don't know the U.S four factors of fair use or four plus factors. I'm sure the answer to that is yes,
- 16:00 - 16:30 it's not a copyright crime. So there's, you know, under US copyright law, there's a, a, the defense, you know, in, in fair use entails balancing four factors, but there's some question, depending who you ask, if it's exclusively four factors, if it's more, I'm on the side that advocates that it is more than just the four factors, but, and I always screw them up when I'm just speaking off the cuff, but it's the, the nature and purpose of the use. Like, so are you using it for for-profit purposes?
- 16:30 - 17:00 So, you know, are they training the algorithm to make money? The type of work that's being trained, and that can be whether it's a creative work, or a like factual work. That gets to the idea expression dichotomy. The, you know, the amount used, essentially, like I said, I always screw it up when I'm thinking off the cuff here the amount of the work that you're using, are you, you training it on the whole book, the whole movie, or are you training it on snippets?
- 17:00 - 17:30 And also on the, the effect on the market for that work. So if, you know, and that can include taking away licensing revenue. So we do have to keep all that in mind. I mean, that was intended to balance, you know, the common law approach to limiting copyrights, so I think you know, not to mention when you start looking internationally, you get into a whole other mess where the Berne Convention has which most countries, most major countries are
- 17:30 - 18:00 signatories to, have more restrictions, and I think we're starting to see some interesting things happen in internationally. I believe Japan is allowing training data and the EU is adopting an opt out or notice an opt out process. So you know, this is, I think we can't look at it just with a U. S. lens. We need to also look at it in the context of international. Yeah, just building on Max's observation about feelings and, and Danielle's earlier beginning points.
- 18:00 - 18:30 I think there are feelings in conflict here on the one side. Yeah. You have the feeling that we should be advancing research. The Constitution authorizes copyright to promote the progress of science and the useful arts, and certainly that's what training is doing. On the other hand, there is a feeling, certainly in the creative community of what my colleague in the political
- 18:30 - 19:00 science department has characterized in this context as a need for reparation. And it's interesting to take the reparation discussion and plant it squarely in the consciousness of people who feel that they've been ripped off, and whatever the technicalities of the law may be, deserve to be compensated in some way. I think that's the feelings that, that are in conflict. On the technicalities of, of fair use you know, Mark said, well
- 19:00 - 19:30 there's just a lot of stuff that we're copying. Yeah, the 20 million books that Google digitized for the Google Books project was also a lot of copying and was held to be fair use. I think when the fair use question is addressed in the ongoing litigation, the question will be does the rule of the Google Books case from the Second Circuit apply here. Most of the litigation is
- 19:30 - 20:00 in the Ninth Circuit, although there is in, in the second and I imagine that what courts might have a close eye on is that the second circuit decision in Google Books turned on the notion of transformative use. The opinion was written by a judge who invented transformative use. Transformative use has come under subsequent review in the Warhol case, a quite different
- 20:00 - 20:30 context, but it leads to some uncertainty about what does transformative use mean today, post Warhol, or outside the Second Circuit. Yeah, so I'd love to just build on a couple of points that were made and Danielle and I have consensus on a number of things we, we typically find actually. I mean, I think it's really important in
- 20:30 - 21:00 setting the stage that we distinguish between the outputs of these models and the active training, right? So again, any particular output that is generated from a model. If you put that up against an original work on which it was trained, or, or any copyrighted work for that matter, and it is substantially similar in protected expression and copied from the original, there may very well be a copyright problem. And we're going to turn to that issue next. Yeah,
- 21:00 - 21:30 so I'll, I'll skip that. That's a contextual analysis of that particular output. What we're talking about in this question is just the act of training an AI tool to be able to generate language or to be able to generate images of any kind. Right. If I want to create an image of a cat on a surfboard in Venezuela, that image probably doesn't exist anywhere. It's not infringing anybody's copyright, but without a tool that's been trained on lots and lots and lots of images. There's no way of doing that. So, first setting the stage there,
- 21:30 - 22:00 I think we're talking just about training. Then I think it's important to put in context this, the concept generally of fair use and this idea expression dichotomy. So, the idea expression dichotomy is just the basic rule that nobody can own an idea. Nobody can own a concept. Nobody can own information. Why not? Because we want everybody to be able to use those ideas and concepts and information to create new works, to write about them, to expand knowledge. So what we protect under copyright law is
- 22:00 - 22:30 just the particular expression of that idea, the actual words used, the actual image created, the particular notes and rhythms that create a song. And so separating those two things is important because again, everybody is free to take the idea, the Supreme Court has said absolutely, definitively, over and over. You are allowed to use the ideas and concepts from other people's work. That's what all knowledge is based on. So then the style comes up, right?
- 22:30 - 23:00 In the context of these cases, there are artists and writers who say, you're taking my style. But style's an idea. Nobody owns a style either. I can go to a museum and do my darndest to emulate Picasso's style. Now, I'll never be as good as Picasso. I may not be able to compete with him. But I want to make a thousand works in the style of Picasso. So long as I'm not taking his expression. Copyright law permits that. And so in the
- 23:00 - 23:30 context of training, what we're talking about is ingesting copies of works. So that you can take information from those works about how language works, syntax, structure, how often words appear next to other words. This kind of information is arguably not even protected by copyright, and if what you're doing is taking that for the purposes of creating a completely new ability to generate new language and content, I would argue that is quintessentially transformative and exactly what the courts have held,
- 23:30 - 24:00 including in the Google Books case, is permitted. So, so the Well, that's owed. Go ahead. Yeah, sorry Mark. That's, that's why I kind of said we need to look also at the intent, right? Because I think if you were to take the scripts from all the Star Wars movies train them, to train an AI with the purpose of creating, I mean, you know, the same person creating or even just a company doing it training an AI with the, all the Star Wars scripts for the purpose of creating Star Wars content. And I think, you know, you start getting to
- 24:00 - 24:30 that intent and I think you do obviate the fair use. So I do think that is going to potentially. So this, I mean, both of these comments to me sort of get to the question, get to the problem that is actually really hard to get people to focus on the distinction between the training and the output, right? So to me, right, if I, if I decide to train only on Star Wars content I don't think there's anything inherently problematic about that. Except that it's almost certainly going to give you output that is substantially similar to Star
- 24:30 - 25:00 Wars content. And if that's a, if that's what I'm going to do, then I think we're going to be in a different and more challenging problem. Right. But can I add one more thing? I'm sorry, Mark. Even the creation of identical works, even with that intent, that doesn't mean that it isn't fair use. You know, every time I go to photocopy something, I'm making an identical work. But if I'm doing that for purposes of scholarship, research, teaching, criticism, that may very well be fair. If I'm a brand owner, I may very well want to upload my content for purposes of… But again, we are wandering past training to
- 25:00 - 25:30 content. I want to talk about that, but I want to close the loop on training. Right? But the very fact that it is hard to separate these two, I think is really important. Right? Because this is what we're seeing happen in the, in the, in the lawsuits. Right? Why are people upset about training? Maybe they're upset about training because they sort of conceptually don't like the idea that a machine might learn from their work. Even if the machine is gonna produce cats surfing in Venezuela, right, that has nothing to do with their work.
- 25:30 - 26:00 That's a kind of weird objection though, right? I, and when you get down to it, I think the objections mostly end up turning out to be I'm afraid, either one of two forms, right? One is I'm afraid that the output will be too similar, right? That you will in fact sort of end up copying my work in the output. Right? And that's an issue we're going to turn to in a minute, right? Or the objection is, right, I'm afraid of competition from something that isn't infringing my work, right? And that's an objection and it goes a little bit to Max's point, right? But it's not an objection copyright law cares about.
- 26:00 - 26:30 So let me just sort of close the loop on training with two points to make, right? One is the reason that the lawsuits so far are mostly focused on training and not output training. Is that training is an existential question. For AI, it's an existential question for AI because of the way we have structured copyrights remedies regime, right? Copyright has a statutory damages provision. So for every registered work if you show infringement, you can get not just the actual harm you suffered you can get a minimum of $750
- 26:30 - 27:00 per work and a maximum of $150,000 per work, depending on intent. If in fact training on 2 billion images is an act of infringement of all of those 2 billion images then even assuming we pick the minimum threshold, we don't assume, you know, we pick the minimum statutory damages that we are required by law to give, right, we're at 1. 5 trillion dollars in damages. plaintiff's
- 27:00 - 27:30 class actions lawyers can do that in that math right? And 1. 5 trillion dollars in damages sounds pretty good, even if you're not gonna get 1. 5 trillion maybe there's a settlement, maybe there's a lot of money to be had here. And so rather than the hey, sometimes occasionally there is an output that's substantially similar and that's infringing, we want the big hit right? We want to say the whole thing is infringing and we are entitled the damages for all of that. Whether that's true, of course, depends a little bit on the issue we've also talked about, right,
- 27:30 - 28:00 which is transformative use, but it also depends on another question, the fourth factor of the fair use, right, which is the market effect. Now traditionally we think of the market effect as is my work substituting for yours, right? Are people buying my copy of this song rather than your copy of this song? If so, that's unlikely to be a fair use. But as Paul previously mentioned right, one of the things we think about in market effect is also the possibility of a licensing market. And so one of the things that I think is an
- 28:00 - 28:30 unsettled question in the fair use inquiry is will we see, can we see market for licensing training data? Is this a feasible thing? And Paul, I'd like, love to hear your thoughts on sort of, is the world going to end up not with with lawsuits and, and statutory damages, but with a kind of like. I think the answer to that Mark is yes, but it is a highly qualified yes. I think that however the current raft of lawsuits resolve themselves there
- 28:30 - 29:00 is going to be licensing going forward. There is a very substantial chance that there will be, in part because of this instinct about reparation. But also in part stuff that is happening in Europe right now, and Danielle properly, properly alluded to foreign activities as being very important. And
- 29:00 - 29:30 I'll get that, to that at the It's the fourth of my licensing possibilities, but there are four ways in which licensing of training activities, exclusively training activities could go forward. One is the traditional negotiated two party license. There is some precedent for that. I think one of the early ones was OpenAI's license with the Associated Press and there
- 29:30 - 30:00 are others going forward. The problem with that is scalability. And transaction costs. There's just too many licenses to be negotiated. There is a variable factor there, though. There was a really ill conceived bill introduced in the Congress this week by Congressman Schiff that would impose a duty of transparency on platforms that are doing training and require them to list
- 30:00 - 30:30 all copyrighted works that they've trained on. It's, it's a very happily, it's a very short bill. But it is just totally ill conceived. And among other things, the sanction for noncompliance is a one time fine of 5, 000, which I. Well, let's put that to the side. I think individual negotiated licensing is probably a non part. The second and third are collective
- 30:30 - 31:00 licensing and compulsory licensing. Both topics that the Copyright Office asked for input on in their current notice and inquiry that will be getting a report from them, I think, starting in a couple of months. And there will be a series of reports. The responses to that were collective licensing. There was some support among authors groups particularly
- 31:00 - 31:30 and that is if you think of ASCAP, BMI and CSAC and the music area of having collective licensing by collective management organizations, CMOs. It got some support. The problem in the U. S. with collective licensing as a solution to transaction costs is that the U. S. has four CMOs for musical
- 31:30 - 32:00 performance rights and virtually no CMOs for all the other kind of content that is subject to copyright. You contrast that with Europe and Latin America and In Asia, where there is a single CMO for each, at least one, but typically one CMO for each area, for photography, for visual arts generally, for writing and so the thought of getting those collecting organizations in place in
- 32:00 - 32:30 the U.S. is certainly problematic. There was less support for compulsory licensing. And, and, and for good reason, the major proponents of it were student groups, one from Hawaii the University of Hawaii, and the other from Brooklyn Law School. The two student groups liked compulsory licensing, nobody else seemed to care for it. And again, for good reason compulsory
- 32:30 - 33:00 licenses are frowned upon as Danielle alluded to, there is the Byrne Convention which in Article 9. 2. Limits the, puts limits on a country's ability to subject normal free market licenses to, to licensing. The fourth kind of licensing and the one that I think is most likely to come into place, and to do so within the next two years, is automated licensing.
- 33:00 - 33:30 Metadata attached to individual works that will communicate with the platform prior to training saying, I don't want to be copied. I will agree to be copied, but I need to be compensated in this amount. Or, or, you know, go ahead and copy, but give me this other consideration. I'm sure many of you recognize that as content ID which makes YouTube possible.
- 33:30 - 34:00 Within a world of safe harbors where otherwise they're subject to notice and takedown. I think that's likely for a couple of reasons. It has precedent. The content ID precedent. It's an elegant, low cost solution. And probably the most compelling reason is we're not gonna have any choice. Some of you may be familiar with Article 4. 3 of
- 34:00 - 34:30 the Digital Single Market Copyright Directive that creates, carves out an exception for training, but says in the event that a rights holder gives notice that it objects to the training, that notice must be honored. Well, Article 4. 3, the opt out provision applies only within,
- 34:30 - 35:00 applies only to copying, training going on in Europe, in the individual countries under the principle of territoriality. What's happened more recently, last month was the adoption in the European Union of the AI Act which in Article 53C makes the Article
- 35:00 - 35:30 4.3 obligation an operational obligation across the board. Not only for training that occurs within the European, the countries of the European Union, but that occurs anywhere. Let me just raise real quick there's an obligation among nations to put in place a policy to comply with union copyright law, Article 4. 3, and in particular to identify and comply with, including
- 35:30 - 36:00 through state of the art technology, a reservation of rights expressed pursuant to Article 4. 3. Now that is going to apply extraterritorially. It's going to apply to any Platform that does its training anywhere. So long as they're doing business in the European Union it's akin in that respect to the GDPR general data protection regulation which similarly imposes
- 36:00 - 36:30 an extraterritorial obligation, respecting data privacy on countries outside the European Union, if they are doing business in the European Union. And that has had the effect among American. companies that are in that business of conforming their conduct in part because of a vacuum of privacy law under U. S. law, but in part because they need to do business in the European Union. This is the so-called Brussels effect, and we will find ourselves,
- 36:30 - 37:00 I believe, when the AI Act comes into force it's about two years and a couple of months from now. In a position where that kind of compliance will be required. And, and worth noting, there is no fair use doctrine in Europe. There is no fair use doctrine in Europe. So, so, I, I want to say on the licensing issue, I mean, I, I am troubled by this, right, because I think the economics are fundamentally different. Then the economics of other places where
- 37:00 - 37:30 licensing is work, right? We have, licensing for satellite broadcasts of transmissions, right? We have licensing for kind of covers of songs, right? And that works because the thing I'm using is one, or maybe a couple of individual copyrighted works. And so we know who the people we want to pay are and so forth. I, I, but I struggle with sort of what it would mean to say we'll pay a fee. For training on two billion images selected from the
- 37:30 - 38:00 LION database. So stability AI, right, the whole company's worth two billion dollars, maybe less right now after recent developments, right? But right, even if we said, all right, you know what, we're gonna take half of the the entire value of the company and pay it in compulsory license fees to copyright owners, everybody gets fifty cents. I don't think when compulsory, when people talk, think about like compulsory licensing, they think, I want my 50 cents. That's not 0. 50 per use, that's 0. 50, period, for training, right? And so what I worry about is that if in fact
- 38:00 - 38:30 we're in a world that Paul's talking about, right, the what we're gonna see is a bunch of people who say, Sure, I'll license this. my thing to be trained for 5, 000, right? Or maybe 500, right? And that's just impractical, right? No one can build a large training data set. There may be specialized ones, right? It may be that you want to train on a few particular things. Music might be a great example of where people would be willing to pay a certain amount of money.
- 38:30 - 39:00 Right. To train a music data set, although that's often going to be because we want to generate things that seem very similar to your to your song. And that might not be a very popular. So I worry a little bit that sort of the practical effect of this is it's not going to be, we'll get a licensing scheme that works. I just don't think the economics work for it. It's good. We'll get a bunch of people who say. Right? Sure, I'll do it for money. And then we just opt out completely. Right? That's a, the result will have to be a They're just,
- 39:00 - 39:30 we can't train on anyone who who doesn't give it to us for free. Or, we can't train unless we happen to be Google or some other company that has gotten the ability to collect this information for other purposes and has put somewhere in their terms and conditions that we can use this for whatever purpose we want. If my royalty statements for my photos are any indication, it's about a hundredth of a cent for a photograph, give or take,
- 39:30 - 40:00 I think is what I can't remember if it was a thousandth of a cent or a hundredth of a cent, so. Don't, don't spend it all in one place. I know, it's, you know, I think the total for the number of photos I have up in that library for that training data was about a penny. Mark, two responses, Mark. One is, you started off talking about compulsory licensing, which is going to be a non starter. But notwithstanding that, and bear in mind, we do have a compulsory license that was implemented in the Music Modernization Act a couple of years ago that creates a blanket
- 40:00 - 40:30 license for making copies of all music subject to a rate set by some administrators in Washington. So we do have that possibility for dealing with millions of content. It's a so called It's sort of a hybrid. It's a compulsory blanket license. And so everybody's work comes under it. Under, under the second observation you made, which I think relates more to content ID, which I think
- 40:30 - 41:00 is right, is going to be the path going forward. It's a content ID like system. The situation there is, sure, there can be plenty of people who say, I want 5, 000. And they're going to get, they're going to get rejected. And pretty quickly the market is going to drop down to the 0. 5 cents per use or per training use, whatever it is, because that's the most you're going to be paid. It's like a Spotify royalty, which is, you know, close to zero. So I think the market is going to
- 41:00 - 41:30 drive down those demands. I'm not saying 5, 000 is unreasonable. What I'm saying is it's the market is not going to sustain it. Alright, so people wanted earlier to talk about output infringement and I think we want to talk about the shift from training to output infringement. So, I, I noted earlier that training is an existential question, right, because it's sort of the, the potential amounts of money here if you can't do it legally are, are enormous. Output
- 41:30 - 42:00 infringement by contrast is is a more specific and targeted problem, right? The vast majority of things generated by generative AI are not substantially similar to any copyrighted work. Alright, they are not infringing, they are not a problem. But sometimes it happens. Right why does it happen? I, one, one reason it happens is, is what we call in computer science the deduplication problem. Right, so it's not that generative AI decided to copy your
- 42:00 - 42:30 particular text from this particular instance. Right, it turns out there are 10 or 15, 000 copies of this particular image or of the Harry Potter books floating around on the internet that got into the training data set. And so when you ask it a specific enough prompt, right, you get a result that is an amalgam of the closest universe of things and all of those closest universe of things turn out to be the same image. Right and so the result looks like the same image
- 42:30 - 43:00 or looks like the same text. Second reason it might happen is sort of deliberate prompting of infringement. So when I work with the folks in the computer science department, right, if you ask if you ask ChatGPT to give you a story about kids who go to a wizarding school in Britain, it doesn't give you Harry Potter or anything similar to Harry Potter. But if you ask it to give you a story about kids who go to a wizarding school in Britain
- 43:00 - 43:30 that begins with, and then goes feed it into the prompt, the first paragraph of Harry Potter. Well, then it actually does spit out something relatively close to the first chapter because it recognizes that that particular unique combination of words is likely to occur only in specific contexts, along with. Other combinations of words and we see some examples of the prompting infringement problem in the New York Times lawsuit, right? Where the New York Times says, Hey, look OpenAI spat out our news story. And if you go look behind the scenes
- 43:30 - 44:00 at the exhibits, right? It turns out that New York Times OpenAI spat out our news story when we said, give us a New York Times story with this title that begins with the first nine paragraphs. And, you know, that's, I mean, that is an act of infringement, I think, alright, although the question of who's responsible for it, I think, is an interesting one. And then there's another set of problems which I think of as the baby Yoda problem, right and that is, there may just be concepts, right that the, that the, the, that the software recognizes as a
- 44:00 - 44:30 concept in the same way it recognizes coffee cups to generate an image. There are enough Baby Yodas out there in the world and they all look similar enough that if you ask it, give me Baby Yoda, it knows what a Baby Yoda is and it will give you a very realistic looking image of something that turns out is copyrighted. I, what, so, I think, Angela said earlier, this is a potential problem. Right? I mean, even if you think training is fair use, generating an image of Baby Yoda
- 44:30 - 45:00 seems much less likely to be a fair use. What do we do about it? I think we're pattern matching to stuff we saw before. And that's, I mean, this is a legal audience. Yeah, that's what, I mean, that's our bread and butter. But I think that, I just want to see if we can forecast a year or two in the future about what these outputs are. What is AI4 at all? Why do we have people are still trying to figure that out. But I want to relate a very short story.
- 45:00 - 45:30 Someone emailed me, and Things were getting a little tense. They sent me a winky face emoji and that was like a hit of dopamine. I was like, Oh, maybe, maybe things are okay. It wasn't being conveyed in the language, but in that emoji, there was something being communicated that they couldn't before. And at least what we're trying to do. And I think a lots of procedural generation companies are trying to do. We're trying to expand
- 45:30 - 46:00 the way that people can make outputs at all. Expand the vocabulary that you, you can, we want you to be able to communicate new things with each other. And so from that perspective, I think like we're getting, we're getting very stuck just pattern matching to the world we see. And it's like caveman thoughts, like money there. It's all it. I made it. I want it. Give me money. Or also like, I'm scared because new, how do I do new? What is new for? But I think like very soon. Generative AI is going to be in, it's going to be stuck to
- 46:00 - 46:30 human culture. It's going to be like language. We're not going to be able to communicate with each other without it intermediating. So I think it's really important to think now how we want, what do we want the laws to be. Do we want copyright law to restrict how we think when we start using it as such an inseparable tool in our thinking? Do we want copyright law
- 46:30 - 47:00 to restrict how we communicate with the people we love? I think we're going to see it go, you know, taking Max's challenge looking a year or two ahead. I think we're going to see some of the litigation go some of the same ways that we saw with the file sharing cases. And I do think, you know, the maybe the Grokster case the inducement liability might, you You, you know, to the example I gave earlier about if you're training a AI algorithm specifically
- 47:00 - 47:30 for a specific purpose, so you're creating one to training it on the Star Wars universe to create Star Wars content and, you know, to encourage people to create Star Wars content that is infringing, I think we're going to see, you know, something along the lines of an inducement liability on something like that. I think it'll be interesting to see with the mid journey type and OpenAI chat GPT stuff where it's a more general algorithm,
- 47:30 - 48:00 more general purpose algorithm. It's gonna be interesting to see where those go. But I do think there, in some cases, we're going to see an inducement form of liability come into play. So, I'll just add on to that a, a few things and this is the point that maybe I started to make on the, on the training question that was better suited here. If you imagine a scenario in which a work is generated through an AI tool that is plainly incorporates the, you know,
- 48:00 - 48:30 the protected expression of a work. Or is it an identical copy? I mean, there are easier ways to do that than trying to intentionally engineer a prompt to deliver that. Again, back to our photocopier example. But assume that you are able to generate that. There are any number of perfectly valid and appropriate reasons you might want to do that. If I'm a brand owner that has a really fantastic logo, for instance, and I want to feed that into
- 48:30 - 49:00 a tool to figure out an ideate around potential variations, ways that I might expand upon that. I have every right to do that and it would return content that I own. If I want to ask tool to produce some image for me that I want to use in an article to highlight, for instance Concerns over the creative content of, of someone. I'm allowed to use that for news reporting, even if in another context, with another intent, it might be infringing. So there is this,
- 49:00 - 49:30 there is this desire, I think, as we have this conversation, to put everything in, in yes or no, black or white boxes, but it doesn't quite work that way. It's very contextual. There is no question that these tools can be misused for ill purposes and the copyright law does not allow that. What can't be said is that the fact that these tools are
- 49:30 - 50:00 capable of generating content that in certain instances will look very much like content, copyrighted content on which it is trained, is necessarily improper. And I want to take us back to Warhol. So the professor mentioned that in Warhol, Andy Warhol, who had been through a license entitled to make a silkscreen of a Lynn Goldsmith photograph of Prince. Now, he made more than one. He exceeded the scope of the license. He made 16. And later when Prince died, And the, a magazine wanted to write an article about prints.
- 50:00 - 50:30 They went back to the Andy Warhol Foundation and found out that there were additional silk screens and used one that hadn't been authorized. And the court found that's not okay. There was actually a licensing market for this work. In fact, that's how you got your hands on it in the first place. And you used it for the specific purpose that it had originally been licensed, right? You used the photograph of Prince to create a work for illustration of him in an article.
- 50:30 - 51:00 What the court stopped short of saying is that the making of the work in and of itself, the other silkscreens that never appeared anywhere, was not fair. Because until those are deployed, until we know how they're being used, we don't know if it's fair or not. It's sort of a Heisenberg Uncertainty Principle, or, you know. There is an element of, if the painting had been hung in a gallery, or if it had been used for educational purposes,
- 51:00 - 51:30 That may very well have been a fair use. And the court said we can't decide that here. We're not going to decide that here. We're going to decide on this particular use. So I think it's right to say that this is right. Output infringement, unlike training infringement, which is a kind of existential question, is a fact specific contextual question, right? What is, what particular thing has been generated? And who has generated it, although I do think, to go to Daniel's point, one difference here between this and the internet cases, which may matter a lot for the AI companies, is it's not obvious that your liability is only for inducement, right?
- 51:30 - 52:00 And so one of the questions we resolved early on in the internet was this sort of question of volition, right? If I ask a machine to give me something and it gives me something Who's, who, who's making the copy? Who's the direct infringer, right? On the internet, right, we were happy to say that if I'm just hosting content that other people put up there, the fact that I sort of deliver it in an automated way doesn't make me the maker of the copy,
- 52:00 - 52:30 it's the person who's asking for it. AI looks a little different though, right? Because I am generating a new thing in response to a prompt. I think there may be a line at which a specific enough prompt that is clearly designed to misuse the system might make the prompter and not the ai directly responsible for infringement. But I, I think there are gonna be a bunch of things where a prompt generates some output that is infringing. But it is the AI itself that is making the thing and therefore it's the AI itself that is
- 52:30 - 53:00 the direct infringer. That, that matters I think because copyright is a strict liability offense. The fact that you intend, you didn't intend to do it, the fact that you sort of took efforts to try to prevent yourself, prevent it from happening won't necessarily mean that you avoid liability. Max. I, I guess I wonder who would do that? Like, I mean, if you want a direct image, you can just go find it on the internet. Who is using generative AI tools to intentionally copy what's in there? So, plaintiff's lawyers is one answer, right? That's kind of a very small
- 53:00 - 53:30 answer, right? Most, most of those things have in fact been the plaintiff's lawyers in the case. But it's a, it's a fair question, right? But I, so my guess is it's not. It would be, it's a very bad copy machine. It's a very inefficient way to use as a copy machine. So I agree with that. In contrast to the grokster. But it, what it's gonna be, I think is gonna be more the style case. It's going to be the sort of like, I want something that looks similar to but not identical to this thing and maybe it's too similar, maybe it's not. But that again is going to be a sort of
- 53:30 - 54:00 case by case. Well, it's also the derivative works more than the direct infringement, just to answer that question. Alright, alright. So, we have lots more things that I would love to talk about but I also see people lining up at questions and so maybe we should hear what you want to talk about. Pablo. Alright. So, Professor Goldstein the fair use. I'm going to use this question, which is sort of an existential issue. You mentioned sort of coming down to the transformation question and noted that there's fuzziness around that, which I think we would, not surprising given where that doctrine has to operate. But it occurs to me that it, you know,
- 54:00 - 54:30 back when you taught us copyright, if you said we're now going to all go brainstorm, the most transformational use cases we can think of, like what is the most, almost a caricature of transforming something. Nobody could have come up with, I'm going to create 30, Pointing randomly and then have them reorient themselves so that they can guess the next word of these texts. Like, that would have been, and so even in a world where there's some fuzziness, what I don't understand is how is this anywhere close to a close call? And
- 54:30 - 55:00 tell me what I'm missing because it seems to me that like, any fuzziness of the doctrine aside, we're so beyond, we're so off the charts for transformation that then game over, no licensing, nothing, we just all go back to, you know. Yeah. As usual, Pablo, you're not far off the mark. You weren't in class and you weren't there. I, no, I didn't mean to say you weren't present in class. You were always present in class. No, I think, you know, the way transformative
- 55:00 - 55:30 use worked in the Google Books case was the use to which ultimately the copied works were put, the snippets from the copied works were put. And so I think transformative use here would, in looking at training activities, would say To what use are these measurements, these tokens effectively being put, and they're being put to a use that is totally transformative as compared
- 55:30 - 56:00 to the material on which on which they were trained. So that's, does that take care of it? Good. Okay. And I just, I would just add to, to that though, that part of, you know, the, the, Like, I wrote an amicus brief in the Warhol case and there were a bunch of amicus, a huge number of amicus cases briefs. And you know, I think one of the, the concerns that, that we had and that others have had was that the, the focus of fair use analysis has turned, is it transformative?
- 56:00 - 56:30 Forgetting that there are three other factors and, and even transformative is nowhere in the fair use So, it, it's just been something that has become an easy way to just, oh, it's transformative, so clearly it's fair use, but, and, and I believe if I recall correctly from the opinion, the court said we have to analyze these four, you know, these all four factors.
- 56:30 - 57:00 And I think they even did in the Oracle v. Google case, which also turned on one particular factor, but that, that's gotten lost in fair use jurisprudence, is that transformativeness is not the only determinant factor. I wondered if you could comment on the acquisition of the training data, in particular terms of service on websites. Say I downloaded, I don't know, millions of hours of YouTube videos, let's say, and then built a video generating system, you know,
- 57:00 - 57:30 as an example. Yeah. Just a random example. You know, just, just a random example. And, and, you know, you know, I didn't do this, but one may have done this. Yeah. Some would. Yeah, so I, I think the answer is probably not a copyright problem per se, but quite potentially a sort of breach of contract, breach of terms of use problem, if in fact that's how you get it, right? Now, most of the databases, I think most of the generalized ones, right, have used common crawl right, which respects the robots. txt header. So they said, we're gonna go
- 57:30 - 58:00 crawl the internet for the universe of things that have said, yes, please crawl me and index me in a technical system. I think probably the pe many of the people who set robots. txt to yes, didn't have generative AI in mind. They have, I want to appear in search engines in mind, and it might be that we Right. In the future, start to distinguish those things, right? Or the lion database, right? Which is a sort of database of image categorizations that is in turn taken out of common crawl, right? That is an effort to sort of get around those
- 58:00 - 58:30 problems. I don't think it gets around all of the copyright problems in part, because many of the people who put up information or data on a website and said, sure, please index me might have illegally taken that information, right? The books. 3 database might be an example of that but I do think if you are going to a specific website to crawl that website and it does not have a robots. txt header that says no problem you ought to
- 58:30 - 59:00 be worried about worried about that. Hello, so my question is more about what would be the model for compensating the creators for the content, whether that is text, like a webpage. a video, maybe upload it on YouTube or an a photography work. So as industry and technology developed it seems that people found ways how to compensate the creators. So, for web, we can embed AdSense
- 59:00 - 59:30 and Google pays the, you know, web web page creators for for their work. For video, we have a YouTube, kind of determining algorithmically how to pay YouTube creators and you know, Spotify is, you know, doing that for, for, for, for audio. My question, do you think there would be a kind of emergence of maybe Spotify or YouTube or AdSense, some type of a technology that would
- 59:30 - 60:00 be embedded in the, in the content itself that could help, you know, companies like MidJourney or OpenAI to actually compensate? To those Paul, you want to take it first? Yeah, the I think it's going to be various. It's good. You know, it's interesting that you use the example of Spotify which is a smoothly functioning, not very low paying, but I'll be it, but a smoothly functioning operation, which rests on a compulsory
- 60:00 - 60:30 license. It's the digital phono record delivery compulsory license of art of section one 15. And that is basically what drives Spotify, and they have periodically to negotiate rates between them and the publishers with the Copyright Royalty Board. But that's one way to do it. Another way is the YouTube way. Interestingly, the legal infrastructure ultimately makes, whether it's a
- 60:30 - 61:00 effectively negotiated license with Content ID or a compulsory license with, with Spotify. The legal infrastructure at the end of the day makes very little difference. It's the economic arrangements built on top of it that, that really count. Angela, real quickly and then I think maybe one more, i, I just wanted to say it, it really depends too on whether you're suggesting that there should be some sort of licensing regime For outputs versus training.
- 61:00 - 61:30 I mean, to Mark, Mark's point earlier, the only conceivable regime would involve some mechanism for a single training license fee. You, you cannot conceivably imagine that everybody who's had a, taken a picture of a mountain somewhere where a, a platform has been trained on two million pictures of mountains suggest that an output that may have a mountain in it that looks nothing like those two million mountains would be entitled to any kind of compensation.
- 61:30 - 62:00 That's never how copyright law was, was meant to work. And, and it, it would, It would destroy the technology before it's had the ability to come into full fruition, to solve all sorts of problems. We haven't even talked about on this panel. I mean, there are amazing as yet on discovered reasons. We need these technologies developed and drug, drug discovery disease pathways, solving traffic, solving climate. Big questions, right? Not just cats on,
- 62:00 - 62:30 on surfboards in Venezuela. And I'm so going to Google that now, Angela. If I can just make one comment on the business models, because when we didn't get to, we were going to talk about was like the celebrity impersonation and digital avatars, digital replicas. There are a ton of companies out there trying to, and music actually also trying to develop competent compensation and tracking models and different, you know, products to try to handle
- 62:30 - 63:00 this licensing concept. But so I think there is and that, and that to me, right, that's going to work because it is specific to an individual or a few individuals, right? If I want to make, if I want to use your image, right, then, then I ought to be paying you. Right. I think it's much harder if it's everyone who's ever taken a picture of a mountain should get a billionth of a cent. All right. Last question. Yeah. Question regarding the current court cases. What is unique about them compared to some of the historical ones regarding, you know copyrighted data, IPs indexed by some of the search engines we have seen in so many cases?
- 63:00 - 63:30 What are the current cases makes different regarding usage of radio? I mean, I, so, well, they're current, they're ongoing, so we don't know how they're going to come out. I mean, I, I, I think as Angela has kind of suggested, right, and Paul suggested in response to Pablo's question if, if the question is, what does existing precedent say as to training? Existing precedent lines up pretty strongly in favor of, this is a very different purpose
- 63:30 - 64:00 it's a new technology that's going to be a fair use Things that might change it, right? A licensing market, right? If we thought there was a working licensing market and therefore you're depri you're depriving people of revenue, that can change the fourth factor and it could change the analysis. And tech lash, right? We're we are in a moment of sort of like AI moral panic. And I think that affects judges and it might well be that the a legal decision that in a different technology, in a different kind of psychological
- 64:00 - 64:30 era would clearly have come out in favor of the tech company might come out differently because people are afraid of AI. That's not what should happen, but it might happen. And I think with that, we're going to have to stop. Thank you all.