“Fake Drake”: Vindicating Copyright Ownership in the Advent of Generative AI Music

INTRODUCTION

In April 2023, “Heart on My Sleeve” almost instantly went viral on TikTok, grabbing the attention of millions of viewers who were intrigued by what seemed to be an unreleased collaboration between Drake and The Weeknd.1Amanda Silberling, A New Drake x The Weeknd Track Just Blew Up—But It’s an AI Fake, TechCrunch (Apr. 17, 2023, 9:41 AM), https://techcrunch.com/2023/04/17/uh-oh-an-ai-generated-song-by-drake-and-the-weeknd-went-viral [https://perma.cc/ZAT6-6DG6]. The song not only sounded extremely similar to its alleged vocalists and their music styles, but the lyrics also reflected events and people relevant to their lives, resulting in a very convincing piece of music. But it quickly became clear that this song was not, in fact, created nor sung by Drake and The Weeknd; instead, it was the product of artificial intelligence (“AI”) music-generating programs used by Ghostwriter977, the poster of the video.2Samantha Murphy Kelly, The Viral New ‘Drake’ and ‘Weeknd’ Song Is Not What It Seems, CNN (Apr. 19, 2023, 9:14 AM), https://www.cnn.com/2023/04/19/tech/heart-on-sleeve-ai-drake-weeknd [https://perma.cc/6DWJ-6E5A]. After amassing millions of views across various platforms in just a few days, streaming services pulled the song,3The original video of the song posted to TikTok was also seemingly deleted. Id. and those searching for it on YouTube were met with a message stating the video was “no longer available due to a copyright claim by Universal Music Group.”4Daysia Tolentino, Viral AI-Powered Drake and The Weeknd Song Is Removed from Streaming Services, NBC News (Apr. 18, 2023, 12:04 PM), https://www.nbcnews.com/pop-culture/viral-ai-powered-drake-weeknd-song-removed-streaming-services-rcna80098 [https://perma.cc/4YG9-G49J]. Despite the message displayed, Universal Music Group (“UMG”) declined at that time to clarify whether it had formally sent takedown requests. Laura Snapes, AI Song Featuring Fake Drake and Weeknd Vocals Pulled from Streaming Services, Guardian (Apr. 18, 2023, 5:37 PM), https://www.theguardian.com/music/2023/apr/18/ai-song-featuring-fake-drake-and-weeknd-vocals-pulled-from-streaming-services [https://perma.cc/MNZ3-ZWGG].

While concerns about this particular song seem to have been adequately addressed by streaming services quickly pulling it from their platforms, the impact of Ghostwriter977’s video was profound and widespread. While generative AI had already aroused questions and concerns generally, 5See, e.g., Abreanna Blose, As ChatGPT Enters the Classroom, Teachers Weigh Pros and Cons, neaToday (Apr. 12, 2023), https://www.nea.org/nea-today/all-news-articles/chatgpt-enters-classroom-teachers-weigh-pros-and-cons [https://perma.cc/35P7-LB4S] (“On the one hand, many educators fear [ChatGPT] . . . encourag[es] new methods of cheating and plagiarism. . . . On the other, [it] . . . appeal[s] to educators who see its potential to improve education.”); Benj Edwards, Artists File Class-Action Lawsuit Against AI Image Generator Companies, Ars Technica (Jan. 16, 2023, 3:36 PM), https://arstechnica.com/information-technology/2023/01/artists-file-class-action-lawsuit-against-ai-image-generator-companies [https://perma.cc/5FNU-TLHW] (“Since the mainstream emergence of AI image synthesis in the last year, AI-generated artwork has been highly controversial among artists . . . .”). “Heart on My Sleeve” directed the world’s attention to the music context. While this is not the first instance of a controversial AI-generated musical work,6See, e.g., Sonia Horon, Drake Responds to AI-Generated Cover of Him Rapping Ice Spice’s Hit Song Munch and Calls It ‘The Final Straw’, Daily Mail (Apr. 14, 2023, 7:31 PM), https://www.dailymail.co.uk/tvshowbiz/article-11974861/Drake-calls-AI-Generated-cover-rapping-Ice-Spices-song-Munch-final-straw.html [https://perma.cc/FRA4-Q96J] (“Drake appeared less than pleased with a recent AI-Generated cover of him rapping Ice Spice’s hit song Munch.”); Jem Aswad, AI and Copyright: Human Artistry Campaign Launches to Support Songwriters and Musicians’ Rights, Variety (Mar. 17, 2023, 7:17 AM), https://variety.com/2023/music/news/ai-copyright-human-artistry-campaign-musicians-songwriters-artificial-intelligence-1235557582 [https://perma.cc/79QD-WR6V] (noting that the “music industry is alarmed” following instances like David Guetta’s song using an AI-generated Eminem track). the nature and quality of the song revealed just how advanced generative AI technology has become, sparking strong responses ranging from excited curiosity to extreme outrage.7Singer-songwriter Grimes posted on X, in response to “Heart on My Sleeve,” that she would “split 50% [of] royalties on any successful AI generated song that uses [her] voice,” noting, in a reply to her initial post, that she thinks “it’s cool to be fused w[ith] a machine.” Grimes (@Grimezsz), X (Apr. 23, 2023, 6:02 PM), https://x.com/Grimezsz/status/1650304051718791170 [https://perma.cc/X5Q7-8VJV]. A more cautious John Legend conceded that “AI’s going to be a part of our lives, . . . [a]nd that’s fine,” but he believes artists’ “rights should still be protected.” Daniella Genovese, John Legend Calls for Regulation on AI-Generated Music, Fox Bus. (Apr. 27, 2023, 9:07 AM), https://www.foxbusiness.com/lifestyle/john-legend-calls-regulation-ai-generated-music [https://perma.cc/SF9C-ZD7H].

The key question that the world is now more intently wondering, as artists, labels, and music representatives wave the flag of “copyright infringement,” is whether U.S. copyright law, as it stands today, can be a source of recourse for artists to take legal action in response to AI-generated music. Due to the novelty of the technology and the nuances of copyright law in the music context, we are without the legal precedent one would usually look at to find a more definitive answer. Because copyright holders’ concerns are pressing and nothing suggests that copyright law will soon be amended to address them, analogizing to similar cases and drawing on the fundamental principles of, and rationales for, copyright protection is necessary to develop predictions as to how courts will rule in a copyright case of Artist v. AI User.

Copyright is concerned with protecting the rights of creators and encouraging innovation, meaning that there remains an additional concern about being overly restrictive and inhibiting creativity and progress. In the context of AI-generated music and copyright infringement, we are placed at what some deem a crossroads,8A spokesperson for UMG asked, “which side of history [do] all stakeholders in the music ecosystem want to be on: the side of artists, fans and human creative expression, or on the side of . . . fraud and denying artists their due compensation”? Snapes, supra note 4. left to decide whether we value human artists’ creativity and resulting work more or less than we value technological innovation and its potential for important advancements. On one side of this policy debate is the music industry, which generated $15.9 billion in revenue in 2022 in the United States alone,9Jem Aswad, U.S. Recorded Music Revenue Scores All-Time High of $15.9 Billion in 2022, Per RIAA Report, Variety (Mar. 9, 2023, 5:57 AM), https://variety.com/2023/music/news/riaa-2022-report-revenue-all-time-high-15-billion-1235547400 [https://perma.cc/A9AT-YV9E]. and represents an art form that has brought humans together since the beginning of time. There is a high barrier to achieving conventional success in the music industry, which some interpret to mean that only the very best succeed as a result of their hard work and dedication. But the other side of the debate takes these same ideas to highlight how innovative generative AI music should be encouraged. Unlike the music industry, which is extremely difficult to break into, there is a very low barrier to entry for generative AI use, as it is largely accessible and there are many tools one can use to learn how to harness the technology.10Ziv Epstein, Aaron Hertzmann, the Investigators of Human Creativity, Memo Akten, Hany Farid, Jessica Fjeld, Morgan R. Frank, Matthew Groh, Laura Herman, Neil Leach, Robert Mahari, Alex “Sandy” Pentland, Olga Russakovsky, Hope Schroeder & Amy Smith, Art and the Science of Generative AI, 380 Sci. 1110, 1110 (2023). Some see this as an opportunity to diversify music and the people making it, which has many benefits. There are strong opinions on both sides, placing this debate squarely within the realm of what legislators anticipated would be a subject of copyright controversy—how can we balance protecting existing creations and encouraging future innovations? 11Artificial Intelligence and Intellectual Property—Part II: Copyright: Hearing Before the Subcomm. of Intell. Prop. of the S. Comm. on the Judiciary, 118th Cong. 2 (2023) (statement of Sen. Christopher A. Coons) (“We should also consider whether changes to our copyright laws . . . may be necessary to strike the right balance between creators’ rights and AI’s ability to enhance innovation and creativity.”).

Absent both a clear answer to this question and any indications that existing copyright law will soon be amended to specifically address the issue of potential copyright infringement by generative AI music outputs, we must look to the interpretation of current copyright law in similar situations. This Note will use case law to shed light on how courts might treat copyright infringement suits involving AI-generated music. To illustrate how current copyright law will apply to real AI-generated music, two hypothetical songs will be used as examples, both based on songs that could be created using existing generative AI music systems.12MuseNet, one of the AI systems that will be used, is not currently functional. However, there is significantly more information available about MuseNet than comparable platforms, and it uses modeling similar to other operating platforms which means this application will be generalizable to similar modeling systems.

Sample Song A is a rap song created by User A using Uberduck.ai (“Uberduck”). Sample Song A was created using a generic punk rap beat provided by Uberduck. The voice used to create Sample Song A is an option specifically labeled as Kanye West in the era of Yeezus, West’s provocative 2013 album. The lyrics are generated by Uberduck, using the prompt “rebellion, slavery, superiority, unapologetic, perseverance, individuality, and power,” all of which are words that have been used to describe West’s reputation, as well as the themes of Yeezus and particularly, the hit song “Black Skinhead.”13Mark Chinapen, Yeezus by Kanye West Retrospective—The Anti-Rap Album, Medium (Jan. 29, 2021), https://medium.com/modern-music-analysis/yeezus-by-kanye-west-retrospective-the-anti-rap-album-39d57d618723 [https://perma.cc/HG57-JZVL]; James McNally, Review: Yeezus by Kanye West, Ethnomusicology Rev. (July 14, 2013), https://ethnomusicologyreview.ucla.edu/content/review-yeezus-kanye-west [https://perma.cc/4TGF-XH4L]. The resulting rap sounds nearly identical to West, with lyrics closely tied to themes he has focused on. The unsuspecting listener may very likely mistake the song for a new release by West himself. While the song sounds like it would fit in with West’s discography, the actual music and lyrics are completely different from any of his prior releases. 

Sample Song B is an emotional ballad, and User B created the musical composition using MuseNet. In creating Sample Song B, they selected Adele as the vocal style for the song, and the selected instrument was limited to piano. The introduction to Sample Song B uses the well-known piano phrase that functions as a melodic hook throughout Adele’s “Someone Like You,” an option provided by MuseNet. This piano segment is arguably the most distinctive musical feature of “Someone Like You,” and is known as an arpeggio, which melodizes chords.14Arpeggio, GW Law: Music Copyright Infringement Resource, https://blogs.law.gwu.edu/mcir/2018/12/20/arpeggio [https://perma.cc/ES9C-RV2L]. The exact piano chords and resulting melody are used—just slightly sped up—but after the introduction, the chords begin to differ. However, the song returns to the piano phrase after the chorus, resulting in a song that is musically similar to “Someone Like You.” User B added lyrics using an outside platform after MuseNet finalized the composition. Sample Song B’s lyrics were written to evoke feelings of both love and despair, and the words themselves speak to a failed relationship, regret, and a longing for love; thus, the song, both lyrically and musically, bears a notable resemblance to “Someone Like You” and Adele’s music generally.15Kitty Empire, Adele: 21—Review, Guardian (Jan. 22, 2011, 7:05 PM), https://www.theguardian.com/music/2011/jan/23/adele-adkins-21-review [https://perma.cc/3W55-NMDN]; Doug Waterman, The Story Behind the Song: Adele, “Someone Like You”, Am. Songwriter (Oct. 12, 2021, 12:59 PM), https://americansongwriter.com/someone-like-you-adele-behind-the-song [https://perma.cc/GN6Q-L4GA]; Michaeleen Doucleff, Anatomy of a Tear-Jerker, Wall St. J. (Feb. 11, 2012), https://www.wsj.com/articles/SB10001424052970203646004577213010291701378 [https://perma.cc/4T3Z-AAJZ]. The lyrics are sung in a feminine, mezzo-soprano voice, but unlike Sample Song A, the voice does not directly imitate its style inspiration.

Before applying copyright law to the sample songs, this Note provides relevant background information. Part I introduces generative AI, providing an overview of how the technology works and details on how the systems used to make the sample songs produce musical works. Additionally, the U.S. Copyright Office’s statements about AI are discussed. Part II focuses on current copyright law—what it requires, what it protects, and how infringement actions work. Music occupies a unique area of copyright law because of the separation between the composition and the sound recording, so limitations and exclusions are discussed in detail. Because courts have not specifically addressed AI on many occasions, analogizing to other cases involving technology helps anticipate the judicial response to this novel technology. Part III applies copyright law to the sample songs and predicts likely outcomes. This includes an analysis of how the songs may fare in all steps of an infringement action, from defenses to statutorily imposed limitations on what can be the basis of a lawsuit. This analysis reveals how copyright law might help artists and how it may hurt them. While artists may potentially find support in trademark law or the right of publicity, this Note will focus solely on copyright law as a vehicle for attempting to vindicate their rights. Finally, Part IV discusses policy implications associated with trying to fit AI-generated music into our developed system of copyright law, highlights the key concerns for artists, and points to gray areas that warrant clarification. The conclusion of this Note summarizes anticipated outcomes and the complicated nature of fitting new technology into the current framework of copyright law.

I. BACKGROUND: GENERATIVE ARTIFICIAL INTELLIGENCE

A. How the Technology Works

AI is “a science and a set of computational techniques that are inspired by the way in which human beings use their nervous system and their body to feel, learn, reason, and act.”16Pradeep Kumar Garg, Overview of Artificial Intelligence, in Artificial Intelligence: Technologies, Applications, and Challenges 3, 3 (Lavanya Sharma & Pradeep Kumar Garg, eds., 2022) (citation omitted). More simply, AI can be thought of as “a man-made object with thinking power.”17This meaning can be derived from the root words of artificial intelligence: “artificial” means “human-created” and “intelligence” means “thinking power.” Id. At the foundation of any program is data input, a starting point akin to the intaking of information that constitutes the first step of the human learning process; the difference between AI and human learning in this respect, however, is that AI systems require massive amounts of data to be effective.18Id. How exactly systems use data and produce desired results depends on the learning approach. The most prominent systems are machine learning (“ML”) and deep learning (“DL”).

ML is the “most promising and most relevant domain” to apply AI.19R. Lalitha, AI vs. Machine Learning vs. Deep Learning, in Artificial Intelligence (AI): Recent Trends and Applications 73, 75 (S. Kanimozhi Suguna, M. Dhivya & Sara Paiva, eds., 2021). ML is a way of learning from big data, and its algorithm is self-adaptive, meaning that through experience, it can get new patterns and improve “perception, knowledge, decisions, or actions.”20Id.; Christopher Manning, Artificial Intelligence Definitions, Stanford University: Human-Centered A.I. (Apr. 2022), https://hai.stanford.edu/sites/default/files/2023-03/AI-Key-Terms-Glossary-Definition.pdf [https://perma.cc/5SZ9-V94M]. The key feature that distinguishes ML is that the goal is for the algorithm to learn to find its own solutions, as opposed to learning to follow human-defined rules.21Garg, supra note 16, at 9; Philip Boucher, Artificial Intelligence: How Does It Work, Why Does It Matter, and What Can We Do About It?, Eur. Parl. Rsch. Servs. VII (2020). DL uses “large multi-layer (artificial) neural networks”22Manning, supra note 20. (“ANNs”) to carry out tasks. 23Boucher, supra note 21, at VI (“Artificial neural networks process data to make decisions in a way that is inspired by the structure and functionality of the human brain.”). DL algorithms “filter[] the input through many layers,” resulting in the ability to “classify and predict the data.”24Lalitha, supra note 19, at 76. “Computational nodes” are created and trained, and ultimately make decisions through a filtering process that is similar to the human brain.25Id. (“It is exactly similar to how the human brain filters any information into deep layers to understand in depth.”).

This Note will focus specifically on generative AI applications, which are created using generative modeling.26Stefan Feuerriegel, Jochen Hartmann, Christian Janiesch & Patrick Zschech, Generative AI, 66 Bus. & Info. Sys. Eng’g 111, 112 (2024) (“[G]enerative modeling aims to infer some actual data distribution . . . [and] [b]y doing so, a generative model offers the ability to produce new synthetic samples.”). Generative AI models have a “machine learning architecture” and use learned patterns to generate new data samples.27Id. There are various generative AI systems, each tailored to a desired output goal; for example, ChatGPT is a generative AI system that generates text and is based on an “X-to-text” model.28Id. Because generative AI is a subset of ML, the training process requires substantial amounts of data. How models are trained can vary greatly, so this Note will focus on the training used for the specific systems that generate music.

B. Generative AI in the Music Context

There are important nuances to note when discussing generative AI systems that create music as opposed to other output domains. Systems that generate music have attracted a lot of attention purely because the output is something we have long considered to be an “innate pursuit of human beings,” as music is viewed as a human expression that encompasses both “creativity” and “collaboration.”29Weiming Liu, Literature Survey of Multi-Track Music Generation Model Based on Generative Confrontation Network in Intelligent Composition, 79 J. Supercomputing 6560, 6561 (2022). While many people remain very opposed to generative AI music,30In response to an AI-generated song intended to be in the style of his music, singer and songwriter Nick Cave stated that the song was “bullshit, a grotesque mockery of what it is to be human.” Sian Cain, ‘This Song Sucks’: Nick Cave Responds to ChatGPT Song Written in Style of Nick Cave, Guardian (Jan. 16, 2023, 7:39 PM), https://www.theguardian.com/music/2023/jan/17/this-song-sucks-nick-cave-responds-to-chatgpt-song-written-in-style-of-nick-cave [https://perma.cc/JJ4E-8L4T]. it is undeniable that the technology has advanced rapidly in ways that have vastly improved the output quality; many generative AI music systems are now able to account for the subtle but important nuances in recorded music and generate output accordingly.31Eric Sunray, Note, Sounds of Science: Copyright Infringement in AI Music Generator Outputs, 29 Cath. U. J.L. & Tech. 185, 192–93 (2021).

Most music-generating systems involve combinations of ML, DL, and ANNs. The sample songs guiding this Note’s application of copyright law to AI-generated music used the following two noteworthy systems: Uberduck.ai and MuseNet, both of which exist on different ends of the technology spectrum. While these systems are different in relevant ways that will be discussed, it is important to note a key similarity is that they are trained on existing music, so it is almost guaranteed that at least some of the input includes copyrighted songs that train the model to invoke a sound or style.

Uberduck, used for Sample Song A, is a speech synthesis system powered by DL that generates “high-quality and expressive voice output.”32UberDuck, Welcome.AI, https://welcome.ai/solution/uberduck [https://perma.cc/4KUC-376P]. Uberduck utilizes several models for speech synthesis, including SO-VITS-SVC, HiFi-GAN, and other text-to-speech models.33Id. Other models include Tacotron 2 and zero-shot RADTTS. Id. SO-VITS-SVC is a DL model, trained using audio files to convert recordings into singing voices.34Matt Mullen, How to Make an AI Cover Song with Any Artist’s Voice, MusicRadar (Nov. 28, 2023), https://www.musicradar.com/how-to/ai-vocal-covers [https://perma.cc/AWG2-L2JD]. SO-VITS-SVC references “SoftVC,” “[c]onditional [v]ariational [a]utoencoder with [a]dversarial [l]earning,” and “singing voice conversion.”35Amal Tyagi, How to Turn Your Voice into Any Celebrity’s (so-vits-svc 4.0), Medium (May 17, 2023), https://medium.com/@amaltyagi/how-to-turn-your-voice-into-any-celebritys-so-vits-svc-4-0-e92222a287e2 [https://perma.cc/W3EM-S3S4]. Using a source audio, SoftVC, or “soft voice conversion” separates a singer’s voice into “frequency bands,” which are encoded to analyze “distinct characteristics” of a voice.36Id.; Benj Edwards, Hear Elvis Sing Baby Got Back Using AI—and Learn How It Was Made, Ars Technica (Aug. 4, 2023, 8:32 AM), https://arstechnica.com/information-technology/2023/08/hear-elvis-sing-baby-got-back-using-ai-and-learn-how-it-was-made [https://perma.cc/EBP5-LMJ5]. A conditional variational autoencoder with adversarial learning uses adversarial training aimed at enabling text-to-speech models to handle more varied data.37Tyagi, supra note 35. Lastly, singing voice conversion, which can be thought of like a voice cloner, converts one singing voice into another while maintaining features like pitch, rhythm, and notes from the original input.38Id.; What Is SVC Technology?, Voice.ai (May 10, 2023), https://voice.ai/hub/voice-technology/svc-technology [https://perma.cc/24JZ-F954]. Uberduck also uses HiFi-GAN, which is a specialized variant of the generative model Generative Adversarial Network (“GAN”).39Jiaqi Su, Zeyu Jin & Adam Finkelstein, HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks, 2020 Interspeech 4506, 4506 (2020); K. Rakesh and V. Uma, Generative Adversarial Network: Concepts, Variants, and Applications, in Artificial Intelligence (AI): Recent Trends and Applications 131, 132 (S. Kanimozhi Suguna et al. eds., 2021). GANs use generators and discriminators, which work together in a repeated feedback process to help the generator produce results that pass the discriminator’s authenticity test.40Sunray, supra note 31, at 189. The discriminator is trained to determine whether an audio sample is real or fake, which aids the generator in “better approximat[ing] the distribution of real data,” resulting in more realistic-sounding outputs.41Su et al., supra note 39, at 1. Through its “loss function,” the generator improves its output by incorporating feedback from the error in results, which is the difference between actual and predicted outputs.42Id. This process is illustrated in Figure 1 below. The difference with HiFi-GAN, specifically, is that it is tailored to “transform recorded speech to sound as though it had been recorded in a studio.”43Id. The use of HiFi-GAN is an important component of making the resulting song sound believable. Together, these technologies and the other text-to-speech models work to mimic the voice of an input audio and make it sound as authentic as possible.

 

Figure 1.  The HiFi-GAN Process

While both systems use DL, MuseNet, used for Sample Song B, is not a text-to-speech system, and is instead a music composition generator that uses a transformer model, which is illustrated in Figure 2 below. MuseNet uses MIDI files encompassing a wide variety of musical styles as its training data.44Christine Payne, MuseNet, OpenAI (Apr. 25, 2019), https://openai.com/index/musenet [https://perma.cc/2WBS-4T88]. MIDI files, unlike conventional audio files, contain information on the notes and how those notes are to be played, which allows the model to “extract patterns in the way notes are played, with what instruments, and for how long.” Raghav Srinivasan, MuseNet and the Future of AI, Medium (Mar. 31, 2021), https://raghav-srinivasan.medium.com/musenet-and-the-future-of-ai-f0a971fc6ed7 [https://perma.cc/XYA9-NF88]. In training the system, sequential data is provided in the form of sets of notes, and it is asked to predict what the next note will be.45Payne, supra note 44. Data is encoded in a way that “combines expressivity with conciseness.”46Id. Similar to the adversarial elements of Uberduck, MuseNet has an “inner critic” during training which asks the model if a sample was generated by the model or from the dataset.47Id. Additionally, MuseNet created composer and instrumentation tokens which are used during training to teach the model to utilize such information when making predictions; the result is that the model can be conditioned to generate output in a certain style using prompts.48Id. Essentially, MuseNet uses the music styles and MIDI files it has been trained on to generate note sequences that sound realistic, as if human-generated.49Srinivasan, supra note 44.

 

Figure 2.  Transformer Model Training

With the internal side of the technology having been established, the next component is the user side. When using Uberduck—specifically the “AI Generated Rap” feature used to create Sample Song A—the user is able to select a beat from a list of premade generic beats.50AI Generated Rap Beat, Uberduck, https://www.uberduck.ai/app/rap#beat [https://perma.cc/3TPM-RVHG]. The other options are simpler “Text to Voice” and “Voice to Voice” features. Id. After that is chosen, users have a choice to input custom lyrics or utilize Uberduck’s AI lyric generator, which requires entering a detailed “description of what you want your rap to be about.”51Id. Finally, the user selects an artist from a list of “[r]appers” to be the voice of their song.52Id. Users are also able to use their own voice, but that is not relevant to this discussion since there would likely not be anything to point to in the output as infringing if the lyrics are original and one’s own voice is the basis of the audio. Uberduck’s interface has since changed, but previously certain artists had several options, indicating different eras of their music. The end result is a complete rap song. As for MuseNet, the initial prompts include style, introduction, instruments, and number of tokens.53Devin Coldewey, MuseNet Generates Original Songs in Seconds, from Bollywood to Bach (or Both), TechCrunch (Apr. 25, 2019, 1:31 PM), https://techcrunch.com/2019/04/25/musenet-generates-original-songs-in-seconds-from-bollywood-to-bach-or-both [https://perma.cc/Z78E-QWS9]. Style options range from Mozart to Lady Gaga to Disney.54Id.; Payne, supra note 44. Similarly, the introduction options cover a wide range, including the intro from “Someone Like You” by Adele, which is used in Sample Song B.55Coldewey, supra note 53. The number of tokens used corresponds to the length of the song. The end product is a musical composition, to which lyrics can be added outside the platform.56This can be done through simple applications, such as GarageBand, or more advanced technology like that used in a professional music studio. An interesting note that could be studied in the future is that, theoretically, lyrics could be generated in the voice of an artist using a system like Uberduck and added to a composition from a system like MuseNet utilizing an outside application. While the result may sound disjointed or unnatural, it may raise interesting copyright or trademark issues with regard to the interaction of vocal style, musical style, and potential fragmented literal similarity with regard to the music.

C. Copyright Office on AI

In August 2023, the U.S. Copyright Office (“Office”) published a notice of inquiry on copyright and AI, which followed the March 2023 launch of the Office’s AI Initiative.57Notice of Inquiry, 88 Fed. Reg. 59942 (Aug. 30, 2023). This inquiry specifically focused on policy issues relating to copyrighted works being used to train models, the copyrightability of AI-generated works, potential liability for AI-generated work that infringes on a copyright, and how to treat AI-generated works that imitate artists.58Id. at 59945. In July 2024, the Office published Part 1 of the Report on Copyright and Artificial Intelligence (“Report”), which addresses the topic of digital replicas.59See generally U.S. Copyright Off., Copyright and Artificial Intelligence Part 1: Digital Replicas (2024). Specifically referencing “Heart on My Sleeve,” the Office ultimately concluded that it believes the time has come for a new federal law to address unauthorized digital replicas.60Id. at 7. It is of note that the U.S. Copyright Office (“Office”) uses the term “digital replicas” to refer to “video[s], image[s], or audio recording[s] that [have] been digitally created or manipulated to realistically but falsely depict an individual,” and uses the term “deepfake” interchangeably. Id. at 2. With respect to copyright law specifically, the Office broadly indicated that a victim of a digital replica in the form of a musical work may have a claim for infringement of the copyrighted work, but clarified that a replica of one’s voice alone does not seem to constitute copyright infringement.61Id. at 17. Because Part 1 of the Report provides little insight with respect to the potential vitality of such copyright claims and primarily focuses on legislative suggestions, the Office’s previous statements and approaches in similar technology-related contexts remain potentially revelatory.

While this inquiry is the Office’s most comprehensive look into AI, it is not the first time it has addressed AI. The Office addressed concerns about technology-generated works in 1965, especially after receiving an application for registration of a “musical composition created by a computer.”62U.S. Copyright Off., 68th Annual Report of the Register of Copyrights 4–5 (1966). Although the issues posed by AI today are, in many respects, far more complex given the vast technological advancements in recent years, the general questions about how non-human-generated works fit or do not fit into copyright have been pondered for nearly six decades. The Office, in operating a copyright registration system, necessarily adjusts its practices according to shifts in technology.63Oversight of the U.S. Copyright Office: Hearing Before the Subcomm. on Cts., Intell. Prop. & the Internet of the H. Comm. on the Judiciary, 113th Cong. 4 (2014) (statement of Maria A. Pallante, Register of Copyrights and Director of the U.S. Copyright Office). In deciding whether to register a claim, a “registration specialist” is tasked with determining whether a work qualifies as copyrightable subject matter and satisfies the formal and legal requirements of the copyright statutes and the Office’s practices.64U.S. Copyright Off., Compendium of U.S. Copyright Office Practices § 206 (3d ed. 2021). As such, the Office’s practices regarding what is registered generally reflect contemporary understandings of the scope of copyright law in light of modern developments.

The question of copyright protection for AI-generated works has notably been addressed in three recent situations. The first situation, which ripened into litigation, involved the Office’s denial of registration for “A Recent Entrance to Paradise,” an artwork created by an AI system, the “Creativity Machine,” which was listed as the author. The Office cited the lack of human authorship as its basis for denial, a requirement that derives from the statutory criteria that protection is extended only to “original works of authorship.”65Letter from U.S. Copyright Off. Rev. Bd. to Ryan Abbott, Esq., at 2–3 (Feb. 14, 2022); 17 U.S.C. § 102. While “original work of authorship” is not defined statutorily, courts have uniformly interpreted it to limit protection to human authors,66See Burrow-Giles Lithographic Co. v. Sarony, 111 U.S. 53, 61 (1884) (using the words “man” and “person” to describe an author); Goldstein v. California, 412 U.S. 546, 561 (1973) (describing an author as an “individual”); Kelley v. Chi. Park Dist., 635 F.3d 290, 304 (7th Cir. 2011) (“[A]uthorship is an entirely human endeavor.” (citation omitted)). and the Office has adhered to that.67U.S. Copyright Off., supra note 64, at § 306. The Office also rejected the argument that AI can be an author under a “work-for-hire” theory.68U.S. Copyright Off. Rev. Bd., supra note 65, at 6–7 (explaining that an AI system cannot enter into a contract). The user challenged the denial as an “arbitrary, capricious, . . . abuse of discretion . . . not in accordance with the law, . . . and in excess of [the Office’s] statutory authority.”69Thaler v. Perlmutter, 687 F. Supp. 3d 140, 144 (D.D.C. 2023). The court upheld the denial, stating the lack of human involvement pointed to the “clear and straightforward answer” that it does not give rise to copyright.70Id. at 146–47, 150 (describing the human authorship requirement as a “bedrock requirement of copyright,” following from the statutory text that limits protection to “original works of authorship”). The court did not address the plaintiff’s theories of ownership but mentioned that “doctrines of property transfer cannot be implicated where no property right exists to transfer in the first instance,” and the “work-for-hire provisions of the Copyright Act” similarly presume that there is an existing right that can be claimed. Id. This situation differs from a second scenario in which the Office registered “Zarya of the Dawn,” a comic book created using an AI system known as Midjourney.71Letter from U.S. Copyright Off. to Van Lindberg 1–2 (Feb. 21, 2023). The images in the book were created by Midjourney in response to the user’s text prompts, but the user did not control the creation process; as such, the images themselves were not protectable based on the human authorship requirement, so copyright extended only to the text she wrote herself and the selection and arrangement of the elements of the book, including the images.72Id. at 6–12. The registration of the work explicitly excluded “artwork generated by [AI].” Id. at 12. The third situation involved the denial of copyright registration for an AI-generated artwork entitled “Théâtre D’opéra Spatial” based on the Office’s conclusion that it contained “more than a de minimis amount of content generated by [AI].”73Letter from U.S. Copyright Off. Rev. Bd. to Tamara Pester, Esq. 1–3 (Sept. 5, 2023). The Office offered to register the work if the user would exclude AI-generated features, as there were some elements of human creation, but he refused and challenged that requirement; nonetheless, the Office stood by the requirement of disclosing AI-generation.74Id. at 7–8

Due to situations like these,75Note that this excludes “Théâtre D’opéra Spatial,” which occurred after the statement.  the Office clarified how AI-generated works are examined and registered in a recent statement.76Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence, 88 Fed. Reg. 16190, 16190 (Mar. 16, 2023). In the statement, the Office explains that in making registration decisions about works created using AI, the first question is whether the work is “basically one of human authorship, with the computer [or other device] merely being an assisting instrument,” or if a machine conceived and executed the traditional elements of human authorship.77Id. at 16192. The Office notes that when AI systems receive prompts from humans that enable the generation of “complex . . . musical works,” the author is the technology, not the prompt-writing human, so it would not be registered.78Id. This scenario is an example of a work in which the “traditional elements of authorship” are attributable to a machine and therefore lack the requisite human authorship for copyright protection. The Office states that there are cases in which AI is used in conjunction with sufficient human effort to permit registration. In such situations, copyright protects only human-authored elements.79Id. at 16192–93. While AI adds nuance to registration inquiries, an important takeaway is that the Office stands firmly behind the human authorship requirement.

II. LEGAL BACKGROUND: COPYRIGHT LAW

Codified in Title 17 of the United States Code, the Copyright Act of 1976 (“Copyright Act”), including its subsequent amendments, is the governing source of copyright law.8017 U.S.C. §§ 101–1511. Congressional authority to enact such legislation arises from the “Copyright Clause” in the U.S. Constitution, which vests in Congress the power to “promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.”81U.S. Const. art. 1, § 8, cl. 8. In the time since the enactment of the Copyright Act, there have been many amendments, resulting in a large body of law that simultaneously outlines rules and requirements with specificity and leaves considerable room for judicial interpretation.

A. Requirements for Protection

Under the Copyright Act, copyright “subsists . . . in original works of authorship fixed in any tangible medium of expression, now known or later developed, from which they can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device.”8217 U.S.C. § 102(a). Copyright does not extend to underlying ideas.83Id. § 102(b) (“In no case does copyright protection . . . extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery . . . .”); Harper & Row, Publishers, Inc. v. Nation Enters., 471 U.S. 539, 547 (1985) (“[N]o author may copyright facts or ideas. . . . [C]opyright is limited to those aspects of the work—termed ‘expression’—that display the stamp of the author’s originality.”). The Copyright Act explicitly includes “musical works, including any accompanying words” and “sound recordings.”8417 U.S.C. § 102(a)(2), (7). Generally, the requirements for copyright protection break down into four separate but interrelated requirements: (1) work of authorship, (2) tangible fixation, (3) originality, and (4) creativity.

Legislative history indicates that the phrase “work of authorship” is intended to provide flexibility.85Id. § 102(a); H.R. Rep. No. 94-1476, at 51 (1976). The broad categories of works of authorship in § 102 of the Copyright Act are illustrative, not exclusive.86H.R. Rep. No. 94-1476, at 53 (1976) (noting that the general outline provides for “sufficient flexibility to free the courts from rigid or outmoded concepts of the scope of particular categories”). As mentioned, this requirement has been interpreted to require human authorship, but the Office’s recent statement suggests technology can be involved in the “authorship,” so long as there is sufficient human involvement.87Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence, 88 Fed. Reg. 16190, 16190 (Mar. 16, 2023). What constitutes “sufficient” involvement remains to be determined. A work satisfies the fixation requirement if it is fixed in a “tangible medium of expression” that is “sufficiently permanent or stable.”8817 U.S.C. § 101. A “phonorecord” is defined as a “material object[] in which sounds, . . . are fixed by any method now known or later developed, and from which the sounds can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device.” A “copy,” on the other hand, is a “material object[], other than [a] phonorecord[], in which a work is fixed by any method now known or later developed, and from which the work can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device.” Id. Congress has indicated that fixation form does not matter.89H.R. Rep. No. 94-1476, at 52. A fixed composition may be written sheet music, while a fixed sound recording may be a recording saved onto a compact disc.90 U.S. Copyright Off., supra note 64, at § 803.4.

Fixed works of authorship must also satisfy the requirements of originality and creativity,91Some characterize originality as “embodying creativity,” while others view creativity as a “necessary adjunct to originality.” 1 Melville B. Nimmer & David Nimmer, Nimmer on Copyright § 2.01(B)(2) (Matthew Bender, rev. ed. 2024). Regardless of the characterization, the two require distinction from one another. which require “independent creation plus a modicum of creativity.”92Feist Publ’ns, Inc. v. Rural Tel. Serv. Co., 499 U.S. 340, 346 (1991). The Court in Feist explained that the originality requirement is “not particularly stringent,” as it “requires only that the author make the selection or arrangement independently (i. e., without copying that selection or arrangement from another work), and that it display some minimal level of creativity.” Id. at 358. Therefore, so long as the work is independently created, a lack of novelty does not preclude copyright protection.931 Nimmer & Nimmer, supra note 91, § 2.01(A)(1) (“[A] work is original and may command copyright protection even if it is completely identical with a prior work, provided it was not copied from that prior work but is instead a product of the independent efforts of its author.”). The “modicum of creativity” standard is a relatively low threshold, requiring only that the work goes beyond independent effort94See Feist, 499 U.S. at 345 (“[T]he requisite level of creativity is extremely low; even a slight amount will suffice.”). and bears a “spark of distinctiveness in copyrightable expression.”95Clanton v. UMG Recordings, Inc., 556 F. Supp. 3d 322, 331 (S.D.N.Y. 2021). 

There are unique considerations with regard to these requirements in the context of musical works because determining the requisite creativity in music can be contentious.961 Nimmer & Nimmer, supra note 91, § 2.05(B) (“As applied to music, the requirement of originality is straightforward . . . . It is within the domain of creativity that special considerations rise to the fore.”). It is important to note that courts typically combine originality and creativity under the term “originality,” requiring a closer look at which requirement is really being addressed. Id. § 2.01(B)(2). Creativity is said to inhere in one of three key elements of a musical work—harmony, melody, or rhythm.97Newton v. Diamond, 204 F. Supp. 2d 1244, 1249 (C.D. Cal. 2002), aff’d, 388 F.3d 1189 (9th Cir. 2004). While the typical source of protection for compositions is melody, courts vary in this regard, with sufficient creativity being found and denied on each basis.98See, e.g., N. Music Corp. v. King Rec. Distrib. Co., 105 F. Supp. 393, 400 (S.D.N.Y. 1952) (suggesting that finding creativity in rhythm is rare, if not impossible, and harmony is not likely the subject of copyright in itself); Santrayll v. Burrell, No. 91-cv-3166, 1996 U.S. Dist. LEXIS 3538, at *4 (S.D.N.Y. Mar. 25, 1996) (holding that repetition of word in a distinct rhythm was copyrightable); Levine v. McDonald’s Corp., 735 F. Supp. 92, 99 (S.D.N.Y. 1990) (suggesting that melody is not required for copyright if sufficient rhythm and harmony is present). Protection for musical works includes “accompanying words” or lyrics;9917 U.S.C. § 102(a)(2). when lyrics and musical elements are integrated into one work, they are protected together and on their own.100Marya v. Warner/Chappell Music, Inc., 131 F. Supp. 3d 975, 984 (C.D. Cal. 2015). Lyrics must also satisfy the requirements for protection, and whether lyrics qualify for protection is very situation-dependent.101Clanton v. UMG Recordings, Inc., 556 F. Supp. 3d 322, 332 (S.D.N.Y. 2021) (holding that the expression “I’m tryna make my momma proud” does not satisfy the creativity and originality requirement); TufAmerica, Inc. v. Diamond, 968 F. Supp. 2d 588, 604 (S.D.N.Y. 2013) (denying a motion to dismiss the claim which was based on the phrase “say what,” which was both in the song and the title). Note, however, that infringement claims regarding lyrics are often addressed more thoroughly in the context of fair use and substantial similarity. 

B. Rights Conferred by Copyright Ownership

Section 106 of the Copyright Act outlines the exclusive rights of a copyright holder, which broadly include reproduction, distribution, adaptation, performance, and display rights.10217 U.S.C. § 106. Actionable copying may pertain to infringement of any of these exclusive rights but must include infringement of at least one.103S.O.S., Inc. v. Payday, Inc., 886 F.2d 1081, 1085 n.3 (9th Cir. 1989) (“The word ‘copying’ is shorthand for the infringing of any of the copyright owner’s five exclusive rights, described at 17 U.S.C. § 106.”). AI-generated music is most likely to implicate the reproduction, adaptation, and distribution rights.

  1. Reproduction Right

The first exclusive right relevant to AI music is the right to “reproduce the copyrighted work in copies or phonorecords.”10417 U.S.C. § 106. The introductory language of § 106 further specifies that copyright owners have exclusive rights to authorize the exercise of the six rights. In the music context, a USB with a sound recording would qualify as a phonorecord, while a written composition of the song, like sheet music, would be considered a copy.105Copyright Registration of Musical Compositions and Sound Recordings, Copyright Off., https://www.copyright.gov/register/pas-r.html#:~:text=A%20musical%20composition%20may%20be,%2C%20spoken%2C%20or%20other%20sounds [https://perma.cc/Z6UG-FKHH]. It is important to distinguish a phonorecord from the actual recording: the sound recording itself is not a phonorecord, but the medium on which it is stored is. To infringe on the reproduction right, the subsequent work must be a tangible, material, fixed object. An important music-specific caveat in 17 U.S.C. § 114 (“section 114”) is that the reproduction right in recordings is “limited to the right to duplicate the sound recording in . . . phonorecords or copies that directly or indirectly recapture the actual sounds fixed in the recording.”10617 U.S.C. § 114(b) (emphasis added). This means that phonorecords with sounds that merely imitate the original sound, as opposed to actually recapturing the original sounds, do not infringe on the reproduction right, “even though such sounds imitate or simulate those in the copyrighted sound recording.”107Id. This has been interpreted as precluding liability for substantially similar imitations of a recording absent any exact copying; this is important in the context of music sampling, as it requires proof of exact duplication.108Bridgeport Music, Inc. v. Dimension Films, 410 F.3d 792, 800 (6th Cir. 2005) (“This means that the world at large is free to imitate or simulate the creative work fixed in the recording so long as an actual copy of the sound recording itself is not made.”).

  1. Adaptation Right

Copyright owners also have the exclusive right to “prepare derivative works based upon the copyrighted work,” as well as to authorize others to do so.10917 U.S.C. § 106(2). A derivative work is one that must be “based upon one or more pre-existing works,” which is interpreted to mean that a latter work incorporates a sufficient amount of the original work to go beyond mere inspiration.110Id. § 101; 2 Nimmer & Nimmer, supra note 91, § 8.09(A)(1). The adaptation right is closely tied to the other exclusive rights, namely the reproduction and performance rights. When a work is deemed to be a derivative, there is a necessary implication that the reproduction or performance right was also infringed because the second work is substantially similar.111Twin Peaks Prods., Inc. v. Publ’ns Int’l, Ltd., 996 F.2d 1366, 1373 (2d Cir. 1993). With respect to sound recordings, the right to produce derivative works is limited to those in which “actual sounds fixed in the sound recording are rearranged, remixed, or otherwise altered in sequence or quality.”11217 U.S.C. § 114(b). The independent fixation exclusion to the reproduction right also applies to the adaptation right.113Id. (“The exclusive rights of the owner of copyright in a sound recording under clauses (1) and (2) . . . do not extend to the making or duplication of another sound recording that consists entirely of an independent fixation of other sounds, even though such sounds imitate or simulate those in the copyrighted sound recording.”). As with the reproduction right, this limitation finds notable importance in the realm of music sampling and licensing.114Bridgeport Music, Inc. v. Dimension Films, 410 F.3d 792, 800 (6th Cir. 2005).

  1. Distribution Right

The third exclusive right relevant to music is the right to “distribute copies or phonorecords of the copyrighted work to the public by sale or other transfer of ownership.”11517 U.S.C. § 106(3). To violate the distribution right, there must be a tangible product, whether a phonorecord or a copy. The distribution right in the music context involves the right to sell copies, like sheet music, and phonorecords, such as CDs, of the musical work to the public. In the context of Internet platforms, specifically music platforms for sharing sound recordings, there are questions as to whether making copyrighted works available to the public constitutes a violation of this right. Although courts have not unanimously agreed on the answer, it seems clear that making sound recordings available for download by the public on file sharing networks is likely sufficient to demonstrate infringement.1162 Nimmer & Nimmer, supra note 91, § 8.11(D)(4)(a). This question would generally relate more to the potential liability of the generative AI platforms themselves, as opposed to users. For more background on the differing interpretations of this question, however, see generally A&M Recs., Inc. v. Napster, Inc., 239 F.3d 1004 (9th Cir. 2001); UMG Recordings, Inc. v. Hummer Winblad Venture Partners, 377 F. Supp. 2d 796 (N.D. Cal. 2005). Unlike the reproduction and adaptation rights, section 114 does not explicitly name the distribution right in limiting exclusive rights in a recording to exact copies; however, this is likely immaterial because a mere imitation of sounds in the original would seemingly fall outside the definition of the right as applying to distributing copies or phonorecords of the original work.117Section 114(b) only explicitly limits the reproduction and adaptation rights to literal duplications; however, if an independent fixation mimicking sounds is not a copy or phonorecord for the purposes of clauses (1) and (3) of section 106, it seems fair that same understanding would implicitly apply to clause (2); see 17 U.S.C. §§ 106, 114.

C. Additional Music-Specific Considerations

1. Musical Composition Versus Sound Recordings

One unique aspect of music copyright is that there are two sources of protection in a song: the musical composition and the sound recording.118A musical composition, which itself consists of music and lyrics, is typically the work of composers or lyricists, or both. A sound recording, often in the form of a master recording, is the “physical embodiment of a particular performance of the musical composition.” Hutson v. Notorious B.I.G., LLC, No. 14-2307, 2015 U.S. Dist. LEXIS 170733, at *9 n.2 (S.D.N.Y. Dec. 21, 2015). These are considered distinct elements of a musical work, with each being independently copyrightable.119Prior to the enactment of the Copyright Act, the 1909 Act required musical works to be recorded on sheet music or another manuscript in order to be protected, excluding protection for sound recordings as a matter of statutory law. 1 Nimmer & Nimmer, supra note 91, §§ 2.05(A)(1)(a), 2.10(A)(1)(c). This Note, however, will focus exclusively on musical works that are governed by the Copyright Act, which protects compositions and recordings. While both elements are subject to the same requirements for protection, it is important to distinguish between the two, as the law applies differently to each in certain respects. This distinction plays an overall significant role in infringement actions, from whether something is actionable to what royalties are owed for a use.

While some cases have blurred the line between the composition and recording,120In Bridgeport Music, Inc. v. UMG Recordings, Inc., the court found infringement of the musical composition. Confusingly, however, this was based on the appropriation of elements exclusive to the sound recording, despite the fact that the plaintiff did not own the recording; not owning the recording would seemingly mean infringement of the recording would not be actionable, but the court allowed the suit to proceed. 585 F.3d 267, 276 (6th Cir. 2009). others reflect the importance of keeping them separate, as it is clear that determining applicable case law and potential arguments depends on whether the claim is based on recording or composition. Cases are also revelatory of how outcomes differ based on which element is allegedly infringed.121See, e.g., Newton v. Diamond, 204 F. Supp. 2d 1244, 1250–52, 1260 (C.D. Cal. 2002) (dismissing an infringement claim based on the composition because the alleged infringement related to elements of performance only reflected in the recording, which plaintiff neither owned nor alleged infringed), aff’d, 388 F.3d 1189 (9th Cir. 2004). Pertinent to this Note’s discussion, it is both possible and not necessarily uncommon for a work to infringe on the rights of ownership of the composition, but not the recording. Because infringement of the recording has been read to require actual duplication of sounds, a work that recreates but does not directly sample a guitar solo can infringe on the composition but give rise to no cause of action for infringement of the sound recording. Thus, this Note will continue to emphasize the line between these two elements, and how AI-generated music may or may not infringe on each.

  1. Licensing and Sampling

Licensing and sampling are unique considerations in the music context. Licensing, whether it is compulsory and imposed by the Copyright Act or voluntarily negotiated,122See 17 U.S.C. §§ 114–15. The central licensing provisions in the U.S. Copyright Act (“Copyright Act”) that would potentially be relevant in this context are those in §§ 114 and 115. Section 114 applies to sound recordings and § 115 applies to musical compositions. functions as a means of ensuring that owners are compensated for the use of their work. How licenses are obtained and what they allow a licensee to do depends on what aspect of the musical work is involved and who is seeking to license it. Central to the discussion in this Note, however, is the royalty aspect of licensing. Because the hypothetical uses analyzed in this Note did not involve licensing the songs, the artists did not receive compensation in royalty payments for these uses.

A very common practice in the music industry that potentially implicates the need for obtaining a license is sampling. “Sampling” refers to the practice of incorporating short segments of sound recordings into new recordings.123Newton, 388 F.3d at 1191. Typically, when the word sampling is used, it means there is a literal duplication of some portion of the original work, not merely an imitation.124This may be a question for the factfinder, however, as it is not always clear, or admitted, that a use was effectively “copied and pasted” rather than independently recreated. Because sampling involves using a clip in an identical sounding way or with limited alterations, the issues presented by sampling usually fall under the substantial similarity inquiry.125Newton, 388 F.3d at 1195 (explaining that the substantiality requirement applies throughout copyright law, including cases involving samples). Courts are divided on how to approach sampling, particularly with regard to whether applying the de minimis doctrine is appropriate. On one end of the spectrum, the Sixth Circuit in Bridgeport Music, Inc. v. Dimension Films held that sound recording owners have exclusive rights to sample their own recordings, which led to the strong recommendation to “[g]et a license or do not sample.”126Bridgeport Music, Inc. v. Dimension Films, 410 F.3d 792, 801 (6th Cir. 2005). The court explained that requiring licensing does not stifle creativity and will be kept under control by the market; it was also noted that sampling is “never accidental” because sampling involves knowledge of taking another’s work, thereby making licensing requirements fair. Id. This indicated a bright-line rule that any unauthorized use of the recording constitutes infringement, dispensing of the substantial similarity requirement as it pertains to sound recordings.127Id. at 801 n.18. This view has been sharply criticized by many courts on the other end of the spectrum. Rejecting the Bridgeport view, the Ninth Circuit in VMG Salsoul, LLC v. Ciccone held that the de minimis doctrine extends to sound recordings, thereby necessitating the usual substantial similarity inquiry.128VMG Salsoul, LLC v. Ciccone, 824 F.3d 871, 880–87 (9th Cir. 2016) (creating a circuit split with its holding that the de minimis exception applies to allegations of infringement involving sound recordings); see also Batiste v. Lewis, 976 F.3d 493, 505–06 (5th Cir. 2020); Saregama India Ltd. v. Mosley, 687 F. Supp. 2d 1325, 1338–41 (S.D. Fla. 2009), aff’d, 635 F.3d 1284 (11th Cir. 2011). As such, the assessment of sampling in AI-generated music will differ based upon whether the court applies a sampling friendly or unfriendly approach.

D. Copyright Infringement Actions

To establish an actionable copyright infringement claim, the owner must prove the following: (1) they own a valid copyright and (2) there has been copying of the original expression contained therein.12917 U.S.C. § 501(a)–(b); Feist Publ’ns, Inc. v. Rural Tel. Serv. Co., 499 U.S. 340, 361 (1991).

  1. Ownership of a Valid Copyright

As to the first requirement, valid copyright exists when an original work falls within the protectable subject matter of copyright law and adheres to statutory formalities, including fixation, duration, and national origin.130See Varsity Brands, Inc. v. Star Athletica, LLC, 799 F.3d 468, 476 (6th Cir. 2015), aff’d, 580 U.S. 405 (2017). Additionally, registration of the work with the Office is typically a prerequisite for an infringement claim and serves as prima facie evidence of both a valid copyright and ownership thereof.131Id. at 477. The second prong, ownership, is a legal conclusion based on relevant facts;13217 U.S.C. § 201. ownership is particularly important in the music context given the separation of the composition and recording. Once this is established, one can draw a conclusion as to which exclusive rights the owner has, which then form the basis of an infringement claim.

  1. Copying

Despite extensive similarity, there can be no infringement without copying. Actionable copying must relate to protectable elements of the original work.133Feist, 499 U.S. at 361. This requirement is best understood as consisting of two elements: factual copying and legal copying.134Peter Letterese & Assocs., Inc. v. World Inst. of Scientology Enters., 533 F.3d 1287, 1300 (11th Cir. 2008). Factual copying poses a purely factual question: did the defendant know of the protected work, have access to it, and use it in some way in the production of their work?135New Old Music Grp., Inc. v. Gottwald, 122 F. Supp. 3d 78, 85, 93 (S.D.N.Y. 2015). To establish that the defendant actually copied the original, direct or indirect evidence may be used.136Jorgensen v. Epic/Sony Recs., 351 F.3d 46, 51 (2d. Cir. 2003) (citation omitted). Absent direct proof, copying can be established circumstantially if the plaintiff can show the defendant “had access to the copyrighted material,”137Id. (citing Herzog v. Castle Rock Ent., 193 F.3d 1241, 1249 (11th Cir. 1999)). Access speaks to a “reasonable possibility” of access, not simply a “bare possibility.” Gaste v. Kaiserman, 863 F.2d 1061, 1066 (2d Cir. 1988). However, access may be inferred when the works are “so strikingly similar as to preclude the possibility of independent creation.” Repp v. Webber, 132 F.3d 882, 889 (2d Cir. 1997) (citation omitted). and similarities exist between the works that are “probative of copying.”138Jorgenson, 351 F.3d at 51 (citing Repp, 132 F.3d at 889).

Legal copying is often referred to as “improper appropriation” or “substantial similarity.”1394 Nimmer & Nimmer, supra note 91, § 13D.02(B)(2). This Note will use the term “substantial similarity.” Copying does not require verbatim replication of the original work, rather it requires that copying result in the production of a substantially similar work.140Ringgold v. Black Ent. Television, Inc., 126 F.3d 70, 74 (2d Cir. 1997) (describing “substantial similarity” as the threshold for whether copying is actionable). Experts describe the question of when similarity rises to the level of “substantial” as one of the toughest questions in copyright law.1414 Nimmer & Nimmer, supra note 91, § 13.03(A) (noting also that a “mere distinguishable variation [may] constitute a sufficient quantum of originality so as to support a copyright in such variation, that same distinguishable variation . . . may not sufficiently alter its substantial similarity to another” (internal quotations marks omitted)). Similarity exists on a spectrum, spanning from the most trivial similarities, which are not actionable, to absolute, literal similarity that renders a second work identical. One approach to similarity divides it into two types: “comprehensive nonliteral similarity” and “fragmented literal similarity.”142Id. Although this distinction has not widely been recognized by courts in an express manner, the terminology has been endorsed in a variety of cases and can be helpful in keeping straight the types of similarities that are presented in this Note’s sample songs. Comprehensive nonliteral similarity speaks to similarity in the “fundamental essence or structure” of a work. Fragmented literal similarity refers to duplication of literal elements of an original, but only in a fragmented manner, such as the exact duplication of only three lines of text. Fragmented literal similarity is often described as a de minimis doctrine, as the question gets at whether a use is de minimis or not.143See Warner Bros. Inc. v. ABC, 720 F.2d 231, 242 (2d Cir. 1983).

Regardless of the type of similarity involved, courts imposed one additional barrier for copying of protected elements to be actionable: the copying must not be de minimis.144De minimis non curat lex, usually shortened to de minimis, is a legal maxim that represents the idea that “[t]he law does not concern itself with trifles.” De minimis non curat lex, Black’s Law Dictionary (11th ed. 2019). In the context of copyright, “de minimis copying” can be understood as the opposite of substantial similarity.145Newton v. Diamond, 388 F.3d 1189, 1193 (9th Cir. 2004) (“To say that a use is de minimis because no audience would recognize the appropriation is thus to say that the use is not sufficiently significant.”). While the idea of de minimis copying sounds simple, its application is not necessarily straightforward because it is highly fact dependent. A de minimis determination pertains both to the quantity and quality of the use, therefore a “simple word count” is not alone enough to determine infringement.146Nihon Keizai Shimbun, Inc. v. Comline Bus. Data, Inc., 166 F.3d 65, 71 (2d Cir. 1999). In the music context, whether uses are deemed de minimis can vary greatly; in one instance, a six-second segment of a four-and-a-half-minute song was deemed a de minimis use,147Newton, 388 F.3d at 1195–96 (concluding that the portion used was neither quantitatively nor qualitatively important to the original work). but in another, a three-second orchestra sequence was not.148TufAmerica, Inc v. Diamond, 968 F. Supp. 2d 588, 606–07 (S.D.N.Y. 2013) (holding that a sequence was repeated in the original work and ultimately constituted fifty-one seconds, which gave it qualitative and quantitative importance).

Courts have developed a wide variety of approaches to determine when similarity rises to the level of substantial in these types of cases. The three test categories that are most commonly used in similar music-related cases are the extrinsic-intrinsic, ordinary observer, and fragmented literal similarity tests.149There are other judicially formulated tests for substantial similarity, but these three appear to be the most commonly used in music cases, particularly in recent years. While they each take slightly different approaches to determining the presence of substantial similarity, they are all ultimately rooted in the foundational question of whether there is similarity in those elements to which copyright protection would extend.

  1. Fair Use Defense

Section 107 carves out a limitation on exclusive rights, commonly known as the fair use defense. Four factors are considered in determining whether a use is a fair use:

(1) [T]he purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) the effect of the use upon the potential market for or value of the copyrighted work.15017 U.S.C. § 107.

While the Copyright Act dictates that these four factors “shall” be considered, how they have actually factored in has developed over time through judicial interpretation. The seminal case that guides all applications of the fair use defense is Campbell v. Acuff-Rose Music, Inc., a 1994 Supreme Court case that addressed a musical parody.151Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569, 572 (1994) (holding that the commerciality prong of a fair use analysis is insufficient to determine whether a use qualifies for the § 107 exception). The Court cautioned against simplifying the analysis to bright-line rules, emphasizing that fair use determinations must be done on a case-by-case basis, weighing each factor together.152Id. at 577–78 (“The fair use doctrine thus permits [and requires] courts to avoid rigid application of the copyright statute when, on occasion, it would stifle the very creativity which the law is designed to foster.”) (alteration in original) (citation omitted) (internal quotation marks omitted). While the general principles from Campbell remain, the Supreme Court recently addressed fair use again in Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith, in which the Court limited the fair use defense with regard to the first factor’s transformation inquiry.153Andy Warhol Found. for the Visual Arts v. Goldsmith, 598 U.S. 508 (2023). This holding was likely welcomed by lower courts who criticized how the factor had expanded. See Kienitz v. Sconnie Nation LLC, 766 F.3d 756, 758 (7th Cir. 2014) (“[Courts have] run with the suggestion [of transformative use] and concluded that [it] is enough to bring a modified copy within the scope of § 107.”). This will likely have particular salience in infringement cases involving AI because AI is inherently transformative; however, this type of transformation may not hold as much weight under the new understanding of the first factor post-Goldsmith.

While fair use is regularly litigated in many copyright cases generally, musicians tend to avoid it.154Edward Lee, Fair Use Avoidance in Music Cases, 59 B.C. L. Rev. 1873, 1877 (2018). This initially seems odd given that the seminal case for fair use, Campbell, involves music; but Campbell is really a parody case. Outside the context of parody,155There has been at least one case finding fair use of copyrighted music by schools, but that is excluded from this discussion because the court found that the use fell “plainly within the enumerated fair use purposes of teaching and nonprofit education,” so the analysis was very different. Tresóna Multimedia, LLC v. Burbank High Sch. Vocal Music Ass’n, 953 F.3d 638, 654 (9th Cir. 2020). Estate of Smith v. Cash Money Records, Inc., is the only federal case recognizing a songwriter’s fair use in copying another song.156Estate of Smith v. Cash Money Recs., Inc., 253 F. Supp. 3d 737, 752 (S.D.N.Y. 2017). This case is described as a music case but involved only lyrics. Some have questioned whether the use should have even really been considered a “musical work” because it was a spoken acapella rap. Lee, supra note 154, at 1876. There is one other case, Chapman v. Maraj, in which the court said the use of part of a song in a non-parodic manner was fair use. Chapman v. Maraj, No. 18-cv-09088, 2020 U.S. Dist. LEXIS 198684, at *34 (C.D. Cal. Sept. 16, 2020). However, in Chapman, the use was never released and was only for “artistic experimentation” while waiting on license approval from the owner. Id. at *33.  While artist-defendants have pled fair use in their answers to infringement cases, they typically defend their work on other grounds.157Compare Answer of Defendants at 28, Skidmore v. Led Zeppelin, 2016 U.S. Dist. LEXIS 51006 (C.D. Cal. Apr. 8, 2016) (No. 15-3462) (asserting a fair use affirmative defense), with Skidmore v. Led Zeppelin, 952 F.3d 1051, 1079 (9th Cir. 2020) (affirming conclusion that there was no infringement, but not discussing fair use at all). A 2018 empirical study revealed that, up to that point, no defendant had successfully established a non-parody fair use of another work’s musical notes.158Lee, supra note 154, at 1878. Therefore, how fair use will operate in this context will be somewhat speculative.

III.  APPLICATION AND ANALYSIS

A.  Sample Song A

Sample Song A is highly similar to “Heart on My Sleeve” by “Fake Drake.” While it sounds deceptively like Kanye West, both in the voice and in that it employs lyrics that intentionally evoke similar themes to his recent works, these similarities are highly unlikely to be cognizable under copyright law for several reasons. Rather than being copyright infringement, this Kanye-inspired song is almost certain to be considered what the courts have called a “soundalike.” But because songs like this have already been the source of contention regarding music and copyright, it is helpful to understand the basis for why this is unlikely to be a successful claim.

For the purposes of this application, it is assumed that there are valid copyrights for the songs from Yeezus that were used in creating Sample Song A, including “Black Skinhead.” It is also assumed that West owns the valid copyrights for both the sound recordings and underlying compositions.159West’s label likely owns the rights to Yeezus and “Black Skinhead,” but the copyright ownership is attributed to West for the ease of application; see Detailed Record View: Registration Record SR0000724178, Copyright Pub. Recs. Sys., https://publicrecords.copyright.gov/detailed-record/26242659 [https://perma.cc/33D7-8XDX] (Yeezus registration); Detailed Record View: Registration Record PA0001890242, Copyright Pub. Recs. Sys., https://publicrecords.copyright.gov/detailed-record/26654806 [https://perma.cc/Q7ZD-ESAZ] (“Black Skinhead” registration). It is important to note, as earlier, that there may be an important discussion to be had regarding copyright liability on the part of the owner of the AI system or program, as they are trained on these songs. For the purpose of this Note, however, that claim is being set aside to instead focus on output liability. Thus, the first requirement of a copyright infringement claim, ownership of a valid copyright, is presumed to be satisfied. This means that West is entitled to the exclusive rights outlined in the Copyright Act. Infringement of one of these rights must be the basis of his claim against User A, which presents just one of many road bumps in an attempted lawsuit based on this type of activity: copying as it relates to his voice or style can pertain only to the sound recording. As such, he is limited to claiming infringement on his right to reproduce, adapt, distribute, and perform the sound recording.16017 U.S.C. §§ 106, 114. Note that the public performance right noted here is only that which pertains to the sound recording, meaning performance by means of digital audio transmission. Id. § 106(6).

  1. Factual Copying

Whether or not there is any possibility of an actionable claim will depend on the second requirement of copying, which is divided into two prongs: factual copying and legal copying. West’s claim would most likely have to rest on an infringement of a right associated with “Black Skinhead” specifically because satisfying the copying requirements for an entire album comprised of a variety of types of songs seems very unlikely. Turning first to factual copying, this prong asks the question of whether the defendant knew of, had access to, and in some way used the protected work in the production of their work. This requirement would seemingly be satisfied by the AI system’s owner, as the question could be answered by looking at the songs the system is trained on to produce work that sounds like West. However, it is likely more complicated when the infringer is merely the user who is not responsible for or involved with inputting data. While the prompt used by User A strongly suggests their desire and intent to use Yeezus and “Black Skinhead” in some way, it is not obvious whether this satisfies the factual copying requirement. This inquiry raises two key questions: can the use by Uberduck be imputed onto User A or can indirect evidence be used to sufficiently prove factual copying by User A themselves?

While it can arguably be assumed that Uberduck is trained on Yeezus and “Black Skinhead” given its option of West’s voice in the style of Yeezus, it cannot be verified for certain absent an admission from Uberduck’s programmer. However, this is not detrimental to a claim by West because factual copying can be proven using indirect evidence, which requires only demonstrating that defendant had access to the copyrighted work and that there are substantial similarities between the works that are “probative of copying.”161Jorgensen v. Epic/Sony Recs., 351 F.3d 46, 51 (2d Cir. 2003) (quoting Repp v. Webber, 132 F.3d 882, 889 (2d Cir. 1997)). While access cannot be demonstrated by showing a bare possibility that the defendant accessed it, a reasonable possibility of access can.162Id. (citing Gaste v. Kaiserman, 863 F.2d 1061, 1066 (2d Cir. 1988)). Where these two key questions diverge is on how that possibility of access is demonstrated, whether it be access by the system imputed onto User A or access by User A themselves. Starting with the system, the offering of a Yeezus-style voice suggests a reasonable possibility of access to “Black Skinhead” for a few reasons. First, from a technological perspective, Uberduck utilizes DL, which alone requires significant amounts of data input for the system to learn; for a model to be able to replicate West’s voice from a specific album, it can be inferred that the whole album would have been used to provide as much learning material as possible to create the most authentic results. So-VITS-SVC, the specific DL model Uberduck uses to make songs that sound like West, involves a process of using relevant source audios of West to separate out his voice, which is then encoded to analyze and use the distinctive characteristics of his voice from those songs. Additionally, the HiFi-GAN model that Uberduck uses helps to train the generator to recognize authentic versus fake West samples until it can produce highly realistic-sounding speech.

Asserting that the voice can sound specifically like West in Yeezus, together with the technological understanding that this would require as much relevant training data as possible, it seems fair to conclude it is reasonably possible that the system had access to “Black Skinhead,” which is one of only ten songs on the album. Even considering the unlikely possibility that not all ten songs were used to create a Yeezus-inspired voice, it seems very reasonable to conclude that “Black Skinhead” would be used because it was the first single released from the album,163David Greenwald, Kanye West Prepping ‘Black Skinhead’ as First ‘Yeezus’ Single, Billboard (June 28, 2013), https://www.billboard.com/music/rb-hip-hop/kanye-west-prepping-black-skinhead-as-first-yeezus-single-1568684 [https://perma.cc/UD8X-P5BT]. it has since been certified platinum in the United States three times, and West performed it repeatedly,164Gold & Platinum, RIAA, https://www.riaa.com/gold-%20platinum/?se=Kanye+west&tab_active=default-award&col=title&ord=asc [https://perma.cc/RL72-KN2Q].   all of which arguably make it a hallmark of the Yeezus era.165See, e.g., Miriam Coleman, Kanye West Unleashes the Fury of ‘Black Skinhead’ on ‘SNL’, Rolling Stone (May 19, 2013), https://www.rollingstone.com/music/music-news/kanye-west-unleashes-the-fury-of-black-skinhead-on-snl-167279 [https://perma.cc/E7NF-26Y6]; Edwin Ortiz, Watch Kanye West Perform “Black Skinhead” on “Le Grand Journal”, Complex (Sept. 23, 2013), https://www.complex.com/music/a/edwin-ortiz/kanye-west-black-skinhead-performance-on-le-grand-journal [https://perma.cc/LKP8-6ZXB]; Marc Hogan, Drake Welcomes Kanye West for ‘Black Skinhead’ Live in Berlin, Spin (Feb. 28, 2014), https://www.spin.com/2014/02/drake-kanye-west-black-skinhead-berlin-live-video [https://web.archive.org/web/20240524193340/https://www.spin.com/2014/02/drake-kanye-west-black-skinhead-berlin-live-video]. It is difficult to imagine a Yeezus-style voice could be trained without the use of this song. Technology aside, access can also be shown through a theory of widespread dissemination,166Three Boys Music Corp. v. Bolton, 212 F.3d 477, 482 (9th Cir. 2000), overruled by Skidmore v. Led Zeppelin, 952 F.3d 1051 (9th Cir. 2020) (overruling the use of the inverse ratio rule). and, for the reasons just stated, “Black Skinhead” was clearly widely disseminated. However, this theory of access is likely not applicable to the system itself outside the context of liability for input.

Having established a relatively strong claim of reasonably likely access, the next question turns on whether that access could be imputed onto User A. Courts have held that there was a reasonable possibility of access by the defendant in certain circumstances in which such access is inferred based on an “intermediary.”167Jorgensen, 351 F.3d at 53. One iteration of this theory of access is that access can be inferred if the intermediary or third party is connected to the copyright owner and the infringer.168Gaste v. Kaiserman, 863 F.2d 1061, 1067 (2d Cir. 1988). Courts that have entertained this argument have varied on the relationship the intermediary must have with both parties, but a key characterization appears to be that it is a “close relationship,” which might be found when the intermediary contributes creative ideas to the infringer, supervises the infringer’s work, or has worked together in the same department as the infringer.169Jorgensen, 351 F.3d at 54–55; Towler v. Sayles, 76 F.3d 579, 583 (4th Cir. 1996); Meta-Film Assocs., Inc. v. MCA, Inc., 586 F. Supp. 1346, 1355–56 (C.D. Cal. 1984); Moore v. Columbia Pictures Indus., Inc., 972 F.2d 939, 942 (8th Cir. 1992). Note that some courts refer to this as the “Corporate Receipt Doctrine,” but not all, and that name might add potential confusion to this analysis. There are two wrinkles in trying to apply this argument here. First, most cases involve the intermediary being given the copyrighted work by the owner.170For example, in Jorgensen, the conclusion of access largely rested on the fact that the intermediary admitted to receiving the work and telling the owner he would forward it to the later infringer. 351 F.3d at 54–55. This is potentially less damaging because it still seems relevant whether the third party heard the song, as this also factors into the conclusions in addition to whether the intermediary was given a copy.171Lessem v. Taylor, 766 F. Supp. 2d 504, 509–11 (S.D.N.Y. 2011). Second, the relevant cases involving inferences based on intermediary access have involved a human intermediary.172There are discussions of Internet intermediaries in the context of copyright infringement, but these cases typically involve secondary liability because Internet programs were used to infringe, which is different from the issue of access. This may be particularly problematic for a plaintiff in a situation like West because it is hard to apply a framework of a close human relationship to the relationship between a computer program, a user, and input data. However, given the novelty of generative AI technology and the unique issues presented by generative AI music, there is a chance courts will not deem this fatal.

One reason to think courts may be flexible here is because of the expanded willingness to hold Internet intermediary sites vicariously or contributorily liable for failing to monitor infringing material available on or through the use of the Internet’s system.173See generally A&M Recs., Inc. v. Napster, Inc., 239 F.3d 1004 (9th Cir. 2001) (embracing an expansive understanding of vicarious liability in holding a music downloading platform liable for infringement by users). While this speaks more to potential liability of the system as the sole infringer, it may still help convince a court to accept arguments based on non-traditional assistance in infringement, which is required here to first find the technology to have been an intermediary, and then impute liability onto a user. An indication that courts may be less likely to consider an AI system to be an intermediary turns on the assessment of AI in Thaler v. Perlmutter. As discussed, the court in Thaler emphasized the importance of human authorship for copyright protection.174Thaler v. Perlmutter, 687 F. Supp. 3d 140, 142 (D.D.C. 2023). The court rejected the plaintiff’s “work-for-hire” argument, which he used to suggest that he had hired the AI system to create the painting for him; the court rejected the argument for several reasons, but most importantly noted that such provisions of the Copyright Act clearly only contemplated the involvement of humans as employees and the contractual relationship outlined in the provision required a meeting of the minds that cannot occur with a non-human entity.175Id. at 150 n.3. While again, this speaks to a different type of imputation onto technology, it nonetheless reflects a hesitancy to treat technology itself like a human. This provides good reason to question whether a court would find an AI system to be a sufficient intermediary to justify an inference of access.

Given that courts have at times expressed the need to be careful in imposing liability when infringement is not done directly,176Metro-Goldwyn-Mayer Studios Inc. v. Grokster, Ltd., 545 U.S. 913, 929 (2005) (explaining that there is a concern about imposing indirect liability based on the potential that it might “limit further development of beneficial technologies”). The Court in Grokster found that there was a powerful argument for imposing indirect liability in those circumstances, given the amount of infringement that was occurring on the platform, which was the party being held indirectly liable. Id. it is worth considering the possibility that a court assessing generative AI may have trepidations about holding a user liable for infringement that may technically be executed through the complex algorithm of an AI system without any input from the user besides a brief prompt.177Similar concerns may apply in a lawsuit against the platform, especially at this point when there remains much to be learned about how the technology actually works; however, this Note is focused on the liability of users, as the current state of technology often involves the use of multiple different platforms. However, case law has consistently indicated that a finding of infringement is not dependent upon finding that the defendant intended to infringe.178See Coleman v. ESPN, Inc., 764 F. Supp. 290, 294 (S.D.N.Y. 1991) (“Intent is not an element of copyright infringement.”); Pinkham v. Sara Lee Corp., 983 F.2d 824, 829 (8th Cir. 1992) (“[D]efendant is liable even for innocent or accidental infringement.”) (internal quotation marks omitted). As such, it seems unlikely that an individual could escape potential imputation of access by simply arguing they intended to use the system to create a new song, not to infringe on the copyright of another.

Assuming the inference of access could not be imputed onto User A by way of an intermediary theory, there remains the question of whether factual copying by User A can be proven through the same indirect evidence approach without any imputation or involvement of the AI system. As mentioned earlier, one avenue for demonstrating a reasonable probability of access is by pointing to widespread dissemination of the song, which certainly seems like an available option here.179Three Boys Music Corp. v. Bolton, 212 F.3d 477, 482 (9th Cir. 2000), overruled by Skidmore v. Led Zeppelin, 952 F.3d 1051 (9th Cir. 2020) (overruling the use of the inverse ratio rule). This assertion is likely bolstered by the fact that User A clearly knew of Yeezus, as they selected the Yeezus style, and had to have been familiar with the album generally because of the themes in their prompt. These facts, in addition to the widespread dissemination of the song and selection of a rap beat and lyrical themes so similar to “Black Skinhead” form a strong basis for concluding there is a reasonable likelihood of access to the song by User A. The potential issue that could arise is that User A may argue that they were not involved in the creation aside from the prompt and the few general selections. They may try to argue that, even if they had heard the song, this would not matter because their awareness was not involved in the actual creation of the song or what it sounds like. Ultimately, this would likely come down to a determination of whether the selections and prompt constitute sufficient involvement in the creation, but it seems possible that it would be enough because User A did in fact direct Uberduck in a very pointed direction, even if they did so through simple or general means. Additionally, this is unlikely to be where West’s case completely crumbles, and User A has stronger, more important arguments in other areas.

Even if access is proven, the factual copying prong remains unsatisfied until West can demonstrate probative similarity. The probative similarity prong is likely much more straightforward in this case than the access prong. The idea behind probative similarity is that, combined with a reasonable probability of access, a level of similarity will give rise to a reasonable inference that the copyrighted work served as the source for the allegedly infringing work.1804 Nimmer & Nimmer, supra note 91, § 13D.06 Determining the presence of probative similarity requires an examination of the two works as wholes to assess whether similarities are those which would not be expected to arise independently.181Id. An important difference between this inquiry and the legal inquiry of substantial similarity is that probative similarity is not limited to protectable elements, meaning the inquiry takes a holistic approach focused on drawing a historical conclusion as to whether the copyrighted work was the basis in some way for the second work.182Positive Black Talk Inc. v. Cash Money Recs. Inc., 394 F.3d 357, 369–70 n.9 (5th Cir. 2004). This could give West a small glimmer of hope because the songs may sound sufficiently similar when compared side-by-side, especially given that unprotectable elements of his style and voice can technically be considered. Because the song sounds like West and expresses themes common to “Black Skinhead” and Yeezus more generally, a jury looking holistically at the two songs may find the similarity to be probative of copying. The level of similarity required to satisfy this requirement is lower than that of substantial similarity, as West must show only that Sample Song A overall is similar to “Black Skinhead” in a way that would be unexpected had User A not had access to the original.183Id. at 370; see also Ringgold v. Black Ent. Television, Inc., 126 F.3d 70, 75 (2d Cir. 1997) (explaining that the factual copying requirement of probative similarity “requires only the fact that the infringing work copies something from the copyrighted work; . . . [substantial similarity] requires that the copying is . . . sufficient to support the legal conclusion that infringement (actionable copying) has occurred”). But this is an uncertain outcome because it ultimately comes down to the jury’s assessment of how the songs actually sound and is not dependent upon any legal criteria aside from the general rule of what probative similarity is. Although there is a chance West might prevail on factual copying by demonstrating access and probative similarity, it is likely short-lived because the legal copying inquiry remains.

  1. Legal Copying

The end of the road for those like West who seek to vindicate their exclusive rights by legally challenging soundalikes almost certainly comes at the legal copying phase, if the claim even reaches that point. The substantial similarity prong of the copying requirement raises questions that a song like Sample Song A cannot satisfactorily answer. The chief problem here is that we are assuming the only real similarity is that it sounds like West’s voice or is sung in his distinctive style, neither of which are copyrightable elements of his work. The exclusion of voice and style from the scope of copyright protection was confirmed solidly in the well-known case Midler v. Ford Motor Co., in which Bette Midler lost on a claim of infringement based on a soundalike song that mimicked her voice almost exactly; the infringement claim relied solely on her voice, as the user had obtained rights to the song itself.184Midler v. Ford Motor Co., 849 F.2d 460, 461–62 (9th Cir. 1988). The Ninth Circuit stated bluntly that “voice[s] [are] not copyrightable,” as they are not fixed works of authorship as required by the Copyright Act.185Id. at 462. While West may try to point to the similar themes, copyright extends only to expression and not ideas. Regardless of what test is used, when a work is substantially similar only in regard to separate, unprotectable elements, there can be no infringement. There are instances in which unprotectable elements together can form the basis of substantial similarity, but that would not be possible when two songs do not sound alike aside from the voice and general genre or theme. Absent some concrete similarity, such as instrumental interludes, phrases, or even lyrics, there can be no actionable substantial similarity. Section 114 of the Copyright Act likely blocks this type of claim, as it states that the reproduction and adaptation rights do not extend to independent fixations, even if the recording imitates a copyrighted recording.18617 U.S.C. § 114(b). Therefore, Sample Song A would not qualify as a derivative work because, as a mere imitation, it cannot infringe on the adaptation right.

While all signs point to dismissal, there are two potential unique considerations that may be worth discussing. First, there is the question of whether Sample Song A should be considered a reproduction and adaptation, even though it is not the exact same, because the exact song was used to train the outputs of the generative AI system. Technically, AI is trained to the point that it can create its own patterns, but ultimately those are still developed using the copyrighted work. In the case of Sample Song A and Uberduck, So-VITS-SVC isolates the artist’s voice, uses that voice to create and encode frequency bands that correspond to the distinctive characteristics of the voice in that audio, and then learns to make audio that uses those frequencies. There is potentially an argument that this is a literal reproduction of sounds in a way that should be separated from the intangible concept of a voice or style, and instead look at it like a remixed sample of audio of West’s voice.187This argument would require convincing a court that the use of frequencies extracted from the songs is equivalent to sampling a section and remixing it to say something else. While from a technological standpoint this could theoretically be true, it is both a stretch and would be difficult to prove those frequencies came from a certain song in the first place. Under this theory, not only could the use be an infringement of the reproduction and distribution right, but Sample Song A would also potentially qualify as a derivative work, as it is a new song based on parts of West’s recording in “Black Skinhead.”188Frisby v. Sony Music Ent., No. 19-1712, 2021 U.S. Dist. LEXIS 51218, at *26–27 (C.D. Cal. Mar. 11, 2021). If this were to be considered a sample, under the Bridgeport view, this would qualify as infringement without even delving into the substantial similarity inquiry.189Bridgeport Music, Inc. v. Dimension Films, 410 F.3d 792, 801 (6th Cir. 2005) (“Get a license or do not sample.”). However, this is far from the only approach to sampling. Likely, the question of substantial similarity will remain central to determining whether this use of sampling constitutes infringement. As already discussed, Sample Song A and “Black Skinhead” cannot be substantially similar because their chief “similarity,” West’s voice and style, is not a protectable element of the song, so it would not be able to serve as the sole basis for infringement under any of the judicial tests. The use of West’s vocal frequency bands would likely be deemed a de minimis use, which is a use in which “the average audience would not recognize the appropriation.”190VMG Salsoul, LLC v. Ciccone, 824 F.3d 871, 878 (9th Cir. 2016) (quoting Newton v. Diamond, 388 F.3d 1189, 1193 (9th Cir. 2004)). It seems very unlikely that the average audience would recognize Sample Song A’s use of vocal frequency bands extracted from “Black Skinhead” and West’s other music, even though they might recognize that the voice generally sounds alike. This is certainly more complicated than an ordinary sampling inquiry because the use involves very small fragments used in very different ways; nonetheless, because the statutory language prohibits only that which is actually duplicated, the substantial similarity inquiry and de minimis interpretation would have to be based solely on those exact duplications of frequency bands. As such, if this is considered sampling, it would nonetheless likely be dismissed as a de minimis use.

However, even if the use is considered sampling, fair use will likely be an issue for West, whether or not the legal copying issue is addressed with a substantial similarity inquiry. If the sets of sounds from the source audio were actually sampled to make Sample Song A, they are fundamentally different because the frequencies inherently change when forming sounds that say different words. Therefore, if that could be considered an exact reproduction and adaptation of those sounds, it seems likely that a court would find that to be a fair use. While Goldsmith instructed the transformation inquiry to be reined in, this type of use is undeniably transformative in a way similar to the code transformed in Google LLC v. Oracle America, Inc.191See Andy Warhol Found. for the Visual Arts, Inc. v. Goldsmith, 598 U.S. 508, 527–41 (2023); Google LLC v. Oracle Am., Inc., 593 U.S. 1, 29–32 (2021). While the basis for the sound of West’s voice, the frequencies, were used, they were manipulated and restructured to such a significant degree, as evidenced by the creation of an entirely new set of lyrics rapped. This is comparable to the reverse engineering of object code in Sega Enterprises Ltd. v. Accolade, Inc., in which the Ninth Circuit found reverse engineering in order to transform code into something entirely new to be a fair use.192Sega Enters. Ltd. v. Accolade, Inc., 977 F.2d 1510, 1514–15 (9th Cir. 1992). In Sega, the court rejected the argument that a use in order to create competing products precludes a fair use finding, and emphasized the need to focus on several factors, including but not limited to commercial purposes; there, the use of copyrighted code was to understand the program’s mechanisms and then create something entirely new that would be compatible with the program, which outweighed its purpose of creating an ultimately commercial product.193Id. at 1522–23. Here, the decoding of songs into frequency bands could be understood as an attempt to understand why West’s voice sounds the way it does, and the subsequent use of such frequency bands to say new words and make an entirely new song is a transformative purpose sufficient to count toward a fair use. While User A likely hoped their song would achieve commercial success, that does not negate the transformative purpose behind their use of frequency bands from West’s music. Thus, the first fair use factor leans strongly in favor of the user.

As to the second factor, the nature of the work, West’s music is inherently creative, which tends to count against fair use.194Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569, 586 (1994). However, this is often not the most significant factor, and courts have not refused to find a fair use in situations involving creative works.195The work at issue in Campbell was a song, as well, which is a work “closer to the core of intended copyright protection.” Id. The third factor, amount and substantiality used, counts very strongly in favor of fair use. Vocal frequency bands constitute a very small amount of everything that goes into a song. Considering that all other elements, including instrumentals and lyrics, are entirely different, the use of frequency bands is a minor taking from the original, although West may try to argue that whole songs, presumably including “Black Skinhead,” were encoded. In Sega, in which the entire program was encoded, the court noted that while that fact counts against fair use, the factor is of little weight when the actual use of that information is so limited.196Sega, 977 F.2d at 1526–27. Here, certainly not all of that which is encoded is used. What was encoded was a sufficient amount of frequency bands to analyze and understand vocal characteristics for future imitations;197Google LLC v. Oracle Am., Inc., 593 U.S. 1, 34 (2021) (“The ‘substantiality’ factor will generally weigh in favor of fair use where, as here, the amount of copying was tethered to a valid, and transformative, purpose.” (citation omitted)). while this may have involved a large number of frequency bands, that was what was required to serve the ultimately transformative purpose of creating a high-quality song that did not itself utilize all that was encoded for training purposes.198Estate of Smith v. Cash Money Recs., Inc., 253 F. Supp. 3d 737, 751 (S.D.N.Y. 2017) (finding that the third factor counted toward a fair use finding because the amount taken in sampling a song was “reasonable in proportion to the needs of the intended transformative use”). Because the third fair use factor asks about substantiality as well, there is an opening for West to try to argue that, even if frequency bands are one small part of a song, they are nonetheless substantial in relation to the whole work because they are responsible for creating his distinctive voice. This argument would face a few barriers, the first being that it is completely acceptable to make a song that simply sounds like someone else. Additionally, he may have a more compelling argument if those vocal frequencies were placed together and used to rap lyrics from one of his songs. But the frequency bands themselves, isolated from the other bands that together create his voice, are hardly the “heart” of his original work, especially with how they have been changed in Sample Song A.199Elsmere Music, Inc. v. Nat’l Broad. Co., 482 F. Supp. 741, 744 (S.D.N.Y. 1980) (holding that a small use was nonetheless substantial because the small amount used happened to be the “heart of the composition”).

The fourth fair use factor, the effect on the market, has received limited attention in the context of music. However, in Frisby v. Sony Music Entertainment, the court noted that two songs in the similar genres of rap and hip-hop were marketplace competitors.200Frisby v. Sony Music Ent., No. 19-1712, 2021 U.S. Dist. LEXIS 51218, at *40–41 (C.D. Cal. Mar. 11, 2021). As competitors, one copying the other could reasonably be expected to diminish the value and sales of the original.201Id. Here, Sample Song A and “Black Skinhead” are certainly in the same genre, so they may properly be considered competitors in the music market. Following the line of reasoning in Frisby, this means it can be assumed that Sample Song A would have a negative impact on the value of “Black Skinhead” and, further, would harm the market for derivatives because it was used without a license.202Id. at *41 (explaining that the harm to the market for derivatives must also be considered). Because sampling is so prevalent in the rap and hip-hop genres, this is particularly relevant here; West may argue that finding this a fair use would set the precedent that following proper sampling procedures is unnecessary. However, the facts here separate this case from that of Frisby because the potential sampling that occurred could have easily gone unnoticed absent the knowledge that it was created using an AI system that had encoded these vocal frequencies. To suggest that this use of “Black Skinhead” would have such a chilling effect on licensing in the industry seems to be taking Frisby’s presumptions too far.

Taking all four factors together, it seems that the highly transformative purpose and minimal amount used may be enough to weigh in favor of finding this to be a fair use, especially in light of the highly speculative arguments about market harm given that this does not involve sampling in the traditional sense. However, because the fourth factor is “undoubtedly the single most important element of fair use,”203Harper & Row, Publishers, Inc. v. Nation Enters., 471 U.S. 539, 566 (1985). it is possible that if a court adopts the view that sampling without a license has such an impact on the market for future derivatives, the fourth factor could be enough to compel the finding that this is not fair use. Of course, this would be a judicial determination, so it is not impossible that a court would accept these arguments, but it does not seem overly promising at this point. Given how courts have viewed voice and style thus far, it seems like a stretch to imagine the argument that vocals are really just compilations of protectable sounds would suddenly work because of the technology involved.

The second consideration is that some may believe Williams v. Gaye opened the opportunity to argue style infringement. While the dissenting opinion in Gaye criticized the majority’s conclusion as endorsing the idea of copyright protection for a musical style,204Williams v. Gaye, 885 F.3d 1150, 1183–86 (9th Cir. 2018) (Nguyen, J., dissenting). the bases for infringement included elements like signature phrases, hooks, and structural similarities.205Id. at 1172. These were similarities that, although alone may not have been protected, together led to substantial enough similarity that a jury concluded rights had been infringed. While these elements could be considered aspects of the plaintiff-artist’s style, they clearly went beyond sounding like a voice. Additionally, Gaye focused on the composition, whereas Sample Song A’s mimicking of West’s voice could only speak to infringement of the recording because the alleged similarities relate only to what the vocals sound like, which is not fixed on a page like the phrases in Gaye. Putting aside the differences between Sample Song A and the infringing song in Gaye, a key weakness in West’s style argument and whether Gaye made that argument an option is that this idea has not been embraced by other courts. While some courts have embraced a “total concept and feel” test for substantial similarity, both on its own and as part of an “intrinsic” test,206See infra Sections III.B.2.i–ii. that test requires at least a claim based on original arrangement of unprotected elements.207Skidmore v. Led Zeppelin, 952 F.3d 1051, 1074 (9th Cir. 2020) (“We have extended copyright protection to a combination of unprotectable elements . . . only if . . . their selection and arrangement [are] original enough that their combination constitutes an original work of authorship.”) (citation omitted) (internal quotation marks omitted). Without some protectable element, whether it be lyrics or a drum beat,208See, e.g., New Old Music Grp., Inc. v. Gottwald, 122 F. Supp. 3d 78, 95 (S.D.N.Y. 2015). a similar “feeling” song will not pass a substantial similarity test.209See Skidmore, 952 F.3d at 1064 (explaining that “only substantial similarity in protectable expression may constitute actionable copying that results in infringement liability”) (emphasis added). Here, even if a lay person has an initial reaction that the songs sound similar because the voice mimics West, that, again, is not copyrightable. Given that there are no elements of the instrumental track or lyrics to be the basis of this claim because these are original lyrics and a generic rap beat unlike “Black Skinhead,” the mimicking of his voice is the only thing West could point to and that cannot pass the test. Therefore, even if Gaye introduced a way to litigate style infringement, which is debatable given other courts’ avoidance of such a conclusion, it appears that there must be some sort of protectable expression in that style to base one’s claim on. While West’s voice may evoke a certain aesthetic style and certainly speaks to his creative expression, there is nothing in that expression that can be the source of a successful claim here.

None of this discussion is intended to denigrate the frustration on the part of West and similarly situated artists who understandably want to fight back against AI-generated songs that intentionally mimic their voices and do so in a way that misleads listeners. This certainly reflects Drake’s perspective in response to “Heart on My Sleeve,” which nearly duped the world.210See Snapes, supra note 4 (following “Heart on My Sleeve,” Drake also fell victim to an AI-generated verse added to an Ice Spice song, to which he responded, “[t]his is the final straw AI.”). However, these valid concerns do not bear a clear or logical connection to copyright law and its subject matter. Instead, these concerns likely find more coherence in the protections afforded by the laws relating to trademark, unfair competition, and state rights of publicity, which are tailored to protect against the unauthorized use of one’s identity.211Jennifer E. Rothman, Navigating the Identity of Thicket: Trademark’s Lost Theory of Personality, the Right of Publicity, and Preemption, 135 Harv. L. Rev. 1271, 1272 (2022). This is not to suggest that such claims are certain to be successful, or even actionable, but the aims of those laws, which includes protecting identity, are likely more amenable to the concerns of West and others.212There may be barriers in these cases if there is reason for federal copyright law to preempt the rights of publicity. See generally Laws v. Sony Music Ent., Inc., 448 F.3d 1134 (9th Cir. 2006) (holding that right of publicity claims were preempted by the Copyright Act because the subject matter of the claim fell within the subject matter of the Copyright Act and the rights asserted were equivalent to those contained in the Copyright Act).

B. Sample Song B

Unlike Sample Song A, Sample Song B presents questions of infringement that, on their face, seem more likely to be answerable with copyright law. While Sample Song B also seems to generally mimic Adele’s style in “Someone Like You,” it importantly incorporates more than that, particularly by way of a nearly identical melodic hook. As with Sample Song A, it is assumed that Adele owns a valid copyright in both the sound recording and the musical composition of “Someone Like You.”213As with Sample Song A, this is for the purpose of streamlining the application, even though she likely does not own both herself; see Detailed Record View: Registration Record PA0001734868, Copyright Pub. Recs. Sys., https://publicrecords.copyright.gov/detailed-record/24702018 [https://perma.cc/ESH4-UFW8] (registration record for “Someone Like You” CD). Accordingly, Adele would have a potential claim for infringement upon her rights of reproduction, adaptation, distribution, and performance. With valid ownership established, the inquiry begins with the copying requirement as it pertains to the composition.

  1. Factual Copying

The trajectory for proving factual copying is much clearer for Sample Song B. On MuseNet, User B specifically selected the introduction from “Someone Like You” by Adele, and that introduction, though slightly modified, is present from the starting note of Sample Song B. If admitted or witnessed, this would constitute direct evidence of factual copying. However, direct proof is often unavailable because “[p]lagiarists rarely work in the open.”214Johnson v. Gordon, 409 F.3d 12, 18 (1st Cir. 2005). Nonetheless, it seems very likely that indirect evidence would satisfy this requirement. Regarding access, the theory of widespread dissemination would operate well here. When dealing with songs that have gained notable popularity, plaintiffs have tended to invoke a variety of data points to support theories of widespread dissemination including references to airplay frequency and locations, billboard charts, certifications, record sales, nominations and awards, and royalty revenues.215Batiste v. Lewis, 976 F.3d 493, 503 (5th Cir. 2020). See generally ABKCO Music, Inc. v. Harrisongs Music, Ltd., 722 F.2d 988 (2d Cir. 1983) (pointing to statistics such as weeks on the Billboard chart to support a theory of widespread dissemination); Guzman v. Hacienda Recs. & Recording Studio, Inc., 808 F.3d 1031 (5th Cir. 2015) (explaining that the lack of data representing widespread dissemination was problematic for the argument of inferring access). Here, Adele will be able to construct a very convincing claim of widespread dissemination because she can invoke all of these data points with regard to “Someone Like You”: the song has been streamed over two billion times on Spotify alone;216Adele, Spotify, https://open.spotify.com/artist/4dpARuHxo51G3z768sgnrY [https://perma.cc/QK28-W7PB]. won several awards, including a Grammy;217Grammy Awards 2012: Winners and Nominees, L.A. Times (Mar. 22, 2014), https://www.latimes.com/la-env-grammy-awards-2012-winners-nominees-list-htmlstory.html [https://perma.cc/QH9G-4WFT]. was certified platinum five times in the United States;218Gold & Platinum, RIAA, https://www.riaa.com/gold-platinum/?tab_active=default-award&ar=Adele&ti=Someone+like+You&format=Single&type=#search_section [https://perma.cc/668Y-6PJL]. and is the twenty-fifth-best-selling song of all time in the United Kingdom.219The Best-Selling Singles of All Time on the Official UK Chart, Off. Charts (Nov. 8, 2023), https://www.officialcharts.com/chart-news/the-best-selling-singles-of-all-time-on-the-official-uk-chart__21298 [https://perma.cc/VQ4J-FNZX]. Occasionally, widespread dissemination arguments are accompanied by theories of subconscious copying, which speak to the fact that copyright infringement does not have a scienter requirement.220Williams v. Gaye, 885 F.3d 1150, 1167–68 (9th Cir. 2018). User B did, in fact, see on MuseNet that the intro was “Someone Like You,” suggesting this was not subconscious copying. However, the leeway to argue that the use did not need to be with full knowledge of the circumstances may be help Adele’s case; at a minimum, if User B does not admit selecting the intro, they cannot invoke a willful blindness-type argument. Therefore, an attempt to rebut the argument of widespread dissemination is unlikely to be persuasive.

As discussed with Sample Song A, substantial probability of access usually needs to be accompanied by probative similarity to successfully prove factual copying with indirect evidence. However, there are instances in which the probative similarity is convincing enough that it alone can satisfy the copying requirement. This is often referred to as “striking similarity,” and it arises when the similarity is so extensive that it is “effectively impossible for one to have arisen independently of the other.”2214 Nimmer & Nimmer, supra note 91, § 13D.07. In analyzing striking similarity in music, it has been held that degree of similarity cannot pertain only to the quantity of identical notes, but must also look to the uniqueness and intricateness of the similar aspects and the places in which the two are dissimilar.222See Selle v. Gibb, 741 F.2d 896, 903–05 (7th Cir. 1984) (holding that a plaintiff failed to demonstrate striking similarity because there was no testimony to suggest the similarities could not have occurred absent copying); Wilkie v. Santly Bros., 91 F.2d 978, 980 (2d Cir. 1937) (holding that both the differences in the “plan and construction of the compositions” and the use of common “cadences and final chords” were irrelevant given the striking similarity resulting from thirty-two virtually identical bars). Oftentimes, because of how high the bar is set for striking similarity, expert testimony is needed when the subject matter is as highly technical as music. Here, while the melodic hook created by the use of an arpeggio is very recognizable and may seem unique to “Someone Like You,” the use of arpeggios generally is common.223Arpeggio, supra note 14. While there seems to be a possibility that, with the help of an expert, Sample Song B could be found strikingly similar to “Someone Like You,” the high bar for such a determination, coupled with the infrequency of successful arguments for striking similarity, makes it reasonable to assume that the normal requirements of access and probative similarity will need to be met; this is not damaging for Adele’s claim, as those are almost certainly provable.

Assuming striking similarity is not found, the indirect evidence just needs to show probative similarity. Comparing the two works side-by-side, protected and unprotected elements alike, a factfinder could certainly conclude that “Someone Like You” was the basis, at least in part, for Sample Song B. This holistic comparison would likely highlight the nearly identical melodic hook, which consists of arpeggiated chords and underlies the distinctive harmony, along with the general similarities in terms of the theme and vocal range. While the use of an arpeggio is not itself uncommon and could occur absent copying, the distinctive chord progression, melody, and harmony created in Sample Song B is similar in all the ways that make the instrumental portion of “Someone Like You,” so memorable and impactful. While remaining careful about maintaining the distinction between probative and substantial similarity, there is likely enough similarity to be probative of copying; whether that similarity is substantial in a legal sense remains to be addressed.

  1. Legal Copying

Substantial similarity is thought of as existing on a spectrum, thereby requiring close examination to attempt to identify the line between trivial similarities and actionable improper appropriation. Here, Adele’s infringement action would allege both comprehensive nonliteral and fragmented literal similarity. The most obvious claim is that of literal similarity with regard to the piano phrase, which functions as a melodic hook, because it is reproduced nearly identically in Sample Song B. A potentially important note is that an arpeggio would appear on the sheet music for a composition because it is notated to guide the playing of chord progressions.224Types of Arpeggio Signs, Steinberg.Help, https://archive.steinberg.help/dorico_pro/v3/en/dorico/topics/notation_reference/notation_reference_arpeggio_signs/notation_reference_arpeggio_signs_types_r.html [https://perma.cc/6S98-98W7]. Further, the use of an arpeggio is key here because it melodizes the chords being used, which tends to then be an important aspect of the resulting harmony; thus, it is potentially very significant to the substantial similarity analysis because arpeggios may take harmony into the protectable range of copyright law.225See Arpeggio, supra note 14. As for nonliteral similarity, this is a situation in which the nonliteral similarity may be characterized as comprehensive; both songs are played in common time, have a somber, emotional sound, and nearly identical lyrical themes, although they are different on a word-for-word basis. As noted, courts use different tests for determining substantial similarity. While these tests are similar in many ways and may yield similar results, the most thorough prediction of how a song like Sample Song B will fare against infringement allegations must consider the nuances of each. Expert testimony is almost always used to help guide complex questions of infringement in music, so any conclusions are subject to elaboration or criticism by a technical expert.

Before applying any of the tests, it is an appropriate moment to address the doctrine of de minimis copying. Because a determination that a use is de minimis negates the need for a full substantial similarity inquiry, courts often address this “defense”226Though sometimes called a defense, it does not necessarily function as such. at the outset. De minimis copying essentially means there is a lack of substantial similarity, so the conclusion that a use is de minimis generally arises when “the average audience would not recognize the appropriation.”227Newton v. Diamond, 388 F.3d 1189, 1193 (9th Cir. 2004) (citation omitted) (holding that the use of three notes that constitute about six seconds in the original song was a de minimis use and therefore not actionable). It is important to keep this concept separate from that of characterizing an element as de minimis itself, such as saying that one note is de minimis and not protectable. As the inverse of substantial similarity, the de minimis inquiry similarly must consider the quantitative and qualitative importance of a use because both get at what an ordinary listener would find substantial. Essentially, the inquiry here would follow the same steps as the fragmented literal similarity test, as that test is viewed as a de minimis doctrine.228See Warner Bros. Inc. v. Am. Broad. Co., 720 F.2d 231, 242 (2d Cir. 1983) (explaining that in cases of fragmented literal similarity, a de minimis rule applies and allows “the literal copying of a small and usually insignificant portion of the plaintiff’s work”); Williams v. Broadus, No. 99 Civ. 10957, 2001 U.S. Dist. LEXIS 12894, at *11 (S.D.N.Y. Aug. 24, 2001) (calling fragmented literal similarity a “de minimis doctrine”). Because the details of those steps will be discussed in detail in applying the fragmented literal similarity test,229See infra Section II.B.2.iii. they need not be laid out here, largely because it seems unlikely that a court would deem the copying here to be de minimis. The focus of this inquiry is on how much of the original was used or copied; the piano phrase is repeated throughout most of “Someone Like You,” so it seems highly likely an audience would recognize the appropriation. Given that the phrase constitutes a quantitatively large part of the original and arguably has significant qualitative importance because the piano is intentionally the only instrument to create a particular feeling, the phrase opens the song instrumentally, and it may be seen as the song’s backbone, a determination that this use is de minimis copying seems unlikely. Thus, it is appropriate to analyze potential outcomes under each of the substantial similarity tests. 

i. Extrinsic-Intrinsic Test

The extrinsic-intrinsic test is a two-prong test. The extrinsic prong is the objective prong and requires identifying concrete elements of expression that are similar.230Sid & Marty Krofft Television Prods., Inc. v. McDonald’s Corp., 562 F.2d 1157, 1164 (9th Cir. 1977) (“[Specific] criteria include the type of artwork involved, the materials used, the subject matter, and the setting for the subject.”), overruled on other grounds by Skidmore v. Led Zeppelin, 952 F.3d 1051 (9th Cir. 2020) (overruling the use of the inverse ratio rule). Because this test is part of a substantial similarity inquiry, the dissection of elements involves identifying those that are and are not protected by copyright. Music often presents a more complicated case for analysis because, unlike books and films, it cannot easily be classified into a few protectable and unprotectable elements;231Swirsky v. Carey, 376 F.3d 841, 848–49 (9th Cir. 2004). Literary works, including films, TV shows, and books, can be broken down into elements more easily than music because relevant elements like plot, character, event sequence, and dialogue are more discrete than elements like melody or harmony. Id. at 849 n.15 (citation omitted).  thus, courts applying the extrinsic prong have looked to a wide variety of elements, including title hooks, lyrics, melodies, chord progression, pitch, instrumentation, accents, and basslines.232Id. at 849; see also Three Boys Music Corp. v. Bolton, 212 F.3d 477, 485–86 (9th Cir. 2000) (upholding jury’s finding of infringement based on compilation of unprotectable elements of a song), overruled on other grounds by Skidmore v. Led Zeppelin, 952 F.3d 1051 (9th Cir. 2020) (overruling the use of the inverse ratio rule). The combination of these expressive elements can be protected by copyright and often form the basis of claims involving instrumental phrases.233Swirsky, 376 F.3d at 848–49. Therefore, it can be helpful to think of the first question as relating to separating protectable elements or compilations of elements, and the second question as analyzing those elements to determine whether they are objectively substantially similar. In Skidmore v. Led Zeppelin, the district court concluded on a summary judgment motion that there was sufficient extrinsic similarity for the issue to go to the jury; the basis for such similarity focused on a “repeated A-minor descending chromatic bass lines lasting [thirteen] seconds” that appeared within the first two minutes of both songs and was arguably the “most recognizable and important segments of the respective works.”234Skidmore v. Led Zeppelin, No. CV 15-3462, 2016 U.S. Dist. LEXIS 51006, at *50 (C.D. Cal. Apr. 8, 2016), aff’d, 952 F.3d 1051 (9th Cir. 2020). Additionally, the “harmonic setting” of the sections used the same chords.235Id. The court concluded that even though a “descending chromatic four-chord progression” is common, the placement in the song, pitch, and recognizability make it appropriate for analysis under the extrinsic test.236Id. Ultimately, however, the jury concluded that, despite the combination of objective similarities, the songs were not extrinsically similar. The jury reached a different conclusion in Three Boys Music Corp. v. Bolton, in which the jury found substantial extrinsic similarity in the compilation of five unprotectable elements.237In Three Boys Music, an expert testified to the similarity in the combination of “(1) the title hook phrase (including the lyric, rhythm, and pitch); (2) the shifted cadence; (3) the instrumental figures; (4) the verse/chorus relationship; and (5) the fade ending.” 212 F.3d at 485.

Here, Adele could likely make an argument similar to that of the plaintiffs in both Skidmore and Three Boys Music, arguing that although arpeggiating chords to achieve certain melodic or harmonic goals is not uncommon, the very same chord progression starts both songs without lyrical accompaniment, is repeated several times in both songs at the same pitch, and is “arguably the most recognizable and important”238Skidmore, 2016 U.S. Dist. LEXIS 51006, at *50. part of each work; invoking the device that made the Three Boys Music plaintiffs successful, Adele would want to emphasize that it is the compilation of expressive elements that form the basis of actionable extrinsic similarity. While the knowledge that MuseNet took the actual intro from “Someone Like You,” and used generative AI to make “predictions” for the rest of the song according to prompts suggests objective similarity of these elements, expert testimony would still be helpful and needed to confirm which elements are really present in Sample Song B; for example, there may be subtle note differences that do not necessarily make the song sound different, but are objective differences, nonetheless.239Because generative AI music technology is still being explored, expert testimony as to the specifics of the musical elements would likely be needed because it is not clear whether selecting the “Someone Like You” intro means that it is being copied and pasted into the new song, or if it is instead composing something that closely resembles the phrase. The fact that the generated song has an almost identical-sounding piano phrase is addressed in the intrinsic prong. This conclusion is ultimately a question of fact requiring technical breakdown by an expert to evaluate the compilation of expressive elements, including those that are part of the melodic hook, for originality. Based on this analysis, a jury can make an informed determination as to whether these elements are sufficiently original to be protected, and if so, whether Sample Song B is substantially similar with regard to that protected expression. Assuming an expert can corroborate the objective similarity that appears to exist, there seems to be a strong case against User B as it pertains to the extrinsic prong. This is especially true in light of cases in which experts found extrinsic similarity in hooks and signature phrases,240See, e.g., Williams v. Gaye, 885 F.3d 1150, 1172 (9th Cir. 2018). as well as those that emphasized compilations as sufficient for extrinsic similarity.241See, e.g., Three Boys Music, 212 F.3d at 485. Within this framework, the copied melodical hook—consisting of the same or at least similar chord progressions, use of arpeggio, pitch, and harmony—coupled with the prominence and similar repetition in both songs, sets up a strong claim for extrinsic similarity.

Importantly in the context of AI-generated music, Adele may want to point to the fact that the song is “in her style” and that the voice sounds very similar to hers. As discussed with Sample Song A, however, courts have been very reluctant to recognize copyright in a style or someone’s voice. Especially in the case of Sample Song B—which is even closer to what has been identified as a soundalike in past cases, as Adele’s voice is not being used at all—it is at most an imitation of her voice type, and thus it seems unlikely that this part of the similarity between the songs could be actionable itself.242Unlike Sample Song A, in which West’s voice was used in some way to create the vocals for the AI-generated song, User B just used vocals that were in a similar mezzo-soprano voice. While the practical result is that it sounds like Adele, this seems like a classic case of a soundalike. See generally Midler v. Ford Motor Co., 849 F.2d 460 (9th Cir. 1988). However, this similarity may work to Adele’s benefit under the intrinsic test.

If satisfied, the extrinsic test must be followed by an intrinsic test, which is the subjective prong that puts aside analytical dissection in favor of taking the approach of a reasonable listener. The intrinsic test asks whether ordinary listeners would find the “total concept and feel of the works to be substantially similar.”243Three Boys Music, 212 F.3d at 485 (quoting Pasillas v. McDonald’s Corp., 927 F.2d 440, 442 (9th Cir. 1991)). A jury may find substantial similarity from an overall view, even when individual similarities alone seem trivial.244Gaye, 885 F.3d at 1164. This may be important for Adele’s case because the similarity technically boils down to a few chords and how they are played. However, the impact of the arrangement resulted in an internationally recognized piano phrase, as well as a melody and harmony that have been highly successful in conveying a message. In both songs, the phrase starts at the first second, plays without lyrics initially, and repeats after the chorus. While there are some differences in instrumental content and lyrics, a jury could subjectively find that the repeated phrase is substantial. The ordinary listener would likely also find subjective similarity in the combination of those instrumental choices and thematically similar lyrics, suggesting that the songs genuinely evoke similar meanings. In a subjective analysis of the total concept and feel, the similar-sounding vocals may potentially factor in, particularly because both songs are sung by mezzo-sopranos. However, this is unlikely to be the most salient reason for finding intrinsic similarity because mezzo-soprano is the most common female singing voice, and the intrinsic test assumes an untrained ear who would likely attribute the similarity to the unremarkable fact that both vocalists sound feminine, rather than recognizing the specific vocal range.245Stefan Joubert, 7 Vocal Types and How to Determine Yours, London Singing Inst. (Oct. 30, 2020), https://www.londonsinginginstitute.co.uk/7-vocal-types-and-how-to-determine-yours [https://perma.cc/M3TL-24LF]. Nonetheless, it seems reasonable to conclude that the songs are substantially similar overall. But because the ordinary listener is supposed to truly reflect an ordinary person with no music expertise, it could also go the other way. While the hook phrase is distinctive and impactful, a jury could conclude that in Sample Song B, because of the variation in the accompaniment aside from the phrase, it is not as salient, therefore finding that the works holistically lack the requisite similarity. This ultimately speaks to the challenging nature of anticipating intrinsic analysis results, as the conclusions depend on unknown variables and subjective judgments. Courts consistently reiterate that they will not question the jury’s intrinsic conclusions, therefore there is less to rely on by way of case law because it is not judges who engage in this inquiry.246See generally Gaye, 885 F.3d; Swirsky v. Carey, 376 F.3d 841 (9th Cir. 2004); Three Boys Music, 212 F.3d; Sid & Marty Krofft Television Prods., Inc. v. McDonald’s Corp., 562 F.2d 1157 (9th Cir. 1977), overruled on other grounds by Skidmore v. Led Zeppelin, 952 F.3d 1051 (9th Cir. 2020) (overruling the use of the inverse ratio rule).

The extrinsic-intrinsic test has been criticized for lack of clarity as to both prongs. As will also be discussed with aspects of the following tests, the “total concept and feel” approach seems to conflict with copyright law’s very specific intent to protect original expressions rather than ideas or commonplace expressions of ideas.2474 Nimmer & Nimmer, supra note 91, § 13.03(A)(1)(c). Assuming this test remains in use, however, it may be the approach applied in the litigation of User B. Without knowing the quality of potential expert testimony, it is hard to predict with certainty the outcome. However, case law does suggest that the type of elements that were copied could, if framed as a compilation, satisfy the extrinsic test because there are clearly musical elements that are objectively the same. As for the intrinsic test, the subjective conclusions of the factfinder will ultimately determine the outcome; however, the prominence of the copied phrase, as well as the concept and feel of the emotional ballads, suggest that a jury could potentially find the songs to be substantially similar.

ii. Ordinary Observer Test

The ordinary observer test asks “whether defendant took from plaintiff’s works so much of what is pleasing to the ears of lay listeners, who comprise the audience for whom such popular music is composed, that defendant wrongfully appropriated something which belongs to the plaintiff.”248Arnstein v. Porter, 154 F.2d 464, 473 (2d Cir. 1946). Here, because there are similarities between protectable and unprotectable elements, the test will probably be more discerning. In conducting the more discerning inquiry, courts are to try to extract the unprotectable elements and ask whether the remaining protectable elements are substantially similar.249Velez v. Sony Discos, No. 05 Civ. 0615, 2007 U.S. Dist. LEXIS 5495, at *24 (S.D.N.Y. Jan. 16, 2007). Protectable elements may either be completely original or original contributions by way of selection, coordination, or arrangement.250Id. (“In other words, unoriginal elements, combined in an original way, can constitute protectible elements of a copyrighted work.”). For Adele, this would likely mean focusing on the original selection, coordination, and arrangement of the piano phrase itself and its function in the song through repetition. Once those elements are identified, the factfinder will look to the total concept and feel, focusing on whether the defendant misappropriated the original aspects of the copyright owner’s work. While the original formulation of the ordinary observer test in Arnstein v. Porter references the intended audience, that factor has not typically played a large role and is usually understood to mean the lay listener.251Arnstein, 154 F.2d at 473; see Dawson v. Hinshaw Music, Inc., 905 F.2d 731, 737 (4th Cir. 1990) (suggesting that a departure from the lay audience serving as the representative of the intended audience is appropriate only when “the intended audience possesses specialized expertise”) (internal quotation marks omitted). Because the emphasis is almost entirely on total concept and feel, whether MuseNet made minor, audibly imperceptible changes to the phrase may be less important than in the extrinsic inquiry of the extrinsic-intrinsic test.252It may also not be any less important depending on testimony. However, since the focus is so much more directly on whether the second work took something important from the first, these minor changes may factor in much less. Nevertheless, this potential small change would not be fatal to the claim, because we are discussing substantial similarity of the composition, meaning that it need not be completely identical.

The analysis of Sample Song B under an ordinary observer test will likely resemble the analysis in New Old Music Group, Inc. v. Gottwald.253New Old Music Grp., Inc. v. Gottwald, 122 F. Supp. 3d 78, 95–97 (S.D.N.Y. 2015). In New Old Music, the infringement claim was based on a drum part consisting of a single measure, which was repeated throughout the allegedly infringing work, ultimately accounting for eighty-three percent of the original work.254Id. at 97. The defendant argued that the individual elements were not sufficiently original to be protected, but the court held that the totality of the drum part could suffice as copyrightable based on its original selection, coordination, and arrangement.255The court in New Old Music was ruling on a summary judgment motion, so it did not determine whether the selection, coordination, or arrangement of the drum part was sufficiently original. Instead, it simply pointed to the defendant’s failure to show that it was not original and emphasized that protection for the plaintiff is not limited to the originality of the individual elements. Id. at 95–96. A reasonable juror in New Old Music could have concluded that the use of the drum part, which could be seen as the original song’s “backbone,” took so much of “what is pleasing to the ears of lay listeners, . . . that [the] defendant wrongfully appropriated something” from the plaintiff.256Id. at 97 (quoting Repp v. Webber, 132 F.3d 882, 889 (2d Cir. 1997)). Here, the repeated piano phrase could be described as the backbone of “Someone Like You,” and be protected as a unique and original arrangement despite the unoriginality of any individual note. Analyzing the total concept and feel of both songs, a reasonable jury could likely conclude User B substantially misappropriated Adele’s original compilations and thereby infringed on her copyright.

Because this test relies on subjective judgments, the outcome could go the other way. A jury could conclude that the piano phrase and its arrangement were not original,257To determine the selection or arrangement of the piano in “Someone Like You,” is unoriginal, evidence must be presented that suggests as much. While nothing readily apparent suggests this upon researching the song, that does not preclude the possibility that an expert in music and music theory could demonstrate its unoriginality. or that it is a de minimis aspect of the work258The term “de minimis” in this context refers to the violation being trivial; this differs slightly from “de minimis copying,” a term used to describe copying that falls below the substantial similarity threshold. See Ringgold v. Black Ent. Television, Inc., 126 F.3d 70, 74 (2d Cir. 1997).  and therefore the similarity does not pertain to what lay listeners deem pleasing in “Someone Like You.” This was the case in Velez v. Sony Discos, in which the combination of eight-measure phrases was a structure widely used and therefore not original to the plaintiff’s song, and also constituted de minimis aspects of the original song.259Velez v. Sony Discos, No. 05 Civ. 0615, 2007 U.S. Dist. LEXIS 5495, at *38–40 (S.D.N.Y. Jan. 16, 2007). Sample Song B differs from the allegedly infringing song in Velez in that, aside from that structure of phrases, the song was not otherwise similar to the original in melody, harmony, or lyrics;260Id. at *39. Sample Song B, on the other hand, can be alleged to infringe on the arrangement of piano phrases, as well as the resulting melody and harmony that is affected by other expressive choices like arpeggiating the chords. Because of these similarities, it seems likely that a jury could find for Adele under the ordinary observer test, assuming expert testimony does not exclude the possibility of originality.

A key reason the ordinary observer test, discerning or traditional, comes under criticism is that it asks a factfinder to simultaneously separate protectable elements for careful examination and determine substantial similarity based solely on the total concept and feel.2614 Nimmer & Nimmer, supra note 91, § 13.03(E)(1)(b). Additionally, ordinary listeners’ impressions regarding whether copying has occurred do not necessarily prove that a violation of the Copyright Act has taken place. These shortcomings could affect Adele’s case against User B in two opposing ways. On one hand, the meticulous separation of protectable elements before conducting a net effect-type of analysis might lead the jury to conclude that what they are merely dealing with individual phrases. Focusing too closely on the individual phrases, as opposed to the whole arrangement, might cause this similarity to be overlooked in a total concept and feel inquiry. If, however, the jury recognizes the arrangement as the “backbone” of the song, this could lessen the issue. Further, in focusing on the total concept and feel, a jury might unintentionally be overinclusive when the vibe of the songs is as similar as “Someone Like You” and Sample Song B. If anything, this emphasizes the importance of expert testimony regarding the originality, or lack thereof, of the elements—whether on their own or as a compilation—to guide the jury before their total concept and feel analysis.

iii.  Fragmented Literal Similarity Test

The last test is the fragmented literal similarity test, which has less applicable case law. This test focuses on “localized” similarity based on the idea that identifiable fragments of identical or nearly identical expression should be the basis for an infringement action.262TufAmerica, Inc. v. Diamond, 968 F. Supp. 2d 588, 597 (S.D.N.Y. 2013). As such, the substantial similarity question under this test turns on whether the copying involves trivial or substantial elements of the original work, which is determined by quantitative and qualitative assessments.263Id. at 598. Most cases specifically addressing fragmented literal similarity involve lyrics, so the qualitative significance of instrumental phrases is less explored. However, when considering the qualitative importance of instrumental phrases outside the context of fragmented literal similarity, it has been recognized that small sections can have great qualitative import, such as the four-note opening melody in Beethoven’s Fifth Symphony.264Newton v. Diamond, 388 F.3d 1189, 1197 (9th Cir. 2004) (Graber, J., dissenting). See generally Williams v. Broadus, No. 99 Civ. 10957, 2001 U.S. Dist. LEXIS 12894 (S.D.N.Y. Aug. 24, 2001); Jarvis v. A & M Recs., 827 F. Supp. 282 (D. N.J. 1993). Here, the specific piano phrase appears at the first second of “Someone Like You,” initially without lyrics for about fourteen seconds; the same phrase continues through nearly three and a half minutes of the song, although there are some additional notes played and volume changes.265A trained musical expert would need to testify as to the specific breakdown of how long the exact same chords are played, but the progression is present through approximately three and a half minutes of the song. “Someone Like You” is four minutes and forty-five seconds in total. Someone Like You, Spotify, https://open.spotify.com/track/5lkpeJwmQKgY3bX2zChjxX [https://perma.cc/RJ2Z-XZLW]. Quantitatively, this is clearly significant. In TufAmerica, Inc. v. Diamond, the court determined that a “distinctive orchestra sequence” from the original song that was about three seconds and consisted of “a series of five punchy ascending chords” was quantitatively significant given that it was repeated seventeen times to ultimately constitute about fifteen percent of the song.266TufAmerica, 968 F. Supp. 2d at 606–07. While a musical expert would need to confirm the actual length of time the phrase appears in original form in “Someone Like You,” it certainly seems to exceed that threshold. The qualitative importance also seems convincing given that the piano is the only instrument, the phrase opens the song instrumentally, making it very recognizable, and the phrase continues with only slight alterations, thereby functioning as a common thread through the whole work. Under this test, it seems highly likely Adele would prevail.

However, this test seems least likely to apply. First, it is not as commonly used as the other tests. Second, there is much more at issue than just fragmented literal similarity, especially considering that the desire to legally target Sample Song B likely has as much to do with the fact that User B used AI to create a song that intentionally sounds like Adele as it has to do with the use of the phrase; “local” and “global” similarity are expected concerns for artists whose works are pirated by AI. Third, the fact that the phrase is slightly sped up and may contain slight differences due to how it was generated suggests the other tests may be better suited for this case.  

User B’s final opportunity to argue that their conduct falls within the bounds of the Copyright Act without constituting infringement is by asserting the fair use defense. Because the same analysis likely applies to User B’s use of the recording as well, the fair use discussion below addresses both components of the song together.

  1. The Sound Recording

The analysis thus far has focused on the composition. Infringement of the sound recording of “Someone Like You” requires a literal duplication of the recording.26717 U.S.C. § 114(b). As discussed earlier, while not explicitly included, there is reason to believe the same applies to the distribution right as well; see supra text accompanying note 117. Based on the language of the Copyright Act, whether the rights in the recording have been infringed depends entirely on how MuseNet creates music using introductions from existing songs:

(a) The exclusive rights of the owner of copyright in a sound recording are limited to the rights specified by [the] clauses [pertaining to the reproduction, adaptation, distribution, and the public performance by digital audio transmission rights] . . . . (b) The exclusive right of the owner of copyright in a sound recording under [the reproduction right] is limited to the right to duplicate the sound recording in the form of phonorecords or copies that directly or indirectly recapture the actual sounds fixed in the recording. The exclusive right of the owner of copyright in a sound recording under [the adaptation right] is limited to the right to prepare a derivative work in which the actual sounds fixed in the sound recording are rearranged, remixed, or otherwise altered in sequence or quality.26817 U.S.C. § 114(a)–(b) (emphasis added).

MuseNet trains on MIDI files, which capture data that can be seen as a “symbolic representation of music.”269David Rizo, Pedro J. Ponce de León, Carlos Pérez-Sancho, Antonio Pertusa & José M. Iñesta, A Pattern Recognition Approach for Melody Track Selection in MIDI Files, 7th Int’l Conf. on Music Info. Retrieval (2006). Essentially, a MIDI file records data about the notes in a song, including pitch, volume, and time nodes, which can then instruct the reproduction of musical compositions.270Liu, supra note 29, at 6564; Christos P. Badavas, MIDI Files: Copyright Protection for Computer-Generated Works, 35 Wm. & Mary L. Rev. 1135, 1140–41 (1994). Importantly, MIDI files are not audio recordings and cannot transmit audio.271Badavas, supra note 270, at 1139. (“The gestures made on a keyboard are translated into the serial computer language that is MIDI, sent out of the MIDI Out port, are received at the MIDI In port of a second (and third, and fourth, ad infinitum) instrument, and that instrument faithfully reproduces those gestures.”). This means that, unlike Uberduck, MuseNet technically never even “hears” the sound recording; it only trains on the computer language that indicates how the composition is played. Therefore, a MIDI file of “Someone Like You” could not possibly result in exact duplication of the protected recording being used in Sample Song B because the recording itself is not transmitted. This information alone suggests that User B cannot be liable for infringement of the sound recording of “Someone Like You,” and Adele would have to rely on allegations of infringement of the composition as discussed earlier.

While the literal language of the statute suggests that copying using a MIDI file is not an actionable infringement of the recording, a more in-depth inquiry as to whether this is so black-and-white is warranted considering that many AI music generators train on MIDI files. The starting point for this inquiry is legislative intent. The Digital Performance Right in Sound Recordings Act of 1995 (“DPRA”) created an exclusive performance right for sound recordings, specifically granting the right to perform by “means of a digital audio transmission.”27217 U.S.C. § 106(6). In doing so, section 114 was also amended to add the relevant limitations on the performance right. The House Report accompanying the DPRA explicitly states that the right applies only to digital audio transmissions, which is consistent with the language of section 114 concerning reproduction and adaptation rights.273H.R. Rep. No. 104-274, at 14 (1995). Additionally, it specifies that a “digital phonorecord delivery” refers to the delivery of a recording by digital transmission.274Id. at 28. From this, it is clear that while the rights associated with sound recordings were expanded to adapt to technological developments, they were not explicitly extended beyond the transmission of the actual recording. However, the House Report does note that because the bill does not “precisely anticipate particular technological changes,” they intend that the rights, exemptions, and limitations created should be interpreted to “achieve their intended purposes.”275Id. at 13. This is at least suggestive of the understanding that the language may not be precise enough to cover all technologies and potential infringements. In 2018, Congress passed the Musical Works Modernization Act with the intent of updating copyright law to increase fairness for creators regarding statutory licensing.276Musical Works Modernization Act §§ 101–106; 17 U.S.C. §§ 114, 115. While this points to an ongoing concern about protecting artists in the advent of technological innovation, it does not change how digital transmission is defined. Legislative intent seems to indicate that Congress’s focus is to protect the actual sound recording. However, the concern about the future evolution of technology nonetheless remains relevant. 

The Office has also provided some perspective on MIDI files and the sound recording requirement. As of 2021, the Office “does not consider standard [MIDI] files to be phonorecords and will not register a copyright claim in a sound recording contained in a standard [MIDI] file.”277U.S. Copyright Off., Compendium of U.S. Copyright Office Practices § 803.4(C) (3d ed. 2021). The Office elaborates that, because MIDI files do not capture sounds and only capture the underlying score, they are insufficiently fixed to be copyrighted as sound recordings, though they may suffice for musical works.278Id. While this does not directly address MIDI files in the context of infringement, this is clear evidence that the Office is aware of how MIDI files operate in the music context and continues to view them as fundamentally different from sound recordings. If the Office does not consider MIDI files to be fixations of the recording itself, it is a difficult argument to suggest it should constitute a sound recording for the purposes of infringement.

Case law does not seem to have addressed this issue directly. However, there is a wealth of judicial interpretation of section 114 and what is meant by the requirement that sound recordings be duplicated to qualify as infringement.279See Bridgeport Music, Inc. v. Dimension Films, 410 F.3d 792, 800 (6th Cir. 2005) (“[17 U.S.C. § 114(b)] means that the world at large is free to imitate or simulate the creative work fixed in the recording so long as an actual copy of the sound recording itself is not made.”) (emphasis added); VMG Salsoul, LLC v. Ciccone, 824 F.3d 871, 883 (9th Cir. 2016) (“A new recording that mimics the copyrighted recording is not an infringement, even if the mimicking is very well done, so long as there was no actual copying.”); Batiste v. Lewis, 976 F.3d 493, 506 (5th Cir. 2020) (“[A]n artist infringes a copyrighted sound recording by sampling all or any substantial portion of the actual sounds from that recording.”) (citation omitted) (internal quotation marks omitted). This conclusion aligns with the language of the statute and its intended purpose. Therefore, even if Sample Song B sounds like it was sampled, current interpretations of the Copyright Act would instruct a court to conclude that Sample Song B did not infringe on Adele’s exclusive rights in the sound recording of “Someone Like You.” Undeniably this would be incredibly frustrating for an artist in Adele’s shoes; changing one fact—how the song was duplicated—could open the door to receiving royalties for sampling. This bears similarity to the frustration artists feel in cases involving songs like Sample Song A in which they justifiably feel that their hard work has been “appropriated,” yet that appropriation is simply not cognizable under current copyright law.

However, given that this case presents new issues that have not yet been addressed directly, it is possible that using the original in this specific way could be considered an exact duplication. Based on the DPRA and Congress’s intent to protect the ability to earn royalty revenues in the digital age, it may be a fair extension to consider the extraction and use of exact portions of a song using MIDI technology to be within what was meant by an actual duplication. There is no human involvement in using MIDI files to recreate the exact instrumentals; they are fed to the AI system to learn, train on, and reproduce with predictions. By possessing the MIDI file, the system autonomously makes an exact replica of the song. In fact, the point of MIDI files is to enable the creation of exact replicas, as it is a type of file that can direct notes and instruments to be played. While that seems to sound like a process akin to a person who uses their own instrument to recreate a song, which is acceptable under the Copyright Act, the lack of human involvement may persuade a court to conclude that this process falls outside the scope of what Congress intended to allow without obtaining a license.

If this is considered to be sampling, there are several potential rights for Adele to argue infringement upon; by its very nature, sampling may infringe on the reproduction and distribution rights, and courts have found that sampling infringes on the adaptation right by harming the market for future derivatives.280Frisby v. Sony Music Ent., No. 19-1712, 2021 U.S. Dist. LEXIS 51218, at *40–41 (C.D. Cal. Mar. 11, 2021). In determining whether this sample infringed on those rights, courts would likely apply the same requirements for a successful infringement action. The only instance in which the fact of sampling alone would be sufficient is if a court strictly adheres to the holding and reasoning from Bridgeport. Because this would be considered an exact duplication, the factual copying prong would easily be satisfied. As to the legal prong, it seems that Sample Song B would likely be found to be substantially similar to “Someone Like You” for the same reasons as discussed regarding the musical composition. Further, the fair use inquiry would be important in determining whether User B is liable for infringing Adele’s copyright.

Absent such a change in interpretation or amendment of the Copyright Act, it seems unlikely that Adele would succeed on a claim of infringement on the sound recording. Given that AI systems often train on MIDI data, this is something that may be addressed in the Office’s future reports. While arguments about style pirating by generative AI systems seem unlikely to influence changes in copyright protections, arguments about near-duplication by MIDI files align more with adjusting copyright law to address technological changes. Ongoing concerns about royalties and protecting rights in ownership of a sound recording may demand attention to this MIDI “loophole.” Because this situation presents a good opportunity to reconsider what exactly is meant by exact duplications, it is worth considering how Adele’s infringement action would proceed if User B’s use of MIDI files does qualify as sampling. Since the required elements of an infringement cause of action are likely satisfied, the outcome for the recording probably depends on fair use, as that is User B’s last opportunity to attempt to show that their conduct is not prohibited by the Copyright Act. 

  1. Fair Use Defense

Regarding both the musical composition and the sound recording, User B will likely at least plead fair use in their answer to a suit alleging infringement by Adele. Nevertheless, like other music copyright cases, it is not guaranteed that this defense will be litigated. In asserting a fair use defense, User B will have the burden of justifying their use of the original phrase, including its intact melody, harmony, and rhythm. If successful, they will be relieved from liability because fair use is an affirmative defense.28117 U.S.C. § 107. Because there are only a handful of fair use music cases that involve non-parody uses, with a notable absence of case law addressing the use of instrumental sections, the following analysis largely relies on analogies to other applications of the defense.

The first factor is the “purpose and character” of the use.282Id. § 107(1). The key question is one of transformation. Post-Goldsmith, this inquiry is more demanding and requires looking beyond whether the use adds something new. When the use is essentially the same as the original, as is the case here, a compelling justification is required.283Andy Warhol Found. for the Visual Arts v. Goldsmith, 598 U.S. 508, 547 (2023). There is certainly an argument that the use here is transformative, simply based on the nature of MuseNet and the resulting composition. The intro to “Someone Like You” served as the basis for Song B, but then the AI system used predictive technology to construct much of the remaining composition, revisiting the original phrase only occasionally. In a literal sense, User B, via MuseNet, transformed the phrase by pairing it with new instrumental phrases. While this fits the definition of literal transformation, a more compelling argument would exist if the song retained less of the original in its essentially unchanged form. Since most uses incorporate some addition, the inquiry must also consider the extent to which the purpose differs.284Id. at 525. Sample Song B does not fit into any of the criteria from the preamble of § 107,285The preamble explicitly lists the following purposes: “criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research.” 17 U.S.C. § 107. but that does not preclude a sufficiently different purpose. In Estate of Smith, the court found that the use of lyrics to discuss music generally served a “sharply different” purpose than the lyric’s original purpose or goal of commenting on the “primacy of jazz music.”286Estate of Smith v. Cash Money Recs., 253 F. Supp. 3d 737, 750 (S.D.N.Y. 2017). The original lyrics were: “Jazz is the only real music that’s gonna last. All that other bullshit is here today and gone tomorrow. But jazz was, is and always will be.” In the second work, the lyrics were edited to say: “Only real music is gonna last.” Id. at 749. Whether this conclusion would be accepted under Goldsmith, which was decided later, is questionable because the Court held that transformation cannot be based on the “stated or perceived intent of the artist.”287Goldsmith, 598 U.S. at 545.

Regardless, while there are changes in the instrumental phrasing and added lyrics, the lyrics reflect very similar themes, and the music serves the same purpose of setting a somber tone. While more specifics about the lyrics and the message of Sample Song B are needed to confirm this conclusion, the available information suggests that the purpose of using the piano phrase is not even as different as that of the use in Estate of Smith, which also arguably lacked significant differences. Because of the exact portions of piano used, along with several other nonliteral similarities, it seems unlikely that User B could sufficiently demonstrate a compelling justification or a distinct purpose. The Goldsmith Court noted that Campbell cannot be read to say that any use that adds something new counts in favor of fair use because, if it did, a “commercial remix of Prince’s ‘Purple Rain’” would weigh in favor of fair use purely because it added some new expression to the song.288Id. at 541. Thus, Sample Song B is arguably just a remix of the instrumentals in “Someone Like You,” which fails to serve any significant unique purpose because it uses the phrasing to evoke the same theme and musical vibe. Therefore, it seems unlikely that a court would find the first factor to favor fair use here.

The second factor is “the nature of the copyrighted work.”28917 U.S.C. § 107(2). This factor examines whether the work is creative or expressive.290Estate of Smith, 253 F. Supp. 3d at 751. This factor weighs strongly against fair use because the copyrighted work is an original, creative musical work. Because this is somewhat uncharted territory, User B could argue that the creative nature of the original song is less relevant because what was used can be broken down into a chord progression, and there are only so many combinations of such progressions; User B may then argue that courts should look at these chords more like facts or nonfiction works. This argument is not particularly persuasive given that Sample Song B uses the same arrangement of the chord progressions, maintaining the original melody and harmony, which clearly speaks to the creative choices made in “Someone Like You.” Nonetheless, this factor is rarely significant in a final fair use determination.291Authors Guild v. Google, Inc., 804 F.3d 202, 220 (2d Cir. 2015).  

The third factor pertains to the “amount and substantiality of the portion used in relation to the copyrighted work as a whole.”29217 U.S.C. § 107(3). User B will certainly argue that they used only what was required for the generative AI system to create predictions and compose a new song in accordance with those predictions. While User B is not required to use only the minimum amount needed for the system to function,293Estate of Smith, 253 F. Supp. 3d at 751. the significant amount used, coupled with the lack of obvious transformation in the resulting song, will likely work against them. This factor is less likely to favor fair use when there is extensive copying or when the use encompasses “the most important parts of the original.”294Authors Guild, 804 F.3d at 221. While in Oracle, the amount of code used was reasonable in proportion to the transformative use,295Google LLC v. Oracle Am., Inc., 593 U.S. 1, 33–35 (2021). the use of exact news segments in Fox News Network, LLC v. TVEyes, Inc. was extensive and included all of the important parts of the original news segments, thereby failing to qualify as fair use.296Fox News Network, LLC v. TVEyes, Inc., 883 F.3d 169, 179 (2d Cir. 2018). User B’s use of the piano phrase likely falls between these two cases, as it does not use the entire composition, but still uses so much of what is important from it. As with the other two factors, this factor would likely count against fair use here.

The final factor, often deemed the most important, asks about the “effect of the use upon the potential market for or value of the copyrighted work.”29717 U.S.C. § 107(4). This factor requires looking beyond the immediate situation to consider whether widespread conduct of this kind “[might] adversely affect the potential market for the copyrighted work.”298Sony Corp. of Am. v. Universal City Studios, Inc., 464 U.S. 417, 451 (1984), superseded by statute, Digital Millennium Copyright Act, Pub. L. No. 105-304, 112 Stat. 2860, as recognized in Monge v. Maya Mags, Inc., 688 F.3d 1164 (9th Cir. 2012). As noted earlier, this factor’s application in the music context is unclear, as it has received little judicial attention. Since the use is unlikely to be deemed transformative, Song B is more likely to pose a risk of market substitution. However, this conclusion is based on an approach that is not typically applied to music cases like this one. User B will certainly argue that listening preferences are subjective and the use of the piano phrase to create a similarly emotional ballad may not clearly harm the market for the original the way the complete replication of news segments and distribution of clips would render paying for the original largely unnecessary.299Fox News, 883 F.3d at 179–180. However, a California court, addressing an allegedly infringing song in Frisby, held that two songs within similar genres were competitors; as such, the court concluded that when a latter song copies important elements of the original, the value and sales of the original are expected to be diminished because “the copy supersedes the objects of the original creation thereby supplanting [it].”300Frisby v. Sony Music Ent., No. 19-1712, 2021 U.S. Dist. LEXIS 51218, at *40 (C.D. Cal. Mar. 11, 2021). Sample Song B is clearly within the same genre as “Someone Like You,” so a court may deem them to be market competitors. Assuming these two songs qualify as market competitors, the subsequent question becomes whether Sample Song B copies an important element of “Someone Like You,” thereby supplanting the original. For the reasons discussed throughout this Note, the copied piano phrase is clearly a critical part of “Someone Like You,” as it is recognizable and serves as the instrumental accompaniment for most of the song. If a court agrees with this determination of importance, it will likely count against fair use.

The court in Frisby further explained the importance of considering the market for derivative works that may be affected by a later use; in that case, the court found that if the sample were considered fair use, it would “destroy the market for derivative works based on [the original song].”301Id. at *41. While that conclusion was linked to the existence of a “flourishing market” for derivatives of the original song,302Id. the premise that such a decision would result in future users not bothering to pay licensing fees would still apply here, even if there is no such flourishing market for “Someone Like You.” Fair use cases pertaining to all types of work often consider the potential chilling effects on the market. Finding Sample Song B’s use to be fair use could certainly undermine the efficacy and profitability of an established system of licensing.303See, e.g., id. at *41–42 (“[F]inding fair use in this case would have an extremely adverse effect on the potential market for and value of [the original].”); Fox News, 883 F.3d at 180 (finding that the use “usurp[ed] a market that properly belongs to the copyright-holder”) (citation omitted); Sega Enters., Ltd., v. Accolade, Inc., 977 F.2d 1510, 1523 (9th Cir. 1992) (explaining that if widespread conduct involving the use at issue would diminish sales, interfere with marketability, or usurp the market, “all other considerations might be irrelevant”); A&M Recs., Inc., v. Napster, Inc., 239 F.3d 1004, 1017 (9th Cir. 2001) (finding that the use harms the market for the original by affecting the present and future market for digital downloads). By referencing sound recordings, the DPRA reflects congressional concern about the livelihoods of artists and individuals who rely on licensing revenue. Allowing this substantial amount of copying to be fair use would likely lead many future users to forgo obtaining a license. Further, the court in Sony Music Entertainment v. Vital Pharmaceuticals, Inc. held that when a user “completely ignore[d] the market for music licensing,” the burden shifts to the user to demonstrate that their use is not likely to harm the market for the original.304Sony Music Ent. v. Vital Pharms., Inc., No. 21-22825, 2022 U.S. Dist. LEXIS 183358, at *37–38 (S.D. Fla. 2022) (holding that a company’s use of a record company’s songs for commercial purposes was not a fair use). Therefore, because User B did not obtain a license to use any part of “Someone Like You,” they would be responsible for producing evidence that Sample Song B did not negatively affect the market for the original. Adele’s unrealized royalties in this case would be limited to licensing revenues for “traditional, reasonable, or likely to be developed markets.”305Fox News, 883 F.3d at 180 (quoting Am. Geophysical Union v. Texaco Inc., 60 F.3d 913, 930 (2d Cir. 1994)). However, based on statutory requirements and industry practices, music licensing qualifies as a developed market. Therefore, this limitation is unlikely to have a significant impact in the music context.

Even if the use of MIDI files renders the use a mere imitation rather than a duplication infringing upon Adele’s rights in the recording, the result may be the same for this fourth factor, as a finding of fair use would necessarily imply that the MIDI loophole provides an acceptable way to avert infringement. This is problematic for the sampling and licensing market because those who would normally obtain a license to sample “Someone Like You” and other songs may instead copy the songs via MIDI technology. While such an approach would be unwise, considering that it does not remove potential liability for infringement of the musical composition, it would nonetheless provide a way to avoid paying licensing fees, which some AI users would likely exploit. Therefore, the chilling effect is likely to occur regardless of whether the use is characterized as sampling or a literal duplication. Further, the piano phrase is an important part of “Someone Like You,” both in the actual recording and in the composition, which is copied exactly. Therefore, Sample Song B may supplant the composition and thereby harm the sales and value of “Someone Like You.”

While predictions about fair use are necessarily speculative given the unique factors here, the application of analogous precedent suggests that, at a minimum, User B does not have a very compelling fair use defense. Future application of fair use in music by courts will be instructive, as will opinions addressing generative AI more specifically. A particularly important question to be answered will be how generative AI works that use predictive models will hold up against a transformation inquiry, as that factor typically seeps into the other three as well. Until courts provide such insight on how fair use and infringement apply to generative AI songs, Adele seems to have a decent case for infringement of the composition, so long as the subjective assessment leans in her favor. Infringement of the rights in the sound recording copyright, however, seems to present a less promising case under current interpretations of the Copyright Act.

IV. POLICY IMPLICATIONS

The analyses of Sample Songs A and B clearly suggest that current copyright law does not provide obvious answers to several questions that arise in the context of generative AI music and, more generally, AI technology. While certain provisions of the Copyright Act are intentionally broad to allow for changes, and amendments have addressed specific deficiencies identified by Congress, a fundamental deficiency arises from the fact that they did not design the Act with this advanced of technology in mind. For example, the limitation of rights in a sound recording to exact duplications was not promulgated with the expectation that machine learning algorithms would eventually train on data and duplicate it exactly through what technically qualifies as an independent fixation under the statute. Whether these deficiencies are addressed through amendments, judicial decisions, or administrative policies, a determination stands to be made as to whether specific new rules or exceptions are needed, or if the broad language of the Act should remain, with adjusted, AI-specific or AI-sensitive interpretations.306While judicial interpretation has certainly shaped our understanding of copyright law, substantial changes necessary to address these issues are unlikely to come from the courts alone. See Sony Corp. of Am. v. Universal City Studios, Inc., 464 U.S. 417, 429–31 (1984) (“Sound policy, as well as history, supports our consistent deference to Congress when major technological innovations alter the market for copyrighted materials.”), superseded by statute, Digital Millennium Copyright Act, Pub. L. No. 105-304, 112 Stat. 2860, as recognized in Monge v. Maya Mags., Inc., 688 F.3d 1164 (9th Cir. 2012).

Specific rules aside, the contentious situations created by generative AI music highlights the continuing struggle to balance protection for creators with the benefits of rapidly advancing technology. As the Court noted in Twentieth Century Music Corporation v. Aiken, the Copyright Act and its provisions are intended to reflect “a balance of competing claims upon the public interest.”307Twentieth Century Music Corp. v. Aiken, 422 U.S. 151, 156 (1975). On one side of the spectrum, it is important to recognize the societal value of music and properly appreciate the talent it takes to release authentic, moving pieces of work.308The Court in Twentieth Century Music described this end of the spectrum as reflecting the goal of “secur[ing] a fair return for an ‘author’s’ creative labor.” Id. If we want musically talented individuals to continue to pursue these creative aims and provide us with entertainment, their creative expression must continue to enjoy protection. This is a particularly salient concern given the sensitivity of the creation involved, as one artist is a vulnerable human, baring their soul, and the other “artist” is an inherently non-creative and non-vulnerable trained machine.

On the other end of the spectrum is the necessary recognition of the importance of encouraging technological advancement and pursuing a more efficient society. If the use of generative AI is aggressively cabined by the risk of copyright infringement litigation, the world may miss out on valuable works. While the protection of artists is undeniably important, it cannot be forgotten that protections are limited because the ultimate goal is to promote creativity for the public good.309See id.; Authors Guild, Inc. v. HathiTrust, 755 F.3d 87, 94–95 (2d Cir. 2014) (explaining that copyright law does not confer natural rights of “absolute ownership” on authors, but is “designed rather to stimulate activity and progress in the arts for the intellectual enrichment of the public”) (citing Pierre N. Leval, Toward a Fair Use Standard, 103 Harv. L. Rev. 1105, 1107 (1990)). Further, this could have a chilling effect beyond the music industry, impacting industries in which the use and advancement of this technology could change the world or save lives. Even within the music industry, if we limit the usage of AI by non-owners, how might that precedent impact the use of AI by owners themselves? Currently, similar technology is used in recording studios to make original songs and, particularly, to improve songs before they are released.310The idea of protecting innovation speaks not only to new creations, but also to building upon existing processes to improve them, a continual process that is clearly important in the music industry where quality improvements are constant and arguably beneficial for everyone involved. See Perfect 10, Inc. v. Amazon.com, Inc., 508 F.3d 1146, 1163 (9th Cir. 2007) (highlighting the importance of encouraging “the development of new ideas that build on earlier ones”). Artists would agree that this use is not the aim of cracking down on copyright infringement, but it would potentially be difficult to keep these uses separate and may result in frivolous and undesired suits between disgruntled artists and producers. Further, we need to determine the weight that the creative input of the user has on what uses are more permissible because not all AI systems dominate the creation without meaningful human input. Determining how and where to draw this line is far from simple and will necessarily depend on an increased understanding of the technology, assessment of policy priorities, and, to some degree, value judgments regarding what aims our society deems most important.

CONCLUSION

Generative AI music presents a whole host of new questions, considerations, and potential implications for how copyright holders vindicate their ownership. While the application of current copyright law and precedents to these situations involving AI-generated music does not provide fully satisfying answers as to what will happen when songs like these land on court dockets, it does direct attention to the chief policy concerns and areas in which artists are vulnerable. With regard to “Fake Drake,” the analysis of Sample Song A suggests that an infringement suit based on AI-generated soundalikes is unlikely to be successful. While a better understanding of the technology involved in AI-generated music may lead to stronger sampling claims, addressing “Fake Drake” is likely a matter better suited for trademark law and the right of publicity. Sample Song B presents slightly brighter prospects for artists to litigate AI-generated songs they believe infringe on their existing, copyrighted work. But these results are somewhat tentative, pending a better understanding of the technology and, ideally, insight from the Office.

What can be said for certain is that our understanding and expectation of how these cases will unfold are crucially informed by our understanding of the generative technology that ultimately creates the works. From the amount of user input to training data, there are many more considerations for actionable infringement than in a case of one person consciously copying the lyrics of a song by copying and pasting them onto new sheet music. As more is understood about how this technology actually uses existing songs to create new ones, the more we can apply the principles of copyright law and identify the gray areas that need clarification. To call these situations and concerns complicated would be a vast understatement. But if copyright law is to achieve its aims of “promot[ing] the Progress of Science and useful Arts,”311U.S. Const. art. I, § 8, cl. 8. while also continuing to provide adequate protection for “original works of authorship,”31217 U.S.C. § 102(a); see also H.R. Rep. No. 94-1476, at 51 (1976). even in the face of alluring technological developments, work must be done to decipher between these considerations and identify those that are legally cognizable. While Drake likely cannot

vindicate his copyright ownership rights by taking Fake Drake to court, future artists similarly affected might face a different trajectory thanks to “Heart on My Sleeve,” and how it turned the country’s attention to the question of how copyright law interacts with generative AI music.

98 S. Cal. L. Rev. 663

Download

* Executive Senior Editor, Southern California Law Review, Volume 98; J.D. Candidate 2025, University of Southern California Gould School of Law; B.A. 2022, University of Arizona, W.A. Franke Honors College. Thank you to Professor Barnett for his support and guidance, and to the members of the Southern California Law Review for their thoughtful suggestions.

Data Valuation and Law

Data has become an increasingly valuable asset. Numerous areas of law—including contracts, corporate law, intellectual property (“IP”), antitrust, tax, privacy, and bankruptcy—require parties and courts to determine the value of assets, including data. Unfortunately, data valuation has been hindered by a lack of clarity over what data is and why it is valuable. This lack of clarity also increases the chances of legal decisionmakers valuing data in inconsistent ways, which would create further confusion, inefficiencies, and opportunities for regulatory arbitrage.

This Article proposes a unified framework for valuing data that will promote consistent valuations across fields of law. It begins by conceptualizing data as building blocks: It is of little value on its own. But when placed in skillful and creative hands, it can unlock choices for its holders—choices they would not otherwise have—that can generate tremendous profits. Thus, data constitutes what is known as a “real option.” This Article shows how using real options to value data can significantly improve upon existing data valuation practices.

INTRODUCTION

The rise of data analytics has been staggering. In 2021, 1.134 trillion megabytes were created every day, totaling 74 zettabytes for the year.1See Louie Andre, 53 Important Statistics About How Much Data Is Created Every Day, Fins. Online (July 16, 2023), https://financesonline.com/how-much-data-is-created-every-day [https://
perma.cc/RKL6-9L8S].
As large as this is, projections for 2022 are over 25% higher.2Approximately 94 zettabytes of new data were projected to be created during 2022. Id. Big data and new information technology are changing the tools, business models, operations, and mindset that firms, nonprofits, and governments use every day, quietly transforming business and society.3See generally Geoffrey G. Parker, Marshall Van Alstyne & Paul Sangeet Choudary, Platform Revolution: How Networked Markets Are Transforming the Economy and How to Make Them Work for You (2016); Marco Iansiti & Karim R. Lakhani, Competing in the Age of AI: Strategy and Leadership When Algorithms and Networks Run the World (2020); Ajay Agrawal, Joshua Gans & Avi Goldfarb, Power and Prediction: The Disruptive Economics of Artificial Intelligence (2022).

These changes come with challenges. A variety of legal regimes govern economic activity; in many instances, those legal regimes must determine the value of owning or using particular assets, including data.

For example, one area in which data valuation plays an important role is in contracting. Firms contract with each other daily with regard to the sale of data. This includes first-party data sales, such as when Target sells data that it has collected to Proctor & Gamble, as well as third-party data sales, in which data aggregators or brokers sell data that others have collected. If one party breaches the contract, what remedies are available to their counterparty?4Cemre Bedir, Contract Law in the Age of Big Data, 16 Eur. Rev. Cont. L. 347, 362–64 (2020). In corporate law, target boards have fiduciary duties to make sure their shareholders are being appropriately compensated during mergers and acquisitions. This requires having a handle on the value of the target firm’s assets, including its data.5Doron Nissim, Big Data, Accounting Information, and Valuation, 8 J. Fin. & Data Sci. 69, 70 (2022). In tax, the taxation of intangible assets and specifically of data is a growing issue of concern.6Young Ran (Christine) Kim & Darien Shanske, State Digital Services Taxes: A Good and Permissible Idea (Despite What You Might Have Heard), 98 Notre Dame L. Rev. 741, 797–798 (2022).

These questions can potentially be even thornier when specific aspects of data must be valued, rather than full ownership. To take another example, suppose that one firm’s negligence results in another firm’s proprietary data leaking to the public. To award damages, a court must determine how much the damaged firm lost from having the data become public—but how much is that?7D. Daniel Sokol & Tawei Wang, A Review of Empirical Literature in Information Security, 95 S. Cal. L. Rev. 95, 109 (2021). Similarly, in antitrust, when control of data plays an important role in anticompetitive behavior, is it ownership of the data itself that creates the problem, or the use of the data?8See Tilman Kuhn, Kristen O’Shaughnessy, Tobias Pesch, Jaclyn Phillips & D. Daniel Sokol, Big Data and Data-Related Abuses of Market Power, in Research Handbook on Abuse of Dominance and Monopolization 438, 438–55 (Pinar Akman, Or Brook & Kristianos Stylianou eds., 2023) (providing an overview of cases in the United States and European Union). Does sharing the data with competitors make matters better or worse?9Id. The rise of generative artificial intelligence (“AI”), which requires data for its machine learning models, may create additional concerns as to the value of various data usage rights.

Unfortunately, the difficulties of conceptualizing data have hampered law’s attempts to incorporate the data revolution into multiple legal doctrines. This has opened the door to confusion, inconsistency, and inefficiency. Decisionmakers have confused data with algorithms, and struggled with how to apply certain doctrines to the legal rights that data owners and data users possess. This increases the risks that regulators in different substantive areas of law, as well as in different jurisdictions, will take inconsistent approaches. This creates inefficiencies as parties subject to multiple regimes work to navigate them. Different legal regimes also creates opportunities for regulatory arbitrage, in which regulated parties take advantage of divergent regulatory rules to achieve the regulatory treatment they want while making only minor changes to their economic activities.

To address these concerns, this Article offers a general framework for valuing data based on real options valuation. The financial economics literature pioneered the use of real options to better assess business decision-making under uncertainty.10See generally Avinash K. Dixit & Robert S. Pindyck, Investment Under Uncertainty (1994). This approach has since been extended beyond finance to address other areas of uncertainty.11See, e.g., Joseph A. Grundfest & Peter H. Huang, The Unexpected Value of Litigation: A Real Options Perspective, 58 Stan. L. Rev. 1267, 1282–91 (2006); Andrew Chin, Teaching Patents as Real Options, 95 N.C. L. Rev. 1433, 1434–35 (2017). Real option analysis provides a better path forward than the current patchwork of doctrinal and analytical approaches. A real options approach is conceptually correct and thus has the potential to ameliorate the confusion, inconsistency, and inefficiency of existing approaches. To our knowledge, this is the first article to utilize real options as a method to value data, in law or otherwise.

Along with its potential benefits as a method of data valuation, real options analysis does have its drawbacks. Real options theory is complicated, which creates implementation challenges that must be overcome, or at least managed, to achieve the benefits described above. That said, real options analysis is an improvement over existing approaches. Applying a more unified theory also allows for a more standardized approach that can then be tailored to specific doctrines and areas of law.

This Article proceeds as follows. Part I provides context regarding the big data revolution and the growing importance of data. In doing so, it reviews the extant theoretical and empirical literatures on data valuation. Part II identifies the implications of data valuation for law by providing some case studies across fields. It includes vignettes demonstrating the types of issues that emerge and some current legal approaches. Next, in Part III, the Article explores how real options analysis offers a viable potential solution to the current patchwork of legal approaches. The Article concludes on how agencies and courts would benefit from such an approach, notes limitations on the use of real options, and offers avenues of future research.

I.  THE DATA REVOLUTION AND THE VALUE OF DATA

To understand the importance of data valuation methods to the law, one must understand two other, related points. First, one must have a grounding in why and how data is used in the modern economy. Second, one must consider how to think about how those use cases translate into value estimates.

A.  Digital Transformation

To understand the role of data in the modern economy, one must consider three related points: (1) The increase in AI techniques that can generate value from data; (2) The increase in data to which such AI techniques can be applied; and (3) The amount of value that these techniques are creating. Understanding these dynamics allows us to explore specific case studies that apply these insights across a number of areas of law.

1.  Generating Value from Data with AI

As a starting point, companies across the economy have moved to increasingly digitized, AI-enabled business strategies, producing profound effects on value creation and innovation.12Iansiti & Lakhani, supra note 3, at 28–40; Ajay Agrawal, Joshua Gans & Avi Goldfarb, Prediction Machines: The Simple Economics of Artificial Intelligence 11–13 (2018); Hau L. Lee, Big Data and the Innovation Cycle, 27 Prod. & Operations Mgmt. 1642, 1645–46 (2018); Hal R. Varian, Big Data: New Tricks for Econometrics, 28 J. Econ. Persps. 3, 7–25 (2014) (analyzing the uses of big data in economics). Many companies have become platforms, where the ability to create economies of scale and scope have allowed for a generation of “new opportunities to create, appropriate, and deliver value for firms and [users] . . . .” D. Daniel Sokol, Technology Driven Government Law and Regulation, 26 Va. J.L. & Tech. 1, 2 (2023). We use the term AI broadly here, as a way to encompass algorithms that improve prediction and decision-making.13For applications in law, see for example, Amy L. Stein, Artificial Intelligence and Climate Change, 37 Yale J. on Reg. 890, 895–900 (2020); Ashley Deeks, The Judicial Demand for Explainable Artificial Intelligence, 119 Colum. L. Rev. 1829, 1829–32 (2019); W. Nicholson Price II, Regulating Black-Box Medicine, 116 Mich. L. Rev. 421, 432–37 (2017). There are different approaches to AI, such as neural networks and machine learning, among others.14Xiao Liu, Dokyun Lee & Kannan Srinivasan, Large-Scale Cross-Category Analysis of Consumer Review Content on Sales Conversion Leveraging Deep Learning, 56 J. Mktg. Rsch. 918, 924–25 (2019) (using neural networks in marketing research); Michael L. Rich, Machine Learning, Automated Suspicion Algorithms, and the Fourth Amendment, 164 U. Pa. L. Rev. 871, 871–80 (2016) (discussing machine learning in a legal context).

When thinking about data and AI, it can be helpful to consider a simple, three-tier vertical model of how companies and other actors use data and AI to further their goals.

 

Figure 1.

At the first stage is data. If AI is the product or output, data serve as the input. Data feed the needs of AI-enabled technologies. Data underlie machine learning and prediction models, and it is data that has fueled digital transformation.15Marshall Fisher & Ananth Raman, Using Data and Big Data in Retailing, 27 Prod. & Operations Mgmt. 1665, 1666–67 (2018); Anindya Ghose & Vilma Todri-Adamopoulos, Toward a Digital Attribution Model: Measuring the Impact of Display Advertising on Online Consumer Behavior, 40 Mgmt. Info. Sys. Q. 1, 2–3 (2016). Without sufficient quantity and quality of data, many current AI techniques simply cannot produce very good results.

Data often is the input to the next stage—powering an algorithm. The algorithm itself is not the end of the production. Rather, the algorithm simply enables better prediction. It is at the stage of prediction where there are outputs to AI—outputs that can generate tremendous value.

For example, when a user types terms into a search engine, that engine might consider data about what sites other users who typed in similar terms ultimately clicked on (among other data) when deciding what results should appear. Diagnostic software might compare a patient’s MRI to millions of MRI images that have already been analyzed by doctors to estimate the likelihood that the patient has breast cancer. Data drives the AI, the AI makes predictions, and those predictions enable better decision-making, which creates economic value.

2.  Increase in Data

While many facets of AI are themselves not new, the speed of data collection and processing have significantly improved these tools’ impact.16Ajay Agrawal, Joshua Gans & Avi Goldfarb, Prediction, Judgment, and Complexity: A Theory of Decision-Making and Artificial Intelligence, in The Economics of Artificial Intelligence 89, 93 (Ajay Agrawal, Joshua Gans & Avi Goldfarb eds., 2019). Data is vast and the various ways to use it have grown significantly, such that there are distinct data-related strategies that firms may adopt.

The data ecosystem is worth exploring briefly. Data can be bought and sold like many other inputs.17Maryam Farboodi & Laura Veldkamp, Data and Markets 1 (Mass. Inst. of Tech. Sloan, Research Paper No. 6887–22, 2022), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4284192 [https://perma.cc/M4JS-4Y2A]. It can be acquired from public sources. It can be collected from what can be termed data suppliers. For example, first-party companies such as Netflix or Spotify can sell their data and databases to other companies—firms regularly sell large quantities of this type of data through basic business transactions.18Firms also sell “exhaust” data; this is data sold for what are unrelated to business transactions but have a secondary purpose for other kinds of business. Third-party data brokers, apps and internet service providers (“ISPs”) that can provide locational or other data, and data aggregators also play significant roles in the data ecosystem.19Llewellyn D.W. Thomas & Aija Leiponen, Big Data Commercialization, 44 Inst. Elec. & Electronics Eng’rs: Eng’g Mgmt. Rev. 74, 80 (2016). Data brokers buy and sell data, thereby allowing firms to acquire new data to make better predictions.20See Nico Neumann & Catherine Tucker, Data Deserts and Black Boxes: The Impact of Socio-Economic Status on Consumer Profiling (February 27, 2023) (unpublished presentation) (on file with the Southern California Law Review); Arion Cheong, D. Daniel Sokol & Tawei Wang, Cookie Intermediaries: Does Competition Leads to More Privacy? 2–5 (April 16, 2023) (unpublished manuscript) (on file with Southern California Law Review). This increase in data sources is an important change, as it makes data more widely available. This both enables more actors to put it to use and to experiment and innovate with it.21To the extent that data is accessible from many sources, that weakens arguments that data access is a key barrier to entry.

Indeed, data has become both a make and buy decision.22See Jordan M. Barry & Victor Fleischer, Tax and the Boundary of the Firm 2–7 (Aug. 28, 2023) (unpublished manuscript) (on file with Southern California Law Review). See generally R.H. Coase, The Nature of the Firm, 4 Economica 386 (1937). That is, firms have significant opportunities to generate their own data—such as Target keeping track of what consumers buy at Target—and to acquire third-party data from other actors. This is especially true with respect to end-consumer data.23See Alessandro Bonatti, Munther Dahleh, Thibaut Horel & Amir Nouripour, Selling Information in Competitive Environments 4–5 (Mass. Inst. of Tech. Sloan Sch. of Mgmt., Working Paper No. 6532-21, 2022), https://arxiv.org/pdf/2202.08780 [https://perma.cc/7MWJ-AZNQ]; Anja Lambrecht & Catherine E. Tucker, Can Big Data Protect a Firm From Competition?, Competition Pol’y Int’l Antitrust J. (Jan. 17, 2017), https://www.competitionpolicyinternational.com/can-big-data-protect-a-firm-from-competition [https://perma.cc/JK39-W2CR]; Thomas & Leiponen, supra note 19, at 80.

3.  Amount of Value

What is this power of data? Typically, data is defined across four “V’s”: velocity, veracity, volume and variety.24See A.B.A. Section of Antitrust, Artificial Intelligence & Machine Learning: Emerging Legal and Self-Regulatory Considerations (Part One) 2 (2019), https://

http://www.americanbar.org/content/dam/aba/administrative/antitrust_law/comments/october-2019/clean-antitrust-ai-report-pt1-093019.pdf [https://perma.cc/F9S2-8P5Q].
Combined, these four Vs create data value. Velocity is the speed at which data is collected and used. Volume is the sheer amount of data that is generated, which (at least at present) overwhelms our ability to process it; there is more data than ever before and every day we create 328.77 million terabytes of new data.25See Petroc Taylor, Volume of Data/Information Created, Captured, Copied, and Consumed Worldwide from 2010 to 2020, with Forecasts from 2021 to 2025, Statista (Sept. 8, 2022), https://www.statista.com/statistics/871513/worldwide-data-created [https://perma.cc/LZ5B-CSFM]. Veracity goes to the increasingly important issues of data accuracy and trustworthiness. Finally, variety reflects the diversity of data types that can be collected and used, such as e-mails, PDFs, and videos.

Data may come from many sources. The general rule of data is that the more the data, the greater the ability to feed AI and the better the ability to improve prediction,26Iansiti & Lakhani, supra note 3, at 16–27; Andrei Hagiu & Julian Wright, Data-Enabled Learning, Network Effects and Competitive Advantage 3 (May 2021) (unpublished manuscript), https://app.scholarsite.io/julian-wright/articles/data-enabled-learning-network-effects-and-competitive-advantage-3 [https://perma.cc/6J8A-L8MU]. although there are limits to what data alone can do.27See, e.g., Carmelo Cennamo, Building the Value of Next-Generation Platforms: The Paradox of Diminishing Returns, 44 J. Mgmt. 3038, 3039–41 (2018) (identifying diminishing returns to data); Hanna Halaburda, Mikolaj Jan Piskorski & Pinar Yildirim, Competing by Restricting Choice: The Case of Matching Platforms, 64 Mgmt. Sci. 3574, 3574–76 (2017) (identifying network saturation allowing for competition through differentiation in platforms); D. Daniel Sokol & Roisin Comerford, Antitrust and Regulating Big Data, 23 Geo. Mason L. Rev. 1129, 1135–40 (2016) (illustrating that it is not the data but what you do with them that matters as well as other limits to data). Data must be processed, via AI or otherwise, to reap benefits.28Ron Berman & Ayelet Israeli, The Value of Descriptive Analytics: Evidence from Online Retailers, 41 Mktg. Sci. 1074, 1076 (2022) (finding that e-commerce data analytics dashboards increase weekly revenues between 4%–10%). When properly processed, big data allows firms to improve their products and services and to develop new such products and services.29Sokol & Comerford, supra note 27, at 1134.

The academic and practitioner literature on data valuation is complex. First, there is the literature on data brokers. In some senses, the costs of data are lower now than ever before.30Avi Goldfarb & Catherine Tucker, Digital Economics, 57 J. Econ. Literature 3, 3 (2019). The reduced cost of data allows for the creation of a wide variety of sophisticated algorithms that can produce insights that would elude unassisted humans.31Iansiti & Lakhani, supra note 3, at 62–70. The ability to utilize data to feed AI allows for opportunities to better create, appropriate, and deliver economic value not merely for AI-driven firms but for the different users of digital platforms such as advertisers, complementors, and customers.32Ron Adner, Phanish Puranam & Feng Zhu, What Is Different About Digital Strategy? From Quantitative to Qualitative Change, 4 Strategy Sci. 253, 258 (2019); Michael G. Jacobides, Carmelo Cennamo & Annabelle Gawer, Towards a Theory of Ecosystems, 39 Strategic Mgmt. J. 2255, 2257 (2018); Geoffrey Parker, Marshall Van Alstyne & Xiaoyue Jiang, Platform Ecosystems: How Developers Invert the Firm, 41 Mgmt. Info. Sys. Q. 255, 259 (2017).

This transformation creates significant economic value, but the drivers of that value are not well understood by courts and regulatory bodies. In some cases, regulation might stymie the use of data and chill innovation and investment.33See Jian Jia, Ginger Zhe Jin & Liad Wagman, The Short-Run Effects of the General Data Protection Regulation on Technology Venture Investment, 40 Mktg. Sci. 661, 677 (2021) (finding a decrease in venture capital investment as a result of GDPR); Rebecca Janssen, Reinhold Kesler, Michael E. Kummer & Joel Waldfogel, GDPR and the Lost Generation of Innovative Apps 1 (Nat’l Bureau of Econ. Rsch., Working Paper No. 30028, 2022) (finding a reduction of apps by one third as a result of GDPR). In other cases, the potential portability of certain types of data has motivated increased legislative and regulatory action.34Org. for Econ. Coop. & Dev., Data Portability, Interoperability and Digital Platform Competition 42 (2021). In other situations, courts have held that owners of certain types of data have certain rights, such as the right to exclude others from such data. The exact value—either of the underlying data itself or of the rights to exclude others—may not always be clear.35Francesco Decarolis & Gabriele Rovigatti, From Mad Men to Maths Men: Concentration and Buyer Power in Online Advertising, 111 Am. Econ. Rev. 3299, 3299–303 (2021) (discussing ad auctions). There are yet other areas in which data-related transactions occur on a regular basis, but which have not produced judicial decisions to date.36Id.

It is these sorts of complexities as to law and data to which we next turn.

B.  Disagreements on How to Think About Data Creating Value

Valuing data presents conceptual challenges because data is unlike other assets, including other intangible assets. The first problem is to understand how even though data is a building block for constructing a final product, data is not like traditional tangible assets such as bricks and steel used to make a factory. Data can be collected and mixed in a number of different, complex ways. Further, unlike bricks, data is non-rivalrous; more than one firm can use the same data.37Charles I. Jones & Christopher Tonetti, Nonrivalry and the Economics of Data, 110 Am. Econ. Rev. 2819, 2834 (2020). For instance, someone’s driving history can be used at the same time by multiple firms, in the same or different industries (for example, advertisers, insurance companies, credit card companies). As Jones and Tonetti explain:

An analogy may be helpful. Because capital is rival, each firm must have its own building, each worker needs her own desk and computer, and each warehouse needs its own collection of forklifts. But if capital were nonrival, it would be as if every auto worker in the economy could use the entire industry’s stock of capital at the same time. Clearly this would produce tremendous economic gains. This is what is possible with data.38Id. at 2820.

Thus, non-rivalry means that valuation may be harder across a number of the traditional measurements.

Further complicating data is that it is (mostly) non-exclusive.39But see Autorité de la concurrence, Décision n° 14-MC-02 du 9 septembre 2014 relative à une demande de mesures conservatoires présentée par la société Direct Energie dans les secteurs du gaz et de l’électricité (2014) (identifying unique data because of regulation as to customer data and contracts). For example, if someone collects public records about home purchases into a comprehensive database, that does not prevent others from collecting that same information in the same way. This is a stark contrast from some other intangible assets, including traditional forms of IP such as patents and copyrights, which create value by conferring exclusive rights on their holders.40John P. Conley & Christopher S. Yoo, Nonrivalry and Price Discrimination in Copyright Economics, 157 U. Pa. L. Rev. 1801, 1818–19 (2009).

Both of these indicia suggest that the underlying value of the data, rather than that of the algorithm, may be small. When the input (data) is easily available to all, it is the actor’s ability to make use of the input—that is, the algorithm—that creates the value, not the input itself. For example, a classic crème brûlée recipe has only four ingredients—cream, sugar, egg yolks, and vanilla. All of these items are widely available. The ability to charge a premium for the final product is a function of the baking skill of the pastry chef.

Beyond non-rivalry and non-excludability, some regulation, such as the European Digital Markets Act41Proposal for a Regulation of the European Parliament and of the Council on Contestable and Fair Markets in the Digital Sector (Digital Markets Act), COM (2020) 842 (Dec. 15, 2020) [hereinafter Proposal for a Regulation]. requires fair, reasonable, and non-discriminatory (“FRAND”) licensing. Even in IP and antitrust, FRAND terms are not always clearly understood.42Herbert Hovenkamp, FRAND and Antitrust, 105 Cornell L. Rev. 1683, 1684 (2020). It stands to reason that in data, with fewer cases to provide guidance across different areas of law, the nature of FRAND obligations is even less clear. Further, certain types of data have sharing requirements in practice that may change the valuation of data, such as requirements for data portability.

Data is also unlike some other intangible assets because of the speed at which data can become obsolete.43Ehsan Valavi, Joel Hestness, Marco Iansiti, Newsha Ardalani, Feng Zhu & Karim R. Lakhani, Time Dependency, Data Flow, and Competitive Advantage 10 (Harv. Bus. Sch., Working Paper No. 21-099, 2021) (“High perishability undermines the importance of data volume or historical data in creating a competitive advantage.”). Much data gets stale over time.44Ehsan Valavi, Joel Hestness, Newsha Ardalani & Marco Iansiti, Time and the Value of Data 1 (Harv. Bus. Sch., Working Paper No. 21-016, 2020). This suggests that much data is a diminishing asset, something which IP such as patents or copyrights do not face nearly as quickly because those rights last for longer periods.

II.  THE IMPLICATIONS OF DATA VALUATION FOR LAW

There are many areas of law for which valuation of various assets is important. Data is an increasingly valuable asset. Unfortunately, there is currently relatively little law on how to value data. Courts and regulators have generally avoided the question whenever possible, perhaps out of concern for the difficulty of the problem, or uncertainty on how to proceed, and often such cases get decided upon other grounds. This raises the chances that different legal areas will use different valuation methods. Such inconsistency creates dilemmas as to how to allocate legal rights and responsibilities. Perhaps the clearest way of understanding this tension across areas of law is to consider the purpose of damages. Damages exist to compensate a potential victim for violations of law and/or to deter the violator from doing so again.45Gary S. Becker, Crime and Punishment: An Economic Approach, 76 J. Pol. Econ. 169, 172–73 (1968). There are other potential justifications for damages, such as retributivism, but these are the two justifications raised most frequently in the civil context. Methods across areas of law might include: (1) a cost-based approach based on the replacement cost; (2) a market-based approach based on similar acquisitions of data (or companies with data); and (3) an income-based approach, to the extent that the data is producing income via sales or even licensing. To this, we add the importance of a fourth possibility, an options-based approach. Often, outcomes seem to be highly contextual rather than based on valuation methodology.46Feng Chen, Kenton K. Yee & Yong Keun Yoo, Robustness of Judicial Decisions to Valuation-Method Innovation: An Exploratory Empirical Study, 37 J. Bus. Fin. & Acct., 1094, 1097 (2010). A lack of consistency is significant because of the growing stake of data as an important part of economic activity.

Which approach ultimately to take across areas of law such as IP, antitrust, mergers and acquisitions (M&A), bankruptcy, torts, and other areas of law varies. One important driver is what information courts and parties can easily measure. When contracts (and comparable transactions) are not easy to find, private negotiations between contracting parties in the shadow of the law are another important driver. These questions become more salient as we try to understand how issues involving big data reverberate across a number of areas of law and in terms of the value of data overall. The biggest question is how much value do we think is in big data?47We assume that data creates value. See Maryam Farboodi, Roxana Mihet, Thomas Philippon & Laura Veldkamp, Big Data and Firm Dynamics, 109 Am. Econ. Assoc. Papers & Proc. 38, 42 (2019). We might also imagine that information is simply a byproduct of economic activity. See Pablo D. Fajgelbaum, Edouard Schaal & Mathieu Taschereau-Dumouchel, Uncertainty Traps, 132 Q. J. Econ. 1641, 1642 (2017).

A.  Valuation Is Important to Many Areas of Law

Below we offer some examples of how data valuation plays a role across various areas of law. We highlight these examples as a way to understand some of the complexity that requires a more generalized rethink as to valuation method of data. Understanding these complexities helps clarify the value of data as well as some of the struggles that different areas of law are currently experiencing as they seek to value data.

1.  Antitrust

Antitrust has tried to address the questions of competition and the exercise of market power in two contexts—mergers and conduct cases. These produce two types of antitrust cases—those where data is an input and those in which data is a product. However, there is little caselaw in each area. Consequentially, the problem with both sets of circumstances is that we tend not to see litigated cases that get to the valuation issue of the data.

Antitrust primarily addresses behavior one of two ways. The first is through ex ante enforcement through merger control. Essentially, regulators can block mergers that are expected to produce antitrust problems. On the mergers side, most cases do not go to court, which means that litigated cases may not be representative. Even in those cases for which there is a judicial opinion, not all issues may get addressed. Scholars have expressed general frustration with what gets decided under the shadow of merger law.48D. Daniel Sokol & James A. Fishkin, Antitrust Merger Efficiencies in the Shadow of the Law, 64 Vand. L. Rev. En Banc 45, 45–46 (2011). Thus, the basis for decisions on many issues, including data valuation, is limited or incomplete. As Professors Katz and Shelanski lament, “The overall picture of current merger enforcement practice is, therefore, murky.”49Michael L. Katz & Howard A. Shelanski, Merger Analysis and the Treatment of Uncertainty: Should We Expect Better?, 74 Antitrust L.J. 537, 547 (2007).

Cases provide some guidance on how antitrust courts and agencies think about data, which gives some insight on how to think about data’s value. Yet much uncertainty remains. As of this writing, no mergers have been blocked on data theory grounds in the United States. Nor have there been any decided cases that explain the valuation method used for such transactions that weigh the data rather than its use to a specific platform.

In the case of data, let us begin with mergers and the possibility that data is itself the market. One such deal that included data as the market is the 2014 CoreLogic-DataQuick merger.50See Decision & Order at 5–8, In re CoreLogic, Inc., Docket No. C-4458 (F.T.C. May 21, 2014). In that transaction, the Federal Trade Commission cleared the transaction with a database divestiture but did not explain the valuation technique employed. Alas, this has been typical with regard to antitrust analysis of mergers that include data as a market. Similarly, people generally have not discussed mergers that include valuable data as an input (for example, Microsoft/LinkedIn, Apple/Shazam) as matters of valuation. At best, there are transactions that have received some sort of conditional approval such as Nielsen/Arbitron but without an explicit discussion of data valuation.51See Decision & Order at 5–7, In re Nielsen Holdings N.V., Docket No. C-4439 (F.T.C. Feb. 28, 2014).

Antitrust, through public and private enforcement, polices against anticompetitive conduct by one or more firms that harms competition. Conduct cases in antitrust involving data issues have not resolved the data valuation question, either. Complicating antitrust further is that duties to deal with competitors are limited, which means that such data sharing cases do not get to the data valuation stage of the case. Rather, these cases are decided based on the premise that data is not required to be shared in the first place. Yet, understanding such cases helps to explore the value of data because the discussion helps to inform the value of data use and ownership.

For example, Section 2 of the Sherman Act generally imposes no requirements to deal with one’s competitors.52Sherman Act, 15 U.S.C. § 2 (1982). In Aspen Skiing Co. v. Aspen Highlands Skiing Corp., the Supreme Court held that there are some limited circumstances under which Section 2 requires monopolistic firms to deal with their rivals.53Aspen Skiing Co. v. Aspen Highlands Skiing Corp., 472 U.S. 585, 585 (1985). Courts have further narrowed Aspen Skiing’s holding since. Most recently, the DC Circuit dismissed a monopolization case that forty-six states brought against Meta based on the court’s narrow reading of Aspen Skiing.54New York v. Meta Platforms, Inc., 66 F.4th 288, 305 (D.C. Cir. 2023). Guam and the District of Columbia were also plaintiffs in the litigation. 

Cases brought under other provisions of the Sherman Act have also implicated the value of data. However, much like the Section 2 monopolization cases, courts examining Section 1 of the Sherman Act have offered little guidance on how to value data. For example, in Authenticom, Inc. v. CDK Global, LLC, Authenticom brought a claim against CDK for closing its system for data and thereby barring data scrapers from access. The Seventh Circuit ruled in favor of CDK on the basis that forced data sharing was inconsistent with precedent.55Authenticom, Inc. v. CDK Global, LLC, 874 F.3d 1019, 1025–27 (7th Cir. 2017). Because of this ruling, which dismissed the case on essential facilities grounds, the data valuation issue was never addressed. Of course, that does not mean that the data does not have value, merely that the court was able to dispose of the case without determining what the data’s value was.

Similar to antitrust enforcement, competition regulation increasingly plays an important role in big data valuation. This comes up specifically in the case of the Digital Markets Act (“DMA”), the European approach to ex-ante regulation of data.56Proposal for a Regulation, supra note 41, at 7. See Nicolas Petit, The Proposed Digital Markets Act (DMA): A Legal and Policy Review, 12 J. Eur. Competition L. & Prac. 529, 529–32 (2021) (providing an overview of the Digital Markets Act). Regarding “gatekeeper” firms, the DMA states:

The gatekeeper shall provide to any third-party undertaking providing online search engines, at its request, with access on fair, reasonable and non-discriminatory terms to ranking, query, click and view data in relation to free and paid search generated by end users on its online search engines. Any such query, click and view data that constitutes personal data shall be anonymised.57Digital Markets Act, 2022 O.J. (L 265) art. 6 ¶ 11.

Of course, data from a gatekeeper will not generate profits on its own; gatekeeper data must still be combined with some effort by recipients. But this reality makes it harder to assess the incremental profits the recipient earns as a result of having access to the data.58Incremental revenue, which one might hope to observe, will overstate the benefits; one must also consider incremental costs. 

2.  Business Law

Business law increasingly confronts data valuation. Unfortunately, it does so in ways that do not always show the precision that we believe is necessary to unlock a more accurate value of data assets. For example, data valuation questions arise within the context of both mergers and acquisitions (“M&A”) and bankruptcy. A number of factors arise in each context that make data valuation more difficult. Within the merger context, the purpose of valuation is to best help the acquiring and target boards to fulfill their fiduciary duties to ensure that the price paid for the acquisition is an appropriate one.

Overall, corporate law has grappled with how to account for intangibles. Many assets, including branding and intangibles such as IP, are lumped together under the heading of “goodwill.” However, the goodwill from reputation and branding is different than goodwill that is the basis of a regenerative asset such as data. Further, how data is stored and how easily it can be processed and integrated make such a valuation more challenging.59Chengxin Cao, Gautum Ray, Mani Subramani & Alok Gupta, Enterprise Systems and M&A Outcomes for Acquirers and Targets, 46 Mgmt. Info Sys. Q. 1295, 1299–300 (2022) (identifying similar issues in the context of integration of business enterprise software in M&A).

Different data sets may have different levels of privacy requirements, such as data that is protected under the Health Insurance Portability and Accountability Act (“HIPAA”) versus commercial health data, which has less stringent requirements. Identifying what sort of data companies may keep, for how long, how stale such data get, and the potential liabilities of such data are complex.60Sometimes firms might unknowingly buy a data lemon, with liabilities that attach because of a data breach, such as Marriot’s acquisition of Starwood’s hotel chain. However, this is a somewhat different question than valuing the data set itself. Chirantan Chatterjee & D. Daniel Sokol, Don’t Acquire a Company Until You Evaluate Its Data Security, Harv. Bus. Rev. (April 16, 2019), https://hbr.org/2019/04/dont-acquire-a-company-until-you-evaluate-its-data-security [https://perma.cc
/XH4E-BK6M].
Yet, there are very few cases that offer direct guidance on how to value data in the corporate and M&A setting. Thus, data valuation ends up a financial black box with potentially large implications if and when such cases go to litigation. This sort of uncertainty creates potential risk for deals, particularly those deals for which the underlying data may be a significant asset.61Michel Benaroch, Yossi Lichtenstein & Karl Robinson, Real Options in Information Technology Risk Management: An Empirical Validation of Risk-Option Relationships, 30 Mgmt. Info. Sys. Q. 827, 828 (2006) (suggesting a risk management-based approach to address the uncertainty associated with data breaches).

Finally, unresolved issues include requirements of how to store data62Woodrow Hartzog & Neil Richards, Privacy’s Constitutional Moment and the Limits of Data Protection, 61 B.C. L. Rev. 1687, 1706 (2020). as well as how to destroy data.63Some forms of data disposal have specific regulation. See, e.g., Disposing of Consumer Report Information? Rule Tells How, U.S. Fed. Trade Comm’n (June 2005), https://www.ftc.gov/business-guidance/resources/disposing-consumer-report-information-rule-tells-how [https://perma.cc/RWW9-2EXJ]. The lack of uniform federal privacy legislation makes such analysis more difficult. Federal agencies, especially the FTC, enforce privacy protections,64Ginger Zhe Jin & Andrew Stivers, Protecting Consumers in Privacy and Data Security: A Perspective of Information Economics 1 n.2 (May 22, 2017) (unpublished manuscript), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3006172 [https://perma.cc/N3E3-4NGV]. but private actions also play a role.65See generally Daniel J. Solove & Woodrow Hartzog, Breached! Why Data Security Law Fails and How to Improve It (2022) (discussing the shortcomings of data privacy and privacy laws). Moreover, states can impose additional rules on top of the federal ones. For example, California took inspiration from the General Data Protection Regulation (“GDPR”) and adopted the California Consumer Privacy Act (“CCPA”) and the California Privacy Rights Act (“CPRA”).66California Consumer Privacy Act of 2018, Cal. Civ. Code §§ 1798.100–199.100 (2018); California Privacy Rights Act, Cal. Civ. Code §§ 1798.100–199.100 (2018).

This issue of data valuation similarly plays itself out in the bankruptcy setting. In some settings, the data itself, such as customers’ spending behavior,67Perhaps this is a more sophisticated version of a customer list, which gets trade secret protection under the Defend Trade Secrets Act. may be the asset. Take the example of the bankruptcy proceeding for Caesar’s Entertainment Operating Corp casinos.68James E. Short & Steve Todd, What’s Your Data Worth?, 58 Mass. Inst. Tech. Sloan Mgmt. Rev. 17, 17 (2017). Creditors viewed the company’s data (customer-specific data on spending habits) as one of the company’s most important assets. Yet, as is often the case in bankruptcy proceedings, this issue was resolved through negotiations in the shadow of the law, leaving behind no case law to help shape future data valuation inquiries. On one side, there was a note by the bankruptcy court examiner that properties of Caesar’s that were sold off were worse off because they could not leverage the data of the rewards program—but at the same time, the examiner recognized that it would be difficult to sell the rewards program to other buyers.69Id. Thus, the court never ultimately decided how to value the data in light of these complexities. This is common in bankruptcy, where few decisions come in the form of a bankruptcy court ruling.70Douglas G. Baird & Robert K. Rasmussen, The End of Bankruptcy, 55 Stan. L. Rev. 751, 786–88 (2002).

3.  Synthesis

These case studies lead to a number of conclusions. First, courts do not always get to valuation questions. This may be because cases are decided on other grounds for legitimate reasons or because judges feel uncomfortable getting to the actual valuation and so they rule on different grounds to avoid the exercise. Second, there is uncertainty of valuation methodologies across areas of law, as well as potential for some such issues to simultaneously emerge in multiple contexts (for example, M&A and antitrust, M&A and bankruptcy, antitrust and data privacy) that may employ different methodologies. Accordingly, we believe that a more consistent approach may better facilitate business certainty with regard to valuation models.

III.  REAL OPTIONS AS A SOLUTION

Real options analysis provides a framework that can be used to value data across different contexts, including different areas of law.  We provide a basic introduction to real options before discussing the advantages and disadvantages of using them to value data. We then discuss how this approach might be employed in the real world.

A.  Real Options

An option is the right, but not the obligation, to do something. For example, if Maria has the right to paint her house green, to travel to Paris, or to order pizza for lunch, those are all options.

In finance, the most well-known options give their holders the right to buy or sell a specific quantity of a particular asset at a specified time for a specified price. These options are known as financial options.71See Investment Products: Options, Fin. Inv. Regul. Auth., https://www.
finra.org/investors/investing/investment-products/options [https://perma.cc/J6VN-7GPR] (last visited Aug. 28, 2023).
For instance, Jacinta might have the right to buy 1,000 shares of Apple stock in three months’ time at a price of $150 per share. That right would be quite valuable if, three months from now, Apple stock is trading at $200 per share: Jacinta could buy 1,000 Apple shares for $150,000,721,000 shares * $150 purchase price per share = $150,000. then immediately sell them to other investors for $200,000,731,000 shares * $200 sale price per share = $200,000. netting her $50,000 of profit.74$200,000 revenue from sale of Apple shares – $150,000 paid for Apple shares = $50,000 profit.

Real options, like financial options, reflect the value of being able to react to changing conditions. However, rather than representing merely the right to buy or sell, they can encompass one’s ability to change one’s behavior in all manner of ways.75Real options are also called strategic options. Ivo Welch, Corporate Finance 363 (3rd ed. 2014). This ability to change course can be extremely valuable. A pair of simple, stylized examples help illustrate this point.

Example 1. Suppose that you are an executive at a company, and you are considering whether the company should launch a new product. It is unclear how consumers will react to the product; they may love it (iPods) or they may not (Zunes). Suppose that there is a 50% chance that the product will be a success, in which case it will generate $10 million of profits per year for the next ten years.76For conceptual clarity, and to avoid complicating the example with issues related to time value of money and discount rates, we assume that all of the payment values discussed in this example are present values—that is, the profit you will earn in year one (or two, or three, or seven, etc.) is worth $10 million to you today. On the other hand, there is a 50% chance that the product will be a commercial failure, in which case it will cost the company $20 million per year for the next ten years.

Under the facts of Example 1, the company should not launch the product.77For simplicity, this analysis assumes that you are risk-neutral. If you were risk-averse, the case against the project would be even stronger. Half of the time, the product will produce $100 million of profit;78$10 million in annual profits * 10 years = $100 million in total profits. the other half of the time it will produce losses of $200 million.79$20 million in annual losses * 10 years = $200 million in total losses. On average, then, launching the new product will cost the company $50 million.8050% * $100 million + 50% * -$200 million = $50 million + -$100 million = -$50 million. Equivalently, the net present value (NPV) of this project is -$50 million.

Example 2. The facts are the same as in Example 1, except that now the company has the ability to stop making the new product after its first year on the market.

Under the facts of Example 2, the company should absolutely launch the product. When the product is a success, it will keep the product on the market. Everything will remain the same in that circumstance, and the company will earn $100 million of profit. But when the product is a commercial failure, the company can now cut its losses after one year. By doing so, the company will reduce its total losses when the product fails from $200 million to only $20 million.81The difference is between 1 year of $20 million annual losses and 10 such years. On average, the new product will now generate $40 million of profit.8250% * $100 million + 50% * -$20 million = $50 million + -$10 million = $40 million. Equivalently, the NPV of this project is $40 million.

Taken together, Examples 1 and 2 show how valuable the ability to change course can be. Simply having the ability to give up on the product when it is not profitable transforms a project that loses $50 million into one that earns $40 million—a $90 million swing.83$50 million – -$40 million = $90 million. Since the only difference between these two Examples was the real option to give up on the product after a year, that option is worth $90 million.

Real options come in a variety of common forms. Companies can expand or contract their businesses, such as by opening new locations or closing existing facilities. They can accelerate or delay projects, such as by hiring more workers to build a factory or pausing construction. They can switch production processes, trade-off between workers and automated processes, or shift production between in-house divisions and outside contractors. Taken together, real options encompass a wide range of actions spread across an expansive set of possible circumstances.

B.  Real Options as a Model for Data Valuation

As a framework for valuing data, real option analysis has many virtues. First, the value of data is that it enables a person to take new actions that were not available previously.84This feature is not unique to data. For example, the value of lumber comes from what you can build with it, or what someone will give you in exchange for it—which depends on what they can build with it or what they can sell it for, and so on. Real option analysis is how finance values the ability to take new courses of action. Thus, as a conceptual matter, real option analysis is a natural fit for valuing data. Further, real option analysis is a flexible and expansive tool that can be used to model an extraordinarily wide range of scenarios and circumstances. This makes it capable of handling the range of new possible outcomes that data, paired with modern statistical analysis, can produce.

Moreover, as noted previously, current approaches to data valuation offer little guidance. This increases the potential for confusion, inconsistency, and regulatory arbitrage. In some instances, they assign data no value at all.85Interestingly, this parallels the most common mistake that managers make with respect to real options. Welch, supra note 75, at 368. In some instances, holding data can have negative expected value, even accounting for the real options it creates. This could happen if the uses for the data generate little profit (for example, if legislation narrowly circumscribes their permitted uses), but the firm would suffer large costs if the data leaks, and the chance of a leak remains significant even after the firm takes precautions.     Applying real options analysis to data valuation would help ameliorate all of these problems. Real options analysis gives a clear theoretical framework, providing guidance and structure for those trying to determine data’s value. This would help align and unify the disparate valuation approaches that have been employed to date. Improved alignment would also reduce the opportunities for regulatory arbitrage that can result when different regulatory regimes adopt inconsistent valuation methodologies.86See Victor Fleischer, Regulatory Arbitrage, 89 Tex. L. Rev. 227, 230 (2010) (describing regulatory regime arbitrage); cf. Jordan Barry, Response, On Regulatory Arbitrage, 89 Tex. L. Rev. See Also 69, 73–78 (2010) (arguing that regulatory regime arbitrage is a subset of economic substance arbitrage, and that true regulatory arbitrage is only possible in that context when at least one of the regulatory regimes in question is using a regulatory rule that does not track the relevant underlying economic substance).

While real option valuation offers a number of benefits, it also entails a significant drawback: correctly valuing real options is quite difficult. To do so precisely, one must anticipate, and then think through, all of the possible future states of the world, their respective likelihoods of occurring, how one would respond to them all, and how much one would ultimately reap as a result. From there, one can work backwards from these endpoints to determine the right course of action at each decision point and the scenario’s expected value overall. This is a tall order—especially when valuing data, an asset whose value depends in part on future developments in statistical analysis.

To put a somewhat finer point on it, consider financial options once more. Valuing financial options is a difficult mathematical problem. Fischer Black, Myron Scholes, and Robert Merton’s options pricing model was a watershed advance for the field, ultimately garnering a Nobel Prize in 1997.87Fischer Black & Myron Scholes, The Pricing of Options and Corporate Liabilities, 81 J. Pol. Econ. 637, 640–45 (1973); Robert C. Merton, Theory of Rational Option Pricing, 4 Bell J. Econ. & Mgmt. Sci. 141, 162–71 (1973); Press Release, The Nobel Prize, Royal Swedish Academy of Sciences, The Bank of Sweden Prize in Economic Sciences in Memory of Alfred Nobel 1997 (Oct. 14, 1997), https://www.nobelprize.org/prizes/economic-sciences/1997/press-release [https://perma.cc/AP7W-9Z4H]. Even with the solution in hand, the mathematics remain challenging. As important as options are to modern finance, many undergraduate finance courses do not cover the application of their formula, let alone its derivation.88See, e.g., A. Craig MacKinlay, The Wharton School, U. Pa., Finance 1000: Corporate Finance (2022), https://apps.wharton.upenn.edu/syllabi/202230/FNCE1000001 [https://perma.cc/YV4T-3U7H]; Albers Sch. Bus. & Econ., Seattle University, FINC 3400 Business Finance & FINC 3420 Intermediate Corporate Finance, https://www.seattleu.edu/business/undergraduate/courses–syllabi/finance [https:
//perma.cc/W5DD-9C6N] (last visited on Aug. 28, 2023).

Valuing real options is even harder than valuing financial ones. There are more possibilities to consider, more actions available, and more variables of interest.89See, e.g., Tom Copeland & Peter Tufano, A Real-World Way to Manage Real Options, Harv. Bus. Rev. (Mar. 2004), https://hbr.org/2004/03/a-real-world-way-to-manage-real-options [https:
//perma.cc/BJL8-TE64] (“As many executives point out, options embedded in management decisions are far more complex and ambiguous than financial options. Their concern is that it would be dangerous to try to reduce those complexities into standard option models, such as the Black-Scholes-Merton model, which have only five or six variables.”).
It would be extremely difficult to write and apply a regulation with a precise formula that generalized across different types of data from diffuse contexts and industries. The complexity of real options also poses challenges for parties, for judges, and for juries.

This is a serious problem. A valuation method that has attractive theoretical properties, but that is impossible to apply in practice, would seem to be of extremely limited value.

C.  A Way Forward

Despite its complexities, we nonetheless believe that real options analysis holds great promise as a framework for valuing data. If one wants to value data accurately, one must have the right model. In our view, real options analysis captures what makes data useful, and thus offers the best framework to think about data’s value. If data’s value is complicated and depends on many factors, then this is not a fault of the model; the model can only help a user identify and focus on the things that matter, even if that’s a long list.90The complexity of real options may not be an entirely bad thing. For example, complexity in the valuation process may impede parties’ ability to strategically manipulate valuations. Put another way, to get the right answer, one must ask the right question. The right question may be a hard one—but answering a different, easier question means avoiding the problem, not solving it.

Moreover, it is worth stating what may be obvious: the real options approach need not be perfect to be an improvement over existing practices.91Harold Demsetz, Information and Efficiency: Another Viewpoint, 12 J.L. & Econ. 1, 1 (1969) (identifying the nirvana fallacy of a first-best comparative institutional analysis). Getting all interested parties asking the right question—or even the same question—would be valuable. It would reduce conceptual confusion, inconsistencies, and opportunities for regulatory arbitrage. Moreover, real options always have positive value.92This is also true of financial options.  Whenever taking an available course of action is profitable, one can do so; if that course of action is not profitable, one can simply decline to take that action.93This assumes that actors are rational. If that is not the case, then it may be beneficial to remove some of one’s choices, such as Odysseus tying himself to the mast to avoid being lured by the Sirens’ song. Homer, The Odyssey (Emily R. Wilson trans., W.W. Norton & Co. 1st ed. 2018). It can also be valuable to remove options from your choice set if that will change others’ behavior in a way that is favorable to you. See, e.g., Deepak Malhotra, Six Steps for Making Your Threat Credible, Harv. Bus. Sch.: Working Knowledge (May 30, 2005), https://hbswk.hbs.edu/item/six-steps-for-making-your-threat-credible [https://perma.cc/J58N-D7AS] (describing how, when playing chicken, the best strategy is to remove your steering wheel and throw it out the window; that way, your adversary knows that you cannot swerve even if you wish to, and must then act accordingly). See also supra note 85 and accompanying text.  Real options analysis would underscore the point that data has value and thus should not be ignored.94Cf. Welch, supra note 75, at 368. These combined benefits may be considerable.

Furthermore, if decisionmakers use real options analysis to value data, they may find ways to ameliorate the complexity problems over time. Trial and error can produce insights. As agencies and courts experiment with the framework, approximations may arise that are easier to calculate. Even if these approximations are not precisely accurate, they may be close enough to be useful. In particular, they may be significant improvements over existing data valuation methods.

That dynamic—of finding heuristics that are simpler but informative—has been borne out in other settings. For example, basic corporate finance theory teaches that profit-maximizing firms should use net present value analysis to allocate their resources.95See, e.g., id. at 61–66. Yet many firms, including large, sophisticated ones, analyze other metrics as well.96See John R. Graham, Presidential Address: Corporate Finance and Reality, 77 J. Fin. 1975, 2038 (2022) (surveying corporate managers on how they make capital allocation decisions and finding that, among large firms, 64% use the payback method and 39% use the profitability index); John R. Graham & Campbell R. Harvey, The Theory and Practice of Corporate Finance: Evidence from the Field, 60 J. Fin. Econ. 187, 199 (2001) (finding that 57% used the payback method, 30% used the discounted payback method, and 12% used the profitability index). These metrics include the profitability index, which measures how much profit a project generates per dollar invested, and the payback rule, which considers how long it takes for a project to repay its startup costs.97Welch, supra note 75, at 75–78. Both of these simple rules have well-known flaws that can cause them to produce absurd results.98Profitability index can produce the wrong decision rules because firms seek to maximize their total profits, not their profits per dollar invested. For example, consider two mutually exclusive projects: Project A costs $100 and produces $1000 in revenue. Project B costs $1 and produces $100 in revenue. Both projects are good, but if one must choose between them, Project A is clearly better; its $900 in profit dwarfs Project B’s $99 profit. Yet Project B has a much higher profitability index ($100 / $1 = 100) than Project A does ($1000 / $100 = 10). Id. at 75–76.

The payback rule evaluates projects based on how long they take to return their initial costs. Discounted payback does the same, but discounts the project’s future cash flows to account for the fact that they do not come immediately. Both have the same problem; they ignore any cash flows that the project generates after it has paid back its initial costs. Consider project C, which costs $100 today and returns $110 in a year, and project D, which costs $100 today and returns $1000 in a year and a day. Project D is clearly a superior project, but the payback method will select Project C instead. Id. at 77.
Why, then, do they remain common?

One possible answer is that these simple rules produce information about projects’ real option value. For example, recouping one’s initial investment means that those recovered dollars can be redeployed toward other purposes, increasing the range of decisions available to the firm.99This assumes that capital markets are imperfect, which is true of real-world markets. See id. at 511–39. Researchers have found that, under a variety of circumstances, such simple rules can allow firms to make nearly optimal decisions.100See Robert L. McDonald, Real Options and Rules of Thumb in Capital Budgeting, in Project Flexibility, Agency, and Competition 13 (M.J. Brennan & L. Trigeorgis eds., 2000); Achim Wambach, Payback Criterion, Hurdle Rates and the Gain of Waiting, 9 Int’l Rev. Fin. Analysis 247, 257 (2000); Glenn W. Boyle & Graeme A. Gutherie, Payback and the Value of Waiting to Invest 13–14 (Apr. 29, 1997) (unpublished manuscript), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=74 [https://perma.cc/8K39-B95L]. The relative accuracy of these rules, combined with their simplicity, may explain why firms use them more frequently than real options analysis.101See Graham, supra note 96, at 1985 (finding that only 38% of large firms frequently use real options in decision-making, which was less frequent than profitability index (39%) or payback rule (64%)); see also Graham & Harvey, supra note 96, at 188 (finding that payback rule was more commonly used than real options); H. Kent Baker, Shantanu Dutta & Samir Saadi, Management Views on Real Options in Capital Budgeting, 21 J. Applied Fin. 1, 8 (2011) (surveying Canadian firms and finding that only 10% often or always used real options analysis when deciding among projects, while 67% used the payback rule, 25% used the discounted payback rule, and 11% used the profitability index). These types of heuristics, and others, may prove useful to valuing data.

Alternatively, real option analysis can inform other modes of valuation. One response to complicated valuation problems is the method of comparables: To determine an item’s value, identify similar items whose values are known (that is, comparables), then make appropriate adjustments. This method is frequently employed to value items such as real estate, art, and active businesses.102Welch, supra note 75, at 431–36. Under the right circumstances, this method can produce accurate valuations.

The method of comparables can be tricky to apply to data for several reasons. First, it may be difficult to identify similar data sets with known values. Sale prices are often used as the measure of value for comparable items, and sale prices for data may not be public. But even when sale prices are available, data sets can differ from each other along a variety of dimensions. Which of those differences are important, and how much should value estimates be adjusted to account for these differences? For example, which is more valuable—a data set that is twice as large, or one that includes data drawn from twice as much time? Is data more valuable when the future is more uncertain or less? These are but a few of the dimensions one might wish to consider.

Real option theory sheds insight into some of these questions. It identifies a number of factors that directly affect real option value, and thus the value of data. These factors can then be considered and adjusted for when using comparables to value data.

One factor that informs a data set’s value is its informational uniqueness. To what extent does that data tell its user something that they otherwise would not know? Having insights that no one else has can be extremely valuable. On the other hand, when competitors have access to comparably informative data, profitably exploiting the data gets harder, as competition among firms puts the firm’s counterparties in a comparatively stronger position.

Two other factors stem from the payoffs available from exploiting data. Unsurprisingly, the higher the potential future profits that the data can unlock, the more valuable the data is. What is less obvious is that the value of data increases as the future becomes less certain. This is somewhat abnormal; in finance, safer cash flows are usually considered more valuable than riskier ones.103Id. at 124, 197. Options are an important exception to this general rule, however. Because options allow one to change behavior in response to different circumstances, they actually become more valuable when a project has a wider range of possible future payouts.104Id. at 364.

Another important factor in real option valuation is the length of time over which one can continue to change one’s behavior.105This is also an important factor in financial option valuation. See generally Merton, supra note 87. The longer that one can change direction, the more actions that one has available, and the more valuable the option. In the data context, this corresponds to the useful life of the data. As noted earlier, some data remains useful and informative for years or even decades; other data grows stale quickly.106Of course, distinguishing one from the other may be challenging in particular cases. The task gets easier when one at least knows to ask the question, however. All else equal, the former is more useful than the latter.107This factor relates to the first. If the data is informationally unique, or more unique, for a longer period of time, the firm possessing that data will have more attractive choices available to it for a longer period of time (that is, a longer-lived option).

Relatedly, interest rates affect the value of real options, and thus of data.108This is also true of financial options. See generally Merton, supra note 87. Profits earned in the future are more valuable when interest rates are low than when rates are high.109More precisely, firms should care about the discount rate they apply to future cash flows rather than about interest rates, but the two concepts are similar. In practice, the latter is easier to observe and may closely correlate with the former. Interest rates have more of an effect on data with a longer useful life, and less of an effect on shorter-lived data.

How quickly and cheaply one can change one’s behavior also affects a real option’s value. The quicker one can act, the more nimble one is, the more ways in which one can profitably change one’s behavior. Similarly, options that can be exercised at little cost are more valuable than those which are expensive to utilize.110This is analogous to the strike price for a financial call option; all else equal, options with lower strike prices are more valuable.

These factors are more amenable to forming legal standards than a strict formula for valuing real options would be. Accordingly, they may provide a path forward for data valuation.

Finally, real options theory could inform attempts to value data in a different way. Experience may convince policymakers that valuing data is simply too hard, and that they should act accordingly. Such actions could take multiple forms.

One response to a difficult valuation problem is to simply exit the field as much as possible. Section 83 of the Internal Revenue Code provides a good example of this approach.11126 U.S.C. § 83 (2023). It addresses the questions of how much income a taxpayer has when they receive property in exchange for performing services, and when the taxpayer is taxed on that income. Section 83’s general rule is that employees are taxed on property based on its fair market value, and they are taxed at the time it becomes clear that they will get to keep the property.

For example, startup companies frequently include some form of equity interest in the company as part of their employees’ compensation packages.112See, e.g., Abraham J.B. Cable, Fool’s Gold? Equity Compensation & the Mature Startup, 11 Va. L. & Bus. Rev. 613, 613 (2017). These interests can come in various forms, including stock, restricted stock units, or stock options.113Id. If employees leave their employer before a certain date—if they quit to take a new job or are fired—then they forfeit some or all of their equity interests. The date after which an employee gets to keep an equity interest, even if the employee leaves the firm, is known as that interest’s vesting date. If an employee leaves the employer before the vesting date, they lose their unvested equity.

Under the general rule of Section 83, an employee is typically taxed on the value of their equity interest at the time those interests vest.11426 U.S.C. § 83 (2023). However, as noted previously, valuing stock options is difficult. Accordingly, Section 83 exempts stock options from its general rule—unless they have a visible market price (in which case they are easy to value).11526 U.S.C. § 83(e) (2023); Treas. Reg. § 1.83–7(b) (as amended in 2004). Stock options can also have a readily ascertainable fair market value if they are not actively traded, but this is unusual; the relevant regulations recognize that the possibility of future price changes increases the value of an option and requires (among other conditions) that this component of value be measurable with reasonable accuracy. Treas. Reg. § 1.83–7(b)(2), (3) (as amended in 2004). Instead, employees who receive stock options generally are not taxed until they exercise those options, at which point they receive stock in their employer, which is easier to value.116This assumes that the stock is vested. The general rule of Section 83 applies to the stock; if the employee may have to surrender the stock to the employer in the future if they do not continue their employment past a specified date, then the employee is not taxed on the value of the stock until the stock vests. This limits taxpayers’ ability to take aggressive valuations of hard-to-value stock options.117For example, absent these rules, an employee could assign a low value to a stock option, thereby recognizing little ordinary income at the time of the grant. They would then recognize greater gains on the eventual sale of their stock, but those gains would generally be long-term capital gains and would be subject to a significantly lower tax rate. Because options are hard to value, it could be difficult for the IRS to prove that the employee’s valuation was too low. Regulators can adopt similar tactics in the context of data valuation.

A potentially complementary approach would be to foster a market for data, with standardized features, in order to make private transaction prices more visible and data sets more easily comparable. In a number of instances, legislative and regulatory interventions have helped shift markets characterized by bespoke arrangements toward more commoditized features and greater transparency.118Financial derivatives provide a useful recent example. See Dodd-Frank Wall Street Reform and Consumer Protection Act, Pub. L. No. 111–203, §§ 701–774, 124 Stat. 1376 (2010). Such standardized markets can make the job of valuation much easier, and can also protect unsophisticated parties operating in those markets.119See Burton G. Malkiel, A Random Walk Down Wall Street: The Time-Tested Strategy for Successful Investing 26 (2015) (“Taken to its logical extreme, it means that a blindfolded monkey throwing darts at the stock listings could select a portfolio that would do just as well as one selected by experts.”).

CONCLUSION

While data has become increasingly valuable and important, the law’s attempts to value data have lagged, remaining confused and underdeveloped. Situating data valuation law within an economic framework built on real options analysis would resolve conceptual confusion among courts, agencies, and legislatures. It would also create greater predictability among private actors, which in turn would reduce the risk of regulatory uncertainty and facilitate investment. A clearer legal approach that cuts across different areas of law and jurisdictions would limit opportunities for regulatory arbitrage across fields of law addressing data valuation. Furthermore, a consistent approach reduces politicization of results, preventing favored groups from shifting unclear legal rules in their favor when there is no economic basis for such a shift. A consistent approach also makes decision-making less opaque, thereby increasing the legitimacy of outcomes.

While the real options approach is not without potential problems, we believe that it is the least bad alternative available. Moreover, increased use of real options analysis over time may generate heuristics that simplify data valuation by courts and agencies. These heuristics may prove so effective that private parties incorporate them into arm’s length transactions. Further research is needed to identify what heuristics work best in the data valuation context, as well as how to encourage more transparent and comparable pricing in burgeoning data markets worldwide.

96 S. Cal. L. Rev. 1545

Download

* John B. Milliken Professor of Law and Taxation, USC Gould School of Law.

† Carolyn Craig Franklin Chair in Law, Professor of Law and Business, USC Gould School of Law and USC Marshall School of Business, and Senior Advisor, White & Case LLP.

AI-Generated Inventions: Implications for the Patent System

This symposium Article discusses issues raised for patent processes and policy created by inventions generated by artificial intelligence (“AI”). The Article begins by examining the normative desirability of allowing patents on AI-generated inventions. While it is unclear whether patent protection is needed to incentivize the creation of AI-generated inventions, a stronger case can be made that AI-generated inventions should be patent eligible to encourage the commercialization and technology transfer of AI-generated inventions. Next, the Article examines how the emergence of AI inventions will alter patentability standards, and whether a differentiated patent system that treats AI-generated inventions differently from human-generated inventions is normatively desirable. This Article concludes by considering the larger implications of allowing patents on AI-generated inventions, including changes to the patent examination process, a possible increase in the concentration of patent ownership and patent thickets, and potentially unlimited inventions.

INTRODUCTION

AI-generated inventions—inventions autonomously created by AI software—are around the corner.1See Hiroaki Kitano, Nobel Turing Challenge: Creating the Engine for Scientific Discovery, 7 Nature Partner Js.: Sys. Biology and Applications 1, 1–2 (2021). They have already surfaced in some applications, including genomic.2See Ross. D. King, Kenneth E. Whelan, Ffion M. Jones, Philip G. K. Reiser, Christopher H. Bryant, Stephen H. Muggleton, Douglas B. Kell & Stephen G. Oliver, Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist, 427 Nature 247, 247–51 (2004). Genomics is the study of genes, including interactions of those genes with each other and the environment. William S. Klug, Michael R. Cummings, Charlotte A. Spencer, Michael A. Palladino & Darrell J. Killian, Concepts of Genetics 46 (12th ed. 2019). “Invention machines,” as we will generically call them, will, in all likelihood, become more prevalent in the future with more and better data, methods, and computers. They will also fundamentally alter the innovation process, with inventions becoming cheaper and faster to produce—at least in some technological fields or for some types of inventions.

If the innovation process changes, so, perhaps, should the support schemes put in place to encourage it. Scholars have traditionally seen innovation activities as needing policy support with tools such as the patent system, grants, research and development (“R&D”) tax subsidies, and prizes, among others.3See Jakob Edler & Jan Fagerberg, Innovation Policy: What, Why, and How, 33 Oxford Rev. Econ. Pol’y 2, 2–6 (2017); Johan Schot & W. Edward Steinmueller, Three Frames for Innovation Policy: R&D, Systems of Innovation and Transformative Change, 47 Rsch. Pol’y 1554, 1554–55 (2018); Nicholas Bloom, John Van Reenen & Heidi Williams, A Toolkit of Policies to Promote Innovation, 33 J. Econ. Persps. 163, 163–65 (2019). It is not clear that the current policy toolbox is well adapted to this changing landscape.

One concrete question that has received a great deal of scholarly attention is whether AI-generated inventions can be protected by patents under existing intellectual property (“IP”) laws.4E.g., Research Handbook on the Law of Artificial Intelligence 411–537 (Woodrow Barfield & Ugo Pagallo eds., 2018) [hereinafter “Research Handbook on Law of AI”]; Ryan Abbott, The Reasonable Robot: Artificial Intelligence and the Law (2020); Marta Duque Lizarralde & Claudia Tapia, Artificial Intelligence: IP Challenges and Proposed Way Forward, 2022 Pat. Law. 16, 16–21 (2022). See, e.g., Dan L. Burk, AI Patents and the Self-Assembling Machine, 105 Minn. L. Rev. Headnotes 301, 301–03 (2021); W. Michael Schuster, Artificial Intelligence and Patent Ownership, 75 Wash. & Lee. L. Rev. 1945, 1946–52 (2018); Liza Vertinsky, Thinking Machines and Patent Law, in Research Handbook on Law of AI, supra, at 489; Shlomit Yanisky Ravid & Xiaoqiong (Jackie) Liu, When Artificial Intelligence Systems Produce Inventions: An Alternative Model for Patent Law at the 3A Era, 39 Cardozo L. Rev. 2215, 2217 (2018); Ryan Abbott, I Think, Therefore I Invent: Creative Computers and the Future of Patent Law, 57 B.C. L. Rev. 1079, 1079–83 (2016); Liza Vertinsky & Todd M. Rice, Thinking About Thinking Machines: Implications of Machine Inventors for Patent Law, 8 B.U. J. Sci. & Tech. L. 574, 581 (2002); John Villasenor, Reconceptualizing Conception: Making Room for Artificial Intelligence Inventions, 39 Santa Clara High Tech. L.J. 197, 199–203 (2022); Kemal Bengi & Christopher Heath, Patents and Artificial Intelligence Inventions, in Intellectual Property Law and the Fourth Industrial Revolution 127, 127–30 (Christopher Heath, Anselm Kamperman Sanders & Anke Moerland eds., 2020).There is also a growing literature addressing whether AI-generated work can be protected by copyright. See, e.g., Daniel J. Gervais, The Machine as Author, 105 Iowa L. Rev. 2053, 2053–55 (2020); Matthew Sag, The New Legal Landscape for Text Mining and Machine Learning, 66 J. Copyright Soc’y 291, 291–92 (2019). The issue also received coverage from mainstream media when Professor Ryan Abbott’s team from the University of Surrey filed patent applications, as part of the Artificial Inventor Project, designating an AI system as the inventor at several patent offices worldwide.5See, e.g., AJ Willingham, Artificial Intelligence Can’t Technically Invent Things, Says Patent Office, CNN (Apr. 30, 2020, 4:39 AM), https://edition.cnn.com/2020/04/30/us/artificial-intelligence-inventing-patent-office-trnd/index.html [https://perma.cc/625V-FUZK]; Leo Kelion, AI System ‘Should Be Recognized as Inventor’, BBC (Aug. 1, 2019), https://www.bbc.com/news/technology-49191645 [https://perma.cc/ETP2-NXKN]; Angela Chen, Can an AI be an Inventor? Not Yet., Mass. Inst. Tech. Tech. Rev. (Jan. 8, 2020), https://www.technologyreview.com/2020/01/08/102298/ai-inventor-patent-dabus-intellectual-property-uk-european-patent-office-law [https://perma.cc/7UKU-8DDE]. The applications were (so far) rejected by some patent offices (including in the United States, the European Patent Office, and the United Kingdom), but accepted by others (including in South Africa and Australia).6In Australia, the initial decision to accept the AI-inventor patent has been overturned by a five-judge panel. This decision can still be appealed to the highest court. Commissioner of Patents v Thaler [2022] FCAFC 62 (13 Apr. 2022) (Austl.), rev’d, Thaler v. Commissioner of Patents [2021] FCA 879 (30 July 2021) (Austl.) (holding inventor for a patent application must be a natural person).  The issues posed in that case were whether an AI-generated invention can be patented and whether an AI system can be named as an inventor in a patent application. The patentability of AI-generated inventions is also high on the policy agenda, with the main patent offices actively discussing the issue.7See, e.g., Artificial Intelligence, European Pat. Off., https://www.epo.org/news-events/in-focus/ict/artificial-intelligence.html [https://perma.cc/BGR2-3KXC] (May 2, 2022); Artificial Intelligence, U.S. Pat. & Trademark Off., https://www.uspto.gov/initiatives/artificial-intelligence [https://perma.cc/S36W-WQ37] (last visited Aug. 31, 2023); Artificial Intelligence and Intellectual Property, World Intell. Prop. Org., https://www.wipo.int/about-ip/en/frontier_
technologies/ai_and_ip.html [https://perma.cc/9LZR-AQSX] (last visited Aug. 31, 2023); Artificial Intelligence and IP: Copyright and Patents, U.K. Intell. Prop. Off., https://www.gov.uk/government/consultations/artificial-intelligence-and-ip-copyright-and-patents [https
://perma.cc/5K9N-KPVN] (June 28, 2022).

However, the question of the patentability of AI-generated inventions under current patent laws is too narrow a framing of the issue. The important question is whether and how the emergence of this new invention technology changes our judgment as to how the patent system can best operate to achieve its objectives. The fundamental aspects of patent laws have barely changed since the 1474 Venetian Patent Statute. Having resisted two industrial revolutions, it is not immediately apparent that the patent system must adapt to the digital revolution. However, whereas the previous industrial revolutions essentially concerned invention-driven changes in the organization of production, AI affects the invention process itself and, consequently, the incentives for innovation that are the focus of the patent system.

This Article takes a normative approach to how the patent system should handle AI-generated inventions. It also discusses implications for the patent system of invention machines. It draws on arguments from economic theory and evidence from empirical analyses of analogous situations. The focus is on technical inventions that would clearly and unambiguously meet the novelty, non-obviousness, and utility criteria if invented by a human. We are concerned with inventions that AI has fully and autonomously invented; we are not considering the use of AI as a mere tool in the invention process. However, we note that many of the points we raise apply to this broader issue as well. The fact that AI speeds up and lowers the cost of inventing does change the innovation incentives—and, perhaps, the way we should conceive the patent system.

This Article proceeds in four parts. Part I considers whether patent protection for AI-generated inventions is normatively desirable. Part II examines how invention machines could affect the patentability standards, especially the non-obviousness requirement. Part III argues against a differentiated patent system for AI-generated inventions versus human-made inventions. Part IV discusses some systemic consequences of invention machines for patent systems and proposes potential solutions. The last Part offers concluding remarks.

I.  SHOULD AI-GENERATED INVENTIONS BE PATENTABLE?

Artificial Intelligence is notoriously difficult to define but is commonly associated with the ability of a computer to learn. We utilize the term AI to refer to computer systems that can perform tasks that normally require human intelligence. AI is used in hundreds of ways all around us. Apple uses AI technology in its voice recognition software, Tesla in its self-driving technology, and Spotify and Amazon use AI to learn customer preferences. AI is used to identify the shape of proteins, which could lead to breakthroughs in drug discovery and development. AI chatbots like ChatGPT are poised to change the way students learn and study.8ChatGPT and other natural language processing algorithms raise normative issues for copyright policy that are analogous to those considered here for patent policy. We do not consider AI-driven copyright policy issues herein because the incentive issues are different in the copyright and patent contexts.

AI, however, can also invent. Perhaps the most infamous AI-generated inventions include those associated with DABUS. DABUS is an AI system developed by Stephen Thaler. According to Thaler, DABUS created inventions that Thaler did not conceive.9See Jared Council, Can an AI System Be Given a Patent?, Wall St. J. (Oct. 11, 2019, 9:45 AM), https://www.wsj.com/articles/can-an-ai-system-be-given-a-patent-11570801500 [https://perma.
cc/F3BX-2WKS] (stating with respect to two inventions that, according to a group associated with Thaler, he “didn’t conceive of those two products and didn’t direct the machine to invent them”).
However, DABUS is far from the only AI system that has created inventions without human intervention, which rise to the level of inventor under current patent law.10See Michael McLaughlin, Computer-Generated Inventions, 101 J. Pat. & Trademark Off. Soc’y 224, 238–39 (2019). For other examples of AI-generated inventions, see Ben Hattenbach & Joshua Glucoft, Patents in an Era of Infinite Monkeys and Artificial Intelligence, 19 Stan. Tech. L. Rev. 32, 32 (2015). Among other examples, AI-generated inventions currently include an AI-designed airplane cabin and an AI-designed race car chassis.11See McLaughlin, supra note 10.

In this Part, we address the fundamental economic question of whether society would be better off granting patent protection for AI-generated inventions instead of keeping them unprotected in the public domain. We do so by examining three canonical reasons for granting patent protection, the incentives to innovate, the incentive to commercialize inventions, and the ability of patents to encourage technology transfer. During our analysis, we assume that the invention machine autonomously creates patentable inventions at zero cost.

A.  Do We Need Patents to Encourage AI-Generated Inventions?

The primary justification for the patent system is to provide incentives to innovate.12See Kenneth J. Arrow, Economic Welfare and the Allocation of Resources for Invention, in The Rate and Direction of Inventive Activity: Economic and Social Factors 609, 609 (Nat’l Bureau of Econ. Rsch. ed., 1962). Patents enable inventors to recoup their research and development expenses by granting inventors the time-limited ability to exclude others from making, selling, or importing their inventions. By doing so, patents provide dynamic incentives for investments in new technologies.

Despite its primacy in theoretical discussions of the patent system, it is not immediately apparent that patents are needed to incentivize the act of inventing. Curiosity is a fundamental human trait, and exploration for its own sake is a widespread human activity. Inventions would undoubtedly occur in the absence of patents. It is possible that the incentive created by patents increases the rate of invention over its natural rate. This proposition is difficult to determine because we do not have good “natural experiments” comparing societies with and without patent systems.13See Eric Budish, Benjamin N. Roin & Heidi Williams, Patents and Research Investments: Assessing the Empirical Evidence, 106 Am. Econ. Rev. 183, 183 (2016).

The issue of incentives to bring inventions to market plays out similarly for AI as for human-made inventions. With AI, the act of creating inventions moves away from a costly, time-consuming trial-and-error process towards an automated data-crunching task. This approach drastically reduces the cost and time of inventions, such that it costs nothing for the AI machine to produce an invention—bar the computing costs.14Whether there is some critical human input in the creation of inventions is an important consideration in the legal literature to establish that inventions are allowed patent protection. The distinction between AI-generated versus AI-aided inventions (autonomy versus automation) does not matter so much in the present discussion, where the cost and speed of creation carry more weight. If inventions are cheap and fast to come up with, one could argue that there is a priori no need to incentivize inventive activities. Producing inventions is cheap, and machines do not need to be incentivized.

However, producing the invention machines is presumably costly. Thus, the relevant question is whether these machines would be developed in a world in which their output cannot be patented. In other words, would a patent on the invention machine itself provide enough of an incentive to create the machine, or would the machine’s outputs also need to be patent eligible?15See Deepak Somaya & Lav R. Varshney, Embodiment, Anthropomorphism, and Intellectual Property Rights for AI Creations, 2018 Proc. AAAI/ACM Conf. on AI, Ethics & Soc’y 278, 278–83 (2018). This question is difficult to answer, as the answer depends upon a number of factors, including the costs to produce an invention machine and the ability to monetize any invention the machine creates without patent protection. At the most, if innovators cannot secure the property of their AI inventions, there is limited financial incentive to produce invention machines in the first place. On the other hand, allowing every invention produced by an invention machine to be patentable seems like a windfall to the inventor. At some point, the reward will substantially outweigh the original incentive to innovate. As a result, it is unclear whether AI-generated inventions should be patentable based on the incentive to innovate alone.

B.  Do We Need Patents to Encourage the Commercialization of AI-Generated Inventions?

Although it is uncertain whether we need patents on AI-generated inventions to maintain invention incentives, patents also play a critical role in invention commercialization. To be clear, we differentiate between “invention costs,” which are assumed close to zero with the invention machine, and “commercialization costs,” which are necessary to bring the invention to market—covering activities such as development, optimization of design, market research, scale-up of production, distribution, and the like.16Ted Sichelman, Commercializing Patents, 62 Stan. L. Rev. 341, 348–55 (2010).   In the particular but important case of pharmaceuticals and medical devices, human safety and efficacy testing also form part of commercialization costs.

Recent history provides part of the answer to that question. Let us go back in time, to 1980, and call the invention machine a “public research organization” (“PRO”). The U.S. government used to retain title to inventions and license them only non-exclusively. As we now know, this situation led to many valuable inventions being left unused. According to a governmental report, at the time, “fewer than 5 percent of the 28,000 patents being held by federal agencies had been licensed,” compared with 25–30 percent of the federal patents for which the government allowed companies to retain title to the invention.17U.S. Gen. Acct. Off., GAO/RCED-98-126, Technology Transfer: Administration of the Bayh–Dole Act by Research Universities (1998).  Thus, many valuable inventions fell into oblivion.

The context changed with the Government Patent Policy Act of 1980, also known as the Bayh-Dole Act, which allowed PROs and universities to patent and exclusively license federally-funded inventions. Research on the effects of the Bayh-Dole Act shows that university patenting and licensing revenues increased after 1980, suggesting greater use of inventions.18See David C. Mowery, Richard R. Nelson, Bhaven N. Sampat & Arvids A. Ziedonis, The Growth of Patenting and Licensing by U.S. Universities: An Assessment of the Effects of the Bayh–Dole Act of 1980, 30 Rsch. Pol’y 99, 99 (2001); Jerry G. Thursby & Marie C. Thursby, University Licensing and the Bayh–Dole Act, 301 Sci. Mag. 1052, 1052 (2003); Scott Shane, Encouraging University Entrepreneurship? The Effect of the Bayh-Dole Act on University Patenting in the United States, 19 J. Bus. Venturing 127, 127 (2004). Several countries in Europe adopted similar legislation, including Germany and Italy.19Dirk Czarnitzki, Wolfgang Glänzel & Katrin Hussinger, Heterogeneity of Patenting Activity and its Implication for Scientific Research, 38 Rsch. Pol’y 26, 28 (2009).

This situation is known in economics as the free-good problem.20Wendy Gordan, Fair Use as Market Failure: A Structural and Economic Analysis of the Betamax Case and its Predecessors, 82 Colum. L. Rev. 1600, 1611 (1982).  A free good has zero opportunity cost, and the textbook example is air, which everyone can freely consume. By its very nature, nobody can possibly sell a free good. The picture changes when one introduces scarcity. Consider Swissbreeze, a startup that sells “the best, most pristine and freshest Swiss canned air, gathered in the most beautiful and remote lake and mountain regions.”21Martha Cliff, Would You Pay £19 for a Bottle of Fresh Air? Swiss Company Sells Containers of Oxygen Collected in the Mountains to ‘Clear Your Mind,’ DailyMail (Jan. 21, 2018, 11:54 AM),https://www.dailymail.co.uk/femail/article-5294701/Would-pay-19-bottle-fresh-AIR.html [https://
perma.cc/9Q59-8JRB].
Swissbreeze’s business model only works because not everyone has access to fresh air, let alone from the Swiss mountains. It is easy to imagine that wealthy consumers in Delhi, India, or Anyang, China—two of the world’s most polluted cities—may want to pay a high price for a shot of fresh air. Fresh air in these cities is scarce, and breathing it has a high opportunity cost.

Only scarcity makes the business model of bottling and selling fresh air viable. By the same reasoning, only scarcity makes viable the business model of bringing an invention to the market. Put differently, the inability to secure exclusive rights to an invention limits firms’ appetite for that invention. This fate was that of many PRO and university inventions before the Bayh-Dole Act. The need for investment to bring the product to market means that at least some level of scarcity (achieved with patent protection) is warranted.22See Benjamin N. Roin, Unpatentable Drugs and the Standards of Patentability, 87 Tex. L. Rev. 503, 509–10 (2009). The present reasoning does not apply to inventions with zero commercialization costs, that is, inventions that can be directly implemented in products without further investment. In the absence of patent protection, firms in a competitive market would immediately adopt the invention, and consumers would absorb all the surplus. However, most inventions require some amount of investment to get them from concept to market. Yet, the corner case of inventions with zero commercialization cost is an interesting one because it suggests another argument in favor of patent protection: ensuring disclosure. There has been ongoing debate regarding the extent to which patents actually disclose helpful information. See, e.g., Lisa Larrimore Ouellette, Do Patents Disclose Useful Information?, 25 Harv. J.L. & Tech. 545, 546–50 (2012) (summarizing the existing debate and arguing that benefits of disclosure are stronger than generally thought). We have assumed thus far that inventions are disclosed publicly. It is clearly the case for university and PRO inventions, but it will not necessarily be the case for AI-generated inventions. In the absence of protection, many ready-to-market inventions—but also inventions with non-zero commercialization costs—would be kept secret, severely limiting the diffusion of these inventions. See, e.g., Daniel P. Gross, The Consequences of Invention Secrecy: Evidence from the USPTO Patent Secrecy Program in World War II 2–3 (Harv. Bus. Sch., Working Paper, No. 19-090, 2019); Gaétan de Rassenfosse, Gabriele Pellegrino & Emilio Raiteri, Do Patents Enable Disclosure? Evidence from the Invention Secrecy Act (Mar. 26, 2020) (unpublished manuscript), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3561896 [https://perma.cc/7YKN-MR7F]; Jeffrey L. Furman, Markus Nagler & Martin Watzinger, Disclosure and Subsequent Innovation: Evidence from the Patent Depository Library Program, 13 American Econ. J.: Econ. Pol’y 239, 241–42 (2021). Thus, patent protection may be necessary to ensure commercial opportunities for the output of invention machines and, consequently, for creating invention machines themselves.23Such a situation will have admittedly a lower impact for “integrated” innovators, who are both invention creators and implementors. They may obtain high enough returns from commercializing their own AI-generated inventions.

     By analogy with tangible goods, one might argue that patenting the machine and its output is inappropriate. One does not get a patent for a screw machine and additional protection for the screw it produces. This analogy is misleading as the economic appropriation of tangible goods is inherently different than that of intangible goods. The “public good” nature of knowledge calls for additional protection mechanisms.

C.  Do We Need Patents to Encourage Technology Transfer?

The third rationale for granting patents is to enhance technology transfer. If an invention is not patented, inventors may keep the invention secret.24One might object that secrecy creates scarcity, solving the free good problem. Indeed, nothing would prevent the owner of an invention machine from approaching would-be licensees or buyers to transfer the secret inventions. However, secrecy is not always an adequate protection mechanism. See Edwin Mansfield, Patents and Innovation: An Empirical Study, 32 Mgmt. Sci. 173, 176 (1986); Wesley M. Cohen, Richard R. Nelson & John P. Walsh, Protecting Their Intellectual Assets: Appropriability Conditions and Why U.S. Manufacturing Firms Patent (or Not) 6 (Nat’l Bureau of Econ. Rsch., Working Paper No. 7552, 2000). It offers no protection for inventions that can be easily reverse-engineered, with drugs being a notable example. Secrecy hampers transactions in markets for technology, as it hurts the search for a licensing partner. Secrecy reduces the search to a one-sided process, in which only the owner has the ability to reach out to interested parties.25More generally, the option of keeping an invention secret is available by default for all inventions, patentable or not. Although secrecy is sometimes used in lieu of patent protection, we do not generally judge that the option of secrecy (or other possible appropriation methods) means that patents are not a valuable policy tool. We see no reason why AI inventions are different in this regard. Furthermore, even if the owner of the invention identifies an interested party, contracting over the information is notoriously difficult. Once the owner discloses the information, the interested party may be able to take it without paying.

Patents help increase technology transfer in two ways. First, a patent helps enable a two-sided search process where licensees and licensors search for each other. Hegde and Luo provide evidence that the publication of U.S. patent applications 18 months after their filing date rather than at the time of the patent grant has sped up licensing transactions.26Deepak Hegde & Hong Luo, Patent Publication and the Market for Ideas, 64 Mgmt. Sci. 652, 652 (2017).  They attribute this effect to the patent system being a “credible, standardized, and centralized repository [that] mitigates information costs for buyers and sellers.”27Id. Second, patents may help solve the information disclosure paradox. Patent rights are legal title that protects buyers against the expropriation of the traded idea, including when searching for a licensing partner, which also facilitates technology transactions.28See Joshua S. Gans, David H. Hsu & Scott Stern, The Impact of Uncertain Intellectual Property Rights on the Market for Ideas: Evidence from Patent Grant Delays, 54 Mgmt. Sci. 982, 988 (2008); Gaétan de Rassenfosse, Alfons Palangkaraya & Elizabeth Webster, Why Do Patents Facilitate Trade in Technology? Testing the Disclosure and Appropriation Effects, 45 Rsch. Pol’y 1326, 1326 (2016). But see Michael J. Burstein, Exchanging Information Without Intellectual Property, 91 Tex. L. Rev. 227, 235–46 (2012) (arguing that there is a range of ways in which to exchange information without patent protection). See generally Benjamin Mitra-Kahn, Economic Reasons to Recognise AI Inventors, in Research Handbook on Intellectual Property and Artificial Intelligence 376, 378 (Ryan Abbott ed., 2022) (arguing that recognizing AI inventors will facilitate technology transfer).

Implicit in this argument is that a transfer must occur between invention producers and implementers. Such transfers are necessary in the case of PROs and universities, which produce non-market-ready inventions and do not commercialize products. However, owners of invention machines may very well implement the inventions themselves. In practice, many patented inventions are traded and licensed on markets for technology.29Ashish Arora, Andrea Fosfuri & Alfonso Gambardella, Markets for Technology: The Economics of Innovation and Corporate Strategy 15–45 (2001). Using patent reassignment data, Serrano found that about 12–16 percent of U.S. patents are traded over their lifecycle,30Carlos J. Serrano, The Dynamics of the Transfer and Renewal of Patents, 41 RAND J. Econ. 686, 693 (2010). while Ciaramella et al. found 12 percent of European patents in medical technologies are traded.31Laurie Ciaramella, Catalina Martínez & Yann Ménière, Tracking Patent Transfers in Different European Countries: Methods and a First Application to Medical Technologies, 112 Scientometrics 817, 817–20 (2017). Furthermore, we may speculate that invention machines will exacerbate the division of innovative labor. Creating invention machines is costly, but producing inventions is cheap and fast, which may lead to more specialization (in other words, inventors versus implementers). In addition, the skills and capabilities required for creating invention machines differ drastically from those required to commercialize the inventions. If invention machines lead to a greater division of labor (where producers of inventions do not implement them), the issue of technology transfer will become particularly salient.

*******

In summary, under the traditional theory of incentives to innovate, it is uncertain whether AI-generated inventions should be patent eligible. AI makes inventing cheap, and AI machines do not need to be incentivized to invent. However, producing the AI invention machine is presumably costly. It is unclear whether these machines would be developed if their outputs cannot be patented. A stronger case for patenting AI-generated inventions is made under commercialization and technology transfer rationales for patents. Without protection, the output of the invention machine becomes more challenging to transfer and commercialize, which reduces the incentives to invent and develop such machines in the first place. Moreover, AI-generated inventions may result in the further stratification of labor markets, where producers of inventions do not commercialize them. This division of labor would make it more critical for AI-generated inventions to be patentable, as patents facilitate both technology transfer and commercialization. Of course, patents also impose costs on society, such as limiting competition and access to the invention. Thus, the benefits of allowing patents on AI-generated inventions should outweigh the costs. While we believe these arguments taken together make the uneasy case for allowing AI-generated inventions to be patented, we acknowledge that it is difficult to say so definitively. Before considering whether the patent system should treat AI-generated inventions differently, we discuss a potentially significant implication of AI systems for the patentability standards.

II.  THE EXISTENCE OF AI-GENERATED INVENTIONS AND IMPLICATIONS FOR THE NON-OBVIOUSNESS STANDARD

The emergence of inventions generated by AI systems also has implications for how we interpret patent validity. At any given time, there is an unknown but presumably large set of inventions that are makeable in the sense that humanity’s underlying knowledge and technology base has advanced to the point where they are a feasible step beyond what has come before—an argument known as the “inevitability of inventions” at least since Ogburn and Thomas,32William F. Ogburn & Dorothy Thomas, Are Inventions Inevitable? A Note on Social Evolution, 37 Pol. Sci. Q. 83, 88 (1922). and Ihde.33Aaron J. Ihde, The Inevitability of Scientific Discovery, 67 Sci. Monthly 427, 427 (1948). Historically, the flow of patent applications from this unknown feasible pool has been determined by some combination of the contemporary socio-economic context, the breadth of human ingenuity, and the resources devoted to finding them. The addition of AI systems to the technology for fishing in this pool of potential inventions will likely significantly relax the latter two constraints. Human ingenuity will quite literally no longer be necessary, and the cost of exploration may be so dramatically reduced that resources available for inventing will be much less binding (perhaps almost irrelevant) as a constraint.

To begin, countries are not uniform in allowing a machine to be an inventor of a patent. Appeals courts in both the United States34This conclusion seems to follow a straightforward interpretation of the Patent Act. The Patent Act defines an inventor as an “individual or, if a joint invention, the individuals collectively who invented or discovered the subject matter of the invention.” 35 U.S.C. § 100(f). The Federal Circuit interpreted the term “individual” to be a natural person and that the term inventor, as used in patent statutes, does not include machines. Thaler v. Vidal, 43 F.4th 1207, 1211 (Fed. Cir. 2022). The Federal Circuit did in part by noting that the Patent Act refers to individual inventor in gendered pronouns as herself or himself, which would exclude a machine from comprising an individual. Id. at 1209. and England35Thaler v. Comptroller Gen. of Pats. Trade Marks and Designs, [2021] EWCA (Civ) 1374 (U.K.). have held that machines cannot be inventors of patents. In contrast, Australia and South Africa allow machines to be inventors of patents.36In Australia, the initial decision to accept the AI-inventor patent has been overturned by a five-judge panel. This decision can still be appealed to the highest court. Commissioner of Patents v Thaler [2022] FCAFC 62, rev’d, Thaler v. Commissioner of Patents [2021] FCA 879 (holding inventor for a patent application must be a natural person).  Thus, we acknowledge that the patent acts of some countries, such as the United States, may need to be amended in order for machines to be inventors of patents. Assuming such reform efforts will occur, the rest of this Part examines how AI-generated inventions may affect the non-obviousness standard of patentability.

An invention is deemed obvious (and, therefore, not patentable) if the differences between what is claimed and what has been done before are such that it is obvious to a person having ordinary skill in the art (“PHOSITA”) how to adapt existing technology to make the proposed invention.37Graham v. John Deere Co., 383 U.S. 1, 17 (1966). The level of skill associated with the PHOSITA is critical in the non-obviousness inquiry. The PHOSITA is defined as an average person in a given field with “ordinary creativity, not an automaton,”38KSR Int’l Co. v. Teleflex Inc., 550 U.S. 398, 421 (2007). The MPEP provides guidance on the level of ordinary skill in the art. See U.S. Pat. & Trademark Off., Manual of Patent Examining Procedure § 2141.03 (9th ed. 2023); see also John F. Duffy & Robert P. Merges, The Story of Graham v. John Deere Company: Patent Law’s Evolving Standard of Creativity, in Intellectual Property Stories 109, 110 (Jane C. Ginsburg & Rochelle Cooper Dreyfuss eds., 2006) (noting that determining the appropriate level of ordinary skill for the nonobviousness standard “is one of the most important policy issues in all of patent law”). who has access to the same tools, skills, and knowledge base. The more skilled the PHOSITA, the more likely a new invention is obvious. Another key determinant of the obviousness inquiry is establishing what constitutes prior art, which references such as scientific articles may be used to determine whether an invention is obvious. The more prior art that can be considered, the more likely an invention is obvious. The emergence of AI systems for invention will likely have at least two ramifications for the obviousness inquiry.

First, we must confront the question of whether the PHOSITA includes AI systems. Said differently, if a proposed invention could have been adapted from existing technologies by a normally-skilled AI system, does that make the invention obvious and, hence, invalid? Currently, because most fields do not use AI, inventors do not have to disclose the use of AI to the Patent Office. Consider a scientist who decides to use neural networks to help come up with a new microchip design. The AI might help her calculate the ways that different materials can impact the microchip’s operations. The new microchip may represent an improvement in the technology, but if an ordinary microchip inventor could have arrived at the same invention, then the new microchip would not qualify for a patent. However, suppose the AI assists in developing a novel microchip design that is beyond the skill of the ordinary microchip inventor to design. In that case, the invention may qualify for a patent. As more companies and inventors use AI to create new inventions, the legal standard will have to adapt. At some point, patent examiners will have to start assuming that a PHOSITA, which is a legal fiction that is presumed to know the relevant prior art, has access to AI, which will raise the bar for obviousness in the patent process.

Second, AI machines may alter the analogous art doctrine, which limits the prior art considered in an obvious inquiry to only prior art in the same field of the invention39KSR Int’l Co., 550 U.S. at 417. or reasonably pertinent to the problem faced by the inventor.40In re Bigio, 381 F.3d 1320, 1325 (Fed. Cir. 2004). Because an obviousness inquiry often involves combining multiple prior art references to render the invention unpatentable, the analogous art doctrine was adopted by courts to reflect the practical conditions facing an invention. An inventor likely would focus on this type of prior art when inventing. Adopting a “normally skilled” AI system as the PHOSITA could lead to a reconsideration of the analogous art doctrine. A normally skilled AI system may easily search the entire world of prior art (including patents and printed publications, but also technical blogs, standard documents, and other resources), and thus removing the analogous art limitation on the obviousness inquiry may reflect the practical realities of shifting the skilled artisan to a skilled AI system. Such removal would also result in raising the bar to patentability.

There are, however, some difficulties associated with trying to define a “normally skilled” AI system. Making the determination as to what represents an inventive enough leap for a person of ordinary skill is challenging enough; doing so for an AI machine may be even more challenging. To begin, it seems difficult to distinguish the AI system that did find the invention from the fictional one that could have. This problem does not arise with human inventors because we accept as a matter of course that each human is unique, and a given invention can come from one human’s spark of genius without suggesting that any skilled human could have done it. Making this distinction for AI systems seems much harder.

A number of commentators argue that a PHOSITA AI system will place the bar for non-obviousness implausibly high, as a PHOSITA using AI can potentially create every invention—rendering “everything obvious.”41Ryan Abbott, Everything is Obvious, 66 UCLA L. Rev. 2, 4–10 (2019); see also Tabrez Y. Ebrahim, Data-Centric Technologies: Patent and Copyright Doctrinal Disruptions, 43 Nova L. Rev. 287, 310 (2019). Notably, this would be true for an inventor who did not have access to AI. That is, once inventors in the field are assumed to have access to AI, this will raise the legal standard for nonobviousness across the board, including for those inventors in the field who do not have access to AI.  However, as several commentators also note, this conception of AI currently is more science fiction than science,42Burk, supra note 4, at 301. in that AI only works within circumscribed attributes that humans input. Importantly, our piece is explicitly assuming AI-generated inventions. In such a scenario, it is important to keep in mind that AI systems likely would raise the non-obviousness bar, making patents harder to obtain in the future.

III.  A DIFFERENTIATED PATENT SYSTEM?

The previous Part considers how AI inventions may affect the non-obviousness standard. Assuming for the sake of the argument that AI-generated inventions are patentable, we turn now to considering whether we should treat AI-generated patents differently from patents on inventions generated by humans.

While the first Part of this Article makes the case for patent protection of AI-generated inventions, we have not yet addressed how strong such patent protection should be. At first, this problem seems a special case of sequential innovation with just one chain—that is, the invention machine and its inventions. Unfortunately, the vast literature on IP rights and sequential innovations is of little help. It usually assumes (1) that firms compete in the generation of follow-on inventions and (2) that follow-on inventions improve or complement, in some ways, the original invention.43Suzanne Scotchmer, Standing on the Shoulders of Giants: Cumulative Research and the Patent Law, 5 J. Econ. Persps. 29, 29–30 (1991). In the present case, the same firm controls both the invention machine and the downstream inventions. Furthermore, downstream inventions are quite distinct from the invention machine itself.

It might be helpful to think of the invention machine and its offspring as one “mega invention.” This mega invention is characterized by high fixed costs (the cost of producing the machine) and low marginal costs (the cost of producing one more invention using the machine). Taking such a perspective leads to an intuitive parallel with the existing literature on optimal patent strength. If we allow downstream inventions to be patented, the fractional nature of the mega invention implies that more valuable (or fruitful) mega inventions will receive stronger protection. Put differently, mega inventions associated with a larger offspring will receive a larger number of patents—and thus, broader patent protection. In that simple setup, the breadth of patent protection is proportional to the inventive potential of the mega invention. A priori, such a naturally differentiated breadth of protection may seem desirable.

However, simply allowing more patents to more fruitful mega inventions may not be the first best. This discussion naturally takes us back to the literature on optimal patent breadth.44See, e.g., Richard Gilbert & Carl Shapiro, Optimal Patent Length and Breadth, 21 RAND J. Econ. 106, 108–12 (1990) (providing conditions for optimal patent policy); Paul Klemperer, How Broad Should the Scope of Patent Protection Be?, 21 RAND J. Econ. 113, 120–24 (1990) (exploring the tradeoff between patent length and width). From a theoretical perspective, optimal patent incentives will always depend on the incentive structure of the invention and investment processes, which clearly differ across technologies and markets. Thus, the first-best patent policy has to be a highly differentiated one, in which many aspects of the patent process and characteristics of patent protection differ for different kinds of inventions.45See David Encaoua, Dominique Guellec & Catalina Martínez, Patent Systems for Encouraging Innovation: Lessons from Economic Analysis, 35 Rsch. Pol’y 1423, 1425 (2006); Angus C. Chu, The Welfare Cost of One-Size-Fits-All Patent Protection, 35 J. Econ. Dynamics & Control 876, 877 (2011). This route is sometimes encouraged in the policy literature, which argues in favor of “a more differentiated approach to patent protection that depends on specific characteristics of the inventions.”46Org. for Econ. Coop. & Dev., Patents and Innovation: Trends and Policy Challenges 6 (2004).

In the present context, the first best might be a differentiated system for AI-generated and man-made inventions, reflecting the fact that the invention processes are intrinsically different. A differentiated system requires a sui generis IP right, as already pointed out by some scholars.47Deepak Somaya & Lav R. Varshney, Ownership Dilemmas in an Age of Creative Machines, 36 Issues Sci. & Tech. 79, 85 (2020); Alexandra George & Toby Walsh, Can AI Invent?, 4 Nature Mach. Intel. 1057, 1057–58 (2022). In practice, we do not and cannot implement first-best policies; political and institutional realities and myriad information and transaction frictions constrain actual policies.48See generally Adam B. Jaffe & Josh Lerner, Innovation and Its Discontents (2004). At the most fundamental level, the theoretical argument for differentiated patent treatment assumes that it is costless to separate different types of inventions from each other. A patent policy that awards AI longer/shorter or stronger/weaker patents than other inventions would require an articulated set of criteria that determine whether an invention is “AI” or “not AI.” If being “AI” resulted in less desirable treatment, we can be sure that applicants will figure out ways to characterize their inventions to meet the “not AI” criteria—and even more so if AI-generated inventions are deemed not patentable. We cannot know what fraction of truly-AI inventions would manage to escape the screen, but this positioning battle would inevitably waste resources and confuse the examination process. Recent history confirms this fear. In 2000, the United States Patent and Trademark Office (“USPTO”) held that business method patent applications would be subject to a “second pair of eyes” review (“SPER”), unlike other patent applications.49John R. Allison & Starling D. Hunter, On the Feasibility of Improving Patent Quality One Technology at a Time: The Case of Business Methods, 21 Berkeley Tech. L.J. 729, 734 (2006).  Allison and Hunter show that the introduction of SPER led applicants of business method patent applications to write their applications so that they would not be subject to the extra review.50Id. 

A second problem with a differentiated patent system is that any differences in treatment would have to be introduced by statute, at least in the United States. Patent policy already is a highly political process. If legislation treating AI inventions differently were to be passed, it does not require a high degree of cynicism to expect that the differentiation eventually ending up in the legislation might bear little relation to what was suggested by the first-best theoretical analysis of incentives.

Further, there is a danger that such discussion would open a bigger door: if AI patents are to be treated differently, other interests would be sure to jump in and argue that their patents should be treated differently. And in each case, the interests most affected by such differentiation would be those who expect to apply for the new special category. They have much more at stake in seeking favorable treatment than anyone has at stake in protecting the broader public interest. Opening the door to special treatment might well result in a series of differentiations in which particularly active groups get favorable treatment. Again, believing or hoping that theoretical results from welfare optimization would drive the differentiation seems naïve.

Moreover, the creation of a sui generis right could distort the innovation ecosystem in unintended ways. Consider the case of a company that has a choice between allocating human pharmacologists and investing in an AI system to develop a new vaccine. It is not clear that we want to create a system whereby the firm decides to pursue one option over another depending on the type of right it will get at the end. The new vaccine should be produced in the most efficient manner, and IP rights should be neutral to this choice.

Finally, the creation of a differentiated patent system might run afoul of international treaty obligations under the Trade-Related Aspects of Intellectual Property Rights Agreement (“TRIPS”). TRIPS requires signatories to provide a minimum set of standards for all patents, such as the stipulation that the term of a patent must last at least twenty years from the filing date.51Agreement on Trade-Related Aspects of Intellectual Property Rights, Apr. 15, 1994, 1869 U.N.T.S. 299, 33 I.L.M. 1197, sec. 5, art. 33, ¶ 1 [hereinafter TRIPS Agreement]. However, it may be possible to create new sui generis intellectual property rights for AI-generated inventions that do not violate TRIPS obligations if such rights are not conceived as patents.52There is also an open question as to whether new intellectual property rights, such as database protection, violates TRIPS. The European Union created a new form of intellectual property rights with respect to database protection, which so far has survived TRIPS challenges. See generally Guido Westkamp, TRIPS Principles, Reciprocity and the Creation of Sui-Generis-Type Intellectual Property Rights for New Forms of Technology, 6 J. World Intell. Prop. 827 (2003). A complete examination of this issue is beyond the scope of this Article.

Overall, although a differentiated system might be the first best solution, the realpolitik of the patent system suggests that developing a patent policy specifically for AI inventions is not likely to improve public policy and may violate international obligations.

IV.  TAKING INVENTION MACHINES SERIOUSLY

This Part examines the bigger-picture implications of allowing patents on AI-generated inventions. In particular, this Part argues that patents on AI-generated inventions may overwhelm the examination capacity of national patent offices, increase the concentration of patent ownership, increase patent thickets, and lead to unlimited inventions. This Part also begins to examine changes to patent practice that might be desirable in light of these potential implications.

A.  The Examination Process

It is easy to see why invention machines pose significant challenges to the functioning of the patent system. The first challenge is a potential backlog at patent offices that would come with a patent application explosion. Examining patent applications is (currently) a labor-intensive, time-consuming task. If inventing becomes cheap and fast, patent offices may not keep up with the increasing demand for examination.53Cf. George & Walsh, supra note 47, at 1059–60 (making a similar point). The “global patent warming” of the mid-2000s,54See generally Bronwyn H. Hall & Rosemarie Ham Ziedonis, The Patent Paradox Revisited: An Empirical Study of Patenting in the U.S. Semiconductor Industry, 1979–1995, 32 RAND J. Econ. 101 (2001) (documenting the rise of patenting in the semiconductor industry); Joseph Straus, Is There a Global Warming of Patents?, 11 J. World Intell. Prop. 58 (2008) (examining the reasons behind the surge in patent application filings); Jérôme Danguy, Gaétan de Rassenfosse & Bruno van Pottelsberghe de la Potterie, On the Origins of the Worldwide Surge in Patenting: An Industry Perspective on the R&D-Patent Relationship, 23 Indus. & Corp. Change 535 (2014) (same). which put the U.S. and European patent systems under strain, might look pale in comparison. Pendency could reach excessively long delays, which is detrimental to welfare.55See Alfons Palangkaraya, Paul H. Jensen & Elizabeth Webster, Applicant Behaviour in Patent Examination Request Lags, 101 Econ. Letters 243, 243 (2008); Warren K. Mabey, Jr., Deconstructing the Patent Application Backlog: . . . A Story of Prolonged Pendency, PCT Pandemonium & Patent Pending Pirates, 92 J. Pat. & Trademark Off. Soc’y 208, 237–46 (2010); Lily J. Ackerman, Prioritization: Addressing the Patent Application Backlog at the United States Patent and Trademark Office, 26 Berkeley Tech. L.J. 67, 67–68 (2011); Stuart J. H. Graham & Galen Hancock, The USPTO Economics Research Agenda, 39 J. Tech. Transfer 335, 341 (2014).

The obvious policy response is that patent offices must also use AI to speed up the examination process. Currently, a third-party contractor with the USPTO utilizes AI to classify new patent applications so that they route to patent examiners with the right technological expertise.56U.S. Pat. & Trademark Off., U.S. Dept. Com., PTOC-016-00: Privacy Impact Assessment for the Serco Patent Processing System (PPS) 1 (2018); Serco Processes 4 Millionth Patent Application for U.S. Patent and Trademark Office, PR Newswire (Nov. 15, 2018), https://www.prnewswire.com/news-releases/serco-processes-4-millionth-patent-application-for-us-patent-and-trademark-office-300751330.html [https://perma.cc/GM86-EWPT] (“Since 2006, Serco has performed classification and other analysis services through awarded contracts including Pre-Grant Publication (PGPubs) Classification Services, Initial Classification and Reclassification (ICR) Services, and Full Classification Services (FCS) contracts.”). The USPTO has also considered incorporating AI to improve prior art searching of patent examiners.57U.S. Pat. & Trademark Off., U.S. Dept. Com., Patent-End-To-End Search Artificial Intelligence Capability: Request for Information & Notice of Vendor Engagement 3 (Aug. 25, 2023), https://sam.gov/opp/e10a9492b5f94f738a4790190303e552/view [https://perma.cc/MEH3-KZ57]. AI holds great potential for improving the search process associated with patent examination as well as locating relevant passages in the prior art, mapping them to elements of the current application’s claims, and hence suggesting potential rejections. Admittedly, AI may not be as helpful in reviewing patent applications on newer subject matters where inventors are just developing new patentable technologies.

Moreover, it seems unlikely that legislators will authorize a fully autonomous examination, that is, the automatic granting of traditional patent rights without a human in the loop. Some human intervention in the patent examination process may be necessary to satisfy a patent applicant’s due process rights or administrative law’s reason-giving requirements under current law.58Although the case law is far from settled on this matter. See, e.g., Arti K. Rai, Machine Learning at the Patent Office: Lessons for Patents and Administrative Law, 104 Iowa L. Rev. 2617, 2625–29 (2019); Aziz Z. Huq, A Right to a Human Decision, 106 Va. L. Rev. 611, 661–71 (2020). Moreover, effectively keeping up with the increase in patent numbers requires patent offices to adopt AI tools as sophisticated as those of the most advanced applicants, which does not seem likely.59Cf. Rai, supra note 58, at 2638 (“To the extent that the Al-assisted search used by the Patent Office does not account for potentially rapid change in the average skill of practitioners itself spurred by AI, it will fall short.”). Because the need for human intervention puts a hard constraint on examination time, it is safe to assume that, on balance, pendency most likely will increase.60Interestingly, one might say that invention machines will reduce the demand for scientists and engineers. The pool of redundant inventors could then be hired by patent offices to examine the inventions of the very machines that took their job. For a modern example of machine slavery, see Modern Times (United Artists 1936).

The USPTO has some experience with an increased onslaught in patent applications in the past. In the 1990s, the agency experienced a torrential rise in the number of patent applications filed on express sequence tags (“EST”) or small fragments of DNA.61This rise in patent applications was due to changes in technology that made the sequencing of DNA easier. See Eliot Marshall, Patent Office Faces 90-Year Backlog, 272 Science 643, 643 (1996). The USPTO estimated that it would take a single examiner over 90 years and cost the Agency upwards of 20 million dollars to review the EST patent applications in its queue. As a result, then USPTO Commissioner Bruce Lehman considered several possible changes to combat the growing backlog of DNA patent applications, including requiring patent applicants to do more work themselves or contract out the research for searching the prior art.62In the EST context, the Agency successfully lobbied for an elevated utility standard with respect to EST—which required the patent applicant to describe the function and utility of the gene that the EST comprised. In re Fisher, 421 F.3d 1365, 1370–71 (Fed. Cir. 2005). Patent offices can consider these same approaches with respect to AI-generated patent applications.

Contracting out the research, however, would have the same problems as addressed above. That is, any contractor likely would need access to AI tools as sophisticated as those of the most advanced applicants. An alternative may be to require patent applicants on AI-generated applications to conduct their own patentability search and identify the most relevant prior art when they submit their applications to patent offices. Shifting the prior art search on the applicant would ease the burden on the patent offices as well as harness the most up-to-date AI search tools.63The current duty of candor whose breach can lead to a charge of inequitable conduct attempts to harness applicants’ knowledge. 37 C.F.R. § 1.56(a) (“Each individual associated with the filing and prosecution of a patent application has a duty of candor and good faith in dealing with the Office, which includes a duty to disclose to the Office all information known to that individual to be material to patentability as defined in this Section.”). Moreover, the common refrain against requiring more search efforts of patent applicants—that such efforts would increase the cost of patenting and hence reduce patenting efforts for cost-conscious applicants—has less force for AI-generated inventions.64John M. Golden, Proliferating Patents and Patent Law’s “Cost Disease,” 51 Hous. L. Rev. 455, 494 (2013). Given that invention machines presumably have processed and screened the prior art for coming up with the invention, it would be reasonably straightforward to identify the closest prior art. Nonetheless, shifting the patentability search to the applicant has its own set of drawbacks. Applicants, whose incentives may arguably cut against doing an exhaustive search, may find ways to game the search process.65Cf. Jeffrey M. Kuhn, Information Overload at the U.S. Patent and Trademark Office: Reframing the Duty of Disclosure in Patent Law as a Search and Filter Problem, 13 Yale J.L. & Tech. 90, 112–19 (2010) (documenting that examiners receive too much information on prior art disclosure from patent applicants that examiners cannot process the information and often ignore it).

Other work-sharing options may also ease the administrative burden associated with a rapid influx of AI-generated patent applications. The USPTO has patent work-sharing arrangements with foreign intellectual property offices to improve patent examination efficiency. Patent work-sharing permits patent offices to collaborate in the examination of commonly filed patent applications, reducing inefficiencies that patent offices experience when doing largely duplicative research into questions relating to patentability.66Mabey, supra note 55, at 231. The most famous of these work sharing efforts occurs through the Patent Prosecution Highway (“PPH”) programs, in which the partial examination of an application in one office can result in the expedited review of that application in another office.67Toshinao Yamazaki, Patent Prosecution Highways (PPHs): Their First Five Years and Recent Developments Seen from Japan, 34 World Pat. Info. 279, 279 (2012) (providing an overview of PPH programs); U.S. Pat. & Trademark Off., Performance and Accountability Report 105 (2021), https://www.uspto.gov/sites/default/files/documents/USPTOFY21PAR.pdf [https://perma.cc/HB76-XGY8]. Various reports suggest that PPH results in faster and cheaper reviews of patent applications.68See Yamazaki, supra note 67, at 280–82 (claiming PPH benefits in terms of speed of “patent acquisition,” increased allowance rates, and reduced costs). Nevertheless, in the fiscal year of 2021, the 6,000 patent applications filed under PPH are minuscule in comparison to the 650,000 patent applications filed at the USPTO.69U.S. Pat. & Trademark Off., supra note 67 (in the fiscal year 2021, 5,821 patent applications were filed under PPH while over 650,000 patent applications were filed in total at the USPTO). As a result, work-sharing efforts seem unlikely to do much to combat the increase in filings associated with AI-generated patent applications.

A more radical approach might be to, in effect, aggregate examinations of patents produced by the same AI invention machine. Applicants could apply to have a specific AI algorithm certified as reliably generating novel and non-obvious inventions. Subsequent applications that could be shown to be the output of certified machines would be presumed valid and granted patent protection.70To guarantee the quality of the certification, machines could be checked regularly and major changes to the algorithms would trigger re-examination. Examiners could also randomly select some AI-generated inventions at regular intervals and examine them. This two-track system does not necessarily imply a differentiated patent system since the nature of the patent right granted is the same across both tracks. Further, it does not seem to introduce the problem of people gaming the system to qualify for or avoid special treatment. An applicant could submit an “invention machine” for approval. Examiners would not need to determine whether the submitted “machine” meets some definition of AI; they would need only to determine whether or not it reliably produces inventions that meet the standards for patentability.

B.  Market Impacts

The second challenge of cheap and fast inventions is the potential effects on the markets for innovation. This Section identifies two potential market impacts of allowing patents on AI-generated inventions. First, AI-generated inventions could result in an increase in the concentration of patent ownership. Owners of invention machines would have the opportunity to amass vast patent portfolios, possibly conferring on them strategic advantages over their rivals.71Hall & Ziedonis, supra note 54, at 108–10; Gideon Parchomovsky & R. Polk Wagner, Patent Portfolios, 154 U. Pa. L. Rev. 1, 72–74 (2005). Along this line, Professors Choi and Gerlach have shown that an increase in one firm’s patent portfolio unambiguously reduces the rival firm’s incentives to develop a new product. One could also think of more severe chilling and blocking effects.72Jay Pil Choi & Heiko Gerlach, A Theory of Patent Portfolios, 9 Am. Econ. J.: Microeconomics 315, 315–16 (2017).

Second, a market-related issue of a burst of inventions is an exacerbation of the problem of patent thickets, namely overlapping and fragmented patent rights.73Carl Shapiro, Navigating the Patent Thicket: Cross Licenses, Patent Pools, and Standard Setting, in 1 Innovation Policy and the Economy 119, 119–22 (Adam B. Jaffe, Josh Lerner & Scott Stern eds., 2000); Rosemarie Ham Ziedonis, Don’t Fence Me In: Fragmented Markets for Technology and the Patent Acquisition Strategies of Firms, 50 Mgmt. Sci. 804, 804–06 (2004). Intertwined patent rights increase litigation risks for innovators, and the transaction costs associated with clearing these rights may become prohibitively expensive. This is especially true in industries in which many patent-protected technologies are necessary to manufacture a single product, such as a smartphone.

Relatedly, increased market concentration of patenting and patent thickets could also lead to the emergence of a new genre of patent assertion entities (“PAEs”), taking hold-ups and nuisance settlements to new heights. The leading critique of PAEs is that they assert weak or invalid patents against product manufacturers to extract nuisance settlements, which in turn stunt innovation.74Ashley Chuang, Note, Fixing the Failures of Software Patent Protection: Deterring Patent Trolling by Applying Industry-Specific Patentability Standards, 16 S. Cal. Interdisc. L.J. 215, 232 (2006) (“Because of a patent troll’s approach to generating revenue, a troll’s charges of infringement and litigation can often be baseless and thus clog the legal system.” ); Spencer Hosie, Patent Trolls and the New Tort Reform: A Practitioner’s Perspective, 4 I/S: J.L. & Pol’y for Info. Soc’y 75, 78 (2008) (“Perhaps the most common refrain in the patent debate is that plaintiffs will bring frivolous cases to extort unjustified settlements.”); Sannu K. Shrestha, Trolls or Market-Makers? An Empirical Analysis of Non-Practicing Entities, 110 Colum. L. Rev. 114, 119 (2010) (“One of the most prominent criticisms against NPEs is that they acquire weak and obscure patents and use them to pursue ‘baseless’ litigation.”); Robert P. Merges, The Trouble with Trolls: Innovation, Rent-Seeking, and Patent Law Reform, 24 Berkeley Tech. L.J. 1583, 1603–04 (2009) (discussing allegations that NPEs file suits on weaker patents). While there is no reason to think that AI-generated inventions are inherently of lower quality than human-generated inventions, the rise of patenting fueled by AI-generated inventions could lead to more overlapping patent rights and could decrease the costs of amassing vast patent portfolios. Product manufacturers who face patent thickets often settle through cross-licensing agreements. This process is not possible for PAEs as they do not produce any products or services that could potentially infringe anyone else’s patents. Thus, an increase in patent thickets and a decrease in barriers to amassing vast patent portfolios may create tantalizing opportunities for PAEs.

The adverse welfare effects of vast patent portfolios and patent thickets suggest that rewarding machine-made inventions with as many patents as inventions produced may offer too large a reward. Considering that invention machines have high fixed costs and low marginal costs, there must be a point at which the machines are generating large numbers of very low value inventions. Past this point, additional patents have value to their owners only through the market power generated by a larger portfolio.75Alfonso Gambardella, Dietmar Harhoff & Bart Verspagen, The Economic Value of Patent Portfolios, 26 J. Econ. & Mgmt. Strategy 735, 735–36 (2017). This optimal threshold is private information and varies across invention machines.

One could imagine several mechanisms to limit patent portfolios’ strength. The suggestion above of creating the applicant option to have an invention machine certified as producing patentable inventions likely would exacerbate the portfolio market power and patent thickets problem, but it also offers potentially incentive-compatible ways to limit such market power. Patents granted through this route could bear limitations such as a shorter validity period or forced availability under Fair, Reasonable, and Non-Discriminatory (“FRAND”) clauses—although FRAND clauses come with their own set of challenges.76Michael A. Carrier, Why Is FRAND Hard?, 2023 Utah L. Rev. 931, 932–53 (2023) (describing eight reasons why FRAND licensing is challenging).  However, as noted in Part III these limitations would need to be carefully crafted so as not to violate international treaty obligations under the TRIPS Agreement. Other options exist, such as increasing application fees with the size of the assignee’s patent portfolio or for each new invention produced by the same machine.

Putting conditions on patents from invention machines that potentially reduce the value of the patents would, again, introduce greater differentiation into the system. But this could perhaps be incentive-compatible rather than wasteful. It will be up to the applicants to decide whether to seek approval of an invention machine, and if they have an approved machine, whether to submit each new invention as a product of the machine or as a standard application. The machine route will yield faster but less valuable patents, while the standard route will yield slower but more valuable patents. In principle, these tradeoffs could be calibrated to limit the market power of vast portfolios while still affording appropriate incentives to patent the best inventions. Nevertheless, a differentiated system would still suffer from the political economy concerns set forth in Part III.

While it seems a priori desirable to limit the strength of AI-generated patent portfolios, the best mechanism to achieve this aim is unclear and deserves a careful theoretical investigation.

C.  Unlimited Inventions?

Finally, even if the flood of inventions from AI is not all patented, the democratization of invention machines could still have systemic consequences for the patent system. Owners of such machines might not patent their inventions but generate a vast amount of prior art. This prior art would naturally form part of the literature used to assess the non-obviousness of inventions, implicitly raising the bar to obtain patents in these areas—perhaps to a point where it would be extremely challenging to obtain patents in a given area.77One such initiative is already under way. See All Prior Art, http://allpriorart.com [https://perma.cc/4RFE-8SQL] (last visited Sept. 6, 2023). A key question is whether the disclosures by the AI would be enabling. Firms may want to flood a technological area with prior art to ensure freedom to operate.78Firms did something similar with DNA gene fragments before the law required that for a DNA gene fragment to be patentable, the utility of the underlying gene must be identified. This practice could essentially impose patent-free technological zones with unknown consequences on product development and commercialization. Such situation would have similar consequences to allowing an AI-augmented PHOSITA. The issue would not be that the AI-augmented PHOSITA could have produced the invention, but an acknowledgement of the fact that a large pool of prior art exists that renders the invention obvious.

Taking this argument a step further (and maybe too far), suppose AI got so skilled at invention that invention itself became essentially irrelevant. Imagine a world where in some sense every invention that could possibly be made at a point in time was known to everyone, or knowable to anyone who cared at very low cost. At this point, there would be no need to provide any incentive for people to invent; indeed it would become somewhat unclear what it even meant to invent something. But there may still be a social need to provide incentives for people to invest in commercializing inventions, as argued above.79Unless we had AI that, without cost, could tell us exactly how to adapt, manufacture, scale-up, and market a new product. We have trouble imagining how this would work, but it would be silly to rule it out ex ante.

To make this consideration concrete, consider the (admittedly artificial) hypothetical case in which every chemical compound that might have therapeutic benefits to humans was known or knowable, so no one could meaningfully “invent” a new drug. But it still costs millions to test the drug in humans. We would want companies to pay to run those tests, but they would not do so if anyone could then sell the drug because it was proved safe and effective. In that world, we might want to give companies some kind of exclusive right to test and then market new drugs. But we couldn’t use first to file as the criterion to determine who got that right. One could imagine a different kind of examination system, where companies made proposals for developing products out of the pool of available inventions, and were somehow evaluated on how much they proposed to invest and/or how good their development plan was. But that sounds hard. To economists, an obvious solution would be to auction the rights. The development of a particular invention out of a publicly-known pool is somewhat like a slice of electromagnetic spectrum in a given geographic area. We want someone to use it, but we don’t want more than one entity to use it, so we auction it off.

We raise these possibilities neither to say that we know that AI will get that good, nor to suggest that we have done any careful analysis of the merits of public auctions for invention development rights. Rather, we only want to suggest that if AI becomes extremely successful at invention, we will need to think about potentially radical changes to innovation policy.

CONCLUSION

Patent law has traditionally adapted slowly to the changing environment. In 2004, the U.S. National Research Council issued a report entitled “A Patent System for the 21st Century.”80Nat’l Rsch. Council of the Nat’l Acads., A Patent System for the 21st Century (Stephen A. Merrill, Richard C. Levin & Mark B. Myers eds., 2004). The report addressed issues that had plagued the U.S. patent system for decades or more, including questionable patent quality, impediments to disseminating information through patents, and international inconsistencies.81Id. Some inconsistencies, such as the United States’ first-to-invent principle compared to the rest of the world’s first-to-file principle, existed since the Patent Act of 1790. Many of the issues discussed in the report have not yet come to the fore. While they could materialize sooner than expected, the legislator is unlikely to act faster than expected. We hope that the patent system will be ready for the 22nd century by discussing these issues now.

In our view, some form of IP protection for AI-generated inventions is likely desirable. However, the nature of the IP regime is unclear and deserves in-depth theoretical and empirical examination. Regardless of whether AI-generated inventions are patentable, if AI radically reduces the cost and increases the production rate for inventions, it will have implications for the patentability standards that will have to be addressed. In addition, AI-generated inventions will have significant implications for the patent ecosystem more generally. A large increase in the rate of generation of patentable ideas will potentially overwhelm the examination process (if AI-generated inventions are patentable), make patents unavailable in wide swaths of technology (if AI-generated inventions are not patentable but saturate the prior art), and increase the concentration of patent ownership and the likelihood of patent thickets.

We have proposed a series of potential solutions to these problems. We do not claim that any of our proposed solutions are the best. We note also that AI-generated inventions have the potential to exacerbate the problem of increasing market power from highly concentrated patent portfolios, and that certifying invention machines might make this problem worse. Our hope is that this Article illustrates a need to seriously consider the protection of AI-generated inventions and that creative solutions do exist, but those solutions may have complex ramifications that should be thought through. In addition, these solutions also require global cooperation to harmonize legislations. Meanwhile, some concrete steps may already be implemented, such as a change in disclosure requirements. By forcing patent applicants to disclose the extent of the involvement of AI in the invention process, it becomes possible to track AI-generated inventions. This step is necessary to quantify the phenomenon and empirically study its effects.

The pressure for changes in the system that AI-generated inventions may create is also an opportunity. The structure of our current system is essentially the result of historical accident. As noted, it is difficult to measure the consequences of the system, or of specific aspects of the system, because we do not have natural experiments that allow us to test one practice against another. If changes are to be made in response to these new pressures, they should be structured initially to provide explicitly for quantified evaluation of the effects of new policies and procedures, potentially including structures such as randomized control trials that isolate the causal effect of specific changes.82For an example of the use of an RCT to measure the effect of a change in patent examination procedure, see Nicholas A. Pairolero, Andrew Toole, Peter-Anthony Pappas, Charles DeGrazia & Mike Teodorescu, Closing the Gender Gap in Patenting: Evidence from a Randomized Control Trial at the USPTO 2–5 (U.S. Pat. & Trademark Off., Econ. Working Paper No. 2022-1, 2022).

There is little doubt that confronting the implications of AI playing a role in the invention process is now on the agenda, and is likely to become more and more important. This paper’s focus on one set of issues should not be taken to mean that these issues are the main challenges facing tomorrow’s patent system. Nor does it mean that there are no other ways of modernizing the patent system.83For example, proposals to “decentralize” the patent system using distributed ledger (also known as blockchain) technologies may very well be an important component of a 22nd-century patent system. Lital Helman, Decentralized Patent System, 20 Nev. L.J. 67, 68–71 (2019); Gaétan de Rassenfosse & Kyle Higham, Decentralising the Patent System, 38 Gov’t Info. Q. 1, 1 (2021). In the context of a burst of inventions, a “block-chained” patent system can mitigate the transaction costs associated with intertwined patent rights. A license to an antecedent patent, essential to the use of a new invention, could be executed automatically by means of a smart contract under set conditions, should the owner of antecedent patent allow it. But AI is a rapidly evolving set of technologies, and the longer we delay determining how the innovation system should respond, the more likely we are to see socially undesirable consequences.

96 S. Cal. L. Rev. 1453

Download

* Associate Professor, College of Management of Technology, Ecole polytechnique fédérale de Lausanne, Switzerland.

† Professor Emeritus of Economics, Brandeis University; Senior Research Associate, Motu Research, Wellington, New Zealand.

‡ Charles Tilford McCormick Professor of Law, Associate Dean for Research, University of Texas School of Law.