The Search for Ethical AI

Apr 30

*The writers, designers and artists of the future may be destined to lives of perpetual under-employment.*

Without a continuous supply of fresh human creativity, AI eventually suffers “model collapse.” Compensating creators isn't just the fair thing to do. It may also be necessary to keep the technology from eating itself. Part 2 of Bill Sparks’ look at the harsh economics of creator compensation.

Start with a number: $3,000.

That's roughly what authors whose books were pirated and used to train Anthropic's Claude AI stood to receive per title from the largest AI copyright settlement resolved to date. The case, Bartz v. Anthropic, totaled $1.5 billion — a figure that sounds significant until you do a little math. Anthropic reportedly downloaded around 7 million books from pirate repositories. About 500,000 titles qualified for compensation under the settlement terms. Each one gets approximately $3,000, split by default between the author and publisher.

For a book that took years to research and write, $1,500 is a complicated thing to feel good about.

What makes the settlement even harder to celebrate is what it doesn't resolve. The case was built on piracy — Anthropic acquired those books from illegal repositories, which is where its legal exposure came from. A separate ruling by the same judge found that training AI on legitimately acquired books is fair use. Legally acquired means no compensation required. The biggest creator victory so far mostly addresses the smaller part of the problem, and the central question — whether using someone's life's work to build a commercial competitor requires any payment at all — remains unanswered.

What the Law Currently Says

The short version is that the law currently says very little that helps individual creators.

In June 2025, Judge William Alsup of the U.S. District Court for the Northern District of California ruled that if an AI company obtains books through legal channels, training on them without permission or payment is fair use. That's not a quirky interpretation. It reflects how courts have historically thought about the difference between copying a work and learning from it. A person can read every novel ever written and use what they've learned to write their own. Courts are wrestling with whether an AI doing the same thing at a billion-times the scale is meaningfully different.

So far, the answer has largely been no.

The U.S. Copyright Office issued guidance in May 2025 that made things only marginally clearer. It found little support for mandating licensing schemes, instead expressing preference for market-based approaches where creators and AI companies negotiate deals on their own. Given that one side of those negotiations controls some of the most valuable companies in history and the other side is often a freelance illustrator trying to pay rent, "negotiate on your own" is doing quite a bit of work as policy advice.

Europe is further along. The EU AI Act, coming into fuller effect in 2026, requires AI developers to disclose their training data sources, respect copyright opt-outs, and label AI-generated content. EU creators can formally reserve their rights to prevent their work from being used in training. It's not a complete solution, but it's a meaningful protection. It also only covers Europeans.

What's Actually Being Paid

*Major publishing companies have done lucrative licensing deals with AI companies. Individual creators can expect little if any compensation when their work is used for training.*

A licensing market is emerging, though it mostly benefits organizations rather than individuals.

OpenAI has signed 18 licensing agreements with publishers globally. News Corp reached a five-year deal reportedly worth more than $250 million. The Associated Press, The Washington Post, The Guardian, the Financial Times, and the New York Times all have agreements in place with major AI companies. The basic structure of these deals: the publisher's content gets used in AI training or surfaces in AI-generated responses with attribution, and in return the publisher gets access to AI tools and, sometimes, additional funding. The Axios deal included money from OpenAI to open four local newsrooms, which is a creative arrangement even if it raises its own questions about editorial independence.

These are real agreements involving real money. They are also almost entirely invisible to the individual writer, photographer, or illustrator whose work sits in the same training datasets. A staff writer at The Guardian doesn't see a share of whatever licensing revenue the paper negotiates with OpenAI. A freelancer who sold a piece to the Financial Times five years ago certainly doesn't.

Music has moved somewhat further in developing individual compensation structures. By 2026, some AI music licensing contracts define which specific tracks can be included in training datasets, require platforms to log attribution metadata, and calculate royalties based on usage metrics — how many times a track's contribution influenced a generated output. The model is closer to streaming royalties than a one-time fee. It's still early and still contested, but there is at least a framework being built.

For most other creative fields, that framework doesn't exist yet.

The Most Ambitious Proposal

In December 2025, three academics published what may be the most carefully thought-through solution to this problem so far. Frank Pasquale of Cornell Law and Cornell Tech, Thomas Malone of MIT Sloan, and Andrew Ting of George Washington University proposed a new legal right they call "learnright" — a seventh exclusive right to be added to the six copyright already grants creators, this one covering specifically the use of a work in AI training.

The concept is modeled on how copyright has evolved before. When digital audio transmission became a significant new use of music, Congress created new protections for it. The argument is that Congress could do the same for AI training ingestion. Under a learnright regime, companies building generative AI would need to license the right to train on specific datasets, much as some already do with news archives and stock photo libraries. Collective licensing organizations — similar to ASCAP or BMI in the music industry — would handle the aggregation and distribution, reducing the burden of individual negotiations.

Malone describes the appeal directly: "Learnright law provides an elegant way of balancing all these competing perspectives. It provides compensation to the people who create the content needed for AI systems to work effectively. It removes the legal uncertainties about copyright law that AI companies face today. In short, it addresses a growing legal problem in a way that is simpler, fairer, and better for society than current copyright law."

Pasquale adds an argument aimed squarely at the companies: "At present, AI firms richly compensate their own management and employees, as well as those at suppliers like NVIDIA. But the copyrighted works used as training data are also at the foundation of AI innovation. So it's time to ensure its creators are compensated as well."

There is also a self-interest argument buried in the proposal that AI companies would be wise to take seriously. Research suggests that AI systems fed on their own outputs over time suffer "model collapse" — the quality of generated content degrades as models increasingly train on AI-generated rather than human-generated material. Without a continuous supply of fresh human creativity, the training data pipeline eventually runs thin. Compensating creators isn't just the fair thing to do. It may also be necessary to keep the technology from eating itself.

The Counterarguments

*How do you determine what a single poem or illustration is worth in a dataset of several billion items?*

The case for learnright is compelling, but it's not without legitimate criticism.

The most straightforward objection is that mandating licensing could slow innovation and create prohibitive costs for smaller AI developers who can't afford the legal infrastructure to negotiate rights across millions of individual works. The Copyright Office's preference for market-based solutions reflects a real concern that the wrong regulatory structure could calcify an industry still figuring out what it is.

There's also the practical problem of valuation. How do you determine what a single poem or illustration is worth in a dataset of several billion items? The music industry spent decades building BMI, ASCAP, and eventually the complex machinery of streaming royalties — and that system is still messy, still contested, and still the subject of ongoing litigation. Replicating it across every category of creative work simultaneously is a genuinely difficult engineering problem, both legal and logistical.

And then there's the fair use argument that has historically given society breathing room to learn, build, and create. Courts have long recognized that progress depends on being able to engage with existing work without permission at every turn. The question of where AI training falls on that spectrum isn't settled, and reasonable people disagree.

Where Things Actually Stand

Honest assessment: not far enough along for most creators.

The settlements resolve piracy, not principle. The licensing deals reach publishers, not the people who did the work. Learnright has no legislative champion yet. The EU opt-out framework is the most concrete protection currently available, and it only covers Europeans. The Copyright Office prefers market solutions, but the market is negotiating across an enormous power imbalance with no floor under the weaker side.

The music industry parallel is instructive, though not entirely comforting. It took decades of legal battles, Congressional action, and collective organizing before streaming royalties became a functional — if still disputed — system. Napster launched in 1999. Meaningful streaming compensation didn't arrive until well into the following decade, and artists were still fighting over rates years after that. Creators in other fields are at roughly the same point the music industry was at the turn of the century: watching their work get used at scale, arguing about whether it's legal, and trying to establish what fair looks like before the window closes.

The question isn't whether a fair system is possible. Systems like this have been built before. The question is whether one will arrive before the creative ecosystem AI depends on has been altered in ways that are difficult to reverse.

Which brings us back to $3,000 per book

It's not nothing. For some authors, particularly those with large catalogs of qualifying titles, the Anthropic settlement represents real money. But for a work that took years to produce, that trained a system now generating billions of dollars in revenue, and that a court has ruled was fair game if the company had simply acquired it through legitimate channels — it's also a preview of what "fair" looks like when the law hasn't caught up and the market is left to set its own terms.

Whether that changes depends on whether the people making the rules decide it should.

The Advantage Journal arrives every week. What matters in sport, mobility, media, and technology — curated and contextualized by Bill Sparks, Bill Long, and Paul Pfanner. No hedging. No filler. Subscribe — It’s Free

Bill SparksCold ReadPfanner AdvantageleadershipTechnologyTalent2026artificial intelligenceAI copyrightcreative rightsAI regulationmedia industrydigital rightsknowledge workersgenerative AIAI ethicsAI toolsmediamotorsport media

Bill Sparks

Bill Sparks writes the Cold Read column, where he examines technology, media, and competitive systems with the same unsentimental analytical mindset he developed over more than three decades at the intersection of motorsports, media, and marketing.

As founding publisher of RACER magazine, he helped build one of North America’s most respected motorsports titles and later played a key role in the development of RACER.com and Racer Studio, anticipating the shift toward digital and video storytelling.

At Pfanner Advantage, the consulting practice of Pfanner Communications, Sparks focuses on translating ideas into durable platforms while ensuring expansion never outpaces the brand integrity that ultimately sustains long-term value.