Should AI companies be allowed to mine creative content to train models?

Published · Updated · Thomas Wood
Should AI companies be allowed to mine creative content to train models?

The effects of AI companies training on creative works

When AI companies train models on text, video and images from creative industries, the end product is an AI model which can create near-human quality visual designs or copywriting. Many artists have argued the AI companies are exploiting creatives by profiting from their work, while the original creators do not receive any compensation. By flooding the market with low-cost content, AI has driven down rates for human artists, which is threatening creative careers.

Artists are also distressed by style mimicry, where AI models can create instant artworks “in the style of” Banksy, Van Gogh, Frida Kahlo, or Tracy Emin.

What does the law say about training on others’ copyrighted work?

In the recent Getty Images v Stability AI case, a judge ruled that Stability AI’s image generation model, which was trained on images belonging to Getty Images, didn’t infringe Getty Images’ copyright because an AI model doesn’t store the original photos and is not a copy in the sense of copyright law. It was also difficult for Getty Images to enforce their copyright because Stability’s training took place outside the UK - meaning that jurisdiction appears to only apply to the location that the model training took place in. The Getty ruling revealed gaps in copyright law that the UK government is now trying to fix.

The UK government has proposed changing copyright rules to make it easier for AI companies to mine content, by assuming that creative works are fair game for training AI models, unless the creators choose to “opt out” of such arrangements. In other words, all your artwork can be used to train an AI unless you take proactive steps to prevent it.

The backlash from creative industries

TV companies and film makers have claimed that this opt-out system is unworkable. Eric Fellner, co-chair of Working Title Films, called the proposal an “existential threat” to creative industries.

In February 2025, over 1000 musicians including Kate Bush, Annie Lennox, and Damon Albarn released a silent protest album titled Is This What We Want?, which had 12 silent tracks had titles that collectively spelled out: “The British Government Must Not Legalise Music Theft To Benefit AI Companies.”

The House of Lords’ recommendations to the Government

In the last week, British artists were granted a small victory, as the House of Lords released a landmark report titled AI, Copyright and the Creative Industries[1], which criticised all of the government’s opt-in models. (For readers outside the UK, you might be surprised to learn that in 2026, unlike in other countries, we still have an upper chamber partly composed of people who have inherited their special position and right to vote on Acts of Parliament from their ancestors.) However, the Lords’ report has been criticised as “kicking [copyright] down the road” in the Financial Times.[2]

The House of Lords recommended that the UK government abandons the “opt-out” models for creatives, introduce protections against “in the style of” AI outputs, and force companies to be transparent about where their training data came from. In other words, the UK should move towards a more regulated approach to AI training.

The Lords also recommended the UK should build its own sovereign AI using licensed data.

What does the Lords’ report mean for artists and AI companies?

We heard that one of the most acute harms arising from generative AI outputs is outputs ‘in the style of’ an artist. These can imitate their recognisable style, voice or personality in a way that displaces commissions or devalues the distinctive appeal of their work without reproducing a substantial part of any particular work.

  • House of Lords, AI, copyright and the creative industries[1]

Unfortunately, I don’t think the House of Lords’ recommendations will have any effect at all.

AI training and model use can take place anywhere in the world, so whatever the UK does to enforce certain restrictions will be easily bypassed.

I don’t see how AI can be prevented from creating works “in the style of” an artist. It is currently legal for a human to manually create an artwork in the style of an artist, and I struggle to see an effective way that we can forbid AI models from doing this without granting artists a monopoly over a particular creative style, which ultimately harms other artists. If “painting like David Hockney” becomes a legal infringement, it could lead to a wave of frivolous lawsuits against human artists.

UK based AI companies have argued that under the government’s proposals or the proposals in the Lords’ report, it will be hard to compete with US and Chinese companies, who operate under more permissive rules, and the UK would miss out on the AI boost to the economy. The idea of a sovereign AI has been criticised as unrealistic. It would be hard for the UK to train a publicly-owned “clean” model which would be able to compete with GPT-5 or Gemini.

I do not foresee any large changes coming quickly off the back of the Lords’ report, and even with the best of intentions, any legislative measures can be bypassed by tech companies using more permissive jurisdictions for training their models.

References

  1. House of Lords Communications and Digital Committee, AI, copyright and the creative industries, 2026.
  2. Anna Gross and Daniel Thomas, UK to delay difficult decisions on AI copyright rules, Financial Times, 7 March 2026.

Elevate Your Team with NLP Specialists

Unleash the potential of your NLP projects with the right talent. Post your job with us and attract candidates who are as passionate about natural language processing.

Hire NLP Experts

AI for expert witness work
Legal ai

AI for expert witness work

This is an article based on my presentation on “The Role of Artificial Intelligence in Expert Investigations and the Preparation of reports” which I gave at the Expert Witness Conference on 20 May 2026.

How can we turn unstructured data into structured data with generative AI?
Generative aiNatural language processing

How can we turn unstructured data into structured data with generative AI?

Many companies and organisations have large datasets that are stored in a very unstructured format. For example, you could work for a US based healthcare provider or insurer and have patient records stored in a free text format such as HL7 files or PDFs. A building regulator, land registry, or mortgage provider may have texts and accompanying diagrams from thousands of building inspections or land title deeds. A patent attorney’s office may have records of patent applications in PDF format.

Takeaways from the Expert Witness Conference in Ireland
Legal ai

Takeaways from the Expert Witness Conference in Ireland

On 20 May, I attended the Expert Witness Conference in Dublin, Ireland, organised by La Touche Training. It was an eye opening event with a mixture of lawyers and expert witnesses in different fields from Ireland and abroad. The event was chaired by Mr Justice Michael Peart, with a keynote address by the Honourable Mr Justice David Barniville, President of the High Court of Ireland.

What we can do for you

Transform Unstructured Data into Actionable Insights

Contact us