[Guest Post] UK House of Lords Inquiry into Large Language Models

As the UK Government grapples with how to address the new technological landscape of artificial intelligence, several Committee Inquiries are underway to consider if or how the Government should intervene, including the House of Lords Communications and Digital Committee on Large Language Modules, at which this Kat was invited to provide evidence. This report is provided by Aiswarya Deepa Padmakumar, Simon Parayemkuzhiyil Abraham, and Aashish Murali Krishnan who attended the session and are students of mine pursuing their Masters in Law at Brunel University London. Here's what they have to say about the evidence session:

These large language models infringe copyright at multiple parts of the process: when they collect the information, how they store the information, and how they handle it.

A Large Language Model (LLM) is an artificial intelligence (AI) algorithm that uses deep learning and large data sets to analyse, summarize, generate, and predict content. They are trained on large volumes of data in multiple steps including copyright protected works used with or without the permission of the creator or appropriate licenses.

The House of Lords Communications and Digital Committee investigates public policy areas related to the media, digital, communications, and creative industries. The Committee has been conducting an Inquiry into LLMs since 12 September 2023 and are exploring what needs to happen over the next 1–3 years to ensure the UK can respond to the opportunities and risks.

Dan Conway, Hayleigh Bosher, Arnav Joshi
and Richard Mollet giving evidence at the Committee

In the latest evidence session for the Inquiry, which took place on Tuesday 7 November 2023, experts engaged in a compelling discussion highlighting the intricate relationship between AI, intellectual property (IP) law, and data protection. The intersection of AI technology and copyright regulation has become a central point of legal discourse, with industry experts and policymakers offering valuable insights. You can watch the full session here.

This session provided a platform for legal scholars, industry representatives, and decision-makers to explore the multifaceted challenges and opportunities arising from integrating LLMs into the realm of AI development. Here are some key points raised during this discussion.

Do LLMs Infringe Copyright?

The committee explored the question of infringement of copyright by LLMs. Dan Conway, CEO of the Publishers Association, stressed the need for a licensing framework, involving permission, transparency, remuneration, and attribution. The discussion also touched upon the role of the Intellectual Property Office in shaping policies around AI and copyright.
A crucial aspect of the conversation revolved around finding a balance between encouraging innovation and respecting the rights of content creators. The panel emphasized that copyright laws inherently strike this balance and should be upheld in the context of AI development.
The principle of copyright is the same, but the context is slightly different.
Mr Conway emphasized the importance of copyright in the context of AI model development. He raised concerns about the alleged infringement of copyright-protected content by LLMs on a massive scale. He said that the existing market conditions have led to the development of AI in ways that may not align with safe, responsible, reliable, and ethical practices.

Dr Hayleigh Bosher, Reader in Intellectual Property Law at Brunel University London and a Copyright expert, agreed with Conway that copyright should be applied to AI models. She explained that copyright law aims to be technologically neutral to remain relevant even when technologies evolve. She argued that licensing and seeking permission are fundamental when reproducing copyright protected works, emphasising that a license is required to cover various stages of the AI process.

Richard Mollet, Head of European Government Affairs with RELX plc, echoed the same thoughts highlighting the need for transparency ideals to uphold copyright and incentivise the creation of high-quality data. He asserted that copyright is crucial for maintaining trust in the outputs of AI systems.

The tech industry and the creative industry are not separate entities. Creative industries are also developing AI. AI tech companies are also creative. Everyone can benefit from the copyright framework.

The discussion also touched on statements from major tech companies, including Google, Amazon, and Meta, claiming compliance with copyright when training their AI models. They say that this is fair use under US law. However, it was contested that fair use would apply and clarified that in any event, this does not apply in the UK.

Is GDPR enough to protect personal data used by AI?

Arnav Joshi, Senior Associate at Clifford Chance focusing on digital regulation and data protection, provided insights into the GDPR compliance landscape. He suggested that while most companies aim to comply with data protection laws, there is a need for clearer guidance. He emphasized the importance of conducting data protection impact assessments and highlighted that GDPR offers individuals various rights, such as access, erasure, and rectification.

Baroness Healy of Primrose Hill raised concerns about inadvertently leaking private information due to AI models. Joshi said that GDPR rights can be exercised by individuals, allowing them to request access to, correction, or erasure of their data. He also mentioned technical solutions, such as input and output filters, to prevent the misuse of personal data.

There is no extensive guidance on how you would apply the test to large language models, which is where some of the risks start to come in.

The discussion continued with concerns about the transparency of AI models and their potential to generate inaccurate or misleading information. Dr Bosher emphasized the importance of allowing content creators to enforce their rights through transparency, enabling them to identify if their works have been used in AI model training.

The debate touched upon the issue of opting in or out regarding data usage in AI models. The consensus was that opting out, particularly in the context of personal data, only affects future models, not the current ones. Lord Bishop of Leeds Nicholas Baines, raised concerns about the difficulty of correcting misinformation about individuals, emphasizing the need for effective redress mechanisms. Mr Joshi emphasized the sliding scale of compliance and risks, pointing out that while GDPR is a powerful tool, challenges remain in addressing issues like misinformation generated by AI models.

Aiswarya, Hayleigh, Simon, and Aashish
outside Parliament after the session!

In closing, Lord Bishop of Leeds asked what the role of government is in determining the balance between innovation and IP rights. The panel agreed that the Government should play a role in ensuring that innovation is balanced with the protection of copyright.

In conclusion, the Committee evidence session underscored the intricate challenges at the intersection of AI, creativity, copyright, and data protection. A proactive and collaborative effort is essential amongst policymakers, legal experts, and industry stakeholders to establish a clear and comprehensive legal framework. The ongoing dialogue highlights the complexity of issues and the crucial need for balancing innovation with the protection of copyright and privacy.

Watch the full session here and keep up to date with the Inquiry and future sessions here.

Not just any old IPKat ...

* "Most Popular Intellectual Property Law Blawg" of all time according to Justia rankings, April 2025.

* "Most Popular Copyright Blawg" of all time according to Justia rankings, April 2025.

* "Best UK Intellectual Property blog" of all time according to FeedSpot, April 2025.

* PermaKat Eleonora Rosati has been quoted, and the IPKat has also been hyperlinked on the New York Times, April 2024.

* PermaKat Eleonora Rosati and The IPKat are expressly recommended as sources to follow to get an "unstuffy look at IP issues" according to Legal Business, April 2023.

* PermaKat Eleonora Rosati received the 2022 Adepi Award.

* PermaKat Eleonora Rosati listed as one of the World Intellectual Property Review's "Influential Women in IP" of 2020.

* PermaKat Eleonora Rosati listed as one of the Managing Intellectual Property magazine's "Fifty Most Influential People" of 2018.

* IPKat founder and Blogmeister Emeritus Jeremy Phillips listed as one of the Managing Intellectual Property magazine's "Fifty Most Influential People" of 2005, 2011, 2013, and 2014.

* Recommended by the European Patent Office as reading material for candidates for the European Qualifying Examinations, 2013.

* Listed as "Top Legal Blog" in The Times Online, March 2011.

* One of the only two non-US blogs listed in the Blawg 2010 ABA Journal 100.

* Number 1 in the 2010 Top Copyright Blog list compiled by the Copyright Litigation Blog, July 2010.

* Selected by the United States Library of Congress for inclusion in its historic collections of Internet materials related to Legal Blawgs as of 2010.

* Top Patent Blog poll 2009: 3rd out of 50 in the "Favourite Patent Blog" poll and 2nd out of 50 in the "Most-read" poll.

* ComputerWeekly IT Law and Governance Blog of the Year, 20 August 2008.

* Best of the Blogs, Times Online, 21 August 2008.

1 comment:

SantaThursday, 23 November 2023 at 11:23:00 GMT
I think whilst most of our legislation was coming from the EU the parliamentary committees deteriorated in quality, and are only now coming back up to speed. What that means is that they don't have in depth 'in-house' knowledge of the relevant issues, and rely too much on talking to experts (as described in this blog article). Unfortunately as a result the most appropriate framework is not used to look at each issue. In this case there should be understandings of how the concept of a 'commons' relates to copyright, and how that in turn relates to the economics of allowing a monopoly on a work. Based on that framework one can then look at how AI's use of copyrighted works should be governed, if at all. It is clear that US and EU legislative processes are presently far superior to ours (the UK) in so many ways, but we won't be able to recover quickly from decades of outsourcing our legislative process to the EU. [However it must also be noted that the US has also struggled with the concept of 'eligibility' and it use to limit the scope of patent protection, with only the US Supreme Court able to really understand the need for this and how it could be applied in practice (the Federal Circuit for example not understanding it at all)]

All comments must be moderated by a member of the IPKat team before they appear on the blog. Comments will not be allowed if the contravene the IPKat policy that readers' comments should not be obscene or defamatory; they should not consist of ad hominem attacks on members of the blog team or other comment-posters and they should make a constructive contribution to the discussion of the post on which they purport to comment.

It is also the IPKat policy that comments should not be made completely anonymously, and users should use a consistent name or pseudonym (which should not itself be defamatory or obscene, or that of another real person), either in the "identity" field, or at the beginning of the comment. Current practice is to, however, allow a limited number of comments that contravene this policy, provided that the comment has a high degree of relevance and the comment chain does not become too difficult to follow.

Learn more here: http://ipkitten.blogspot.com/p/want-to-complain.html