The European regulation requires transparency about the content used in training in order to protect third-party intellectual property rights. The aim is to facilitate this protection by increasing transparency while at the same time enabling AI systems providers to protect their own intellectual property rights and trade secrets.

Although intellectual property is not the main objective of the Artificial Intelligence Regulation (AIR), the impact of these technologies in some sectors with intensive content creation has led to the inclusion of certain requirements in this area. To summarize, the various agents involved in the AI value chain are required to be transparent in their use of third-party content, particularly in the field of model training. The AIR makes it quite clear that these transparency obligations should not affect the right of developers to protect their intellectual property and trade secrets. It is unlikely to be an easy task.

Compulsory intellectual property policies for the most powerful AI systems

Among the special obligations that the AIR requires of providers of general-purpose AI (GPAI) systems, which are the most powerful and versatile, are those aimed at ensuring respect for third-party intellectual property rights. In particular, article 53(1) of the AIR contains two significant requirements in this area.

Firstly, in paragraph (c), the AIR requires GPAI system providers to “put in place a policy to comply with Union law on copyright and related rights, and
in particular to identify and comply with, including through state-of-the-art
technologies, a reservation of rights expressed pursuant to Article 4(3) of Directive
(EU) 2019/790;” That is, the AIR takes into account the general application of the so-called text and data mining exception – which we have already addressed here – but it also seeks to ensure that rightsholders of protected works who do not want their content to be used for AI training systems can exercise their right to be excluded or to opt out. This is a “by design” requirement which does not specify the appropriate tools for this purpose or how they should be used, leaving it up to individual providers to decide on the most appropriate method.

Secondly, section (d) of the article requires these system service providers to “draw up and make publicly available a sufficiently detailed summary about the
content used for training of the general-purpose AI model, according to a template
provided by the AI Office.” To date we still do not have the model in question so we will have to wait and see how the office deals with the transparency obligations required of GPAI model providers and their right to protect their intellectual property and trade secrets. In any case, certification of third-party training data appears to be the most natural way of testing compliance and, with this aim, the qualified time stamping services (QTS) are deemed to be particularly interesting as they ensure reliable proof of the contents used, and in doing so clarify any doubts on their quality. We have already mentioned the usefulness of QTS in relation to software and what we have already written about here can similarly be applied to training data.

According to the AIR, providers can attest to their compliance with these obligations through the standards established in the codes of conduct approved under the processes contained in the regulation. It is a presumption of compliance which, as such, admits proof to the contrary. Nonetheless, it is a vote of confidence that the industry should be the one to decide on how to address this task subject, however, to the final say of the supervisory bodies.

The information that providers submit to the supervisory authorities shall be subject in any case to the requirements of confidentiality contained in the AIR which stipulates that both the Commission and the supervisory bodies must respect the confidentiality of the information that they obtain in the exercise of their duties. In fact, mention is specifically made to both the providers’ intellectual property rights and their trade secrets. Nonetheless, it is confusing that the confidentiality obligations established in article 78 of the AIR focus on the source code, which in AI matters is not always the most valued or fragile of the systems requiring protection.

Transparency requirements for providers of components and processes designed for integration in high-risk AI systems

Secondly, it is important to consider the obligations imposed by the AIR on the providers of components and/or processes required to develop high-risk AI systems. For example, it is normal for an AI systems provider to require these third parties to supply them with components and/or processes designed for training, testing or evaluation of models, integration in other computer programs or other aspects of model development.

In order to ensure high standards of compliance throughout the value chain of high-risk systems, the AIR requires these suppliers to provide all the information, capabilities, technical access, and the necessary assistance – taking into account the state of the art – so that the provider can fulfill their obligations in accordance with the AIR. This is logical, as the provider is the one primarily responsible for complying with the AIR obligations. Unless there is a minimum of transparency regarding the elements of these systems, compliance would be impossible.

However, at the same time, transparency requirements could well affect both the intellectual property rights and trade secrets of these suppliers. Aware of this fact, the regulation duly stipulates that this compliance should enable the provider “to fully comply with the obligations set out in this Regulation, without compromising their own intellectual property rights or trade secrets.” It is clearly a difficult balancing act. The secrecy of the data models or even the training data – which undoubtedly will compete on quality – may be the only way to protect the assets over which the AIR requires transparency and may therefore end up acting as a disincentive for the provision of these kinds of components and services.

An exception that should be considered is the fact that providers of tools, services, processes or components licensed under a free and open license (except in the case of general purpose AI models) are not subject to this transparency obligation. This is unquestionably an advantage that providers should take into account when designing their systems.

Conclusion

As we mentioned at the start of this post, although the AIR does not set out to protect intellectual property, it has introduced some essential mechanisms that will make this possible, mainly through the creation of transparency obligations. They are not including new exceptions, nor are they reinterpreting existing ones, but instead requiring transparency of the content used for training, and the manner in which this is obtained, and also which measures have been implemented in order to respect the right of rightsholders who do not want their content to be used for training to be excluded or to opt out. It is important in this process to maintain a balance so that AI system providers are able to protect their own intellectual property rights including their increasingly valuable trade secrets. An analysis of the AIR text reveals that the lawmakers are aware of this problem and have sought to strike a balance between conflicting interests. Nevertheless, we are inclined to think that in practice, finding this balance is going to be complicated. We will have to wait and see.

 

Cristina Mesa Sánchez

Intellectual Property Department