How do we ensure lawfulness in AI?
Latest update - 28 October 2024
We have updated the reference to inferred special category data on this page to reflect the latest changes to the detailed special category data guidance on inferences. This is because the guidance no longer focuses on the certainty of an inference as a relevant factor to decide whether it counts as special category data. Our underlying policy position has not changed, but we’re explaining it in a different way to make our position clearer.
At a glance
This section sets out what to consider when you decide on your lawful basis for processing personal data in the context of AI development and deployment. It also includes up-to-date content on the status of AI-driven inferences.
Who is this section for?
This section is aimed at compliance-focused roles, including senior management, who are responsible for ensuring the processing using AI is lawful, fair, and transparent. It will also be of interest for AI developers to help them understand what kind of data they are likely to have access to, depending on the purpose they are using it for.
In detail
- What should we consider when deciding lawful bases?
- How should we distinguish purposes between AI development and deployment?
- Can we rely on consent?
- Can we rely on performance of a contract?
- Can we rely on legal obligation, public task or vital interests?
- Can we rely on legitimate interests?
- What about special category data and data about criminal offences?
- Using AI systems to make inferences
- What about inferences and affinity groups?
What should we consider when deciding lawful bases?
The development and deployment of AI systems involve processing personal data in different ways for different purposes. You must break down and separate each distinct processing operation, and identify the purpose and an appropriate lawful basis for each one, in order to comply with the principle of lawfulness.
Whenever you are processing personal data – whether to train a new AI system, or make predictions using an existing one – you must have an appropriate lawful basis to do so.
Different lawful bases may apply depending on your particular circumstances. However, some lawful bases may be more likely to be appropriate for the training and / or deployment of AI than others. In some cases, more than one lawful basis may be appropriate.
At the same time, you must remember that:
- it is your responsibility to decide which lawful basis applies to your processing;
- you must always choose the lawful basis that most closely reflects the true nature of your relationship with the individual and the purpose of the processing;
- you should make this determination before you start your processing;
- you should document your decision;
- you cannot swap lawful bases at a later date without good reason;
- you must include your lawful basis in your privacy notice (along with the purposes); and
- if you are processing special categories of data you need both a lawful basis and an additional condition for processing.
Further reading outside this guidance
Read our guidance on lawful basis for processing.
How should we distinguish purposes between AI development and deployment?
In many cases, when determining your purpose(s) and lawful basis, it will make sense for you to separate the research and development phase (including conceptualisation, design, training and model selection) of AI systems from the deployment phase. This is because these are distinct and separate purposes, with different circumstances and risks.
Therefore, it may sometimes be more appropriate to choose different lawful bases for your AI development and deployment. For example, you need to do this when:
- the AI system was developed for a general-purpose task, and you then deploy it in different contexts for different purposes. For example, a facial recognition system could be trained to recognise faces, but that functionality could be used for multiple purposes, such as preventing crime, authentication, and tagging friends in a social network. Each of these further applications might require a different lawful basis;
- you implement an AI system from a third party, any processing of personal data undertaken by the developer will have been for a different purpose (eg to develop the system) to what you intend to use the system for, therefore you may need to identify a different lawful basis; and
- processing of personal data for the purposes of training a model may not directly affect the individuals, but once the model is deployed, it may automatically make decisions which have legal or significant effects. This means the provisions on automated decision-making apply; as a result, a different range of available lawful bases may apply at the development and deployment stages.
The following sections outline some AI-related considerations for each of the UK GDPR’s lawful bases. They do not consider Part 3 of the DPA (law enforcement processing) at this stage.
Can we rely on consent?
Consent may be an appropriate lawful basis in cases where you have a direct relationship with the individuals whose data you want to process.
However, you must ensure that consent is freely given, specific, informed and unambiguous, and involves a clear affirmative act on the part of the individuals.
The advantage of consent is that it can lead to more trust and buy-in from individuals when they are using your service. Providing individuals with control can also be a factor in your DPIAs.
However, for consent to apply, individuals must have a genuine choice about whether you can use their data. This may have implications depending on what you intend to do with the data – it can be difficult to ensure you collect valid consent for more complicated processing operations, such as those involved in AI. For example, the more things you want to do with the data, the more difficult it is to ensure that consent is genuinely specific and informed.
The key is that individuals understand how you are using their personal data and have consented to this use. For example, if you want to collect a wide range of features to explore different models to predict a variety of outcomes, consent may be an appropriate lawful basis, provided that you inform individuals about these activities and obtain valid consent.
Consent may also be an appropriate lawful basis for the use of an individual’s data during deployment of an AI system (eg for purposes such as personalising the service or making a prediction or recommendation).
However, you should be aware that for consent to be valid, individuals must also be able to withdraw consent as easily as they gave it. If you are relying on consent as the basis of processing data with an AI system during deployment (eg to drive personalised content), you should be ready to accommodate the withdrawal of consent for this processing.
Further reading outside this guidance
Read our guidance on consent.
Further reading – European Data Protection Board
The European Data Protection Board (EDPB), which has replaced the Article 29 Working Party (WP29), includes representatives from the data protection authorities of each EU member state. It adopts guidelines for complying with the requirements of the EU version of the GDPR.
The EDPB has produced guidelines on consent.
EDPB guidelines are no longer directly relevant to the UK regime and are not binding under the UK regime. However, they may still provide helpful guidance on certain issues.
Can we rely on performance of a contract?
This lawful basis applies where the processing using AI is objectively necessary to deliver a contractual service to the relevant individual, or to take steps prior to entering into a contract at the individual’s request (eg to provide an AI-derived quote for a service).
If there is a less intrusive way of processing their data to provide the same service, or if the processing is not in practice objectively necessary for the performance of the contract, then you cannot rely on this lawful basis for the processing of data with AI.
Furthermore, even if it is an appropriate ground for the use of the system, this may not be an appropriate ground for processing personal data to develop an AI system. If an AI system can perform well enough without being trained on the individual’s personal data, performance of the contract does not depend on such processing. Since machine learning models are typically built using very large datasets, whether or not a single individual’s data is included in the training data should have a negligible effect on the system’s performance.
Similarly, even if you can use performance of a contract as a lawful basis to provide a quote prior to a contract, this does not mean you can also use it to justify using that data to develop the AI system.
You should also note that you are unlikely to be able to rely on this basis for processing personal data for purposes such as ‘service improvement’ of your AI system. This is because in most cases, collection of personal data about the use of a service, details of how users engage with that service, or for the development of new functions within that service are not objectively necessary for the provision of a contract. This is because the service can be delivered without such processing.
Conversely, use of AI to process personal data for purposes of personalising content may be regarded as necessary for the performance of a contract – but only in some cases. Whether this processing can be regarded as ‘intrinsic’ to your service depends on:
- the nature of the service;
- the expectations of individuals; and
- whether you can provide your service without this processing (ie if the personalisation of content by means of an AI system is not integral to the service, you should consider an alternative lawful basis).
Further reading outside this guidance
Read our guidance on contracts.
Further reading – European Data Protection Board
EDPB guidelines on processing under Article 6(1)(b) in the context of online services.
Can we rely on legal obligation, public task or vital interests?
There are some examples in which the use of an AI system to process personal data may be a legal obligation. You may also be required to audit your AI systems to ensure they are compliant with various legislation (including but not limited to data protection), and this may involve processing of personal data. For example, to test how the system performs on different kinds of people. Such processing could rely on legal obligation as a basis, but this would only cover the auditing and testing of the system, not any other use of that data. You must be able to identify the obligation in question, either by reference to the specific legal provision or else by pointing to an appropriate source of advice or guidance that sets it out clearly.
Similarly, if you use AI as part of the exercise of your official authority, or to perform a task in the public interest set out by law, the necessary processing of personal data involved may be based on those grounds. This is likely to be relevant to public authorities using AI to deliver public services.
In a limited number of cases, the processing of personal data by an AI system might be based on protecting the vital interests of the individuals. For example, for emergency medical diagnosis of patients who are otherwise incapable of providing consent (eg processing an FMRI scan of an unconscious patient by an AI diagnostic system).
It is however very unlikely that vital interests could also provide a basis for developing an AI system, because this would rarely directly and immediately result in protecting the vital interests of those individuals, even if the models that are eventually built might later be used to save the lives of other individuals. For the development of potentially life-saving AI systems, it would be better to rely on other lawful bases.
Further reading outside this guidance
Read our guidance on:
Can we rely on legitimate interests?
Depending on your circumstances, you could base your processing of personal data for both development and ongoing use of AI on the legitimate interests lawful basis.
It is important to note that while legitimate interests is the most flexible lawful basis for processing, it is not always the most appropriate. For example, if the way you intend to use people’s data would be unexpected or cause unnecessary harm. It also means you are taking on additional responsibility for considering and protecting people’s rights and interests. You must also be able to demonstrate the necessity and proportionality of the processing.
Additionally, if you are a public authority you can only rely on legitimate interests if you are processing for a legitimate reason other than performing your tasks as a public authority.
There are three elements to the legitimate interests lawful basis, and it can help to think of these as the ‘three-part test’. You need to:
- identify a legitimate interest (the ‘purpose test’);
- show that the processing is necessary to achieve it (the ‘necessity test’); and
- balance it against the individual’s interests, rights and freedoms (the ‘balancing test’).
There can be a wide range of interests that constitute ‘legitimate interests’ in data protection law. These can be your own or those of third parties, as well as commercial or societal interests. However, the key is understanding that while legitimate interests may be more flexible, it comes with additional responsibilities. It requires you to assess the impact of your processing on individuals and be able to demonstrate that there is a compelling benefit to the processing.
You should address and document these considerations as part of your legitimate interests assessment (LIA). As described above, in the initial research and development phase of your AI system, your purposes may be quite broad, but as more specific purposes are identified, you may need to review your LIA accordingly (or identify a different lawful basis).
Example
An organisation seeks to rely on legitimate interests for processing personal data for the purposes of training a machine learning model.
Legitimate interests may allow the organisation the most room to experiment with different variables for its model.
However, as part of its legitimate interests assessment, the organisation has to demonstrate that the range of variables and models it intends to use is a reasonable approach to achieving its outcome.
It can best achieve this by properly defining all of its purposes and justifying the use of each type of data collected – this will allow the organisation to work through the necessity and balancing aspects of its LIA. Over time, as purposes are refined, the LIA is revisited.
For example, the mere possibility that some data might be useful for a prediction is not by itself sufficient for the organisation to demonstrate that processing this data is necessary for building the model.
Further reading outside this guidance
Read our guidance on legitimate interests.
We have also published a lawful basis assessment tool which you can use to help you decide what basis is appropriate for you, as well as a legitimate interests template (Word).
What about special category data and data about criminal offences?
If you intend to use AI to process special category data or data about criminal offences, then you will need to ensure you comply with the requirements of Articles 9 and 10 of the UK GDPR, as well as the DPA 2018.
Special category data is personal data that needs more protection because it is sensitive. In order to process it you need a lawful basis under Article 6, as well as a separate condition under Article 9, although these do not have to be linked. For more detail, see our detailed guidance on special category data and ‘How should we address risks of bias and discrimination’.
Further reading outside this guidance
Read our guidance on special category data and on criminal offence data.
Using AI systems to make inferences
You may intend to use AI systems to:
- guess or predict details about someone, using information from various sources; or
- analyse and find correlations between datasets, and use these to categorise, profile or make predictions.
In other words, you may use AI systems to make inferences about individuals or groups. Whether an inference is personal data depends on whether it relates to an identified or identifiable individual.
It may also be possible for you to infer details about someone which fall within what constitutes special categories of data. Whether or not this counts as special category data and triggers Article 9 depends whether you are intentionally drawing that inference, or intentionally treating someone differently based on an inference.
That inference is likely to be special category data, if your use of AI means you:
- can (or intend to) infer relevant information about an individual; or
- intend to treat someone differently on the basis of the inference (even if it's not with a reasonable degree of certainty).
.
Further reading in ICO guidance
See our guidance on “What is personal data”, including:
What about inferences and affinity groups?
Whether inferences about groups are personal data depends on the circumstances. For example, how easy it is to identify an individual through group membership.
AI systems may aim to make predictions based on patterns within a population. In this sense they may appear to solely concern groups. However, if your AI system involves making inferences about a group – creating affinity groups - and linking these to a specific individual, then data protection law applies at multiple stages of the processing. More specifically:
- the development stage, involving processing of individuals’ personal data to train the model; and
- the deployment stage, where you apply the results of the model to other individuals that were not part of the training dataset on the basis of its predictive features.
This means that even if an individual’s personal data is not part of your training dataset, data protection law applies when you use that model on them. This is because it involves you processing their personal data to make a decision or prediction about them, using your model to do so.
It is also important that you don’t just consider obvious and immediate tangible damage to people. But also more subtle intangible harms and how the system might affect people's rights and freedoms more generally. This includes any impact on society as a whole. For example, DPIAs require you to consider risks to rights and freedoms of all those that the system might affect.
Additionally, data protection by design requires you to take appropriate steps to:
- implement the data protection principles effectively; and
- integrate necessary safeguards into your processing at the design stage, and throughout the lifecycle.
In the context of AI, your data protection considerations therefore must include:
- the individuals whose personal data you process to train your system; and
- the impact your system has on the rights and freedoms of individuals and society once it is deployed.
If you use an affinity group to profile individuals, you need to comply with the data protection principles, including fairness.