Documentation

Contents

At a glance

It is essential to document each stage of the process behind the design and deployment of an AI decision-support system in order to provide a full explanation for how you made a decision.
In the case of explaining AI-assisted decisions, this includes both documenting the processes behind the design and implementation of the AI system and documenting the actual explanation of its outcome.
The suggested areas for documentation may not apply to all organisations, but are intended to give you an indication of what might help you provide the evidence to establish how a decision was made.
The key objective is to provide good documentation that can be understood by people with varying levels of technical knowledge and that covers the whole process from designing your AI system to the decision you make at the end.

Checklist

☐ We have documented what we are required to do under the GDPR.

☐ We have documented how each stage of our use of AI contributes to building an explanation, from concept to deployment.

☐ Our documentation provides an audit trail about who we give explanations to, and how we provide them.

☐ We have considered how best to organise our documentation so that relevant information can be easily accessed and understood by those providing explanations to decision recipients.

In more detail

What documentation is legally required under the GDPR?
What documentation should we provide to demonstrate the explainability of our AI system?
How should we organise this documentation?

What documentation is legally required under the GDPR?

Article 5 of the GDPR says that “The controller shall be responsible for, and able to demonstrate compliance with, paragraph 1 (‘accountability’).”

Article 12 of the GDPR requires you to provide information to the data subject in “concise, transparent, intelligible and easily accessible form, using clear and plain language…”. It also states that you can provide the information “in combination with standardised icons in order to give in an easily visible, intelligible and clearly legible manner a meaningful overview of the intended processing.”

Article 13 of the GDPR requires you to provide your DPO’s contact details, which aligns with the responsibility explanation; the purpose for which you are processing the data subject’s personal data, as well as the legal basis for that processing, which in many cases should form part of your explanation; and the existence of automated decision-making, including profiling, referred to in Article 22(1) and (4) and, at least in those cases, meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject. You must document all of this to ensure you remain accountable.

Article 14 of the GDPR applies to cases where you have not obtained personal data from the data subject directly. You should provide data subjects with the following information in addition to that required under Article 13, within a reasonable period after obtaining the personal data, but at the latest within one month, having regard to the specific circumstances in which the personal data are processed:

what categories of personal data you are processing; and
the source from which you obtained their personal data, and if applicable, whether it came from publicly accessible sources.

See Article 14 of the GDPR for further information on when it is not required to provide this information to the data subject. This includes when you are processing personal data for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes. This is subject to certain conditions and safeguards.

Article 15 of the GDPR gives data subjects an additional right of access to the personal data that you hold on them. This means you should document how you will provide them with a copy of the personal data that you process.

Article 21 of the GDPR gives data subjects the right to object at any time, on grounds relating to their particular situation, to processing of personal data concerning them, including profiling. This means you should document how you ensure data subjects are aware of this right, and how you record if they have exercised this right.

Article 22 of the GDPR gives individuals the right not to be subject to a solely automated decision producing legal or similarly significant effects, unless certain conditions apply. It obliges you to adopt suitable measures to safeguard individuals, including the right to obtain human intervention, to express their view, and to contest the decision. This means you need to document how you will do this.

Article 30 of the GDPR helps you to fulfil the accountability principle. It states that an organisation shall “…maintain a record of processing activities under its responsibility.”

Article 35 of the GDPR requires organisations to carry out a Data Protection Impact Assessment (DPIA) when they are doing something with personal data, particularly when using new technologies, which is likely to have high risks for individuals. A DPIA is always required for any systematic and extensive profiling or other automated evaluation of individuals’ personal aspects which are used for decisions that produce legal or similarly significant effects.

You can reconcile some of these documentation requirements through your privacy notice, which must contain certain elements:

The lawful basis for the processing – one or more of the bases laid out in Article 6(1) of the GDPR.
If applicable, the legitimate interests for the processing – these are the interests pursued by you or a third party if you are relying on the lawful basis for processing under Article 6(1)(f) of the GDPR. You could also include a link to the record of your assessment of whether legitimate interests apply to the particular processing purpose.
The rights available to individuals regarding the processing – eg access, rectification, erasure, restriction, data portability, and objection. The rights vary depending on the lawful basis for processing. Your documentation can reflect these differences.
If applicable, the existence of automated decision-making, including profiling. In certain circumstances you will need to tell people about the logic involved and the envisaged consequences.
If applicable, the source of the personal data. This is relevant when you didn’t obtain personal data directly from an individual.

You should be aware that different documentation requirements may apply for law enforcement processing under Part 3 of the DPA 2018 and for intelligence services processing under Part 4 of the Act.

While this guidance focusses on documentation required to support explanations, the auditing framework covers other aspects of an AI system. In particular, the framework details those that require documentation for data protection compliance and good information governance.

What documentation can help us to demonstrate the explainability of our AI system?

The list below should support you to provide an explanation to the decision recipient, and maintain an audit trail about who you give explanations to, and how you provide them.

You may not need to provide or document all of this information, and you may have to obtain some information from your vendor if you procure a system. It is up to you to decide what information is required, and how the documented information can help you provide an explanation to the decision recipient. As with the policies and procedures section, a risk-based approach can help. For example, an AI system that recommends which groceries to buy or which films to watch will require less detail than those in recruitment.

Decision to use an AI system

What the system is intended to be used for, so you can explain to the decision recipient why you are planning to use it.
Who the ultimate decision recipient will be.
What the AI system you have chosen will do from a technical perspective, in a way that the decision recipient can also understand.
How the specifications of the system were determined, and by whom – as well as alternative specifications that were considered, and why they were not chosen.
If your AI system has been procured from a third party or outsourced, how you can change or retool its specifications to meet changing performance and explainability needs over time.
What trade-offs are involved for the data subject whose data will be used in the model and who will often also be the decision recipient. For example, the data subject’s personal data may be used in training the AI system to produce a highly accurate model, but this use of data may not be in the data subject’s interest.
What the demographics and background of the development team are, in order to be aware of the diversity within the team responsible for designing, deploying and maintaining the system, and how that may impact on the decision for the decision recipient.
What domain you will be using the AI system in, and how this system has been tested and validated in that domain.
What other impact assessment are relevant to your domain, in addition to the DPIA mentioned.
Which people within the organisation have responsibility for providing explanations along the design and implementation pipeline of your AI system.

Explanation types this supports: rationale, responsibility, fairness, safety and performance, impact.

Scoping and selecting explanation types

What processes you have set up to optimise the end-to-end accountability of your AI model.
What setting or sector your AI model will be used in and the bearing this has on the types of explanation you will offer.
Why you have prioritised certain explanation type(s), based on your AI system’s potential impact.
Why you have chosen to handle the remaining explanation types that will not be prioritised in a certain way.
How you have set up process-based and outcome-based aspects of the explanations types that you will offer.
Why you have provided the depth or comprehensiveness of the explanation, given the potential impacts of the system. This includes the general risks of deploying the system, and the risks for the specific person receiving the AI-assisted decision.
Who within your organisation is responsible for selecting the appropriate type(s) of explanation.

Explanation types this supports: rationale, responsibility, data, fairness, safety and performance, impact.

Data collection and procurement

Where the data came from, for what purpose the data was originally collected, and for whom – this will help you explain to the decision recipient how relevant the data you have used is to the decision the AI system has made about them.
What the components of the dataset are together with a brief summary of why each element is being included.
How the data is representative of the people that will be subject to the AI decisions you will make – for example through consideration by a domain expert.
How you have made sure that the data is reliable, accurately measured and obtained from a source of integrity.
How you have examined your datasets for any potential inherent bias.
Whether the data is recent, up-to-date and appropriately timely given the rate of change in the underlying distribution you are modelling. This will demonstrate that you have accounted for concept drift and data fluctuation from the start. The rate of change will depend on the domain you are operating in, and the specific case you are considering.
If you use synthetic data, provide documentation on when and how it was created, and its properties – this helps you explain and justify to the decision recipient why created data has been used in the training of the model and why this is appropriate.
What the risks associated with using the data are, and the risks to those whose data is included.
How individuals can opt out of being included in the data used either to train or run the AI system.

Explanation types this supports: data, fairness, safety and performance, impact.

Data pre-processing

How, especially in cases where social and demographic data is involved, you have ensured that the pre-processing of your data has produced a feature space which includes variables that are understandable, relevant and reasonable and that does not include variables that are opaque or difficult to understand about your model’s target variable.
How you have labelled the data and why you have labelled it in that way. This should include tagging and annotating what a piece of data is, and the reasons for that tag.
How you have mitigated any bias in your data through pre-processing techniques such as re-weighting, up-weighting, masking, or excluding features and their proxies.
If you are using ‘raw’, observed, or unconventional data, documentation of what interpretively significant feature such data is supposed to indicate about the individual whose data is being processed and evidence that this has been included in the metadata.
Who within your organisation is responsible for data collection and pre-processing.

Explanation types this supports: responsibility, data, fairness, safety and performance, impact.

Model selection

What the specific interpretability or transparency standards, conventions and requirements of the domain in which your AI system will be applied are.
How the specific type of application and the impact on individuals informs the type of model you choose.
How the types of data you are using, for example social or demographic data, or biophysical data, have influenced your model selection regarding its interpretability.
Whether your use case enables you to use maximally interpretable algorithmic techniques, and if not, why not.
When using ‘black box’ models, what the risks are of using them, and how you will provide supporting evidence that your team has determined that your use case and your organisational capacities and resources support the responsible design and implementation of these systems.
When using opaque algorithmic techniques such as ‘black boxes’, how the supplementary tools that you will use to explain the model provide a domain-appropriate level of explainability. Your documentation should demonstrate how the supplementary tool will mitigate the potential risks of using a ‘black box’ system, and how the use of the tool will help you to provide meaningful information about the rationale of any given outcome.
If you use ‘challenger’ models alongside more interpretable models, what is the purpose of these models and how will you use them.

Explanation types this supports: rationale, responsibility, data, fairness, safety and performance, impact.

Model building, testing and monitoring

What the accuracy rate and other performance metrics you have chosen for the model are, as well as any tuning of cost ratios to constrain error allocation, and how and why you have selected these. You should be able to explain to the decision recipient how this choice may affect the decision that you have made about them.
If relevant, what are the group-specific error rates and how the model has been tuned to redress any significant imbalances.
How you have monitored and assessed potential for biases in the model design and what measures you have taken to mitigate those you have identified.
How you have tested the model, including test results and which portions of the data you have used to train, test the model and holdout data.
How frequently you will monitor, update and re-examine the model after it is deployed in the real world.
How often you will update the training data after model production and deployment. You should also document what you have put in place to establish the appropriate frequency of updates.
How you will track each time the model has been updated, and how each version has changed, so that you can explain to the decision recipient how that particular version of the model came to the decision, and why this might differ from the output of a subsequent or prior model.

Explanation types this supports: rationale, data, fairness, safety and performance.

Tools for extracting an explanation

When using more inherently interpretable models, what measures you have taken to ensure optimal explainability, for example the sparsity constraints placed on the feature space so that explanations can remain human understandable.
When using supplementary interpretability tools for ‘black box’ models, an outline of what local and global techniques you have used to provide explanations. This may be in the form of detailed specifications of the supplementary tools you have used.
How you plan to combine these different explanation tools to produce meaningful information about the rationale of the system’s results.
Who is responsible for ensuring that the explanations generated by the supplementary tools are accessible to the people they are intended to inform.
How you will translate the statistical output of your model and supplementary tools into a plain-language explanation, for example by establishing and documenting appropriate implementer training and providing users with comprehensive guidelines for responsible implementation.

Explanation types this supports: rationale, responsibility.

Explanation delivery

Why you will prioritise certain explanation types when you deliver the explanation to the affected individual, given the contextual factors you determine to be relevant in the particular case you are considering.
How and why you have prioritised the remaining explanation types.
What training you have provided to implementers to enable them to use the model’s results responsibly and fairly.
How the implementer will be presented with the model’s result, including:
- how you present performance metrics and error rates for the model as a whole and for sub-groups if appropriate;
- how you present uncertainty measures like error bars and confidence intervals;
- how you use visualisation tools and present indicators of relative variable important or variable interactions; and
- in the case of ‘black box’ models, how you present information from supplementary tools as well as indicators of the limitations and uncertainty levels of these tools.
What reasonable adjustments you will make for the form in which you deliver the explanation, as required under the Equality Act 2010.
What information you will proactively share with your customers and stakeholders, so that they are able to make informed choices in advance of engaging with the decision-making process.
Who decision recipients can contact to query a decision.

Explanation types this supports: rationale, responsibility, data, fairness, safety and performance, impact.

How should we organise this documentation?

In each part of this guidance, we have emphasised that preparing your organisation to explain AI-assisted decisions is a holistic and end-to-end activity. It involves both demonstrating that you have undertaken the processes behind the design, development, and deployment of your AI system responsibly and clarifying the outcomes of that system’s decision-support in a clear, understandable, and context-sensitive way. We have called these aspects of explaining AI process-based and outcome-based explanations.

Whether you are a developer, who is building and supplying AI applications, or an organisation developing your own AI systems in-house, one of the challenges you may face is figuring out how best to organise the documentation of your innovation practices to help with your process-based explanations.

This may, at first, seem like a daunting task, because it involves:

documenting diverging governance activities across the AI design and deployment lifecycle;
consolidating this information to easily convey it to a diverse range of stakeholders with varying needs and levels of technical and domain expertise; and
differentially organising how to provide the information, so that different stakeholders receive the appropriate kinds and quantities of information (for example, ensuring that decision recipients are not overwhelmed by technical details and vast amounts of text or provided with commercially sensitive information).

One method for organising the documentation for process-based explanations is building argument-based assurance cases for those high-level properties of your AI model (like safety or fairness). You can find details about how to do this in Annexe 5.

However you choose to organise your documentation, you should do it in a way that:

allows you to easily access the relevant information required for each explanation type;
is supported by your current document management system; and
is accessible to those within your organisation that provide explanations to decision recipients.

If you plan to procure a system, you should ensure that the process you choose allows you to communicate with your vendor in a way that mutually manages expectations. If your vendor is able to offer evidence of their compliance and ability to explain decisions (through justification, evidence, and documentation), you will be able to better provide this information to your decision recipients. You will also be able to assess whether the model offered by the vendor meets the acceptable criteria and standards you have set for an AI system.

Further reading

For further guidance on procuring systems, you may wish to read the World Economic Forum’s AI Government Procurement Guidelines. Although this guidance is primarily aimed at public sector organisations, a large number of the principles contained within this guidance are not sector-specific. As such you should be able to apply them to your organisation.