Anonymization and GDPR compliance; an overview

Anonymization and GDPR
Anonymization and GDPR

Anonymization of personal data is the process of encrypting or removing personally identifiable data from data sets so that the person can no longer be identified directly or indirectly. When a person cannot be re-identified the data is no longer considered personal data and the GDPR does not apply for further use. 3 min read

What is anonymization of data?

Data anonymization is the processing of data with the aim of irreversibly preventing the identification of the individual. In other words, the data shall be impossible to connect to an individual for it to meet the requirements for data anonymization. Anonymization and GDPR compliance – there is no standard technique in EU legislation.

True data anonymization:

  1. is irreversible
  2. makes it nearly impossible to identify a natural person

Anonymization definition

Data anonymization is defined in an ISO standard (ISO 29100:2011) as:

“Process by which personally identifiable information (PII) is irreversibly altered in such a way that a PII principal can no longer be identified directly or indirectly, either by the PII controller alone or in collaboration with any other party”

The main criteria that can be picked out form this definition is that identifiable information should be irreversibly altered in a way that the person no longer can be directly or indirectly identifiable.

There is not an actual definition of anonymization in the GDPR but the requirements in recital 26 GDPR must be met in order for the data to be considered anonymized:

“The principles of data protection should therefore not apply to anonymous information, namely, information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable”

To determine if a natural person is identifiable you shall consider all means that are reasonably likely to be used such as the costs, the time required and the available technology at the time of the processing and technological developments.

Anonymization techniques and challenges

Which type of techniques that works best depends on the situation. The EU expert advice group Article29 on data protection has issued guidance on anonymization techniques in opinion WP216 (2014).

Here is an introduction to anonymization and GDPR compliant techniques that can be used:

The first type of technique is Randomisation and this technique is build on the alteration of the data. The purpose is to cut the link between the individual and the data, without losing the value the controller has of the data. Therefore, this type of technique may be good to use when you do not need precise information for the processing. Noise addition, permutation and differential privacy are techniques that fall within the family of randomization techniques.

The second type is Generalisation, and the purpose of this technique is to reduce the granularity of data. Which will have the effect that you disclose lesser data regarding the data subject. By using this type of technique, it is less likely that an individual can be singled out. This technique can only work if you store multiple data subjects data together. For example, a database storing the ages of data subjects can be altered so only the band of ages the data subjects fall under is recorded (e.g. 18-25; 25-35; 35-45; etc.). Therefore, you would not be able to identify any of the specific data subjects. This technique will only work if you store so many different persons in the database that it is impossible to single people out under each category. The family of generalisation techniques include the techniques of aggregation and K-anonymity and L-diversity/T-closeness.

The last technique is Masking and this often works as a supplement to different anonymisation techniques. This technique builds on removing any obvious personal identifiers form the data. This could, for example, be a name, images and addresses. This technique will often not be enough for the data to be anonym. Therefore, often you need to use this technique together with one of the other techniques.

Legal basis for anonymization

Processing personal data for the purpose to anonymize the data is still processing that must have a legal basis under Article 6. The anonymization process is what is known as “further processing”. As such the new processing must be compliant with the principle of purpose limitation.

Most often, the legal basis of the controller’s/processor’s fulfiling contract or legitimate interest will apply, if the principles of collection, purpose, retention have been complied with.

Data anonymization used for GDPR compliance

If the anonymization is correctly done the data will no longer be linked to an identified or identifiable natural person and therefore not be considered as personal data. The GDPR does not apply to anonymous data which means that you can use such data more freely.

You can use the process of anonymization to improve your organization’s data protection compliance in two main ways:

  • as part of the “privacy by design” strategic work – with the goal to improve the protection of the processed data; or
  • as part of the “data minimisation” strategy – where data can be anonymized and used without the risk of harming the data subjects.

Failed anonymization subject to GDPR sanction

In March 2019 the Danish Data Protection Agency (Danish DPA) sanctioned the taxi company Taxa 4×35 with a GDPR sanction of 1,2 million DKK for failing to delete or anonymize user data. The Danish DPA found that Taxax35 had kept personal data of nearly 9 million taxi rides for the last 5 years. Taxax35 had failed to keep the data used limited to what is necessary for the purposes for which they were processed (so-called data minimization principle and storage limitation in article 5 GDPR).

Taxa 4×35 claimed that they anonymised the data of their customers after two years. Taxa 4×35 deleted the name of the customer from their system after two years and claimed that the data then were anonymized. The remaining data was deleted after 5 years. However, since the customers’ number and other data still was available in the system, the deletion of the name after two years was not enough for the data subject to be anonymous. There was still the possibility of linkability, that is the ability to link records concerning the same person. The Danish Data Protection Agency stated that the data was not de-identified as the company could use the remaining data in the system to identify the customer indirectly. Therefore, the personal data was not irreversibly altered.

Full text of the Danish DPA’s statement on Taxa4x35 (in Danish).

Conclusion

Processing datasets to render the data anonymous is not a one-off task. You will need to revisit the techniques chosen to keep up with the technological advancements and any changes in your organization’s practices.