Thursday, May 1, 2014

Legal Impact of Anonymisation Techniques and Geospatial Data

The Article 29 Data Protection Working Party recently published Opinion 05/2014 on Anonymisation Techniques. The purpose of the opinion was to "analyze the effectiveness and limits of existing anonymisation techniques against the EU legal background of data protection and provide recommendations . . . "

The report discusses a number of anonymisation techniques, including randomization - through noise addition, permutation and differential privacy - and generalization - through aggregation, k-anonymity, l-diversity and t-closeness. The opinion examines the "robustness" of each technique with respect to three criteria: (i) whether it was still possible to single out an individual; (ii) was it possible to link records relating to an individual and (iii) whether information can be inferred concerning an individual. The group also explored pseudonymisation, primarily to "clarify some pitfalls and misconceptions: pseudonymisation is not a method of anonymisation." The report concludes that "anonymisation techniques can provide privacy guarantees and may be used to generate efficient anonymisation processes, but only if their application is engineered appropriately." This requires the data process to clearly identify the context and the objectives of the process, and should be determined on a "case-by-case basis". Moreover, anonymisation should not be a one-off exercise, as privacy risks should be regularly reassessed.

The report included a number of references to geospatial information. For example, it states that "if an organisation collects data on individual travel movements, the individual travel patters at event level would still qualify as personal data for any party, as long as the data controller (or any other party) still has access to the original raw data, even if direct identifiers have been removed from the set provided to third parties. But if the data controller would delete the raw data, and only provide aggregate statistics to third parties on a high level, such as 'on Mondays on trajectory X there are 160% more passengers than on Tuesdays', that would qualify as anonymous data." (p. 9). The report also refers to a 1997 study in which an academic researcher could link the identify of specific data subjects to the attributes of an anonymised data using only a zip code and two other attributes. (pp 33-34)

With respect to pseudonymised datasets, the report cites a study published in 2013 conducted by MIT  researchers that found by using 15 months of spatial-temporal mobility coordinates of 1.5 million people on a territory within a radius of 100 km, "95% of the population could be singled-out with four location points, and that just two points were enough to single-out more than 50% of the data subjects (if one of the points in known)" - even if the individuals' identities were pseudonymised. (p. 23)

The Article 29 Data Protection Working Party consists of representatives from Data Protection authorities (and a few others) across Europe. As such, although the opinion does not constitute the law in Europe, it provides useful guidance for those collecting/processing/using/storing/distributing data in the region. Since Europe is considered one of the leaders in data protection/privacy, many other nations will consider the European position when drafting their own laws and policies. In addition, I point out that the author of the 1997 study cited above is currently the Chief Technology Officer for the U.S. Federal Trade Commission (FTC). The FTC is becoming the de facto federal authority for privacy in the U.S. As a result, it is a useful marker for organizations that are attempting to anonomise datasets that contain geospatial attributes. 

No comments: