Launching the report - Privacy Preserving Data Sharing Frameworks - on Wednesday, the ACS outlined a basis for balancing the need for governments and business to share information while maintaining citizens’ privacy.
Delivered by a team led by NSW Chief Scientist and ACS Vice President, Dr Ian Oppermann, the paper stated “deidentifying data is a complex issue but one that needs to be addressed by industry and government”.
“Answering the question of ‘will linking deidentified datasets actually lead to being able to identify someone?’ turns out to be a very subtle and complex challenge,” Dr Oppermann said.
The paper describes a framework for privacy preserving data sharing, addressing technical challenges as well as data sharing issues more broadly. It builds on last year’s ACS Report, Privacy in Data Sharing: A Guide for Business and Government, expanding the concept of a Personal Information Factor and introducing a Utility Factor with worked examples.
ACS President, Yohan Ramasundara, said: “With the invention of digitised data, information is plentiful and creatively leveraged by public and private interests. While Data is very important for governments and businesses, preserving individual privacy is critical.
“This paper is an important milestone in developing a framework that gives our society the benefits of shared data while protecting citizens personal information.”
The framework addresses technical, regulatory, and authorising mechanisms to smart services creation and cross jurisdictional data sharing between governments and industry.
The report came to seven conclusions:
- Many of the voiced concerns about data sharing are expressed as concerns about privacy. In practice they are based on concerns about the sensitivity of data and use of outputs to address these concerns.
- The use case for data strongly influences the risk framework required and the methods (aggregation, suppression, obfuscation, perturbation) appropriate for increasing data safety.
- It is feasible to develop a meaningful Personal Information Factor (PIF) giving a measure of personal information in de-identified, people centric data. Information theoretic metrics show promise for many common protection methods and can be enhanced to cover perturbed data.
- Re-identification risk and levels of personal information in data are related but different concepts.
- Understanding the relationship between different features in a dataset helps to those that have the greatest impact on data utility after protection methods are applied.
- Development of a meaningful measure of relative utility is feasible for datasets protected through aggregation, generalisation, obfuscation and perturbation. Information theoretic metrics based on Mutual Information (between original and protected datasets) shows promise.
- Dealing with “trajectories” (or pathways) in data is a critical to its safe use and release. promise; however, the complexity of implementation.