Identifiable Data

Responsible use of personal data means protecting identities. Before identifiable information can be collected, used or shared, researchers must consider legal and ethical requirements, such as privacy legislation and informed consent.

While access to data should ideally be as open as possible, access to sensitive and identifiable data should be as closed as necessary.

It’s possible to reduce the identifiability of data through techniques referred to as ‘de-identification’, ‘anonymisation’, or ‘de-personalising’. But, in the current age of big data and triangulation, there is debate over whether or not any method can reliably ensure the complete removal of identifiable information.

This does not mean that data cannot be used or shared for research. But it does mean that well-defined approaches for managing and working with data must be implemented.

Managing identifiable data

Research data often needs to contain personal information to help with study administration and qualitative analysis. Establishing a well-defined data management plan before starting your research is the best way to meet ethical and privacy requirements through access control and data security.

A safe data management plan can include:

control of access through physical or digital means, such as passwords
encryption of data, particularly if it is being moved between locations
never putting identifiable and unencrypted data on easily lost items such as USB keys, laptops and external hard drives
taking reasonable actions to prevent the inadvertent disclosure, release or loss of sensitive personal information.

Five Safes: Working with sensitive data

The UK Data Service has developed the Five Safes framework for controlled access to sensitive or confidential data — safe data, safe projects, safe people, safe settings and safe output.

Australia also has guidelines. Commonwealth legislation sets out 13 privacy principles and most states have their own privacy legislation. The Office of the Australian Information Commissioner has details.

Learn about the ARDC co-investment project that’s establishing a shared and distributed sensitive data access management platform for the social sciences and related disciplines, CADRE.

De-identifying data

Data de-identification can protect individuals, organisations and businesses, and protect information such as the spatial location of mineral or archaeological findings or endangered species.

It’s not an exact science and judgement calls may still need to be made when de-identifying data. It’s also not a ‘magic bullet’ to share and publish sensitive data. De-identification should be considered within a range of activities to protect the privacy of research participants, such as obtaining informed consent for data sharing and controlling access to the data. The validity of some research may also be reduced if it uses de-identified data.

Best practice basics for managing de-identification

It’s critical to have a clear plan for managing identifiable data through all research stages and when publishing data. Understanding the requirements and risks will help inform the kinds of consent, data security, and access controls required.

Here are some tips to start your de-identification:

plan de-identification early in the research as part of your data management planning
make sure the consent process includes the accepted level of anonymity required and clearly states what may and may not be recorded, transcribed, or shared
retain original unedited versions of data for use within the research team and for preservation
create a de-identification log of all replacements, aggregations or removals made
store the log separately from the de-identified data files
identify replacements in text in a meaningful way, e.g. in transcribed interviews indicate replaced text with [brackets] or use XML markup tags, such as <anon>…..</anon>
for qualitative data (such as transcribed interviews or survey textual answers), use pseudonyms or generic descriptors rather than blanking out information
digitally manipulate audio and image files to remove identifying information

Australian and international resources

For more in-depth information on de-identification, explore the following Australian resources:

Australian Government’s guide to ‘De-identification Decision Making Framework’
Office of the Australian Information Commissioner’s guidance on de-identification of data and information and guide to securing personal information
Australian Government’s Guidelines for the Disclosure of Secondary Use Health Information
ABS Data Confidentiality Guide
Queensland Office of the Information Commissioner’s Guidelines: privacy and de-identification
The Future of Privacy Forum: A visual guide to practical data de-identification
Office of the National Data Commissioner: Assessing Data Requests

The following international resources are also available:

US Department of Health & Human Services’ de-identification guide
USA National Institute of Standards and Technology’s guides to de-identifying government datasets and personal information
UK Anonymisation Network’s Anonymisation Decision-Making Framework
UK Data Service’s research data management advice
UK Research Data Network’s resources list for managing personal data
UK Information Commissioner’s Office’s Anonymisation guide
UK Data Archive’s advice on anonymising qualitative data
Irish Qualitative Data Archive’s tool for anonymising qualitative data.

Search all resources

Curated collections

Identifiable Data

Managing identifiable data

Five Safes: Working with sensitive data

De-identifying data

Did you find this resource useful?

You may also be interested in

Australian National Persistent Identifier (PID) Strategy 2024

Vocabulary Symposium 2023 Recordings

Good Data Practices

Resources for HASS and Indigenous Researchers

Last updated

Type

Categories

Research Topic

Related Projects

Related Articles

Experts Discuss Sharing Sensitive and Identifiable Human Data

Related Resources

Sensitive Data

Indigenous Data

Data Linkage Webinar Series with PHRN

Publishing Sensitive Data Flowchart

Publishing Sensitive Data Guide

NEWSLETTER SIGNUP

Search all resources

Curated collections

Identifiable Data

Managing identifiable data

Five Safes: Working with sensitive data

De-identifying data

Did you find this resource useful?

You may also be interested in

Australian National Persistent Identifier (PID) Strategy 2024

Vocabulary Symposium 2023 Recordings

Good Data Practices

Resources for HASS and Indigenous Researchers

Last updated

Type

Categories

Research Topic

Related Projects

Related Articles

Experts Discuss Sharing Sensitive and Identifiable Human Data

Related Resources

Sensitive Data

Indigenous Data

Data Linkage Webinar Series with PHRN

Publishing Sensitive Data Flowchart

Publishing Sensitive Data Guide

Share & Print

NEWSLETTER SIGNUP