Identifiable Data

Read our guide for responsibly handling identifiable data, and find a list of Australian and international resources for data de-identification.

  • Higher-degree researchers (HDRs) / PhD candidates
  • Early-/mid- career researchers (EMCRs)
  • Data custodians/managers
  • Government

By the end of reading this resource, you should be able to:

  • develop a strategy for addressing identifiable data in your data management plan
  • understand the Five Safes framework
  • seek further information from Australian and international agencies.

Australian Research Data Commons 2025, Identifiable Data, viewed 15 May 2026, https://ardc.edu.au/resource/identifiable-data/.
Australian Research Data Commons. (2025). Identifiable data. https://ardc.edu.au/resource/identifiable-data/.
Australian Research Data Commons. “Identifiable Data.” 2025, https://ardc.edu.au/resource/identifiable-data/.
Australian Research Data Commons. Identifiable data [Internet]. [updated 2025; cited 2026 May 15]. Available from: https://ardc.edu.au/resource/identifiable-data/.
Australian Research Data Commons. “Identifiable Data.” 2025. https://ardc.edu.au/resource/identifiable-data/.
Australian Research Data Commons. “Identifiable Data.” Accessed: May. 15, 2026. [Online]. Available: https://ardc.edu.au/resource/identifiable-data/.

Responsible use of personal data means protecting identities. Before identifiable information can be collected, used or shared, researchers must consider legal and ethical requirements, such as privacy legislation and informed consent.

While access to data should ideally be as open as possible, access to sensitive and identifiable data should be as closed as necessary. 

It’s possible to reduce the identifiability of data through techniques referred to as de-identification, anonymisation, or de-personalising. In the current age of big data and triangulation, however, there is debate over whether or not any method can reliably ensure the complete removal of identifiable information. 

This does not mean that data cannot be used or shared for research, but it does mean that well-defined approaches for managing and working with data must be implemented.

Managing Identifiable Data

Research data often needs to contain personal information to help with study administration and qualitative analysis. Establishing a well-defined data management plan before starting your research is the best way to meet ethical and privacy requirements through access control and data security.

A safe data management plan can include:

  • control of access through physical or digital means, such as passwords
  • encryption of data, particularly if it is being moved between locations
  • never putting identifiable and unencrypted data on easily lost items such as USB keys, laptops and external hard drives
  • taking reasonable actions to prevent the inadvertent disclosure, release or loss of sensitive personal information.

Five Safes: Working with Sensitive Data

The UK Data Service has developed the Five Safes framework for controlled access to sensitive or confidential data – safe data, safe projects, safe people, safe settings and safe output. Watch a video about Five Safes:

Australia also has guidelines. Commonwealth legislation sets out 13 privacy principles, and most states have their own privacy legislation.

Learn about the ARDC co-investment project that’s established CADRE, a shared and distributed sensitive data access management platform for the social sciences and related disciplines.

De-Identifying Data

Data de-identification can protect individuals, organisations and businesses, and protect information such as the spatial location of mineral or archaeological findings or endangered species. 

It’s not an exact science and judgement calls may still need to be made when de-identifying data. It’s also not a magic bullet to share and publish sensitive data. De-identification should be considered within a range of activities to protect the privacy of research participants, such as obtaining informed consent for data sharing and controlling access to the data. The validity of some research may also be reduced if it uses de-identified data.

Managing de-identification: best-practice basics

It’s critical to have a clear plan for managing identifiable data through all research stages and when publishing data. Understanding the requirements and risks will help inform the kinds of consent, data security, and access controls required.

Here are some tips to start your de-identification:

  • plan de-identification early in the research as part of your data management planning
  • make sure the consent process includes the accepted level of anonymity required and clearly states what may and may not be recorded, transcribed or shared
  • retain original unedited versions of data for use within the research team and for preservation
  • create a de-identification log of all replacements, aggregations or removals made
  • store the log separately from the de-identified data files
  • identify replacements in text in a meaningful way – such as indicating replaced text with square brackets [] in transcribed interviews, or using XML markup tags like <anon>…</anon>
  • for qualitative data like transcribed interviews or survey textual answers, use pseudonyms or generic descriptors rather than blanking out information
  • digitally manipulate audio and image files to remove identifying information.

Australian resources

International resources