Identifiable Data

Responsible use of personal data means protecting identities. Before identifiable information can be collected, used or shared, researchers must consider legal and ethical requirements, such as privacy legislation and informed consent.

While access to data should ideally be as open as possible, access to sensitive and identifiable data should be as closed as necessary. 

It’s possible to reduce the identifiability of data through techniques referred to as ‘de-identification’, ‘anonymisation’, or ‘de-personalising’. But, in the current age of big data and triangulation, there is debate over whether or not any method can reliably ensure the complete removal of identifiable information. 

This does not mean that data cannot be used or shared for research. But it does mean that well-defined approaches for managing and working with data must be implemented.

Managing identifiable data

Research data often needs to contain personal information to help with study administration and qualitative analysis. Establishing a well-defined data management plan before starting your research is the best way to meet ethical and privacy requirements through access control and data security.

A safe data management plan can include:

  • control of access through physical or digital means, such as passwords
  • encryption of data, particularly if it is being moved between locations
  • never putting identifiable and unencrypted data on easily lost items such as USB keys, laptops and external hard drives
  • taking reasonable actions to prevent the inadvertent disclosure, release or loss of sensitive personal information.

Five Safes: Working with sensitive data

The UK Data Service has developed the Five Safes framework for controlled access to sensitive or confidential data — safe data, safe projects, safe people, safe settings and safe output.

Australia also has guidelines. Commonwealth legislation sets out 13 privacy principles and most states have their own privacy legislation. The Office of the Australian Information Commissioner has details.

Learn about the ARDC co-investment project that’s establishing a shared and distributed sensitive data access management platform for the social sciences and related disciplines, CADRE.

De-identifying data

Data de-identification can protect individuals, organisations and businesses, and protect information such as the spatial location of mineral or archaeological findings or endangered species. 

It’s not an exact science and judgement calls may still need to be made when de-identifying data. It’s also not a ‘magic bullet’ to share and publish sensitive data. De-identification should be considered within a range of activities to protect the privacy of research participants, such as obtaining informed consent for data sharing and controlling access to the data. The validity of some research may also be reduced if it uses de-identified data.

Best practice basics for managing de-identification

It’s critical to have a clear plan for managing identifiable data through all research stages and when publishing data. Understanding the requirements and risks will help inform the kinds of consent, data security, and access controls required.

Here are some tips to start your de-identification:

  • plan de-identification early in the research as part of your data management planning
  • make sure the consent process includes the accepted level of anonymity required and clearly states what may and may not be recorded, transcribed, or shared
  • retain original unedited versions of data for use within the research team and for preservation
  • create a de-identification log of all replacements, aggregations or removals made
  • store the log separately from the de-identified data files
  • identify replacements in text in a meaningful way, e.g. in transcribed interviews indicate replaced text with [brackets] or use XML markup tags, such as <anon>…..</anon>
  • for qualitative data (such as transcribed interviews or survey textual answers), use pseudonyms or generic descriptors rather than blanking out information
  • digitally manipulate audio and image files to remove identifying information

Australian and international resources

For more in-depth information on de-identification, explore the following Australian resources:

The following international resources are also available: