Enhancing Metadata for Inclusive Research on Entrenched Disadvantage

Increasing the utility of important social science datasets for researchers.
Silhouettes of people standing at sunset on the sea shore
Who will benefit
Social science researchers and analysts, government data custodians

The Challenge

Australia is creating data systems, including public sector data assets, that can support research, policy analysis and evaluation in areas such as education, health, employment, inequality and disadvantage. However, the state of administrative data and the associated metadata needs to be improved so that considerably less time, effort and resources are required to conduct this research.

The Response

This pilot project aims to demonstrate  how to improve the metadata of an important dataset for researchers, particularly in the social sciences.

The project will improve the metadata of the Australian Bureau of Statistics (ABS) Higher Education (HE) administrative data within Person Level Integrated Data Asset (PLIDA) (formerly known as Multi-Agency Data Integration Project (MADIP) data) with a view to increase the utility of the data for researchers/analysts.

HE administrative data refers to a comprehensive set of statistics related to higher education institutions.

The PLIDA dataset is currently used by over 200 research projects led by government, academia and private institutions. Integrated data assets hold a broad range of data that allow complex questions to be analysed, with new insights that aren’t available from a single data source.

The project will focus on the content of metadata (including best practice metadata standards), rather than on the technicalities surrounding information management, or the quality of the data per se.

The proposed deliverables for the project include: 

  • an outline and synthesised summary of existing metadata standards with a focus on administrative and social science data
  • assessment of existing metadata of HE administrative data (within PLIDA/MADIP, and external to it)
  • a user experience report, based on targeted consultations, outlining perceived metadata shortcomings when working with PLIDA/MADIP/HE data and metadata user preferences.
  • best practice metadata elements for HE data (and social science administrative data more broadly, as applicable).
  • guidelines for implementing best practice metadata standards for HE data over time (and in the context of the ABS’s DataLab environment) with relevance for data custodians and the ABS
  • a forward plan for administrative data (with social science relevance) with guidelines and accompanying notes on best practice metadata implementation for data custodians and the ABS.

Who Will Benefit?

Social science researchers and analysts, government data custodians and providers, particularly those using PLIDA and contributing data to the dataset.

The Partners

  • University of Queensland (project lead)
  • Australian Bureau of Statistics (ABS)
  • Australian Government Department of Education
  • ARDC

Target Outcomes

The expected longer-term outcomes of this project include:

  • improved knowledge/awareness among data custodians/providers about good-practice data curation and documentation including CARE and FAIR principles
  • improved, and ongoing improvement of, documentation of government administrative data within the DataLab environment
  • enhanced researcher usability of PLIDA/MADIP, Higher Education data
  • increased demand by social researchers for working with administrative data in the DataLab environment.  

This project is a pilot project for a broader program of work to establish social science research infrastructure.