ARDC Community Data Lab

The ARDC Community Data Lab will share tools and datasets for collaborative HASS research projects that use data from archives, libraries and collections.
A thin layer of water reflecting the silhouettes of a group of people standing on it at dusk

The Challenge

During the consultation for the Trove Researcher Consultation Report, researchers said that HASS is not a single entity, and the researchers who fall in this domain have a multitude of diverse questions and approaches when interrogating data contained in archives and collections. However, it was recognised that establishing a way to pool approaches for research could be of value, enabling researchers to use, re-use, and augment datasets in ways that enable additions and developments.

A number of suggestions during the consultation pointed to the need for a platform where tools, code and datasets that make use of data on Trove could be shared, organised and annotated by researchers. It would create a ‘collaboration layer’ on top of an improved Trove API.

The Response

The ARDC Community Data Lab will enable the sharing of an increasing range of tools and datasets, provide environments for running the tools, and options for researchers to analyse and annotate datasets. It will be a platform that not only shares a number of instruments and infrastructures, but also provides governance and procedures that have protocols, and practices for managing and recording research as a process.

The ARDC Community Data Lab will facilitate cross-institutional collaboration, enabling researchers to work collaboratively in groups and conduct open research.

It is being constructed in phases and will be extendable based on the needs of the research community. The project currently includes 2 phases, which will be delivered over 2023–2024.

Phase 1 will focus on Trove as a central data source. The ARDC Community Data Lab will provide extensive documentation and examples of how to process data from Trove and other archives and collections using existing distributed research infrastructure available to Australian researchers. This will include ARDC Nectar Research Cloud computing services, publicly available data repositories such as Zenodo and Figshare, and workspace services such as GitHub and GitLab.

The ARDC Community Data Lab will include a range of tools, such as the:

Phase 2 will focus on data and analytics training guides.

Feedback on the project is welcome. Please send your feedback to [email protected].

Who Will Benefit?

  • HASS researchers and all those interrogating collections and archives.

Researcher Advisory Group

The ARDC Community Data Lab Researcher Advisory Group will provide focused and specific input and feedback to the project team as the project progresses, to ensure the project outputs have broad applicability to HASS and Indigenous researchers who would benefit from using data lab tools.

The group will provide domain knowledge, independent critical thinking and advice on the defined project work packages and deliverables.

The members of the Researcher Advisory Group are:

  • Professor Catherine Travis, Chair of Modern European Languages at the College of Arts and Social Sciences, Australian National University
  • Dr Yorick Smaal, Senior lecturer in History in the School of Humanities, Languages and Social Science, Griffith University
  • Dr Trent Ryan, Research Fellow to the Indigenous Data Network at the Melbourne School of Population and Global Health, University of Melbourne
  • Professor Adrian Vickers, Professor of Southeast Asian Studies, University of Sydney
  • Dr Mike Jones, Postdoctoral Research Fellow in the College of Arts and Social Sciences, Australian National University
  • Dr Terhi Nurmikko-Fuller, Senior research fellow at the Centre for Social Research & Methods, Australian National University.

Target Outcomes

This project will produce the ARDC Community Data Lab, containing datasets, tools and software, and training guides. It will be freely accessible by Australian researchers. Read the latest updates:

The National Library’s Trove service is consistently highlighted as a valuable resource for researchers in the humanities, arts and social sciences. As part of the ARDC Community Data Lab, Tim Sherratt, creator of GLAM Workbench, has published the Trove Data Guide.

The Trove Data Guide explores the different types of data available from Trove, covering:

  • What is Trove
  • Understanding search
  • Accessing data
  • Digitised newspapers and gazettes
  • Other digitised resources
  • Research Pathways.

Access the Trove Data Guide >

The following tools and services are in development and we can’t wait to share them with you, so stay tuned for further updates.

Image annotation workbench

We have been working with researchers on their needs and have commenced preliminary wireframing to create this new workbench – coming soon!

Stylometrics Tool (IA Workbench) (SIA)

Intelligent Archive (IA) Workbench is a platform for computational and statistical analysis of style in texts, developed by the University of Newcastle. It enables you to easily organise large collections of texts into sets, manage metadata, generate word and n-gram frequencies, handle XML tags, and generate results that can be exported for analysis and visualisation. This is in development – watch this space!

Spatio-temporal hotspot mapping

The first iteration of the Spacetime (spatio-temporal) Hot Spot analysis tools for use with TLCMap data (and beyond) has had an initial demonstration for feedback before wider release.

Trove query interface with additional data formats

A data transformation service will provide a simplified user interface for users to construct Trove API queries, and allow them to access Trove data in additional data formats, including TEI and Atom XML, which they can then import into other applications.

Key Resources