The Challenge
Australian galleries, libraries, archives and museums hold a wealth of data on our history, culture, language and more. Traditional research using these collections involved days or weeks in a reading room, but many collections are now digitised, available online from anywhere.
The emergence of digitised collections has created exciting opportunities for data-driven research. However, researchers need new skills and tools to use computational methods, which have not traditionally been taught in universities and institutions.
In 2022, the ARDC commissioned a report on findings from consultation with the research community about how they use Trove, an online platform run by the National Library of Australia. During the consultation, researchers from across the humanities, arts and social sciences (HASS) raised the multitude of diverse questions and approaches that could be brought to such a rich source of data. Given this diversity, it was recognised that establishing a way to pool approaches for research would be of value, by enabling researchers to use, reuse, share and enhance tools and datasets.
A recommendation of the report was the creation of a platform where tools, code and datasets that make use of data on Trove could be shared, organised and annotated by researchers. It would create a ‘collaboration layer’ on top of an improved Trove API.
The Response
The ARDC Community Data Lab (CDL) fosters the development of tools, datasets, and documentation that enable researchers to use data from libraries, museums, archives and other collections. It does this by:
- gathering information on researcher needs through community co-design processes
- partnering with other organisations to develop resources that meet identified researcher needs
- creating frameworks and policies to guide the development of new resources
- sharing details of new resources and supporting related initiatives for training in digital research skills, creating a pool of approaches for research.
By focusing on processes for efficient, collaborative, and sustainable development, the ARDC CDL will be able to respond quickly to new research needs. The co-design framework will foster connections between researchers and developers, building capacity and engagement. The ARDC CDL is part of the Connections focus area of the HASS and Indigenous Research Data Commons.
Feedback on the project is welcome. Please contact us.
Who Will Benefit
- Anyone interested in using collections in galleries, libraries, archives and museums (GLAM) for doing research will benefit from understanding the data held in collections, and methods that can be used to analyse them.
- GLAM institutions benefit both from greater exposure and interest in their holding, and from the benefit of research outcomes building upon those holdings.
- The Digital Research Infrastructure sector will benefit from the development of good and best practices around developing infrastructures in this way.
Outcomes
The full listings of outputs from Phase 1 include an initial suite of underpinning services and guides, which can be used right now.
- ARDC BinderHub Service, upon which a Jupyter notebook and JupyterHub based approach to developing tools and guidance can be built
- Trove Data Guide
- GLAM Workbench, with enhancements focused on adding machine-readable metadata to Trove-related notebooks and datasets
- Glycerine, an image annotation workbench
- Stylometric Intelligent Archive (SIA), a stylometric analysis workbench
- Spatio-temporal hotspot mapping data guide (access the notebook)
- Searching on the Gazetteer of Historical Placenames guide (access the notebook)
Learn more and access them via the ARDC CDL service page. Read our guide to Trove for researchers.
Also developed were proof of concept services, to test the viability of the architectural principles developed alongside them. These include:
- Trove proxy interface for TEI representation (access the code)
- Trove query builder interface (access the code)
- custom configuration of Voyant Tools, with the proxy and query builder included (access the code)
- generic dockerised version of Voyant Tools (access the code).
We recommend these outputs to developers interested in extending them.
This project has delivered outcomes for 3 different audiences and continues to do so:
The primary outcome is to facilitate easier, faster and or new ways to access and benefit from GLAM sector holdings. We have done this by creating a (growing) set of guides and tools, which are available via our resources and services pages, starting with:
If you are a researcher comfortable with writing code, you may also be interested in outcomes for the national digital research infrastructure sector.
We are always seeking partnerships with the GLAM sector to focus on your holdings. Right now, we are doing this through Trove guidance and tools which aggregates the holdings of many GLAM institutions. Get in touch if you’re interested in drawing attention to your holdings.
If you are a GLAM sector employee comfortable with writing code, you may also be interested in the outcomes for the national digital research infrastructure sector.
We are seeking to grow a body of best practice to:
- develop a ‘collaboration layer’ built upon existing holdings (especially via APIs)
- document useful patterns of development
- cohere a community of practice
- identify and build out supporting services and activities to enable the above.
To this end, we have produced documentation and architectures, including:
- the Trove researcher consultation report
- the Phase 1 project plan
- a preliminary architecture
- a set of principles to guide participation in the development of infrastructure in the lab.
Researcher Advisory Group
The ARDC Community Data Lab Researcher Advisory Group provides focused and specific input and feedback to the project team as the project progresses to ensure the project outputs have broad applicability to researchers who would benefit from using data lab tools.
The group provides domain knowledge, independent critical thinking and advice on the defined project work packages and deliverables.
The members of the Researcher Advisory Group are:
- Professor Catherine Travis, Chair of Modern European Languages at the College of Arts and Social Sciences, Australian National University
- Dr Yorick Smaal, Senior lecturer in History in the School of Humanities, Languages and Social Science, Griffith University
- Dr Trent Ryan, Research Fellow to the Indigenous Data Network at the Melbourne School of Population and Global Health, University of Melbourne
- Professor Adrian Vickers, Professor of Southeast Asian Studies, University of Sydney
- Dr Mike Jones, Postdoctoral Research Fellow in the College of Arts and Social Sciences, Australian National University
- Dr Terhi Nurmikko-Fuller, Senior research fellow at the Centre for Social Research & Methods, Australian National University
- Jacinta Walsh, PhD Candidate, Monash Indigenous Studies Centre (MISC)
- Dr Leah Henrickson, Lecturer in Digital Media and Cultures, School of Communication and Arts, The University of Queensland
- Professor James Smithies, Director, Digital Research (HASS) Australian National University College of Arts and Social Sciences
- Dr Imogen Wegman, Lecturer in Humanities, Office of the School of Humanities, University of Tasmania
Key Resources
- Guides and tools developed by the ARDC Community Data Lab
- ARDC Community Data Lab project plan: Phase 1
- Trove Researcher Consultation Report
- ARDC Community Data Lab architecture: Phase 1 (note: the ARDC Community Data Lab was previously called the HASS Community Data Lab)
Timeframe
Current Phase
Project lead
Categories
Research Topic
Related Events
Related Articles
- Digital History and Trove Data Guide at AHA Conference 2024
- Latest Updates from the HASS and Indigenous Research Data Commons
- HASS and Indigenous Research Data Community Exchange Knowledge at Annual Symposium
- Summer School Shares Computational Skills for HASS and Indigenous Research
- Implementing Indigenous Data Licensing and Access: Empowering Communities and Upholding Cultural Rights
- AARNet and the ARDC Collaboration Brings BinderHub to Researchers
- Empowering HASS and Indigenous Researchers with Essential Computational Skills
- Advancing HASS and Indigenous Research Infrastructure: A Symposium
- End of Year Update on the HASS Research Data Commons and Indigenous Research Capability Program