Federating APPN Data Collections

The Challenge
The Australian Plant Phenomics Network (APPN) is a nationally funded program to deliver state-of-the-art plant phenotyping infrastructure for growing crops under controlled environments and field locations using direct measurements, local sensors, and increasingly, UAV/satellite imagery.
In addition to the phenotyping infrastructure that APPN provides, there is a growing need in the research community to access FAIR-enabled agricultural data collections to make sense of the vast number of datasets published through repositories.
The lack of digital connectivity between research organisations means datasets of national significance are siloed, making discovery and aggregation challenging. Treating plant phenotyping datasets as a single discoverable resource would allow datasets to be combined and faceted to support much richer discovery and aggregated data downloads.
Enhancing the digital connectivity between these core national research infrastructure platforms will offer immediate benefits to the APPN community. In addition, solutions developed through this project can be applied to national research infrastructure in other domains to support the efficient management of federated data collections.
The Response
This project aims to use national information infrastructure, including the repository services of APPN node partners, the persistent identifier systems (including DataCite) and ARDC’s Research Data Australia and Research Vocabularies Australia platforms, as a middleware layer that allows discovery and access of APPN data packages published by its partner nodes.
Key goals of the project are to:
- use consistent metadata elements and persistent identifier (PID) ecosystem (DOI, ROR, ORCID) to describe APPN-related data resources at each of its institutional nodes
- make APPN-related data resources discoverable on the Research Data Australia (RDA) catalogue through an APPN federated view
- develop a strategy to discover candidate datasets for inclusion as part of the APPN federated data collection in the RDA catalogue
- develop recommendations to enable APPN node partners and external stakeholders to tag/nominate datasets to include in the RDA catalogue through using consistent metadata elements and PIDs
- enable machine retrieval of the data resource itself (rather than just the landing page) from a metadata record. This would allow the national research infrastructure to use a federated data collection for future data aggregation services or rich content presentation
- develop an APPN portal lens within RDA to manage the federated data collections and further enhance the views and streams of the data assets.
Achieving these goals will allow APPN and partner data to be more easily accessed and used by researchers, policymakers and the public, and will also enable discoverability through ARDC Research Data Australia.
Overall, these efforts will ensure that the data produced by APPN and collaborators is easily discoverable and accessible, amplifying the impact of the national research infrastructure for the research community.
This project is part of the Domain Data Portals program within the ARDC’s Planet Research Data Commons.
Who Will Benefit
- researchers and research organisations
- government
- industry
- non-government organisations
The Partners
This project is led by APPN in partnership with the ARDC.
APPN is working with its partner nodes and host institutions’ repository and library services.
Target Outcomes
The key outcomes of this project will be:
- crop researchers will be able to discover datasets published by APPN as a single discoverable resource in RDA
- researchers will be able to easily combine and facet APPN datasets to support much richer discovery and aggregated data downloads maximising overall interoperability.
Key Resources
Learn more about the Australian Plant Phenomics Network (APPN)