Data Retention Program Phase 4: Identifying Important Data Collections

Storage capacity subsidy for active research projects.
Project
Data Retention Program Phase 4: Identifying Important Data Collections
Project lead
ARDC
Who will benefit
Universities and NCRIS facilities with growing data storage burdens

Timeframe

Dec 2022 to Dec 2023

Current Phase

In progress

ARDC Co-investment

$2,500,000

The Challenge

Research data takes considerable time and effort to collect and curate, and can be used again to answer new research questions. However, to get the most value from data collections, they need to be stored with metadata that enables reuse, and be easy to find. The most accurate metadata is collected during project activity that creates research data.

The volume of research data continues to grow while the cost of data storage capacity is flat. As data storage can be expensive and long-lasting, investment into it needs to be planned carefully. The options to store data are also more diverse than ever: it can be done locally, in regional data centres or in distributed cloud environments.

The Response

The Phase 4 of the Data Retention Program will provide a subsidy to universities and NCRIS facilities to purchase infrastructure for storing research data.

The Data Retention Program supports the retention of research data collections of national significance with a coherent and consistent international metadata standard. Phase 4 will provide co-investment for existing research projects to support anticipated or existing research data collections.

Previous phases of the Data Retention Program indicated that there is an opportunity to support data collections earlier. Important metadata should be recorded at the same time as the data, rather than trying to do so retrospectively, which is always more challenging.

Supported data collections will require 8 project level metadata recorded in a private project register.

Estimated data storage needs can be applied retrospectively up to 2 years. Maximum capacity investment period will be 3 years per project running concurrently.

Further support for data output may be possible from the ARDC Data Retention Investment model by registering formal data collections into DataCite with a further 6 metadata concepts, 14 in total, as part of the primary Data Retention model (see Data Retention Project Phase 2).

Who Will Benefit?

Institutional IT service operators will benefit from the project as the subsidy will lower the approaching risk of growing data storage needs by collecting metadata on content and capacity that enables realistic and timely decisions on storage allocations.

Researchers will benefit from this investment by formally recognising their contribution to research data output in a similar manner to publication in the scholarly record. By providing contemporary metadata on their research data output they will more easily transition to the scholarly record as a recognised corpus of Australian research.

Institutions will benefit from this investment by ensuring the facilities they provide are stable and enduring and in line with the code for responsible conduct, while recognising a more complete output from the research their institution supports.

Collaboration

We sought applications for Phase 4 of the Data Retention Program from those who manage research data storage needs at universities and NCRIS facilities. Universities and NCRIS facilities can apply for a co-investment subsidy for your research data storage infrastructure costs.

The co-investment subsidy will be calculated at $100 per TB from a register of all eligible projects.  Eligible projects will be required to fulfil all 8 metadata requirements at application. There are examples in the project documentation, provided below.

Timeline

29 September, 13 and 28 October 2022 – public information sessions about the project

6 October 2022 – Applications for support open

4 November 2022 – Applications for support close

18 November 2022 – Contracts offered to successful applicants from this date

December 2022 – Project begins

December 2023 – Project concludes

Project Documentation

To learn more about the project, please download the following documents:

Phase 4 Data Retention Project documentation [pdf]

Application proforma [doc]

Register template [xls]

Budget template [xls]

Target Outcomes

The Data Retention Program is a stepping stone to recognisable and enduring research data as a first-class research output and further support via the Data Retention Investment Model.

The Program aims to provide a flexible investment subsidy depending on project requirements, for example, a subsidy can be used for skilled people, capacity, and third-party services. It ensures a manageable and minimal change in operational processes to effect consistent metadata management for research data output.

Contact the ARDC

"*" indicates required fields