Building a National High-Resolution Geophysics Reference Collection for 2030 Computation

Large volumes of geophysical data have been acquired by universities, industry, federal/state government agencies since the 1950s. The 2030 Geophysics project will make accessible online the rawer, high-resolution versions of AuScope and TERN funded data, ensure that they comply with the FAIR and CARE principles and can be integrated with existing government datasets at NCI. The project will also make high-resolution datasets suitable for programmatic access in HPC environments at NCI and will lay the foundations for more rapid data processing by 2030 next-generation scalable, data-intensive computation including Artificial Intelligence (AI)/Machine Learning (ML) and data assimilation.

Start date 1 October 2021
Expected completion date 31 May 2023
Investment by ARDC $400,000
Co-investment partners
Lead node
1 Geophysical data survey:
Conduct a survey of raw and other derivative, associated or other higher-processed geophysical data that could be part of an integrated national high-resolution reference collection. The survey will initially focus on AuScope funded Magnetotelluric (MT) and Passive Seismic (PS) datasets.
2 Data ingest and organisation:
Targeted raw geophysical datasets will be ingested and organised on the NCI filesystem so that they can be (re)processed with computational tools available within the NCI. Derivative versions will be linked back to the source datasets.
3 Data publication:
Geophysical data releases will be discoverable in the NCI data catalogue and catalogue metadata will be structured to enable ‘vertical’ integration between repositories that have a higher level product, but need to reference the rawer data at NCI. Data will also be discoverable in Research Data Australia.
4 Data repository coordination
Where derivative data products hosted in other repositories need to reference less processed data at NCI, a review of relevant data catalogues will be undertaken to determine if they comply with the FAIR principles. Gaps and inconsistencies will be identified and priority issues targeted for remedial action.
5 Identifiers
Unique identifiers will be assigned to each version of each dataset including identifiers for the relevant funding agencies and various roles of persons /organisations related to the acquisition, processing and publication of a datasets.
6 Geophysics data/metadata standard identified
A review will be undertaken to determine international community-preferred standards for raw and derivative geophysical datasets. Related domain-specific vocabulary standards will be assessed and the vocabulary will be hosted at an ARDC vocab service, or infrastructure fit-for-purpose at NCI.
7 Scalable computing and data analysis
Establish software suitable for NCI’s computing environments that will focus on how to process raw geophysical data to higher level products. Jupyter analysis notebooks that make use of NCI scalable data analysis software environments.
8 FAIR implementation profiles determined
A candidate FAIR Implementation profile, compliant with current international standards, will be developed that could be used for the whole data ecosystem from acquisition to publication.
9 Completion of projects
Projects are due to be completed by the end of May 2023 and final reports will be published.

Core features

Developing new Multi-geophysical Research techniques
Colocating national high-resolution geophysical reference collections at NCI will enable development of multi-geophysical analytical techniques which can provide new insights into geophysical properties from the surface of the Earth to the core.
Scaling Geophysics to Exascale Research Communities
At exascale, geophysical research will require a community approach with shared community codes built around High Performance, high resolution datasets: geophysicists from different disciplines can collaborate and share their modeling, workflows, results and analysis.
Increase confidence in decision making
Stakeholders will be able to transparently trace from the data products back to the source datasets and enable reproducibility of workflows which in turn, will lead to greater confidence in decision making.

Who is this project for?

  • Researchers
  • Peak bodies
  • Research organisations
  • Infrastructure providers
  • Commercial eInfrastructure providers
  • Government (state and commonwealth)
  • Geophysicists
  • Environmental Researchers
  • Data analysts

What does this project enable?

The 2030 project will create a national, high resolution geophysical data collection that will:

  • Be ‘vertically’ integrated from source datasets at NCI to derivative products;
  • Enable ‘horizontal’ integration of remotely-sensed datasets with observational datasets; and
  • Linked by identifiers, that also cite roles/organisations involved in any phase of the dataset.

Handy resources

AuScopeVisit
NCI AustraliaVisit
TERNVisit