Sequencing DNA at population-scale leads to better understanding of disease causes, diagnosis/detection, and more options for tailored treatments. These outcomes require data that is searchable, securely shareable, and often linked across multiple jurisdictions to create cohorts large enough to identify correlations between DNA sequence and health consequences.

Part of an Australian BioCommons Human Genome Informatics Initiative to realise benefits for national consortia, including Australian Genomics and Zero Childhood Cancer, the Human Genomes Platform  project will deliver a services toolbox for improving FAIRness of genomic data at the institutions that hold most human genomes collected for research in Australia. These institutions include: the Garvan Institute, Children’s Cancer Institute, QIMR Berghofer Medical Research Institute, and The University of Melbourne Centre for Cancer Research, covering 10,000s of individuals.

The Human Genome Platform project will implement standards and APIs from the Global Alliance for Genomic Health, and bring their data holdings into alignment with the global human genome repository (European Genome Phenome Archive). The project will investigate best practice technologies for human genome data sharing, and deploy a ‘services toolbox’ built on existing browse and search functions in use at participating repositories. This tool box will replace manual systems and bespoke solutions currently being used by the partners.

Research gains will include fundamental improvements in data management and access to new capabilities, especially the identification of cohorts within and across existing data sets. Critically, the project will also provide as an output, a working template any other institution can adopt and deploy.

1 Search tools
Tools to enable querying across participating repositories will be explored from a technical and operational requirements point of view, and a technical design document produced. Systems will be implemented to enable searching across participating repositories, and return of results.
2 Data Access Management Services
Data Access Management Services appropriate for human genomic data (including DUOS, REMS) will be explored from a technical, policy and operational requirements point of view, and a report produced outlining the findings. Appropriate services will be implemented at participating repositories to semi-automate the work of Data Access committees.
3 User Authorisation and Authentication (AuthN/AuthZ)
Community needs around User Authentication (AuthN) and Authorisation (AuthZ) for human genome data sharing will be ascertained through inclusive surveys and workshops and articulated in a report; and international solutions will be compared (including Global Alliance for Genomic Health (GA4GH) Passports and Authentication & Authorization Infrastructure specification; NIH Researcher Auth Service; and CILogon). A pilot will be conducted of a National Human Genome AuthN/AuthZ system against one of the use cases (for demonstration purposes and proof of principle for any potential future deployments).
4 EGA metadata generation and data upload systems
Systems will be deployed to generate European Genome Phenome Archive (EGA) compliant metadata reports from Gen3 or Vectis (including double-deidentification when required), and semi-automated methods to allow the EGA compliant metadata to be uploaded to EGA. Accompanying streamlined EGA data upload processes (using EGA required encryption (EGACryptor) and file transfer (FTP/Aspera) technologies) will be deployed across participating repositories.
5 Feasibility report
A report on the feasibility of hosting Local EGA Nodes in Australia, and the optimal housing for genome data (to enable long term preservation as well as availability for recomputing), including the technical, policy and funding requirements.
6 Knowledgebase and Training
A knowledgebase will be produced to enable the community to ask questions and contribute content. Training material on how to use the systems will be produced both for self-paced use or for use in workshops. Training events will be held for Researcher / Data Repository Custodian / Data Access committee audiences around how to use the systems, and other Australian providers/Institutions (e.g. Developers/Engineers) wishing to deploy the resources elsewhere.

Core features

Global alignment
Australian systems will be better aligned with global systems, and Australian researchers will be connected more closely to international human genome data sharing initiatives.
Global scale
Genomic data from thousands of Australians will be able to be shared securely and responsibly on national and global scales, enabling comparison with very large numbers of other genomes to ensure their full research value can be realised.

Who is this project for?

  • Researchers who conduct human genome analysis
  • Human genome data repository custodians
  • Data access committees members
  • Developers/engineers at other Australian providers/institutions wishing to deploy the resources elsewhere

What does this project enable?

This project will implement standards and APIs from the Global Alliance for Genomic Health, and bring their data holdings into alignment with the global human genome repository (European Genome Phenome Archive). Genomic data from thousands of Australians will be able to be shared securely and responsibly on national and global scales, enabling comparison with very large numbers of other genomes to ensure their full research value can be realised.

Handy resources

  • European Genome Phenome Archive – EGA
  • Global Alliance for Genomic Health – GA4GH
  • Full title: “Global technologies and standards for sharing human genomics research data” https://doi.org/10.47486/PL032
Australian BioCommonsVisit
The University of Melbourne Centre for Cancer Research (UMCCR) Visit
The Children’s Cancer Institute (CCI) Visit
ZERO Childhood Cancer (ZERO) Visit
Garvan Institute of Medical Research (Garvan) Visit
QIMR-Berghofer Medical Research Institute (QIMRB) Visit
Australian Genomics (AG) Visit
Australian Access Federation (AAF) Visit
National Computational Infrastructure (NCI) Visit
Bioplatforms Australia (BPA) Visit