About the Event
At this meeting of the Australian Sensitive Data Interest Group (AUSDIG), we’ll have 2 presentations on 2 synthetic data projects at CSIRO.
Presentations and Speakers
Genomator: generating synthetic genomes
Synthetic data is a valuable resource. We introduce a novel technique to generate synthetic genome data using SAT solvers, the process of which is demonstrably efficient, accurate and has interesting privacy properties. We show how the power of SAT solvers can be harnessed to deductively generate such synthetic data, and how such deductive process can be reversed to provide a measure of absolute privacy quantification of the results. The resulting private synthetic genome data has potential industry applications.
This presentation will be given by Mark Alexander Burgess, a Research Engineer who has recently joined CSIRO’s Transformational Bioinformatics group. Mark has been applying constraint programming and particularly SAT techniques to the creation of synthetic data. He works in C and Python and explores unconventional programming languages.
Learn more about Genomator.
An approach for generating realistic Australian synthetic healthcare data
Healthcare data is a scarce resource, and access is often cumbersome. While medical software development would benefit from real datasets, the privacy of the patients is held at a higher priority.
Realistic synthetic healthcare data can fill this gap by providing a dataset for quality control while at the same time preserving the patient’s anonymity and privacy. Existing methods focus on American or European patient healthcare data, but none is exclusively focused on Australia, which has a highly diverse population and a unique healthcare system.
To overcome this problem, we used a popular publicly available tool, Synthea, to generate disease progressions based on the Australian population. With this approach, we were able to generate 100,000 patients following Queensland demographics.
This presentation will be given by Dr Ibrahima Diouf, a Research Scientist in the Health Intelligence team at the CSIRO Australian e-Health Research Centre. Ibrahima has extensive experience in the analytics of observational data. His main research interests include statistical methodologies for biomedical research, and he has experience in developing and applying causal inference methods.
Recording
This event will be recorded. The recording will be provided to all registrants and published here.
About AUSDIG
AUSDIG provides an opportunity for anyone interested in discussing the challenges and strategies for managing sensitive data. To watch previous meetings and join the mailing list, visit the AUSDIG website.
Do you have questions about this event? Email [email protected].