This ARDC series aims to drive recognition of research software and its authors. Each month, we talk to leading actors in the research software engineer (RSE) space and share their experience creating, sustaining and improving software for research.
This month we talked with Dr Sawitchaya (Nancy) Tippaya, a Senior Data Scientist at the Curtin Institute for Data Science (CIDS). Nancy has a background in electronics and engineering with expertise in computer vision, health and sports data analytics, and hardware design analysis. She previously worked with biostatisticians and epidemiologists at Curtin University to develop an antenatal prediction model, and with the WA Department of Health to develop machine learning and natural language processing models to extract cancer staging data automatically. She joined the CIDS in 2019 and is currently working on developing predictive modelling and machine learning applications using various data, including sensor, imagery and video data.
Nancy has worked on multiple projects in domains ranging from health research to astronomy, each requiring unique knowledge and skills. In this interview, she discusses her approach to adapting her expertise to such diverse fields.
What are your day-to-day tasks at CIDS?
As a Senior Data Scientist at CIDS, I work on research and industry projects driven by artifical intelligence or machine learning (AI/ML). My role includes data exploration, model development, and collaborating with the CIDS software team to deploy scalable solutions. I engage with researchers and industry partners to scope projects and translate data challenges into actionable plans. A typical day involves coding, data analysis, model refinement, and solving technical challenges. I also collaborate with fellow data scientists, share knowledge, and contribute to publications and presentations.
Our projects include developing an automated cancer staging system in collaboration with the WA Department of Health, enhancing a mental health chatbot using large language models (LLM), improving mining operations through machine vision in the mining industry, and more.

You work across government, industry and academia, which have their own expectations. How do you switch between the different contexts?
First, you need to understand the culture, goals, and expectations of each sector. Since I often work on distinct projects, I stay adaptive, ensuring I have a clear plan and a deep understanding of each project’s objectives. Effective communication is also key to align with stakeholders and address their unique needs. When switching contexts, I focus entirely on the task at hand, minimising distractions to maintain efficiency. Balancing structured planning with flexibility helps me navigate diverse projects while delivering the project outcomes across different domains.
A good example would be juggling the cancer staging project and a mining industry project. These 2 projects are in completely different domains, each with its own internal processes, which means I have to switch contexts frequently. On top of that, my role is not just about coding, I also manage technical aspects of the projects. It took some time and effort to adapt at first, but the key strategies I mentioned earlier have really helped me handle these transitions more smoothly.
What transferable skills or techniques have you developed that influenced your approach to designing practical data science workflows?
In recent years, the rise of AI/ML has expanded projects beyond just model training to include deployment and SaaS implementation. This shift has shaped my approach to designing practical data science workflows. Key transferable skills include writing explainable and maintainable code, ensuring clear documentation, and adopting software development best practices. Collaborating with software engineers has strengthened my skills in model deployment and scalability. I continuously learn from colleagues and embrace software development principles to ensure AI/ML solutions are not only effective but also robust, reproducible, and aligned with real-world applications.
What data challenges do you most frequently observe when developing predictive models using diverse data types?
From my experience, developing predictive models with diverse data types presents challenges such as data quality issues, format inconsistencies, and missing values – sometimes even key features. Structured and unstructured data require different preprocessing techniques. Handling imbalanced datasets, feature engineering, and mitigating data bias are also critical. Scalability and computational efficiency become concerns with large datasets or even storage limitations. An effective data management plan or workflow and strong domain knowledge are essential to overcoming these challenges and ensuring models remain accurate, reliable, and applicable in real-world scenarios.
What criteria do you use to define which computer infrastructure to use, particularly when co-designing solutions with partners?
When choosing computer infrastructure, the first step is defining the project scope – what we are building, how much data we are handling, and the computing power needed. One of the useful guides is the Technology Readiness Levels for ML Systems, which helps set clear expectations for all partners. Good communication with partners is key to making sure we are aligned on goals and requirements. I also consider risks like scalability, security, and costs, and plan ways to manage them. The goal is to pick an infrastructure that is reliable, efficient, and fits both the project’s needs and long-term plans.
How do you keep learning?
AI/ML is evolving so fast that keeping up can be a challenge! I focus on domains I enjoy and want to grow in, making learning more engaging. I use online resources, work on hands-on projects, and learn from colleagues. Being part of communities and networking with experts help me stay updated on industry trends. I also follow key researchers and practitioners on social media for latest updates. Continuous learning isn’t just about staying current – it’s about exploring new ideas, collaborating, and applying knowledge in real-world projects.
Keep In Touch
You can connect with Nancy via LinkedIn.
If you’d like to be part of a growing community of RSEs in Australia, become a member of RSE-AUNZ – it’s free!
ARDC-Sponsored Research Software Awards
The ARDC is proud to sponsor awards for research software and research software engineers in all stages of their careers. The goal of the awards is to strengthen the recognition of research software and those who develop and maintain it as being vital to modern research.
The ARDC continues to sponsor research software awards for 2025.
The ARDC is funded through the National Collaborative Research Infrastructure Strategy (NCRIS) to support national digital research infrastructure for Australian researchers.