This ARDC series aims to drive recognition of research software and its authors. Each month, we talk to leaders in the research software engineering (RSE) space and share their experience creating, sustaining and improving software for research.
Each year, thousands of research projects produce software. In the US, around 20% of NSF-funded grants are software-intensive; in Australia, nearly half of ARC grants between 2010 and 2019 resulted in software infrastructure, tools or code. Public archives like Software Heritage now host over 80 million software projects, highlighting the vast scale of code underpinning modern science.
Zara Hassan from the Australian National University has been researching reproducibility debt – the gap between how research software is built and what’s needed to ensure that the results they produce can be trusted and reused. This investigation resulted in a practical framework to help researchers and research software engineers manage this debt.
What is reproducibility debt?
Zara explains, “In my research, I use the term ‘reproducibility debt (RpD)’ to describe the gap between how scientific software is developed and what is needed to make its results reproducible.” At its core, this debt is largely driven by human-centric issues, such as limited time and resources, inadequate documentation practices, lack of training, and communication gaps within teams. These challenges are compounded by technical factors, including missing dependencies, unstable computing environments, incomplete or missing data and metadata, non-standardised formats for data, and insufficient unit testing.
Zara continues, “To help researchers and RSEs understand this challenge, I have developed a conceptual model for reproducibility debt. It frames reproducibility debt as the interaction of short-term sub-optimal activities, developer challenges and technical issues.”
Through a systematic literature review of 214 primary studies, coupled with interviews and surveys with practitioners, Zara found that these issues are spread across multiple disciplines. Domain scientists and RSEs often have limited incentives to prioritise reproducibility. Moreover, they lack the necessary training and required technical support to capture and share the full computational context. If left unmanaged, this debt compounds, making it harder for future researchers to reproduce or expand prior work. Reproducibility is not just a technical challenge but a cultural one, requiring a shift in practice, recognition and support for both researchers and RSEs.
What are the common barriers researchers face when trying to reproduce results?
Barriers can be technical, cultural or caused by short-term practices that accumulate debt. They can be categorised as data, code, documentation, human and organisational or tool-centric. For example, an open-source research software package may have missing or incomplete data, poorly documented code, untracked dependencies or outdated tools. This makes it nearly impossible to reproduce results accurately.
Limited training, strict timelines and inadequate incentives also lead researchers to adopt quick fixes that compromise reproducibility.
To help teams understand and prevent these barriers, I developed probabilistic cause and effect diagrams. These diagrams show which causes are most likely to contribute to reproducibility debt and which effects will likely occur. Combined with the conceptual model, they provide a theoretical and practical framework, allowing researchers and RSEs to understand, communicate and address reproducibility challenges.
What prevention strategies do you recommend to reduce challenges related to reproducibility?
Prevention is always more effective than repair. My research shows that reproducibility issues can be addressed with less effort earlier in the development cycle. By adopting systematic software engineering practices and using the appropriate technical tools much of the debt can be avoided.
They can adopt the applicable prevention strategies from the list I have compiled from existing literature and practitioner insights. This means the strategies are evidence-based and grounded in real-world experience. They serve as practical guidelines to prevent reproducibility debt or mitigate its effects if it has already been incurred, helping researchers and RSEs reduce risks early, strengthen reproducibility and ensure the long-term sustainability of scientific software. Read a complete list of developed strategies.
What are the long-term consequences of irreproducible research on policy, funding, and public trust?
Over time, irreproducible research generates ‘interest’ – extra effort, cost and risk which is incurred by future research. This debt could be avoided if reproducibility is managed properly.
Policymakers might make decisions based on uncertain evidence. Funding bodies might invest in projects that cannot be extended, and public trust in science can erode. Small gaps in documentation, code, data or computational environments, if not addressed early, compound across projects and disciplines. Over time, irreproducible software slows scientific progress, wastes resources and reduces confidence in research outcomes.
How can institutions foster a culture of reproducibility?
Reproducibility is at the heart of trustworthy science, yet it often takes a back seat in face of short-term pressures like publishing quickly.
Institutions can foster a culture of reproducibility through concrete actions:
- providing training programs and workshops on reproducible workflows
- offering mentorship and guidance from experienced researchers or research software engineers (RSEs)
- establishing clear guidelines and standards
- recognising reproducible practices in promotions or evaluations.
Incentives for both researchers and RSEs to embed reproducibility in their workflows help make these practices routine. Crucially, institutions should allocate dedicated time for reproducibility activities, much like the software industry schedules time to manage technical debt. This gives researchers the opportunity to properly clean and document code, organise data and address accumulated debt without the pressure of publication. By doing so, researchers can produce more reliable, well-documented and reproducible work.
How can your research be applied in a research software project?
My research provides a practical framework for managing reproducibility debt.
It helps researchers and RSEs identify, measure and prevent reproducibility debt by systematically analysing risks and applying prevention strategies. Central to this framework are probabilistic cause-effect diagrams, which help teams identify potential causes and effects of reproducibility debt, prioritise the most critical risks, and guide the application of prevention strategies across the project lifecycle.
To apply the framework in practice, researchers and RSEs can follow these steps (open the accompanying diagram in full size):
1. Audit your scientific workflow
2. Map causes with probabilistic diagrams
3. Prioritise interventions
4. Implement best practices
5. Monitor and iterate

If certain debt items are identified in your project but cannot be resolved or paid off immediately, make sure to document them clearly along with the prevention strategies that should be applied in the future. This ensures that other team members are aware of these issues and that the debt can be systematically addressed later.
While research software continues to scale in complexity and influence, so too does the impact of reproducibility debt. With billions of dollars invested in software-driven research and over 80 million projects already archived, the cumulative cost of reproducibility debt in wasted time, unreliable findings and lost trust is likely substantial.
Addressing this debt is therefore essential to scientific progress. Through my work, I aim to make reproducibility debt visible, manageable and ultimately preventable, helping ensure that the science built on code remains credible and enduring.
Keep In Touch
You can connect with Zara via LinkedIn and her personal website.
If you’d like to be part of a growing community of RSEs in Australia, become a member of RSE-AUNZ – it’s free!
Now Open: ARDC-Sponsored Award for New Developers of Open Source Software in Ecology
The ARDC is proud to sponsor awards for research software and research software engineers in all stages of their careers. The goal of the awards is to strengthen the recognition of research software and those who develop and maintain it as being vital to modern research.
The Ecological Society of Australia (ESA) has an ARDC-sponsored award for New Developers of Open Source Software in Ecology. It aims to support efforts to develop and share methodology, models and data in ecology and management of Australia’s ecological communities. It will also focus on supporting researchers new to software development.
Entries are now open. Learn more and apply by Sunday 31 May.
Closing Soon: 2026 International Research Software Engineering (RSE) Survey
The Software Sustainability Institute has launched the 2026 International RSE Survey to help better understand what research software engineers need and how the community can be supported. The survey produces an incredibly valuable trove of data that anyone can use to understand the RSE community, including national associations, funders and policymakers.
The ARDC is funded through the National Collaborative Research Infrastructure Strategy (NCRIS) to support national digital research infrastructure for Australian researchers.