Shaping Research Software: An Interview with Dr Kate Harborne

We spoke to Dr Kate Harborne, an astrophysicist and Research Associate at the International Centre for Radio Astronomy Research who specialises in numerical simulations and galaxy evolution. Kate was given the 2024 ARDC-sponsored Emerging Leaders in Astronomy Software Development Prize by the Astronomical Society of Australia for developing SimSpin, a software package that bridges a tremendous gap between theoretical and observational astronomy.
Kate Harborne headshot

The ARDC is working with the research community to drive recognition of research software. Our purpose is to provide Australian researchers with competitive advantage through data. Each month, we talk to a leading research software engineer (RSE), sharing their experience and tips on creating, sustaining and improving software for research. 

This month, we spoke to Dr Kate Harborne, an astrophysicist and Research Associate at the International Centre for Radio Astronomy Research (ICRAR) at the University of Western Australia. 

A specialist in numerical simulations and galaxy evolution, Kate was given the 2024 ARDC-sponsored Emerging Leaders in Astronomy Software Development Prize by the Astronomical Society of Australia (ASA) for developing SimSpin, a software package that enables consistent comparisons between real-life and simulated observations of galaxies. SimSpin bridges a tremendous gap between theoretical and observational astronomy.

How did you come to be in your current role?

As an extra-galactic astrophysicist, I look at galaxies beyond our own Milky Way to try and understand how they have grown and changed over time. One can do this through observations or simulations, and for me it’s the latter: building computer models of galaxies using mathematics that reflects first-principle physics.

A rotating simulation of a galaxy
An isolated galaxy model made up of 4,555,392 particles with the stars traced in green and the gas in red. This simulation used the GalIC code and evolved under the force of gravity using Gadget2.

Simulations are a powerful tool for designing cosmic experiments. In observations, we can capture snapshots of a galaxy at a given point in time, but with a simulation, we can model changes through time from any number of projections. By modelling from first principles, we may determine for certain how a feature came to be. It’s an incredibly creative as well as mathematical process, which is why I was drawn to this career in the first place.

What is SimSpin? How has it evolved?

To determine if a galaxy model is plausible, the key is to visualise the simulation as if we were actually observing it, taking into account the real-life observational noise.  

SimSpin began from one such experiment, where we wanted to understand the effect of Earth’s atmosphere on observations of galaxies that extend across several years. I had built many hundreds of idealised galaxy simulations in isolation and wanted a consistent and repeatable method for analysing them. SimSpin helps with this by allowing users to define 2 key classes of objects in a simulation: 

  • the instrument used to mock-observe the galaxy, specifically an integral field spectrograph (IFS) telescope – parameters include the spatial resolution, the spectral resolution and the field-of-view
  • the setup of the mock observation – parameters include the projection angle, atmospheric conditions and distance at which the simulation is placed from the observer. 

With these properties defined and a simulated galaxy input, we can group particles into pixels both spatially and in velocity space at the telescope’s defined resolution. Having then mimicked noise and the atmospheric effects, SimSpin returns a mock observation of that simulated galaxy as if it has been spied through the telescope of choice. 

These mock observations come in FITS format, just like a real-life observation, so it can be easily used in comparisons and is compatible with other observational tools. 

Working with Professor Aaron Robotham, a big proponent of open-source code, it was only natural that I wrap up the code into a shareable R package. I was lucky to have plenty of examples from Aarons’s own plethora of R packages – such as magicaxis, celestial, RFits, ProFit, ProFound and ProSpect – and the wonderful R Packages book!

In 2017, I uploaded the first commit to GitHub. The integrated testing and code coverage plug-ins provided by testthat, GitHub Actions and Codecov are particularly useful for ensuring consistency for SimSpin users as I tweak and add new features. Most recently, I’ve also released live documentation and examples on the SimSpin website.

Eight graphs indicating measurements of gas and star properties of a simulated galaxy
Demonstrating some of the possible outputs of a SimSpin mock observation using a MUSE-like telescope on a simulated inclined disk galaxy. On the left are outputs as measured for the properties of its gas, and along the bottom are outputs as measured for the properties of its stars (Harborne, 2023)

What collaborations have you taken part in?

Comparing different cosmological simulations has been a huge part of my collaboration with the MAGPI Survey. MAGPI, or Middle-Ages Galaxy Properties with Integral field spectroscopy, was Australia’s first and largest program using the European Southern Observatory’s Very Large Telescope and the Multi Unit Spectroscopic Explorer (MUSE) instrument. 

As a Principal Investigator on this survey, I have the joy of working with a multi-disciplinary and international team of observers and theorists. From the theory side, we’re looking at four key cosmological simulations and statistically comparing SimSpin mock observations with the exquisite observations from MUSE. 

This survey began operations in late 2020 and has nearly completed its observations. Many exciting results are being published, many of which lean into this direct comparison with simulations, and many more works are in preparation. 

I also collaborated with Astronomy Data and Computing Services (ADACS), which provides support to Australian astronomers. In 2022, I proposed developing a web application of the SimSpin code with an ADACS merit allocation. The aim was to build an API and web-based GUI that could interface with a containerised version of the R package. This meant that as long as a user was happy to query the API in their language of choice, I could continue to maintain the R package while providing access to the broader astronomical community. 

Thanks to Liz Mannering, Simon O’Toole and others, a beautiful web app was created around an API that allows users to interact with the code visually and intuitively. This web app has dramatically increased the use of SimSpin around the world, with users coming from Germany, the US and the UK as well as Australia. 

Through the ADACS collaboration, SimSpin has been picked up by Data Central, which hosts and provides access to their observational astronomical data. Through this collaboration, we’re hoping to provide similarly consistent and comparable mock observations of galaxies from more simulations. Watch this space!

In my next academic appointment, I’ll work with Kyle Oman at the University of Durham, who has designed MARTINI, a mock observation code for comparing simulations and observations at different wavelengths to SimSpin. This will enable multi-wavelength comparisons of mock observations and expand our understanding of galaxy evolution.

I also collaborated with Astronomy Data and Computing Services (ADACS), which provides support to Australian astronomers. In 2022, I proposed developing a web application of the SimSpin code with an ADACS merit allocation. The aim was to build an API and web-based GUI that could interface with a containerised version of the R package. This meant that as long as a user was happy to query the API in their language of choice, I could continue to maintain the R package while providing access to the broader astronomical community. 

What are your favourite software tools? 

Machine learning is particularly useful for moving from simulations to observations via mocks. I’ve been involved in various ML activities over the last few years. One of them was the Carl Zeiss Stiftung Summer School in August 2023, run by a colleague of mine at the University of Heidelberg in Germany. At the conference, we learnt about auto-differentiation tools embedded within Julia and Jax, interpretable AI, and how social media algorithms might help us find similarities in astrophysical data. I’m looking forward to applying these skills to explore our MAGPI data.

I also love standards. It’s understandable that different simulations don’t necessarily output their data in the same way, but it causes headaches trying to accommodate these differences. Teams such as EAGLE and IllustrisTNG have been doing standardisation work for some time now. Using HDF5 files, you can embed the data’s units within the dataset’s attribute. I hope to see this become the standard as bigger and better simulations go public. 

Finally, Docker is simply amazing at enabling code deployment on any number of different computing clusters. When working with many thousands of galaxies from different universe models, it’s best to run SimSpin in a containerised and isolated environment on a supercomputer. Docker was also used as a basis container for the web application API. 

At the [Carl Zeiss Stiftung Summer School in August 2023], we learnt about auto-differentiation tools embedded within Julia and Jax, interpretable AI, and how social media algorithms might help us find similarities in astrophysical data. I’m looking forward to applying these skills to explore our MAGPI data.

Which communities are you part of and do you recommend?

  • The Astronomical Society of Australia (ASA) promotes collaboration, training and awareness of our research.  The ARDC-sponsored Emerging Leaders in Software Development Prize is a great initiative by the ASA to encourage and recognise the impact of well-designed, sustainable software on astronomy’s development.
  • STEM Women is brilliant for connecting with STEM experts identifying as women. I have been part of this group for a number of years, and through school talks and visits with the ICRAR Outreach and Education team, I’ve seen a positive impact on school kids with a passion for science and astronomy.  
  • The Astrophysics Source Code Library (ASCL) is a great repository for open-source astronomy software code. It assigns a citable Digital Object Identifier (DOI) to code bases (ascl:1903.006 for SimSpin) and helps promote software as an important part of astronomical research.
  • Astro3D is the current astronomy-focused ARC Centre of Excellence. It’s been integral in supporting astronomers and astronomy RSEs. At the concluding Science Meeting held in Sydney earlier this year, I was ecstatic to hear about the many teams using SimSpin. Alongside the ASA, they have enabled so much of my collaboration and SimSpin wouldn’t have had the same impact without Astro3D.
Kate Harborne receiving the Emerging Leaders in Astronomy Software Development Prize at the 2024 ASA Annual Scientific Meeting in Perth
Kate receiving the Emerging Leaders in Astronomy Software Development Prize at the 2024 ASA Annual Scientific Meeting in Perth

Keep In Touch

You can connect with Kate via GitHub, LinkedIn and the ICRAR website.

If you’d like to be part of a growing community of RSEs in Australia, become a member of RSE-AUNZ – it’s free!

Research Software Award Updates

The ARDC is proud to sponsor awards for research software and research software engineers in all stages of their careers. The goal of the awards is to strengthen the recognition of research software and those who develop and maintain it as being vital to modern research.

The ARDC continues to sponsor a wide range of research software awards for 2024, some of which are now open:

The Ecological Society of Australia (ESA) has an ARDC-sponsored award for New Developers of Open Source Software in Ecology.

Entry deadline (extended)

Friday 2 August 2024

Who should apply

New software developers from academia, industry or government working on developing open source software for ecology

Previous winners

Further information

Learn more about this award on the ESA website.

The Statistical Society of Australia (SSA) Bill Venables Award is for new developers of open source software for data analytics, sponsored by the ARDC.

Entry deadline (reopened)

Friday 1 November 2024

Who should apply

Early to mid-career researchers/developers of new open source software primarily developed in Australia.

Previous winners

Further information

Learn more about this award on the SSA website.

Also stay tuned for the announcement of the finalists for the 2024 Eureka Prize for Excellence in Research Software. Read about the 2023 finalists and winners.

The ARDC is funded through the National Collaborative Research Infrastructure Strategy (NCRIS) to support national digital research infrastructure for Australian researchers.

Author

Kate Harborne (ICRAR)

Reviewed by

Nick Jenkins, Dr Paula Andrea Martinez, Jason Yuen (ARDC)

Categories