Shaping Research Software: An Interview with Ben Foley

We spoke with Ben Foley, Language Data Scientist at the University of Queensland School of Languages and Cultures, who is now assisting with software development at the ARDC-supported Language Data Commons of Australia (LDaCA).
Headshot of Ben Foley

As part of our Research Software Agenda for Australia, the ARDC is working with the research community to shape better research software. Each month, we talk to a leading research software engineer (RSE), sharing their experience and tips on creating, sustaining and improving software for research.

This month, we spoke with Ben Foley, Language Data Scientist at the University of Queensland School of Languages and Cultures. Having previously worked on a wide range of language technologies including speech recognition, Ben is now assisting with software development at the ARDC-supported Language Data Commons of Australia (LDaCA), which is making nationally significant language data available for academic and non-academic use.

Tell us about your background and academic interests. How did you become an RSE?

Since I was a kid, I’ve been interested in the nexus of art, design and technology. After finishing high school, and a brief taste of engineering and graphic design undergraduate courses, it was a delight to find that Toowoomba TAFE ran an extraordinary certificate course mixing art, design and tech. Lecturers had international artistic credibility, there was a lab with the latest computers, and the course offerings mixed programming sessions with life drawing classes. This was a foundational experience in developing my awareness of cross-disciplinary learning and experimentation.

After this year of artistic exploration, I enrolled at QUT in a new course mixing art, design and tech. Here, I learnt to use high-end audio-visual, 3D graphics, animation and multimedia gear. My final year was a major in printmaking combined with programming.

After graduation, I worked as a graphic designer in Sydney, then ended up in the Kimberley region of Western Australia. I had my Powerbook G3 with me, and got into making language resources with the Kimberley Language Resource Centre. This was another foundational influence, where I learnt that the process of making resources is often more important than the end product.

In Central Australia some years later, I worked with the Batchelor Institute language program. We combined media training, the use of language in artistic representation of stories, and cutting-edge technologies in language research projects. These projects had academic and non-academic community interests at their heart, typically with a focus on apps and websites for language content and research.

Give us a brief overview of your software work.

Depending on the project, I work in different roles. I have managed projects, developed software, worked on user experience, interaction and visual design, and reviewed code. The Transcription Acceleration Project (TAP) with the ARC Centre of Excellence for the Dynamics of Language (CoEDL) gave me an opportunity to work across many roles, and again to work in a rich cross-disciplinary research space.

In my current role with the ARDC-supported Language Data Commons of Australia (LDaCA), I draw on these many years of experience of working with people and language materials. Now I am designing ways of working with software and data to ensure that language material is appropriately accessible in the future. This work focuses on organising data for longevity. By now I have a lived experience of how people work with data, and an understanding of what may be ideal but unattainable, and what’s realistic in terms of working with language collections.

Tell us about some of your software projects. How were they conceived, and what are their applications?

TAP led to development of the Elpis speech recognition tool. At the time, automatic speech recognition (ASR) technology was well beyond the reach of a regular working linguist. The Elpis project aimed to make speech recognition technology accessible for people who didn’t have the knowledge required to install and use ASR tools. The Elpis ASR tool has been used as a component in a translation pipeline, as well as providing transcriptions for linguistic and language teaching research applications.

Another project that I’m particularly proud of is the First Languages Australia Gambay map. The Gambay map reflects the many Aboriginal and Torres Strait Islander languages, along with videos of people telling their language stories. It’s a valuable resource for understanding our country’s cultural diversity. The map has grown from a quirky PHP framework to now being a very neat React project using custom MapBox map tiles and geoJSON data.

More recently, my work with LDaCA is helping to ensure that language collections are not lost due to technical or administrative obsolescence. This work is informed by a lifetime’s interaction with many types of language workers, and awareness of the idiosyncrasies of how people work with language material.

I worked on a story mapping project recently, collaborating with a researcher to write data analysis code. Understanding a computational way of thinking significantly changed their approach to research and the types of questions that could be opened up.

What impact have your projects had?

The projects I work on now tend to have small numbers of users. It’s a change from my early days of designing websites for global audiences! It’s important to recognise that impact can be quite significant for individuals. Many hours of laborious manual work may be saved by providing a linguist with access to a transcription tool. And sometimes the impact is unexpected. For example, I worked on a story mapping project recently, collaborating with a researcher to write data analysis code. Understanding a computational way of thinking significantly changed their approach to research and the types of questions that could be opened up.

Ben Foley speaking at a University of Melbourne branded podium next to a pull-up banner for the L DaCA project
Ben presenting at the LDaCA Co-Design Workshop in February 2024. Image: David Hannah / ARDC

Keep In Touch

You can connect with Ben via GitHub.

If you’d like to be part of a growing community of RSEs in Australia, become a member of RSE-AUNZ – it’s free!

LDaCA at Upcoming ARDC Events

LDaCA is one of the 6 focus areas of the ARDC HASS and Indigenous Research Data Commons (HASS and Indigenous RDC). Learn more about LDaCA at the following ARDC-led events:

Research Software Awards Open

The ARDC is proud to sponsor awards for research software and research software engineers in all stages of their careers. The goal of the awards is to strengthen the recognition of research software and those who develop and maintain it as being vital to modern research.

The ARDC continues to sponsor a wide range of research software awards for 2024, including:

Award now open

The Ecological Society of Australia (ESA) has an ARDC-sponsored award for New Developers of Open Source Software in Ecology.

Entry deadline (extended)

Friday 2 August 2024

Who should apply

New software developers from academia, industry or government working on developing open source software for ecology

Previous winners

Further information

Learn more about this award on the ESA website.

Awards opening soon

This award from the Australian Bioinformatics and Computational Biology Society (ABACBS), sponsored by the ARDC, recognises an outstanding early- and mid-career researcher (EMCR) bioinformatics software developer from the Australian community with a view to promoting further efforts to develop and share bioinformatics methodologies.

Entry opening

Second half of 2024

Who should apply

Early and mid-career developers of bioinformatics software

Previous winners

Further information

Learn more about this award on the ABACBS website.

This award from Australian Bioinformatics and Computational Biology Society (ABACBS) recognises bioinformaticians from the Australian community who provide outstanding levels of support and maintenance for widely used bioinformatics software. This award aims to promote efforts amongst our community members to document, maintain and support high-quality bioinformatics software.

Entry opening

Second half of 2024

Who should apply

Maintainers of bioinformatics software

Previous winners

Further information

Learn more about this award on the ABACBS website.

Learn more about the ARDC’s Research Software Agenda for Australia.

The ARDC is funded through the National Collaborative Research Infrastructure Strategy (NCRIS) to support national digital research infrastructure for Australian researchers.

Author

Ben Foley (UQ)

Reviewed by

Nick Jenkins, Dr Paula Andrea Martinez, Jason Yuen (ARDC)

Categories

Related Projects