Why software citation is important
Software is pervasive in research. A UK Research Software Survey of 1,000 randomly chosen researchers showed more than 90% of researchers acknowledge software as being important for their own research and about 70% say their research would not be possible without it.
In a separate study The top 100 papers analysed the top 100 Nature papers and found that the vast majority described experimental methods or software that have become essential in their fields.. These studies provide evidence that software plays an important role in research, and should be treated in the same way as other research inputs and outputs such as research data and paper publications.
Proper citation of software has the following benefits:
- ensures scientific transparency and reasonable accountability of a researcher
- aids scientific reproducibility through direct, unambiguous references to the precise software used in a particular study
- provides fair credit for software developers or researchers who spend time developing software
- assists in tracking the use and reuse of software through reference in scientific literature and within other software
- helps developers verify how their software is being used.
How to cite research software
In general, software should be cited in a similar fashion to data and research papers. The core required elements of a citation are:
- Author(s) – the people or organisations responsible for the intellectual work to develop the software.
- Publication Year – the year when the software was published to a repository or any other publication venue.
- Title – the formal title of the software/service.
- Version – the precise version of the software used. Careful version tracking is critical for accurate citation.
- Publisher – the repository where software is held, archived, distributed, released or produced, ideally an institutional or disciplinary repository that provides curation of software over the long term. For example, Climate Data Gateway at NCAR, NASA Earth Exchange, Zenodo, Github.
- Locator/Identifier – a persistent identifier (PID) for the software such as a DOI, Handle or ARK that resolves to the landing page. DOI is considered a best practice for software citation. DOIs are a unique, persistent identifier that can be used to track software citation metrics and to link related research outputs such as journal articles and research data.
Various international organisations have been working to develop guidelines for software citation. Examples of these include:
- Force11 Software Citation Principles developed CHORUS, a centralised software citation policy index with links to the publishers sites.
- Katz DS, Chue Hong NP, Clark T et al. Recognizing the value of software: a software citation guide [version 2; peer review: 2 approved]. F1000Research 2021, 9:1257 https://doi.org/10.12688/f1000research.26932.2
- DataCite Metadata Schema 4.1 (with additions to describe software and examples for software citation)
The DataCite DOI Citation Formatter is a simple online based system which uses your data set DOI to allow you to quickly format your citation in hundreds of different styles.
If a DOI/PID doesn’t exist, the URL can be used, but must be used in conjunction with the access date:
- Access Date (optional) – ongoing development of software may not always be reflected in release dates and versions. It is important to indicate when a software was accessed, especially when the software is not referenced through its DOI but a URL indicating the software’s location.
Software citation format
Software citation should follow this general citation format:
Creator (Publication Year): Title. Version No. Publisher. (Resource Type). Identifier.
In the case of software citation, use resource type “Software”.
- Xu, C., & Christoffersen, B. (2017). The Functionally-Assembled Terrestrial Ecosystem Simulator Version 1. Los Alamos National Laboratory (LANL), Los Alamos, NM (United States). (Software). https://doi.org/10.11578/dc.20171025.1962
Where the software is a library that was developed and run on a software platform, for example, a kinetic analysis software library with Matlab (TM) wrappers, it can be cited as follows:
- Dowson, Nicholas; Baker, Charles; Raffelt, David; Smith, Jye; Thomas, Paul; Rose, Stephen; Salvado, Olivier (2014): InsightToolkit Kinetic Analysis (itkka) Software Library. v1. CSIRO. (Software). https://doi.org/10.4225/08/540E9A7D11EB0
Where the software does not have a DOI, but is accessible from a URL, a suggested citation format is as follows:
Creator (Publication Year): Title. Version. Publisher. (Resource Type). URL. Access Date.
Again, use resource type “Software”.
- Jones E, Oliphant E, Peterson P, et al. (2001). SciPy: Open Source Scientific Tools for Python. (Software). http://www.scipy.org/ [Online; accessed on 2018-07-26].
When the locator/identifier is a URL which doesn’t point to the exact version that has been utilised in the research, it is important to include an access date as this may help to identify the version.
Some repositories provide a recommended format for citing software from that repository, but you will need to modify it to match the citation style you are using in your publication.
The NCAR Command Language (Version 6.4.0). (Software). (2017).
Boulder, Colorado: UCAR/NCAR/CISL/TDD. http://dx.doi.org/10.5065/D6WD3XH5
How to make your own software citable
A simple ARDC guide to make your research software citable. It is accompanied by a slide deck and a licensing information guide.
- Liffers, Matthias, & Honeyman, Tom. (2021, July 1). ARDC Guide to making software citable. Zenodo. http://doi.org/10.5281/zenodo.5003989
- Liffers, Matthias. (2021, July). Software publishing, licensing, and citation. Zenodo. http://doi.org/10.5281/zenodo.5091717
- Australian Research Data Commons. (2021, June 21). ARDC Research Software Rights Management Guide. Zenodo. http://doi.org/10.5281/zenodo.5003962
What is research software?
“Research Software” refers to many different types of software depending on the context in which it is used. Here it is software, in source code or compiled form, created to support research. This is inclusive of small scale scripts created by researchers in aid of any step within the bounds of a research project, all the way up to complex software products created by software engineers and primarily used by researchers directly in aid of their research. Most importantly though, it is the authors of research software who are being asked to make their software citable, and so it is up to the authors to identify their software as research software.