FAIR Data Self-Assessment Tool

Use this handy, researcher-friendly tool to discover how findable, accessible, interoperable and reusable (FAIR) your research dataset is and get practical tips on how to enhance its FAIRness.

This tool is divided into 4 sections, focusing on your dataset’s:

  • findability
  • accessibility
  • interoperability
  • reusability.

At the beginning of each section, you will be presented with an explanation of what it means for your dataset to be findable, accessible, interoperable and reusable.

You will then answer questions related to the extent to which your dataset aligns with the 4 principles.

Once you have completed a section, you will be given a percentage of your dataset’s alignment with the principle.

When all sections are completed, you will be given a percentage of how FAIR your dataset is overall.

Along the way, you will find explanatory pop-ups (“What is this?”), which clarify key concepts and provide further resources around the FAIR principles.

This tool is designed for:

  • researchers
  • data librarians
  • IT staff.

Research software engineers (RSEs) developing FAIR data tools and services can use the FAIR Software Checklist, developed by the ARDC and the Netherlands eScience Center.

The FAIR principles emerged from the 2016 paper “The FAIR Guiding Principles for scientific data management and stewardship“. FAIR provides a useful framework for thinking about sharing data in a way that will enable maximum use and reuse.

We are continually improving this tool and welcome feedback. Tell us what you think of this tool by completing the feedback form at the bottom of this page.

Total across FAIR

0/12 Answered

Findable

The data has sufficiently rich metadata and a unique and persistent identifier to be easily discovered by others. This includes assigning a persistent identifier (e.g. DOI, Handle), having rich metadata to describe the data, and making sure it is findable through data registries, repositories and other discovery portals.

Findable
Learn more about identifiers, metadata describing datasets, and discovery portals including repositories and registries.
Related Resources
1 green check

Does the dataset have any identifiers assigned?

Identifiers
Identifiers are unique alphanumeric codes that positively identify entities such as people, places and things. Digital Object Identifiers (DOIs) can be used to uniquely identify datasets, and provide a persistent link to their description on the internet. Other persistent identifiers for datasets include PURLs and Handles.
2 green check

Is the dataset identifier included in all metadata records/files describing the data?

3 green check

How is the data described by a metadata record?

Metadata record
A metadata record contains information about a dataset that describes characteristics such as content, quality, format, location and contact information as well as identifiers for related people, places or things. It describes physical items as well as digital items and can take many different forms, from free text (such as read-me files) to standardised, structured, machine-readable content.
4 green check

What type of searchable repository or registry is the metadata record in?

Registries and repositories

Generally, registries contain metadata records (data descriptions) only, while repositories contain metadata records as well as providing access and storage for the data itself.

Examples of registries include:

Findable meter

Accessible

The data is retrievable by humans and machines through a standardised communication protocol, with authentication and authorisation where necessary. The data does not necessarily have to be open. Data can be sensitive due to privacy concerns, national security or commercial interests. When it’s not able to be open, there should be clarity and transparency around the conditions governing access and reuse.

Accessible
Learn more about accessibility of data, availability of data online and metadata records.
Related Resources
5 green check

How accessible is the data?

Data access
Data access covers who may access the data and when access may occur (including any embargo). Restrictions may be based on security, privacy or other policies.
6 green check

Is the data available online once access has been approved?

Application programming interfaces (APIs)

Application programming interfaces (APIs) allow computer applications to share and access machine-readable data. There are a number of well-documented APIs used for the exchange of data and metadata. For example, OGC WMS is used for geo-registered map images, and OAI-PMH is used for exchanging repository metadata.

Read our page on standardised communications protocols for more information.

7 green check

Will the metadata record be available even if the data is no longer available?

Metadata record availability
This ensures that even if the dataset itself is no longer available (it may have been changed, removed or superseded), its metadata record will persist so that the dataset can be cited and followed up in future.
Accessible meter

Interoperable

The associated data and metadata uses a “formal, accessible, shared, and broadly applicable language for knowledge representation”. This involves using community accepted languages, formats and vocabularies in the data and metadata. Metadata should reference and describe relationships to other data, metadata and information through identifiers. Highly interoperable data makes it easier for researchers to work with data from different sources.

Interoperable
Learn more about file formats for data, different types of vocabularies, ontologies and tagging schemas for data and linking metadata to other data and metadata.
Related Resources
8 green check

What (file) format(s) is the data available in?

Data standards

Data standards support data interoperability, processing and management.

Machine-readable data is data that is in an open and structured format so that it can be automatically read and processed by a computer, e.g. JSON (Javascript Object Notation), XML (eXtensible Markup Language), RDF (Resource Description Framework) and CSV (Comma Separated Value). Open file formats can be freely accessed and used with free and open source software. They are not proprietary or locked to specific software.

9 green check

What best describes the types of vocabularies used to describe/define the data elements?

Vocabularies

Data described with vocabularies communicates what the data is and what it is about, such that it can be read and understood by humans and machines. Controlled vocabularies reflect agreements on meaning reached within a project or an organisation, within a domain, or across domains.

There are many types of controlled vocabularies from simple term lists to complex machine-readable ontologies. They include authority files, glossaries, dictionaries, gazetteers, code lists, taxonomies, subject headings, thesauri and semantic networks.

The controlled vocabulary used to describe datasets can be documented and made resolvable using globally unique, and preferably persistent, identifiers.

10 green check

How is the relationship to other data and resources described in the metadata?

Linked data

Meaningful links to data and resources enrich the contextual knowledge about the data, including how the data integrates with other data and/or interoperates with services/workflows. Examples of related resources include publications, services, organisations, people, grants, provenance and reuse information.

‘Linked data’ is machine-readable because all data is explicitly described in meaning and in format, using agreed standards and published on the internet. URI (or URL) links by themselves do not have this structure, though they still provide useful context for humans.

Interoperable meter

Reusable

The associated metadata provides rich and accurate information, and the data comes with a clear usage licence and detailed provenance information. Reusable data should maintain its initial richness. For example, it should not be diminished for the purpose of explaining the findings in one particular publication. It needs a clear machine-readable licence and provenance information on how the data was formed. It should also use discipline-specific data and metadata standards to give it rich contextual information that will allow accurate interpretation and reuse.

11 green check

Which of the following best describes the licence/usage rights assigned to the data and included in the metadata record?

Data licences

Licences are legal statements giving official permission to do something with a dataset. Standard licences for data include the Creative Commons licences. Non-standard licences do not conform to an agreed standard and include custom licences and data sharing agreements.

Machine-readable format is metadata that is in an open and structured format so that it can be automatically read and processed by a computer, e.g. JSON (Javascript Object Notation), XML (eXtensible Markup Language), RDF (Resource Description Framework) and CSV (Comma Separated Value).

12 green check

How much provenance information has been captured to facilitate data reuse?

Data provenance information
Data provenance metadata:
  • documents why and how the data was produced
  • documents where, when and by whom the data was collected
  • includes elements such as creation date, creator, instrument or software used and data processing methods.

Provenance information can be recorded in a text string (e.g. a single README text file) or in a more structured way by applying generic metadata standards (e.g. Dublin Core) through to discipline-specific metadata standards such as ISO 19115-2 and to highly abstract data models such as the W3C Provenance Data Model (PROV-DM) and Provenance Ontology (PROV-O).

Provenance information can be represented in machine-readable (e.g. in RDF, JSON, NetCDF, XML) and/or human readable form (e.g. textual description).

Reusable meter

Get Personalised Resources And Save Your Results

Get personalised resources that will help you level up in making your data FAIR. You’ll receive an email with the list of best resources for you based on your answers. Additionally, you’ll get a unique link to be able to access your assessment at a later time.

Full Name
I am interested in:

Tool Disclaimer

This FAIR Data Self-Assessment Tool was developed by the ARDC. It is provided purely for educational and informational purposes. It is based on our interpretation of the FAIR Data Principles with the acknowledgement that there are other interpretations of the principles. Other tools like the CSIRO’s 5-Star Data Rating tool and the DANS’ FAIRdat tool provided valuable inspiration in developing this tool.

The scores arising from this tool are intended for self-assessment purposes only and to trigger thinking and discussion around possible ways of making data more FAIR.

This tool was updated in October 2024. The code for the original tool is available for reuse on Github.