Australian Text Analytics Platform (ATAP)

A powerful toolset for processing and analysing unstructured texts.
Two programmers are coding
Who will benefit
Researchers, research organisations, higher-degree research candidates, text analytics coursework students

The Challenge

Text analytics is the process of enabling data-driven research by extracting and analysing machine-readable information from within unstructured text. 

Due to the increasing availability of large amounts of unstructured text, techniques for their analysis are becoming increasingly important across research disciplines. 

This can take the form of extracting social and cultural information from texts in the humanities, arts and social sciences (HASS) to extracting machine-readable information from technical texts in engineering and the sciences to help with developing hypotheses and projections.

The Response

Text analytics in research tends to happen at either a basic, generic level (handled with standard packages) or with custom code specifically developed for a particular project. 

The Australian Text Analytics Platform (ATAP) provides researchers with a toolset that is more powerful and customisable than those contained in the standard packages, while being accessible to a large number of researchers who do not have strong coding skills.

ATAP transforms and accelerates the data-driven research possibilities across disciplines by providing Australian researchers with access to an online platform for processing and analysing unstructured texts. 

The platform includes self-service training in text analytics techniques and promotes greater flexibility and transparency in research workflows. This project aims to foster a community that brings together developers and users of text analytics in an accessible and collaborative environment.

This project involves the following elements:

  • Text analytics notebooks for data processing and analysis – A library of Jupyter Notebooks incorporating open source scripts for cleaning, transforming, analysing, and visualising text data. The ready-to-use notebooks will contain core functionalities that can also be further built upon and customised for more complex text analysis.
  • Text analysis workbench – The workbench is a web-based, authenticated environment that enables researchers to import into an analytics sandbox their own text datasets such as text data scraped from websites; collections of journal articles or transcripts of media files. The workbench and its support services allows researchers to customise text analysis notebooks without needing a strong background in coding.
  • Online text analytics training environment – Web-based training in text analytics and community development initiatives (such as hacky hours and user groups) will support the needs of the community of emerging users of text analysis tools. The training environment presents a selection of case studies demonstrating the entire process of text analytics across a range of domains and applications. This is complemented by a series of education and training workshops targeting researchers from beginners through to more experienced practitioners.

Who Will Benefit

Researchers, research organisations, higher-degree research candidates text analytics coursework students will benefit from the project’s core features:

  • Powerful, accessible tools – Jupyter Notebooks containing ready-to-use, customisable scripts from simple processing tasks through to complex text analyses.
  • Online training – Enable researchers who do not know how to code to undertake text analytics. Case studies across a range of disciplines will demonstrate how to use available notebooks to produce research outputs.

The Partners

Our partners are:

Outcomes

ATAP supports FAIR data management principles through the creation of tools that automate the creation of text analytics data output that is transparent and replicable. 

Exploring the large or complex datasets used for text analytics would otherwise require the use of high performance computing resources. 

ATAP makes this possible in a web-based analysis environment with easy access to coding tools and training resources that enable individuals to complete complex text analyses.

Key Resources

To access to platform, tools and training materials, visit the Australian Text Analytics Platform (ATAP).

Contact the ARDC

"*" indicates required fields

Mailchimp Marketing Agreement
This field is for validation purposes and should be left unchanged.