A data management plan (or DMP) is a living document that describes:
- what data will be created
- what policies will apply to the data
- who will own and have access to the data
- what data management practices will be used
- what facilities and equipment will be required
- who will be responsible for each of these activities.
How data are managed often depends on the disciplinary context. The list below includes a few examples of DMPs specific to a research domain.
- Best Practices for Preparing Environmental Data Sets to Share and Archive by the Oak Ridge National Laboratory Distributed Active Archive Center.
- LIBER Data Management Planning Catalogue – public DMPs from Europe which have been assessed for robustness and completeness.
Why do I need a data management plan?
Data management is an essential component of planning and delivering a successful research project. A DMP can help to ensure that data generated from the research are well managed and able to be reused efficiently. In the past data management was typically done at the last minute and using the first method that came to mind, an approach that can be time-consuming and error-prone. Taking time at the start of a research project to put in place robust, easy-to-use data management procedures can pay off several times over in the later stages of the project. Inadequate data management can lead to catastrophes like the loss of data or the violation of people’s privacy.
The Australian Code for the Responsible Conduct of Research, developed jointly by the National Health and Medical Research Council (NHMRC), the Australian Research Council (ARC) and Universities Australia (UA), outlines the broad principles and responsibilities that underpin the conduct of Australian research, including basic data management requirements. To receive funding from NHMRC and ARC all research projects need to comply with the Code.
What does a data management plan need to cover?
A DMP should cover the topics listed in the table below. Different disciplines have different conventions and requirements for DMPs. To help facilitate cooperation, make sure that your data management is compatible with the prevailing standards in your discipline. This mostly applies to file formats and metadata standards, and can extend to specific data repositories and workflows.
|Backup||This is probably the single most important item on this list. You must have a credible backup strategy of regular backups, and you must then follow it.
Consider including an off-site backup so that your data will not be lost if a local catastrophe occurs. Consider an automated backup process.
|Survey of existing data||What existing data will need to be managed?|
|Data to be created||What data will your project create?|
|Data owners & stakeholders||Who will own the data created, and who would be interested in it?|
|File formats||What file formats will you use for your data?|
|Metadata||What metadata will you keep?
What format or standard will you follow?
|Access and security||Who will have access to your data?
If the data is sensitive, how will you protect it from unauthorised access?
|Data organisation||How will you name your data files?
How will you organise your data into folders?
How will you manage transfers and synchronisation of data between different machines?
How will you manage collaborative writing with your colleagues?
How will you keep track of the different versions of your data files and documents?
|Storage||Where will your data be stored?
Who will pay for the hardware?
Who will manage it?
|Bibliography management||What bibliography management tools will you use?
How will you share references with the other members of your group?
|Data sharing, publishing and archiving||What data will you share with others?
What license will you apply?
|Destruction||What data will you destroy? When? How?|
|Responsibilities||Who will be responsible for each of the items in this plan?|
|Budget||What will this plan cost?
Possible costs include hardware for backups, research assistant time for data curation, metadata creation, archiving etc.
|Anything else||Don't restrict yourself to the items above. Stop and think. What is missing from this list? If you think of something, please let us know so that we can update this information.|
Next generation approaches
While there is still an open question about the efficacy of data management plans, work continues on multiple fronts to improve them and employ them in a way that truly supports the research enterprise. Institutions are beginning to move from those long early versions of DMPs to a next generation of DMP tools and approaches that consider whether DMPs can or should be:
- public not private documents
- machine readable as well as human readable
- flexible living documents that can be changed through the course of a research project
- measurable (i.e. did researcher X do what they said they would do in their DMP?)
- connected to at least one other system rather than standalone forms.
A report on Machine-actionable Data Management Plans (maDMPs) by Stephanie Simms, Sarah Jones, Daniel Mietchen and Tomasz Miksa reflects collective thinking on this next generation approach. In 2019 Sarah Jones (Digital Curation Centre) gave an update on emerging trends in data management planning in an ARDC webinar.
Data management plans (DMPs) interest group
The group is facilitated by the ARDC and open to anybody interested in DMPs, DMP tools and their effectiveness. The group provides a forum for discussion about local DMP tools and approaches as well as international developments. It meets by video conference on a bimonthly basis and is supported by an open slack channel.