Developing a genomics-specific Data Management Plan (DMP) using the Data Stewardship Wizard
Professor Cris Print, Molecular Medicine and Pathology; Libby Li, Centre for eResearch; Braden Woodhouse, Oncology
What is Genomics into Medicine (GIM)
Demand for genomic analysis is rapidly expanding in both health research and clinical medicine worldwide. Genomics Into Medicine (GIM) is a University of Auckland Strategic Research Initiative for research-led clinical genomics. The initial GIM initiative worked to establish a multi-disciplinary connected network of researchers and clinicians within the Auckland Academic Health Alliance (AAHA). This connected network is aligned around a common vision of improving patient care through genomics, and guided by Māori medical and scientific leaders.
Data Management Planning with DSWizard
In collaboration with the Centre for eResearch (CeR), GIM have been developing a customised genomics Data Management Plan (DMP). CeR has provided seminars and training workshops together with some one-on-one mentorship in DMP practice and design as well as data science related to genomics. The goal is to raise data management maturity and meet the challenges of data science literacy in genomics amongst clinicians and researchers.
This work also involved a machine-actionable DMP tool for the Auckland Genomics Group using the Data Stewardship Wizard (DSWizard). The DSWizard generates a DMP using a template-based smart questionnaire to guide researchers in building their plan. Using the DSWizard tool not only facilitates the generation of DMPs by the research community (which are increasingly required by funders and ethics committees) but also increases awareness for good research data management practices.
Why DSWizard?
There are several widely used DMP tools worldwide including DSWizard. What is unique about the DSWizard tool is its dynamic form that can be updated as a research project progresses (Figure 1, 2a, 2b). The DSWizard allows changing and re-prioritizing of decisions in the planning phase to meet FAIR metrics for researchers to see their DMP answers score (Figure3).
This capability is collaborative (Fig 4) and customisable in style; and offers questions for different research domains, with the ability to add a to-do list; work on the DMP with collaborators; export the DMP in multiple formations; and provides the potential for future automated provisioning of appropriate IT infrastructure such as storage to researchers based on data classification and sensitivity.
The research project information captured in the DSWizard will help our GIM researchers and clinicians think more deeply about their data management priorities, facilitating their understanding of the practical challenges of meeting the FAIR and CARE data principles and the application of appropriate stewardship and governance, including Māori Data Sovereignty.
Genomic Data Management
Data management is a key challenge in genomic research with large volumes and velocity of data generation, the need for sharing of data and reproducibility, and the sensitivity of a person’s genomic information. GIM aims to progress the secure, standardised management and sharing of genomic data and related information in alignment with international frameworks such as the Global Alliance for Genomics and Health Framework for Responsible Sharing of Genomic and Health Related Data, the FAIR data principles¹ and leading international best practice across the CARE² principles and adherence to Māori Data Sovereignty.
We have customised the DSWizard to provide a user-friendly interface, alongside a how-to-use guide, and a DMP template specific for projects that involve genomic data. This question template also known as Knowledge Model (KM) has been developed through numerous workshops from working with researchers on genomics data; and contains a combination of suggested answers and guidance, as well as important references throughout the template to assist the researcher to determine the best plan for their context. Although template development is still in progress, it can quickly be modified and updated, allowing for periodic updates and changes once the template is in use. Researchers who are using an updated template are notified of a more recent version of their template, and can proceed to migrate the new changes into their existing DMP.
In this tool, a researcher or clinician creates their DMP, shares it with other project members. The responses are used to provide guidance for further self-help and to provide a common understanding of data stewardship requirements amongst members of the research project. The DMP takes into account a number of areas within data management planning for a project: Introduction, Project Overview, Ethics and Māori Data Sovereignty, Data Collection, Analysis plan, Discover & Reuse, and Publishing & Licensing.
Figure 1. DSWizard
Figure 2a. customisable questionnaire
Figure 3. How DMP answers score in the FAIR metrics
Figure 2b. customisable questionnaire
Figure 4. Sharing with collaborators
¹ FAIR data principles: Findable, Accessible, Interoperable, Reusable
² CARE: Collective benefit, Authority to control, Responsibility, Ethics