
Developing a genomics-specific Data Management Plan (DMP) using the Data Stewardship Wizard
Professor Cris Print, Molecular Medicine and Pathology; Libby Li, Centre for eResearch; Braden Woodhouse, Oncology
What is Genomics into Medicine (GIM)
Demand for genomic analysis is rapidly expanding in both health research and clinical medicine worldwide. Genomics Into Medicine (GIM) is a University of Auckland Strategic Research Initiative for research-led clinical genomics. The initial GIM initiative worked to establish a multi-disciplinary connected network of researchers and clinicians within the Auckland Academic Health Alliance (AAHA). This connected network is aligned around a common vision of improving patient care through genomics, and guided by Māori medical and scientific leaders.
Data Management Planning with DSWizard
In collaboration with the Centre for eResearch (CeR), GIM have been developing a customised genomics Data Management Plan (DMP). CeR has provided seminars and training workshops together with some one-on-one mentorship in DMP practice and design as well as data science related to genomics. The goal is to raise data management maturity and meet the challenges of data science literacy in genomics amongst clinicians and researchers.
This work also involved a machine-actionable DMP tool for the Auckland Genomics Group using the Data Stewardship Wizard (DSWizard). The DSWizard generates a DMP using a template-based smart questionnaire to guide researchers in building their plan. Using the DSWizard tool not only facilitates the generation of DMPs by the research community (which are increasingly required by funders and ethics committees) but also increases awareness for good research data management practices.
Why DSWizard?
There are several widely used DMP tools worldwide including DSWizard. What is unique about the DSWizard tool is its dynamic form that can be updated as a research project progresses (Figure 1, 2a, 2b). The DSWizard allows changing and re-prioritizing of decisions in the planning phase to meet FAIR metrics for researchers to see their DMP answers score (Figure3).
This capability is collaborative (Fig 4) and customisable in style; and offers questions for different research domains, with the ability to add a to-do list; work on the DMP with collaborators; export the DMP in multiple formations; and provides the potential for future automated provisioning of appropriate IT infrastructure such as storage to researchers based on data classification and sensitivity.
The research project information captured in the DSWizard will help our GIM researchers and clinicians think more deeply about their data management priorities, facilitating their understanding of the practical challenges of meeting the FAIR and CARE data principles and the application of appropriate stewardship and governance, including Māori Data Sovereignty.
Genomic Data Management
Data management is a key challenge in genomic research with large volumes and velocity of data generation, the need for sharing of data and reproducibility, and the sensitivity of a person’s genomic information. GIM aims to progress the secure, standardised management and sharing of genomic data and related information in alignment with international frameworks such as the Global Alliance for Genomics and Health Framework for Responsible Sharing of Genomic and Health Related Data, the FAIR data principles¹ and leading international best practice across the CARE² principles and adherence to Māori Data Sovereignty.
We have customised the DSWizard to provide a user-friendly interface, alongside a how-to-use guide, and a DMP template specific for projects that involve genomic data. This question template also known as Knowledge Model (KM) has been developed through numerous workshops from working with researchers on genomics data; and contains a combination of suggested answers and guidance, as well as important references throughout the template to assist the researcher to determine the best plan for their context. Although template development is still in progress, it can quickly be modified and updated, allowing for periodic updates and changes once the template is in use. Researchers who are using an updated template are notified of a more recent version of their template, and can proceed to migrate the new changes into their existing DMP.
In this tool, a researcher or clinician creates their DMP, shares it with other project members. The responses are used to provide guidance for further self-help and to provide a common understanding of data stewardship requirements amongst members of the research project. The DMP takes into account a number of areas within data management planning for a project: Introduction, Project Overview, Ethics and Māori Data Sovereignty, Data Collection, Analysis plan, Discover & Reuse, and Publishing & Licensing.

Figure 1. DSWizard

Figure 2a. customisable questionnaire

Figure 3. How DMP answers score in the FAIR metrics

Figure 2b. customisable questionnaire

Figure 4. Sharing with collaborators
¹ FAIR data principles: Findable, Accessible, Interoperable, Reusable
² CARE: Collective benefit, Authority to control, Responsibility, Ethics
See more case study projects

Our Voices: using innovative techniques to collect, analyse and amplify the lived experiences of young people in Aotearoa

Painting the brain: multiplexed tissue labelling of human brain tissue to facilitate discoveries in neuroanatomy

Detecting anomalous matches in professional sports: a novel approach using advanced anomaly detection techniques

Benefits of linking routine medical records to the GUiNZ longitudinal birth cohort: Childhood injury predictors

Using a virtual machine-based machine learning algorithm to obtain comprehensive behavioural information in an in vivo Alzheimer’s disease model

Mapping livability: the “15-minute city” concept for car-dependent districts in Auckland, New Zealand

Travelling Heads – Measuring Reproducibility and Repeatability of Magnetic Resonance Imaging in Dementia

Novel Subject-Specific Method of Visualising Group Differences from Multiple DTI Metrics without Averaging

Re-assess urban spaces under COVID-19 impact: sensing Auckland social ‘hotspots’ with mobile location data

Aotearoa New Zealand’s changing coastline – Resilience to Nature’s Challenges (National Science Challenge)

Proteins under a computational microscope: designing in-silico strategies to understand and develop molecular functionalities in Life Sciences and Engineering

Coastal image classification and nalysis based on convolutional neural betworks and pattern recognition

Determinants of translation efficiency in the evolutionarily-divergent protist Trichomonas vaginalis

Measuring impact of entrepreneurship activities on students’ mindset, capabilities and entrepreneurial intentions

Using Zebra Finch data and deep learning classification to identify individual bird calls from audio recordings

Automated measurement of intracranial cerebrospinal fluid volume and outcome after endovascular thrombectomy for ischemic stroke

Using simple models to explore complex dynamics: A case study of macomona liliana (wedge-shell) and nutrient variations

Fully coupled thermo-hydro-mechanical modelling of permeability enhancement by the finite element method

Modelling dual reflux pressure swing adsorption (DR-PSA) units for gas separation in natural gas processing

Molecular phylogenetics uses genetic data to reconstruct the evolutionary history of individuals, populations or species

Wandering around the molecular landscape: embracing virtual reality as a research showcasing outreach and teaching tool
