Metadata Catalogue in High Value Nutrition (National Science Challenge)
Rob Carter, Dr Dharani Sontam, Yvette Wharton, Professor Mark Gahegan, Centre for eResearch; Dr Simmon Hofstetter, Operations Manager, Professor Richard Mithen, Liggins Institute; Joanne Todd, Challenge Director, High Value Nutrition, National Science Challenge.
Nutrition and research data
You’d think that research projects would share methodologies in common with Libraries when it comes to research collections. But in practice it is uncommon to build a Data Catalogue that itemises what research data was created.
The Centre for eResearch’s Research Data Management team provides consultancy on processes and structures that support healthy data workflows. Central to this work, we must be able to answer the question: “Where is the data?”
It’s important to keep standardised records about the research data that is collected and created. Metadata enables future researchers to discover previous knowledge, because metadata records are publicly searchable. Metadata is what makes services like Google search possible.
Apart from being good for exposure and internet search, a Metadata Catalogue should be flexible enough to make statements about this data in the context of Tikanga Māori. For example, where Māori assert kaitiakitanga over a particular species that is the subject of research.
The Data Catalogue sets out contact information for individuals and organisations who are involved in the guardianship of the data. In this way, it is possible to involve these people in future decisions around the data. We use an industry standard knowledge repository platform to publish these metadata records. Records are regularly syndicated to data.govt.nz (Figure 1) and included in their collection of datasets. Syndication helps to ensure that the data is more likely to continue to be available into the future.
Ko Ngā Kai Whai Painga, High Value Nutrition (HVN) National Science Challenge, is a multi-year, multi-study research programme. It asks questions about nutrition and diet from early childhood onwards. It’s aim: to grow the science excellence and knowledge Aotearoa New Zealand needs to create and deliver food to the world that people choose to stay healthy and well.
CeR provides two staff members, Robert Carter and Dharani Sontam, to develop the Metadata Catalogue from the ground up. The scale and number of studies being conducted, along with the added complexities of COVID, required a flexible, collaborative approach. With many data types spread across multiple organisations, the project has benefited from previous work with CeR on Data Management Plans and Standard Operating Procedures. The Metadata Catalogue links each study with it’s Ethics Registry approval records; providing detailed, searchable, clinical information.
Seeding Through Feeding (SUN): Nourishing the infant microbiome to support immune health
The SUN Study is a double-blind, randomised controlled trial designed to recruit 300 infants from urban and central Auckland, New Zealand. The SUN Study aims to determine the associations, and possible causality between prebiotic feeding, growth of immune health beneficial microbes in the infant gut, with reduced number of respiratory infections and improved vaccination responses in infants 6 to 12 months of age. We are working with the project team to define and include information relating to 19 individual data types resulting from the work.
He Rourou Whai Painga: An Aotearoa New Zealand diet for metabolic health and whanau wellbeing
A national Aotearoa New Zealand dietary intervention study to evaluate the effect of a 12-week whole-diet intervention incorporating nutritious domestically-produced food and beverage products and dietary change support, compared with habitual diet, on the MetS-Z score in individuals at risk of developing metabolic disease in a randomised controlled trial. In this case, CeR collected details of 11 different data result sets, for inclusion in the Metadata Record.
Future-proofing research data
With technology changing at such a rapid pace, the phrase ‘future proof’ might raise alarm bells. Institutional archives have been replaced by Digital Object Stores, filing cabinets with cloud storage, all in the space of one lifetime. How do you make provision for longevity in the digital age?
The tools of Archivists come into play when making calculated guesses about what the future will hold. Metadata must obey some kind of consistent format that is both human and machine readable. Short of inscribing the information on a brass plaque, we try to ensure that only the minimum of technology is required to read and make use of the Metadata Record. The record must carry with it a description of what the fields in the record mean.
To this end, three items of data are collected for each field in the record: the field name, a description of the data the field contains, and the data itself. The intension is to make the Metadata self-descriptive, rather than relying on some external pre-existing schema. Over time data formats change, and where possible, text is probably the most accessible format to use. On top of this, the project employs JSON as it’s baseline machine readable format.
Where to from here?
As the team continues to build the Metadata Catalogue, we measure what we have done in terms of the number of studies covered by the work, and in terms of the discoverability of the research. While it sometimes seems that digital data has begin to take on an ephemeral quality, the HVN Metadata Catalogue provides visibility of research data to a standard that supports researchers in the years to come.