The EU Child Cohort Network's core data: establishing a set of findable, accessible, interoperable and re-usable (FAIR) variables

The Horizon2020 LifeCycle Project is a cross-cohort collaboration which brings together data from multiple birth cohorts from across Europe and Australia to facilitate studies on the influence of early-life exposures on later health outcomes. A major product of this collaboration has been the establ...

Full description

Bibliographic Details
Authors: Pinot de Moira, Angela, Fernández-Barrés, Sílvia, Vrijheid, Martine, LifeCycle Project Group
Format: article
Status:Published version
Publication Date:2021
Country:España
Institution:Universitat Pompeu Fabra
Repository:Repositorio Digital de la UPF
OAI Identifier:oai:repositori.upf.edu:10230/53183
Online Access:http://hdl.handle.net/10230/53183
http://dx.doi.org/10.1007/s10654-021-00733-9
Access Level:Open access
Keyword:Birth cohort
Cross-cohort collaboration
Data harmonisation
FAIR (findable, accessible, interoperable and reusable) principles
Lifecourse epidemiology
Description
Summary:The Horizon2020 LifeCycle Project is a cross-cohort collaboration which brings together data from multiple birth cohorts from across Europe and Australia to facilitate studies on the influence of early-life exposures on later health outcomes. A major product of this collaboration has been the establishment of a FAIR (findable, accessible, interoperable and reusable) data resource known as the EU Child Cohort Network. Here we focus on the EU Child Cohort Network's core variables. These are a set of basic variables, derivable by the majority of participating cohorts and frequently used as covariates or exposures in lifecourse research. First, we describe the process by which the list of core variables was established. Second, we explain the protocol according to which these variables were harmonised in order to make them interoperable. Third, we describe the catalogue developed to ensure that the network's data are findable and reusable. Finally, we describe the core data, including the proportion of variables harmonised by each cohort and the number of children for whom harmonised core data are available. EU Child Cohort Network data will be analysed using a federated analysis platform, removing the need to physically transfer data and thus making the data more accessible to researchers. The network will add value to participating cohorts by increasing statistical power and exposure heterogeneity, as well as facilitating cross-cohort comparisons, cross-validation and replication. Our aim is to motivate other cohorts to join the network and encourage the use of the EU Child Cohort Network by the wider research community.