ACE Seminar: DataSHIELD: taking the analysis to the data not the data to the analysis

Speaker: Prof. Paul Burton

Date/Time: 19-Mar-2015, 16:00 UTC

Venue: MPEB 1.03



Research in modern biomedicine and social science requires sample sizes so large that they can often only be achieved through a pooled co-analysis of data from several studies. But the pooling of information from individuals in a central database that may be queried by researchers raises important ethico-legal questions and can be controversial. These reflect important societal and professional concerns about privacy, confidentiality and intellectual property. DataSHIELD provides a novel technological solution that circumvents some of the most basic challenges in facilitating the access of researchers and other healthcare professionals to individual-level data. Commands are sent from a central analysis computer (AC) to several data computers (DCs) that store the data to be co-analysed. Each DC is located at one of the studies contributing data to the analysis. The data sets are analysed simultaneously but in parallel. The separate parallelized analyses are linked by non-disclosive summary statistics and commands transmitted back and forth between the DCs and the AC. Technical implementation of DataSHIELD employs a specially modified R statistical environment linked to an Opal database deployed behind the computer firewall of each DC. Analysis is then controlled through a standard R environment at the AC. DataSHIELD is currently being developed as a flexible, easily extendible, open-source way to provide secure data access to a single study or data repository as well as for secure co-analysis of several studies.



Paul Burton is Professor of Infrastructural Epidemiology at the University of Bristol. He is Head of Data Science in the transdisciplinary D2K (Data to Knowledge) Research Group and leads the DataSHIELD project (EU FP7, MRC, WT). He chairs the national Data Access Committee (DAC) that oversees access to the biomedical components of 1958, 1970 and Millennium Cohorts and First Steps. He also chairs the Steering Committee of P3G (the Montreal-based Public Population Project in Genomics in Society), and the Scientific Advisory Board of Alberta’s Tomorrow Project based in Edmonton, Canada. He sits on the Expert Advisory Group for Data Access (WT, MRC, ESRC, CRUK)

Add to Calendar

This page was last modified on 27 Mar 2014.