Article
How can health care data and biobanks be adapted and linked for research – establishing an enterprise clinical research data warehouse
Search Medline for
Authors
Published: | August 27, 2018 |
---|
Outline
Text
Introduction: The reuse of electronic health records (EHR) data for research purposes has become an important topic in national discussion [1]. A typical large university hospital is characterized by a heterogeneous IT system landscape with clinical, laboratory and radiology information systems and further specialized information systems related to research such as biobanks [2]. In 2012 the Hannover Unified Biobank (HUB) has been established at the Hannover Medical School (MHH).
A data warehouse based approach enables researchers to use heterogeneous data sets by consolidating and aggregating data from various sources. Also data quality can be controlled during data integration. Since 2013 the Enterprise Clinical Research Data Warehouse (ECRDW) [3], [4] operates as an interdisciplinary platform for research relevant questions at the MHH.
Currently the ERCDW provides a data collection of over 2 million patients (10 million diagnoses) and over 500 million further data points. The HUB manages data of approx. 1.7 million bio samples.
In order to enable researchers to use scenarios in the interaction of health care data with biosamples, a solution for linking the two systems is required.
Methods: We used the classic data warehouse approach based on 3-layer-architecture [5] for the ECRDW. The data warehouse core is technically based on the Microsoft SQL Server. Microsoft SQL Server Integration Services (SSIS) is used to extract, transform and load (ETL) data into the data repository of the ECRDW and the Microsoft Analysis Services is used for data provisioning. The presentation layer of the ECRDW is represented by SharePoint Server, PowerBI-Self-Services with interfaces to standard tools for statistics and analysis that are generally known to researchers or clinicians (SPSS, R etc.).
For better handling of data and processes we are developing user interfaces and tools to facilitate and standardize the use of the ERCDW, e.g. full text search engines, text extraction and meta data search (OPS, ICD, LOINC).
Results: We successfully linked health care data to the according biosamples, stored in the HUB, by integrating the basic sample information (such as material type, collection date etc.) into the ECRDW.
Harmonised processes for central consulting services and requests for collaboration in research projects and studies have been set up. Requests for data, samples or advice can be submitted via a web-based form and are provided in a standardized process. All requests are tracked and managed using the ticketing system of the MHH.
Discussion: The ECDWR and HUB are established as central service units aiming at all research use cases regarding bio samples and data at the MHH. However, data integration processes for national and international projects (HiGHmed [6], GBA [7], EHR4CR [8]) evolve intensively. Semantic modelling of clinical concepts as well as data modelling and analysis of unstructured data and OMICs are crucial points.
Intense exchange between the national projects and involved MHH infrastructures and facilities is required to take advantage of synergy effects. The involvement of both, ECRDW and HUB is an important factor of the sustainability concept within the Data Integration Centers of the Medical Informatics Initiative and GBA.
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.
References
- 1.
- Tolxdorff T, Puppe F. Klinisches Data Warehouse. Inform Spektrum. 2016;39(3):233-7.
- 2.
- Bernemann I, Kersting M, Prokein J, Hummel M, Klopp N, Illig T. Centralized biobanks - a basis for medical research. Bundesgesundheitsblatt. 2016;59(3):336-43. DOI: 10.1007/s00103-015-2295-2
- 3.
- Gerbel S, Laser H, Haarbrandt B. Das Klinische Data Warehouse der Medizinischen Hochschule Hannover. Forum der Medizin Dokumentation und Medizin Informatik. 2014;16(2):49-52.
- 4.
- Raßmann T. Entwicklung eines Verfahrens zur integrierten Abbildung und Analyse der Qualität von Forschungsdaten in einem klinischen Datawarehouse [dissertation]. Hannover: Medizinische Hochschule; 2018.
- 5.
- Bauer A, Günzel H. Data-Warehouse-Systeme: Architektur, Entwicklung, Anwendung. dpunkt; 2013. p. 132.
- 6.
- HiGHmed Medical Informatics. Core Partners. [cited 2018 May 25]. Available from: http://www.highmed.org/consortium/core-partners/
- 7.
- German Biobank Node. Projektpartner. [cited 2018 May 25]. Available from: http://bbmri.de/biobanking/it/projektpartner/
- 8.
- Electronic Health Records for Clinical Research. EHR4CR project. [cited 2018 May 25]. Available from: http://www.ehr4cr.eu/