gms | German Medical Science

54. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

07. bis 10.09.2009, Essen

Improving data quality of routine data for clinical research

Meeting Abstract

  • Thomas Fischer - Universität Erlangen Nürnberg, Erlangen
  • Gregor Hohmann - Universität Erlangen Nürnberg, Erlangen
  • Tino Münster - Universitätsklinikum Erlangen, Erlangen
  • Frank Lauterwald - Universität Erlangen Nürnberg, Erlangen
  • Richard Lenz - Universität Erlangen Nürnberg, Erlangen

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 54. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds). Essen, 07.-10.09.2009. Düsseldorf: German Medical Science GMS Publishing House; 2009. Doc09gmds267

doi: 10.3205/09gmds267, urn:nbn:de:0183-09gmds2674

Published: September 2, 2009

© 2009 Fischer et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.




Large amounts of data are typically buried in operational IT systems or EMRs, unavailable for clinical research. Extracting data for research purposes from these sources is often tedious and the quality of the extracted data is often questionable [1].

The QuadSys Project is aimed at improving this situation through an IT-supported Data Quality Management (DQM). We present the first results of the project, which indicate typical data quality problems and root causes. In addition, we propose an infrastructure for IT-supported quality improvement.


To identify typical data quality problems in clinical research and their root causes.

To provide an appropriate tooling, that supports DQM in clinical research projects.


The data production process in an interdisciplinary clinical research project (“Genetische Grundlagen individueller Unterschiede in der entzündungsbedingten Schmerzsensibilisierung”) with multiple data sources was modeled and analyzed. Based on a broad catalog of DQ dimensions, which are distinguished in DQ-literature [2], [3], specific problems were identified and led back to their root causes. A multi step plan for continuous quality improvement was elaborated.


The process analysis revealed various problems in multiple quality dimensions including interpretability, accessibility, accuracy, completeness and correctness. Other dimensions like currency and timeliness were found to be less important for research data.

Major root causes for insufficient data quality are poor quality of the database schema, manual data transfer between independent systems, and insufficient synchronization between interconnected systems.

The multi step quality improvement plan comprises: 1) Modeling the data production process and data quality requirements, and identify relevant indicators for measuring data quality. 2) Integrate data sources, and automate capturing of data quality indicators. 3) Establish an IT infrastructure comprising a dedicated metadata repository for DQ-indicators.


Continuous measuring of data quality based on relevant indicators is needed to improve quality awareness and thereby the value of clinical research data.


Nonnemacher M, Weiland D, Stausberg J. Datenqualität in der medizinischen Forschung. Leitlinie zum adaptiven Management von Datenqualität in Kohortenstudien und Registern. Berlin: Medizinisch Wissenschaftliche Verlagsgesellschaft (Schriftenreihe der Telematikplattform für Medizinische Forschungsnetze, 4); 2007.
Ge M, Helfert M. A Review of Information Quality Research. Develop a Research Agenda. In: Robbert MA, O'Hare R, Markus ML, Klein B, Hrsg. Proceedings of the 2007 International Conference on Information Quality. MIT IQ Conference. 2007.
Strong DM, Lee YW, Wang RY. Data quality in context. Communications of the Acm. 1997;40(5):103–110.