gms | German Medical Science

66. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 12. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF)

26. - 30.09.2021, online

Introducing a data quality framework for data collections in observational health research

Meeting Abstract

  • Carsten Oliver Schmidt - Universität Greifswald, Greifswald, Germany
  • Stephan Struckmann - Universitätsmedizin Greifswald, Greifswald, Germany
  • Cornelia Enzenbach - Universitätsmedizin Greifswald, Greifswald, Germany
  • Adrian Richter - Institut für Community Medicine, Universitätsmedizin Greifswald, Greifswald, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 66. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 12. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e.V. (TMF). sine loco [digital], 26.-30.09.2021. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 210

doi: 10.3205/21gmds100, urn:nbn:de:0183-21gmds1002

Veröffentlicht: 24. September 2021

© 2021 Schmidt et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe



Introduction: Epidemiologic research would benefit from more homogeneous approaches to address data quality. Yet there is a considerable variety as to the applied concepts [1] and tools to implement them. Standards are lacking. This talk presents a recently published data quality framework to guide data quality assessments in observational health research [2].

Methods: The developed data quality framework focuses on intrinsic data quality, i.e. quality which can be assessed without contextual information. Its development was guided by an existing data quality framework, the 2nd edition of the TMF (Technology, Methods, and Infrastructure for Networked Medical Research) guideline for data quality [3], and its evaluation by a group of representatives of German cohort studies [4]. In addition, literature reviews and overviews of data quality concepts in health research informed the choice of indicators. To facilitate the computation of data quality indicators, the R package dataquieR [5] was developed.

Results: The framework distinguishes four data quality dimensions with 34 data quality indicators. The first dimension is related to structural and technical requirements on the data (integrity), the second covers the degree to which data values are present in a data collection (completeness), and two further dimensions are related to data correctness: consistency targets inadmissible, impossible, or uncertain data values, while accuracy covers unexpected distributions (e.g. outliers) and associations (e.g. observer effects). Indicators of the framework are linked with generic statistical implementations including documentations on a dedicated web-page (

Discussion: The data quality framework provides an improved basis to assess intrinsic data quality in observational health research data collections. While the overall structure of the framework may apply to data with a different data provenance, e.g. administrative data collections, any statistical implementation would require a revision. The scope of indicators will be expanded in the future to cover additional aspects of data quality.

Conclusion: The introduced data quality framework provides an improved conceptual basis for the assessment of data collections in observational health research.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


Stausberg J, Nasseh D, Nonnemacher M. Measuring data quality: A review of the literature between 2005 and 2013. Stud Health Technol Inform. 2015;210:712-6.
Schmidt CO, Struckmann S, Enzenbach C, Reinecke A, Stausberg J, Damerow S, et al. Facilitating harmonized data quality assessments. A data quality framework for observational health research data collections with software implementations in R. BMC Med Res Methodol. 2021;21(63).
Nonnemacher M, Nasseh D, Stausberg J. Datenqualität in der medizinischen Forschung: Leitlinie zum Adaptiven Datenmanagement in Kohortenstudien und Registern. Berlin: TMF e.V.; 2014.
Schmidt CO, Richter A, Enzenbach C, Pohlabeln H, Meisinger C, Wellman J, et al. Assessment of a data quality guideline by representatives of German epidemiologic cohort studies. GMS Med Inform Biom Epidemiol. 2019;15(1):Doc09. DOI: 10.3205/mibe000203 Externer Link
Richter A, Schmidt CO, Struckmann S. dataquieR: Data quality in epidemiological research. 2021.