GMS | 67. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 13. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF) | A glimpse at representing data quality rules for their collaborative governance in the Medical Informatics Initiative

67. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 13. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF)

21.08. - 25.08.2022, online

Article

XML version

Send article

A glimpse at representing data quality rules for their collaborative governance in the Medical Informatics Initiative

Meeting Abstract

Search Medline for

Erik Tute - Peter L. Reichertz Institut für Medizinische Informatik der Technischen Universität Braunschweig und der Medizinischen Hochschule Hannover, Hannover, Germany
Christian Draeger - Universität Leipzig, Leipzig, Germany
Kerstin Gierend - Medizinische Fakultät Mannheim der Universität Heidelberg, Mannheim, Germany
Matthias Löbe - Universität Leipzig, Leipzig, Germany
Julia Palm - Institut für Medizinische Statistik, Informatik und Datenwissenschaften, Universitätsklinikum Jena, Jena, Germany
Carsten Oliver Schmidt - Universität Greifswald, Greifswald, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 67. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 13. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e.V. (TMF). sine loco [digital], 21.-25.08.2022. Düsseldorf: German Medical Science GMS Publishing House; 2022. DocAbstr. 166

doi: 10.3205/22gmds018, urn:nbn:de:0183-22gmds0186

Published:	August 19, 2022

© 2022 Tute et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.

Outline

Text

Introduction: The Medical Informatics Initiative (MII) supports multicenter data usage in research as well as in clinical use cases. Both involve cross-site multidisciplinary teams. This implies a need to support collaborative governance of methods for checking the data usages’ data quality (DQ) requirements. This work explores the suitability of a proposed 5-tuple approach [1] for representing such DQ-methods for their collaborative governance. We explored the following questions using the example of an MII use case:

1.: Can the representation express all desired DQ-rules? (question 1)
2.: Does the 5-tuple representation help to improve unambiguousness and completeness of the specification compared to free text DQ-rules? (question 2)
3.: Is the representation comprehensible and subjectively suitable for collaborative DQ-method governance? (question 3)
4.: Are resulting executable methods (R-scripts) subjectively a suitable basis for application of common DQ-methods in MII data integration centers? (question 4)

Methods: The MII Taskforce Metadaten had prepared a set of free text DQ-rules (based on HIDQF [2]) targeting use case VHF_MI in the 6. MII Projectathon. One author (ET) formalized these rules using the 5-tuple approach and documented issues, e.g. ambiguousness/incomplete specifications or rules that the 5-tuple representation could not cover. The MII Taskforce Metadaten was invited per E-Mail to provide feedback on these DQ-methods focusing on research questions 2-4. The DQ-methods, an example R-script showing an executable DQ-method, an explanation of the research idea and the research questions were provided in a shared online document, since the Taskforce already shared related information in a cloud folder. Experts provided feedback either per E-Mail or online in the shared document.

Results: The formalized DQ-methods are publicly available in a GIT repository ([3] - Commit 2101515a). The representation could express all desired VHF_MI DQ-rules (cf. question 1). Formalization revealed some incomplete free text DQ-rule specifications (cf. question 2). Most of them were missing information about the desired output. For example, rules defined constraints for valid values but did not specify if the result should report the number of violations, list problematic instances or show something else. Four experts provided feedback regarding questions 2-4. One expert addressed question 2 expressing a positive perception regarding unambiguousness. Four experts responded positive regarding comprehensibility of formalized DQ-methods (cf. question 3). One expert even demonstrated understanding of the methods by suggesting minor corrections for two methods’ contents. One expert proposed ideas for better DQ-method presentation. Two experts addressed question 4, both positive.

Discussion and conclusion: The results from this work indicate that the proposed 5-tuple approach seems suitable for representing DQ-methods for collaborative governance. Their application is not bound to a certain state in the data’s lifecycle (e.g. data integration) and requires no consortium-specific technical infrastructure. R-scripts with input data defined on common data definitions seem to be a potential common ground for MII. Since no actual collaborative DQ-method governance or application happened and the number of responding experts was small, robustness of these findings is limited. Positive expert feedback must not be confused with endorsement for application in the MII. Further evaluation of the approach is planned.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.

Outline

References

1.: Tute E, Scheffner I, Marschollek M. A method for interoperable knowledge-based data quality assessment. BMC Med Inform Decis Mak. 2021 Mar 9;21(1):93. DOI: 10.1186/s12911-021-01458-1
2.: Kahn MG, Callahan TJ, Barnard J, Bauck AE, Brown J, Davidson BN, et al. A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data. EGEMS (Wash DC). 2016 Sep 11;4(1):1244. DOI: 10.13063/2327-9214.1244
3.: Tute E. Erik Tute/openCQA – GitLab [Internet]. Braunschweig: Peter L. Reichertz Institut für Medizinische Informatik der Technischen Universität Braunschweig und der Medizinischen Hochschule Hannover; [cited 2022 Apr 06]. Available from: https://gitlab.plri.de/tute/openehr-dq/-/tree/MII_projektathon

gms | German Medical Science

67. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 13. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF)

Article

A glimpse at representing data quality rules for their collaborative governance in the Medical Informatics Initiative

Search Medline for

Authors

Outline

Text

References