gms | German Medical Science

66. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 12. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF)

26. - 30.09.2021, online

Towards a Data Dictionary Minimal Information Model – Consensus for Research Metadata Exchange

Meeting Abstract

  • Dennis Kadioglu - Medical Informatics Group (MIG), Johann Wolfgang Goethe-University Frankfurt, Frankfurt am Main, Germany; Data Integration Center, University Hospital Frankfurt, Frankfurt am Main, Germany
  • Matthias Löbe - Institute for Medical Informatics, Statistics and Epidemiology (IMISE), University of Leipzig, Leipzig, Germany
  • Mark R. Stöhr - UGMLC, German Center for Lung Research (DZL), Justus-Liebig-University Gießen, Gießen, Germany
  • Abishaa Vengadeswaran - Medical Informatics Group (MIG), Johann Wolfgang Goethe-University Frankfurt, Frankfurt am Main, Germany
  • Raphael W. Majeed - UGMLC, German Center for Lung Research (DZL), Justus-Liebig-University Gießen, Gießen, Germany; Institute for Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 66. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 12. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e.V. (TMF). sine loco [digital], 26.-30.09.2021. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 167

doi: 10.3205/21gmds030, urn:nbn:de:0183-21gmds0303

Published: September 24, 2021

© 2021 Kadioglu et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

In the field of biomedical research, reusing patient and subject data is essential for future research. The collection of such data usually has been tailored to the original context, either documentation of healthcare or research like clinical studies. Especially for the latter, usually a so-called data dictionary (DD) is created at the beginning. Such a DD contains valuable information, like structure and meaning of data elements, which are necessary when interpreting and processing the data for further analysis. However, as there is no general standard about the content and structure of a DD, as well as how it should be made available, finding and reading them is often difficult. Additionally, in cases data are shared as part of a scientific publication or as a package through a research data repository, metadata definitions easily get separated or lost. Looking at various DDs, (e.g. from recent studies or Electronic Data Collection software solutions), the most obvious difficulty when comparing them is their varying amount of information. Where one DD contains a short and long description for every data element, another one only contains a short description. Also the understanding, what a long description should contain and look like, differs. This workshop aims to bring together relevant experts from GMDS, such as Medical Documentalists, Medical Data Scientists, Medical Informaticists, Biometricians, Clinical Researchers and any researcher with interest in reusing patient and subject data for future research. A first draft standard should be discussed towards a consensus for a Minimal Information Model for Data Dictionaries.

As preparation for this workshop, relevant standards and software solutions will be identified and analyzed to devise a first list of possible candidates for attributes describing a data element. Furthermore, all items of this list will be categorized according to their respective frequency as well as their importance in the perspective of the authors. The final list will then be presented during the workshop as an introduction and trigger for its main part, an open discussion. All feedback regarding this list will be collected and first areas of common ground and possible additions will be identified.

The refined list as the main result of this workshop will then be used to design a first version of the specification.

To support adoption as well as evolution of the standard, the stakeholders will demonstrate its usefulness via first implementations in the context of various research networks. One possible implementation would be a common exchange data set for Metadata Repositories (MDR). In addition to providing (un)structured templates (PDF, Excel) as DD for researchers who want to annotate existing data, MDRs from different vendors implementing this common data set would improve the overall comparability of DDs. Furthermore, by basing a common API on this specification and implementing it in various MDRs, findability of DDs would be fostered as well, that is to say, central or decentral search portals could be built and improve the accessability of DDs. Finally, interoperability and reusability of metadata are improved.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.