gms | German Medical Science

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH)

08.09. - 13.09.2024, Dresden

Pseudonymisierung in REDCap mit E-PIX und gPAS

Meeting Abstract

  • Christian Erhardt - Medical Data Integration Center (meDIC), University Hospital Tübingen, Tübingen, Germany; Hertie-Institute for Clinical Brain Research, University Hospital Tübingen, Tübingen, Germany
  • Lars-Christian Achauer - Medical Data Integration Center (meDIC), University Hospital Tübingen, Tübingen, Germany
  • Martin Bialke - Institut für Community Medicine, Universitätsmedizin Greifswald, Greifswald, Germany
  • Dana Stahl - Trusted Third Party of the University Medicine Greifswald, Greifswald, Germany
  • Michaela Hardt - Medical Data Integration Center (meDIC), University Hospital Tübingen, Tübingen, Germany
  • Oliver Kohlbacher - Interfaculty Institute for Biomedical Informatics (IBMI), University Tübingen, Tübingen, Austria
  • Raphael Verbücheln - Medical Data Integration Center (meDIC), University Hospital Tübingen, Tübingen, Germany
  • Stephanie Biergans - Medical Data Integration Center (meDIC), University Hospital Tübingen, Tübingen, Germany

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH). Dresden, 08.-13.09.2024. Düsseldorf: German Medical Science GMS Publishing House; 2024. DocAbstr. 625

doi: 10.3205/24gmds101, urn:nbn:de:0183-24gmds1011

Published: September 6, 2024

© 2024 Erhardt et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Introduction: REDCap is a widely used electronic data capture software for managing pseudonymized research data [1]. However, researchers often maintain the mapping between pseudonyms and identifying information (the study list) in separate spreadsheets, which is labor-intensive and error prone. This work presents a novel REDCap module that integrates record linkage and pseudonymization with REDCap. The implemented module aims to improve data entry workflows, data privacy, security, and quality while reducing manual efforts.

Methods: We developed a REDCap module [2] that integrates with E-PIX(R) (for identity management and record linkage) and gPAS (R) (for pseudonym management) from the University Medicine Greifswald [3], [4].

The module processes person identifying information (PII) to store it in E-PIX. E-PIX generates a Master Patient Index (MPI), which is used to create study pseudonyms in gPAS. REDCap entries for the study pseudonyms are automatically generated. The reverse data flow is also covered: When accessing a dataset, the associated MPI of the person is retrieved from gPAS, and subsequently, the PII is loaded from E-PIX and displayed within the REDCap UI. To simplify data entry and to improve data quality, the module interfaces with the hospital information system (HIS). This interface provides a search function for patients and allows the import of PII from the HIS into EPIX based on the patient ID. To securely connect REDCap to E-PIX, gPAS and HIS APIs within the internal hospital network an API gateway is used.

Results: The innovative REDCap module provides a search interface to find matching persons from HIS/E-PIX and options to create/edit/delete study participants in REDCap. Aligned automated workflows for pseudonymization and de-pseudonymization are implemented with the help of gPAS.

Key functionalities include:

1.
Pseudonym generation based on identifying data or external pseudonyms
2.
Separate storage of PII with record linkage to avoid and handle duplicate entries of study participants
3.
Lookup and import of patient information from the HIS
4.
Integration of the study list in REDCap with import/export capabilities
5.
Automatic de-pseudonymization when displaying datasets
6.
Role-based access control in REDCap for managing access to the different features
7.
Separate user authentication to prevent unauthorized HIS access

The module supports different user roles (e.g., physician with PII access, data scientist without PII access) and can handle various participant types (patients, external participants, external pseudonyms).

Conclusion: The developed REDCap module successfully integrates established open-source solutions for record linkage (E-PIX) and pseudonym management (gPAS) to support pseudonymization workflows within REDcap, while considerably reducing manual efforts. It offers enhanced data quality through HIS integration and enforces strict access control. The module goes beyond simple pseudonymization by providing a de-pseudonymized view for data quality checks. The REDCap-module is publicly available from GitHub.

While the system addresses the initial objectives, it is not a universal solution due to potential site-specific requirements, particularly regarding access control. Future work will explore additional configuration options to accommodate diverse organizational needs. Overall, the module significantly improves the management of pseudonymized research data within REDCap, enhancing both efficiency and data protection.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap) — A metadata-driven methodology and workflow process for providing translational research informatics support. Journal of Biomedical Informatics. 2009 Apr 1;42(2):377–81.
2.
Erhardt C. cerhardt/redcap-pseudo-service [Internet]. 2024 [cited 2024 Apr 21]. Available from: https://github.com/cerhardt/redcap-pseudo-service External link
3.
NFDI4Health - Nationale Forschungsdateninfrastruktur für personenbezogene Gesundheitsdaten. White Paper - Verbesserung des Record Linkage für die Gesundheitsforschung in Deutschland. August 2023. DOI: 10.4126/FRL01-006461895 External link
4.
Gött R, Stäubert S, Strübing A, Winter A, Merzweiler A, Bergh B, et al. 3LGM2IHE: Requirements for Data-Protection-Compliant Research Infrastructures - A Systematic Comparison of Theory and Practice-Oriented Implementation. Methods Inf Med. 2022 Dec;61(S 02):e134–48.