gms | German Medical Science

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH)

08.09. - 13.09.2024, Dresden

Studying Privacy Aspects of Learned Knowledge Bases in the Context of Synthetic and Medical Data

Meeting Abstract

  • Valentin Henkys - Johannes Gutenberg University Mainz, Institute of Computer Science, Mainz, Germany
  • Xenia Heilmann - Johannes Gutenberg University Mainz, Institute of Computer Science, Mainz, Germany
  • Daan Apeldoorn - Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
  • Konstantin Strauch - Universitätsmedizin der Johannes Gutenberg-Universität Mainz, Mainz, Germany
  • Bertil Schmidt - Johannes Gutenberg University Mainz, Institute of Computer Science, Mainz, Germany
  • Timm Lilienthal - Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
  • Torsten Panholzer - Universitätsmedizin Mainz, Mainz, Germany

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH). Dresden, 08.-13.09.2024. Düsseldorf: German Medical Science GMS Publishing House; 2024. DocAbstr. 462

doi: 10.3205/24gmds021, urn:nbn:de:0183-24gmds0212

Veröffentlicht: 6. September 2024

© 2024 Henkys et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Retrieving comprehensible rule-based knowledge from medical data by machine learning is a beneficial task, e. g., for automating the process of creating a decision support system. While this has recently been studied by means of exception-tolerant hierarchical knowledge bases (i, e., knowledge bases, where rule-based knowledge is represented on several levels of abstraction), privacy concerns have not been addressed extensively in this context yet. However, privacy plays an important role, especially for medical applications. When parts of the original dataset can be restored from a learned knowledge base, there may be a practically and legally relevant risk of re-identification for individuals. In this paper, we study privacy issues of exception-tolerant hierarchical knowledge bases which are learned from data. We propose approaches for determining and eliminating privacy issues of the learned knowledge bases and present results for synthetic as well as for real world datasets, showing that our approach effectively prevents privacy breaches while only moderately decreasing the inference quality.

This work compares/investigates approaches involving preliminary work by one of the authors.

The authors declare that an ethics committee vote is not required.


References

1.
Apeldoorn D. Comprehensible Knowledge Base Extraction for Learning Agents – Practical Challenges and Applications in Games. Dissertation at TU Dortmund University. Aachen, Mainz; 2023.
2.
Sweeney L. k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems. 2002;10(5):557–570.
3.
LeFevre K, DeWitt DJ, Ramakrishnan R. Mondrian Multidimensional K-Anonymity. In: 22nd International Conference on Data Engineering (ICDE'06). Piscataway: IEEE; 2006. p. 25.
4.
Dix J, Faber W, Subrahamanian VS. The Relationship between Reasoning about Privacy and Default Logics. In: Sutcliffe G, Voronkov A, editors. Logic for Programming, Artificial Intelligence, and Reasoning. Berlin, Heidelberg: Springer; 2005. p. 637–650.
5.
McKenna R, Miklau G, Sheldon D. Winning the nist contest: A scalable and general approach to differentially private synthetic data [preprint]. arXiv. 2021. arXiv:2108.04978.