Artikel
Studying Privacy Aspects of Learned Knowledge Bases in the Context of Synthetic and Medical Data
Suche in Medline nach
Autoren
Veröffentlicht: | 6. September 2024 |
---|
Gliederung
Text
Retrieving comprehensible rule-based knowledge from medical data by machine learning is a beneficial task, e. g., for automating the process of creating a decision support system. While this has recently been studied by means of exception-tolerant hierarchical knowledge bases (i, e., knowledge bases, where rule-based knowledge is represented on several levels of abstraction), privacy concerns have not been addressed extensively in this context yet. However, privacy plays an important role, especially for medical applications. When parts of the original dataset can be restored from a learned knowledge base, there may be a practically and legally relevant risk of re-identification for individuals. In this paper, we study privacy issues of exception-tolerant hierarchical knowledge bases which are learned from data. We propose approaches for determining and eliminating privacy issues of the learned knowledge bases and present results for synthetic as well as for real world datasets, showing that our approach effectively prevents privacy breaches while only moderately decreasing the inference quality.
This work compares/investigates approaches involving preliminary work by one of the authors.
The authors declare that an ethics committee vote is not required.
References
- 1.
- Apeldoorn D. Comprehensible Knowledge Base Extraction for Learning Agents – Practical Challenges and Applications in Games. Dissertation at TU Dortmund University. Aachen, Mainz; 2023.
- 2.
- Sweeney L. k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems. 2002;10(5):557–570.
- 3.
- LeFevre K, DeWitt DJ, Ramakrishnan R. Mondrian Multidimensional K-Anonymity. In: 22nd International Conference on Data Engineering (ICDE'06). Piscataway: IEEE; 2006. p. 25.
- 4.
- Dix J, Faber W, Subrahamanian VS. The Relationship between Reasoning about Privacy and Default Logics. In: Sutcliffe G, Voronkov A, editors. Logic for Programming, Artificial Intelligence, and Reasoning. Berlin, Heidelberg: Springer; 2005. p. 637–650.
- 5.
- McKenna R, Miklau G, Sheldon D. Winning the nist contest: A scalable and general approach to differentially private synthetic data [preprint]. arXiv. 2021. arXiv:2108.04978.