gms | German Medical Science

GMDS 2013: 58. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

01. - 05.09.2013, Lübeck

A Business Logic System for Mining German Patient Records

Meeting Abstract

Search Medline for

  • Philipp Senger - Fraunhofer Institute SCAI, St. Augustin, DE
  • Alexander Klenner - Fraunhofer Institute SCAI, St. Augustin, DE
  • Juliane Fluck - Fraunhofer Institute SCAI, St. Augustin, DE

GMDS 2013. 58. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Lübeck, 01.-05.09.2013. Düsseldorf: German Medical Science GMS Publishing House; 2013. DocAbstr.248

doi: 10.3205/13gmds056, urn:nbn:de:0183-13gmds0564

Published: August 27, 2013

© 2013 Senger et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.



Introduction and goals: Today service provider in the public health sector face the major challenge to integrate innovations coming from research and development, to improve the quality of treatment, to raise the patient safety and to reduce the costs of health services. The secondary use of already existing biomedical routine data is one approach to make use of existing data resources in order to improve service quality. This paper presents an application scenario for secondary usage of electronic orthopaedic patient records. The goal is to analyse German unstructured endoprothetic surgery reports automatically. The approach uses a German Named Entity Recognition (NER) system and subsequently a system based on business rules to find relations between identified biomedical entities.

Materials and Methods: Corpus of electronic patient records: The corpus of German health records and surgery reports was provided by the University of Erlangen-Nürnberg [1] and the RHÖN-KLINIKUM AG [2]. The corpus contains 256 unique and anonymised reports (with an average length of about 257 words) with diverse structure and content.

Annotation of biomedical entities in patient records: The ProMiner NER system [3] was used to identify entities of different terminologies. These manually curated terminologies cover relevant aspects of the use case such as model and manufacturer of an endoprosthesis, (previous) surgeries, and human anatomy.

Finding relationships between entities: The subsequent step is to identify relevant relationships between prior identified entities. The business logic integration platform Drools [4] was used to build an infrastructure defining rules (currently 64 rules) in a domain specific language (DSL). The output format is the Operational Data Model (ODM). ODM ensures compatibility to clinical systems and allows easy integration into clinical context.

Results: Preliminary results are shown for correct identification of endoprosthesis, anatomy, and relevant previous surgeries from unstructured free text. The following paragraph shows a typical abridgement from a German record: “Diagnose: Mediale Gonarthrose links, Verfahren: Implantation einer unikondylären zementierten Oberflächenersatzprothese links (Typ balanSys der Fa. Mathys, zementiertes Tibiametallimplantat Größe 2, 20 g Refobacin-Palacos-Knochenzement)” The defined rule-set extracts the relation between entities like Gonarthrose (Eng. gonathrosis -> diagnosis and anatomy) and links (Eng. left -> body site), as well as Implantation (Eng. implantation -> operation) and “Oberflächenersatzprothese” (surface replacement prosthesis -> category of endoprosthesis) and “links” (-> body site). Additionally the rule-set extracts the model “balanSys”, the manufacturer “Mathys”, and applied type of cement “Refobacin-Palacos-Knochenzement” out in this report. This approach shows a preliminary F-Score of 0.66 (precision 0.75, recall 0.58) on a first representative subset of 10 documents with 95 manually annotated relations, 800 entities, and 180 different ODM types. The authors are currently working on a larger gold standard and more complex business rules to extract further information.

Discussion: Analysing unstructured patient records makes it possible to detect putative causal relationships within the data that are unknown today. A possible outcome might be a positive correlation of an endoprosthesis model and high rate of revision. Our system enables researcher to identify such facts automatically and hence improve patient quality in the long term.


1. External link
2. External link
Hanisch D, Fundel K, Mevissen HT, Zimmer R, Fluck J. ProMiner: Rule based protein and gene entity recognition. BMC Bioinformatics. 2005;6 (Suppl 1):S14.
4. External link