gms | German Medical Science

49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds)
19. Jahrestagung der Schweizerischen Gesellschaft für Medizinische Informatik (SGMI)
Jahrestagung 2004 des Arbeitskreises Medizinische Informatik (ÖAKMI)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie
Schweizerische Gesellschaft für Medizinische Informatik (SGMI)

26. bis 30.09.2004, Innsbruck/Tirol

Tools and data structures for content annotation in medical reference images

Meeting Abstract (gmds2004)

  • corresponding author presenting/speaker Thomas Wittenberg - Fraunhofer IIS, Erlangen, Deutschland
  • Matthias Grobe - Fraunhofer IIS, Erlangen, Deutschland
  • Heiko Kuziela - Fraunhofer IIS, Erlangen, Deutschland
  • Christian Münzenmayer - Fraunhofer IIS, Erlangen, Deutschland
  • Klaus Spinnler - Fraunhofer IIS, Erlangen, Deutschland
  • Robert Schmidt - Fraunhofer IIS, Erlangen, Deutschland

Kooperative Versorgung - Vernetzte Forschung - Ubiquitäre Information. 49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds), 19. Jahrestagung der Schweizerischen Gesellschaft für Medizinische Informatik (SGMI) und Jahrestagung 2004 des Arbeitskreises Medizinische Informatik (ÖAKMI) der Österreichischen Computer Gesellschaft (OCG) und der Österreichischen Gesellschaft für Biomedizinische Technik (ÖGBMT). Innsbruck, 26.-30.09.2004. Düsseldorf, Köln: German Medical Science; 2004. Doc04gmds086

The electronic version of this article is the complete one and can be found online at:

Published: September 14, 2004

© 2004 Wittenberg et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.



Introduction and Objective

Depending on the amount and type of reference image data-sets used, different stages in the development of medical image processing (MIP) algorithms can be identified. For radiologic image acquisition modalities such as X-ray, angiography, CT, or MRT, the first two stages of algorithm development are usually based on images obtained from phantoms (stage 1) or modality simulators (stage 2). In the first case, the geometrical and radiological properties (the a-priori known reference knowledge) of the depicted phantom objects can be evaluated with regard to the modality used. In the second case, different (known) properties of the image acquisition modality can be easily simulated, controlled and evaluated with respect to the depicted objects. For non radiologic medical image acquisition modalities, such as endoscopy or microscopy, the algorithm development starts with stage 3, the development data sets (DD). Development datasets can be referred to as a minimum dataset in the sense, that it comprises not more than necessary images to meet the requirement of reflecting the (a) problem-, (b) domain- and (c) modality-specific image variety. The purpose of DD's is to foster the development of algorithms for medical image analysis by serving as base reference for the method evaluation by the developer [1]. The statistical properties and the variety of images in this datasets should be similar to the stage 4 data set, the so-called system trial dataset (STD). Compared to the DD, the system trial dataset is a rather large dataset, which allows exhaustive testing and evaluation of MIP algorithms and should contain as many images as possible of the respective image modality and domain [2]. Finally, the last stage (5) consists the evaluation of the algorithms in the field in terms of a multi-centre evaluation study, where the acquired and processed image data under real scenarios in one or more clinic or medical centres.

Besides the numerous images contained in each of these MIP data-sets, the reference information about their content with regard to the corresponding image processing task has the same importance as the images themselves. This reference holds the so-called ground-truth or gold-standard, and is needed for the evaluation of the algorithms in each development stage.

Taxonomy of reference information types

Since three of the most important tasks in medical image processing are segmentation, classification and registration, we distinguish several types of images, with respect to their MIP tasks and the needed reference annotation:

Type I: Images with just one verbal or coded annotation. The related MIP-tasks are image retrieval, case base reasoning or full image classification, that is to find the best possible match for a given image out of a large image data base, based on global image features, and usually rather independent of the image contents. According to this best match, the input image will be given a class label. For this task the only reference annotation needed is the correct class label coding for each image.

Type II: Images depicting one single object of interest. For such images, the image processing task is twofold: segmentation of the object, and its classification. An example for this case is a dermatoscopic image depicting one skin lesion. For the two tasks classification and segmentation, reference information is needed. For the segmentation task a binary image mask can serve as visual annotation reference. For classification, predefined coded classes such as "naevi" or "melanoma" are needed.

Type III: Images showing more the one object. In this case, 3 tasks can be identified, segmentation of all objects, classification of each object, and counting the objects. Thus, for each object, a reference description (class label) as well as a visual annotation of its correct form is needed. Finally a reference number of the objects depicted is mandatory. Examples for this image type are micrographs, depicting different types of cells, where each cell has to be extracted and classified due to its morphological and textural appearance. Finally the correct count of cells per class and in total is important.

Type IV: Multi-modal images, where the same anatomic region is depicted within different imaging modalities. An example are combinations of brightfield and fluorescence images of cells, depicting different cell attributes. Here one further task, namely image registration can be identified. The respective reference information is a set of corresponding markers or a transfer function between the images.

In order to organize and handle these different reference annotations with respect to their acquisition and use for evaluation, tools and data structures are needed. Such a tool will be presented in the remains of the paper.


In our recent work in the field of segmentation and classification of cells [3], we developed a reference annotation tool to acquire and organize all the needed reference information from the experts.

Our annotation tool consists of three different parts, (a) a graphical user interface (GUI) to depict and interactively annotate image data sets, (b) a hierarchical data structure to organize the image data sets and their corresponding reference information, and (c) an XML-data structure for persistence and archivation. The data structure contains at its the root level a set of all image tuples present in the database. For most applications one image modality is sufficient.

Also located at the root level is a list of classes consisting of an enumeration type, a short application dependent class description (e.g. "benign", "malign") and a class color code to visualize image segmentation with respect to different classes.

Using the graphical user interface, all image tuples can be displayed from the image data set. For each image tuple, the modalities can easily be switched. For an image classification task (Type I) a rectangle containing the complete image is used for spatial reference for the complete image content. By connecting this rectangle (referred to as image object) to a certain class, the corresponding image gets annotated with this class label.

To annotate image objects with a form and class, the objects have to be manually drawn on the screen, using a tool similar to a paint-brush. By drawing the contours of the objects of interest, and flood-filling them in the corresponding class color, the image object is annotated. In the internal data structure, the object is represented by a binary pixel mask, and its coordinates, which can also easily be saved in standard image file formats. If the image object of interest is depicted as several parts, the visible parts can individually be drawn on the screen, implicitly dividing such an object into its visible parts, which are denoted as object regions. Thus, each object can be represented by one or more image regions. If more than one image modality is used, the object entities of one object can also be distributed over the different image modalities. As mentioned, one example are multi-modal images of cells. In the fluorescent image, only the cell nuclei of the cells are visible, while in the bright field image, the corresponding cell plasma can be seen. Thus, to annotate a complete cell, the cell nucleus region is drawn in the fluorescent image, while the related stained cell region is marked in the bright-field image. Through the class annotation, each complete cell object receives an individual class label. Besides annotating images and objects with rectangles and regions, also point markers can be used. This technique may useful to denote landmarks and corresponding points in the different modalities (e.g. PET and CT) for registration problems. For persistence, the complete data structure can be saved using standard XML together with its binary image masks.


The benefit of the described structured annotation tool is, that it allows standardized, quantitative and objective comparisons between different annotations based on an image data set. Typical scenarios are the evaluation of MIP algorithms, that is to compare the classification, segmentation or registration results based on the image data set to the manually annotated reference information. Another important feature is the comparison of two manual reference annotations, since it is known that two individuals seldom yield the same results, especially with regard to difficult classification tasks. Therefore, based on the simultaneous organisation of an image data set together with its reference annotation, standardized comparisons can be much more efficient, more objective and easy to reproduce.


This work has been partially funded by the Bavarian Research Foundation ('Bayerische Forschungsstiftung') under contract number 409/00.


Horsch A, Wittenberg A, Spinnler K. Concept and Roadmap for Establishing an International Reference Image Database for Medical Image Processing R&D Groups. In Proc's Sixth Korea-Germany Joint Workshop on Advanced Medical Image Processing, 2002.
Prinz M., Horsch A, Schneider S, et al A Reference Image Database for Medical Image Processing. In 2nd Annual Conf, Austrian Scientific Soc. for Telemed., OCG-Schriftenreihe, Vienna, 2002: 45 - 51.
Wittenberg T, Grobe G, Münzenmayer C, Kuziela H, Spinnler K. A semantic approach for automatic segmentation of overlapping cells. Methods of Informatics in Medicine, to appear, 2004.