gms | German Medical Science

51. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (gmds)

10. - 14.09.2006, Leipzig

Identifying pattern in microarray expression series using algorithmic information theory

Meeting Abstract

  • S.E. Ahnert - Theory of Condensed Matter, Cavendish Laboratory, Cambridge
  • F.C.S. Brown - Département de Mathématiques et Applications, ENS, Paris
  • T.M.A. Fink - Institut Curie, Paris
  • K. Willbrand - Laboratoire de Physique Statistique, ENS, Paris

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (gmds). 51. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. Leipzig, 10.-14.09.2006. Düsseldorf, Köln: German Medical Science; 2006. Doc06gmds149

The electronic version of this article is the complete one and can be found online at: http://www.egms.de/en/meetings/gmds2006/06gmds088.shtml

Published: September 1, 2006

© 2006 Ahnert et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.en). You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.


Outline

Text

We introduce a method of detecting pattern in data series independent of the nature of the pattern. Our approach is to replace each data series with an alternative description from which the original data can be fully recovered. Data series with short descriptions, which are significantly compressible are more likely to result from simple underlying mechanisms than series which are incompressible. We assume compressible gene expression profiles to be biologically or medically interesting. We show that the compression in bits k is a universal currency which is independent of the type of noise underlying the data and by which we can order data series according to their significance. The method is successfully tested on microarray time series of yeast cell cycle.