gms | German Medical Science

49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds)
19. Jahrestagung der Schweizerischen Gesellschaft für Medizinische Informatik (SGMI)
Jahrestagung 2004 des Arbeitskreises Medizinische Informatik (ÖAKMI)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie
Schweizerische Gesellschaft für Medizinische Informatik (SGMI)

26. bis 30.09.2004, Innsbruck/Tirol

Integration of Genomic Data in Electronic Health Records : Chances and Dilemmas

Meeting Abstract (gmds2004)

Search Medline for

  • corresponding author presenting/speaker Ulrich Sax - Children's Hospital Informatics Program, Boston, USA
  • Steffen Schmidt - Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, USA
  • Isaac, S Kohane - Children's Hospital Informatics Program, Harvard Medical School, Boston, USA

Kooperative Versorgung - Vernetzte Forschung - Ubiquitäre Information. 49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds), 19. Jahrestagung der Schweizerischen Gesellschaft für Medizinische Informatik (SGMI) und Jahrestagung 2004 des Arbeitskreises Medizinische Informatik (ÖAKMI) der Österreichischen Computer Gesellschaft (OCG) und der Österreichischen Gesellschaft für Biomedizinische Technik (ÖGBMT). Innsbruck, 26.-30.09.2004. Düsseldorf, Köln: German Medical Science; 2004. Doc04gmds080

The electronic version of this article is the complete one and can be found online at:

Published: September 14, 2004

© 2004 Sax et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.




While the human genome is sequenced now [1], [2], the tricky part begins now, as we still do not know the function of many genomic regions. The post-genomic era of medicine not only challenges patients, clinicians and the healthcare system, but has also a huge impact on healthcare information systems.

Genomic patient information will soon be part of the patient's healthcare record, while most clinical information systems are not prepared to deal with genomic data.

Therefore, we need new data models and data structures to cope with the integration of genomic data in the Electronic Health Record (EHR) as well as interference mechanisms to connect genotype and phenotype data. Furthermore we have to deal with an inconvenient type of data, as predictions and interpretations drawn from a patient's DNA sequence will have to be repeated frequently as research gains new insights. Therefore the raw data should be easily accessible.


Integration strategy

Genomic data could be integrated in an EHR in several ways. For patient centered integration, we need the lab meta-data such as the specimen, used methods and corresponding interpretations. It seems to be crucial to include the raw data from either sequencing labs or the results of microarray experiments on patients as well. Another strategy could be to include pointers to external data sources in the EHR, which could cause security and reliability problems.

Interference mechanisms

The current interpretation of the raw data for one patient has to be versioned, as with new insights in science these interpretations could change over time. We need electronic agents, who frequently analyze the given data using recent scientific knowledge.

Additionally, horizontal interference agents aim for finding correlations from genotype to phenotype and vice versa in the EHR covering many patients.

Data Types

We have to deal with several new data types, not being standardized for EHR integration. Genomic data (base sequences) are simple ASCII streams likeproteomic data. Protein data (amino acid sequences) might be derived from genomic data directly (annotation of the genome). So one simply would need a description of the exact location of genes in the genome. In case of microarray data can be delivered in the „Minimum Information about Microarray Experiments" (MIAME) Format.

Data Model

The Health Level Seven (HL7) Clinical Genomics Special Interest Group (SIG) proposed changes for the HL7 Reference Information Model (RIM), as well as for the clinical study centered Clinical Data Interchange Standards Consortium (CDISC) standard [3]. A patient's DNA sequence should be hooked up at the top layer of the model with the possibility of versioning it. Alternatively somatic mutations could be stored like Single Nucleotide Polymorphisms (SNP). All derived genomic information like differences to a reference DNA sequence can be considered as lab tests.


Historically, clinical patient data and research data as well as clinical study data were separated in different databases and systems.

This classic separation of clinical data and research data constrains new insight more as ever before. In order to find the relationship of genotype and phenotype, we need both clinical and research data accessible in the EHR.

As the genotype does not solely capture the individual patient state, we need additionally to assess and quantify the environmental influences. This comprises the patient history, physical condition, laboratory studies and imaging data [4].

Currently, genomic data is not represented in the EHR standard models, but a HL7 SIG is creating a HL7 Clinical Genomics Model [3].

Data integration seems to be necessary in different layers. Other than with for example clinical chemistry data, we not only need the results, but also the raw data. Result finding does not only take place once, it has to be performed frequently, as new insight could lead to a different result.

In the current Bioinformatics world, using references to external information can be difficult, as this information may not be persistent (remember the gi numbers given for each submission to GenBank @ NCBI).

Horizontal interference agents could try to interfere genotype and phenotype information using EHR data from many patients.

As the enriched EHR will contain very sensitive genomic data, confidentiality and privacy concerns have to be addressed. For two reasons: the genomic data is much more predictive of the patient's health status than any other test, and the genome is uniquely identifiable [5].


Given the necessity to capture both environment and genomic state of a patient and their interaction, clinical information systems have to be redesigned. Early examples of these systems can be seen in IBM's Genomics Messaging System [6], being rolled out in the Mayo clinic recently [7].

Beyond the development of new data models, both vertical (patient centered) and horizontal (study centered) agents have to be developed on order to link genotype and phenotype. While genotyping seems to be automatable easily, this is not the case for clinical information. Therefore we need standard ontologies. First attempts show the complexity of this task [3], [8].


Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, et al. The sequence of the human genome. Science 2001;291(5507):1304-51.
Istrail S, Sutton GG, Florea L, Halpern AL, Mobarry CM, Lippert R, et al. Whole-genome shotgun assembly and comparison of human genome assemblies. Proc Natl Acad Sci U S A 2004;101(7):1916-21.
HL7 Clinical Genomics SIG. HL7 Clinical Genomics SIG San Diego Meeting Minutes Jan 21-22, 2004. In: HL7; 2004. CG%20SIG%20Meeting%20Minutes%202004%2001%2021%20v2.doc
Ford JH, 2nd, Turner A, Yoshii A. Information requirements of genomics researchers from the patient clinical record. J Healthc Inf Manag 2002;16(4):56-61.
Kohane IS. Bioinformatics and clinical informatics: the imperative to collaborate. J Am Med Inform Assoc 2000;7(5):512-6.
IBM_Research. Genomics Messaging System (GMS). In: IBM Haifa Labs; 2004.
Snow D. Mayo Amassed Mounds of Data. In: Wired News; 2003. medtech/0,1286,61633,00.html?tw=wn_tophead_4
Holloway E. Meeting Review: From Genotype to Phenotype: Linking Bioinformatics and Medical Informatics Ontologies. Comp Funct Genom 2002;2002(3):447-450.