### Artikel

## Selection of robust and efficient two-color microarray designs

### Suche in Medline nach

### Autoren

Veröffentlicht: | 14. September 2004 |
---|

### Gliederung

### Text

#### Introduction

Several papers on the efficient design and analysis of factorial two-color microarray experiments have been published in recent years [Ref. 1], [Ref. 2]. The standard common reference design has been seen to be less efficient than other competing designs. The simple loop design, for example, compares the samples one to another in a daisy-chain fashion. For larger number of samples, designs which interweave two or more loops can be an efficient alternative to the previous two designs. A topic which has not been covered in the literature is the robustness of different microarray designs with respect to missing values. As long as the principles of balance and replication are applied, the common reference design seems to be robust to a loss of arrays resulting from poor quality hybridization (or any other reason). With more complex designs, the efficiency advantage of loop designs by creating multiple links among the samples is jeopardized by a potential non-robustness when one or more arrays are discarded. Since arbitrary pairs of samples can be contrasted only indirectly through chains of comparisons, already a loss of few arrays may cause a breakdown in the chain.

#### Methods

In this talk we investigate the robustness of several designs by comparing them for different experimental questions. We apply an ANOVA model to the residuals of an initial normalization. Because experimenters often want to address very specific questions with microarray experiments, we use special contrast vectors or matrices which describe the biological questions. We then search for efficient designs taking into account (*i*) the estimability of the contrasts and (*ii*) the variances of the treatments' effects estimators. To this end, an efficiency measure *e*(*X*) is computed for a given design *X*. If the experimental question is given in form of a single contrast vector, we minimize the variance factor associated with the design(s) under investigations. If the experimental question is given in form of a contrast matrix (e.g., in factorial designs), we minimize the maximum eigenvalue of the (normalized) dispersion matrix. This efficiency measure can be used to compare several designs with each other, where *e*(*X*) is defined in such a way that larger values indicate a better design [Ref. 3].

The main topic of this talk is to compare the behavior of several designs for varying number *m* of missing slides using the efficiency measure above. The comparisons are performed with respect to efficiency (as long as the related contrasts remain estimable) and network breakdown (minimum number of arrays leading to one or more non-estimable effects). The comparisons are performed for different factorial treatment structures: 1x3, 2x2 and 2x3 designs. Depending on the number of factors and their levels, a variety of basic designs *X* are investigated, including among others common reference design, swap designs, simple loop and swapped loop designs. Experimental questions under investigation include simple and mains effects for all factors as well as interaction effects. For each combination of factorial treatment structure, *X* and experimental question we compute *e*(*X*) by systematically leaving out *m* = 0, 1, ..., 4 arrays (which are treated as missing). For each *m* and *X* we compute (*i*) the median efficiency across all configurations when leaving out *m* arrays, (*ii*) the range of efficiency and (*iii*) the proportion of robust designs, i.e., the proportion of array constellations leading to estimable contrasts. In addition, worst case scenarios are derived, for which a minimum number of missing values leads to non-estimable effects.

#### Results

The results are highly complex in the sense that for different factorial treatment structures and for different experimental questions one obtains different efficient designs. All in all it transpires that the common reference is inferior to many, but not all of the competing designs. However, for any experimental question, a design can be selected which is at least as good as the common reference design. The difference in median efficiency can be quite substantial in some cases. Thus, the use of common reference designs is not recommended. Swapped loop designs seem to behave most efficient on average. Although for a particular scenario a more efficient design may exist, swapped loop design often (but not always) lead to a reasonably good robustness. The recommended approach is to fix the number of factors and levels as well as the experimental question before conducting the experiment and then to select a specific robust and efficient design, which may be better than a swapped loop design and much better than the common reference design.

For illustration purposes, consider a study with 2 treatments and 2 cell lines (2x2 factorial structure). Assume that we are interested in the interaction effect and we start the experiment with 8 arrays. In case of the common reference design, the median efficiency decreases from 2 (*m* = 0) to 1 (*m* = 4). The simple loop design leads to 8 (*m* = 0) and 4 (*m* = 4). Thus, in this example the common reference design would need the fourfold number of arrays as the simple loop design to achieve the same efficiency, irrespective of the number of missing slides. The swapped loop design leads to an efficiency of 8 (*m* = 0) and 2 (*m* = 4) and finally the swap over B design leads to 8 (*m* = 0) and 2.67 (*m* = 4). All three alternative designs are more efficient than the common reference design on average. Although the simple loop design leads to the highest median efficiency, we do not recommend its use in this situation, because it leads to estimable contrasts only in 23% of all array constellations with *m* = 4 missing values (common reference: 23%; swapped loop design: 91%; B-swap design: 94%). Thus, if only the interaction effect is of interest, the B-swap design is the most efficient design among the four competing designs in this example. However, if other comparisons are also of interest, the B-swap design may loose drastically in efficiency or even not guarantee the estimability of the contrasts of interest.

Considerations like those illustrated in the previous example lead to the selection of robust and efficient designs, which may be substantially better than competing designs.

### References

- 1.
- Kerr M. Design considerations for efficient and effective microarray studies. Biometrics 2003; 59: 822-828.
- 2.
- Kerr M, Churchill G. Experimental design for gene expression microarrays. Biostatistics 2001; 2: 183-201.
- 3.
- Bretz F, Landgrebe J, Brunner E. Efficient design and analysis of two colour factorial microarray experiments. Submitted.