### Article

## A systematic framework to analyse and classify measures of association in 2x2 probability tables

### Search Medline for

### Authors

Published: | September 20, 2011 |
---|

### Outline

### Text

**Background:** Measures of association play a role in selecting 2x2 tables in high-dimensional binary data that exhibit strong associations. Several measures of association are in use namely mutual information (MutInf), correlation coefficient (R), odds ratio (OR) based measures like Yule’s Q and Y and in genetics Lewontin’s D’. These measures markedly differ on specific tables and in their dependence on the margins. There is no consensus for what purpose to use which measure.

**Methods:** We study a 2-dimensional group of margin transformations on the 3-dimensional manifold T of all 2x2 probability tables. All measures of association independent of the margins are monotone functions of the odds ratio. The margin transformations allow introducing natural coordinates that identify T with real 3-space such that the z-axis corresponds to log(sqrt(OR)) and margins vary on planes z=const. We use these coordinates to visualise how each measure of association depends on the margins by plotting the measure restricted to tables with constant odds ratio.

**Results:** The measures listed above represent different selection criteria for interesting tables: MutInf is maximal only for the table with œ in the diagonal and down-weights tables with any skewed margins given the odds ratio. R is maximal for diagonal tables and down-weights for deviation from diagonal shape. D’ is maximal whenever one cell goes to zero and up-weights L-shaped tables with one small entry. Unfortunately - although extensively used in genetics - also some degenerate tables with a small row or column receive height weights. This explains the well known fact that D’ exhibits erratic behaviour when estimated for tables with skewed margins.

As an alternative to D’ without these defects, we develop a novel measure of association HS based on the odds ratio in which tables with skewed margins are weighted according to the relative entropy among tables with the same odds ratio. Entropy is a principled measure of the combinatorial plausibility of a table. Relative entropy given the odds ratio is maximal on symmetric tables for odds ratio ≤ 12.89. We show analytically that at about 12.89 a bifurcation occurs such that for large odds ratios higher weights are given to L-shaped tables. HS behaves well in down-weighting tables with very skewed margins.

**Conclusion: **We present a mathematical framework to investigate the relative merits of measures of associations and propose a new entropy and odds ratio based measure useful when interest is on L-shaped tables.