### Artikel

##
Towards sequential statistical testing as some standard: Pearson’s correlation coefficient

Hin zu sequentiellem statistischem Testen als Standardverfahren: Pearsons Korrelationskoeffizient

### Suche in Medline nach

### Autoren

Veröffentlicht: | 28. Oktober 2014 |
---|

### Gliederung

### Abstract

In standard statistical packages sequential tests are seldom and when they are offered, they mainly concern just tests about means. By a simulation study it was shown that a new sequential triangular test for the null-hypothesis *H*_{0}: 0<*ρ*≤*ρ*_{0} for given requirements of precision (type-I-, type-II-risk, and a practical relevant effect *δ*=*ρ*_{1}–*ρ*_{0}) offers reasonable results. Due to 100,000 runs the average sample size of the sequential triangular test is smaller than the sample size of the pertinent fixed sample size test. If practically possible, it is recommended that the triangular test should be used instead of a one-step procedure with a sample size fixed in advance.

### Zusammenfassung

In statistischen Programmpaketen findet man kaum sequentielle Tests und wenn doch, werden vor allem Erwartungswerte getestet. Mit Hilfe einer Simulationsstudie konnte gezeigt werden, dass ein neuer sequentieller Dreieckstest der Nullhypothese *H*_{0}: 0<*ρ*≤*ρ*_{0} bei gegebenen Genauigkeitsforderungen wie Risiko erster und zweiter Art und einem praktisch relevanten Effekt *δ*=*ρ*_{1}–*ρ*_{0}) zu vernünftigen Ergebnissen führt. Bei 100.000 simulierten Tests war der durchschnittliche Stichprobenufang des sequentiellen Dreieckstests kleiner als der entsprechende fest vorgegebene Umfang. Wenn es praktisch machbar ist, sollte stets der Dreieckstest an Stelle eines Tests mit fest vorgegebenem Umfang verwendet werden.

### Introduction

In standard statistical packages sequential tests are seldom and when they are offered, they mainly concern just tests about means. For instance in [1], R-programs of sequential triangular tests can be found for:

- comparing a mean with a constant,
- comparing two means,
- comparing a probability with a constant,
- comparing two probabilities.

In this paper we present simulation results for a newly developed triangular sequential test for comparing a correlation coefficient with a constant.

In correlation analyses most of the time the null-hypotheses *H*_{0}: *ρ*=0 (versus the two-sided alternative hypothesis *H*_{A}: *ρ*≠0) is tested. But a significant correlation coefficient – regardless of how small the type-I-risk α may be established and the more regardless how small the *p*-value results – has often no practical meaning. Therefore it is often more reasonable to test the null-hypothesis *H*_{0}: *ρ*=*ρ*_{0} for any 0<*ρ*_{0}<1 against the alternative *H*_{A}: *ρ*=*ρ*_{1} for any *ρ*_{1}>*ρ*_{0} or *ρ*_{1}<*ρ*_{0}. Kubinger et al. [2] made available even to SPSS-users the fixed sample size test, which is known for this problem since a long time in statistics. In the present paper we show the attractiveness of a corresponding sequential triangular test proposed by Schneider et al. [3]. In general, sequential triangular tests have the advantage that their average sample size is not only smaller than that of the corresponding fixed sample size tests but its maximum sample size needed is known in advance in any case.

### Method

The null-hypothesis *H*_{0}: *ρ*=*ρ*_{0} can – without loss of generality – be replaced by *H*_{0}: 0<*ρ*≤*ρ*_{0} or by *H*_{0}: *ρ*≥*ρ*_{0}. In an analogous way the alternative hypothesis may be replaced by a composite alternative.

Under the assumption that random variables are normally distributed with second moments *σ ^{2}_{x}*,

*σ*, and

^{2}_{y}*σ*, and a correlation coefficient

_{xy}*ρ*=

*σ*

*/(*

_{xy}*σ*

_{x}*σ*

*) the null-hypothesis should be tested against the alternative hypothesis*

_{y}*H*

_{A}: 0<

*ρ*

_{0}≤

*ρ*≤

*ρ*

_{1}with a type-I-risk α – i.e. the probability of wrongly rejecting

*H*

_{0}– and a type-II-risk β – i.e. the probability of wrongly accepting

*H*

_{0}(in particular as long as

*δ*≥

*ρ*

_{1}–

*ρ*

_{0}with a

*ρ*

_{1}to be fixed in advance).

The empirical correlation coefficient *r*=*s** _{xy}* /

*s*

_{x}*s*

*based on*

_{y}*k*observations (

*x*

_{i},

*y*

_{i}) (

*i*=1,…,

*k*) is an estimate of the parameter

*ρ*(

*s*

_{xy},

*s*

^{2}

_{x},

*s*

^{2}

_{y}are the empirical covariance and variances, respectively). In place of using

*r*as a test statistic we use the transformed (Fisher transformation [4]) value

as a test statistic. The distribution of the corresponding random variable is approximately normal even if

*k*is rather small. The expectation of

**, being a function of**

*z**ρ*, amounts to

the variance to

The statistic *z* can be used to test for a fixed sample size *k*=*n* the hypothesis *H*_{0}: *ρ*≤*ρ*_{0} against the alternative hypothesis *H*_{A}: *ρ*>*ρ*_{0} (respectively *H*_{0}: ρ≥ρ_{0} against *H*_{1}: *ρ*<*ρ*_{0}, a problem discussed in [5], Example 11.15). *H*_{0}: *ρ*≤*ρ*_{0} is rejected with error probability *α*, if

(respectively for *H*_{0}: *ρ*≥*ρ*_{0}, if

*z*_{1-α} a is the (1-*α*)-quantile of the standard normal distribution).

An approximate lower bound for the sample size *n* which is necessary to keep a type-II-error with probability smaller or equal *β* given *ρ*=*ρ*_{1} is the smallest positive integer for which holds:

.

A special group of sequential tests are the sequential triangular tests, going back to Whitehead [6] and Schneider [7]. The main characteristic is that based on the actual sampled date, to decide whether sampling has to be continued or either the null- or the alternative hypothesis can be accepted.

We split the sequence of data pairs into sub-samples of length, say *k*>3 each. For each sub-sample *j* (*j*=1, 2,…, *m*) we calculate a statistic which distribution is only a function of *ρ* and *k*.

A triangular test must be based on a statistic with expectation 0, given the null-hypothesis, therefore we transform *z* into a realization of a standardized variable

(1)

which has the expectation 0 if the null-hypothesis is true.

The parameter

(2)

is used as a test-parameter. For *ρ*=*ρ*_{0} the parameter *θ* is 0 (as demanded). For *ρ*=*ρ*_{1} we obtain:

(3)

The difference *δ*=*ρ*_{1}–*ρ*_{0} is the practical relevant difference which should be detected with the power 1–β.

From each sub-sample *j* we now calculate the sample correlation coefficient *r** _{j}* as well as its transformed values

(

*j*=1, 2,…,

*m*).

Now by

and *V** _{m}*=

*m*the sequential path is defined by points (

*V*

*,*

_{m}*Z*

*) for*

_{m}*m*=1,2,… up to the maximum of

*V*below or exactly at the point where a decision can be made. The continuation region is a triangle whose three sides depend on α,

*β*, and

*θ*

_{1}via

and (4),

with the percentiles

*z*

*of the standard normal distribution. That is, one side of the looked-for triangle lies between –*

_{P}*a*and

*a*on the ordinate of the (

*V*,

*Z*) plane (

*V*=0). The two other borderlines are defined by the lines

*L*

_{1}:

*Z*=

*a*+

*cV*and

*L*

_{2}:

*Z*=–

*a*+

*3cV*, which intersect at

(5).

The maximum sample size is of course *k·V*_{max}. The decision rule now is: Continue sampling as long as –*a*+3*cV** _{m}*<

*Z*

*<*

_{m}*a*+

*cV*

*if*

_{m}*θ*

_{1}>0 or –

*a*+3

*cV*

*>*

_{m}*Z*

*>*

_{m}*a*+

*cV*

*if*

_{m}*θ*

_{1}<0. Given

*θ*

_{1}>0, accept

*H*

_{A}in case that

*Z*

*reaches or exceeds*

_{m}*L*

_{1}and accept

*H*

_{0}in case that

*Z*

*reaches or underruns*

_{m}*L*

_{2}, Given

*θ*

_{1}<0, accept

*H*

_{A}in the case

*Z*

*reaches or underruns*

_{m}*L*

_{1}and accept

*H*

_{0}in the case

*Z*

*reaches or exceeds*

_{m}*L*

_{2}. If the point is reached,

*H*

_{A}is to be accepted.

### Example 1

We like to test the null hypothesis *ρ*≤.6 against the alternative hypothesis *ρ*>.6 with *α*=.05, *β*=.1 and *δ*=*ρ*_{1}–.6 =.2. We use *k*=7. We hence obtain

and

Since we obtain

We find *z*_{.0.9}=1.282 and *z*_{.0.95}=1.645, and hence

From (5) we get *V*_{max}=4.849/0.168=28.863, *Z*_{max}=9.698.

The corresponding triangle is shown in Figure 1 [Fig. 1].

### Example 2

Finally we demonstrate the case of Example 11.15 in [5]. We like to test the null hypothesis *ρ*≥.8 against the alternative hypothesis *ρ*<.6 with *α*=.05, *β*=.1 and *δ*=.8–.6=.2. We use *k*=7. We hence obtain

and

Since we obtain

We find *z*_{.0.9}=1.282 and *z*_{.0.95}=1.645, and hence

From (5) we get *V*_{max}=–4.849/–0.168=28.863, *Z*_{max}=–9.698.

The corresponding triangle is shown in Figure 2 [Fig. 2].

### Simulation study

The reasonability of the sequential triangular test for hypotheses of the correlation coefficient was tested by simulated paths (*Z*, *V*) being generated by bivariate normally distributed random numbers *x* and *y* with means *µ** _{x}*=

*µ*

*=0, variances*

_{y}*σ*

_{x}^{2}=

*σ*

_{y}^{2}=1, and a correlation coefficient

*σ*

*=*

_{xy}*ρ*. Simulations were performed with the nominal risks

*α*

*=.05,*

_{nom}*β*

*=.2 and several values of*

_{nom}*ρ*

_{0},

*ρ*

_{1}and

*k*. For each parameter combination 100,000 paths were generated. Simulations as well as the calculation of the introduced sequential triangular test were done in R [8]. As criteria for the test-quality we calculated: the relative frequency of wrongly accepting

*H*

_{1}, given

*ρ*=

*ρ*

_{0}, which is the empirical risk of the first kind, say

*α*

*; the relative frequency of rejecting*

_{emp}*H*

_{1}, given

*ρ*=

*ρ*

_{1}– this is empirical risk of the second kind, say

*β*

*; the average number of samples*

_{emp}*ASN*for

*ρ*

_{0}and

*ρ*

_{1}used for the calculation of

*r*and

*z** until data sampling stopped (that is: the path leaves the continuation region). Here,

*ASN*is the mean number of sample pairs over all 100,000 runs of the simulation study. Bear in mind that in a certain case the number of sample pairs needed for a terminal decision may lie either below or above that value

*ASN*. The procedure is reasonable, if

*α*

*≤*

_{emp}*α*and

*β*

*≤*

_{emp}*β*. Of course, the sequential test should also lead to

*ASN*<

*n*

_{fix}, where

*n*

_{fix}is the sample size necessary in a fixed sample size test with a power 1–

*β*, given

*ρ*=

*ρ*

_{1}and testing the hypothesis

*ρ*≤

*ρ*

_{0}with the type-I-risk

*α*=.05. Results are shown in Table 1 [Tab. 1] where the difference between

*α*

*and*

_{emp}*α*are not to large. The

*ASN*strongly depends on the real value of

*ρ*. From Table 1 [Tab. 1] the reader finds reasonable values of

*k*for the pairs (

*ρ*

_{0},

*ρ*

_{1}) to be tested.

To summarize: If one actually wishes to check whether a correlation coefficient is large enough we suggest first to fix a lower bound of the coefficient of determination *ρ*_{0}^{2} as the value which at least must be given in order to speak of a meaningful percentage of the variance of *y* which can be explained by *x* – and vice versa. Another value of the coefficient of determination, *ρ*_{1}^{2}, is that one which the researcher must not oversee with a higher probability than 1–*β* if given. It is easy to adapt the procedure for negative roots of these squares. For example, one might decide for *ρ*_{0}^{2}=.5, as a consequence of which *ρ*_{0}=.7; suppose *ρ*_{1}=.8, *α*=.05, *β*=.2, then using *k*=26 (by interpolation) a total sample size will result as about 108 instead of 119 for a fixed sample study.

### References

- 1.
- Rasch D, Pilz J, Verdooren RL, Gebhardt A. Optimal experimental design with R. New York: Chapman & Hall/CRC; 2011.
- 2.
- Kubinger KD, Rasch D, Šimeckova M. Testing a correlation coefficient's significance: Using H0: 0<ρ≤λ is preferable to H0: ρ=0. Psychol Sci. 2007;49:74-87.
- 3.
- Schneider B, Rasch D, Kubinger KD, Yanagida T. A Sequential Triangular Test of a Correlation Coefficient's Null-Hypothesis: 0<ρ≤ρ0. Stat Pap. 2014 Jun 12. DOI: 10.1007/s00362-014-0604-8
- 4.
- Fisher RA. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika. 1915;10(4):507-21.
- 5.
- Rasch D, Kubinger KD, Yanagida T. Statistics in Psychology. Using R and SPSS. Chichester: Wiley; 2011. DOI: 10.1002/9781119979630
- 6.
- Whitehead J. The Design and Analysis of Sequential Clinical Trials. 2nd ed. Chichester: Ellis Horwood; 1992.
- 7.
- Schneider B. An interactive computer program for design and monitoring of sequential clinical trials. In: Proceedings of the XVIth international biometric conference; 1992 Dec 7-11; Hamilton, New Zealand. p. 237-50.
- 8.
- R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2013. Available from: http://www.R-project.org/