GMS | GMS Medizinische Informatik, Biometrie und Epidemiologie | Towards sequential statistical testing as some standard: Pearson’s correlation coefficient

GMS Medizinische Informatik, Biometrie und Epidemiologie

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS)

ISSN 1860-9171

Artikel

Artikel empfehlen

Towards sequential statistical testing as some standard: Pearson’s correlation coefficient

Hin zu sequentiellem statistischem Testen als Standardverfahren: Pearsons Korrelationskoeffizient

Research Article

Suche in Medline nach

Dieter Rasch - Department of Landscape, Spatial and Infrastructure Sciences, University of Natural Resources and Life Sciences, Vienna, Austria
Takuya Yanagida - School of Applied Health and Social Sciences, University of Applied Sciences Upper Austria, Linz, Austria
Klaus D. Kubinger - Division of Psychological Assessment and Applied Psychometrics, Faculty of Psychology, University of Vienna, Vienna, Austria
Berthold Schneider - Institute for Biometry, Hannover Medical School, Hannover, Germany

GMS Med Inform Biom Epidemiol 2014;10(1):Doc07

doi: 10.3205/mibe000156, urn:nbn:de:0183-mibe0001569

Veröffentlicht:	28. Oktober 2014

© 2014 Rasch et al.
Dieser Artikel ist ein Open Access-Artikel und steht unter den Creative Commons Lizenzbedingungen (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.de). Er darf vervielfältigt, verbreitet und öffentlich zugänglich gemacht werden, vorausgesetzt dass Autor und Quelle genannt werden.

Gliederung

Abstract

In standard statistical packages sequential tests are seldom and when they are offered, they mainly concern just tests about means. By a simulation study it was shown that a new sequential triangular test for the null-hypothesis H₀: 0<ρ≤ρ₀ for given requirements of precision (type-I-, type-II-risk, and a practical relevant effect δ=ρ₁–ρ₀) offers reasonable results. Due to 100,000 runs the average sample size of the sequential triangular test is smaller than the sample size of the pertinent fixed sample size test. If practically possible, it is recommended that the triangular test should be used instead of a one-step procedure with a sample size fixed in advance.

Keywords: sequential test, correlation coefficient, simulation

Zusammenfassung

In statistischen Programmpaketen findet man kaum sequentielle Tests und wenn doch, werden vor allem Erwartungswerte getestet. Mit Hilfe einer Simulationsstudie konnte gezeigt werden, dass ein neuer sequentieller Dreieckstest der Nullhypothese H₀: 0<ρ≤ρ₀ bei gegebenen Genauigkeitsforderungen wie Risiko erster und zweiter Art und einem praktisch relevanten Effekt δ=ρ₁–ρ₀) zu vernünftigen Ergebnissen führt. Bei 100.000 simulierten Tests war der durchschnittliche Stichprobenufang des sequentiellen Dreieckstests kleiner als der entsprechende fest vorgegebene Umfang. Wenn es praktisch machbar ist, sollte stets der Dreieckstest an Stelle eines Tests mit fest vorgegebenem Umfang verwendet werden.

Schlüsselwörter: sequentieller Test, Korrelationskoeffizient, Simulation

Gliederung

Introduction

In standard statistical packages sequential tests are seldom and when they are offered, they mainly concern just tests about means. For instance in [1], R-programs of sequential triangular tests can be found for:

comparing a mean with a constant,
comparing two means,
comparing a probability with a constant,
comparing two probabilities.

In this paper we present simulation results for a newly developed triangular sequential test for comparing a correlation coefficient with a constant.

In correlation analyses most of the time the null-hypotheses H₀: ρ=0 (versus the two-sided alternative hypothesis H_A: ρ≠0) is tested. But a significant correlation coefficient – regardless of how small the type-I-risk α may be established and the more regardless how small the p-value results – has often no practical meaning. Therefore it is often more reasonable to test the null-hypothesis H₀: ρ=ρ₀ for any 0<ρ₀<1 against the alternative H_A: ρ=ρ₁ for any ρ₁>ρ₀ or ρ₁<ρ₀. Kubinger et al. [2] made available even to SPSS-users the fixed sample size test, which is known for this problem since a long time in statistics. In the present paper we show the attractiveness of a corresponding sequential triangular test proposed by Schneider et al. [3]. In general, sequential triangular tests have the advantage that their average sample size is not only smaller than that of the corresponding fixed sample size tests but its maximum sample size needed is known in advance in any case.

Gliederung

Method

The null-hypothesis H₀: ρ=ρ₀ can – without loss of generality – be replaced by H₀: 0<ρ≤ρ₀ or by H₀: ρ≥ρ₀. In an analogous way the alternative hypothesis may be replaced by a composite alternative.

Under the assumption that random variables are normally distributed with second moments σ²_x, σ²_y, and σ_xy, and a correlation coefficient ρ=σ_xy /(σ_xσ_y) the null-hypothesis should be tested against the alternative hypothesis H_A: 0<ρ₀≤ρ≤ρ₁ with a type-I-risk α – i.e. the probability of wrongly rejecting H₀ – and a type-II-risk β – i.e. the probability of wrongly accepting H₀ (in particular as long as δ≥ρ₁–ρ₀ with a ρ₁ to be fixed in advance).

The empirical correlation coefficient r=s_xy /s_xs_y based on k observations (x_i,y_i) (i=1,…,k) is an estimate of the parameter ρ (s_xy, s²_x, s²_y are the empirical covariance and variances, respectively). In place of using r as a test statistic we use the transformed (Fisher transformation [4]) value

as a test statistic. The distribution of the corresponding random variable is approximately normal even if k is rather small. The expectation of z, being a function of ρ, amounts to

the variance to

The statistic z can be used to test for a fixed sample size k=n the hypothesis H₀: ρ≤ρ₀ against the alternative hypothesis H_A: ρ>ρ₀ (respectively H₀: ρ≥ρ₀ against H₁: ρ<ρ₀, a problem discussed in [5], Example 11.15). H₀: ρ≤ρ₀ is rejected with error probability α, if

(respectively for H₀: ρ≥ρ₀, if

z_1-α a is the (1-α)-quantile of the standard normal distribution).

An approximate lower bound for the sample size n which is necessary to keep a type-II-error with probability smaller or equal β given ρ=ρ₁ is the smallest positive integer for which holds:
.

A special group of sequential tests are the sequential triangular tests, going back to Whitehead [6] and Schneider [7]. The main characteristic is that based on the actual sampled date, to decide whether sampling has to be continued or either the null- or the alternative hypothesis can be accepted.

We split the sequence of data pairs into sub-samples of length, say k>3 each. For each sub-sample j (j=1, 2,…, m) we calculate a statistic which distribution is only a function of ρ and k.

A triangular test must be based on a statistic with expectation 0, given the null-hypothesis, therefore we transform z into a realization of a standardized variable
(1)
which has the expectation 0 if the null-hypothesis is true.

The parameter
(2)
is used as a test-parameter. For ρ=ρ₀ the parameter θ is 0 (as demanded). For ρ=ρ₁ we obtain:
(3)

The difference δ=ρ₁–ρ₀ is the practical relevant difference which should be detected with the power 1–β.

From each sub-sample j we now calculate the sample correlation coefficient r_j as well as its transformed values
(j=1, 2,…, m).

Now by

and V_m=m the sequential path is defined by points (V_m, Z_m) for m=1,2,… up to the maximum of V below or exactly at the point where a decision can be made. The continuation region is a triangle whose three sides depend on α, β, and θ₁ via
and (4),
with the percentiles z_P of the standard normal distribution. That is, one side of the looked-for triangle lies between –a and a on the ordinate of the (V, Z) plane (V=0). The two other borderlines are defined by the lines L₁: Z=a+cV and L₂: Z=–a+3cV, which intersect at
(5).

The maximum sample size is of course k·V_max. The decision rule now is: Continue sampling as long as –a+3cV_m<Z_m<a+cV_m if θ₁>0 or –a+3cV_m>Z_m>a+cV_m if θ₁<0. Given θ₁>0, accept H_A in case that Z_m reaches or exceeds L₁ and accept H₀ in case that Z_m reaches or underruns L₂, Given θ₁<0, accept H_A in the case Z_m reaches or underruns L₁ and accept H₀ in the case Z_m reaches or exceeds L₂. If the point is reached, H_A is to be accepted.

Gliederung

Example 1

We like to test the null hypothesis ρ≤.6 against the alternative hypothesis ρ>.6 with α=.05, β=.1 and δ=ρ₁–.6 =.2. We use k=7. We hence obtain

and

Since we obtain

We find z_.0.9=1.282 and z_.0.95=1.645, and hence

From (5) we get V_max=4.849/0.168=28.863, Z_max=9.698.

The corresponding triangle is shown in Figure 1 [Fig. 1].

Gliederung

Example 2

Finally we demonstrate the case of Example 11.15 in [5]. We like to test the null hypothesis ρ≥.8 against the alternative hypothesis ρ<.6 with α=.05, β=.1 and δ=.8–.6=.2. We use k=7. We hence obtain

and

Since we obtain

We find z_.0.9=1.282 and z_.0.95=1.645, and hence

From (5) we get V_max=–4.849/–0.168=28.863, Z_max=–9.698.

The corresponding triangle is shown in Figure 2 [Fig. 2].

Gliederung

Simulation study

The reasonability of the sequential triangular test for hypotheses of the correlation coefficient was tested by simulated paths (Z, V) being generated by bivariate normally distributed random numbers x and y with means µ_x=µ_y=0, variances σ_x²=σ_y²=1, and a correlation coefficient σ_xy=ρ. Simulations were performed with the nominal risks α_nom=.05, β_nom=.2 and several values of ρ₀, ρ₁ and k. For each parameter combination 100,000 paths were generated. Simulations as well as the calculation of the introduced sequential triangular test were done in R [8]. As criteria for the test-quality we calculated: the relative frequency of wrongly accepting H₁, given ρ=ρ₀, which is the empirical risk of the first kind, say α_emp; the relative frequency of rejecting H₁, given ρ=ρ₁ – this is empirical risk of the second kind, say β_emp; the average number of samples ASN for ρ₀ and ρ₁ used for the calculation of r and z* until data sampling stopped (that is: the path leaves the continuation region). Here, ASN is the mean number of sample pairs over all 100,000 runs of the simulation study. Bear in mind that in a certain case the number of sample pairs needed for a terminal decision may lie either below or above that value ASN. The procedure is reasonable, if α_emp≤α and β_emp≤β. Of course, the sequential test should also lead to ASN<n_fix, where n_fix is the sample size necessary in a fixed sample size test with a power 1–β, given ρ=ρ₁ and testing the hypothesis ρ≤ρ₀ with the type-I-risk α=.05. Results are shown in Table 1 [Tab. 1] where the difference between α_emp and α are not to large. The ASN strongly depends on the real value of ρ. From Table 1 [Tab. 1] the reader finds reasonable values of k for the pairs (ρ₀, ρ₁) to be tested.

To summarize: If one actually wishes to check whether a correlation coefficient is large enough we suggest first to fix a lower bound of the coefficient of determination ρ₀² as the value which at least must be given in order to speak of a meaningful percentage of the variance of y which can be explained by x – and vice versa. Another value of the coefficient of determination, ρ₁², is that one which the researcher must not oversee with a higher probability than 1–β if given. It is easy to adapt the procedure for negative roots of these squares. For example, one might decide for ρ₀²=.5, as a consequence of which ρ₀=.7; suppose ρ₁=.8, α=.05, β=.2, then using k=26 (by interpolation) a total sample size will result as about 108 instead of 119 for a fixed sample study.

Gliederung

Note

Conflict of interest

The authors declare that they have no competing interests.

Gliederung

References

1.: Rasch D, Pilz J, Verdooren RL, Gebhardt A. Optimal experimental design with R. New York: Chapman & Hall/CRC; 2011.
2.: Kubinger KD, Rasch D, Šimeckova M. Testing a correlation coefficient's significance: Using H0: 0<ρ≤λ is preferable to H0: ρ=0. Psychol Sci. 2007;49:74-87.
3.: Schneider B, Rasch D, Kubinger KD, Yanagida T. A Sequential Triangular Test of a Correlation Coefficient's Null-Hypothesis: 0<ρ≤ρ0. Stat Pap. 2014 Jun 12. DOI: 10.1007/s00362-014-0604-8
4.: Fisher RA. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika. 1915;10(4):507-21.
5.: Rasch D, Kubinger KD, Yanagida T. Statistics in Psychology. Using R and SPSS. Chichester: Wiley; 2011. DOI: 10.1002/9781119979630
6.: Whitehead J. The Design and Analysis of Sequential Clinical Trials. 2nd ed. Chichester: Ellis Horwood; 1992.
7.: Schneider B. An interactive computer program for design and monitoring of sequential clinical trials. In: Proceedings of the XVIth international biometric conference; 1992 Dec 7-11; Hamilton, New Zealand. p. 237-50.
8.: R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2013. Available from: http://www.R-project.org/

gms | German Medical Science

GMS Medizinische Informatik, Biometrie und Epidemiologie

Artikel

Towards sequential statistical testing as some standard: Pearson’s correlation coefficient

Hin zu sequentiellem statistischem Testen als Standardverfahren: Pearsons Korrelationskoeffizient

Suche in Medline nach

Autoren

Gliederung

Abstract

Zusammenfassung

Introduction

Method

Example 1

Example 2

Simulation study

Note

Conflict of interest

References

gms | German Medical Science

GMS Medizinische Informatik, Biometrie und Epidemiologie

Artikel

Towards sequential statistical testing as some standard: Pearson’s correlation coefficient Hin zu sequentiellem statistischem Testen als Standardverfahren: Pearsons Korrelationskoeffizient

Suche in Medline nach

Autoren

Gliederung

Conflict of interest

Towards sequential statistical testing as some standard: Pearson’s correlation coefficient

Hin zu sequentiellem statistischem Testen als Standardverfahren: Pearsons Korrelationskoeffizient