### Article

## A bivariate survival model with cure fraction

### Search Medline for

### Authors

Published: | September 14, 2004 |
---|

### Outline

### Text

#### Introduction

Models for survival analysis typically assume that everybody in the study population is susceptible to the event under study and will eventually experience this event if the follow-up is sufficiently long. This is often an unstated assumption of the widely used proportional hazard models and their extensions - frailty models. However, there are situations when a fraction of individuals are not expected to experience the event of interest; that is, those individuals are cured or insusceptible. For example, researchers may be interested in analyzing the recurrence of a disease. Many individuals may never experience a recurrence; therefore, a cured fraction of the population exists. Historically, cure models have been utilised to estimate the cured fraction. Cure models are survival models which allow for a cured fraction of individuals. These models extend the understanding of time-to-event data by allowing for the formulation of more accurate and informative conclusions. These conclusions are otherwise unobtainable from an analysis which fails to account for a cured or insusceptible fraction of the population. If a cured component is not present, the analysis reduces to standard approaches of survival analysis.

We suggest a cure-mixture model to analyze bivariate time-to-event data, as motivated by the paper of Chatterjee and Shih [Ref. 1], but with a simpler estimation procedure and the correlated gamma-frailty model instead of the shared gamma-frailty model. This approach allows us to deal with left truncated and right censored lifetime data and accounts for heterogeneity as well as for an insusceptible (cure) fraction in the study population. We perform a simulation study to evaluate the properties of the estimates in the proposed model and apply it to breast cancer incidence data for 5,857 Swedish female monozygotic and dizygotic twin pairs from the so-called old cohort of the Swedish Twin Registry. This model is used to estimate the size of the susceptible fraction and the correlation between the frailties of the twin partners. Possible extensions, advantages and limitations of the method are discussed.

#### Methods

In the following we apply the correlated gamma-frailty model (Pickles et al. [Ref. 2]; Yashin et al. [Ref. 3]; Petersen [Ref. 4]; Wienke et al., [Ref. 5], [Ref. 6], [Ref. 7] among others) including an insusceptible fraction to fit bivariate time-to-event (occurrence of breast cancer) data. The correlated gamma-frailty model provides a specific parameter for correlation between the two frailties. The interesting point here is that individual frailties in twin pairs could not be observed, but their correlation could be estimated by application of the correlated gamma-frailty model. We use a parametric approach by fitting a Gamma-Gompertz model to the data.

For a combined analysis of monozygotic and dizygotic twins we include two correlation coefficients, ρMZ and ρDZ, respectively. These correlations between monozygotic and dizygotic twins provide information about genetic and environmental influences on frailty.

#### Results

In this paper we have suggested a cure-mixture model for the modeling of correlations in bivariate time-to-event data. This model extends the approach outlined in the paper of Chatterjee and Shih [Ref. 1] in various ways. First, instead of the shared gamma-frailty model we use the much more flexible correlated gamma-frailty model, which includes the shared gamma-frailty model as a special case. Second, we propose to use a direct estimation procedure in the parametric model instead of the two-step estimation procedure used by Chatterjee and Shih [Ref. 1]. Third, we think that our twin data are more appropriate as an illustrative example than the family data of Chatterjee and Shih [Ref. 1] (who ignored higher order correlations in their family data) for such bivariate models. Nevertheless, our estimate of the size of a susceptible fraction (due to breast cancer) with 0.222 (0.045) is very close to the estimate 0.22 (0.0093) in the parametric model found by Chatterjee and Shih [Ref. 1] in a completely different study population. Fourth, we allow the lifetimes to be truncated in our model.

#### Discussion

Cure models with the right censored observations suffer from an inherent identifiability problem. For such observations the event under study has not occurred either because the person is insusceptible, or because the person is susceptible but follow-up was not long enough for the event to be observed. The identifiability problem increased with increasing censoring, but is lessened by the parametric modeling of the baseline hazard. The simulation study shows that the estimation procedure works well under the given truncation and censoring scheme in our sample data set. Stronger right censoring causes strong identifiability problems. In cure models with fixed censoring times (caused by the end of the study) censoring is no longer non-informative even when the censoring times and the survival times are independent. The proportion of censored observations contains important information about model parameters.

The present paper is restricted to the parametric case, meaning in our case the marginal survival functions are specified parametrically up to a few (one - dimensional) parameters. From a statistical point of view such a parametric assumption is unsatisfactory, because it is non-justifiable. Frailty models of univariate data have been strongly criticised because assumptions have to be made about both the shape of the underlying mortality trajectory and the distribution of the frailty: different pairs of assumptions can result in equally good fits to the data. Without an insusceptible fraction in the population this problem can be solved by using the non-parametric correlated gamma-frailty model (Yashin and Iachine [Ref. 8]). Applying the (true) parametric and semi-parametric estimation procedures to the same (simulated) data generated from the correlated gamma-frailty model, the semi-parametric estimation procedure shows good performance, despite the fact that it does not make use of the additional information about the parametric structure of the marginal survival functions. The estimates of σ^{2} and ρ are similar in both cases (results are not shown here). Nevertheless, using the wrong parametric model may result in biased parameter estimates.

To what extent this method is applicable in the much more complicated semi-parametric model with cure fraction is still an open question, one that needs further careful consideration. Dealing with a disease with late age of onset resulting in heavily censored data may lead to problems in estimating the (infinite dimensional) nuisance parameter - the marginal survival function - and, consequently, in estimating the parameters of interest, σ^{2} and ρ.

Our study points to the existence of an important insusceptible fraction. The suggested model gives a clear illustration of how survival analysis and cure models could be merged for analysis of time-to-event data of related individuals.

#### Acknowledgments

The authors wish to thank the Swedish Twin Register for providing the twin data. The research was partly supported by NIH/NIA grant 7PO1AG08761-09. The Swedish Twin Registry is funded by a grant from the Department of Higher Education, the Swedish Scientific Council, and ASTRA Zeneca.

### References

- 1.
- Chatterjee, N., Shih, J. (2001) A bivariate cure-mixture approach for modeling familial association in diseases. Biometrics 57, 779 - 786.
- 2.
- Pickles, A., Crouchley, R., Simonoff, E., Eaves, L., Meyer, J., Rutter, M., Hewitt, J., Silberg, J. (1994) Survival Models for Developmental Genetic Data: Age of Onset of Puberty and Antisocial Behavior in Twins. Genetic Epidemiology 11, 155 - 170.
- 3.
- Yashin, A.I., Vaupel, J.W., Iachine, I.A. (1995) Correlated individual frailty: An advantageous approach to survival analysis of bivariate data. Mathematical Population Studies 5, 145 - 159.
- 4.
- Petersen, J.H. (1998) An additive frailty model for correlated lifetimes. Biometrics 54, 646-661.
- 5.
- Wienke, A., Christensen, K., Skytthe, A., Yashin, A.I. (2002) Genetic analysis of cause of death in a mixture model with bivariate lifetime data. Statistical Modelling 2, 89 - 102.
- 6.
- Wienke, A., Lichtenstein, P., Yashin, A.I. (2003) A bivariate frailty model with a cure fraction for modeling familial correlations in diseases. Biometrics 59, 1178 - 1183.
- 7.
- Wienke, A., Holm, N., Christensen, K., Skytthe, A., Vaupel, J., Yashin, A.I. (2003) The heritability of cause-specific mortality: a correlated gamma-frailty model applied to mortality due to respiratory diseases in Danish twins born 1870 - 1930. Statistics in Medicine 22, 3873 - 3887.
- 8.
- Yashin, A.I., Iachine, I.A. (1995) Genetic analysis of durations: correlated frailty model applied to the survival of Danish twins. Genetic Epidemiology, 12, 529 - 38.