Article
Tree-Based Modeling of Subdistribution Hazards in Discrete Time
Search Medline for
Authors
Published: | February 26, 2021 |
---|
Outline
Text
In many studies, individuals may experience events of various types. Typical examples are the development of different kinds of disease or the occurence of specific causes of deaths that are analyzed in clinical research. This requires suitable techniques for competing risks analysis. Traditional methods usually assume that the event times are measured on a continuous scale. In practice, however, the exact (continuous) event times are often not oberved, but only intervals (i.e., pairs of fixed consecutive points in time) at which the events of interest took place. Thus, time is measured on a discrete scale.
Here, it is assumed that the interest is in the analysis of the observation time T to the occurrence of one out of J competing events measured on a discrete time scale t=1, 2, ..., k. The key quantity to describe competing risks data is the discrete cumulative incidence function, which for event j is defined by (Fj (t | x) : P (T ≤ t, ε = j | x), where the event type is represented by the random variable and x = (x1,...,xp) is a set of covariates.
A popular modeling approach for the cumulative incidence function is the proportional subdistribution hazard model [1], which is a direct modeling approach for the cumulative incidence function of one event of interest j. Subdistribution hazard models have been extended to the discrete-time case by [2]. The methodology in [2] refers to parametric regression models using linear combinations of the covariates for modeling the subdistribution hazard λj = (t | x), which is directly linked to Fj (t | x).
When parametric models are too restrictive, for example, because unknown interactions between covariates are present, an alternative strategy is to apply recursive partitioning techniques or trees. Following the tree-based method by [3], which was designed for discrete hazard models with one single type of event, a discrete subdistribution hazard model of the form λj (t | x) = fj (t,x) is proposed, where the function (fj (·)) is determined by a Classification and Regression Tree (CART) with binary outcome. For tree building, the covariates (x1,...,xp) as well as the time t (coded as an ordinal variable) are considered as candidates for splitting. As in the classical CART approach, the proposed splitting criterion is based on impurity measures. During tree building the minimum node size is considered as the main parameter for pruning, which can be determined by either cross-validation of the log-likelihood or by information criteria such as AIC and BIC. Controlling the tree size prevents the resulting subdistribution hazard estimates from having a too large variance, which is inversely related to the terminal node size.
The proposed approach is illustrated by an analysis of age-related macular degeneration (AMD) among elderly people that were monitored by annual study visits.
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.
References
- 1.
- Fine JP, Gray RJ. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association. 1999;94:496-509.
- 2.
- Berger M, Schmid M, Welchowski T, Schmitz-Valckenberg S, Beyersmann J. Subdistribution hazard models for competing risks in discrete time. Biostatistics. 2018:kxy069.
- 3.
- Schmid M, Küchenhoff H, Hörauf A, Tutz G. A survival tree method for the analysis of discrete event times in clinical and epidemiological studies. Statistics in Medicine. 2016;35:734-751.