Roc-guided survival trees

Fits a "rocTree" model.

rocTree(formula, data, id, subset, ensemble = TRUE, splitBy = c("dCON",
  "CON"), control = list())

Arguments

formula	is a formula object, with the response on the left of a '~' operator, and the terms on the right. The response must be a survival object returned by the 'Surv' function.
data	is an optional data frame in which to interpret the variables occurring in the 'formula'.
id	is an optional vector used to identify the longitudinal observations of subject's id. The length of 'id' should be the same as the total number of observations. If 'id' is missing, each row of `data` represents a distinct observation from a subject and all covariates are treated as a baseline covariate.
subset	is an optional vector specifying a subset of observations to be used in the fitting process.
ensemble	is an optional logical value. If `TRUE` (default), ensemble methods will be fitted. Otherwise, the survival tree will be fitted.
splitBy	is a character string specifying the splitting algorithm. The available options are 'CON' and 'dCON' corresponding to the splitting algorithm based on the total concordance measure or the difference in concordance measure, respectively. The default value is 'dCON'.
control	a list of control parameters. See 'details' for important special features of control parameters.

Value

An object of S4 class "rocTree" representig the fit, with the following components:

Details

The argument "control" defaults to a list with the following values:

tau: is the maximum follow-up time; default value is the 90th percentile of the unique observed survival times.
maxTree: is the number of survival trees to be used in the ensemble method (when ensemble = TRUE).
maxNode: is the maximum node number allowed to be in the tree; the default value is 500.
numFold: is the number of folds used in the cross-validation. When numFold > 0, the survival tree will be pruned; when numFold = 0, the unpruned survival tree will be presented. The default value is 10.
h: is the smoothing parameter used in the Kernel; the default value is tau / 20.
minSplitTerm: is the minimum number of baseline observations in each terminal node; the default value is 15.
minSplitNode: is the minimum number of baseline observations in each splitable node; the default value is 30.
disc: is a logical vector specifying whether the covariates in formula are discrete (TRUE) or continuous (FALSE). The length of disc should be the same as the number of covariates in formula. When not specified, the rocTree() function assumes continuous covariates for all.
K: is the number of time points on which the concordance measure is computed. A less refined time grids (smaller K) generally yields faster speed but a very small K is not recommended. The default value is 20.

References

Sun Y. and Wang, M.C. (2018+). ROC-guided classification and survival trees. Technical report.

Examples

data(simDat)

## Fitting a pruned survival tree
rocTree(Surv(Time, death) ~ z1 + z2, id = id, data = simDat, ensemble = FALSE)
#>  ROC-guided survival tree
#> 
#>  node), split
#>    * denotes terminal node
#>   
#> Root                     
#>  ¦--2) z1 <= 0.32338*    
#>  °--3) z1 > 0.32338      
#>      ¦--6) z2 <= 0.60199*
#>      °--7) z2 > 0.60199* 
#> 

## Fitting a unpruned survival tree
rocTree(Surv(Time, death) ~ z1 + z2, id = id, data = simDat, ensemble = FALSE,
        control = list(numFold = 0))
#>  ROC-guided survival tree
#> 
#>  node), split
#>    * denotes terminal node
#>   
#> Root                          
#>  ¦--2) z1 <= 0.32338          
#>  ¦   ¦--4) z1 <= 0.16418*     
#>  ¦   °--5) z1 > 0.16418*      
#>  °--3) z1 > 0.32338           
#>      ¦--6) z2 <= 0.60199      
#>      ¦   ¦--12) z2 <= 0.22388*
#>      ¦   °--13) z2 > 0.22388* 
#>      °--7) z2 > 0.60199*      
#> 

# NOT RUN {
## Fitting the ensemble algorithm (default)
rocTree(Surv(Time, death) ~ z1 + z2, id = id, data = simDat, ensemble = TRUE)
# }

Arguments

Value

Details

References

See also

Examples

Contents