Recur
objectsvignettes/reReg-Recur.Rmd
reReg-Recur.Rmd
In this vignette, we demonstrate how to create a recurrent event object with the Recur()
function from the reda
package (Wang et al. 2021). The Recur()
function is imported when the reReg
package is loaded. The Recur
object bundles together a set of recurrent times, failure time, and censoring status, with the convenience that it can be used as the response in model formula in the reReg
package. We will illustrate the usage of Recur()
with the cgd
data set from the survival
(Therneau 2021) and the readmission
data set from the frailtypack
package (Rondeau, Mazroui, and González 2012), (González et al. 2005).
> library(reReg)
> packageVersion("reReg")
[1] '1.4.6'
> data(readmission, package = "frailtypack")
> head(readmission)
id enum t.start t.stop time event chemo sex dukes charlson death
1 1 1 0 24 24 1 Treated Female D 3 0
2 1 2 24 457 433 1 Treated Female D 0 0
3 1 3 457 1037 580 0 Treated Female D 0 0
4 2 1 0 489 489 1 NonTreated Male C 0 0
5 2 2 489 1182 693 0 NonTreated Male C 0 0
6 3 1 0 15 15 1 NonTreated Male C 3 0
> readmission <- subset(readmission, !(id %in% c(60, 109, 280)))
Recur
interface
The Recur()
function is modeled after the Surv()
function in the survival
package (Therneau 2021). The function interface of Recur()
is
> args(Recur)
function (time, id, event, terminal, origin, check = c("hard",
"soft", "none"), ...)
NULL
The six arguments are
time
: event and censoring times.
It can be a vector that represents the time of recurrent events and censoring, or as a list of time intervals that contains the starting time and the ending time of the interval. In the latter, the intervals are assumed to be open on the left and closed on the right, where the right end points are the time of recurrent events and censoring.
id
: subject’s id.
It can be numeric vector, character vector, or a factor vector.
If it is left unspecified, Recur()
will assume that each row represents a subject.
event
: event indicator of recurrent events.
This is a numeric vector that represents the types of the recurrent events. Logical vector is allowed and converted to numeric vector. Non-positive values are internally converted to zero indicating censoring status.
terminal
: event indicator of terminal events.
This is a numeric vector that represents the status of the terminal event. Logical vector is allowed and converted to numeric vector. Non-positive values are internally converted to zero indicating censoring status. If a scalar value is specified, all subjects will have the same status of terminal events at their last recurrent episodes. The length of the specified terminal
should be equal to the number of subjects, or number of data rows. In the latter case, each subject may have at most one positive entry of terminal
at the last recurrent episode.
origin
: time origin of subjects.
This is a numerical vector indicating the time origin of each subject. If a scalar value is specified, all subjects will have the same origin at the specified value. The length of the specified origin
should be equal to the number of subjects, or number of data rows. In the latter case, different subjects may have different origins. However, one subject must have the same origin. In addition to numeric values, Date
and difftime
are also supported and converted to numeric values.
check
: indicates how to run the data checking procedure.
This is a character value specifying how to perform the checks for recurrent event data. Errors or warnings will be thrown, respectively, if the check
is specified to be "hard"
(default) or "soft"
. If check = "none"
is specified, no data checking procedure will be run.
Recur
object
readmission
data set, the time
argument can be specified with time = t.stop
or with time = t.start %to% t.stop
, where the infix operator %to%
is used to create a list of two elements containing the endpoints of the time intervals. When check = "hard"
or check = "soft"
, the Recur()
function performs an internal check for possible issues on the data structure. The Recur()
function terminates and issues an error message once the check failed if check = "hard"
(default). On the contrary, Recur()
would proceed with a warning message when check = "soft"
or without a warning message when check = "none"
. The checking criterion includes the following:
The Recur()
function matches the arguments by position when the arguments’ names are not specified. Among all the arguments, only the argument time
does not have default values and has to be specified by users. The default value for the argument id
is seq_along(time)
, thus, Recur()
assumes each row specifies the time point for each subject when id
is not specified. However, using the default value id
defeats the purpose using recurrent event methods. The default value for the argument event
is a numerical vector, where the values 0 and 1 are used to indicate whether the endpoint of the time intervals in time
is a non-recurrent event or a recurrent event, respectively. The event
argument can accommodate more than one types of recurrent events; in this case the reference level (value 0) is used to indicate non-recurrent event. On the other hand, a zero vector is used as the default value for arguments terminal
and orgin
.
The default values in Recur()
are chosen so that Recur()
can be conveniently adopted in common situations. For example, in situations where the recurrent events are observed continuously and in the absence of terminal events, the event
and terminal
arguments can be left unspecified. In this case, the last entry within each subject will be treated as a censoring time. One example is the cgd
data from the survival package, where the recurrent event is the serious infection observed from a placebo controlled trial of gamma interferon in chronic granulotamous disease. A terminal event was not defined in the cgd
data and the patients were observed through the end of study. For this dataset, the Recur
object can be constructed as below:
> data(cgd, package = "survival")
> (recur1 <- with(cgd, Recur(tstart %2% tstop, id)))
...
[1] 1: (0, 219], (219, 373], (373, 414+]
[2] 2: (0, 8], (8, 26], ..., (350, 439+]
[3] 3: (0, 382+]
[4] 4: (0, 388+]
[5] 5: (0, 246], (246, 253], (253, 383+]
[6] 6: (0, 364+]
[7] 7: (0, 292], (292, 364+]
[8] 8: (0, 363+]
[9] 9: (0, 294], (294, 349+]
[10] 10: (0, 371+]
...
For each subject, the function Recur()
prints intervals to represent the duration until the next event (a recurrent event or a terminal event). The Recur
object for the readmission
dataset can be constructed as below:
> (recur2 <- with(readmission, Recur(t.stop, id, event, death)))
...
[1] 1: (0, 24], (24, 457], (457, 1037+]
[2] 2: (0, 489], (489, 1182+]
[3] 3: (0, 15], (15, 783*]
[4] 4: (0, 163], (163, 288], ..., (686, 2048+]
[5] 5: (0, 1134], (1134, 1144+]
[6] 6: (0, 627], (627, 1190], ..., (1406, 1407+]
[7] 7: (0, 38], (38, 42], ..., (63, 1049+]
[8] 8: (0, 1466*]
[9] 9: (0, 148], (148, 1474+]
[10] 10: (0, 1113+]
...
The readmission
example above shows patient id #1 experienced two hospital readmissions with a terminal event at t = 1037
(days). The +
at t = 1037
indicates the terminal time was censored, e.g., this patient did not experience the event of interest (death) at t = 1037
. Similarly, patient id #3 has one readmission and died at t = 783
(days) as indicated by *
at 783
. On the other hand patient id # 4 has more than 3 readmissions and was censored at t = 2048
(days). The readmission intervals was suppressed to prevent printing results wider than the screen allowance. The number of intervals to be printed can be tuned using the options
and argument reda.Recur.maxPrint
.
Recur
output
The Recur()
returns an S4-class representing model response for recurrent event data. The following shows the structure of the Recur
object created for cgd
data.
> str(recur1)
Formal class 'Recur' [package "reda"] with 9 slots
..@ .Data : num [1:203, 1:6] 0 219 373 0 8 26 152 241 249 322 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:6] "time1" "time2" "id" "event" ...
..@ call : language Recur(time = tstart %2% tstop, id = id)
..@ ID : chr [1:128] "1" "2" "3" "4" ...
..@ ord : int [1:203] 1 2 3 4 5 6 7 8 9 10 ...
..@ rev_ord : int [1:203] 1 2 3 4 5 6 7 8 9 10 ...
..@ first_idx : int [1:128] 1 4 12 13 14 17 18 20 21 23 ...
..@ last_idx : int [1:128] 3 11 12 13 16 17 19 20 22 23 ...
..@ check : chr "hard"
..@ time_class: chr "integer"
..$ dim : int [1:2] 203 6
..$ dimnames:List of 2
.. ..$ : NULL
.. ..$ : chr [1:6] "time1" "time2" "id" "event" ...
The slots of the Recur
S4-class are
.Data
: a numerical matrix with columns time1
, time2
, id
, event
, terminal
, and origin
.call
: a function call producing the Recur
object.ID
: a character string storing the original subject ID.ord
: indices that sort the response matrix by rows. Sorting is in an increasing order by id
, time2
, and -event
.rev_ord
: indices that revert the increasingly sorted response matrix by ord
to its original ordering.first_idx
: indices that indicates the first record of each subject in the sorted matrix.last_idx
: indices that indicates the last record of each subject in the sorted matrix.check
: a character string that records the specified check
argument in Recur()
.time_class
: a character string the original times if specified in calendar dates.The summary for Recur
object can be printed with summary()
.
> summary(recur1)
Call:
Recur(time = tstart %2% tstop, id = id)
Sample size: 128
Number of recurrent event observed: 75
Average number of recurrent event per subject: 0.586
Proportion of subjects with a terminal event: 0
Median follow-up time: 293
> summary(recur2)
Call:
Recur(time = t.stop, id = id, event = event, terminal = death)
Sample size: 400
Number of recurrent event observed: 452
Average number of recurrent event per subject: 1.13
Proportion of subjects with a terminal event: 0.265
Median follow-up time: 1143
Readers are referred to a separate vignette on Recur()
for a detailed introduction of Recur()
. The reSurv()
function is being deprecated in Version 1.2.0. In the current version, the reSurv()
function can still be used, but the reSurv
object will be automatically transformed to the corresponding Recur
object.
González, Juan Ramón, Esteve Fernandez, Víctor Moreno, Josepa Ribes, Mercè Peris, Matilde Navarro, Maria Cambray, and Josep Maria Borrás. 2005. “Sex Differences in Hospital Readmission Among Colorectal Cancer Patients.” Journal of Epidemiology & Community Health 59 (6): 506–11.
Rondeau, Virginie, Yassin Mazroui, and Juan Ramń González. 2012. “frailtypack: An R Package for the Analysis of Correlated Survival Data with Frailty Models Using Penalized Likelihood Estimation or Parametrical Estimation.” Journal of Statistical Software 47 (4): 1–28.
Therneau, Terry M. 2021. A Package for Survival Analysis in R. https://CRAN.R-project.org/package=survival.
Wang, Wenjie, Haoda Fu, Sy Han Chiou, and Jun Yan. 2021. reda: Recurrent Event Data Analysis. https://github.com/wenjie2wang/reda.