Peter Murakami: simp

simp() function

added November 15, 2012

Code File

The code for the simp() function is contained in the file simp.R. You can source the file directly into R by calling

      source("http://www.biostat.jhsph.edu/~pmurakam/simp.R")

Manual page


Description:
A function to estimate the SI (sensitivity) and SSI (positive predictive value) of 
reflux events on obstructive apnea events, and to report the p-value for the test 
that reflux events and obstructive apnea events occur in time independently of one 
another (not necessarily that SI=0 or SSI=0).

Usage:
simp(tab, numiter, interval, cause, stat, method, gap.symptom, gap.rf, seed=4243859, verbose=TRUE)

Arguments:
tab         - a data frame with columns named "event" and "seconds".  The event column must contain
              values listed in the cause and outcome arguments, and one "study end".  Rows with any 
              other value in the event column will be ignored.  The seconds column must have numeric 
              values with the time at which the event identified in the event column occurred.  Study 
              start is assumed to to occur at seconds=0.
numiter     - number of iterations to perform
interval    - the interval of time that defines the window of association.  Can be a vector.
cause       - what are the cause(s) called in the "event" column of tab.  If a vector is provided,
              all elements of that vector are considered as causes.
outcome     - what is the outcome event called in the "event" column of tab.
stat        - "sensitivity" or "ppv" to calculate the SI or SSI, respectively.
method      - 1 for permuting times between events, 2 for just simulating event times from a 
              uniform distribution, and 3 for the wrap-around method.
gap.symptom - remove all symptom events that happen within 'gap.symptom' seconds after the previous 
              symptom event with no reflux event happening in between. Defaults to 0 if method is 1 
              or 3, and defaults to the interval argument if method is 2.
gap.rf      - remove all reflux events that happen within 'gap.rf' seconds before the next reflux 
              event with no symptom event happening in between.  Defaults to 0 if method is 1 or 3, 
              and defaults to the interval argument if method is 2.
seed        - random seed
verbose     - logical value, whether to print the progress bar

Details:
SI (a sensitivity) is what percent of the apneas were within 'interval' seconds after a
reflux (i.e. what percent of apneas were correctly predicted by reflux), and 
SSI (a positive-predictive value) is what percent of the refluxes were 'interval' seconds 
before an apnea (i.e., what percent of refluxes correctly predicted apnea).

There is a choice one can make about how close events have to be in time in order for them 
to just be considered one event.  If gap.symptom = interval (the association window), then no 2 symptoms can 
be associated with the same reflux, and if gap.rf = interval (the association window) then no 2 
refluxes can be associated with the same symptom.  If you set gap=0, then even if two symptoms happen 
one right after the other, even by a second, they will be counted as 2 separate symptom events 
associated or not associated with a reflux, which is probably a problem for method 2's 
assumption of uniformly distributed event times if the events actually tend to cluster in time.  
Setting gap to be greater than the association window is probably not justified, and there's 
also probably no reason to set the gap lower than what can be empirically observed in practice 
either. SI and SSI estimates may change if the gap.symptom and gap.rf are changed, but the p-values from 
methods 1 and 3 will probably remain relatively unaffected.

Value:
A list with "resx" table reporting the SI or SSI (depending on the stat argument) and 
p-value for each value of the interval argument, and "resn" reporting the number of 
reflux events in the data (reflux events defined by the cause argument) ("n.rf") and 
the number of obtructive apneas in the data ("n.symptom").

See also:
SAP and Ghillebert probability methods

References:
Glen D, Murakami PN, & Nunez J (2012). "Symptom Index p-Value and Symptom Sensitivity 
Index p-Value (SIP and SSIP) to Determine Symptom Association between Apnea and Reflux 
in Premature Infants at Term", Diseases of the Esophagus (forthcoming)

Examples:
Tab = data.frame(event=c("study start",sample(c("pH","MII","OA"),200,replace=TRUE),"study end"),
                 seconds=c(0,round(sort(runif(200,1,20000))),20001), stringsAsFactors=FALSE)
out = simp(tab=Tab, numiter=100, interval=seq(15,300,by=15), cause=c("pH","MII"), outcome="OA",
           stat="sensitivity", method=1, gap.symptom=0, gap.rf=0, seed=1234, verbose=TRUE)