CH9 ANALYZING RESEARCH QUESTIONS ABOUT SURVIVAL (Outcomes)

Intro to Clinical Statistics
by Timothy Bilash MD
June 2004

based on:
Review of Basic & Clinical Biostatistics by
Beth Dawson, Robert Trapp (2001) CH9

Survival analysis adds a complexity to interpretation of events and rates (TDB)
1. patients do not enter study at same time
2. time is mixed in with event occurence and contibute indistinguishably for a given outcome
  1. time and event are mixed and equated
    1. compound variable (person-years)
    2. change in time-to-event (earlier time) can have same effect as additional occurance (number)
    3. cannot distinguish statistically with survival analysis
  2. exposure and event are not time distinct (sum of given amounts of both)
    1. event determines amount of exposure, not given exposure leads to event
      1. time dimension feeds event occurance back into exposure
      2. event can occur prior to or during beginnings of exposure yet be unrelated to it, even tho in the exposure group
    2. length of time can be compared, but important to note exposure and event are in different time periods
  3. withdrawal for event mixed with withdrawal from attrition or non-compliance
    1. drop out for event not the same as drop out for compliance
  4. startpoint problem at beginning of period (partial exposure amount to risk)
    1. limits exposure for early event group
  5. endpoint problem at end of period (partial time obsrvation for outcome to occur)
    1. cut of inclusion at same given time
    2. limits events for later exposure group
3. survival analysis is different from experiment
  1. for survival must analyze data before all events have occurred, as opposed to after all events to be counted have occurred
  2. events cumulative over time
  3. always have a truncation
  4. alters interpetation of the statistics (conditional probabilities)
4. Survival parameters that affect risk and outcome
  1. exposure to risk
    1. onset delay
    2. length of exposure (integrated exposure)
    3. dose
      1. total
      2. maximum
    4. confounders/modifiers
      1. before
      2. during
      3. after
  2. possibility of outcome
    1. incubation time to event (delay of time to event after exposure has occurred)
    2. time to symptom
    3. time to diagnosis (test+)
    4. threshold for detection
    5. "accuracy" of test (PPV, NPV)
    6. gating
    7. censoring
5. Censored observations
  1. Time of Entry
    1. Simultaneously Censored (entry time simultaneous, like experiment)
      1. [Fig9-1 p 212]
    2. Progressively Censored (entry time not simultaneous)
      1. [Fig 9-2 p212]
SURVIVAL CURVES (Characterizing one Group)
1. Life Table (Actuarial) vs Kaplan-Meier Methods
  1. both calculate proportion surviving thru an interval
    1. # of events in given time interval
    2. 1 event in that time interval
  2. different (weighted) averages to give survival proportion
    1. Kaplan-Meier is exact
    2. Life Table is approximation (averages over interval)
  3. equivalent if constant rate of events over each interval and over study time
2. Life Table ( Actuarial Table) Analysis
  1. also called
    Cutler-Edurer Method
  2. different ways to collect life table data
    1. Current life table
      1. cross sectional data
        
        different people at one point in time
      2. used by insurers
    2. Cohort life table
      1. (longitudinal data)
        
        same people over a period of time
      2. most medical studies
    3. not the same statistic
  3. assumptions for Life Table method
    1. intervals fixed
      1. # of events in given time interval
    2. allows mild censoring
    3. assumes events average out to midpoint of intervals
      1. equivalent to a random withdrawal during the period
      2. compares to constant rate of withdrawl [?]
    4. assumes survival in one period not dependent on survival in other periods
    5. time interval duration used in the analysis is somewhat arbitrary but should be selected so that the number of censored observations in any interval is small
  4. uses conditional probability, must not have had that event all the periods before that interval
    1. survival function
    2. assumes probability of survival in a given period does not depend on survival in any other period
      1. "probably violated in much medical research but does not appear to cause major concern to biostatisticians"
    3. count patients in the study at the beginning of each interval who are not left by the end of the interval (had an event)
      1. used for the numerator
    4. count patients who were left at the beginning of each interval
      1. used for denominator
      2. some not in the study that long (stopped)
      3. some lost to followup
      4. denominator reduced by half the number of patients withdrawn for other reasons during the period
      5. assumes patients withdraw randomly throughout the interval
        
        on average patient withdrawals for an interval occur on average at the midpoint of the interval
        so subtract 1/2 of the number who withdraw during that period instead of all
        less of a concern if time intervals are short
      6. varying event# for fixed time interval
        
        n events for that time interval
    5. Survival Function per intervals = proportion surviving thru the ith interval =
      S(i) = p(i) p(i-1) ... p(1) [D&T p 216]
      1. i= ith time interval ( fixed time in denominator of rate)
      2. p(i)= 1-q(i)= percent survival (with no event) for ith time interval
      3. q(i)= d(i)/[n(i)-w(i)/2]= event rate for ith time interval
      4. n(i) = n(i-1) - w(i-1) = #pts at beginning of interval
        
        affects denominator not numerator
        withdrawals in previous interval(i-1) affects number in currnt interval (i)
      5. d(i) = #termination events in the interval
      6. w(i) = #withdrawals in interval (censored for other reasons than event)
    6. uses Greenwood's formula for the standard error SE [D&T p 215]
      1. formula for confidence interval (CI)
        
        assumes mild censoring
        assumes the proportion surviving in an interval is approximately normally distributed
        assumes sufficiently large sample sizes
        assumes a normal distribution for proportion surviving
      2. SE for S(i)= S(i) * Sqrt[Sum( q(i) / (n(i) - d(i) - w(i)/2) )]
  5. typically as the interval from entry into the study gets longer, the number of patients remaining for the next interval gets smaller.
    1. this means that the uncertainty (standard deviation) of the proportion surviving gets larger (statistics get worse with time)
    2. 95% confidence intervals get wider (often drawn on the graph as a band on either side of the curve)
  6. considerable bias can occur [p217]
    1. if the intervals are large
    2. if many withdrawals occur
    3. if the withdrawals not midway in the interval
      1. Kaplan-Meier removes this problem
3. Kaplan-Meier Product Limit method [D&T p217]
  1. actuarial-type method, with analysis at each event occurance (time since entry divided into unequal intervals by each event)
    1. event proportions in group estimated at the variable event moment, rather than at fixed intervals
    2. good for studies even involving small numbers of patients
    3. gives exact survival proportions because it uses exact survival times
    4. fixed event# (1) per varying time
      1. events per interval time= 1
  2. Survival Function per event = proportion surviving thru jth event = S(j)=p(j)p(j-1)...p(1)
    1. j= jth event ( varying time in denominator of rate)
    2. p(j)=1-q(j)= percent survival (with no event) for time interval of the jth event
    3. q(j)= d(j)/n(j)= event rate for jth event interval
    4. d(j)= jth event (1)
    5. n(j)= #pts at jth event
  3. [SE for S(j)] = S(j) * Sqrt[Sum( (d(j) / (n(j) * (n(j) - d(j)) )] [see D&T p218]
  4. note that withdrawls are ignored
    1. patients who are lost to follow-up and those who drop out before an event time merely drop out of the calculations by no longer being considered.
    2. [ISLT] effects of censoring
      1. if no censored observations occur, the Wilcoxon rank sum test is appropriate for comparing the ranks of survival times
      2. according to D&T, Kaplan-Meier removes the problem of withdrawals not occurring midway in the interval, however:
      3. K-M gives valid rates only if withdrawals and lost-to-follow-ups occur at a constant rate, or as an approximation at a random rate over the time of study (that is withdrawals are uncorrelated)
      4. other problems from many withdrawals would still remain however [?]
        
        no way to account if the patients who have been censored (withdrawals and lost to followup) would be more or less likely to have an event in a given group, if the withdrawals were related to treatment medication or other events related to treatment group or perceptions
        for example, if a patient drops out of an estrogen study because of bleeding (from higher unopposed estrogen levels), and does not have an event (if estrogen lowers the event risk), then the event rate would be artifically increased
        d(j) for that interval would be inflated because does not contain these patients
        so only removes the withdrawal problem if withdrawals are constant over time (or random), withdrawals are random as to patients who are exposed and not to risk, and withdrawals are random as to risk factors for outcome.
      5. large interval problem would still remain (for low rates) [?]
4. Summarizing Survival Data (within a group) [p225]
  1. Hazard Function or Hazard Rate
    1. (a Rate is within group as opposed to Ratio is between groups)
    2. also called conditional failure rate (events per total time)
    3. used to obtain estimate of Mean Survival Time for survival data
    4. useful for comparing two groups at risk
    5. if assumption of an exponential distribution is reasonable (ie, constant event rate)
    6. allows censored observations (makes corrections to denominator for dropouts and exposure).
    7. often used to characterize Kaplan-Meier Curves and in Cox Proportional Model
    8. H = D / (SumF + SumC)
      1. D is number of deaths
      2. SumF is the sum of event times
      3. SumC is the sum of censored times
      4. assumes exponential distribution for survival curve (constant event rate)
      5. [see pg 225 for formula]
  2. when assumption of an exponential distribution is not appropriate , other forms of the hazard function based on different probability distributions are used [Lee 1992]
MEASURES OF SIGNIFICANCE (Comparing groups)
1. "Little information is available to guide investigators in deciding which procedure is appropriate in which situation." [p224]
  1. in addition sometimes cannot determine which procedure used to compare survival distributions
  2. multiple names
  3. research on biostatistical methods for analyzing survival data is still underway
2. Independent groups t-test is not appropriate for comparing survival curves directly because survival times are not normally distributed and tend to be positively skewed (p220)
3. Comparing Survival Data (between groups - Uncensored data)
  1. Wilcoxon rank sum test
    1. assumes constant rates
    2. compares the ranks of survival time
    3. if no censored observations occur, appropriate for comparing the ranks of survival times [D&T p221]
4. Comparing Survival Data (between groups - Censored or Uncensored data)
  1. Tests for significance to compare survival curves with censored observations
    1. presence of censored observations requires special methods for comparing two or more survival distributions
    2. conclusions that result are approximations and can be calculated in different ways (also have name confusions for techniques)
    3. Hazard Ratio
    4. Logrank statistic
      1. most commonly reported
      2. Logrank compares the differences of the (group sum over all periods)
    5. Mantel-Haenszel Chi-Square statistic
      1. can be applied to any set of 2x2 tables
      2. Mantel-Haenszel combines the series of 2x2 tables to estimate an odds ratio
    6. independent-groups t-test is not appropriate for Kaplan-Meier because survival time (denominator) is not normally distributed, and tends to be positively skewed. [D&T p220]
  2. Constant Hazard Ratio of Hazard Rates between groups,
    Constant (or Proportional) Hazard Rate in each group
    1. Hazard Ratio for comparing two groups [p221]
      1. ratio of proportions (proportion of outcomes in at risk group to proportion of outcomes in not at risk group for bi-valued risk)
        
        (O₁_/E₁) /(O₂/ E₂)
        
        O_i is observed
        E_i_{is expected}
        
        ratio of rates
      2. constant Hazard Ratio of Hazard Rates between groups
        
        assumes constant Hazard Rates of events in each group throughout the time of study
        or could also have non-constant but proportional Hazard rates to make the ratio constant (Cox)
      3. allows censored observations
      4. interpreted as odds ratio between groups if bi-valued (binary) outcome
      5. HR can be calculated from logrank statistics [p221]
      6. used in Cox Proportional Model
    2. Logrank Statistic
      1. also called
        Mantel logrank statistic
        Cox-Mantel logrank statistic (more general use than just survival curves)
      2. assumes constant hazard ratio between groups throughout time of study
        
        rate in each group may vary , but rates stay in constant ratio
        ie, ratio of non-constant rates rates are at least proportional
      3. for each interval, the number of events observed in each group, is compared to the number of events expected in each group (calculated for the group as a proportion by group number of the total number of events from rate in all groups combined who are at risk, as if group membership did not matter), and these are used to calculate a Chi-square statistic test for significance
      4. at fixed intervals (or computer programs can determine at each event instead)
        
        determine the number at risk in that jth interval
        
        remove those not in study at start of that time interval from the number at risk
        
        died in previous interval
        censored because of event/outcome in previous interval
        censored because of atrition
        
        calculate the number of observed events O_ij in the jth interval, for failures/events/deaths in each ith group
        calculate the number of expected events E_ijin the jth interval, for failures/events/deaths in each ith group
        
        divides up the expected events for the interval by the proportion in each group (as if by chance)
        E_ij = (# of events/deaths in ith group in the jth interval) * (proportion of patients at risk in ith group in the jth interval)
        
        total the numbers of observed events O_ijand expected events E_ij over all j intervals for each i group to get O_i, E_i
        
        Oi, Ei are sums over all periods (or events) for each group
      5. sums these O_i , E_i totals in an approximate Chi-square test for significance (if distributions are not the same as expected)
        
        Chi-square statistic for events over all intervals
        X ² = SUM [(O_i -E_i)² / E_i ] (over all groups)
        
        approximates Chi-square distribution with N-1 degress of freedom (N is # of groups)
        O _i is sum of the observed events over all j intervals for the i group
        E _i is sum of the expected events over all j intervals for the i group
        X²>N indicates distributions are not the same and there is a difference between groups
        alternate rewrite X² = SUM [E_i(1 - O_i /E_i)²]
        
        expected value times the square of fractional risk ratio subtracted from 1
        
        for bi-valued outcome (yes/no, 1 degree of freedom):
        X ² _yes/no = [O_yes- E_yes]² / E_yes+ [O_no - E_no]² / E_no
        
        X ² > ~7 indicates observed is different from expected at the 0.01 level (1%)
      6. allows censored observations
      7. better with exponential distributions [p224]
      8. unweighted
      9. Petro logrank test
        
        weighted logrank test
        weighted by number of patients at risk
        gives more weight to early events when the number of patients at risk is large (more patients happens to be at beginning of study period, and since time and events are mixed in survival function cannot separate them)
    3. Mantel-Haenszel test (chi-square statistic) [D&T p223]
      1. non-constant hazard rates OK but assumes a constant hazard ratio
      2. sometimes called logrank test but not the same
      3. test for homogenity of a relationship between 2 factors across changes in a third factor; is the relationship the same across levels of the third factor
      4. time intervals determined by events
        
        calculate the observed and expected numbers each time an event occurs
        that is, compare percentage in each group caused by that event over time to event interval
      5. approximate test (compares to expected based on average for the period, a pooled odds ratio) [p222]
      6. unweighted
      7. similar results to logrank tests
      8. can be used to compare any distributions [p223]
      9. MH Chi Square =
        [SUM(observed numbers)-SUM(expected numbers)]**2
        [SUM(variances)]
        over every time period ª
      10. Would be subject to small number (frequency) restrictions of chi-square analysis?
    4. Cox Proportional Hazard Model
      1. assumes the hazard ratio between groups for risk of an event is constant throughout the study (p214 Dawson and Trapp)
        
        constant hazard ratios ( rates are proportional )
      2. allows censored observations
      3. uses the Hazard Function to evaluate the length of time to event
      4. entry point problem (start point)
        
        divides periods into intervals
        mismatch endpoint problem for time until event if less than a year, and the analysis being done in yearly segments (like compounding interest yearly instead of daily)
        all patients with an event treated as if event happened at one year mark, whether in first month or last day of period
        for example, a patient dying on day 357 after entering is not counted for a tally of one-year survival with Cox Model since did not make it to the second year
        Kaplan-Meier product limit method gives credit for time up to actual event rather than within interval which corrects this
      5. see Chapter 10
  3. Non-constant (arbitrary) Hazard ratio between groups of Hazard Rates for each group
    1. Methods that are difficult to quantify statistically
      1. Person-years [p214 D&T]
        
        patient contributes to the average however long they are in the study
        
        event marks one in numerator with time to event in denominator
        number of events divided by average time to event of all patients
        
        used to compare numbers for a period with a different time period or from another study
        no statistical methods available to compare these numbers however
        mixes time and number
        
        same number is obtained by observing 1000 patients for 1 year as observing 10 patients for 100 years
        
        assumes chance of and event is constant throughout the study
        
        risk of exposure
      2. 1-Year, 5-Year Survival(Mortality) Rates
        
        endpoint problem for patients that dont stay in group for that total time period
        lose partial participants (have to be in study at least that long)
        Life Table and Kaplan-Meier product limit methods give credit for the amount of time subjects survive up to the time when data are analyzed.
    2. Generalized Wilcoxon test
      1. non-constant hazard ratios OK
      2. other names for this test
        
        Generalized Kuskal-Wallis test
        Gehan test
        Breslow test
      3. extension of Wilcoxon rank sum test allowing for censored data
      4. weights earlier events more
  4. Intention-to-Treat Principle [p228]
    1. The results for each patient who entered the trial are included in the analysis of the group to which the patient was randomized, regardless of any subsequent events.
    2. dropouts
      1. possible that the patients who dropped out of the treatment group had some characteristics that, independent of treatment, could affect the outcome
    3. crossovers
      1. patients cross-over from one treatment group to the other (or have poor commpliance if comparing to placebo group)
      2. do not know why cross-overs occur
    4. Comparison of techniques
      1. analyze the patients by group they were randomized to (intention-to-treat)
      2. analyze the patients by group they ended up in at end of study
      3. eliminate all crossovers
      4. both of these approaches are potentially biased
      5. advocates of evidence-based medicine recommend intention-to-treat
      6. [ISLT] these issues are more problematic than is currently addressed
Back to Top
Back

page views since Mar2007

ªcorrection 8.27.2004