back (5.25.2004/08.28.2012)
DrTim homesite

Statistics of WHI Hazard Ratios - What They Really Mean (IV)
Women's Health Initiative
Timothy Bilash MD MS OBGYN
May 25, 2004 (additional 2012 results added**)

  1. The results of the prematurely halted estrogen arm of the WHI are in. This outline combines the published results of this (2004 E) estrogen and the Preliminary (2002 E+P) and Final (2003 E+P) estrogen-progestin reports.

      1. Two different studies were done (using daily oral dosing) (with multiple update publications thru 2012**)
        1. July 2002 (E+P) WHI Estrogen plus Progestin Preliminary (Final Revised follow)
          (Prempro conjugated equine estrogen 0.625mg/ medroxyprogesterone acetate 2.5mg)
        2. April 2004 (E) WHI Estrogen only
          (Premarin conjugated equine estrogen 0.625mg)
      2. Both studies were stopped early

  2. WHI reported Preliminary Hazard Ratios (HR) in 2002

    1. Hazard Ratio is the Ratio of the Hazard Rates for each group, or the
      [ outcome rate with hormone , divided by outcome rate without hormone ]
      1. hazard rate is the ratio of # of events within the group to total # within the group, over a given time period
      2. hazard ratio is the ratio of hazard rates for group1 to group2 (it's a pure number), over the same time period, and with the same risks except for hormone

    2. Hazard Ratio Significance
      1. Hazard Ratio = 1.0 is equal risk rates.[ (YELLOW) ] (ie, differences are likely due to chance)
      2. HR >1.0 is increased risk rate in hormone group [ (RED) ]
      3. HR <1.0 is decreased risk rate in hormone group [ (GREEN) ]
      4. [Author Note]
        -The interpretation of a HR near 1.0 requires an estimate of the Error Bars for that HR.
        -At a 5% level of significance for measures used in the ratio, a HR in the range
        [HR=0.8 to HR=1.2] is equivalent to a HR=1.0 (differences are still likely due to chance).
        -Many statisticians favor the HR be outside the range [HR=0.5 to HR=2.0]
        or even [HR=0.3 to HR=3.0] to indicate a meaningful statistical difference.

    3. Confidence Interval (CI ) is the range of HR's likely to contain the mean HR
      1. it is more representative to say "the mean Hazard Ratio has a 95% chance of being somewhere in the CI interval", rather than "have a 95% chance that the HR for a given patient has the stated value". (**These cannot be distinguished in general and for nice (gaussian) distributions are equivalent statements).
      2. CI interval contains 1.0 if equal risk rates [ (YELLOW) ] (even if Hazard Ratio is different from 1.0, differences are likely due to chance)
      3. CI interval greater and excluding 1.0 is increased risk rate in hormone group [ (RED) ]
      4. CI interval less and excluding 1.0 is decreased risk rate in hormone group [ (GREEN) ]

    4. The large number of dropouts and crossovers reduces the accuracy of outcome results, and mixes some exposure patients into the no exposure group (and the reverse) after randomization was completed.

    5. Adjustment for multiple testing and multiple outcomes
      1. (E+P) arm (2002, 2003) reported both unadjusted (uncorrected) and adjusted (corrected) risk ratios
        (for statistical testing over multiple times and over multiple outcomes)
      2. (E) arm (2004) reported only unadjusted (uncorrected) risk ratios

  3. WHI Hazard Ratios (see Table1, Table2)
    SubGroups Risks for CHD, Stroke, DVT, PE, BreastCA, ColonCA, HipFx, Global Index

    (E+P Unadjusted)
    preliminary unadjusted result final unadjusted result
    -note CHD/BreastCA risk flips-
    (E+P HR) 2002 (E+P HR) 2003
    CHD, Stroke, DVT, PE, Global Index BreastCA , Stroke, DVT, PE, Global Index
    higher risk than Placebo
    same risk as Placebo (=)
    BreastCA CHD
    ColonCA, HipFx
    lower risk than Placebo ColonCA, HipFx
    (note final risks for CHD/ BreastCA flip for unadjusted)
    (E+P Adjusted)
    (after adjustment for multiple statistical testing and multiple outcome effects)
    final adjusted
    preliminary adjusted
    (E+P HR) 2003
    (E+P HR) 2002
    higher risk than Placebo DVT DVT
    same risk as Placebo (=)
    CHD, BreastCA, Stroke, PE,
    ColonCA, HipFx, Global Index
    CHD, BreastCA, Stroke, PE,
    ColonCA, HipFx, Global Index
    (note preliminary and final results are the same for adjusted)

    (E-only Unadjusted)
    (E HR) 2004
    higher risk than Placebo
    DVT, Stroke
    same risk as Placebo (=)
    CHD, PE, Global Index, Colon Ca, (**BreastCA 2002 - 5.6 year)
    HipFx, **Breast Ca (2012 - 12 yr followup)
    lower risk than Placebo
    (*DVT and adjusted results were not reported for E-only trial
    **12 year followup of Premarin only 2012 changed results for Breast Ca 2002)

  4. Preliminary (Table1: 2002/E+P, 2004/E-Only)
    WHI SubGroup: Hazard Ratio Confidence Interval Ranges

  5. The (E+P) results were revised after Final collection and review of all patient data.
    In particular the the final vs preliminary CHD and Invasive Breast Cancer results flipped
    [clic on table below to open it in separate window to compare to Preliminary results]

    Final Adjudicated Table2: (2003/E+P, 2004/E-only)
    WHI SubGroup: Hazard Ratio Confidence Interval Ranges
    (shown with Mean HR to left)

  1. Inaccuracies in outcomes (disagreement in diagnosis)
    1. Note that there was significant disagreement for some diagnoses (outcomes) between the hospitals and the review center
    2. This would affect the CHD, Stroke, DVT, PE Hazard Rates, decreasing the statistical significance of the Hazard Ratio (HR) differences for these outcomes
    3. See last column of Table1
    4. E+P preliminary HR are close to final HR

  2. WHI has low power. No adjusted results were reported for estrogen-only arm (2004)

    1. presumably an adjustment of the estrogen-only data for multiple statistical testing would widen the Confidence Intervals for Stroke and Hip Fx (comparable to the E+P arm adjustments). this would indicate no statistical significance for ANY of these outcomes with adjusted data, E or E+P.

    2. that is, the power of the study was insufficient to determine a difference between the groups. there would be no demonstrated difference between the hormone and placebo subgroups based on this adjusted analysis.
      (but see the re-analysis of the E+P Mortality results by DrTim).)

  3. Surrogate Exposure problem
    1. It is assumed in this survival analysis that patients who are randomized to a hormone group have the same exposure to the hormone: the same amount (dose), over the same time course (constancy, frequency), with no missed or extra doses (compliance), and receive no other hormone (external exposure). The study reports clearly state that this is not the case. There were many patients who had their hormones interrupted; were non-compliant, crossovers between medication groups, dropouts, or those with exposure to external and unquantified hormones.
    2. In fact the crossovers and dropouts were far greater than had been allowed for in the study design.
    3. It would have been very useful (although expensive) to compare blood levels of estrogen and progesterone for patients in each group, as not every patient had perfect exposure/non-exposure to the risk.

  4. Summary

    1. Even though the WHI Subgroup Hazard Ratios differ from 1.0, they are not statistically significant when the confidence intervals include 1.0.
    2. For instance, although a possible reduction in breast cancer risk with estrogen is reported in the estrogen-only arm (HR= 0.77), this is not statistically significant (CI = [0.59-1.01]). Also, no adjusted confidence intervals for estrogen-only were published.
    3. DVT is the exception in these outcomes but note how when account for the disagreement in diagnosis of 0.11, this moves the CI = [1.03 to 3.63] which is not far from 1.0
    4. Interpretation is problematic by the analysis because of many shortcomings in these studies. Other aspects of the WHI study remain to be clarified. Numerous problems have already been discussed in publications and on this site which severely hamper the interpretation of the data, and the ability to draw meaningful conclusions. See very clear discussions in reference 3, supporting the views expressed here

  5. References

    1. Anderson GL et al, JAMA 2004 Apr14:291(14):1769-71, Effects of conjugated equine estrogen in postmenopausal women with hysterectomy: the Women's Health Initiative randomized controlled trial
    2. Writing Group for the Women's Health Initiative (WHI) Investigators, JAMA July 17, 2002 288(3):p321-349, Risks and Benefits of Estrogen Plus Progestin in Healthy Postmenopausal Women: Principle Results from the Women's Health Initiative (WHI) Randomized Controlled Study
    3. Patricia Kelly, PhD, Recent HRT Studies: The Findings in Perspective (9.2002/2006), WHI Study: One Year Later (2006), San Fransisco Medical Society Website

Back to Top

Counterpage visits since 9/1/2012