go back (07.06.05/02.21.09)
DrTim homepage link
Intro to Clinical Statistics
by Timothy Bilash MD
July 2005
based on:
Review of Basic & Clinical Biostatistics
Beth Dawson, Robert Trapp (2001)
From CH6 and Ch10
I. TWO SEPARATE OR INDEPENDENT GROUPS (from Chapter 6)
 Levine Test
(p141)

Levine Test is an alternative test for the equality of means
 Utilizes absolute deviations
from the mean
 Tests the hypothesis that the average of the absolute value of the observation deviations
from the mean
is the same in each group.
 TwoSample TTest is done, using the absolute value of the deviations
(absolute distance each observation is from the mean in that group) rather than the deviations
 If the T Test on these deviations is significant (>.05), the hypothesis of equal variance is rejected
and the TTest on the difference of means is not appropriate.

if the Levine Test is significant, the deviations from the mean in one group on average exceeds the other
 valid for
normal and nonnormal
population distributions
 a modified Levine Test replaces the mean with median

[ANOTE] Statistical Significance [[ ]] ?expected value meaning
 statistical significance is determined by the deviations
of the observed values from some expected value, not the observed values themselves. whether a test is done for equality of the expected value, or equality for some function of the expected value, want the deviations
from the expected to be minimized.

deviation = yf(y)

mean: f(y)=ybar

least squares: f(y)=a+bx
 identical population distributions may be more important than equal population variances for validity of the TTest.
 Welch Test
(p141)
 another test for comparing means in two independent groups
II. STATISTICAL METHODS FOR MULTIPLE VARIABLES (from Chapter 10) [[ ]]
 Predicting Group Outcomes
(Nominal
<>
Categorical
<>
Grouped)
 Three methods are used for Outcomes measured on nominal scale (bivalued: present or notpresent, yes or no, true or false, + or 
) [[ ]]

Logistic Regression (Curve Fitting)
 can be transformed into Odds Ratio
 Controls (adjusts) for confounding variables using analysis of covariance

Discriminant Analysis
 used less now

Loglinear Analysis
 rarely used

Logistic Regression (Curve Fitting) [[ ]]
 Fits an exponential (loglinearized) to the data (loglinearized), obtaining a regression coefficient (b_{i}) for each factor
 Probability of outcome is divided up among the (N in number) X factors
= 1 / [1 + exp(b_{0}
+ b_{1}
X_{1}
+ b_{2}X
_{2}
+ ... + b_{N}X_{N}
)] = CONSTANT
 Independent Variables are then selected or derived independently from the fit [[ ]]???
 Multiple outcomes
vs multiple risks
 Logistic
Regression
 find best fit for multiple Risk Factors
 Multiple Risk Factors are Independent variables that are Numerical or Nominal(Categories)
, ie, X's
 Single Outcome
Variable
is
bivalued
(Logical), ie, Y, or can also be used for Multiple Outcomes
if Categories
 Sometimes also called multiple regression
 Multiple Regression
 find best fit for multiple Outcome Data, holding the values of all other variables constant
 Sometimes used to mean combination of both Multiple Risk Factors
with
Multiple Outcome Data
(note confusion between these different uses)
 If the independent variables (predictors or risk factors) are bivalued, then can interpret regression coefficients as Odds Ratio
 odds ratio is a summary statistic
 odds ratio is the outcome probability ratio for a bivalued risk
 contrast a 95% chance in one patient
with 100% chance in 95% of patients
 equivalent if odds ratio is constant
 an underlying problem for survival curves to distinguish between these (especially if time affects outcomes or risks)
 Chisquare
Test
is used to determine significance for each variable's regression coefficient when the Outcome is Multiple Categories/Nominal (cant use a T or F test)
 T or F test can be used if single input and single output variables
to determine whether each regression coefficient is different from zero
(if one binary risk and one binary outcome  see elsewhere)
 T Distribution can be used to form confidence intervals
for each regression coefficient
 If 95% confidence interval for the odds ratio does not include value of one, then 95% confident that the factor associated with the odds ratio is a significant factor within the confidence interval
 Regression tends to underpredict
the probability that a risk factor is present for a given outcome
 some advocate the Kappa Statistic for more correct percentage (but see R Wilcox for disagreement with this)
Back to Top
Back
page views since Sept2007