How to Learn Everything You Ever Wanted to Know About Biostatistics Daniel W. Byrne Director of Biostatistics and Study Design General Clinical Research Center Vanderbilt University Medical Center 1 The presenter has no financial interests in the products mentioned in this talk. Objective of This Workshop To provide a 1-hour overview of the important practical information that a clinical investigator needs to know about biostatistics to be successful. 2 I. You Will Need the Right Tools 3 Install a powerful, yet easy to use, statistical software package on your computer. I recommend SPSS for Windows. Bring an 1180 for $80 to Karen Montefiori in 143 Hill Student Center (3-1630). She will lend you the SPSS CD for the day and you can install this software easily. 4 SPSS is the 2nd most popular package. It is much easier to use than SAS and Stata. 1 163 SAS 2 3 4 5 6 7 8 9 10 52 48 36 22 19 12 8 6 5 SPSS STATA Epi Info SUDAAN S-PLUS Statxact BMDP Statistica Statview 5 Install additional software for statistical “odds and ends” Instat by GraphPad – graphpad.com True Epistat by Epistat Services – true-epistat.com - $395 for summary data analysis - $100 for random number table, etc. CIA (Confidence Interval Analysis) – bmj.com for confidence intervals - $35.95 with book 6 Install a sample size program. If you can afford to spend $400, buy nQuery Advisor – statistical solutions www.statsol.com If you can afford to spend $0, download PS from the Vanderbilt web site – http://www.mc.vanderbilt.edu/prevmed/ ps/index.htm Both packages are on the CRC’s statistical 7 II. You Will Need a Plan 8 Use the scientific method to keep your project focused. State the problem Formulate the null hypothesis Design the study Collect the data Interpret the data Draw conclusions 9 State the Problem Among patients hospitalized for a hip fracture who develop pneumonia during their stay in the hospital, the mortality rate is 2.3 times higher at non-trauma centers compared with trauma centers (48.7% vs. 21.1%, P=0.043.) It is not clear if, or how, those who will develop pneumonia could be identified on admission. 10 Formulate the Null Hypothesis Among patients hospitalized for treatment of a hip fracture, there are no factors known upon admission that are statistically different between those who develop pneumonia during their stay and those who do not. 11 Why bother with a null hypothesis? For the same reason that we assume that a person is innocent until proven guilty. The burden of responsibility is on the prosecutor to demonstrate enough evidence for members of a jury to be convinced of that the charges are true and to change their minds. Outcome after treatment with Drug A will not be significantly different from placebo. 12 Design the Study Data on 933 patients with a hip fracture from a New York trauma registry will be analyzed. The 58 patients with pneumonia will be compared with the 875 without pneumonia. 13 The Most Common Type of Flaw Study Design 20 Interpretation of the findings 4 Importance of the topic 4 0 Presentation of the results 0 5 10 15 20 25 Number of Responses 14 Example of Recall Bias A control group is asked, “Two weeks ago from today, did you eat X for breakfast?” Two weeks after their MI, patients are asked “Did you eat X for breakfast on the day of your heart attack?” You can prove any food causes an MI using this method (X=bacon, X=Flintstone vitamins, etc.) 15 John Bailar’s Quote: “Study design and bias are much more important than complex statistical methods.” Devote more time to improving the study design, and minimizing and measuring bias. Become an expert at study design issues and biases in your area of research. 16 What is the statistical power of the study? Power Beta Alpha Sample size Ratio of treated to control group Measure of outcome 17 Sample Size Table See Table 9-1 in the handout “Sample Size Requirements for Each of Two Groups”. 18 19 Collect the Data See the handouts for: ITEC Trauma Systems Study 20 III. You Will Need Data Management Skills 21 Enter your data with statistical analysis in mind. For small projects enter data into Microsoft Excel or directly into SPSS. For large projects, create a database with Microsoft Access. Keep variables names in the first row, with <=8 characters, and no internal spaces. Enter as little text as possible and use codes for categories, such as 1=male, 2=female. 22 Spreadsheet from Hell 23 Spreadsheet from Heaven 24 IV. You Will Need to Learn Descriptive Statistics 25 Descriptive vs. Inferential Descriptive statistics summarize your group. average age 78.5, 89.3% white. Inferential statistics use the theory of probability to make inferences about larger populations from your sample. White patients were significantly older than black and Hispanic patients, P<0.001. 26 Import your data into a statistical program for screening and analysis. 27 Screen your data thoroughly for errors and inconsistencies before doing ANY analyses. Check the lowest and highest value for each variable. For example, age 1-777. Look at histograms to detect typos. Cross-check variables to detect impossible combinations. For example, pregnant males, survivors discharged to the morgue, patients in the ICU for 25 days with no complications. 28 Analyze, descriptive statistics, frequencies, select the variable AGE 700 600 500 400 Statistics 933 0 79.292 81.300 90.0 26.537 763.0 14.0 777.0 200 Std. Dev = 26.54 100 Mean = 79.3 N = 933.00 0 0 5. 77 0 5. 72 .0 5 67 0 5. 62 .0 5 57 0 5. 52 0 5. 47 .0 5 42 0 5. 37 0 5. 32 .0 5 27 0 5. 22 .0 5 17 0 5. 12 .0 75 .0 25 Mean Median Mode Std. Deviation Range Minimum Maximum Valid Mis sing 300 Frequency AGE N 29 AGE Analyze, Descriptive Statistics, Crosstabs SURVIVAL * 48-DISPOSITION Crosstabulation Count 48-DISPOSITION HOME SURVIVAL Total EXPIRED SURVIVED 224 224 REHABILI TATION FACILITY OTHER HOSPITAL 56 56 12 12 MORGUE 63 63 SKILLED NURSING FACILITY HOME WITH ASSISTA NCE AMA DISCHAR GE AGAINST MEDICAL ADVICE 201 201 236 236 3 3 8 138 138 Total 63 870 30 933 Correct the data in the original database or spreadsheet and import a revised version into the statistical package. The age of 777 should be checked and changed to the correct age. Suspicious values, such as an age of 106 should be checked. In this case it is correct. 31 Interpret the Data 32 Run descriptive statistics to summarize your data. SURVIVAL Valid EXPIRED SURVIVED Total Frequency 63 870 933 Percent 6.8 93.2 100.0 Valid Percent 6.8 93.2 100.0 Cumulativ e Percent 6.8 100.0 49-DAYS IN HOSPITAL 400 Statistics 300 933 0 23.34 19.00 20 18.03 236 1 237 200 Frequency 49-DAYS IN HOSPITAL N Valid Mis sing Mean Median Mode Std. Deviation Range Minimum Maximum 100 Std. Dev = 18.03 Mean = 23.3 N = 933.00 0 0 0. 24 0 0. 22 0 0. 20 0 0. 18 0 0. 16 0 0. 14 0 0. 12 0 0. 10 .0 80 .0 60 .0 40 .0 20 0 0. 33 49-DAYS IN HOSPITAL V. You Will Need to Learn Inferential Statistics 34 P Value A P value is an estimate of the probability of results such as yours could have occurred by chance alone if there truly was no difference or association. P < 0.05 = 5% chance, 1 in 20. P <0.01 = 1% chance, 1 in 100. Alpha is the threshold. If P is < this threshold, you consider it statistically significant. 35 Basic formula for inferential tests Observed Expected Test Statistic Variabilit y Based on the total number of observations and the size of the test statistic, one can determine the P value. 36 How many noise units? Signal Test Statistic Noise Test statistic & sample size (degrees of freedom) convert to a probability or P Value. 37 Use inference statistics to test for differences and associations. There are hundreds of statistical tests. A clinical researcher does not need to know them all. Learn how to perform the most common tests on SPSS. Learn how to use the statistical flowchart to determine which test to use. 38 VI. You Will Need to Understand the Statistical Terminology Required to Select the Proper Inferential Test 39 Univariate vs. Multivariate Univariate analysis usually refers to one predictor variable and one outcome variable Is gender a predictor of pneumonia? Multivariate analysis usually refers to more than one predictor variable or more than one outcome variable being evaluated simultaneously. After adjusting for age, is gender a predictor of pneumonia? 40 Difference vs. Association Some tests are designed to assess whether there are statistically significant differences between groups. Is there a statistically significant difference between the age of patients with and without pneumonia? Some tests are designed to assess whether there are statistically significant associations between variables. Is the age of the patient associated with the number of days in the hospital? 41 Unmatched vs. Matched Some statistical tests are designed to assess groups that are unmatched or independent. Is the admission systolic blood pressure different between men and women? Some statistical tests are designed to assess groups that are matched or data that are paired. Is the systolic blood pressure different 42 between admission and discharge? Level of Measurement Categorical vs. continuous variables If you take the average of a continuous variable, it has meaning. Average age, blood pressure, days in the hospital. If you take the average of a categorical variable, it has no meaning. Average gender, race, smoker. 43 Level of Measurement Nominal - categorical gender, race, hypertensive Ordinal - categories that can be ranked none, light, moderate, heavy smoker Interval - continuous blood pressure, age, days in the hospital 44 Horse race example Nominal Did this horse come in first place? 0=no, 1=yes Ordinal In what position did this horse finish? 1=first, 2=second, 3=third, etc. Interval (scale) How long did it take for this horse to finish? 60 seconds, etc. 45 46 Normal vs. Skewed Distributions Parametric statistical test can be used to assess variables that have a “normal” or symmetrical bell-shaped distribution curve for a histogram. Nonparamettric statistical test can be used to assess variables that are skewed or nonnormal. Look at a histogram to decide. 47 Examples of Normal and Skewed 44-DAYS IN ICU 35-SYSTOLIC BLOOD PRESSURE FIRST ER 1000 160 140 800 120 100 600 80 400 40 Frequency 60 200 Std. Dev = 3.99 Std. Dev = 27.74 20 Mean = .9 Mean = 146.9 N = 925.00 0 0 0. 250.0 24 .0 0 230.0 220.0 21 .0 0 200.0 19 .0 0 180.0 170.0 16 .0 0 150.0 140.0 13 .0 0 120.0 11 0 0. 10 0 . 90.0 80 0 . 70.0 60 35-SYSTOLIC BLOOD PRESSURE FIRST ER N = 933.00 0 0.0 10.0 5.0 20.0 15.0 30.0 25.0 40.0 35.0 50.0 45.0 60.0 55.0 70.0 65.0 44-DAYS IN ICU 48 VII. You Will Need to Know Which Statistical Test to Use 49 Flowchart of common inferential statistics See the handout, Figure 16-1, pages 78-79. 50 Commonly used statistical methods 1. Chi-square 2. Logistic regression 3. Student's t-test 4. Fisher's exact test 5. Cox proportional-hazards 6. Kaplan-Meier method 7. Wilcoxon rank-sum test 8. Log-rank test 9. Linear regression analysis 10. Mantel-Haenszel method 51 Commonly used statistical methods 11. One-way analysis of variance (ANOVA) 12. Mann-Whitney U test 13. Kruskal-Wallis test 14. Repeated-measures analysis of variance 15. Paired t-test 16. Chi-square test for trend 17. Wilcoxon signed-rank test 18. Analysis of variance (twoway) 19. Spearman rank-order 52 correlation Chi-square The most commonly used statistical test. Used to test if two or more percentages are different. For example, suppose that in a study of 933 patients with a hip fracture, 10% of the men (22/219) of the men develop pneumonia compared with 5% of the women (36/714). What is the probability that this could happen by chance alone? Univariate, difference, unmatched, nominal, 53 =>2 groups, n=>20. 4 8 E M A o A t 7 8 5 P A C C % % % % 4 2 6 8 P C Chi-square example % % % % 9 4 3 T C % % % % a m c c p t t s a d i i i l d d f d u b 7 1 7 P a 4 1 2 C 2 1 1 L 0 8 F 9 1 7 L 3 N a C b 0 54 Fisher’s Exact Test This test can be used for 2 by 2 tables when the number of cases is too small to satisfy the assumptions of the chi-square. Total number of cases is <20 or The expected number of cases in any cell is <1 or More than 25% of the cells have expected frequencies <5. 55 6 .9 9 tab u H O S IC L BS T E o S E t E a N 875 70 P 5 A Co N B CO 7.5 7.5 5.0 E x 480 % 6% 0% % CO 48 % 5% 8% % CH 55 58 3 P Co R 7.5 8.0 .5 E x % 2% 0% % CO 48 % 5% 2% % CH 933 25 T 8 Co o 5.0 8.0 3.0 E x % 9% 0% % CO 48 % 0% 0% % CH u a r m c c p t t s s s a d i i i l d d f d u b 5 1 0 P a 4 1 3 C 2 1 9 L 0 0 F 1 1 0 L 3 N a . C 56 b 1 How to calculate the expected number in a cell PNEUMONIA COMPLICATION 480.00-486.99 * CIRRHOSIS OR CHRONIC LIVER 571 Crosstabulation PNEUMONIA COMPLICATION 480.00-486.99 * CIRRHOSIS OR CHRONIC LIVER 571 Crosstabulation Count CIRRHOSIS OR CHRONIC LIVER 571 ABSENT PRESENT PNEUMONIA COMPLICATION 480.00-486.99 ABSENT PRESENT Total PNEUMONIA COMPLICATION 480.00-486.99 Total 870 5 875 55 3 58 925 8 933 PRESENT PNEUMONIA COMPLICATION 480.00-486.99 * CIRRHOSIS OR CHRONIC LIVER 571 Crosstabulation PNEUMONIA COMPLICATION 480.00-486.99 Total ABSENT PRESENT Count Expected Count Count Expected Count Count Expected Count ABSENT CIRRHOSIS OR CHRONIC LIVER 571 ABSENT PRESENT 870 5 867.5 7.5 55 3 57.5 .5 925 8 925.0 8.0 Total 875 875.0 58 58.0 933 933.0 Total Count Expected Count % within PNEUMONIA COMPLICATION 480.00-486.99 % within CIRRHOSIS OR CHRONIC LIVER 571 Count Expected Count % within PNEUMONIA COMPLICATION 480.00-486.99 % within CIRRHOSIS OR CHRONIC LIVER 571 Count Expected Count % within PNEUMONIA COMPLICATION 480.00-486.99 % within CIRRHOSIS OR CHRONIC LIVER 571 CIRRHOSIS OR CHRONIC LIVER 571 ABSENT PRESENT 870 5 867.5 7.5 Total 875 875.0 99.4% .6% 100.0% 94.1% 62.5% 93.8% 55 57.5 3 .5 58 58.0 94.8% 5.2% 100.0% 5.9% 37.5% 6.2% 925 925.0 8 8.0 933 933.0 99.1% .9% 100.0% 100.0% 100.0% 100.0% 57 Chi-square for a trend test Used to assess a nominal variable and an ordinal variable. Does the pneumonia rate increase with the total number of comorbidities? Univariate, association, nominal. Analyze, Descriptive Statistics, Crosstabs. 58 Chi-Square Tests Pears on Chi-Square Likelihood Ratio Linear-by-Linear Ass ociation N of Valid Cas es Value 43.381a 34.576 30.522 5 5 Asymp. Sig. (2-s ided) .000 .000 1 .000 df 933 a. 2 cells (16.7%) have expected count less than 5. The minimum expected count is .37. PNEUMONIA COMPLICATION 480.00-486.99 * NUMBER OF COMORBIDITES (0-9) Crosstabulation PNEUMONIA COMPLICATION 480.00-486.99 ABSENT PRESENT Total Count % within NUMBER OF COMORBIDITES (0-9) Count % within NUMBER OF COMORBIDITES (0-9) Count % within NUMBER OF COMORBIDITES (0-9) .00 250 NUMBER OF COMORBIDITES (0-9) 1.00 2.00 3.00 4.00 292 213 98 19 5.00 3 Total 875 98.8% 94.2% 93.0% 86.0% 90.5% 50.0% 93.8% 3 18 16 16 2 3 58 1.2% 5.8% 7.0% 14.0% 9.5% 50.0% 6.2% 253 310 229 114 21 6 933 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 59 Mantel-Haenszel Method Used to assess a factor across a number of 2 by 2 tables. Is the mortality rate associated with pneumonia different between trauma centers and nontrauma centers? Analyze, Descriptive Statistics, Crosstabs. 60 61 Student’s t-test Used to compare the average (mean) in one group with the average in another group. Is the average age of patients significantly different between those who developed pneumonia and those who did not? Univariate, Difference, Unmatched, Interval, Normal, 2 groups. 62 S s t u r a ia d e D i M ig e E o d e F a p e ig t w r i r f p 7 4 1 1 9 9 5 9 2 A E 63 5 4 1 9 6 2 5 E Mann-Whitney U test Same as the Wilcoxon rank-sum test Used in place of the Student’s t-test when the data are skewed. A nonparametric test that uses the rank of the value rather than the actual value. 64 Univariate, Difference, Paired t-test Used to compare the average for measurements made twice within the same person - before vs. after. Used to compare a treatment group and a matched control group. For example, Did the systolic blood pressure change significantly from the scene of the injury to admission? Univariate, Difference, Matched, Interval, 65 Normal, 2 groups. Wilcoxon signed-rank test Used to compare two skewed continuous variables that are paired or matched. Nonparametric equivalent of the paired t-test. For example, “Was the Glasgow Coma Scale score different between the scene and admission?” Univariate, Difference, Matched, Interval, Nonnormal, 2 group. 66 ANOVA One-way used to compare more than 3 means from independent groups. “Is the age different between White, Black, Hispanic patients?” Two-way used to compare 2 or more means by 2 or more factors. “Is the age different between Males and Females, With and Without Pnuemonia?” 67 Tests of Between-Subjects Effects Dependent Variable: AGE Source Model SEX PNEUMON SEX * PNEUMON Error Total Type III Sum of Squares 5769944 a 1981.683 1299.320 519.282 154657.2 5924601 df 4 1 1 1 929 933 Mean Square 1442486 1981.683 1299.320 519.282 166.477 F 8664.775 11.904 7.805 3.119 Sig. .000 .001 .005 .078 a. R Squared = .974 (Adjusted R Squared = .974) 68 Kruskal-Wallis One-Way ANOVA Used to compare continuous variables that are not normally distributed between more than 2 groups. Nonparametric equivalent to the one-way ANOVA. Is the length of stay different by ethnicity? Analyze, nonparametric tests, K independent samples. 69 Repeated-Measures ANOVA Used to assess the change in 2 or more continuous measurement made on the same person. Can also compare groups and adjust for covariates. Do changes in the vital signs within the first 24 hours of a hip fracture predict which patients will develop pneumonia? Analyze, General Linear Model, Repeated Measures. 70 Pearson Correlation Used to assess the linear association between two continuous variables. r=1.0 perfect correlation r=0.0 no correlation r=-1.0 perfect inverse correlation Univariate, Association, Interval 71 Correlations AGE 49-DAYS IN HOSPITAL NUMBER OF COMORBIDITES (0-9) 43-TOTAL NUMBER OF COMPLICATIONS 35-SYSTOLIC BLOOD PRESSURE FIRST ER 35-GLASGOW COMA SCALE FIRST ER 35-PULSE FIRST ER Pears on Correlation Sig. (2-tailed) N Pears on Correlation Sig. (2-tailed) N Pears on Correlation Sig. (2-tailed) N Pears on Correlation Sig. (2-tailed) N Pears on Correlation Sig. (2-tailed) N Pears on Correlation Sig. (2-tailed) N Pears on Correlation Sig. (2-tailed) N 35-SYSTO NUMBER 43-TOTAL LIC OF NUMBER BLOOD 35-GLASG 49-DAYS COMORB OF PRESSU OW COMA IN IDITES COMPLIC RE FIRST SCALE AGE HOSPITAL (0-9) ATIONS ER FIRST ER 1.000 .088** .211** .137** .149** -.030 . .007 .000 .000 .000 .356 933 933 933 933 925 926 .088** 1.000 .167** .453** .039 .016 .007 . .000 .000 .237 .633 933 933 933 933 925 926 .211** .167** 1.000 .222** .034 -.079* .000 .000 . .000 .296 .017 933 .137** .000 933 .149** .000 925 -.030 .356 926 -.008 .809 923 933 .453** .000 933 .039 .237 925 .016 .633 926 .022 .499 923 933 .222** .000 933 .034 .296 925 -.079* .017 926 .055 .093 923 933 1.000 . 933 -.033 .310 925 -.028 .393 926 .046 .161 923 35-PULSE FIRST ER -.008 .809 923 .022 .499 923 .055 .093 925 926 923 -.033 .310 925 1.000 . 925 .043 .196 925 .069* .035 923 -.028 .393 926 .043 .196 925 1.000 . 926 -.100** .002 923 .046 .161 923 .069* .035 923 -.100** .002 923 1.000 . 923 **. Correlation is s ignificant at the 0.01 level (2-tailed). *. Correlation is s ignificant at the 0.05 level (2-tailed). 72 Spearman rank-order correlation Use to assess the relationship between two ordinal variables or two skewed continuous variables. Nonparametric equivalent of the Pearson correlation. Univariate, Association, Ordinal (or skewed). 73 Correlations Spearman's rho AGE 49-DAYS IN HOSPITAL NUMBER OF COMORBIDITES (0-9) 43-TOTAL NUMBER OF COMPLICATIONS 35-SYSTOLIC BLOOD PRESSURE FIRST ER 35-GLASGOW COMA SCALE FIRST ER 35-PULSE FIRST ER Correlation Coefficient Sig. (2-tailed) N Correlation Coefficient Sig. (2-tailed) N Correlation Coefficient Sig. (2-tailed) N Correlation Coefficient Sig. (2-tailed) N Correlation Coefficient Sig. (2-tailed) N Correlation Coefficient Sig. (2-tailed) N Correlation Coefficient Sig. (2-tailed) N 35-SYSTO NUMBER 43-TOTAL LIC OF NUMBER BLOOD 35-GLASG 49-DAYS COMORB OF PRESSU OW COMA IN IDITES COMPLIC RE FIRST SCALE 35-PULSE AGE HOSPITAL (0-9) ATIONS ER FIRST ER FIRST ER 1.000 .089** .158** .145** .091** -.146** -.008 . .007 .000 .000 .005 .000 .806 933 933 933 933 925 926 923 .089** 1.000 .142** .389** .073* .048 .037 .007 . .000 .000 .027 .149 .268 933 933 933 933 925 926 923 .158** .142** 1.000 .229** .037 -.091** .042 .000 .000 . .000 .257 .006 .202 933 .145** .000 933 .091** .005 925 -.146** .000 926 -.008 .806 923 933 .389** .000 933 .073* .027 925 .048 .149 926 .037 .268 923 933 933 925 926 923 .229** .000 933 .037 .257 925 -.091** .006 926 .042 .202 923 1.000 . 933 -.014 .676 925 -.076* .020 926 .043 .196 923 -.014 .676 925 1.000 . 925 .079* .017 925 .080* .015 923 -.076* .020 926 .079* .017 925 1.000 . 926 -.038 .252 923 .043 .196 923 .080* .015 923 -.038 .252 923 1.000 . 923 **. Correlation is s ignificant at the .01 level (2-tailed). *. Correlation is s ignificant at the .05 level (2-tailed). 74 Summary of Inferential Tests 75 Unpaired vs. Paired Student’s t-test Chi-square One-way ANOVA Mann-Whitney U test Kruskal-Wallis H test Paired t-test McNemar’s test Repeated-measures Wilcoxon signed-rank Friedman ANOVA 76 Parametric vs. Nonparametric Student’s t-test One-way ANOVA Paired t-test Pearson correlation Correlated F ratio (repeatedmeasures ANOVA) Mann-Whitney U test Kruskal-Wallis test Wilcoxon signed-rank Spearman’s r Friedman ANOVA 77 A Good Rule to Follow Always check your results with a nonparametric. If you test your null hypothesis with a Student’s t-test, also check it with a MannWhitney U test. It will only take an extra 25 seconds. 78 VIII. You Will Need to Understand Regression Techniques 79 Linear Regression Used to assess how one or more predictor variables can be used to predict a continuous outcome variable. “Do age, number of comorbidities, or admission vital signs predict the length of stay in the hospital after a hip fracture?” Multivariate, Association, Interval/Ordinal dependent variable. 80 Coefficientsa Model 1 Uns tandardized Coefficients B Std. Error -4.451 18.889 7.136E-02 .045 (Cons tant) AGE NUMBER OF 2.606 COMORBIDITES (0-9) 35-SYSTOLIC BLOOD 1.562E-02 PRESSURE FIRST ER 35-GLASGOW COMA 1.067 SCALE FIRST ER 35-PULSE FIRST ER 2.581E-02 35-RESPIRATION -8.00E-02 RATE FIRST ER a. Dependent Variable: 49-DAYS IN HOSPITAL Standardi zed Coefficien ts Beta .053 t -.236 1.571 Sig. .814 .117 .548 .159 4.757 .000 .022 .024 .726 .468 1.170 .030 .912 .362 .047 .019 .554 .580 .188 -.014 -.425 .671 81 Logistic Regression Used to assess the predictive value of one or more variables on an outcome that is a yes/no question. “Do age, gender, and comorbidities predict which hip fracture patients will develop pneumonia?” Multivariate, Difference, Nominal dependent variable, not time-dependent, 2 groups. 82 1 2 Total number of comorbidities Cirrhosis 3 COPD 4 Gender 5 Age 83 Draw Conclusions We reject the null hypothesis. Patients who are at high risk of developing pneumonia during their hospitalization for a hip fracture can be identified by: total number of pre-existing conditions cirrhosis COPD male gender 84 How this information could be used to predict pneumonia on admission Probabilit y of Pneumonia 1 (1 e -Z ) Z Z=-4.899 + (number of comorbidities x 0.469) + (cirrhosis x 2.275) + (COPD x 0.714) + (age x 0.021) + (gender[female=1, male=0] x –0.715) e=2.718 Example, an 80 year old male with cirrhosis and one other comorbidity (but not COPD) had a 99.4% chance of developing pneumonia. Z=-4.899 + (2 x 0.469) + (1 x 2.275) + (0 x 0.714) 85 + (80 x 0.021) (0 x –0.715) Survival Analysis Kaplan-Meier method Used to plot cumulative survival Log-rank test Used to compare survival curves Cox proportional-hazards Used to adjust for covariates in survival 86 Odds and Ends You Will Need 87 95% Confidence Intervals A 95% confidence interval is an estimate that you make from your sample as to where the true population value lies. If your study were to be repeated 100 times, you would expect the 95% CIs to cross the true value for the population in 95 of these 100 studies. the value might be a mean, percentage or RR Confidence intervals should be included in publications for the major findings of the study. 88 Prevalence vs. Incidence Prevalence How many of you now have the flu? Incidence How many of you have had the flu in the past year? 89 Random Random is not the same as haphazard, unplanned, incidental. Allocating patients to the treatment group on even days and to the control group on odd days is systematic – not random. Random refers to the idea that each element in a set has an equal probability of occurrence. 90 Improving a RCT See the handout, Table 3-2 pages18-19. “Checklist to Be Used by Authors When Preparing or by Readers When Analyzing a Report of a Randomized Controlled Trial”. 91 IX. You Will Need to Continue Learning About Statistics 92 Recommended books on statistics Kuzma – Statistics in the Health Sciences Norusis – Data Analysis with SPSS Altman – Statistics with Confidence Friedman – Fundamentals of Clinical Trials Pagano – Principles of Biostatistics Encyclopedia of Biostatistics SPSS manuals 93 Future Workshops 94 Future CRC Workshops Oct 11 - How to use wireless hand-helds for clinical research (Paul St Jacques, MD, Anesthesiology) Oct 18 - How to conduct Anova statistical tests - Part 1/3 (Ayumi Shintani, PhD, MPH, Center for Health Services Research) Oct 25 - How to conduct Anova statistical tests - Part 2/3 (Ayumi Shintani, PhD, MPH, Center for Health Services Research) Nov 1 - How to conduct Anova statistical tests - Part 3/3 (Ayumi Shintani, PhD, MPH, Center for Health Services Research) Nov 8 - How to write a data and safety-monitoring plan (Harvey Murff, MD) 95 X. One Final Skill You Will Need to Master 96 A response to the comment: You’re comparing apples and oranges” “No – this is comparing apples and oranges!” 97
© Copyright 2024