Statistics for Oral Health Professionals L11: Sample Size Sample Size Determination by Lin Naing School of Dental Sciences Universiti Sains Malaysia 1 Sample Size Calculation for Estimation 2 School of Dental Sciences Page 1 Statistics for Oral Health Professionals L11: Sample Size Estimating a mean • A study is planned to estimate the knowledge level among adults in Kampong X. • The result should be reported as "mean knowledge score and its 95% CI". e.g. mean Kscore 10.14 units (95% CI: 9.29, 10.99) 10.99 The value of a study can be judged by the width of Confidence Interval. Wide CI means .. a poor study. 3 The value of a study can be judged by the width of Confidence Interval. CI of µ = x ± Z * ( σ ) n This will determine the width of CI. 5 ± 1 = (4, 6) We call this part as “precision” ∆. 5 ± 2 = (3, 7) 2∆ = Width of CI ∆ = Z * (σ / n ) n = Z * (σ / ∆ ) Z *σ n= ∆ 2 4 School of Dental Sciences Page 2 Statistics for Oral Health Professionals L11: Sample Size 2 Z *σ To estimate knowledge score n= ∆ If we plan for 95% confidence (5% error), so Z = 1.96; And SD (σ ) is estimated as 4.3 (K score) (either by previous study or a pilot study; if previous study, state the reference) Impossible to check for normality assumption Now, it is the researcher decision to select which sample size will be appropriate for the study. 5 How to report? (in Methodology) • Sample size was determined as follows. • The following formula (Daniel, 1999) is used to calculate the sample size for objective 1 (to determine K level). Z *σ n= ∆ Z = 1.96 for 95% confidence 2 σ = SD of Kscore = 4.3 (Brian, 2002??) ∆ = Precision = 1 unit 2 1.96 * 4.3 n= = 72 1 • We need 72 people in order to estimate the mean K score with the precision of 1 unit. • We decided to take 87 people (additional 20%) for anticipated non-response cases. 6 School of Dental Sciences Page 3 Statistics for Oral Health Professionals L11: Sample Size Estimating a Proportion • A study is planned to estimate the prevalence of Pods. in Kampong X. • The result should be reported as Prevalence (Proportion) of POds. and its 95% CI". In our example data, we get 37% (95% CI: 27%, 47%). 7 2 Z n = | * P (1 − P) ∆ If we plan for 95% confidence (5% error), Z = 1.96, and P is estimated as 40% (Prevalence of POds.) (Literature or Pilot study). Relationship between P & Sample Size 800 700 Sample Size 600 500 400 300 200 100 1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 0 P 8 School of Dental Sciences Page 4 Statistics for Oral Health Professionals L11: Sample Size 2 Z n = | * P (1 − P) ∆ If we plan for 95% confidence (5% error), Z = 1.96, and P is estimated as 40% (Prevalence of POds) (Literature or Pilot study). With this sample size, you can "expect" the result as ... e.g. Prevalence 40% (95% CI: 37.5, 42.5) 2.5% precision This is considered a very good precision but we need a sample size of 1,476 people to achieve this. Maybe IMPOSSIBLE!!! 9 2 Z n = | * P (1 − P) ∆ If we plan for 95% confidence (5% error), Z = 1.96, and P is estimated as 40% (Prevalence of POds) (Literature or Pilot study). Now, it is the researcher decision to select which sample size will be appropriate for the study. 10 School of Dental Sciences Page 5 Statistics for Oral Health Professionals L11: Sample Size 2 Z n = | * P (1 − P) ∆ Z *σ n= ∆ 2 Setting the level of confidence is conventional at 95% (Z = 1.96). P or SD is estimated by the literature or pilot study. The remaining question is "HOW TO DECIDE THE PRECISION?". (1) Generally, smaller precision is better. (2) However, commonly, researchers are limited with the availability of resources . (3) It may depend on previous studies: - In case of the first study, a relatively wide CI is still considered valuable. - Previous studies have reported with a certain width of CIs in their studies. Somehow, if we want to repeat the study, we should come out with a better width of CI (added value). 11 Sample Size Calculation for Hypothesis Testing Important Concepts 1. Type I (α error) 2. Type II (β error) / Power of the Study (1-β) 3. Detectable Difference (Detectable Alternative) 12 School of Dental Sciences Page 6 Statistics for Oral Health Professionals L11: Sample Size Two types of Error The woman is NOT pregnant. But the test is ‘positive’. In reality, no association. False Positive Your result gives ‘sig association’. False positive Alpha error (Type I) Allow 0.05 (5%). The woman is pregnant. But the test is ‘negative’. False negative In reality, there is association. Your result gives ‘no sig. association’ False Negative Beta error (Type II) Allow 0.2 (20%) 13 Two types of Error In reality, say there is no difference between male and female. But your result is .. The mean K score is significantly different between two groups (P=0.009). This is Type I or alpha error. In reality, say there is a difference between 2 groups. But your result is .. The mean K score is not significantly different between two groups (P=0.234). This is Type II or beta error. It means that even though there is a difference, you do not have enough ‘power’ to prove it. 14 School of Dental Sciences Page 7 Statistics for Oral Health Professionals L11: Sample Size Power of the study The woman is pregnant. But the test is ‘negative’. False negative In reality, there is association. Your result gives ‘no sig. association’ False Negative Beta error (Type II) Allow 0.2 (20%) Let’s say, our study has β error 20%. 20% It means that there is 20% chance that we will get ‘no no sig. association’ association even though there is an association in reality. In other words, there is 80% chance that we will get ‘sig. sig. association’ association if there is an association in reality. It means that our study has power 80%. 80% Power: Power to achieve ‘sig.’ result if there is truly association. 15 Important Concepts Detectable Difference 1.What is Detectable Difference? 2.How to decide the Detectable Difference? 16 School of Dental Sciences Page 8 Statistics for Oral Health Professionals L11: Sample Size Important Concepts What is Detectable Difference (Detectable Alternative)? • The “minimum size of the difference between groups” that the study could detect !!! • "The study could detect" means ... Let's say, you are comparing means of 2 groups, and in reality, the 2 means are truly different. And also at the end of the study, you get the result as "two group means are significantly different" (one is more than the other). It means that "you detect the difference". Let's say, you get the result "the difference is not significant" ... meaning that "you fail to detect it". 17 Important Concepts What is Detectable Difference (Detectable Alternative)? 60.0 Kg ≈ 60.1 Kg 60.0 Kg < 60.5 Kg 18 School of Dental Sciences Page 9 Statistics for Oral Health Professionals L11: Sample Size Important Concepts What is Detectable Difference (Detectable Alternative)? 19 Comparing means of two (2) POPULATIONS µ = 15.00 µ = 15.01 A B Type I Type II α For Alpha 0.05, Zα/2= 1.96 For Beta 0.2, Zβ= 0.84 σ (SD) from previous study or pilot study. DD β 2σ 2 (Zα/2 + Z β ) 2 n= ∆2 How researchers should decide DD (∆)? 20 School of Dental Sciences Page 10 Statistics for Oral Health Professionals L11: Sample Size Comparing means of two (2) POPULATIONS How to decide Detectable Difference? Difference It should reflect the “Clinically Significant Difference” (CSD). We should be able to detect the “CSD”. In other words, we should design a study to detect CSD. 21 Comparing means of two (2) POPULATIONS How to decide "Detectable Difference“ or CSD? The 2 means in this example (15.00 versus 15.01 units) is different numerically. However, this difference will not be considered as important difference by any reasonable person. In other words, 15.00 versus 15.01 unit is NOT a meaningful difference or NOT a practically/clinically important difference. Then, when will you call the "meaningful difference" or "practically/ clinically important difference"? 15.0 versus 15.1? 15.0 versus 15.2? 15.0 versus 16.0? 15.0 versus 16.5? 15.0 versus 15.5? 15.0 versus 17.0? 22 School of Dental Sciences Page 11 Statistics for Oral Health Professionals L11: Sample Size Comparing means of two (2) POPULATIONS How to decide "Detectable Difference“ or CSD? Let's say, researchers consider that 15.0 versus 18.0 is an important difference (3 units). The difference of less than 3 is considered 'not important'. Then, we should set the study in order to detect the difference of 3 units ... meaning that DD should be set at 3 units. WHO can make this decision? Experts who know well about the importance of the level of Kscore. These experts are researchers who plan this study. 23 Using PS software …. • Comparing 2 means • Comparing 2 proportions http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize 24 School of Dental Sciences Page 12 Statistics for Oral Health Professionals L11: Sample Size Using PS software …. • Comparing 2 means Researchers want to compare K score between male and female college students. Objective: To compare mean K score between male and female students 2σ 2 (Zα/2 + Z β ) 2 n= ∆2 Don’ Don’t use formula. This is for explanation. Alpha (0.05) Power (start with 80%) σ = SD (within group SD of K score) ∆ = Detectable Difference (Clinically important difference) 25 1 2 3 4. Fill all 5 inputs 5 Detectable Difference SD from other study or pilot study With the sample size 33 in each group, we will achieve 80% power to detect the difference of 3 units (K score) with the Alpha at 0.05. Ratio between 2 groups (m=1 means 1:1) 26 School of Dental Sciences Page 13 Statistics for Oral Health Professionals L11: Sample Size With the sample size 33 in each group, we will achieve 80% power to detect the difference of 3 units of K score with the Alpha at 0.05. Example: Say, in reality, the difference is 5 units between male and female. With this sample size, you have at least 80% chance to get the ‘significant’ or ‘positive’ result. (You have at least 80% power to reject the Null). Say, if the difference is 1 unit only. So, this sample size will fail to detect this difference. But it’s OK, we don’t want to detect this small diff. It is not clinically/practically important. 27 IN SUMMARY, for comparing 2 means We need to decide …. Alpha (0.05; consensus – 0.05) Power (80%=0.8) SD (variable of interest – from previous study or pilot study) Detectable Difference (should reflect clinical/practical importance) Ratio of sample size between 2 groups (m = 1 “1:1”; m=2 “2:1”) -------------------------------------------------------------------------- How to report? We use PS software (Dupont & Plummer, 1997) to calculate the sample size based on comparing two means. To detect the difference of 3 units (of K score) with 80% power and alpha 0.05, we need 33 students in each study group (SD was estimated as 4.3, reference??). We have decided to take 40 male and 40 female students (additional 20%) with the anticipation of some non-responses. Reference: program m available for free on Dupont WD and Plummer WD (1997). PS power and sample size progra the Internet. Controlled Clin Trials, 18:274. Available at http:// biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize 28 School of Dental Sciences Page 14 Statistics for Oral Health Professionals L11: Sample Size • Researchers want to compare the SBP between treated and untreated hypertensive patients. • A recent study revealed that the SD of SBP among hypertension patients was 10 mmHg (state ref.??). • The researchers feel that it is important to detect the difference of 5 mmHg between 2 study groups. • They plan to take equal sample size (1:1) for 2 study groups (m=1). • Set alpha at 0.05 as usual. • Calculate the sample size to achieve the power of 80%. Exercise 29 • Researchers want to compare the SBP between treated and untreated hypertension patients. • SD was 10 mmHg (state reference??). • DD sets at 5 mmHg. • They plan to take 1:2 ratio for untreated: untreated: treated (m=2) (because difficult to find ‘untreated’). • Set alpha at 0.05 as usual. • Calculate the sample size to achieve the power of 80%. Sample size that software gives is for ‘1’ from 1:2 ratio. 30 School of Dental Sciences Page 15 Statistics for Oral Health Professionals L11: Sample Size Using PS software …. • Comparing 2 proportions Researchers want to compare prevalence of POds. between male and female students. Objective: To compare the prevalence of POds. between male and female students. Remember … We have to set Alpha, Power and Detectable Difference. Alpha = 0.05 Power = 80% (0.8) Detectable Difference = ??? 31 Using PS software …. • Comparing 2 proportions Researchers want to compare prevalence of POds. between male and female students. Objective: To compare the prevalence of POds. between male and female students. Alpha (0.05) Power (80% = 0.8) ∆ = Detectable Difference (Clinically important difference) (P1-P0) P0 = Prevalence of POds. among male (Get from literature) P1 = Prevalence of POds. among female (Set based on desired DD) m = 1 (equal ratio between male and female) 32 School of Dental Sciences Page 16 Statistics for Oral Health Professionals L11: Sample Size Using PS software …. • Comparing 2 proportions Researchers want to compare prevalence of POds. between male and female students. Objective: To compare the prevalence of POds. between male and female students. Alpha (0.05) Power (80% = 0.8) ∆ = Detectable Difference (Clinically important difference) (P1-P0) P0 = 0.27 (Say, we get from literature) P1 = 0.37 (This is our decision based of DD. Here, we put 0.37. It means that we are setting the DD in this study as 0.10 or 10%, considering that difference of <10% is not important. m = 1 (equal ratio between male and female) 33 1 4. Fill all 5 inputs P0 – from previous or pilot study (P1-P0) is Detectable Difference. With the 2 sample size 340 in each group, we will achieve 80% power 3 to detect the difference of 10% (PO prev.) with the Alpha at 5 0.05. Ratio between 2 groups (m=1 means 1:1) 34 School of Dental Sciences Page 17 Statistics for Oral Health Professionals L11: Sample Size IN SUMMARY, for comparing 2 proportions We need to decide …. Alpha (0.05; consensus – 0.05) Power (80% = 0.8) Po (Prevalence of POds. among male, 27% (reference?) Detectable difference (should reflect clinical/practical importance – in this example, 10% difference is decided. Therefore, P1 = 37%) Ratio between 2 groups (m=1 “1:1”; m=2 “2:1”) -------------------------------------------------------------------------- How to report? We use PS software (Dupont & Plummer, 1997) to calculate the sample size based on comparing two proportions. To detect the difference of 10% in prevalence of POds. (P0 27% versus P1 37%) between the two study groups with 80% power and alpha 0.05, we need 340 male and 340 female students. (Po, the prevalence of POds. among male was estimated as 27%, ref?) (You may add some e.g. 10%) Reference: Dupont WD and Plummer WD (1997). PS power and sample size program program available for free on the Internet. Controlled Clin Trials, 18:274. Available at http:// biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize 35 Exercise! • Calculate the sample size for the detectable difference of prevalence 20%. It means P0 = 27% and P1 = 47%. • Calculate the sample size for the detectable difference of prevalence 20% (as above). • And “male: female” ratio as 2:1 (m=2). 36 School of Dental Sciences Page 18 Statistics for Oral Health Professionals L11: Sample Size Final COMMENTS • For each objective, we should calculate the sample size. • Sometimes, in one objective, more than one variables of interest (multiple linear regression). In this case, we need to calculate for each variable of interest. • Then, the biggest sample size will be “the sample size of the study”. • We need to add-up 10-20% because we may get nonresponse, loss of follow up, or any other loss. 37 Summary • Estimating a mean • Estimating a proportion • Comparing two means • Comparing two proportions Using formulae Using PS software Not only “How to Calculate” Calculate” but also “How to Report” Report” 38 School of Dental Sciences Page 19
© Copyright 2025