Sample Size Determination 樣本數的計算 謝宗成助理教授 慈濟大學醫學研究所

Sample Size Determination
樣本數的計算
謝宗成助理教授
慈濟大學醫學研究所
[email protected]
TEL: 03-8565301 ext 2015
研究室:勤耕樓 712
1
主題

Part I 基本觀念
 為什麼要計算樣本數?
 樣本數要大?還是小?
 樣本數要多大才夠?
 樣本數的計算必須考量的條件

Part II 軟體操作
 Sample
size formula vs. Effect Size
 Software for sample size calculation
 Sample size for means comparison
 Sample size for proportions comparison
 Sample size for linear regressions
2
為什麼要計算樣本數?

當欲研究對象的母群體很大 (無限大)時,隨
機抽取有限的樣本,藉由統計方法來推論
母群體的特性。
研究高血壓新藥對台灣的高血壓患者的治
療效果。
 E.g.,

所抽取之”有限”樣本的個數必須要”夠大” ,
才能代表母群體,反映母群體的特性。
3
樣本數要大?還是小?

樣本數太小的壞處
 樣本數的代表性不夠,不足以推論母群體。
的計畫會被 Challenge。
 投的 paper 不會被接受。
 Submit

樣本數太大的壞處
 浪費不必要的資源與費用。
 造成假性的顯著差異
: 統計上顯著,但臨床上
不顯著。E.g., 相關係數檢定有統計上顯著意義
,但由樣本計算而得之相關係數只為 0.1。
 樣本數之於母群體個數的比例很高時,已不符
合統計學無限大母群體的假設。
4
樣本數要大?還是小?
當研究對象的母群體為有限母體時,
還需要進行抽樣或計算樣本數嗎?
5
樣本數要多大才夠?

統計 ~ 透過有限個數的樣本提供的訊息來
推估母群體的特性,以做出適當之決策。
=> 有限樣本的訊息 vs. 決策對或錯

如何提高決策品質 ~ 足夠的樣本數,以使
 做出錯誤決策的機率控制在可接受的範圍內
=> 型一錯誤 (Type I Error)。
 做出正確決策的機率至少高過一定程度
=> 檢定力 (Power)。
6
樣本數要多大才夠?

統計假設檢定 (Statistical hypothesis testing)

統計 ~ 對所欲瞭解之母群體特性提出兩種不同且完
全相反之假設,然後透過樣本的訊息來檢定哪一
個假設較為合理。

此一經由樣本訊息被認為合理的假設,即為統計
分析的產品 ~ 決策。

E.g., 評估新高血壓藥是否有效

新高血壓藥 vs. 安慰劑的臨床試驗

假設一:新高血壓藥與安慰劑效果相同

假設二:新高血壓藥與安慰劑效果不同
7
樣本數要多大才夠?
真實情況
決策
型一誤差
H0為真
H1為真
接受(accept) H0
推論正確
推論錯誤
拒絕(reject) H0
≒
接受 (accept) H1
推論錯誤
型I誤差
推論正確
檢定力
(Type-I error )

H0為真,但 reject H0 (接受H1) 的犯錯(做錯決策)機率。

新高血壓藥與安慰劑效果其實相同,但統計分析結果的結論卻說新高
血壓藥與安慰劑效果不同的犯錯機率。
檢定力

(Power )
H1為真,而也接受H1 (reject H0) 的正確(做對決策)機率。
 新高血壓藥與安慰劑效果不同,而統計分析結果的結論也說新高
血壓藥與安慰劑效果確實不同的正確機率。
8
樣本數要多大才夠?

什麼時候應該拒絕虛無假設 (接受對立假設)?

當 P-value < 事先決定的最大可接受容許型一錯誤 (即
α) 時。

E.g., P-value < 0.05
統計是透過有限的樣本來瞭解母群體。根據樣本所做
的決策一定有錯誤之風險(即機率)
 P-value ,如果事實上虛無假設是對的,但根據樣本
訊息所做的決策會是拒絕虛無假設 (接受對立假設) 的
機率 =>做錯決策的機率=>型一錯誤。


E.g., 如果新高血壓藥與安慰劑效果其實相同,但根據試驗
所搜集之樣本,進行統計分析後,我們決定下結論新高血壓
藥與安慰劑效果不同的決策錯誤機率。
9
樣本數的計算必須考量的條件
經驗值 ( Experienced Data)
 樣本數的計算係根據對應之統計分析方法
所發展出之樣本數公式來計算

 研究目的
(study objective)
 研究設計 (study design)
 試驗組數 (number of treatment groups)
 評估指標 (outcome measure)
 統計方法 (Statistic method)
 統計假設 (statistical hypothesis)
10
樣本數的計算必須考量的條件




Detectable treatment effect
=> Clinical meaningful effect
型一誤差 (Type-I error ) 與檢定力 (Power)
組別間的樣本數比例 (allocation ratio)
中途離開研究以致無法獲得評估結果的受試者
比例 (anticipated dropout rate)
11
樣本數的計算必須考量的條件

經驗值 ( Experienced Data)
 在抽樣(進行試驗)前對母群體的瞭解
 From
pilot study, e.g., phase II study
 From references
 Based on guess
欲進行 phase III study。根據phase II 結果的經驗
 平均降血壓值:
新高血壓藥 vs. 安慰劑 = 10.5 mmHg vs. 1.2 mmHg
 所降血壓標準差
新高血壓 藥 vs. 安慰劑 = 5.2 mmHg vs. 0.4 mmHg
 E.g.,
12
樣本數的計算必須考量的條件
 研究目的
(study objective)
 比較不同治療組別平均效果的差異。

t-test、ANOVA
 建立prediction

model。
Regression models
 探討變數間的相關性

。
Correlation analysis
 建立診斷標準。

ROC Curve Analysis
13
樣本數的計算必須考量的條件
 研究設計
 平行設計

(parallel design)。
Unpaired t-test、ANOVA
 交叉設計

(study design)
(crossover design)
Mixed effect model
 群組循序設計方法(group

sequential design)
Group sequential analysis
14
樣本數的計算必須考量的條件

平行設計 (parallel design)
R
A
N
D
O
M
I
Z
A
T
IO
N
交叉設計
R
A
N
D
O
M
I
Z
A
T
I
O
N
Test Arm: A medication
Control Arm: B medication
(crossover design)
Period I
Sequence
I
Sequence
II
A medication
B medication
Washout
Period
Period II
B medication
A medication
15
樣本數的計算必須考量的條件

試驗組數 (number of treatment groups)
 Single-arm

Paired t-test
 Two-arm


(with control group): 2
Unpaired t-test
 Dose

(without control group): 1
response study: maybe more than 2
ANOVA
評估指標 (outcome measure)
 量性變數
(quantitative variable):血壓、血糖等
 質性變數 (qualitative variable):好 / 壞、有反應 / 無反
應等
 Time to event
16
樣本數的計算必須考量的條件
 統計方法 (statistic method)
 量性變數 (quantitative variable)
 平均數的比較:paired t-test、unpaired t-test
、ANOVA
 Prediction model: regression analysis
 質性變數 (qualitative variable)
 母体比例的比較:Chi-square test
 Prediction model: logistic regression
 Time to event

Survival analysis
17
樣本數的計算必須考量的條件

組別間的樣本數比例 (allocation ratio)
 Active

drug vs. Placebo: 2 vs 1 or more
中途離開研究以致無法獲得評估結果的受試者比
例 (anticipated dropout rate)
 Dropout
rate=P
 total sample size=(No. of evaluable subjects)/(1-P)
 E.g,



no. of requested evaluable subjects by the sample size
formula=40.
Dropout rate=0.2.
Then, the requested sample size=40/(1-0.2)=50.
18
樣本數的計算必須考量的條件


Detectable treatment effect
Clinical meaningful effect vs.
statistical significant effect
統計假設 (statistical hypothesis)
 Test
for equality (difference=0)
 Test for superiority
 Test for noninferiority

型一誤差 (Type-I error ) 與檢定力 (Power)
19
樣本數的計算必須考量的條件

統計假設的種類
Example

進行一個新藥 DRUGN 臨床試驗以評估新藥治療
高血壓的效果。

對照組: DRUGC

評估指標: 治療6個月後的血壓下降量。

μN: DRUGN 在治療6個月後的平均血壓下降量
μC: DRUGC 在治療6個月後的平均血壓下降量
20
樣本數的計算必須考量的條件

Test for equality (difference)

Purpose: DRUGN 與 DRUGC 藥效是否不同

H0: μN =μC vs. H1: μN ≠μC 或寫成
H0: μN -μC =0 vs. H1: μN -μC ≠0

Test for superiority

Purpose: DRUGN的藥效是否比 DRUGC 好

If clinical meaningful difference is δ (正值),
則 H0: μN -μC ≦ δ vs. H1: μN -μC> δ
21
樣本數的計算必須考量的條件

Test for noninferiority

Purpose: DRUGN的藥效是否沒比 DRUGC 差

If clinical meaningful difference is δ (負值) ,
則 H0: μN -μC≦ δ vs. H1: μN -μC > δ
22
樣本數的計算必須考量的條件

型一誤差 (Type-I error )
 做錯決策的機率

當H0為真,但所做之決策卻認為H0為假(拒絕H0) 。
 一般而言,設定為

0.05。
檢定力 (Power)
 做對決策的機率

當H0為假(即H1為真),而所做之決策也認為H0為假(
拒絕H0,即接受H1) 。
 一般而言,設定為
0.8。
23
Sample Size formula vs. Effect Size



A measure of the strength of the relationship between two
variables in a statistical population, or a sample-based
estimate of that quantity.
Some rules for deciding effect size are useful for sample
size calculation.
Example: comparison of two independent means
 Effect
size
 Sample
size formula
nc 
( Z1  Z1  ) 2 ( c 2   t 2 / k )
( t  c   )
2
nt  knc
24
Software for Sample Size Calculation

GPower 3.1

GUI interface for Window OS

Cover lots of statistic test and design

Online user manual available
 Free

!
Website for download:
http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/
25
Sample Size for Comparison of Means between 2 groups

Test for equality
 H0:
µt - µc = 0 vs H1: µt - µc  0
 Two-sided unpaired test
 K: nt / nc (allocation ratio)
 Desirable significant level α, Power 1-β

Formula 1 (hand calculation)
nc 
( Z1 / 2  Z1  ) 2 ( c 2   t 2 / k )
( t   c )
2
nt  knc
公式中的 t , c ,  t 2 ,  c 2必須以經驗值代入 
26
Sample Size for Comparison of Means between 2 groups

Test for equality

Formula 2 (used by G-Power)
Effect size d =
( t   c )
2
2



 c t /2
noncentrality papameter   d nt nc /  nt  nc 
df  nt  nc  2
nt  knc
nt , nc can be obtained by solving the equation of powr  1-
公式中的 t , c ,  t 2 ,  c 2必須以經驗值代入 
27
Sample Size for Comparison of Means between 2 groups

Test for equality (by hand calculation)








Example:治療高血壓新藥臨床試驗 (Phase III study)
Study design: two-arm, randomized, parallel, controlled study
Efficacy endpoint: the decrease of SBP from baseline after 6-month
treatment
Treatment group: DRUGN; Control Group: Placebo
Allocation ratio: 2
由Phase II study result 得知, t , c ,  t ,  c 約為 23, 10, 20,
Desirable significant level α=5%, Power 1-β=80%
Dropout rate: 20%
nc 
(1.96  0.84) 2 (252  202 / 2)
2
 23  10
25
= 38.3  39
nt  2  nc  78

Total sample size= (39+78)/(1-0.2) ≒147
28
Sample Size for Comparison of Means between 2 groups

Test for equality (by G-Power)
 Test
family: t tests
 Statistical test: Difference between two independent means
(two groups)
 Type of power analysis: A priori: compute required sample
size, given α, power, and effect size
 Input parameters:


Tails: Two
Effect size d:
 Click “Determine=>”
 Select n1=n2 panel
 Mean group 1: 23
 Mean group 2: 10
 SD group 1: 20
 SD group 2: 25
 Click “Calculate and transfer to main window”
29
Sample Size for Comparison of Means between 2 groups

Test for equality (by G-Power)
 Input



parameters:
α error prob: 0.05
Power: 0.8
Allocation ratio N2/N1: 0.5
 Click
“Calculate “
 Click “X-Y plot for the range of values”

Click “Draw plot”
30
Sample Size for Comparison of Means between 2 groups

Test for equality (by G-Power) : Calculation of Sample size
如果 2
groups的
SD 相等,
則選這裡
Calculated
sample size
31
Sample Size for Comparison of Means between 2 groups

Test for equality (by G-Power): power analysis for power vs. sample size
32
Sample Size for Comparison of Means between 2 groups

Test for superiority
H0: µt - µc ≦ δ vs H1: µt - µc >δ
 One-sided unpaired test
 K: nt / nc (allocation ratio)
 Desirable significant level α, Power 1-β Superiority margin (clinical
meaningful difference) δ (正值)
 Formula 1 (hand calculation)

nc 
( Z1  Z1  ) 2 ( c 2   t 2 / k )
( t  c   )
2
nt  knc
公式中的 t , c ,  t 2 ,  c 2必須以經驗值代入 
33
Sample Size for Comparison of Means between 2 groups

Test for superiority

Formula 2 (used by G-Power)
Effect size d =
( t  c   )

2
c

 t2 / 2
noncentrality papameter   d nt nc /  nt  nc 
df  nt  nc  2
nt  knc
nt , nc can be obtained by solving the equation of powr  1-
at  -significance level
公式中的 t , c ,  t 2 ,  c 2必須以經驗值代入 
34
Sample Size for Comparison of Means between 2 groups

Test for superiority
Example:治療高血壓新藥臨床試驗 (Phase III study)
Study design: two-arm, randomized, parallel, controlled study
Efficacy endpoint: SBP after 6-month treatment
Treatment group: DRUGN; Control Group: Placebo
Allocation ratio: 2
 由Phase II study result 得知, t , c ,  t ,  c
約為 23,





10, 20,
25
Desirable significant level α =5%, Power 1-β =80%
 Superiority margin (clinical meaningful difference) δ=10
 Drop rate: 20%

nc 
(1.645  0.84)2 (252  202 / 2)
 23  10   10
2
=
(1.96  0.84)2 (252  202 / 2)
2
3
=566.1  567
nt  2  nc  1134

Total sample size= (567+1134)/(1-0.2) ≒2127
35
Sample Size for Comparison of Means between 2 groups

Test for superiority (by G-Power)
 Test
family: t tests
 Statistical test: Difference between two independent means
(two groups)
 Type of power analysis: A priori: compute required sample
size, given α, power, and effect size
 Input parameters:


單尾檢定
Tails: One
Effect size d:
 Click “Determine=>”
由於 μt – μt – δ=3,所以將
 Select n1=n2 panel
兩 group mean 設成相減 = 3
 Mean group 1: 3
 Mean group 2: 0
 SD group 1: 20
 SD group 2: 25
 Click “Calculate and transfer to main window”
36
Sample Size for Comparison of Means between 2 groups

Test for superiority (by G-Power)
 Input



parameters:
α error prob: 0.05
Power: 0.8
Allocation ratio N2/N1: 0.5
 Click
“Calculate “
 Click “X-Y plot for the range of values”

Click “Draw plot”
37
Sample Size for Comparison of Means between 2 groups

Test for superiority (by G-Power) : Calculation of Sample size
Hand calculated
sample size 比較
保守
38
Sample Size for Comparison of Means between 2 groups

Test for noninferiority
H0: µt - µc < -δ vs H1: µt - µc ≧ -δ
 One-sided unpaired test
 K: nt / nc (allocation ratio)
 Desirable significant level α, Power 1-β


Noninferiority margin (clinical meaningful difference) δ (負值)

Formula 1 (hand calculation)
nc 
( Z1  Z1  ) 2 ( c 2   t 2 / k )
( t  c   )
2
nt  knc
公式中的 t , c ,  t 2 ,  c 2必須以經驗值代入 
39
Sample Size for Comparison of Means between 2 groups

Test for noninferiority

Formula 2 (used by G-Power)
Effect size d =
( t  c   )

2
c

 t2 / 2
noncentrality papameter   d nt nc /  nt  nc 
df  nt  nc  2
nt  knc
nt , nc can be obtained by solving the equation of powr  1-
at  -significance level
公式中的 t , c ,  t 2 ,  c 2必須以經驗值代入 
40
Sample Size for Comparison of Means between 2 groups

Test for noninferiority

Example:治療高血壓新藥臨床試驗 (Phase III study)
Study design: two-arm, randomized, parallel, controlled study
Efficacy endpoint: SBP after 6-month treatment
Treatment group: DRUGN; Control Group: Placebo
Allocation ratio: 2
由Phase II study result 得知, t , c ,  t ,  c 約為 18, 21, 20, 25
Desirable significant level α =5%, Power 1-β =80%
Superiority margin (clinical meaningful difference) δ=-10

Drop rate: 20%







nc 
(1.645  0.84) 2 (252  202 / 2)
18  21  ( 10) 
2
(1.96  0.84) 2 (252  202 / 2)
=
=103.9  104
72
nt  2  nc  208

Total sample size=(104+208)/(1-0.2) ≒390
41
Sample Size for Comparison of Means between 2 groups

Test for noninferiority (by G-Power)
 Test
family: t tests
 Statistical test: Difference between two independent means
(two groups)
 Type of power analysis: A priori: compute required sample
size, given α, power, and effect size
 Input parameters:


單尾檢定
Tails: One
Effect size d:
 Click “Determine=>”
由於 μt – μC – δ=7,所以將
 Select n1=n2 panel
兩 group mean 設成相減 = 7
 Mean group 1: 7
 Mean group 2: 0
 SD group 1: 20
 SD group 2: 25
 Click “Calculate and transfer to main window”
42
Sample Size for Comparison of Means between 2 groups

Test for noninferiority (by G-Power)
 Input



parameters:
α error prob: 0.05
Power: 0.8
Allocation ratio N2/N1: 0.5
 Click
“Calculate “
 Click “X-Y plot for the range of values”

Click “Draw plot”
43
Sample Size for Comparison of Means between 2 groups

Test for noninferiority (by G-Power) : Calculation of Sample size
Hand calculated
sample size 比較
保守
44
Sample Size for Comparison of Means between 2
groups

The sample size based on nonparametric method can
also be obtained.
 Mann-Whitney
test
 Steps by G-Power




Test family: t tests
Statistical test: Means: Wilcoxon-Mann-Whitney (two groups)
Type of power analysis: A priori: compute required sample size, given
α, power, and effect size
The remaining steps are similar as the steps described in the
previous slides.
45
Sample Size for Comparison of Means between 2
groups

2
2
If there is no idea about the values of t , c ,  t ,  c ,
the following effect size d proposed by Cohen J.
(1969) can be considered:
 Small
effect size d = 0.2
 Medium effect size d = 0.5
 Large effect size d = 0.8
46
Sample Size for Comparison of Means among 3 or more
groups

Test for equality

H0: μ1 =μ2 = ….. =μk for k≧3
vs. H1: μi ≠μj for some i≠j
 Analysis of Variance (ANOVA)
 Desirable significant level α, Power 1-β
 n: required sample size in each group
 Formula 1 (hand calculation)
1 k
   i
k i 1
1
2  2

2
k

i


i 1
n=  /  2
where  : SD within each group, i is the mean in each group i
公式中的 i ,  , 必須以經驗值代入  由下頁之Table取得
47
Sample Size for Comparison of Means among 3 or more
groups
48
Sample Size for Comparison of Means among 3 or more
groups

Test for equality
 Formula 2 (used by G-Power)
Effecect size f =  m / 
 T2   m2   2 : total variability of the samples
1 k
   i
k i 1
k
2
m
 
i 1
ai
   
i
k
a
2
: variability explained by treatment
i
i 1
where i is the mean in each group i
 2 : variance within group (MSE) : variability due to random error
ai : sampl size in each group
公式中的 i ,  , ai 必須以經驗值代入 
49
Sample Size for Comparison of Means among 3 or more
groups

Test for equality




Example:治療高血壓新藥臨床試驗 (Phase III study)
Study design: three-arm, randomized, parallel, controlled study
Efficacy endpoint: SBP after 6-month treatment
Treatment group: DRUGA, DRUGB, DRUGC
由Phase II study result 得知,  1 ,  2 ,  3 ,  約為 9.25, 11.75, 12, 6


Desirable significant level α =5%, Power 1-β =80%
Drop rate: 20%
1 k
   i =11
k i 1
1
2  2

2
k

i


 0.12847
i 1
n=  /  2 =9.64/0.12847  76
where  : SD in each group, i is the mean in each group i

Total sample size=(76×3)/(1-0.2)=285
50
Size for Comparison of Means among 3 or more
groups
Sample

Test for equality (by G-Power)
 Test
family: F tests
 Statistical test: ANOVA: Fixed effects, omnibus, one-way
 Type of power analysis: A priori: compute required sample
size, given α, power, and effect size
 Input parameters:

Effect size f:
 Click “Determine=>”
 Select procedure: Effect size from means
這邊的 size 是
 Number of groups: 3
previous study 的
 SD within each group: 6
sample size,只要
 Group 1: Mean=9.25, size=5;
隨便輸入相同之值
 Group 2: Mean=11.75, size=5;
即可
 Group 3: Mean=12,
size=5;
 Click “Calculate and transfer to main window”
51
Sample Size for Comparison of Means among 3 or more
groups

Test for equality (by G-Power)
 Input



parameters:
α error prob: 0.05
Power: 0.8
Number of groups: 3
 Click
“Calculate “
 Click “X-Y plot for the range of values”

Click “Draw plot”
52
Sample Size for Comparison of Means among 3 or more
groups

Test for equality (by G-Power) : Calculation of Sample size
53
Sample Size for Comparison of Means among 3 or
more groups

2
If there is no idea about the values of i ,  , the
following effect size f proposed by Cohen J. (1969)
can be considered:
 Small
effect size f : 0.1
 Medium effect size : 0.25
 Large effect size f : 0.4

Suggested minimum sample size
 Per
cell > 20 is preferred.
54
Sample Size for Comparison of Proportions between 2
groups

Test for equality
 H0:
Pt - Pc = 0 vs H1: Pt - Pc  0
 Two-sided Chi-square test (or Z-test)
 K: nt / nc (allocation ratio)
 Desirable significant level α, Power 1-β

Formula
Z1 2  Z1β 
2
nc 
 Pt Pc 
2
 P(
c 1 Pc )  P(
t 1 Pt ) / k 
nt  knc
公式中的 Pt , Pc 必須以經驗值代入 
55
Sample Size for Comparison of Proportions between 2
groups

Test for equality (by hand calculation)
Example:治療高血壓新藥臨床試驗 (Phase III study)
Study design: two-arm, randomized, parallel, controlled study
Efficacy endpoint: response of treatment after 6-month treatment
Treatment group: DRUGN; Control Group: Placebo
Allocation ratio: 2
 由Phase II study result 得知,Pt , Pc 約為 85%, 65%
 Desirable significant level α=5%, Power 1-β=80%
 Dropout rate: 20%





1.96  0.84
nc 
0.65(1-0.65)+0.85(1-0.85)/2  58
2 
0.85  0.65
2
nt  258  116

Total sample size= (58+116)/(1-0.2) ≒219
56
Sample Size for Comparison of Proportions between 2 groups

Test for equality (by G-Power)
 Test
family: z tests
 Statistical test: Difference between two independent
proportions
 Type of power analysis: A priori: compute required sample
size, given α, power, and effect size
 Input parameters:






Tails: Two
Proportions p2: 0.65
Proportions p1: 0.85
α error prob: 0.05
Power: 0.8
Allocation ratio N2/N1: 0.5
 Click
“Calculate “
 Click “X-Y plot for the range of values”

Click “Draw plot”
57
Sample Size for Comparison of Proportions between 2
groups

Test for equality (by G-Power) : Calculation of Sample size
58
Sample Size for Comparison of Proportions between 2
groups
 Test for superiority
H0: Pt - Pc ≦ δ vs H1: Pt - Pc > δ
 One-sided Chi-square test (or Z-test)
 K: nt / nc (allocation ratio)
 Desirable significant level α, Power 1-β Desirable significant level α,
Power 1-β
 Superiority margin δ (正值)


Formula
Z1  Z1β 
2
nc 
 Pt Pc 
2
Pc (1-Pc )+Pt (1-Pt )/ k
nt  knc
公式中的 Pt , Pc 必須以經驗值代入 
59
Sample Size for Comparison of Proportions between 2
groups

Test for superiority (by hand calculation)
Example:治療高血壓新藥臨床試驗 (Phase III study)
Study design: two-arm, randomized, parallel, controlled study
Efficacy endpoint: response of treatment after 6-month treatment
Treatment group: DRUGN; Control Group: Placebo
Allocation ratio: 1
 由Phase II study result 得知,Pt , Pc 約為 85%, 65%
 Desirable significant level α=5%, Power 1-β=80%







Superiority margin δ = 5%
Dropout rate: 20%
1.645  0.84
2
nc 
0.85  0.65  0.05
2
0.65(1-0.65)+0.85(1-0.85)/1  98
nt  198  98


Total sample size= (98+98)/(1-0.2) ≒246
Not available in G-Power
60
Sample Size for Comparison of Proportions between 2
groups
 Test for noninferiority
H0: Pt - Pc ≦ δ vs H1: Pt - Pc > δ
 One sided Chi-square test (or Z-test)
 K: nt / nc (allocation ratio)
 Desirable significant level α, Power 1-β
 Noninferiority margin δ (負值)


Formula
Z1  Z1β 
2
nc 
 Pt Pc 
2
 Pc (1Pc )+Pt (1Pt ) / k 
nt  knc
公式中的 Pt , Pc 必須以經驗值代入 
61
Sample Size for Comparison of Proportions between 2
groups

Test for noninferiority (by hand calculation)
Example:治療高血壓新藥臨床試驗 (Phase III study)
Study design: two-arm, randomized, parallel, controlled study
Efficacy endpoint: response of treatment after 6-month treatment
Treatment group: DRUGN; Control Group: Placebo
Allocation ratio: 2
 由Phase II study result 得知,Pt , Pc 約為 70, 75%
 Desirable significant level α=5%, Power 1-β=80%







Superiority margin δ = -10%
Dropout rate: 20%
1.645  0.84
nc 
0.75(1-0.75)+0.70(1-0.70)/2  723
2 
 0.70  0.75  0.1


2
nt  2 723  1446


Total sample size= (723+1446)/(1-0.2) ≒2712
Not available in G-Power
62
Sample Size for building a linear regression model

Test for all regression coefficients=0
R2 =0 vs H1: R2 ≠0
 F-test
 Desirable significant level α, Power 1-β
 Formula
 H0:
 n 1 p
R2
2
2
Power  P 

R  Ra 
2
p
1 R


p: number of predictors (independent variables)
Ra2  R 2 value obtained from previous study results
n can be obatined via solving the abve equation
2
R
In G-Power, the effect size f 2 
1  R2
63
Sample Size for building a linear regression model

Test for all regression coefficients=0 (by G-power)
Example: Multiple linear regression model for predicting blood pressure
via LDL, HDL, gender, age and TG,
 Number of predictors : 5 ( p = 5)



由 previous study result 得知,
R2 約為 0.3
Desirable significant level α =5%, Power 1-β =95%
Test family: F tests
 Statistical test: Linear multiple regression: Fixed model, R2 deviation from
zero
 Type of power analysis: A priori: compute required sample size, given α,
power, and effect size
 Input parameters:





Effect size: click “Determine =>”
Click “From correlation coefficient”
Squared multiple coefficient ρ2: 0.3
Click “Calculate and transfer to main window”
64
Sample Size for Comparison of Means between 2 groups

Test for superiority (by G-Power)
 Input



parameters:
α error prob: 0.05
Power: 0.95
Number of predictors: 5
 Click
“Calculate “
 Click “X-Y plot for the range of values”

Click “Draw plot”
65
Sample Size for Comparison of Means between 2 groups

Test for superiority (by G-Power) : Calculation of Sample size
66
Sample Size for building a linear regression model

If there is no idea about the value of R2 , the
following effect size f2 proposed by Cohen J. (1969)
can be considered:




Small effect size f : 0.02
Medium effect size : 0.15
Large effect size f : 0.35
Suggested minimum sample size



Min. 5 cases per predictor (5:1)
Ideally 20 cases per predictor (20:1), with an overall N of at
least 100;
N should ideally be 50 + 8(k) for testing a full regression
model or 104 + k when testing individual predictors
(where k is the number of predictors)
67
Key Summary

Clarify the following factors before sample size
calculation:
 研究目的
(study objective)
 研究設計 (study design)
 試驗組數 (number of treatment groups)
 組別間的樣本數比例 (allocation ratio)
 中途離開研究以致無法獲得評估結果的受試者比例
(anticipated dropout rate)
 統計方法 (statistic method)
 評估指標 (outcome measure)
 統計假設 (statistical hypothesis)
 Detectable treatment effect
=> Clinical meaningful effect
 型一誤差 (Type-I error ) 與檢定力 (Power)
68
Key Summary



The hand-calculation can be used when no software
available.
The rules of effect size proposed by Cohen J. (1969)
can be considered if no idea about the values of the
parameters.
The minimum sample size should be achieved.
69