Probability Distribution

Probability and
Probability Distribution
Dosen: Dr. Sony Sunaryo,M.Si.
Kuliah Penyegaran
Statistik Bisnis
MMT-ITS
2014
1
Contoh-contoh Pernyataan Probabilitas di
Bidang Produksi
• Probabilitas goresan dalam produksi plat baja adalah 10%
• Probabilitas berkurangnya berat produk adalah 1%.
• Probabilitas ditemukannya produk cacat dalam suatu pemeriksaan
(inspeksi) adalah 5%.
• Setelah beroperasi, probabilitas mesin trouble dalam 8 jam adalah
3%.
• Probabilitas cacat suatu produk manufaktur dari pabrik A lebih besar
dari pabrik B.
• Apa maknanya ????
2
Random Variable
Suatu variabel acak (random variable) adalah
ukuran numerik dari hasil percobaan
probabilitas, sehingga nilainya ditentukan oleh
kesempatan (chance). Variabel acak dinotasikan
dengan huruf besar seperti X.
3
Dua Jenis Variabel Random
• Variabel acak diskrit yaitu variabel acak yang
memiliki nilai-nilai yang mungkin dalam jumlah
terbatas atau dapat dihitung. Nilainya
diperoleh dengan cara mencacah/ menghitung.
• Variabel acak kontinu yaitu variabel acak yang
memiliki kemungkinan nilai dalam jumlah tak
terbatas atau banyaknya kemungkinan nilai tidak
terhitung. Nilainya diperoleh dengan cara
mengukur.
4
EXAMPLE Distinguishing Between Discrete and
Continuous Random Variables
Determine whether the following random variables
are discrete or continuous. State possible values
for the random variable.
(a) The number of light bulbs that burn out in a
room of 10 light bulbs in the next year.
(b) The number of leaves on a randomly selected
the tree.
(c) The length of time between calls to 109.
5
Probability Distributions
• The histogram (or stem-and-leaf plot, or box
plot) is used to describe sample data.
A sample is a collection of measurements
selected from some larger source or
population.
• A probability distribution is a mathematical
model that relates the value of the variable
with the probability of occurrence of that
value in the population.
6
Probability Distribution
A probability distribution provides the
possible values of the random variable and
their corresponding probabilities.
A probability distribution can be in the form
of a table, graph or mathematical formula.
7
The table below shows the probability distribution
for the random variable X (Why ?), where X
represents the number of DVDs a person rents
from a video store during a single visit.
8
9
EXAMPLE 1 Identifying Probability Distributions
Is the following a probability distribution?
Why?
10
EXAMPLE 2 Identifying Probability Distributions
Is the following a probability distribution?
Why?
11
A probability histogram is a histogram in
which the horizontal axis corresponds to
the value of the random variable and the
vertical axis represents the probability of
that value of the random variable.
12
EXAMPLE
Drawing a Probability Histogram
Draw a probability histogram of the following
probability distribution which represents the
number of DVDs a person rents from a video store
during a single visit.
13
Probability Distribution
0.7
x
prob
0.5
0
0.06
0.4
1
0.58
2
0.22
3
0.1
4
0.03
5
0.01
0.58
probabilities
0.6
0.3
0.22
0.2
0.1
0.1
0.06
0.03
0
0
1
2
3
4
0.01
5
random variable values
14
Continuous Probability Distributions
15
Probability Density Functions
• Unlike a discrete random variable, a
continuous random variable is one that can
assume an uncountable number of values.
• We cannot list the possible values because
there is an infinite number of them.
• Because there is an infinite number of values,
the probability of each individual value is
virtually 0.
16
Point Probabilities are Zero
Because there is an infinite number of values, the
probability of each individual value is virtually 0.
Thus, we can determine the probability of a range
of values only.
• E.g. with a discrete random variable like number of defect, it is
meaningful to talk about P(X=5), say.
• In a continuous setting (e.g. with time as a random variable), the
probability the random variable of interest, say task length, takes
exactly 5 minutes is infinitesimally small, hence P(X=5) = 0.
• It is meaningful to talk about P(X ≤ 5).
17
Important Discrete Distributions
18
 Binomial Distribution
• Distribusi
binomial di Bidang Produksi:
- Variabel acak yang menyatakan banyaknya barang
cacat yang diambil "n“ sekumpulan produk dari
proses yang rata-rata tingkat kecacatannya "p“, akan
memiliki distribusi Binomial.
19
Gambar dibawah menunjukkan hasil pemeriksaan "n = 30" barang yang
diambil dari barang-barang yang diproduksi oleh suatu proses dengan
tingkat keca-catan, p = 0,16. Yaitu menyatakan frekwensi relatif dari
barang-barang yang cacat.
0.25
0.2
0.15
0.1
0.05
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 2930
31
Ketika kita ingin menghitung peluang, misalnya, peluang untuk memperoleh
lebih kecil dari 2 potong barang cacat dari 30 barang yang diambil, maka
distribusi binomial dapat diterapkan.
20
Binomial Distribution Function: Ketika hasil percobaan diklasifikasikan ke
dalam dua cara seperti baik/cacat atau berhasil/gagal yang dilakukan "n"
kali, maka variabel X yang menyatakan banyaknya sukses yang ada dalam
n, akan mengikuti distribusi binomial, yang nilai peluangnya seperti
berikut:.
n x
P( X  x)    p (1  p) nx , x  0,1,..., n
 x
n: the number of total executed
p: the probability of success in execution, the value of between 0 and 1
x: the number of success among “n” times’ execution
Notasi Bin(n,p)
n
n!
  
 x  x!(n  x)!
21
Binomial Probability Histogram
Figure (a) Bin(10, 0.4) (b) Bin(20, 0.1)
22
Mean, Variance, Standard Deviation of Binomial Distribution :
Mean : np, Variance : np( 1- p ),
Standard deviation : np(1  p)
Exercise:
There is a production process with defect rate of 1%.
What’s the probability of defect of under or same 1piece out of n=10
sample taken from total produced goods? Mean and Variance ? (use
Manual and Minitab)
Answer:
P( X  1 ) = P( X = 0 ) + P( X = 1 )
= 1  0.010  0.9910 +10  0.011  0.999
= 0.9044+0.0914
= 0.9957
mean= np =10  0.01=0.1
variance= np(1-p)=10  0.01  0.99=0.099
23
( Use Minitab 1 )
Step 1.
Input in Worksheet as below
24
Step 2.
Input in Calc > Probability Distributions > Binomial as
below
P (X = a) = ?
P (X  a) = ?
P (X  ?) = b
25
Step 3.
Confirmation of Results
P(X=0)=0.9044, P(X=1)=0.0914 ,
P( X  1 ) = P( X = 0 ) + P( X = 1 )
= 0.9044+0.0914
= 0.9957
26
( Minitab use 2 )
Step 1. Input in Calc > Probability Distributions > Binomial as below.
27
Step 2. Confirmation of result
The result turns out to be P(X  1)=0.9957.
28
 Poisson Distribution
Contoh Distribusi Poisson : Banyaknya kejadian yang frekwensinya rendah dan
jarang terjadi pada waktu yang ditentukan atau ruang yang ditentukan,
misalnya:
.
- Banyaknya cacat per area tertentu.
- Banyaknya mesin trouble dalam suatu hari
- Banyaknya kecelakaan lalu lintas yang terjadi di suatu persimpangan
F
r
e
q
u
e
n
c
y
45
30
-The left picture shows the
number of defects in the
surface of steel plate.
10
-The Poisson Distribution is
applied in the data of “number
of defect”
0
1
2
3
4
Defects
29
• The Poisson Distribution Function:
e   ( ) x
P( X  x) 
, x  0,1,2,...
x!
 : the number of defect a unit
• The feature of Poisson Distribution: Mean and Variance with the number
of defects per unit (dpu) are identical.
•E[X] = , V[X] = 
30
Poisson Probability Histogram
Figure (a) Poisson(1) (b) Poisson(10)
31
ex .
A department of making bill in credit card co. is willing
to manage the mistake of a bill. If the number of
mistake per bill is average 0.01 as Poisson Distribution,
what’s the probability of under 1 mistake in taken bills
random ? (Use Manual and Minitab)
Answer.
e 0.01 0.010 e 0.01 0.011
P( X  1) 

0!
1!
 0.9900  0.0099
 0.9999
32
( Use Minitab 1 )
Step 1.
Input data in Worksheet as below
33
Step 2.
Input in Calc > Probability Distributions > Poisson
as below
34
Step 3.
Confirmation of Result
P(X=0)=0.9900, P(X=1)=0.0099 ,
P( X  1 ) = P( X = 0 ) + P( X = 1 )
= 0.9900+0.0099
= 0.9999
35
( Use Minitab 2 )
Step 1.
Input in Calc > Probability Distributions > Poisson as
below
36
Step 2. Confirmation of Results
The result turns out to be P(X  1)=0.9999
37
Distribusi Hipergeometrik
Distribusi hipergeometrik mempunyai sifat:
1. Sampel acak berukuran n yang diambil tanpa pengembalian dari N
benda.
2. Sebanyak k-benda dapat diberi nama sukses dan sisanya N-k
diberi nama gagal.
38
DISTRIBUSI HIPERGEOMETRIK
N
ka
N = besar populasi
k = sifat tertentu dari populasi (misal
sukses)
N = besar sampel
Xn
X = variabel random (banyak
sifat k dalam n)
Distribusi probabilitas perubah acak hipergeometrik X yang menyatakan
banyak nya kesuksesan dalam sampel acak dengan ukuran n yang diambil
dari N-obyek yang memuat k sukses dan N-k gagal dinyatakan sebagai:
 k  Nk 
 x  n x 
 ; x  0,1, 2,......,n
h(x;N,n,k)   
N
n
 
39
Contoh
Suatu panitia 5 orang dipilih secara acak dari 3 kimiawan dan 5 fisikawan.
Hitung distribusi probabilitas banyknya kimiawan yang duduk dalam panitia.
Jawab:
40
Misalkan: X= menyatakan banyaknya kimiawan dalam panitia.
X={0,1,2,3}
Distribusi probabilitasnya dinyatakan dengan rumus
3 5

x  5 x 
h(x; 8, 5, 3) 
85
; x  0,1, 2, 3
  

  

3 5
0 5
x  0  h(0; 8, 5, 3) 
 1
;
56
8
5
3 5
1 4 ; 15
x  1  h(1; 8, 5, 3) 

56
8
Tabel Distribusi hipergeometrik
5
  

  

3 5
2 3
x  2  h( 2; 8, 5, 3) 
 30
56
8
5
3 5
3 2
x  3  h(3; 8, 5, 3) 
 10
56
8
5
x
0
1
2
3
h(x;8,5,3)
1
56
15
56
30
56
10
56
41
Distribusi hipergeometrik h(x;N,n,k) mempunyai rata-rata dan variansi sbb:
dan
nk
 2  Nn (n)( k )(1  k )

N1
n
n
N
Contoh
Tentukan mean dan variansi dari contoh sebelumnya
Jawab:
diketahui n=15 dan p=0.4 Diperoleh
  (5)(3)  3  0, 375
40
8

  40 

 2  405 (5) 3 1  3  0, 3113
39
40
42
Contoh
Suatu pabrik ban mempunyai data bahwa dari pengiriman sebanyak 5000 ban
ke sebuah toko tertentu terdapat 1000 cacat. Jika ada seseorang membeli 10
ban ini secara acak dari toko tersebut, berapa probabilitasnya memuat tepat 3
yang cacat.
Jawab:
Karena n=10 cukup kecil dibandingkan N=5000, maka probabilitasnya
dihampiri dengan binomial dengan p= 1000/5000= 0,2 adalah probailitas
mendapat satu banJadi probabilitas mendapat tepat 3 ban cacat:
h(3; 5000,10,1000)  b(3;10, 0.2)

3
2
x 0
x 0
 b(x;10, 0.2)   b(x;10, 0.2)
 0, 8791  0, 6778
 0, 2013
43
Probability Density Function
Hypergeometric with N = 5000, M = 1000, and
n = 10
x P( X = x )
3 0,201478
Probability Density Function
Binomial with n = 10 and p = 0,2
x P( X = x )
3 0,201327
44
Contoh :
Seseorang memesan jeruk 1 truk yang berisi 10000 buah dengan
perjanjian hanya 1 % jeruk yang masam. Untuk jadi atau tidak
menerima pesanan tersebut dibuat kriteria sebagai berikut :
“Dari 10 jeruk yang dites, jika terdapat hanya 1 jeruk yang
masam maka pesanan tersebut diterima, lebih dari 1 ditolak.”
a).Hitung resiko produsen (Kesalahan Jenis I : menolak yang
benar)
b). Andaikan penjual melakukan “manipulasi” dengan
memasukkan 5 % yang masam, hitung resiko
konsumen (Kesalahan Jenis II : menerima yang salah)
45
Contoh :
Andaikan bahwa probabilitas terdapat kerusakan dalam kawat
baja buatan pabrik tertentu yang panjangnya 1 mil (helai) adalah
0,01. Sebuah kabel baja terdiri atas 100 helai kawat halus, yang
mana kabel tersebut dapat menahan beban yang direncanakan
jika 99 kawat dalam kondisi baik. Hitung probabilitas kabel
tersebut dapat menahan beban yang direncanakan.
Contoh :
Dari 1000 orang mahasiswa 2 orang mengaku selalu terlambat
masuk kuliah setiap hari, jika pada suatu hari terdapat 5000
mahasiswa, berapa peluang ada lebih dari 3 orang yang
terlambat?
46
Important Continuous
Distributions
47
 Normal Distribution
• The feature of normal distribution:
- the typical distribution for continuous data.
- Most of the data from the field is close to normal distribution.
• Application of Normal Distribution:
- It can be used in the calculation of Sigma level for a process with being
capable of taking continuous data.
- In case that data is a form of defect goods or the number of defect, it
can
also be used for calculating Sigma level.
48
• The shape of normal distribution:
• The mean and standard deviation of normal distribution:
- The shape of normal distribution is bell and symmetry.
- The form of it is decided by mean () and standard deviation.
49
• The shape of Normal Distribution in accordance with
mean and standard deviation
  0,   1
  3.0,   2
  2.0,   3.5
-15
-10
-5
0
5
10
15
-As you see the picture, the position of graph is decided by the mean.
And the shape of graph is decided by standard deviation.
50
• The example of normal distribution in a field:
- diameter luar baut
- diameter of bearing (diameter bantalan)
- curing time of cement
- tensile strength of tungsten alloyed steel
- flatness-degree of floor in construction (Derajat kerataan lantai dalam
konstruksi)
- take time of receiving e-mail
- weight of product in filling-up factory
- purity-degree of product in chemistry process
- a diameter of piston of car engine
51
Empirical Rule
This figure represents a plot of the normal probability density function
with mean  and standard deviation . Note that the curve is
symmetric
about , so that  is the median as well as the
mean. It is also the case for the normal population.
• About 68% of the population is in the interval   .
• About 95% of the population is in the interval   2.
• About 99.7% of the population is in the interval   3.
52
• The calculation of probability with Minitab :
- The calculation of probability of normal distribution shall be exposed
with using Minitab.
ex) In a normal distribution with mean 20, standard deviation 5,
please calculate the probability as below.. (use minitab )
( a ) The probability of X < 15
( b ) The probability of X  30
( c ) The probability of X of between 10 and 25
53
• (a) P(X<15)
Step 1.
Input in Calc > probability distributions > normal as below
54
Step 2.
Confirmation of result
55
Step 2.
Confirmation of result
56
• (b) The calculation of P [ X  30 ]
P [ X  30 ] = 1 - P[ X < 30]
57
Step 1. Input In Calc > Probability Distributions > Normal as below
58
Step 2.
Confirmation of Result
Then, it turns out to be 1 - 0.9772 = 0.0228
59
• ( c ) P [ 10 < X < 25 ] = P (X < 25) - P(X < 10)
Step 1. Input data in Worksheet as below
Step 2. Input in Calc > Probability Distributions > Normal as below
60
Step 3. Confirmation of Result
Then, it turns out to be 0.8413 - 0.0288 = 0.8125
61
ex) A tensile strength of a carbon-steel becomes a normal distribution
with mean 171 kg/mm2, standard deviation 5 kg/mm2 approximately.
When we measure tensile strength of samples taken from a steel plate,
what’s the probability of tensile strength below 165 kg/mm2 ?
62
( Use Minitab )
Step 1.
Input in Calc > Probability Distributions > Normal as below
63
Step 2. Confirmation of result
64
Standard Units
• The proportion of a normal population that is within a
given number of standard deviations of the mean is the
same for any normal population.
• For this reason, when dealing with normal populations,
we often convert from the units in which the
population items were originally measured to standard
units.
• Standard units tell how many standard deviations an
observation is from the population mean.
65
Standard Normal Distribution
In general, we convert to standard units by subtracting
the mean and dividing by the standard deviation. Thus,
if x is an item sampled from a normal population with
mean  and variance 2, the standard unit equivalent of
x is the number z, where
z = (x - )/.
The number z is sometimes called the “z-score” of x.
The z-score is an item sampled from a normal population
with mean 0 and standard deviation of 1.This normal
distribution is called the standard normal distribution.
66
• standard normal distribution : The normal distribution with
mean=0, standard deviation=1 is standard normal distribution.
67
• Z -transformation : There is a normal distribution with Mean ,
standard deviation  of probability variable. Then, Z 
X 

becomes the standard normal distribution with mean 0, standard
deviation 1.
68
•Relation between Sigma-level and Z : in case of being only USL
- The value of Zusl=(USL-  )/  means Sigma-level.
- The bigger of Zusl value, The better of performance of process.
•Means of Sigma-level of process:
69
 Exercise
1. There is a process with defect rate 5%. What’s the probability
of under 3 pieces of defect goods out of 15 samples taken from
the process. ( Use Minitab )
2. A department of making bill in credit card co. is willing to manage
the mistake of a bill. If the number of mistakes per bill is average
0.05 as Poisson Distribution, what’s the probability of under 3
mistakes in taken bills random ? (Use Minitab)
70
3. The weight of goods produced in a filling-up process is mean 5kg,
standard deviation 0.5kg. When we pick one product among them,
what’s the probability of 5kg≤ weight<5.5 ?
4. The defect rate of parts in incoming inspection is 10%.
When we inspect 100 pieces,
(a) The probability of under 15 pieces of defect goods ?
(b) The probability of more than 25 pieces of defect goods ?
71
5. Secara rata-rata, lima nasabah sebuah bank mengadakan transaksi diatas 10
juta rupiah setiap jam. Jika diasumsikan kondisi transaksi tersebut tidak
berdistribusi tertentu dan memiliki pola tetap untuk jangka waktu tertentu,
tentukan probabilitas bahwa selama satu jam tertentu akan terjadi transaksi
dengan nasabah lebih dari 10 juta rupiah, lebih dari 10 kali
6. Anggap 90% Produk yang dihasilkan sebuah perusahaan berkualitas baik.
Kepala bagian produksi mengambil 5 produk , berapa probabilitas bahwa
sebuah produk tidak berkualitas baik
7. Diketahui suatu distribusi normal dengan rata-rata 50 dan simpangan baku
10. Carilah probabilitas bahawa X mendapat ilai antara 45 dan 62
8. Suatu suku cadang dapat menahan uji guncangan tertentu dengan
probabilitas 0.75. Hitung probabilitas bahwa tepat 2 dari 4 suku cadang yang
diuji tidak akan rusak.
72
9. Probabilitas seseorang sembuh dari penyakit jantung setelah operasi adalah
0.4. Bila diketahui 15 orang menderita penyakit ini, berapa peluang:
a). sekurang-kurangnya 10 orang dpt sembuh
b). ada 3 sampai 8 orang yg sembuh
c). tepat 5 orang yg sembuh
10. Kekuatan batang baja yang dibuat dengan proses tertentu diketahui kira-kira
mendekati distribusi normal dengan mean 24 dan deviasi standart 3. Para
konsumen menghendaki bahwa paling sedikit 95% batang tersebut
mempunyai kekuatan lebih 20. Apakah kualitas batang baja tersebut sesuai
dengan ketetapan konsumen.
11. Rata-rata jumlah chips cokelat per masak dianggap tujuh menurut manajer
umum. Jika kurang dari 3 atau lebih dari 10 chip pada suatu pemasakan,
proses pemasakan dilakukan penyesuaian, gunakan distribusi poisson untuk
menghitung probabilitas bahwa batas spesifikasi yang ditetapkan oleh
manajer dapat dipenuhi.
73
Rule of Thumb
For most populations, if the sample size is greater than 30,
the Central Limit Theorem approximation is good.
Normal approximation to the Binomial:
If X ~ Bin(n,p) and if np > 5, and n(1– p) > 5, then
X ~ N(np, np(1-p)) approximately.
Normal Approximation to the Poisson:
If X ~ Poisson(), where dpu > 10, then X ~ N(,  2).
74
Normal approximation to Binomial
Bin(100, 0.2) and N (20, 16)
75
Continuity Correction
• The binomial distribution is discrete, while the normal
distribution is continuous.
• The continuity correction is an adjustment, made
when approximating a discrete distribution with a
continuous one, that can improve the accuracy of the
approximation.
• If you want to include the endpoints in your probability
calculation, then extend each endpoint by 0.5.
Then proceed with the calculation.
• If you want exclude the endpoints in your probability
calculation, then include 0.5 less from each endpoint in
the calculation.
76
Continuity Correction
P(45X55)
P(45<X<55)
77
Example
If a fair coin is tossed 100 times, use the normal curve to
approximate the probability that the number of heads is
between 45 and 55 inclusive.
0.7287
78
Use Minitab
Cumulative Distribution Function
Binomial with n = 100 and p = 0,5
x P( X <= x )
44 0,135627
55 0,864373
P (45  X  55) = 0,864373 - 0,135627 =
0,728746
Cumulative Distribution Function
Normal with mean = 0 and standard deviation
=1
x P( X <= x )
1,1 0,864334
-1,1 0,135666
P (-1,1  X  1,1) = 0,864334 - 0,135666 =
0,728668
79
Example
The number of hits on a website follow a Poisson
distribution, with a mean of 27 hits per hour. Find the
probability that
there will be 90 or more hits in three
hours.
80
Pendekatan Antar Distribusi
Probabilitas
H : Hipergeometrik; B : Binomial; P : Poisson; N: Normal
81
Contoh :
Suatu perusahaan memproduksi komponen chip tiap hari 5000
unit untuk komputer dengan kualitas 0,65% cacat. Untuk jadi
beli atau tidak dibuat kriteria sebagai berikut : Dari 20 chip yang
dites jika paling banyak 2 yang rusak, maka jadi beli, selain itu
ditolak.
Hitung resiko produsen dengan menggunakan :
a). Distribusi Hipergeometrik
b). Distribusi Binomial
c). Distribusi Poisson
d). Distribusi Normal sebagai pendekatan Binomial
82
Exercises
• Suppose that a lot contains 100 items, 5 of
which do not conform to requirements. If 10
items are selected at random without
replacement, then what is the probability of
finding one or fewer nonconforming items in
the sample?
83
Exercises
• A lightbulb has a normally distributed light
output with mean 5,000 end foot-candles and
standard deviation of 50 end foot-candles.
Find a lower specification limit such that only
0.5 % of the bulbs will not exceed this limit.
84
Exercises
85
Exercises
86
Exercises
87
88