Why Monte Carlo Integration? Introduction to Monte Carlo Method

Introduction to Monte Carlo
Method
Kadi Bouatouch
IRISA
Email: [email protected]
Continuous Probability
• A continuous random variable x is a
variable that randomly takes on a value
from its domain
• The behavior of x is completely described
by the distribution of values it take.
Why Monte Carlo Integration?
• To generate realistic looking images, we
need to solve integrals of 2 or higher
dimension
– Pixel filtering and lens simulation both involve
solving a 2 dimensional integral
– Combining pixel filtering and lens simulation
requires solving a 4 dimensional integral
• Normal quadrature algorithms don’t extend
well beyond 1 dimension
Continuous Probability
• The distribution of values that x takes on
is described by a probability distribution
function p
– We say that x is distributed according to p, or
x~p
– p(x) ≥ 0
∞
– ∫− ∞ p ( x ) dx = 1
• The probability that x takes on a value
in
b
the interval [a, b] is: P( x ∈ [a, b ]) = ∫a p ( x ) dx
Multi-Dimensional Random
Variables
Expected Value
• The expected value of x~p is defined as
E ( x) = ∫ xp( x)dx
– As a function of a random variable is itself a
random variable, the expected value of f(x) is
E ( f ( x)) = ∫ f ( x) p( x)dx
• For some space S, we can define a pdf p:S →ℜ
• If x is a random variable, and x~p, the probability
that x takes on a value in Si ⊂ S is:
P( x ∈ Si ) = ∫ p( x)dµ
Si
• The expected value of a sum of random
variables is the sum of the expected
values:
E(x + y) = E(x) + E(y)
Monte Carlo Integration
• Expected value of a real valued function f :S →ℜ
extends naturally to the multidimensional case:
E ( f ( x )) = ∫ f ( x ) p ( x ) dµ
S
Monte Carlo Integration
• Suppose we have a function f(x) defined over
the domain x є [a, b]
• We would like to evaluate the integral
• Finding the estimated value of the
estimator we get: E[ I ] = E ⎡⎢ 1 ∑ f ( x ) ⎤⎥
N
i
⎣ N i =1 p ( xi ) ⎦
1 N ⎡ f ( xi ) ⎤
= ∑ E⎢
⎥
N i =1 ⎣ p ( xi ) ⎦
f ( xi )
1
= N∫
p ( xi )dx
N
p ( xi )
b
I = ∫ f ( x)dx
a
• The Monte Carlo approach is to consider N
samples, selected randomly with pdf p(x), to
estimate the integral
1 N f ( xi )
• We get the following estimator: I = ∑
N
i =1
p ( xi )
= ∫ f ( xi )dx = I
• In other words:
1
lim
N →∞ N
N
f ( xi )
∑ p( x ) = I
i =1
i
Variance
Variance Reduction
• Increase number of samples
• Choose p such that f/p has low variance (f and p
should have similar shape)
• The variance of the estimator is
σ2 =
1
N
⎛ f ( xi )
∫ ⎜⎜⎝ p( x ) − I
i
2
⎞
⎟⎟ p ( xi )dx
⎠
– This is called importance sampling because if p is
large when f is large, and small when f is small, there
will be more sample in important regions
• As the error in the estimator is proportional
to σ, the error is proportional to 1
• Partition the domain of the integral into several
smaller regions and evaluate the integral as a
the sum of integrals over the smaller regions
N
– So, to halve the error we need to use four
times as many samples
– This is called stratified sampling
Variance Reduction: Stratified
sampling
• Region D divided into disjoint sub-regions
∑iPi=1
• Di=[xi,xi+1]: sub-region
•
I =
Variance Reduction: Stratified
sampling
f (x)
• Ni samples per sub-region i
•
∫
0
x
f ( x ) dx = ∫ ( f ( x ) / p ( x )) p ( x ) dx +
1
0
x2
∫
( f ( x ) / p ( x )) p ( x ) dx +
x1
... +
1
∫(
x
n −1
f ( x ) / p ( x )) p ( x ) dx
n sub-regions
• p(x)/Pi : pdf for a sub-region i
( x) / p ( x)) p ( x)dx = Pi
∫Di ( f ( x ) / Pip ( x )) p ( x ) dx = 1
∫Di ( f
1
f (x)
0
1
( x) / p ( x)) p ( x)dx = Pi
∫Di ( f ( x ) / p ( x ))( p ( x ) / Pi ) dx = 1
∫Di ( f
I=
n −1
∑ Pi
i =0
f ( x) f ( x)
∫Di p( x).
Pi
0
i = n −1
dx
I≈∑
i =0
1
Pi
Ni
f ( Xki )
∑ p( X
Ni
ki = 0
ki
)
Variance Reduction: Stratified
sampling
• 2D example
•
•
•
•
•
•
•
•
•
→ N2 samples
•
•
•
•
•
•
•
Example
• Sample the pdf p(x)=3x2/2, x є [-1, 1]:
– First we need to find P(x): P( x) = ∫ 3x 2 / 2dx = x 3 / 2 + C
Sampling Random Variables
• Given a pdf p(x), defined over the interval
[xmin, xmax], we can sample a random variable
x~p from a set of uniform random numbers ξi є
x
[0,1)
prob(α < x) = P ( x) = ∫
xmin
• To do this, we need the cumulative probability
distribution function:
– To get xi we transform ξi: xi = P-1(ξi)
– P-1 is guaranteed to exist for all valid pdfs
Sampling 2D Random Variables
• If we have a 2D random variable α=(αx, αy) with pdf p(αx,
αy), we need the two dimensional cumulative pdf:
prob(α x < x & α y < y ) = P( x, y ) = ∫
y
∫
x
y min xmin
– Choosing C such that P(-1)=0 and P(1)=1, we
get P(x)=(x3+1)/2 and
P-1(ξ)= 3 2ξ − 1
– Thus, given a random number ξi є [0,1), we
can ’warp’ it according to p(x) to get xi: xi = 3 2ξ i − 1
p ( µ ) dµ
p ( µ x , µ y )dµ x dµ y
– We can choose xi using the marginal distribution pG(x)
and then choose yi according to p(y|x), where
y max
p ( x, y )
pG ( x) = ∫
p ( x, y )dy and p ( y | x) =
y min
pG ( x)
– If p is separable, that is p(x, y)=q(x,)r(y), the one
dimensional technique can be used on each
dimension instead
2D example: Uniform Sampling of
Triangles
2D example: Uniform Sampling of
Triangles
• To uniformly sample a triangle, we use barycentric
coordinates in a parametric space
• Let A,B,C the 3 vertices of a triangle
• Then a point P on the triangle is expressed as: C
– P= αA + βB + γC
1
– α+ β+ γ= 1
– α = 1- γ - β
P
– P =(1- γ - β ) A + βB + γC
γ
0
A
2D example: Uniform Sampling of
Triangles
• To uniformly sample a triangle, we use
barycentric coordinates
• Integrating the constant 1 across the
1
1−γ
triangle gives ∫γ =0 ∫β =0 dβdγ =0.5
– Thus our pdf is p(β,γ)=2
• Since β depends on γ (or γ depends on β),
we use the marginal density for γ, pG(γ):
1−γ '
pG (γ ' ) =
∫
0
2dβ = 2 − 2γ '
• To uniformly sample a triangle, we use
barycentric coordinates in a parametric space
(0,1)
β
B
(0,0)
γ
(1,0)
β +γ =1
2D example: Uniform Sampling of
Triangles
• From pG(γ’), we find p(β’|γ’) = p(γ’, β’)/pG(γ’) =
2/(2-2γ’) = 1/(1-γ’)
• To find γ’ we look at the cummulative pdf for
γ:
γ'
γ'
0
0
ξ1 = PG (γ ' ) = ∫ pG (γ )dγ = ∫ 2 − 2γdγ = 2γ '−γ '2
• Solving for γ’ we get
• We then turn to β’:
γ ' = 1 − 1 − ξ1
β'
β'
0
0
ξ 2 = P( β ' | γ ' ) = ∫ p(β ' | γ ' )dβ = ∫
1
β'
dβ =
1− γ '
1− γ '
2D example: Uniform Sampling of
Triangles
• Solving for β’ we get:
β ' = ξ 2 (1 − γ ' ) = ξ 2 (1 − (1 − 1 − ξ1 )) = ξ 2 1 − ξ1
• Thus given a set of random numbers ξ1
and ξ2, we warp these to a set of
barycentric coordinates sampling a
triangle: ( β , γ ) = (ξ 2 1 − ξ1 ,1 − 1 − ξ1 )
2D example: Uniform Sampling of
Discs
2D example: Uniform Sampling of
Discs
• Disc
– Generate random point on unit disk with
probability density:
p(x) = 1/(π.R2)
ϕ ∈ [0,2π ] et r ∈ [0, R ]
dµ = dS = rdrdφ
F (r , ϕ ) = ∫0ϕ ∫0r (r ' /(π .R 2 )dr ' dϕ '
– 2D CDF:
2D example: Uniform Sampling of
Spheres
• Sphere
• Disc
– Generate random point on unit disk with
probability density:
p(x) = 1/(π.R2)
ϕ ∈ [0,2π ] et r ∈ [0, R ]
dµ = dS = rdrdφ
2
– 2D CDF: F (r , ϕ ) = ∫0ϕ ∫0r (2r ' / R )(1 / 2π )dr ' dϕ '
– Compute marginal pdf and associated CDF
– ζ 1 and ζ 2 ∈ [0,1] are uniform random numbers
ϕ = 2πζ 1 and r = R ζ 2
θin
π /2
dωΘ in
θin
0
0
ϕin
ϕ in
2π
p(Θ)= cosθ ⇒θ =acos( ξ1 ) etϕ =2πξ2
π
Θ = (θ , ϕ ) ⇒ dµ = dωΘ = sin θ ⋅ dθ ⋅ dϕ
p(Θ)= n+1 cosnθ ⇒θ =acos(ξ1n+1) etϕ =2πξ2
2π
1
Summary
• Given a function f(µ), defined over an n-dimensional
domain S, we can estimate the integral of f over S by a
sum:
1 N f (x i )
(
)
f
d
µ
µ
≈
∑
∫
S
N
i =1
p(x i )
where x~p is a random variable and xi are samples of x
selected according to p.
• To reduce the variance and get faster convergence we:
– Use importance sampling: p should have similar
shape as f
– Use Stratified sampling: Subdivide S into smaller
regions, evaluate the integral for each region and sum
these together