Download Report

MA3236
Non-linear Programming
AY 2009/2010 Sem 1
NATIONAL UNIVERSITY OF SINGAPORE
MATHEMATICS SOCIETY
PAST YEAR PAPER SOLUTIONS
with credits to Sean Lim Wei Xinq, Zheng Shaoxuan
MA3236
Non-linear Programming
AY 2009/2010 Sem 1
Throughout this paper, for any matrix A, A > 0 denotes A is positive definite, A ≥ 0 denotes A is
positive semidefinite, A < 0 denotes A is negative definite, A ≤ 0 denotes A is negative semidefinite and
A ≈ 0 denotes A is indefinite.
Question 1
(a) We have
4x1 − 2x2 + 6x21 + 4x31
2x2 − 2x1
∇f (x) =
Then, setting ∇f (x) = 0 gives us
2x1 − x2 + 3x21 + 2x31 = 0
(1)
x1 = x2
(2)
From (1) we have
2x31 + 3x21 + x1 = 0
x1 (2x21 + 3x1 + 1) = 0
x1 (2x1 + 1)(x1 + 1) = 0
1
x1 = 0 or x1 = −
or x1 = −1
2
−1/2
−1
0
(3)
(1)
(2)
.
and x∗ =
So the stationary points are x∗ =
, x∗ =
−1/2
−1
0
(b) The Hessian of f is given by
Hf (x) =
4 + 12x1 + 12x21 −2
−2
2
So we have
(1)
Hf (x∗ )
(2)
Hf (x∗ )
(3)
4 −2
−2 2
1 −2
−2 2
4 −2
−2 2
=
=
Hf (x∗ ) =
NUS Math LaTeXify Proj Team
Page: 1 of 11
NUS Mathematics Society
MA3236
Non-linear Programming
(1)
AY 2009/2010 Sem 1
(3)
(1)
(3)
For Hf (x∗ ) and Hf (x∗ ), we have ∆1 = ∆2 = 4 > 0, and hence, Hf (x∗ ) = Hf (x∗ ) > 0,
(1)
(3)
(2)
which implies that x∗ and x∗ are strict local minimizers. Then, det(Hf (x∗ )) < 0 implies
(2)
(2)
Hf (x∗ ) ≈ 0, which implies that x∗ is a saddle point.
(c) We can write f as
f (x) = xT Qx + x41 + 2x31
where
Q=
2 −1
−1 1
Then, note that Q is symmetric positive definite. So if λmin (Q) > 0 is the smallest eigenvalue of
Q, we have
f (x) ≥ λmin (Q)||x||2 + x41 + 2x31
From here, we can observe that f is coercive. Hence, f has a global minimizer but no global
maximizer.
Question 2
Since f is continuous and coercive, it follows that for all K > 0, we can find a k > 0 such that if
||x|| > k, then f (x) > K. Let K > 0 and x0 ∈ D such that K > f (x0 ). Now consider the set
S := {x ∈ D | ||x|| ≤ k}
Note that S is non-empty and bounded. Also, S is closed by definition of S. Hence, by Weierstrass
Theorem, there is a global minimizer in S, i.e. there is an x∗ ∈ S such that
f (x∗ ) ≤ f (x)
for all x ∈ S
Now if x ∈
/ S, then we have
f (x∗ ) ≤ K < f (x)
which shows that x∗ is also a global minimizer in D.
Question 3
(a) For any x ∈ Rq we have, for any B ∈ Rn×q we have
xT BT ABx = (Bx)T A(Bx) ≥ 0
since A is positive definite. This shows that BT AB is positive semidefinite.
To have the fact that BT AB > 0, we require that Bx 6= 0 whenever x 6= 0. This will happen
if N (B) = {0}, where N denotes the nullspace of B. Hence, Ak must be positive definite since
Ak = ITn×k AIn×k , where In×k = (bij ) where
(
1 if i = j
bij =
0 otherwise
NUS Math LaTeXify Proj Team
Page: 2 of 11
NUS Mathematics Society
MA3236
Non-linear Programming
AY 2009/2010 Sem 1
and N (In×k ) = {0} for all k = 1, · · · , n.
(b) Let P be an orthogonal matrix such that A = PT DP for some diagonal matrix D with its diagonal
entries λ1 , · · · , λn as the eigenvalues of A and Pk = PIn×k where In×k is defined as in part (a) so
that Ak = PTk DPk . If µ is an eigenvalue of Ak and v its corresponding eigenvector, then we have
µv = Ak v
T
v µv = vT Ak v
= vT PTk DPk v
= xT Dx
for some vector x ∈ Rn . Then we have
2
µ||v|| =
n
X
λi x2i
i=1
≤
n
X
λmax x2i = λmax ||x||2 = λmax ||v||2
i=1
Hence we have µ ≤ λmax . Similarly, we have
µ||v||2 =
n
X
λi x2i
i=1
≥
n
X
λmin x2i = λmin ||x||2 = λmin ||v||2
i=1
So µ ≥ λmin and this completes the proof.
(c) By Taylor’s Theorem, there exists w ∈ [x, xk ] such that
f (x) = f (xk ) + h∇f (xk ), x − xk i +
1
hHf (w), x − xk i
2
for all k = 1, · · · , p. Since f is convex, Hf (x) ≥ 0 and hence, hHf (w), x − xk i ≥ 0 and therefore
f (x) ≥ f (xk ) + h∇f (xk ), x − xk i
∀x ∈ S
∀1 ≤ k ≤ p
Hence, we have
f (x) ≥ max{Lk (x) := f (xk ) + h∇f (xk ), x − xk i | 1 ≤ k ≤ p}
∀x ∈ S
Question 4
(a) (Correction in paper during day of exam: “Width is twice the length” instead of “Length is twice
the width”)
Let x1 , x2 and x3 denote the length, width and height of the box respectively. Now since the width
is twice the length, we have 2x1 = x2 . Then the volume of the box, V (x) is given by
V (x) = x1 x2 x3 = 2x21 x3
NUS Math LaTeXify Proj Team
Page: 3 of 11
NUS Mathematics Society
MA3236
Non-linear Programming
AY 2009/2010 Sem 1
Since it is not necessary to use all the wire and paper, we have inequality constraints instead of
equality. For the wire, we have
4x1 + 4x2 + 4x3 − 20 = 4x1 + 8x1 + 4x3 − 20 = 12x1 + 4x3 − 20 ≤ 0
Dividing by 4 gives us
h1 (x) := 3x1 + x3 − 5 ≤ 0
Next we consider the paper, which is the total surface area. We have
2x1 x2 + 2x1 x3 + 2x2 x3 − 16 = 2x1 (2x1 ) + 2x1 x3 + 2(2x1 )x3 − 16 = 4x21 + 6x1 x3 − 16 ≤ 0
Dividing by 2 gives us
h2 (x) := 2x21 + 3x1 x3 − 8 ≤ 0
Next, since x1 and x3 represent the length and height of the box respectively, we require that they
are non-negative, i.e. x1 , x3 ≥ 0.
The objective function, if we are translating the problem into a minimization problem is the
negation of the volume, i.e.
f (x) := −V (x) = −2x21 x3 ,
x ∈ R2
Hence our optimization problem is
min f (x)
s.t. h1 (x) ≤ 0
h2 (x) ≤ 0
x1 , x3 ≥ 0
(b) The feasible set S is given by
S := {x ∈ R2 | h1 (x) ≤ 0, h2 (x) ≤ 0, x1 , x3 ≥ 0}
To show that all feasible points are regular, we consider 4 cases. Define the set R(x) by
R(x) = {∇hi (x)}
where hi (x) = 0, i = 1, 2. Now we have
3
1
4x1 + 3x3
3x1
∇h1 (x) =
∇h2 (x) =
Case 1: J(x) = ∅. Then R(x) = ∅ which is linearly independent by definition.
Case 2: J(x) = {1}. Then R(x) = {∇h1 (x)}, and ∇h1 (x) 6= 0 for all x ∈ R2 . So the set is linearly
independent.
Case 3: J(x) = {2}. Then R(x) = {∇h2 (x)}, and ∇h2 (x) = 0 if and only if x = 0. But J(x) = {2}
implies that h2 (x) = 0 ⇒ −8 = 0, a contradiction. So the set is always linearly independent.
Case 4: J(x) = {1, 2}. Then R(x) = {∇h1 (x), ∇h2 (x)}. Now consider the matrix
NUS Math LaTeXify Proj Team
Page: 4 of 11
NUS Mathematics Society
MA3236
Non-linear Programming
R(x) :=
3 4x1 + 3x3
1
3x1
AY 2009/2010 Sem 1
Then, the set R(x) is linearly independent if and only if the matrix R(x) is invertible. Now
det(R(x)) = 5x1 − 3x3 = 0 if and only if x1 = 53 x3 . But this implies
h1 (x) = 3
3
x3
5
+ x3 − 5 = 0
14
x3 = 5
5
25
x3 =
14
Then, we have x1 =
15
14 .
But this gives us
h2 (x) = 2
15
14
2
+3
15
14
25
14
− 8 6= 0
which cannot happen since J(x) = {1, 2}.
So in all cases, we have shown that x ∈ S must be regular.
−4x1 x3
(c) We have ∇f (x) =
. Hence the KKT conditions are
−2x21
−4x1 x3
3
4x1 + 3x3
+ µ2
+ µ1
= 0,
1
−2x21
3x1
µ1 , µ2 ≥ 0
Case 1: J(x) = ∅. Then µ1 = µ2 = 0 and we have
−4x1 x3 = 0
−2x21 = 0
But this implies x1 = x3 = 0, which contradicts our assumption that x1 , x3 > 0.
Case 2: J(x) = {1}. Then µ2 = 0 and µ1 > 0 and we have
−4x1 x3 + 3µ1 = 0
−2x21
(3)
+ µ1 = 0
(4)
3x1 + x3 − 5 = 0
(5)
Now (4) implies µ1 = 2x21 . Then, from (3) we have −4x1 x3 + 6x21 = 0 ⇒ x1 (−4x3 + 6x1 ) = 0 ⇒
x1 = 0 or 3x1 = 2x3 .
If x1 = 0, then from (3) we have µ1 = 0, a contradiction!
If 3x1 = 2x3 , then from (5) we have x3 = 5/3, which then gives us x1 = 10/9. Then, µ1 = 200/81.
But observe that
2
10
10
5
2
10/9
h2
=2
+3
−8=
6≤ 0
5/3
9
9
3
81
Hence it is NOT feasible.
NUS Math LaTeXify Proj Team
Page: 5 of 11
NUS Mathematics Society
MA3236
Non-linear Programming
AY 2009/2010 Sem 1
Case 3: J(x) = {2}. Then µ1 = 0 and µ2 > 0 and we have
−4x1 x3 + 4µ2 x1 + 3µ2 x3 = 0
−2x21 + 3x1 µ2
2x21 + 3x1 x3 − 8
(6)
=0
(7)
=0
(8)
Here, (7) implies x1 (−2x1 + 3µ2 ) = 0 which implies that either x1 = 0 or 3µ2 = 2x1 . But x1 = 0
implies −8 = 0, a contradiction.
But 3µ2 = 2x1 implies from (6) that
8
−2x1 x3 + x21 = 0
3
(9)
√
√
Then, 2(8) + 3(9) implies 12x21 = 16 ⇒ x21 = 4/3 ⇒ x1 = 2/ 3. This gives us µ2 = 4/3 3 and
from (8) we have
√
8
+ 4 3x3 − 8 = 0
3
16 1
4
√ = √
x3 =
3 4 3
3 3
Case 4: J(x) = {1, 2}. Then µ1 , µ2 > 0 and we have
−4x1 x3 + 3µ1 + 4µ2 x1 + 3µ2 x3 = 0
(10)
−2x21 + µ1 + 3µ2 x1 = 0
(11)
3x1 + x3 − 5 = 0
(12)
+ 3x1 x3 − 8 = 0
(13)
2x21
Now from (12) we have x3 = 5 − 3x1 , and from (13) we have
2x21 + 3x1 (5 − 3x1 ) − 8 = 0
−7x21 + 15x1 − 8 = 0
7x21 − 15x1 + 8 = 0
(7x1 − 8)(x1 − 1) = 0
8
x1 = , 1
7
If x1 = 1 then x3 = 2 and we have from (10) and (11)
3µ1 + 10µ2 = 8
µ1 + 3µ2 = 2
Solving the simultaneous equation yields µ1 = −4, µ2 = 2, a contradiction since µ1 > 0.
If x1 = 8/7, then x3 = 11/7 and we have from (10) and (11)
65
352
µ2 =
7
49
24
128
µ1 + µ2 =
7
49
3µ1 +
Solving the simultaneous equation yields µ1 = 128/343, µ2 = 32/49. Check again that
is feasible.
Hence the KKT points are
NUS Math LaTeXify Proj Team
x∗1
=
8/7
11/7
√ 2/ √3
8/7
∗
and x2 =
.
11/7
4/3 3
Page: 6 of 11
NUS Mathematics Society
MA3236
Non-linear Programming
AY 2009/2010 Sem 1
(d) Let the sets S1 and S2 be defined as
S1 := {x ∈ R2 | h1 (x) ≤ 0, x1 ≥ 0, x3 ≥ 0}
S2 := {x ∈ R2 | h2 (x) ≤ 0}
Now observe that S1 is closed and bounded and S2 is closed since h2 is continuous, and that
S = S1 ∩ S2 . Since S1 is bounded, S ⊆ S1 implies that S is also bounded, and S is closed since the
intersection of closed sets is closed. So by Weierstrass Theorem, there is a global minimizer. Since
all feasible points are regular, it follows that every global minimizer is a local minimizer, and by
the KKT necessary conditions, every global min is a KKT point. Hence we simply evaluate
32
f (x∗1 ) = − √ = −2.053
9 3
1408
f (x∗2 ) = −
= −4.105
343
So the global minimizer is x∗2 .
Question 5
(a) Since it is a convex programming problem, a KKT point is also a global minimizer. Let h1 (x) :=
x21 + 4x22 − 4 + , h2 (x) = −x1 and h3 (x) = −x2 . Now we have
−2
∇f (x) =
−4
2x1
∇h1 (x) =
8x2
−1
∇h2 (x) =
0
0
∇h3 (x) =
−1
So the KKT conditions are
−1
0
2x1
−2
=0
+ µ3
+ µ1
+ µ2
0
−1
8x2
−4
where µ1 , µ2 , µ3 ≥ 0.
Case 1: J(x) = ∅. Then µ1 = µ2 = µ3 = 0 and hence,
−2
−4
= 0, a contradiction.
Case 2: J(x) = {1}. Then µ1 > 0 and µ2 = µ3 = 0 and hence,
−2 + 2µ1 x1 = 0
(14)
−4 + 8µ1 x2 = 0
(15)
x21 + 4x22 − 4 + = 0
(16)
x1 , x2 > 0
NUS Math LaTeXify Proj Team
Page: 7 of 11
(17)
NUS Mathematics Society
MA3236
Non-linear Programming
AY 2009/2010 Sem 1
Now (14) implies that x1 = 1/µ1 and (15) implies that x2 = 1/2µ1 . Substituting them into (16)
yields
1
1
+
−4+=0
µ21 µ21
2
=4−
µ21
2
µ21 =
4−
r
2
µ1 =
4−
Then we have
r
x1 =
√
4−
2
and x2 =
4−
√
2 2
From here, we have a KKT point and hence a global minimizer. If there is another KKT point,
then their function values will be the same. So we have
p
(4 − )/2
∗
√
√
x () =
4 − /2 2
with the multipliers µ∗1 () =
p
2/(4 − ), µ∗2 () = µ∗3 () = 0.
(b) Let F () = f (x∗ ()). Then
r
F () = −2
Then,
r
√
4−
4−
4−
− 4 √ = −4
2
2
2 2
√
1
−2 2
·−
F () = √
2
4−
0
by the chain rule. Simplifying gives us
r
0
F () =
2
= µ∗1 ()
4−
as required.
Question 6
(a) We are given
f (x) = 4x21 + 4x22 − 2x1 x2 + x23
Observe that
f (x) = xT
where
1
1
Q x = xT Qx
2
2


8 −2 0
Q =  −2 8 0 
0
0 2
(You may want to check that Q is indeed symmetric positive definite)
NUS Math LaTeXify Proj Team
Page: 8 of 11
NUS Mathematics Society
MA3236
Non-linear Programming
AY 2009/2010 Sem 1
(b) By Proposition 5.1(a) we have
h∇q(xk ), ∇q(xk )i2
E(xk+1 )
=1−
E(xk )
h∇q(xk ), Q∇q(xk )i h∇q(xk ), Q−1 ∇q(xk )i
Since f is a quadratic function, then its quadratic Taylor expansion, q is equal to the function
itself. Also, by definition of E(xk ) and dk we have
f (xk+1 ) − f (x∗ )
||dk ||4
=
1
−
f (xk ) − f (x∗ )
hdk , Qdk i (∇q(xk )T Q−1 ∇q(xk ))
Now ∇f (x∗ ) = 0 implies x∗ = 0, and hence, f (x∗ ) = 0, and so we have
f (xk+1 )
||dk ||4
=1−
f (xk )
hdk , Qdk i (∇q(xk )T Q−1 ∇q(xk ))
Next, we multiply both sides by f (xk ) to obtain
f (xk+1 ) = f (xk ) −
||dk ||4 (f (xk ))
hdk , Qdk i (∇q(xk )T Q−1 ∇q(xk ))
Then, by definition of f and ∇q(xk ) we have
f (xk+1 ) = f (xk ) −
||dk ||4 (xTk Qxk )
1
2 hdk , Qdk i xTk QT Q−1 Qxk
Since Q is symmetric, it follows that QT = Q and canceling from numerator and denominator
gives us
1 ||dk ||4
f (xk+1 ) = f (xk ) −
2 hdk , Qdk i
as required.
(c) We first determine the eigenvalues of Q. We have
8 − λ −2
0
−2 8 − λ
0 = 0
0
0
2 − λ
8 − λ −2 =0
(2 − λ) −2 8 − λ
λ=2
or
(8 − λ)2 − 4 = 0
λ = 2, 6, 10
Now since κ(Q) = λmax (Q)/λmin (Q) = 10/2 = 5 is large, the convergence rate, ρ is given by
ρ(Q) = 1 −
1
4
=
κ(Q)
5
Here, hence, in the worst case, the number of iterations, k is given by
log k=
+1
log ρ(Q)
It is given that = 10−8 , so k = 12.
NUS Math LaTeXify Proj Team
Page: 9 of 11
NUS Mathematics Society
MA3236
Non-linear Programming
AY 2009/2010 Sem 1
Question 7
(a)
(i) Consider the figure below.
(ii) Consider the figure below.
(b) The Lagrangian function, L(x, µ) is given by
L(x, µ) = −2x1 − x2 + µ(x21 + 4x22 − 2x1 x2 − 4)
= µx21 + 4µx22 − 2µx1 x2 − 2x1 − x2 − 4µ
where µ ≥ 0 and x ∈ R2 .
Now we want to determine
θ(µ) = inf L(x, µ)
x∈R2
Hence we have
2µx1 − 2µx2 − 2
Lx (x, µ) =
8µx2 − 2µx1 − 1
Setting Lx = 0 gives us
µx1 − µx2 − 1 = 0
(18)
−2µx1 + 8µx2 − 1 = 0
(19)
Taking (19) + 2(18) yields
6µx2 − 3 = 0 ⇒ x2 =
NUS Math LaTeXify Proj Team
Page: 10 of 11
1
2µ
NUS Mathematics Society
MA3236
Non-linear Programming
AY 2009/2010 Sem 1
Note here that µ 6= 0 since that would imply that −1 = 0, a contradiction! Then, we have
x1 = 3/2µ. Hence we obtain
9
1
3
3
1
+ −
− −
− 4µ
4µ µ 2µ µ 2µ
7
=−
− 4µ
4µ
θ(µ) =
So the dual problem is
max θ(µ) = −
7
− 4µ
4µ
s.t. µ > 0
Then, we have
√
7
7
7
2
⇒ µ∗ =
θ (µ∗ ) = 2 − 4 = 0 ⇒ µ∗ =
4µ∗
16
4
0
(c) The Lagrangian function is given by
1
L(x, λ, µ) = xT Qx + cT x + λT (b − Ax) − µT x
2
Now we want to find the infimum of L for µ ≥ 0. Observe that L is convex, hence, the minimizer
is obtained at the stationary point. So we have
Lx = Qx + c − AT λ − µ
Rewriting L gives us
1
L(x, λ, µ) = xT Qx + bT λ + (c − AT λ − µ)T x
2
Setting Lx = 0 gives us
−Qx = c − AT λ − µ
and hence we have
θ(λ, µ) = inf L(x, λ, µ)
x∈X
1
= xT Qx + bT λ − xT Qx
2
1
= − xT Qx + bT λ
2
So the dual problem is
1
− xT Qx + bT λ
2
s.t. Qx + c − AT λ − µ = 0
max
µ≥0
as desired.
NUS Math LaTeXify Proj Team
Page: 11 of 11
NUS Mathematics Society