2. OPTIMISATION & LAGRANGE METHOD

2. OPTIMISATION & LAGRANGE METHOD
We already know how to find maxima/minima of functions of one variable;
We will now look at the analogous problem for functions of two (or more) variables.
We usually call maxima/minima critical points for functions of one variable,
stationary points for functions of two – We start by reviewing critical points:
(2.1)CRITICAL POINTS(Functions of 1 variable)
Given a function (of one variable) f (x), we can find its critical points, (i.e.
local maxima, local minima, points of inflection etc.), by solving the equation
df
= 0.
dx
To classify this
if
a)
b)
c)
(2 · 1)
(or these) critical point(s), we simply calculate f ′′ (x) and
f ′′ (x) < 0 then x is a Maximum,
f ′′ (x) > 0 then x is a Minimum,
f ′′ (x) = 0 then the test is ‘Inconclusive’.
(2 · 2)
We can also find and classify stationary points for functions of two variables:
(2.2)STATIONARY POINTS(Functions of 2 variables)
Given a function (of two variables) f (x, y), we can find its stationary points
(i.e. local maxima, local minima, saddle points, etc.) by solving the equations
∂f
∂f
= 0,
= 0.
∂x
∂y
(2 · 3)
2
To classify this (or these) stationary point(s), we calculate fxx fyy − fxy
. If
2
a) fxx fyy − fxy
< 0 then (x, y) is a Saddle point,
2
b) fxx fyy − fxy > 0 then (x, y) is a Max./Min. point,
2
c) fxx fyy − fxy
= 0 then the test is ‘Inconclusive’.
(2 · 4)
In case a) we immediately conclude that the stationary point is a saddle
point.
As the name suggests, a saddle point is both maximum and minimum: It
1
looks like a maximum from the ‘back’ of the saddle, where it sits on the horse’s
back,
and a minimum from the ‘side’ of the saddle, at the arch of the horse’s back.
In case b) we need to test whether the stationary point is a max. or min.
If
i) fxx , fyy < 0 then (x, y) is a Maximum,
(2 · 5)
ii) fxx , fyy > 0 then (x, y) is a Minimum.
In case c) the test is inconclusive, and we have no choice but to admit defeat.
Note: We will prove each of the above results later on, using the Chain
Rule.
In the mean-time, however, we will get our feet wet with some basic examples.
Example 1 Find and classify every stationary point of the following function:
f (x, y) = 2x3 − 2y 3 − 3ax2 + 3by 2 + 100
and evaluate the function thereat, (assuming that both a and b are positive).
First of all we have to find the stationary points by letting fx = 0 and
fy = 0.
fx = 2 (3x2 ) − 3a (2x) , fy = −2 (3y 2 ) + 3b (2y) ,
= 6 (x2 − ax) ,
= −6 (y 2 − by) ,
= 6x (x − a) ,
= −6y (y − b) .
We find the stationary points by solving the resulting simultaneous equations:
x (x − a) = 0, (1)
y (y − b) = 0. (2)
We find that there are four stationary points: (0, 0), (0, b), (a, 0) and (a, b).
Now, we have to classify the stationary points; Calculating fxx , fyy and
fxy
fxx = 12x − 6a, fyy = −12y + 6b, fxy = 0,
2
we can then calculate fxx fyy − fxy
, and in turn, classify these stationary
points:
2
fxx fyy − fxy
= (12x − 6a) (−12y + 6b) − (0)2 ,
= 62 (2x − a) (−2y + b) − 0,
= −36 (2x − a) (2y − b) .
2
Unfortunately, we have to check each of the four stationary points individually:
For the point (0, 0), with f (0, 0) = 100:
2
= −36ab, (0, 0) is a ‘Saddle point’.
fxx fyy − fxy
For the point (a, 0), with f (a, 0) = 100 − a3 :
2
fxx fyy − fxy
= +36ab, (a, 0) is a ‘Max/Minimum’,
fxx = +6a, fyy = +6b, (a, 0) is a ‘Minimum’.
For the point (0, b), with f (0, b) = 100 + b3 :
2
fxx fyy − fxy
= +36ab, (0, b) is a ‘Max/Minimum’,
fxx = −6a, fyy = −6b, (0, b) is a ‘Maximum’.
For the point (a, b), with f (a, b) = 100 − a3 + b3 :
2
= −36ab, (a, b) is a ‘Saddle point’.
fxx fyy − fxy
Note(1): Finding some of the stationary points is almost always very easy,
but being sure we have found each and every one of them can be pretty tricky.
Note(2): Although we usually refer to stationary points as maxima, minima, etc. – and we will continue to do so – it is (strictly speaking) more
accurate to refer to them as the points at which the function is maximised,
minimised, etc.
Example 2 Find and classify every stationary point of the following function:
f (x, y) = x3 + 3xy 2 − 6xy + 1.
First of all we have to find the stationary points by letting fx = 0 and
fy = 0.
fx = 3x2 + 3 (1) y 2 − 6 (1) y + 0, fy = 0 + 3x (2y 1 ) − 6x (1) + 0,
= 3x2 + 3y 2 − 6y,
= 6xy − 6x,
= 3 (x2 + y 2 − 2y) ,
= 6x (y − 1) .
However, we have a snag; Solving the resulting set of simultaneous equations:
x2 + y 2 − 2y = 0, (1)
x (y − 1) = 0, (2)
3
seems much more difficult now, since equation (1) can not be easily factorised,
although (2) can. We can get around this problem by substituting the (easily
found) solutions for the simplest equation into the most complicated equation.
The solutions of the simplest equation – x (y − 1) = 0 – are x = 0 and y = 1:
For x = 0 : (0)2 + y 2 − 2y = 0, For y = 1 : x2 + (1)2 − 2 (1) = 0,
y (y − 2) = 0,
(x − 1) (x + 1) = 0,
y = 0, 2.
x = 1, −1.
And so, we end up with four stationary points: (0, 0), (0, 2), (1, 1) and (−1, 1).
Note: These kinds of coupled/simultaneous equations can potentially be
very tricky. We are used to straight-forward linear coupled equations, of the
form:
ax + by = A,
cx + dy = D,
which are trivially solved – by Gauss-Elimination for example – but solving these less straight-forward simultaneous equations is (generally) more involved.
Classifying the corresponding stationary points, however, is always very easy.
Now, we have to classify the stationary points: Calculating fxx , fyy and
fxy
fxx = 6x, fyy = 6x, fxy = 6 (y − 1) ,
2
we can then calculate fxx fyy − fxy
, and in turn, classify these stationary
points:
2
fxx fyy − fxy
= 36 x2 − (y − 1)2 .
As before, we have to check each of these four stationary points individually:
For the point (0, 0), with f (0, 0) = +1:
2
fxx fyy − fxy
= −36, (0, 0) is a ‘Saddle point’.
For the point (0, 2), with f (0, 2) = +1:
2
fxx fyy − fxy
= −36, (0, 2) is a ‘Saddle point’.
For the point (+1, 1), with f (+1, 1) = −1:
2
fxx fyy − fxy
= +36, (1, 1) is a ‘Max/Minimum’,
fxx = +6, fyy = +6, (1, 1) is a ‘Minimum’.
4
For the point (−1, 1), with f (−1, 1) = +3:
2
fxx fyy − fxy
= +36, (−1, 1) is a ‘Max/Minimum’,
fxx = −6, fyy = −6, (−1, 1) is a ‘Maximum’.
Example 3 Show, in the usual way, that the function (of two variables)
below:
f (x, y) = x3 + y 3 − 2x2 − 2y 2 + 3xy,
has stationary points at (0, 0), (1/3, 1/3), investigate the nature of these
points.
First of all we have to find the stationary points; Letting fx = 0 and fy = 0:
fx = 3x2 + 0 − 2 (2x1 ) − 0 + 3 (1) y, fy = 0 + 3y 2 − 0 − 2 (2y 1 ) + 3x (1) ,
= 3x2 − 4x + 3y,
= 3y 2 − 4y + 3x,
= x (3x − 4) + 3y,
= y (3y − 4) + 3x,
which results in the (even more) difficult set of simultaneous equations below:
x (3x − 4) + 3y = 0, (1)
y (3y − 4) + 3x = 0. (2)
We need to exploit the symmetry in the above equations in order to solve
them.
Important Note: An important property of symmetric functions is that
the derivatives fx and fy also exhibit symmetry, as we saw in the previous
section on Partial Differentiation. In this case, these equations have
symmetry:
y = x,
(2 · 6)
since swapping x for y, and y for x, in either equation gives us the other one!
This being the case, we can replace y with x in both of the original equations:
)
x (3x − 4) + 3 (x) = 0,
⇒ x (3x − 1) = 0,
(x) (3 (x) − 4) + 3x = 0.
and we end up with the pair of solutions (x, y) = (0, 0) and (x, y) = (1/3, 1/3),
since if x = 0 and x = 1/3, then y = 0 and y = 1/3, respectively, because
5
any solutions to the symmetrical equations (1)–(2) must also be solutions of
y = x. Now, we have to classify the stationary points; Calculating fxx , fyy
and fxy
fxx = 6x − 4, fyy = 6y − 4, fxy = 3,
2
we can then calculate fxx fyy − fxy
, and in turn, classify these stationary
points:
2
fxx fyy − fxy
= (6x − 4) (6y − 4) − (3)2 ,
= 22 (3x − 2) (3y − 2) − 9,
= 4 (3x − 2) (3y − 2) − 9.
And once again, we have to check both of these stationary points individually:
For the point (0, 0), with f (0, 0) = 0:
2
fxx fyy − fxy
= +7, (0, 0) is a ‘Max/Minimum’,
fxx = −4, fyy = −4, (0, 0) is a ‘Maximum’.
For the point (1/3, 1/3), with f (1/3, 1/3) = −1/27:
2
= −5, (1/3, 1/3) is a ‘Saddle point’.
fxx fyy − fxy
Example 4(Area & Volume) An open rectangular five-sided box has a
volume of 4m3 (four cubic metres). Prove that the dimensions of the box
are 2m × 2m × 1m, if the external surface-area of the box is at a minimum.
If the box is opened at the top, and its dimensions are x = length, y =
breadth and z = height, then the Area A and the Volume V of the box can
be written:
Area: A (x, y, z) = 2 (x + y) z + xy,
Volume: V (x, y, z) = xyz = 4.
We can not minimise the area as it stands, since it is a function of three
variables. However, we can express A (x, y, z) as a function of just two variables by eliminating, say, the variable z using (what is called a constraint)
4 = xyz:
4
z=
.
xy
We can hence write the expression for the area of the box in terms of x and
y:
1
+ xy.
A (x, y) = 8 (x + y)
xy
6
So now, to minimise the surface area A (x, y), we need only solve the equations
∂A
∂A
= 0 and
= 0.
∂x
∂y
In fact, as we will see later on when we look at the Lagrange Multipler
Method, there is another way of minimising the area, as a function of three
variables, subject to the constraint (a term we will see a lot of) that the
volume is fixed.
As usual, Ax and Ay are easily found using standard partial differentiation:
∂
∂A
∂
∂A
=
(8 (x + y) x−1 y −1 + xy) ,
=
(8 (x + y) x−1 y −1 + xy) ,
∂x
∂x
∂y
∂y
∂
∂
=
(8x−1 + 8y −1 + xy) ,
=
(8x−1 + 8y −1 + xy) ,
∂x
∂y
= −8x−2 + y,
= −8y −2 + x,
and since the minimum area is found by solving for both Ax = 0 and Ay = 0:
−8x−2 + y = 0, and −8y −2 + x = 0,
−8 + x2 y = 0,
−8 + xy 2 = 0,
x2 y = 8,
xy 2 = 8.
As usual, we find the minimum by solving the resulting simultaneous equations:
x2 y = 8, (1)
xy 2 = 8. (2)
These two equations (again) have the symmetry y = x, since (again) swapping x for y and y for x in either of the two equations gives us the other.
Therefore:
)
x2 (x) = 8,
⇒ x3 − 8 = 0,
x (x)2 = 8.
with the obvious solutions x = 2 and y = 2. (All other solutions for x and
y are complex, and have no physical meaning in this context). To find the
corresponding value of z, we substitute x = 2, y = 2 into our expression for
z:
4
z=
= 1.
(2) (2)
7
The dimensions of the open rectangular five-sided box with a volume of 4m3 ,
and a minimum external surface area, is 2m×2m×1m, as expected. QED.
Example 5 Find and classify every stationary point of the following function:
1
f (x, y) = x2 + y 2 + 2 2 ,
xy
First of all we have to find the stationary points; Letting fx = 0 and fy = 0:
fx = 2x1 + 0 + (−2x−3 ) y −2 , fy = 0 + 2y 1 + x−2 (−2y −3 ) ,
= 2x1 − 2x−3 y −2 ,
= 2y − 2x−2 y −3 ,
= 2x−3 y −2 (x4 y 2 − 1) ,
= 2x−2 y −3 (x2 y 4 − 1) ,
we end up with the following (extremely tricky) set of simultaneous equations:
x4 y 2 − 1 = 0, (1)
x2 y 4 − 1 = 0. (2)
As with the previous question, we clearly need some kind of trick to solve
these.
Important Note: These simultaneous equations have even more symmetry than the two in example 3 (where we had y = x). Now we have the
symmetry:
y 2 = x2 ,
(2 · 7)
since swapping x2 for y 2 , or y 2 for x2 , in either of those equations, automatically gives us the other. By factorising the symmetry equation y 2 = x2 , we
see that it is equivalent to (y − x) (y + x) = 0, giving us the symmetry y = x
and y = −x.
We can once again exploit this symmetry, by making the substitution
y = ±x:
)
x4 . (±x)2 − 1 = 0,
⇒ x6 − 1 = 0,
x2 . (±x)4 − 1 = 0.
with the obvious solutions x = 1 and x = −1. Since the solutions must satisfy y = ±x, the stationary points are (x, y) = (1, 1), (1, −1), (−1, −1) and
(−1, 1).
Now, we have to classify the stationary points; Calculating fxx , fyy and
fxy
fxx = 2 + 6x−4 y −2 , fyy = 2 + 6x−2 y −4 , fxy = 4x−3 y −3 ,
8
2
we can then calculate fxx fyy − fxy
, and in turn, classify these stationary
points:
2
2
fxx fyy − fxy
= (2 + 6x−4 y −2 ) (2 + 6x−2 y −4 ) − (4x−3 y −3 ) ,
= 4 (1 + 3x−4 y −2 ) (1 + 3x−2 y −4 ) − 16x−6 y −6 .
Thankfully, we can actually check all four stationary points at the same time.
So, for each of these four points (x, y) = (1, 1), (1, −1), (−1, 1) and (−1, −1):
2
fxx fyy − fxy
= +48, all points are ‘Max/Minima’,
fxx = +8, fyy = +8, all points are ‘Minima’.
We have seen a number of different types of problems to do with optimisation
now, but we have not explained where the equations (2 · 1)−(2 · 5) came from.
The following derivations of critical and stationary points are non-examinable:
(2.3)CRITICAL POINTS(Derivation)
Let us assume that critical points of f (x) are found by solving the equation
f ′ (x) = 0.
We now derive the conditions we would use to classify these critical points;
The Taylor’s series expansion of an arbitrary function (of just one variable)
f (x) a very small distance along the x axis, away from a critical point x0 , is
f (x0 + h) = f (x0 ) + 1!1 h1 f ′ (x0 ) + 2!1 h2 f ′′ (x0 ) + · · · ,
= f (x0 ) + h {0} + 21 h2 f ′′ (x0 ) + · · · ,
= f (x0 ) + 21 h2 f ′′ (x0 ) + · · · ,
where h denotes this very small distance away from the critical point x = x0 .
Apart from taking into consideration that f ′ (x0 ) = 0, since x = x0 is a critical point, we can also ignore all terms of higher-order (and thus smaller) than
h2 .
9
For a function of just one variable f (x) a Minimum is defined as shown
below:
f (x0 + h) > f (x0 ) for all (small) h,
and therefore, we can see that for f ′′ (x0 ) > 0, we have a minimum at x = x0 .
For a function of just one variable f (x) a Maximum is defined as shown
below:
f (x0 + h) < f (x0 ) for all (small) h,
and therefore, we can see that for f ′′ (x0 ) < 0, we have a maximum at x = x0 .
(2.4)STATIONARY POINTS(Derivation)
Assume that stationary points of f (x, y) are found by solving the equation:
fx (x, y) = 0 and fy (x, y) = 0.
We now derive the conditions we would use to classify the stationary
points;
The Taylor’s series of a function (of two variables) f (x, y), very small distances h (along x axis) and k (along yaxis) away from a stationary point
(x0 , y0 ) is:
f (x0 + h, y0 + k) = f (x0 , y0 ) + 1!1 [h1 fx (x0 , y0 ) + k 1 fy (x0 , y0 )]
+ 2!1 [h2 fxx (x0 , y0 ) + 2hkfxy (x0 , y0 ) + k 2 fyy (x0 , y0 )] ,
= f (x0 , y0 ) + 12 [h2 fxx (x0 , y0 ) + 2hkfxy (x0 , y0 ) + k 2 fyy (x0 , y0 )] ,
since fx (x0 , y0 ) = 0 and fy (x0 , y0 ) = 0, given that (x0 , y0 ) is a stationary
point.
By ‘completing the square’, we can write f (x0 + h, y0 + k) in two useful
forms:
2
f (x0 + h, y0 + k) = f (x0 , y0 ) + 2f1xx (hfxx + kfxy )2 + k 2 fxx fyy − fxy
,
2
2
f (x0 + h, y0 + k) = f (x0 , y0 ) + 2f1yy h2 fxx fyy − fxy
+ (kfyy + hfxy ) ,
where fxx , fyy and fxy mean fxx (x0 , y0 ), fyy (x0 , y0 ) and fxy (x0 , y0 ) respectively.
For a function of two variables f (x, y), a Minimum is defined as shown
below
f (x0 + h, y0 + k) > f (x0 , y0 )
10
for all (small) h, k,
(1)
(2)
2
and so for fxx fyy − fxy
> 0 and fxx > 0, fyy > 0, we have a minimum at
(x0 , y0 ).
For a function of two variables f (x, y), a Maximum is defined as shown
below
f (x0 + h, y0 + k) < f (x0 , y0 )
for all (small) h, k,
2
and so for fxx fyy − fxy
> 0 and fxx < 0, fyy < 0, we have a maximum at
(x0 , y0 ).
The above derivations of the conditions for finding maxima and minima
can be derived from either one of equations (1) and (2), since for fxx fyy −
2
fxy
> 0 we must have either that fxx and fyy are both positive, or are both
negative. Similarly, the following derivation of the formula for finding saddle
points can be derived using either of the two (modified versions of) equations
(1) and (2). Finally, by converting to the polar coordinates h = ε cos θ,
k = ε sin θ, where ε is a very small radius, we can write f (x0 + h, y0 + k) in
the following form:
2
2
2
f (x0 + h, y0 + k) = f (x0 , y0 ) + ε 2fsinxx θ (fxx tan θ + fxy )2 + fxx fyy − fxy
,
2
2
2
f (x0 + h, y0 + k) = f (x0 , y0 ) + ε 2fcosyy θ fxx fyy − fxy
+ (fyy tan θ + fxy )2 .
For a function of two variables f (x, y), a Saddle Point is defined as follows:
f (x0 + h, y0 + k) ⋚ f (x0 , y0 )
depending on θ,
(i.e. in some directions, we have a maximum, in other directions, a mini2
mum), and so for fxx fyy −fxy
< 0, we have a saddle point, i.e. for some angles
of θ we have maxima, for others, minima, regardless of the values of fxx , fyy ,
fxy .
It may not be totally obvious why this is the case, but in light of the fact
that:
−∞ ≤ tan θ ≤ +∞ for −π/2 ≤ θ ≤ +π/2,
we can see that regardless of whether or not fxx or fyy are positive or negative, we can always find an angle which forces the conditions for minima or
maxima:
f (x0 + h, y0 + k) > f (x0 , y0 ) or f (x0 + h, y0 + k) < f (x0 , y0 ) .
11
(This can be a very difficult point to explain on paper, and so this derivation
may fall short – If this is the case, please just ask me to talk ye through it).
(2.5)CONSTRAINTS
So far, we have found the stationary points for functions where x and y were
independent variables. But what if x and y were instead dependent variables?
(a)Independent Variables We find the stationary points of the function:
f (x, y) = x2 + y 2 ,
where x and y are the usual independent variables, by solving these equations:
fx = 2x1 + 0,
fy = 0 + 2y 1 ,
= 2x = 0,
= 2y = 0.
This gives us a stationary point at (x, y) = (0, 0) corresponding to a minimum.
We find the stationary points of the func(b)Dependent Variables
tion:
f (x, y) = x2 + y 2 ,
where x and y are now dependent variables, related by the straight line
equation
3x + 4y = 25,
(more generally called a constraint), by eliminating one of the two variables,
and finding the stationary points of the resulting function, as in example
4. By eliminating y using y = (25 − 3x) /4, we can re-write f (x, y) as the
function
1
f (x) = x2 + 16
(25 − 3x)2 .
Therefore, to minimise this function (of one variable) we only need to solve
for
d
1
(25 − 3x)2 = 0,
x2 + 16
f′ =
dx
1
= 2x + 16
2 (25 − 3x)1 (−3) = 0,
= 18 (25x − 75) = 0.
So, the solution is x = 3, and since the constraint is y = 41 (25 − 3x), y = 4.
This gives us a stationary point at (x, y) = (3, 4) corresponding to a minimum.
12
Note that the absolute minimum for the function f (x, y) = x2 + y 2 at (0, 0)
is
f (0, 0) = 0,
when it is unconstrained, whereas if we impose the constraint 3x+4y = 25, we
find that the absolute minimum for the function f (x, y) = x2 + y 2 at (3, 4)
is
f (3, 4) = 25.
Clearly the absolute minimum is higher (and thus worse) due to the constraint.
Furthermore, the more constraints and restrictions we place on the system,
the lower we expect the maxima to be, and the higher we expect the minima
to
be!
Strictly, it would be possible for a system to have the same maxima/minima
if a constraint were placed on it, but these maxima/minima could never be
better!
Exercise 1:
tions:
Find the stationary point(s) of each of the following func-
a) f (x, y) = x2 + y 2 , b) f (x, y) = xy, c) f (x, y) = (x − y)2 .
Find these stationary points once again, subject now to each of the constraints:
i) Ax + By = C, ii) y − Ax2 − Bx − C = 0, iii) x − y 2 = 0,
by either substituting for, or eliminating, one of the variables x, y in each case.
Note: In many cases in both Engineering and Mathematical-Physics, it can
be extremely difficult to eliminate one of the variables using the equation of
constraint. The Lagrange-Multiplier Method can get around this problem.
Consider, for example, if we wanted to minimise the function of two variables
f (x, y) = xy,
subject to the constraint x2 + 8xy + 7y 2 = 180. In this case, it would be very
difficult to get x in terms of y, or y in terms of x, but (as we will now see),
this can be solved easily and systematically using the Lagrange-Multiplier
Method:
13
(2.6)LAGRANGE MULTIPLIER METHOD
To find the stationary point(s) of the function of two variables f (x, y), subject to the constraint equation g (x, y) = 0, we need to solve the following
equations:
∂f
∂g
∂f
∂g
+λ
= 0,
+λ
= 0,
(2 · 7)
∂x
∂x
∂y
∂y
where λ is the Lagrange multiplier. The Lagrange-Multiplier method is
broadly:
1)
2)
3)
4)
Solve the above (Lagrange–Multiplier) Equations;
Determine the value(s) of the Lagrange-Multiplier;
Use the constraint to find (x, y) for each value of λ;
Determine the value(s) of f (x, y) for each value of (x, y) .
Example 6
of
Use Lagrange’s Multiplier method to find the minimum value
f (x, y) = x2 + y 2 ,
subject to the following equation of constraint (equation of a hyperbolic
curve):
x2 + 8xy + 7y 2 = 180.
Before starting, it is always a good idea to be absolutely clear on the problem:
Find Stationary–Points of f (x, y) = x2 + y 2 ,
Subject to the Constraint g (x, y) = x2 + 8xy + 7y 2 − 180.
Note(1): We must always express these constraints in the form g (x, y) =
0. Obviously, this will always be possible, since an arbitrary equation of the
form:
LHS = RHS,
can always be re-written in the form g (x, y) = 0; in this case LHS − RHS =
0.
First of all we substitute f (x, y), the main function, and g (x, y), the function
appearing in the constraint equation, into the Lagrange-Multiplier Equations:
∂f
∂g
+λ
= 0,
∂x
∂x
∂f
∂g
+λ
= 0,
∂y
∂y
(2x) + λ (2x + 8y) = 0,
(2y) + λ (8x + 14y) = 0,
(1 + λ) x + 4λy = 0,
4λx + (1 + 7λ) y = 0.
14
Our next move is to determine the value(s) of λ, the Lagrange-Multiplier(s).
Note(2): The solutions will correspond to the minimum distance (squared)
between the origin (0, 0), and any point (x, y) on the given hyperbolic curve.
For the time being, however, we will ignore these geometrical interpretation.
Now we have to determine the value(s) of λ that give us the most general
non-trivial solution for (x, y). In this case, the best way to do this is by
re-writing the simultaneous equations using standard matrix notation (as
shown below):
)
(1 + λ) x + 4λy = 0,
0
x
(1 + λ)
4λ
,
=
⇒
0
y
4λ
(1 + 7λ)
4λx + (1 + 7λ) y = 0,
so that the λ giving us the most general solution for x and y can be found
via:
(1 + λ)
4λ
= 0,
4λ
(1 + 7λ) (1 + λ) (1 + 7λ) − 16λ2 = 0,
7λ2 − 16λ2 + 8λ + 1 = 0,
− (9λ + 1) (λ − 1) = 0.
We have to determine the stationary points corresponding to each value of λ.
So, for the first value, being λ = − 19 , our two equations collapse into just
one:
)
1 + − 91 x + 4 − 19 y = 0,
⇒ 2x − y = 0.
4 − 19 x + 1 + 7 − 19 y = 0,
Substituting this equation (y = 2x) into the constraint x2 + 8xy + 7y 2 = 180:
x2 + 8x (2x) + 7 (2x)2 − 180 = 0,
45x2 − 180 = 0,
45 (x + 4) (x − 4) = 0,
and hence, we get the solutions x = 4 (with y = 8), and x = −4 (with
y = −8):
Solutions: (x, y) = (±4, ±8) corresponding to f = 80.
And for the second value, being λ = 1, our equations again collapse into one:
)
(1 + (1)) x + 4 (1) y = 0,
⇒ x + 2y = 0.
4 (1) x + (1 + 7 (1)) y = 0,
15
Substituting this equation (x = −2y) into the constraint x2 +8xy+7y 2 = 180:
(−2y)2 + 8 (−2y) y + 7y 2 − 180 = 0,
−5y 2 − 180 = 0,
−5 (y + 6i) (y − 6i) = 0,
and hence, we get no real solutions. If the above values were acceptable,
we would have the solutions y = 6i (with x = −12i) and y = −6i (with
x = 12i):
Solutions: (x, y) = (±6i, ∓12i) corresponding to f = −180.
However, we cannot accept complex solutions. The topic of complex variables,
and conformal mappings.
Example 7 Using the Lagrange-Multiplier Method, or otherwise, find both
the maximum and minimum values of the function f (x, y) = xy on the circle:
x2 + y 2 = 8.
As in Example 6, it is important to lay out the problem as clearly as
possible:
Find Stationary–Points of f (x, y) = xy,
Subject to the Constraint g (x, y) = x2 + y 2 − 8.
So, we substitute f (x, y) and g (x, y) into the Lagrange-Multiplier Equations:
∂g
∂f
+λ
= 0,
∂x
∂x
∂f
∂g
+λ
= 0,
∂y
∂y
(y) + λ (2x) = 0,
(x) + λ (2y) = 0,
2λx + y = 0,
x + 2λy = 0.
And as before, we determine the value(s) of λ that give us the most general non-trivial solution for (x, y), by re-writing these equations in matrix
notation:
0
x
2λ 1
2λx + y = 0
,
=
⇒
0
y
1 2λ
x + 2λy = 0
16
so that the λs giving us the most general solution for (x, y) can be found via:
2λ 1 1 2λ = 0,
(2λ)2 − 12 = 0,
(2λ + 1) (2λ − 1) = 0.
For the first value, being λ = − 21 , these equations again collapse into just
one:
2 − 12 x + y = 0
⇒ y − x = 0.
x + 2 − 12 y = 0
Substituting this equation (y = x) into the constraint x2 + y 2 − 8 = 0 we get:
x2 + (x)2 − 8 = 0,
2 (x2 − 4) = 0,
2 (x − 2) (x + 2) = 0,
and the solutions are therefore x = 2 (with y = 2), and x = −2 (with
y = −2):
Solutions: (x, y) = (±2, ±2) corresponding to f = +4.
(As it turns out, each of these stationary points will correspond to a maximum).
For the second value, being λ = 21 , the equations collapse into one yet
again:
2 + 12 x + y = 0
⇒ y + x = 0.
x + 2 + 12 y = 0
Substituting this equation (y = −x) into the constraint x2 + y 2 − 8 = 0 we
get:
x2 + (−x)2 − 8 = 0,
2 (x2 − 4) = 0,
2 (x − 2) (x + 2) = 0,
and the solutions are therefore x = 2 (with y = −2), and x = −2 (with
y = 2):
Solutions: (x, y) = (±2, ∓2) corresponding to f = −4.
17
(As it turns out, each of these stationary points will correspond to a minimum).
The maximum of f (x, y) = xy, subject to x2 + y 2 = 8, is therefore f = +4.
Note: We could also have solved this problem if we had known that the
parametric equations of the circle – centred at origin (0, 0) – x2 + y 2 = 8 are:
√
x (θ) = 2 2 cos (θ) ,
√
y (θ) = 2 2 sin (θ) .
Using the above, the constraint is automatically satisfied, and the function
is:
√
√
f (θ) = 2 2 cos (θ) 2 2 sin (θ) ,
= 8 cos (θ) sin (θ) ,
= 4 sin (2θ) .
It is easy to show that the stationary points of this function are at θ =
.
± π4 , ± 3π
4
(These correspond to each of the same four solutions as before, as we expect).
Example 8(Standard Ellipse)
Use Lagrange’s multiplier method to obtain the maximum, and the minimum
values that the function f (x, y) = x2 + y 2 can have on the ellipse described
by
b 2 x 2 + a2 y 2 = a2 b 2 .
As usual, we substitute f (x, y), the function we wish to minimise, and
g (x, y), the function appearing in the constraint equation, into the Lagrange
equations
∂g
∂f
+λ
= 0,
∂x
∂x
∂f
∂g
+λ
= 0,
∂y
∂y
(2x) + λ (2b2 x) = 0,
(2y) + λ (2a2 y) = 0,
2 (λb2 + 1) x = 0,
2 (λa2 + 1) y = 0.
(Normally, we might be tempted to solve coupled/simultaneous equations by
simply eliminating x or y in what we might call the ‘old-fashioned’ way, but
in this example we can clearly see the advantage of using matrix-methods).
18
We write these two simultaneous equations in matrix notation in order to
find the value(s) of λ that gives us the most general non-trivial solution for
(x, y)
0
x
2 (λb2 + 1)
0
2 (λb2 + 1) x + (0) y = 0
,
=
⇒
0
y
0
2 (λa2 + 1)
(0) x + 2 (λa2 + 1) y = 0
so that the λs giving us the most general solution for (x, y) can be found via:
2 (λb2 + 1)
0
= 0,
0
2 (λa2 + 1) 4 (λb2 + 1) (λa2 + 1) = 0.
We have to determine the stationary points corresponding to each value of λ.
For λ = −1/b2 , one of the two equations vanishes altogether, leaving us with:
2 ((−b2 /b2 ) + 1) x + (0) y = 0
⇒ 2 (1 − a2 /b2 ) y = 0.
(0) x + 2 ((−a2 /b2 ) + 1) y = 0
Because a 6= b – since the condition a = b would give us a circle equation –
the solution y = 0 can be substituted into the constraint b2 x2 + a2 y 2 = a2 b2
to get
b2 x2 + a2 (0)2 − a2 b2 = 0,
b2 (x2 − a2 ) = 0,
b2 (x − a) (x + a) = 0.
This then gives us the solutions x = a (with y = 0), and x = −a (with
y = 0):
Solutions: (x, y) = (±a, 0) corresponding to f = +a2 .
(This solution corresponds to a minimum if b > a, and a maximum if b < a).
For λ = −1/a2 , however, the other equation vanishes, leaving us instead
with:
2 ((−b2 /a2 ) + 1) x + (0) y = 0
⇒ 2 (1 − b2 /a2 ) x = 0.
(0) x + 2 ((−a2 /a2 ) + 1) y = 0
Again, because a 6= b, the solution x = 0 can be substituted into the constraint:
b2 (0)2 + a2 y 2 − a2 b2 = 0,
a2 (y 2 − b2 ) = 0,
a2 (y − b) (y + b) = 0.
19
This then gives us the solutions y = b (with x = 0), and y = −b (with x = 0):
Solutions: (x, y) = (0, ±b) corresponding to f = +b2 .
(This solution corresponds to a maximum if b > a, and a minimum if b < a).
Note: We could also have solved this problem if we had known that the
parametric equations of the ellipse – centred at (0, 0) – b2 x2 + a2 y 2 = a2 b2
are:
x (θ) = a cos (θ) ,
y (θ) = b sin (θ) .
Using these, the constraint is automatically satisfied and we write the function:
f (θ) = a cos (θ) b sin (θ) ,
= ab cos (θ) sin (θ) ,
= 12 ab sin (2θ) .
We could hence show that the stationary points of this function are to be
found at θ = ±π/4, ±3π/4, (corresponding to exactly the same solutions as
before).
Example 9(Minimum Distance)
Use the Lagrange multiplier method to obtain the least distance between the
origin (0, 0), and the hyperbolic curve described 3x2 + 4xy + 6y 2 = 140.
Note(Distance): Generally speaking, the distance between (a, b) (any
point) and (x, y) (a point on a curve g (x, y) = 0) is given by the following
expression:
q
(2 · 8)1
D(a,b) = (x − a)2 + (y − b)2 ,
and therefore, the distance between (0, 0) (the origin) and (x, y) (some point
on a curve g (x, y) = 0) is given by the virtually equivalent expression below:
p
(2 · 8)2
D(0,0) = x2 + y 2 .
p
Strictly speaking, we are supposed to minimise x2 + y 2 , the distance between (0, 0) (the origin) and (x, y) (a point on the curve); However, it is convenient, (and mathematically equivalent), to minimise the ‘distance-squared’
function
f (x, y) = x2 + y 2 ,
20
subject to the same constraint g (x, y) = 3x2 +4xy+6y 2 −140 = 0. Therefore;
Find Stationary–Points of f (x, y) = x2 + y 2 ,
Subject to the Constraint g (x, y) = 3x2 + 4xy + 6y 2 − 140.
Because x and y are both real variables (not imaginary or complex variables), the distance will always be positive as we intuitively expect it to
be. Therefore, minimising the distance function is, at least mathematically
speaking, exactly equivalent to minimising the distance-squared function, i.e.
f (x, y) = x2 + y 2 .
(We have to remember that this distance-squared function is only used to find
the point(s) which minimise/maximise the distance – it is only a middle-man.
The solutions must then be substituted into the expression for the distance
D).
As per usual, we start by solving the Lagrange-Multiplier Equations (as
below):
∂f
∂g
+λ
= 0,
∂x
∂x
∂f
∂g
+λ
= 0,
∂y
∂y
(2x) + λ (6x + 4y) = 0,
(2y) + λ (4x + 12y) = 0,
(1 + 3λ) x + 2λy = 0,
2λx + (1 + 6λ) y = 0.
Once again, we want to determine the value(s) of λ, the Lagrange multiplier,
that gives us the most general (non-trivial) solution for (x, y). And yet again,
by re-writing the simultaneous equations in standard matrix notation, we get:
)
(1 + 3λ) x + 2λy = 0,
0
x
(1 + 3λ)
2λ
.
=
⇒
0
y
2λ
(1 + 6λ)
2λx + (1 + 6λ) y = 0,
p
(If we had tried instead to minimise the distance function x2 + y 2 , we would
have had great difficulty solving the resulting simultaneous equations, despite
the fact that they ultimately would have yielded precisely the same solutions).
To determine the λs giving us the most general solution for x and y, we solve:
(1 + 3λ)
2λ
= 0,
2λ
(1 + 6λ) (1 + 3λ) (1 + 6λ) − 4λ2 = 0,
(7λ + 1) (2λ + 1) = 0.
21
We have to determine the stationary points corresponding to each value of λ.
For the first value, being λ = − 17 , our equations once again collapse into one:
)
1 + 3 − 71 x + 2 − 71 y = 0,
⇒ 2x − y = 0.
2 − 17 x + 1 + 6 − 17 y = 0,
Substituting this equation (y = 2x) into the constraint 3x2 +4xy +6y 2 = 140:
3x2 + 4x (2x) + 6 (2x)2 − 140 = 0,
35 (x2 − 4) = 0,
35 (x + 2) (x − 2) = 0,
and we hence find the solutions x = 2 (with y = 4), and x = −2 (with
y = −4);
Solutions: (x, y) = (±2, ±4) corresponding to f = 20.
At this point, we have to recall that the function we are trying to minimise,
is the distance between (0, 0) (the origin) and (x, y) (a point on the curve),
thus
q
L (±2, ±4) = (±2)2 + (±4)2 ,
√
= 4 + 16,
√
= 20.
√
√
(This corresponds to what turns out to be a minimum length of 20 = 2 5).
For the second value, being λ = − 21 , the equations yet again collapse into
one:
)
1 + 3 − 12 x + 2 − 12 y = 0,
⇒ −x − 2y = 0.
2 − 12 x + 1 + 6 − 21 y = 0,
Substituting this equation (x = −2y) into the constraint 3x2 + 4xy + 6y 2 =
140:
3 (−2y)2 + 4 (−2y) x + 6y 2 − 140 = 0,
10 (y 2 − 14) = 0,
√ √ 35 y + 14 y − 14 = 0,
√
√
√
and√we get solutions y = − 14 (with x = 2 14), y = 14 (with x =
−2 14);
√ √
Solutions: (x, y) = ± 14, ∓2 14 corresponding to f = 70.
22
Again, we have to recall that the function we are trying to minimise, is the
distance between (0, 0) (the origin) and (x, y) (a point on the curve), and so
q √ √ √ 2
√
2
± 14 + ∓2 14 ,
L ± 14, ∓2 14 =
√
= 14 + 54,
√
= 70.
(This
solution corresponds to what turns out to be a maximum length of
√
70).
(2.7)LAGRANGE MULTIPLIER METHOD(Derivation)
Since the stationary points of the function f (x, y) are, by definition, given
by
∂f
∂f
df =
dx +
dy = 0,
∂x
∂y
and because, given some constraint in the form g (x, y) = 0, we must also
have
∂g
∂g
dg =
dx +
dy = 0,
∂x
∂y
the stationary points must simultaneously satisfy the following two equations:
∂f
∂f
dx +
dy = 0, (1)
∂x
∂y
∂g
∂g
dx +
dy = 0. (2)
∂x
∂y
We can simplify this by re-writing the equations in matrix form (shown
below):
)
fx dx + fy dy = 0,
0
fx fy
dx
=
⇒
0
dy
gx gy
gx dx + gy dy = 0.
and finding values of fx , fy , gx and gy giving non-trivial solutions for dx, dy.
The most general (non-trivial) solution(s) for dx and dy are found by
solving:
fx fy gx gy = 0 ⇒ fx gy − gx fy = 0.
23
By equating like derivatives, we can see that the condition to optimise
f (x, y), subject to the (equation of) constraint g (x, y) = 0, can be written in the form:
fx gy − gx fy = 0,
fx gy = fy gx ,
fx /gx = fy /gy .
To find the points (x, y) that satisfy the above equation, it is more convenient to solve instead the following equivalent pair of coupled/simultaneous
equations:
fx + λgx = 0,
fy + λgy = 0,
where λ is an unknown constant called the Lagrange Multiplier. QED
Note: The standard Lagrange-Multiplier Equations (shown above) can easily
be modified to handle three or more dimensions, and two or more constraints.
Clearly, we could imagine a whole host of problems where we might have a
function of not two but three variables, subject to an equation of constraint
also involving three variables; For example, if we wanted to optimise the
distance between a point and a surface in three dimensions, rather than the
distance between a point and a curve in two-dimensions – We will show
such problems can be solved relatively easily using a (modified) LagrangeMultiplier
Method.
(2.8)LAGRANGE MULTIPLIER METHOD(3-Dimensions)
To find the stationary point(s) of the function of three variables f (x, y, z),
subject to the constraint equation g (x, y, z) = 0, we solve the three equations:
∂g
∂f
∂g
∂f
∂g
∂f
+λ
= 0,
+λ
= 0,
+λ
= 0,
(2 · 9)
∂x
∂x
∂y
∂y
∂z
∂z
where λ is the Lagrange multiplier. Then, we proceed as usual:
1)
2)
3)
4)
Solve the above (Lagrange–Multiplier) Equations;
Determine the value(s) of the Lagrange-Multiplier;
Use the constraint to find (x, y, z) for each value of λ;
Determine the value(s) of f (x, y, z) for each value of (x, y, z) .
24
Example 10 Use the Lagrange multiplier method to determine the minimum distance from the origin (0, 0, 0) to any point (x, y, z) on the plane
defined by
4x − 4y + 2z = 36.
As usual, it is a good idea to have no ambiguity about the problem in question:
Find Stationary–Points of f (x, y, z) = x2 + y 2 + z 2 ,
Subject to the Constraint g (x, y, z) = 4x − 4y + 2z − 36,
bearing in mind, that the distance between
the origin (0, 0, 0) and a point
p
2
on the curve (x, y, z) is Distance = x + y 2 + z 2 , and that the distancesquared function f (x, y) = x2 +y 2 +z 2 is again used purely for the sake of convenience.
First off, we solve the Lagrange Multiplier equations (of which there are
three)
∂f
∂g
+λ
= 0,
∂x
∂x
∂f
∂g
+λ
= 0,
∂y
∂y
∂f
∂g
+λ
= 0,
∂z
∂z
(2x) + λ (4) = 0,
(2y) + λ (−4) = 0,
(2z) + λ (2) = 0,
2x + 4λ = 0,
2y − 4λ = 0,
2z + 2λ = 0.
Our next move is to determine the value(s) of λ, the Lagrange multiplier(s).
Note: The easiest way to do this is to use the above equations (which can be
simplified to x = −2λ, y = 2λ, z = −λ) and substitute them into the
constraint
4x − 4y + 2z = 36,
4 (−2λ) − 4 (2λ) + 2 (−λ) = 36,
−8λ − 8λ − 2λ = 36,
−18λ = 36.
Hence λ = −2. Having used the constraint already, we find that by backsubstituting the Lagrange multiplier λ into the original equations, the solution (x, y, z) = (4, −4, 2) drops into our laps without having to use the constraint.
Amazingly, this question was quicker than its two-dimensional counterparts.
25
Example 11 Use the Lagrange multiplier method to determine
the maxi√
mum and minimum temperature, on the sphere of radius 5 2 and centred at
the origin (0, 0, 0), where the temperature is the function of position shown
below:
T (x, y, z) = 273 + 2z (3x + 4y) .
As usual, it is a good idea to have no ambiguity about the problem in question:
Find Stationary–Points of T (x, y, z) = 273 + 6xz + 8yz,
Subject to the Constraint g (x, y, z) = x2 + y 2 + z 2 − 50.
First off, we solve the Lagrange Multiplier equations (of which there are
three)
∂g
∂T
+λ
= 0,
∂x
∂x
∂T
∂g
+λ
= 0,
∂y
∂y
∂T
∂g
+λ
= 0,
∂z
∂z
(6z) + λ (2x) = 0,
(8z) + λ (2y) = 0,
(6x + 8y) + λ (2z) = 0,
λx + 3z = 0,
λy + 4z = 0,
3x + 4y + λz = 0.
Our next move is to determine the value(s) of λ, the Lagrange multiplier(s).
However, we will not get the ‘free lunch’ here that we got in Example 10!
We must once again re-write these equations in matrix notation, in order to
find the values of λ that gives us the most general non-trivial solution for
(x, y, z):

   

0
x
λ 0 3
λx + (0) y + 3z = 0 
(0) x + λy + 4z = 0
⇒  0 λ 4  y  =  0 ,

0
z
3 4 λ
3x + 4y + λz = 0
so that the λs giving us the most general solution for (x, y, z) can be found
via
λ 0 3 0 λ 4 = 0,
3 4 λ λ3 − 16λ − 9λ = 0,
λ (λ − 5) (λ + 5) = 0.
We have to determine the stationary points corresponding to each value of λ.
26
For λ = 0, one of these three equations vanishes altogether, leaving us with:

3z = 0 
z = 0,
4z = 0
⇒
y = −3x/4.

3x + 4y = 0
By substituting these back√into the
x2 + y 2 + z 2 = 50, we find that
√ constraint
the stationary point is ±4 2, ∓3 2, 0 , giving us a temperature of T = 273.
For λ = +5, we find that once again, one of these equations disappears –
since the third equation (below) is a superposition of the first two – leaving
us with:

5x + 3z = 0 
5x + 3z = 0,
5y + 4z = 0
⇒
5y + 4z = 0.

3x + 4y + 5z = 0
By substituting these – i.e. x = −3z/5, y = −4z/5 – back into the constraint,
we get the stationary point (±3, ±4, ∓5), giving us a temperature of T = 23.
For λ = −5, we find that once again, one of these equations disappears –
since the third equation is again a superposition of the first two – leaving us
with:

−5x + 3z = 0 
5x − 3z = 0,
−5y + 4z = 0
⇒
5y − 4z = 0.

3x + 4y − 5z = 0
By substituting these – i.e. x = +3z/5, y = +4z/5 – back into the constraint,
we get the stationary point (±3, ±4, ±5), giving us a temperature of T = 523.
Clearly the sphere is hottest at the points (±3, ±4, ±5), at 5230 K, where (0 K
denotes degrees Kelvin), and is coolest at the points (±3, ±4, ∓5), at 230 K.
Example 12 Use the Lagrange multiplier method to determine the maximum and minimum distance from the origin to any point on the ellipsoid
defined by
x2 /4 + y 2 /9 + z 2 /16 = 1.
As usual, it is a good idea to have no ambiguity about the problem in question:
Find Stationary–Points of f (x, y, z) = x2 + y 2 + z 2 ,
Subject to the Constraint g (x, y, z) = x2 /4 + y 2 /9 + z 2 /16 − 1,
27
bearing in mind, that the distance between
the origin (0, 0, 0) and a point
p
2
on the curve (x, y, z) is Distance = x + y 2 + z 2 , and that the distancesquared function f (x, y) = x2 + y 2 + z 2 is yet again used just for the sake of
convenience.
First off, we solve the Lagrange Multiplier equations (of which there are
three)
∂g
∂f
+λ
= 0,
∂x
∂x
∂f
∂g
+λ
= 0,
∂y
∂y
∂f
∂g
+λ
= 0,
∂z
∂z
(2x) + λ (2x/4) = 0,
(2y) + λ (2y/9) = 0,
(2z) + λ (2z/16) = 0,
(λ + 4) x = 0,
(λ + 9) y = 0,
(λ + 16) z = 0.
Our next move is to determine all three values of λ, the Lagrange multipliers.
The three values that the Lagrange multiplier will take on are actually in
plain view, but it is still possible, albeit a little heavy-handed, to solve this
(as before), by re-writing the three equations in matrix notation, as shown
below:

   

0
x
λ+1 0
0
(λ + 4) x + (0) y + (0) z = 0 





0 ,
y
0
λ+9 0
(0) x + (λ + 9) y + (0) z = 0
=
⇒

z
0
0
λ + 16
(0) x + (0) y + (λ + 16) z = 0
0
so that the λs giving us the most general solution for (x, y, z) can be found
via
λ+4 0
0
0
= 0,
λ
+
9
0
0
0
λ + 16 (λ + 4) (λ + 9) (λ + 16) = 0.
We have to determine the stationary points corresponding to each value of
λ.
For λ = −4, we obviously get y = 0, z = 0, and – after substituting
y = z = 0 into the constraint equation – x = ±2. The solutions are therefore
(±2, 0, 0).
For λ = −9, we obviously get x = 0, z = 0, and – after substituting
x = z = 0 into the constraint equation – y = ±3. The solutions are therefore
(0, ±3, 0).
28
For λ = −16, we obviously get x = 0, y = 0, and – after substituting
x = y = 0 into the constraint equation – z = ±4. The solutions are therefore
(0, 0, ±4).
Clearly the shortest distance to the sphere is 2 units, and the longest
distance is twice that. From a geometrical point of view, this result is not
surprising.
29