Why Norms (p. 52) Vector and Matrix Norms

Why Norms (p. 52)
Vector and Matrix Norms
Matrix Computations — CPSC 5006 E
Julien Dompierre
Department of Mathematics and Computer Science
Laurentian University
Norms serve the same purpose on vector spaces that absolute
values do on the real line: they furnish a measure of distance.
More precisely, IRn together with a norm on IRn defines a metric
space. Therefore, we have the familiar notions of neighborhood,
open sets, convergence, and continuity when working with vectors
and vector valued functions.
Sudbury, September 29, 2010
Remind: Inner Product
Vector Norm on a Vector Space
The inner product (x, y ) of two vectors x and y in IRn is given by
(x, y ) = x T y =
n
X
xi yi = x1 y1 + x2 y2 + · · · + xn yn .
i =1
For complex vectors, the inner product (x, y ) of two vectors x
and y in Cn is given by
(x, y ) =
n
X
i =1
xi y¯i = x1 y¯1 + x2 y¯2 + · · · + xn y¯n = y H x
Definition
A vector norm on a vector space V over a field K is a function
f : V −→ K that satisfies the following properties:
1. f (x) ≥ 0 for all x ∈ V .
2. f (x) = 0 if and only if x = 0.
3. f (αx) = |α|f (x) for all α ∈ K and for all x ∈ V .
4. f (x + y ) ≤ f (x) + f (y ) for all x, y ∈ V .
Notes. Item 4. is called the triangle inequality.
Vector Norm on IRn
Important Example: Euclidean Norm
An important example is when the vector space V is IRn and the
field K is IR.
Definition
A vector norm on a vector space IRn over a field IR is a function
f : IRn −→ IR that satisfies the following properties:
1. f (x) ≥ 0 for all x ∈ IRn .
Suppose that V = C2 and K = C,p
then the function
H
1/2
f (x) = kxk2 = x x = (x, x)
= |x1 |2 + |x2 |2 is a vector norm.
Suppose that V = Cn and K = C,p
then the function
f (x) = kxk2 = x H x = (x, x)1/2 = |x1 |2 + |x2 |2 + · · · + |xn |2 is a
vector norm.
2. f (x) = 0 if and only if x = 0.
3. f (αx) = |α|f (x) for all α ∈ IR and for all x ∈ IRn .
4. f (x + y ) ≤ f (x) + f (y ) for all x, y ∈ IRn .
Notes. Item 4. is called the triangle inequality.
We denote such a function f with a double bar notation:
f (x) = kxk. Subscript on the double bar are used to distinguish
between various norms.
The p-Norms (p. 52)
H¨older Inequality (p. 53)
The k · k2 is the most common vector norm in linear algebra. It is
a special case of the p-norms, p ≥ 1, defined by
kxkp =
n
X
!1/p
p
|xi |
A classic result concerning p-norms is the H¨
older inequality:
p
p
p 1/p
= (|x1 | + |x2 | + · · · + |xn | )
.
i =1
Of these, the 1, 2, and ∞ norms are the most important in
practice:
kxk1 = |x1 | + |x2 | + · · · + |xn |,
kxk2 = (|x1 |2 + |x2 |2 + · · · + |xn |2 )1/2 = (x H x)1/2 ,
kxk∞ = max(|x1 |, |x2 |, · · · , |xn |) = max |xi |.
1≤i ≤n
Note. The infinity norm is the limit when p tend to infinity of the
p-norm, i.e.
lim kxkp = max |xi | = kxk∞ .
p→∞
1≤i ≤n
|x T y | ≤ kxkp ky kq , where
1
1
+ = 1.
p q
The numbers p and q above are said to be H¨
older conjugates of
each other.
A very important special case of the H¨older inequality is the
Cauchy-Schwartz inequality:
|x T y | ≤ kxk2 ky k2 .
Norm Equivalence (p. 53)
Norm Equivalence
If x ∈ IRn , then
n
All norms on IR are equivalent, i.e., if k · kα and k · kβ are norms
on IRn , then there exist positive constants c1 and c2 such that
kxk2 ≤ kxk1 ≤
kxk∞ ≤ kxk2 ≤
c1 kxkα ≤ kxkβ ≤ c2 kxkα
Unit Circles
1.5
−1.5
−1.5
1.0
−1.0
−1.0
0.5
−0.5
0.67 0.33 0
0.33
0.0
1.3
1
−0.5
0
0.0
0.0
0.67
0.33
0.5
Y
−1.0
1.0
−1.5
1.5
0.5
0.67
1
1.0
1.3
1.0
0.5
n kxk∞
0.0
−0.5
−1.0
−1.5
1.5
1.0
0.5X
0.0
The sequence {x (k) } of n-vectors


1 + 1/k


o
n
 k/(k + 2) 
x (k)
=


1/k



 
 
 
5/4
4/3
3/2
2


 1/3  ,  2/4  ,  3/5  ,  4/6  , · · ·
=


1/4
1/3
1/2
1
0
−0.5
1.5
n kxk2
Convergence of Vectors (p. 54)
What are the unit circle Cp = {x such that kxkp = 1} associated
with the norm k · kp for p = 1, 2, ∞, in IR2 ?
1
√
kxk∞ ≤ kxk1 ≤ n kxk∞
for all x ∈ IRn .
1.3
√
−0.5
−1.0
−1.5
1.5
1.5
1.0
0.5X
0.0
−0.5
−1.0
−1.5
converges to


1
x =  1 .
0
Note: Convergence of x (k) to x is the same as the convergence of
(k)
each individual component xi of x (k) to the corresponding
component xi of x.
Convergence of Vectors (p. 54)
We say that a sequence {x (k) }, k = 1, 2, ..., ∞, of n-vectors
converges to a vector x with respect to the norm k · k if
lim kx (k) − xk = 0.
k→∞
Note that because the equivalence of norms, convergence in the
α-norm implies convergence in the β-norm and vice-versa.
We will use the notation
lim x (k) = x.
k→∞
Frobenius Norm (p. 55)
Matrix Norm (p. 55)
Since IRm×n is isomorphic to IRmn , i.e. we can consider m × n
matrices as vectors in IRmn , then the definition of a matrix norm
should be equivalent to the definition of a vector norm. In
particular, f : IRm×n −→ IR is a matrix norm if the following four
properties hold:
1. f (A) ≥ 0 for all A ∈ IRm×n
2. f (A) = 0 if and only if A = 0
3. f (αA) = |α|f (A), for all α ∈ IR and for all A ∈ IRm×n
4. f (A + B) ≤ f (A) + f (B), for all A, B ∈ IRm×n .
Notes. As with vector norms, we use a double bar notation with
subscripts to designate matrix norms, i.e., f (A) = kAk.
Submultiplicativity Property of Matrix Norms (p. 55)
The most frequently used matrix norm in numerical linear algebra
is the Frobenius norm (or Euclidean norm or Schur norm).
This norm is the same as the 2-norm of the column vector in IRmn
consisting of all the columns (respectively rows) of A, i.e.
v
uX
n
u m X
|aij |2 .
kAkF = t
i =1 j=1
A matrix norm k · k satisfies the submultiplicativity property (or
consistency) if
kABk ≤ kAk kBk.
The Frobenius norm is submultiplicative. (Hint: Use
Cauchy-Schwartz).
However, the Frobenius norm of the identity matrix is not one, i.e.,
√
kIn kF = n 6= 1.
Vector Norms in IRm×n and Submultiplicativity
Induced Matrix Norm by a Vector Norm
Let k · k be a vector norm on IRn . From this vector norm, we can
define a corresponding matrix norm as follows:
In general, matrix norms in IRmn are not submultiplicative.
For example, consider kAk∆ = maxi ,j |aij |, which is the infinity
norm k · k∞ in IRmn , and let
1 1
A=B =
,
1 1
then
2 = kABk∆ > kAk∆ kBk∆ = 1 · 1 = 1.
kAk = sup
x∈IRn
x6=0
kAxk
.
kxk
These norms satisfy the usual properties of vector norms.
These norms are submultiplicative (or consistent).
With these norms, the identity matrix satisfies kIn k = 1.
Again, important cases are with vector norms k · kp with p = 1, 2,
and ∞:
kAxkp
kAkp = max
.
x∈IRn kxkp
x6=0
The matrix norm k · kp is induced by the vector norm k · kp .
The matrix norm k · kp is subordinate to the vector norm k · kp .
Submultiplicativity of Matrix Norms
A consequence of the submultiplicative of matrix norms is
k
kA kp ≤
kAkkp .
Maximum on Unit Vectors (p. 55)
It is clear that kAkp is the p-norm of the largest vector obtained by
applying A to the unit p-norm vector:
2. lim Ak = 0.
kAxkp kxk−1
kAxkp
p
= max
−1
x∈IRn kxk kxk
x∈IRn kxkp
p
p
x6=0
x6=0
x
A
= max x6=0
kxkp p
3. lim Ak x = 0 for all x ∈ IRn .
=
This implies the following theorem. Let A ∈ IRn×n . The following
four conditions are equivalent:
1. kAkp < 1 for some p.
k→∞
k→∞
4. ρ(A) < 1.
kAkp = max
max kAxkp .
kxkp =1
Convenient Expression for k · k1 and k · k∞
kAk1 =
kAk∞ =
max kAxk1 = max
kxk1 =1
1≤j≤n
m
X
|aij |
i =1
max kAxk∞ = max
kxk∞ =1
Spectral Radius of a Matrix
1≤i ≤m
n
X
|aij |
j=1
If A ∈ IRn×n , then the spectral radius of A, denoted by ρ(A), is
given by
ρ(A) = max |λi (A)|,
1≤i ≤n
where λi (A), i = 1, 2, ..., n are all eigenvalues of A.
The 1-norm of a matrix is the maximum column sum and the
∞-norm is the maximum row sum.
Properties of the Spectral Radius
ρ(A) is not a norm. Indeed, for
0 1
A=
,
0 0
we have ρ(A) = 0 while A 6= 0. Also, triangle inequality is not
satisfied for the pair A and B = AT .
An other property is ρ(A) ≤ kAk for any matrix norm.
k · k2 and Spectral Radius
kAk2 =
q
q
ρ(AH A) = ρ(AAH ).
P
P
Remind that the tr (A) = ni=1 aii = ni=1 λi (A). Then
q
q
H
kAkF = tr (A A) = tr (AAH ).
Equivalence of Matrix Norms (p. 56)
The Frobenius and p-norms (especially p = 1, 2, ∞) satisfy certain
inequalities that are frequently used in analysis of matrix
computations. For A ∈ IRm×n we have
√
kAk2 ≤ kAkF ≤ n kAk2
√
max |aij | ≤ kAk2 ≤ mn max |aij |
i ,j
i ,j
√
1
√ kAk∞ ≤ kAk2 ≤ m kAk∞
n
√
1
√ kAk1 ≤ kAk2 ≤ n kAk1
m