MATH10212 Linear Algebra Lecture Notes Textbook Students are strongly advised to acquire a copy of the Textbook: D. C. Lay. Linear Algebra and its Applications. Pearson, 2006. ISBN 0-521-28713-4. Other editions can be used as well; the book is easily available on Amazon (this includes some very cheap used copies). These lecture notes should be treated only as indication of the content of the course; the textbook contains detailed explanations, worked out examples, etc. Normally, homework assignments will consist of some odd numbered exercises from the sections of the Textbook covered in the lectures on the previous week. The Textbook contains answers to most odd numbered exercises. About Homework The Student Handbook 2.10 (f) says: the recommended text books. As a rough guide you should be spending approximately twice the number of instruction hours in private study, mainly working through the examples sheets and reading your lecture notes and In respect of this course, MATH10212 Linear Algebra B, this means that students are expected to spend 8 (eight!) hours a week in private study of Linear Algebra. What is Linear Algebra? It is well-known that the total cost of a purchase of amounts g1 , g2 , g3 of some goods at prices p1 , p2 , p3 is an expression Expressions of this kind, a1 x 1 + · · · + an x n are called linear forms in variables x1 , . . . , xn with coefficients a1 , . . . , an . p1 g1 + p2 g2 + p3 g3 = 3 X i=1 pi gi . Linear Algebra studies the mathematics of linear forms. 2 MATH10212 • Linear Algebra B • Lecture 1 • Linear systems Lecture 1 Systems of linear equations [Lay 1.1] A linear equation in the variables x1 , . . . , xn is an equation that can be written in the form a1 x 1 + a2 x 2 + · · · + an x n = b where b and the coefficients a1 , . . . , an are real numbers. The subscript n can be any natural number. A system of simultaneous linear equations is a collection of one or more linear equations involving the same variables, say x1 , . . . , xn . For example, x1 + x2 = 3 x1 − x2 = 1 • exactly one solution, or • infinitely many solutions, under the assumption that the coefficients and solutions of the systems are real numbers. A system of linear equations is said to be consistent it if has solutions (either one or infinitely many), and a system in inconsistent if it has no solution. Solving a linear system The basic strategy is We shall abbreviate the words “a system of simultaneous linear equations” just to “a linear system”. A solution of the system is a list (s1 , . . . , sn ) of numbers that makes each equation a true identity when the values s1 , . . . , sn are substituted for x1 , . . . , xn , respectively. For example, in the system above (2, 1) is a solution. The set of all possible solutions is called the solution set of the linear system. Two linear systems are equivalent if the have the same solution set. We shall prove later in the course that a system of linear equations has either • no solution, or to replace one system with an equivalent system (that is, with the same solution set) which is easier to solve. Existence questions and uniqueness • Is the system consistent? • If a solution exist, is it unique? Equivalence of linear systems • When are two linear systems equivalent? 3 MATH10212 • Linear Algebra B • Lecture 2 • Row reduction and echelon forms Lecture 2 Row reduction and echelon forms [Lay 1.2] Matrix notation It is convenient to write coefficients of a linear system in the form of a matrix, a rectangular table. For example, the system x1 − 2x2 + 3x3 = 1 x1 + x2 = 2 x2 + x3 = 3 has the matrix of coefficients 1 −2 3 1 1 0 0 1 1 and the augmented matrix 1 −2 3 1 1 1 0 2 ; 0 1 1 3 notice how the coefficients are aligned in columns, and how missing coefficients are replaced by 0. The augmented matrix in the example above has 3 rows and 4 columns; we say that it is a 3×4 matrix. Generally, a matrix with m rows and n columns is called an m × n matrix. Elementary row operations Replacement Replace one row by the sum of itself and a multiple of another row. Interchange Interchange two rows. Scaling Multiply all entries in a row by a nonzero constant. The two matrices are row equivalent if there is a sequence of elementary row operations that transforms one matrix into the other. Note: • The row operations are reversible. • Row equivalence of matrices is an equivalence relation on the set of matrices. Theorem: Row Equivalence. If the augmented matrices of two linear systems are row equivalent, then the two systems have the same solution set. A nonzero row or column of a matrix is a row or column which contains at least one nonzero entry. We can now formulate a theorem (to be proven later). Theorem: systems. Equivalence of linear Two linear systems are equivalent if and only if the augmented matrix of one of them can be obtained from the augmented matrix of another system by frow operations and insertion / deletion of zero rows. MATH10212 • Linear Algebra B • Lecture 3 • Solution of Linear Systems 4 Lecture 3 Solution of Linear Systems [Lay 1.2] A nonzero row or column of a matrix is a row or column which contains at least one nonzero entry. A leading entry of a row is the leftmost nonzero entry (in a non-zero row). Definition. A matrix is in echelon form (or row echelon form) if it has the following three properties: 1. All nonzero rows are above any row of zeroes. 2. Each leading entry of a row is in column to the right of the leading entry of the row above it. 3. All entries in a column below a leading entry are zeroes. If, in addition, the following two conditions are satisfied, 4. All leading entries are equal 1. 5. Each leading 1 is the only non-zero entry in its column Theorem 1.2.1: Uniqueness of the reduced echelon form. Each matrix is row equivalent to one and only one reduced echelon form. Definition. A pivot position in a matrix A is a location in A that corresponds to a leading 1 in the reduced echelon form of A. A pivot column is a column of A that contains a pivot position. Example for solving in the lecture (The Row Reduction Algorithm): 0 2 2 2 2 1 1 1 1 1 1 1 1 3 3 1 1 1 2 2 A pivot is a nonzero number in a pivot position which is used to create zeroes in the column below it. A rule for row reduction: then the matrix is in reduced echelon form. 1. Pick the leftmost non-zero column and in it the topmost nonzero entry; it is a pivot. An echelon matrix is a matrix in echelon form. 2. Using scaling, make the pivot equal 1. Any non-zero matrix can be row reduced (that, transformed by elementary row operations) into a matrix in echelon form (but the same matrix can give rise to different echelon forms). 3. Using replacement row operations, kill all non-zero entries in the column below the pivot. Examples. The following is a schematic presentation of an echelon matrix: ∗ ∗ ∗ ∗ 0 ∗ ∗ ∗ 0 0 0 ∗ and this is a reduced echelon matrix: 1 0 ∗ 0 ∗ 0 1 ∗ 0 ∗ 0 0 0 1 ∗ 4. Mark the row and column containing the pivot as pivoted. 5. Repeat the same with the matrix made of not pivoted yet rows and columns. 6. When this is over, interchange the rows making sure that the resulting matrix is in reduced echelon form. 7. Using replacement row operations, kill all non-zero entries in the column above the pivot entries. 5 MATH10212 • Linear Algebra B • Lecture 3 • Solution of Linear Systems Solution of Linear Systems When we converted the augmented matrix of a linear system into its reduced row echelon form, we can write out the entire solution set of the system. Example. Let 1 0 −5 1 0 1 1 4 0 0 0 0 be the augmented matrix of a a linear system; then the system is equivalent to x1 − 5x3 = 1 x2 + x3 = 4 0 = 0 The variables x1 and x2 correspond to pivot columns in the matrix and a re called basic variables (also leading or pivot variables). The other variable, x3 is a free variable. Free variables can be assigned arbitrary values and then leading variables expressed in terms of free variables: x1 = 1 + 5x3 x2 = 4 − x3 x3 is free Theorem 1.2.2: Uniqueness Existence and A linear system is consistent if and only if the rightmost column of the augmented matrix is not a pivot column— that is, if and only if an echelon form of the augmented matrix has no row of the form 0 ··· 0 b with b nonzero If a linear system is consistent, then the solution set contains either (i) a unique solution, when there are no free variables, or (ii) infinitely many solutions, when there is at least one free variable. Using row reduction to solve a linear system 1. Write the augmented matrix of the system. 2. Use the row reduction algorithm to obtain an equivalent augmented matrix in echelon form. Decide whether the system is consistent. 3. if the system is consistent, get the reduced echelon form. 4. Write the system of equations corresponding to the matrix obtained in Step 3. 5. Express each basic variable in terms of any free variables appearing in the equation. 6 MATH10212 • Linear Algebra B • Lectures 5-6 • Vector equations Lectures 5-6 Vector equations [Lay 1.3] A matrix with only one column is called a column vector, or simply a vector. Rn is the set of all column vectors with n entries. 1. u + v = v + u 2. (u + v) + w = u + (v + w) 3. u + 0 = 0 + u = u A row vector: a matric with one row. 4. u + (−u) = −u + u = 0 Two vectors are equal if and only if they have 5. c(u + v) = cu + cv 6. (c + d)u = cu + du • the same shape, 7. c(du) = (cd)u • the same number of rows, 8. 1u = u • and their corresponding entries are equal. The set of al vectors with n entries is denoted Rn . The sum u+v of two vectors u and v in Rn is obtained by adding corresponding entries in u and v. For example in R2 1 −1 0 + = . 2 −1 1 The scalar multiple cv of a vector v and a real number (“scalar”) c is the vector obtained by multiplying each entry in v by c. For example in R3 , 1 1.5 0 = 0 . 1.5 −3 −2 The vector whose entries are all zeroes is called the zero vector and denoted 0: 0 0 0 = .. . . 0 Operations with row vectors are defined in a similar way. Algebraic properties of Rn For all u, v, w ∈ Rn and all scalars c and d: (Here −u denotes (−1)u.) Linear combinations Given vectors v1 , v2 , . . . , vp in Rn and scalars c1 , c2 , . . . , cp , the vector y = c1 v1 + · · · cp vp is called a linear combination of v1 , v2 , . . . , vp with weights c1 , c2 , . . . , cp . Rewriting a linear system as a vector equation Consider an example: the linear system x2 + x3 = 2 x 1 + x2 + x3 = 3 x1 + x2 − x3 = 2 can be written as equality of two vectors: x2 + x3 2 x1 + x2 + x3 = 3 x1 + x2 − x3 2 which is the same as 0 1 1 2 x1 1 + x2 1 + x3 1 = 3 1 1 −1 2 7 MATH10212 • Linear Algebra B • Lectures 5-6 • Vector equations Let us write the 0 1 1 matrix 1 1 2 1 1 3 1 −1 2 in a way that calls attention to its columns: a1 a2 a3 b Denote 0 1 1 1 a1 = 1 , a2 = 1 , a3 = 1 1 −1 and 2 b = 3 , 2 then the vector equation can be written as x1 a1 + x2 a2 + x3 a3 = b. Notice that to solve this equation is the same as express b as a linear combination of a1 , a2 , a3 , and find all such expressions. right part of the system as a linear combination of columns in its matrix of coefficients. A vector equation x1 a1 + x2 a2 + · · · + xn an = b. has the same solution set as the linear system whose augmented matrix is a1 a2 · · · an b In particular b can be generated by a linear combination of a1 , a2 , . . . , an if and only if there is a solution of the corresponding linear system. Definition. If v1 , . . . , vp are in n R , then the set of all linear combination of v1 , . . . , vp is denoted by Span{v1 , . . . , vp } and is called the subset of Rn spanned (or generated) by v1 , . . . , vp . That is, Span{v1 , . . . , vp } is the collection of all vectors which can be written in the form c1 v1 + c2 v2 + · · · + cp vp with c1 , . . . , cp scalars. Therefore solving a linear system is the same as finding an expression of the vector of the We say that vectors v1 , . . . , vp span Rn if Span{v1 , . . . , vp } = Rn 8 MATH10212 • Linear Algebra B • Lecture 4 • The matrix equation Ax = b The matrix equation Ax = b [Lay 1.4] Definition. If A is an m × n matrix, with columns a1 , . . . , an , and if x is in Rn , then the product of A and x, denoted Ax, is the linear combination of the columns of A using the corresponding entries in x as weights: x .1 Ax = a1 a2 · · · an .. xn has the same solution set as the vector equation x1 a1 + x2 a2 + · · · + xn an = b which has the same solution set as the system of linear equations whose augmented matrix is a1 a2 · · · an b . = x1 a1 + x2 a2 + · · · + xn an Example. The system x2 + x3 = 2 x1 + x2 + x3 = 3 x1 + x2 − x3 = 2 was written as x1 a1 + x2 a2 + x3 a3 = b. where 0 1 1 1 a1 = 1 , a2 = 1 , a3 = 1 1 −1 and 2 b = 3 . 2 In the matrix product notation it becomes 0 1 1 x1 2 1 1 1 x2 = 3 1 1 −1 x3 2 or Ax = b where A = a1 a2 a3 , x1 x = x2 . x3 Theorem 1.4.3: If A is an m × n matrix, with columns a1 , . . . , an , and if x is in Rn , the matrix equation Ax = b Existence of solutions. The equation Ax = b has a solution if and only if b is a linear combination of columns of A. Theorem 1.4.4: Let A be an m×n matrix. Then the following statements are equivalent. (a) For each b ∈ Rn , the equation Ax = b has a solution. (b) Each b ∈ Rn is a linear combination of columns of A. (c) The columns of A span Rn . (d) A has a pivot position in every row. Row-vector rule for computing Ax. If the product Ax is defined then the ith entry in Ax is the sum of products of corresponding entries from the row i of A and from the vector x. Theorem 1.4.5: Properties of the matrix-vector product Ax. If A is an m × n matrix, u, v ∈ Rn , and c is a scalar, then (a) A(u + v) = Au + Av; (b) A(cu) = c(Au). 9 MATH10212 • Linear Algebra B • Lecture 4 • The matrix equation Ax = b Solution sets of linear equations [Lay 1.5] Homogeneous linear systems A linear system is homogeneous if it can be written as Ax = 0. A homogeneous system always has at least one solution x = 0 (trivial solution). Therefore for homogeneous systems an important question os existence of a nontrivial solution, that is, a nonzero vector x which satisfies Ax = 0: The homogeneous system Ax = b has a nontrivial solution if and only if the equation has at least one free variable. Example. x1 + 2x2 − x3 = 0 x1 + 3x3 + x3 = 0 Nonhomogeneous systems When a nonhomogeneous system has many solutions, the general solution can be written in parametric vector form a one vector plus an arbitrary linea combination of vectors that satisfy the corresponding homogeneous system. Example. x1 + 2x2 − x3 = 0 x1 + 3x3 + x3 = 5 Theorem 1.5.6: Suppose the equation Ax = b is consistent for some given b, and p be a solution. Then the solution set of Ax = b is the set of all vectors of the form w = p + vh , where vh is any solution of the homogeneous equation Ax = 0. 10 MATH10212 • Linear Algebra B • Lectures 5-6 • Linear independence Lecture 7: Linear independence [Lay 1.7] Definition. An indexed set of vectors { v1 , . . . , vp } in Rn is linearly independent if the vector equation x1 v1 + · · · + xp vp = 0 has only trivial solution. Therefore the columns of matrix A are linearly independent if and only if the equation Ax = 0 has only the trivial solution. A set of one vectors { v1 } is linearly dependent if v1 = 0. { v1 , . . . , vp } A set of two vectors { v1 , v2 } is linearly dependent if at least one of the vectors is a multiple of the other. in Rn is linearly dependent if there exist weights c1 , . . . , cp , not all zero, such that c1 v1 + · · · + cp vp = 0 Theorem 1.7.7: Characterisation of linearly dependent sets. An indexed set S = { v1 , . . . , vp } The set Linear independence of matrix columns. The matrix equation Ax = 0 where A is made of columns A = a1 · · · an can be written as x1 a1 + · · · + xn an = 0 of two or more vectors is linearly dependent if and only if at least one of the vectors in S is a linear combination of the others. Theorem 1.7.8: dependence of “big” sets. If a set contains more vectors than entries in each vector, then the set is linearly dependent. Thus, any set { v1 , . . . , vp } in Rn is linearly dependent if p > n. 11 MATH10212 • Linear Algebra B • Lecture 6 • Linear transformations Lecture 6: Introduction to linear transformations [1.8] 1 0 I4 = 0 0 0 1 0 0 0 0 1 0 0 0 . 0 1 Transformation. A transformation T from Rn to Rm is a rule that assigns to each vector x in Rn a vector T (x) in Rm . 1 0 0 I3 = 0 1 0 , 0 0 1 The set Rn is called the domain of T , the set Rm is the codomain of T . The columns of the identity matrix In will be denoted Matrix transformations. With every m × n matrix A we can associated the transformation e1 , e2 , . . . , en . T : Rn −→ Rm x 7→ Ax. In short: T (x) = Ax. The range of a matrix transformation. The range of T is the set of all linear combinations of the columns of A. Indeed, this can be immediately seen from the fact that each image T (x) has the form T (x) = Ax = x1 a1 + · · · + xn an Definition: Linear transformations. A transformation T : Rn −→ Rm is linear if: • T (u + v) = T (u) + T (v) for all vectors u, v ∈ Rn ; • T (cu) = cT (u) for all vectors u and all scalars c. For example, in R3 1 0 e1 = 0 , e2 = 1 , 0 0 0 e3 = 0 . 1 Obsetrve that if x is an arbitrary vector in Rn , x1 .. x = . , xn then x1 x2 x = x3 .. . xn 1 0 0 0 1 0 = x1 0 + x2 0 + · · · + xn 0 .. .. .. . . . 0 0 1 = x 1 e 1 + x2 e 2 + · · · + xn e n . Properties of linear transformations. If T is a linear transformation then T (0) = 0 and T (cu + dv) = cT (u) + dT (v). The identity matrix. An n × n matrix with 1’s on the diagonal and 0’s elsewhere is called the identity matrix In . For example, The identity transformation. It is easy to check that In x = x for all x ∈ Rn . Therefore the linear transformation associated with the identity matrix is the identity transformation of Rn : Rn −→ Rn x 7→ x 12 MATH10212 • Linear Algebra B • Lecture 6 • Linear transformations The matrix of a linear transformation [Lay 1.9] Theorem 1.9.10: The matrix of a linear transformation. Let T : Rn −→ Rm be a linear transformation. Then there exists a unique matrix A such that T (x) = Ax for all x ∈ Rn . In fact, A is the m × n matrix whose jth column is the vector T (ej ) where ej is the jth column of the identity matrix in Rn : A = T (e1 ) · · · T (en ) The matrix A is called the standard matrix for the linear transformation T . Definition. A transformation T : Rn −→ Rm is onto Rm if each b ∈ Rm is the image of at least one x ∈ Rn . A transformation T is one-to-one if each b ∈ Rm is the image of at most one x ∈ Rn . Theorem: One-to-one transformations: a criterion. A linear transformation T : Rn −→ Rm is one-to-one if and only if the equation Proof. First we express x in terms of e1 , . . . , en : x = x1 e1 + x2 e2 + · · · + xn en T (x) = 0 has only the trivial solution. and compute, using definition of linear transformation Theorem: “One-to-one” and “onto” in terms of matrices. Let T (x) = T (x1 e1 + x2 e2 + · · · + xn en ) T : Rn −→ Rm = T (x1 e1 ) + · · · + T (xn en ) = x1 T (e1 ) + · · · + xn T (en ) and then switch to matrix notation: x .1 = T (e1 ) · · · T (en ) .. xn = Ax. be linear transformation and let A be the standard matrix for T . Then: • T maps Rn onto Rm if and only if the columns of A span Rm . • T is one-to-one if and only if the columns of A are linearly independent. 13 MATH10212 • Linear Algebra B • Lectures 7–9 • Matrix operations Lectures 7–9: Matrix operations [Lay 2.1] Labeling of matrix entries. Let A be an m × n matrix. A = a1 a2 · · · an a11 · · · .. . = ai1 · · · . .. am1 · · · a1j .. . ··· aij .. . ··· amj · · · a1n .. . ain .. . amn Notice that in aij the first subscript i denotes the row number, the second subscript j the column number of the entry aij . In particular, the column aj is a1j .. . aj = aij . . .. amj Matrices 1 0 , 0 2 π √0 0 0 2 0 0 0 3 are all diagonal. The identity matrices 1 , 1 0 0 0 1 0 , . . . 0 0 1 1 0 , 0 1 are diagonal. Zero matrix. By definition, 0 is a m×n matrix whose entries are all zero. For example, matrices 0 0 , 0 0 0 0 0 0 are zero matrices. Notice that zero square matrices, like Diagonal matrices, zero matrices. The diagonal entries in A are a11 , a22 , a33 , . . . For example, the diagonal entries of the matrix 1 2 3 A = 4 5 6 7 8 9 1 0 0 0 0 0 , 0 0 2 0 0 , 0 0 0 0 0 0 0 0 0 0 0 are diagonal! Sums. If A = a1 a2 · · · an B = b1 b2 · · · bn are 1, 5, and 9. A square matrix is a matrix with equal numbers of rows and columns. and A diagonal matrix is a square matrix whose non-diagonal entries are zeroes. are m × n matrices then we define the sum A + B as a1 + b1 a2 + b2 a11 + b11 · · · .. . = ai1 + bi1 · · · .. . am1 + bm1 · · · A+B = ··· an + bn a1j + b1j .. . ··· aij + bij .. . ··· amj + bmj · · · a1n + b1n .. . ain + bin .. . amn + bmn 14 MATH10212 • Linear Algebra B • Lectures 7–9 • Matrix operations Scalar multiple. If c is a a scalar then we define cA = ca1 ca2 · · · can ca11 · · · .. . = cai1 · · · . .. cam1 · · · ca1j .. . ··· caij .. . ··· camj · · · ca1n .. . cain .. . camn Theorem 2.1.1: Properties of matrix addition. Let A, B, and C be ma- trices of the same size and r and s be scalars. 1. A + B = B + A 2. (A + B) + C = A + (B + C) 3. A + 0 = A 4. r(A + B) = rA + rB 5. (r + s)A = rA + sA 6. r(sA) = (rs)A. 15 MATH10212 • Linear Algebra B • Lectures 8–9 • Matrix multiplication Lectures 8–9: Matrix multiplication [Lay 2.1] Composition of linear transformations. Let B be an m × n matrix and A an p × m matrix. They define linear transformations T : Rn −→ Rm , C =A·B x 7→ Bx (but the multiplication symbol “·” is frequently skipped). y 7→ Ay. Definition: Matrix multiplication. If A is an p × m matrix and B is an m × n matrix with columns and S : Rm −→ Rp , S ◦ T and it will be natural to call C the product of A and B and denote b1 , . . . , bn Their composition (S ◦ T )(x) = S(T (x)) is a linear transformation S ◦ T : Rn −→ Rp . From the previous lecture, we know that linear transformations are given by matrices. What is the matrix of S ◦ T ? Multiplication of matrices. To answer the above question, we need to compute A(Bx) in matrix form. Write x as x1 x = ... xn and observe Bx = x1 b1 + · · · + xn bn . Hence then the product AB is the p×n matrix whose columns are Ab1 , . . . , Abn : AB = A b1 · · · bn = Ab1 · · · Abn . Columns of AB. Each column Abj of AB is a linear combination of columns of A with weights taken from the jth column of B: b 1j Abj = a1 · · · am ... bmj = b1j a1 + · · · + bmj am Mnemonic rules [m × n matrix] · [n × p matrix] = [m × p matrix] A(Bx) = A(x1 b1 + · · · + xn bn ) = A(x1 b1 ) + · · · + A(xn bn ) columnj (AB) = A · columnj (B) = x1 A(b1 ) + · · · + xn A(bn ) = Ab1 Ab2 · · · Abn x rowi (AB) = rowi (A) · B Therefore multiplication by the matrix C = Ab1 Ab2 · · · Abn transforms x into A(Bx). Hence C is the matrix of the linear transformation Theorem 2.1.2: Properties of matrix multiplication. Let A be an m×n matrix and let B and C be matrices for which indicated sums and products are defined. Then the following identities are true: MATH10212 • Linear Algebra B • Lectures 8–9 • Matrix multiplication 1. A(BC) = (AB)C 2. A(B + C) = AB + AC 3. (B + C)A = BA + CA 4. r(AB) = (rA)B = A(rB) for any scalar r 5. Im A = A = AIn 16 17 MATH10212 • Linear Algebra B • Lecture 10 • The inverse of a matrix Lecture 10: The inverse of a matrix [Lay 2.2] Powers of matrix. As it is usual in algebra, we define, for a square matrix A, Ak = A · · · A (k times) If A 6= 0 then we set A0 = I The transpose of a matrix. The transpose AT of an m × n matrix A is the n × m matrix whose rows are formed from corresponding columns of A: T 1 4 1 2 3 = 2 5 4 5 6 3 6 Theorem 2.1.3: Properties of transpose. Let A and B denote matrices whose sizes are appropriate for the following sums and products. Then we have: 1. (AT )T = A T T 2. (A + B) = A + B T 3. (rA)T = r(AT ) for any scalar r 4. (AB)T = B T AT Invertible matrices An n × n matrix A is invertible if there is an n × n matrix C such that CA = I and AC = I C is called the inverse of A. The inverse of A, if exists, is unique (!) and is denoted A−1 : A−1 A = I and AA−1 = I. Singular matrices. A non-invertible matrix is called a singular matrix. An invertible matrix is nonsingular. Theorem 2.2.4: Inverse of a 2 × 2 matrix. Let a b A= c d If ad − bc 6= 0 then A is invertible and −1 1 d −b a b = a c d ad − bc −c The quantity ad − bc is called the determinant of A: det A = ad − bc Theorem 2.2.5: Solving matrix equations. If A is an invertible n × n matrix, then for each b ∈ Rn , the equation Ax = b has the unique solution x = A−1 b. Theorem 2.2.6: Properties of invertible matrices. (a) If A is an invertible matrix, then A−1 is also invertible and (A−1 )−1 = A (b) If A and B are n×n invertible matrices, then so is AB, and (AB)−1 = B −1 A−1 (c) If A is an invertible matrix, then so is AT , and (AT )−1 = (A−1 )T Definition: Elementary matrices. An elementary matrix is a matrix obtained by performing a single elementary row operation on an identity matrix. If an elementary row transformation is performed on an n × n matrix A, the MATH10212 • Linear Algebra B • Lecture 10 • The inverse of a matrix resulting matrix can be written as EA, where the m × m matrix is made by the same row operations on Im . Each elementary matrix E is invertible. The inverse of E is the elementary matrix of the same type that transforms E back into I. Theorem 2.2.7: Invertible matrices. An n × n matrix A is invertible if and only if A is row equivalent to In , and in this case, any sequence of elementary 18 row operations that reduces A to In also transforms In into A−1 . Computation of inverses. • Form the augmented matrix A I and row reduce it. • If A is row equivalent to I, A I then is row equivalent to −1 I A . • Otherwise A has no inverse. 19 MATH10212 • Linear Algebra B • Lecture 11–12 • Invertible matrices Lectures 11–12: Characterizations of invertible matrices. Partitioned matrices [Lay 2.4] The Invertible Matrix Theorem 2.3.8: For an n × n matrix A, the following are equivalent: (a) A is invertible. One-sided inverse is the inverse. Let A and B be square matrices. If AB = I then both A and B are invertible and B = A−1 and A = B −1 . (b) A is row equivalent to In . (c) A has n pivot positions. (d) Ax = 0 has only the trivial solution. (e) The columns of A are linearly independent. (f) The linear transformation x 7→ Ax is one-to-one. (g) Ax = b has at least one solution for each b ∈ Rn . Theorem 2.3.9: Invertible linear transformations. Let T : Rn −→ Rn be a linear transformation and A its standard matrix. Then T is invertible if and only if A is an invertible matrix. In that case, the linear transformation n (h) The columns of A span R . (i) x 7→ Ax maps Rn onto Rn . (j) There is an n × n matrix C such that CA = I. (k) There is an n × n matrix D such that AD = I. (l) AT is invertible. S(x) = A−1 x is the only transformation satisfying S(T (x)) = x T (S(x)) = x for all x ∈ Rn for all x ∈ Rn 20 MATH10212 • Linear Algebra B • Lecture 11–12 • Invertible matrices Partitioned matrices [Lay 2.4] Example: Partitioned matrix 1 2 a A = 3 4 b p q z A11 A12 = A21 A22 A is a 3 × 3 matrix which can be viewed as a 2 × 2 partitioned (or block) matrix with blocks 1 2 a A11 = , A12 = , 3 4 b A21 = p q , A22 = z Addition of partitioned matrices. If matrices A and B are of the same size and partitioned the same way, they can be added block-by-block. Similarly, partitioned matrices can be multiplied by a scalar blockwise. Multiplication of partitioned matrices. If the column partition of A matches the row partition of B then AB can be computed by the usual rowcolumn rule, with blocks treated as matrix entries. 1 2 a α β A = 3 4 b , B = γ δ p q z Γ ∆ Example 1 2 a A11 A12 A= 3 4 b = , A21 A22 p q z α β B1 B= γ δ = B2 Γ ∆ A11 A12 B1 AB = A21 A22 B2 A11 B1 + A12 B2 = A21 B1 + A22 B2 1 2 α 3 4 γ = α p q γ β a Γ ∆ + δ b β + z Γ ∆ δ Theorem 2.4.10: Column-Row Expansion of AB. If A is m × n and B is n × p then row1 (B) row2 (B) AB = col1 (A) col2 (A) · · · coln (A) .. . rown (B) = col1 (A)row1 (B) + · · · + coln (A)rown (B) 21 MATH10212 • Linear Algebra B • Lecture 12 • Subspaces Lecture 12: Subspaces of Rn [Lay 2.8] Subspace. A subspace of Rn is any set H that has the properties The null space of a matrix A is the set Nul A of all solutions to the homogeneous system (a) 0 ∈ H. Ax = 0 (b) For each vectors u, v ∈ H, the sum u + v ∈ H. (c) For each vector u ∈ H and each scalar c, cu ∈ H. Rn is a subspace of itself. { 0 } is a subspace, called the zero subspace. Span. Span{ v1 , . . . , vp } is the subspace spanned (or generated) by v1 , . . . , vp . The column space of a matrix A is the set Col A of all linear combinations of columns of A. Theorem 2.8.12: The null space. The null space of an m × n matrix A is a subspace of Rn . Equivalently, the set of all solutions to a system Ax = 0 of m homogeneous linear equations in n unknowns is a subspace of Rn . Basis of a subspace. A basis of a subspace H of Rn is a linearly independent set in H that spans H. Theorem 2.8.13: Pivot columns of a matrix A form a basis of Col A. Dimension and rank [Lay 2.9] Theorem: Given a basis b1 , . . . , bp in a subspace H of Rn , every vector u ∈ H is uniquely expressed as a linear combination of b1 , . . . , bp . such that Solving homogeneous systems. Finding a parametric solution of a system of homogeneous linear equations The vector Ax = 0 x = c1 b1 + · · · + cp bp . c1 .. x B=. cp means to find a basis of Nul A. is the coordinate vector of x. Coordinate system. Let Dimension. The dimension of a nonzero subspace H, denoted by dim H, is the number of vectors in any basis of H. B = { b1 , . . . , bp } be a basis for a subspace H. For each x ∈ H, the coordinates of x relative to the basis B are the weights c1 , . . . , c p The dimension of the zero subspace {0} is defined to be 0. Rank of a matrix. The rank of a matrix A, denoted by rank A, is the dimen- 22 MATH10212 • Linear Algebra B • Lecture 12 • Subspaces sion of the column space Col A of A: rank A = dim Col A The Rank Theorem 2.9.14: If a matrix A has n columns, rank A + dim Nul A = n. The Basis Theorem 2.9.15: Let H be a p-dimensional subspace of Rn . • Any linearly independent set of exactly p elements in H is automatically a basis for H. • Also, any set of p elements of H that spans H is automatically a basis of H. For example, this means that the vectors 1 2 4 0 , 3 , 5 0 0 6 form a basis of R3 : there are three of them and they are linearly independent (the latter should be obvious to students at this stage). The Invertible Matrix Theorem (continued) Let A be an n × n matrix. Then the following statements are each equivalent to the statement that A is an invertible matrix. (m) The columns of A form a basis of Rn . (n) Col A = Rn . (o) dim Col A = n. (p) rank A = n. (q) Nul A = {0}. (r) dim Nul A = 0. 23 MATH10212 • Linear Algebra B • Lecture 14 • Introduction to determinants Lecture 14: Introduction to determinants [Lay 3.1] The determinant det A of a square matrix A is a certain number assigned to the matrix; it is defined recursively, that is, we define first determinants of matrices of sizes 1 × 1, 2 × 2, and 3 × 3, and then supply a formula which expresses determinants of n × n matrices in terms of determinants of (n − 1) × (n − 1) matrices. The determinant of a 1 × 1 matrix A = a11 is defined simply as being equal its only entry a11 : det a11 = a11 . The determinant of a 2 × 2 matrix is defined by the formula a11 a12 det = a11 a22 − a12 a21 . a21 a22 The determinant of a 3 × 3 matrix a11 a12 a13 A = a21 a22 a23 a31 a32 a33 is defined by the formula a11 a12 a13 a22 a23 a21 a23 a21 a22 det a21 a22 a23 = a11 · − a12 · + a13 · a32 a33 a31 a33 a31 a32 a31 a32 a33 Example. The determinant det A of the matrix 2 0 7 A = 0 1 0 1 0 4 For example, if 1 2 3 A = 4 5 6 , 7 8 9 equals 1 0 0 1 2 · det + 7 · det 0 4 1 0 then A22 which further simplifies as 1 3 = , 7 9 A31 2 3 = 5 6 2 · 4 + 7 · (−1) = 8 − 7 = 1. Submatrices. By definition, the submatrix Aij is obtained from the matrix A by crossing out row i and column j. Recursive definition of determinant. For n > 3, the determinant of an n × n matrix A is defined as the expression det A = a11 det A11 − a12 det A12 + · · · + (−1)1+n a1n det A1n = n X j=1 (−1)1+j a1j det A1j 24 MATH10212 • Linear Algebra B • Lecture 14 • Introduction to determinants which involves the determinants of smaller (n−1)×(n−1) submatrices A1j , which, in their turn, can be evaluated by a similar formula wich reduces the calcualtions to the (n − 2) × (n − 2) case, and can be repeated all the way down to determinants of size 2 × 2. Cofactors. The (i, j)-cofactor of A is Cij = (−1)i+j det Aij is based on induction on n, which, to avoid the use of “general” notation is illustrated by a simple example. Let a11 0 0 0 0 a21 a22 0 0 0 0 A = a31 a32 a33 0 . a41 a42 a43 a44 0 a51 a52 a53 a54 a55 All entries in the first row of A–with possible exception of a11 —are zeroes, Then a12 = · · · = a15 = 0, det A = a11 C11 + a12 C12 + · · · + a1n C1n is cofactor expansion across the first row of A. Theorem 3.1.1: Cofactor expansion. For any row i, the result of the cofactor expansion across the row i is the same: therefore in the formula det A = a11 det A11 + · · · + a55 det A55 all summand with possible exception of the first one, a11 det A11 , are zeroes, and therefore det A = ai1 Ci1 + ai2 Ci2 + · · · + ain Cin The chess board pattern of signs in cofactor expansions. The signs (−1)i+j which appear in the formula for cofactors, form the easy-to-recognise and easy-to-remember “chess board” pattern: + − + ··· − + − + − + .. ... . Theorem 3.1.2: The determinant of a triangular matrix. If A is a triangular n × n matrix then det A is the product of the diagonal entries of A. Proof: This proof contains more details than the one given in the textbook. It det A = a11 det A11 a22 0 0 0 a32 a33 0 0 = a11 det a42 a43 a44 0 . a52 a53 a54 a55 But the smaller matrix A11 is also lower triangle, and therefore we can conclude by induction that a22 0 0 0 a32 a33 0 0 det a42 a43 a44 0 = a22 · · · a55 a52 a53 a54 a55 hence det A = a11 · (a22 · · · a55 ) = a11 a22 · · · a55 is the product of diagonal entries of A. The basis of induction is the case n = 2; of course, in that case a11 0 det = a11 a22 − 0 · a21 a21 a22 = a11 a22 25 MATH10212 • Linear Algebra B • Lecture 14 • Introduction to determinants is the product of the diagonal entries of A. Corollary. The determinant of a diagonal matrix equals is the product of its diagonal elements: d1 0 · · · 0 0 d2 0 det .. .. = d1 d2 · · · dn . . . . . . 0 · · · dn Corollary. The determinant of the identity matrix equals 1: det In = 1. 26 MATH10212 • Linear Algebra B • Lectures 15–16 • Properties of determinants Lectures 15–16: Properties of determinants [Lay 3.2] Theorem 3.2.3: Row Operations. Let A be a square matrix. Similarly det A (a) If a multiple of one row of A is added to another row to produce a matrix B, then det B = det A. (b) It two rows of A are swapped to produce B, then det B = − det A. (c) If one row of A is multiplied by k to produce B, then det B = k det A. Theorem 3.2.5: The determinant of the transposed matrix. If A is an n × n matrix then det AT = det A. Example. Let 1 2 0 A = 0 1 0 , 0 3 4 then 1 0 0 AT = 2 1 3 0 0 4 T 1 3 = 1 · det +0+0 0 4 = 1 · 4 = 4; we got the same value. Corollary: Cofactor expansion across a column. For any column j, det A = a1j C1j + a2j C2j + · · · + anj Cnj Example. We have already computed the determinant of the matrix 2 0 7 A = det 0 1 0 1 0 4 by expansion across the first row. Now we do it by expansion across the third column: 0 1 1+3 det A = 7 · (−1) · det 1 0 2 0 2+3 +0 · (−1) · det 1 0 2 0 3+3 +4 · (−1) · det 0 1 = −7 − 0 + 8 = 1. and 1 0 det A = 1 · det 3 4 0 0 −2 · det 0 4 0 1 +0 · det 0 3 = 1·4−2·0+0·0 = 4. Proof: Columns of A are rows of AT which has the same determinant as A. Corollary: Column operations. Let A be a square matrix. (a) If a multiple of one column of A is added to another column to produce a matrix B, then det B = det A. 27 MATH10212 • Linear Algebra B • Lectures 15–16 • Properties of determinants (b) It two columns of A are swapped to produce B, then • If a row or a column has a convenient scalar factor, take it out of the determinant. det B = − det A. (c) If one column of A is multiplied by k to produce B, then det B = k det A. Proof: Column operations on A are row operation of AT which has the same determinant as A. Computing determinants. In computations by hand, the quickest method of computing determinants is to work with both columns and rows: • If convenient, swap two rows or two columns— but do not forget to change the sign of the determinant. • Adding to a row (column) scalar multiples of other rows (columns), simplify the matrix to create a row (column) with just one nonzero entry. • Expand across this row (column). • Repeat until you get the value of the determinant. Example. 1 0 3 det 1 3 1 1 1 −1 C3 −3C1 = 2 out of C3 = 1 0 0 det 1 3 −2 1 1 −4 1 0 0 2 · det 1 3 −1 1 1 −2 −1 out of C3 = expand = R1 −3R2 = R1 ↔R2 = = = Example. Compute the determinant 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 ∆ = det 1 0 0 −2 · det 1 3 1 1 1 2 3 1 −2 · 1 · (−1) · det 1 2 0 −5 −2 · det 1 2 1 2 −2 · (−1) · det 0 −5 1+1 2 · (−5) −10. Solution: Subtracting Row 1 from Rows 2, 3, and 4, we rearrange the de- 28 MATH10212 • Linear Algebra B • Lectures 15–16 • Properties of determinants terminant as the first column we have ∆ = 1 1 1 1 0 0 0 0 −1 1 ∆ = det 0 0 −1 0 1 0 −1 0 0 1 0 1 1 1 1 R3 +R2 After expanding the determinant across the first column, we get = = 2+1 −(−1) · (−1) −1 1 · det 1 3 = = −1 1 − det 1 3 −1 1 − det 0 4 −(−4) 4. As an exercise, I leave you with this ques0 0 −1 1 tion: wouldn’t you agree that the deter 0 −1 0 1 minant of the similarly constructed ma ∆ = 1 · (−1)1+1 · det −1 0 0 1 trix of size n × n should be ±(n − 1)? 1 1 1 1 But how can one determine the correct sign? 0 0 −1 1 And one more exercise: you should now 0 −1 0 1 = det be able to instantly compute −1 0 0 1 1 1 1 1 1 2 3 4 0 1 2 3 0 5 . 0 0 −1 1 1 2 0 4 5 det 0 −1 0 1 1 0 3 4 5 R4 +R3 = det −1 0 0 1 0 2 3 4 5 0 1 1 2 Theorem 3.2.4: Invertible matrices. A square matrix A is invertible if and only if After expansion across the 1st column det A 6= 0. ∆ = 0 −1 1 (−1) · (−1)3+1 · det −1 0 1 1 1 2 = 0 −1 1 − det −1 0 1 1 1 2 R3 +R2 = 0 −1 1 − det −1 0 1 0 1 3 After expanding the determinant across Recall that this theorem has been already known to us in the case of 2 × 2 matrices A. Example. The matrix 2 0 7 A = 0 1 0 1 0 4 has the determinant 1 0 0 1 det A = 2 · det + 7 · det 0 4 1 0 = 2 · 4 + 7 · (−1) = 8−7=1 29 MATH10212 • Linear Algebra B • Lectures 15–16 • Properties of determinants hence A is invertible. 4 A−1 = 0 −1 Indeed, 0 −7 1 0 0 2 and det(AB) = 1 · 17 − 5 · 3 = 2, which is exactly the value of —check! (det A) · (det B). Theorem 3.2.6: Multiplicativity Property. If A and B are n × n matrices then Corollary. If A is invertible, det AB = (det A) · (det B) det A−1 = (det A)−1 . Example. Take 1 0 1 5 A= and B = , 3 2 0 1 Proof. Set B = det A−1 , then AB = I then det A = 2 and det B = 1. and Theorem 3.2.6 yields But 1 0 1 5 AB = 3 2 0 1 1 5 = 3 17 det A · det B = det I = 1. Hence det B = (det A)−1 . Cramer’s Rule [Lay 3.3] Cramer’s Rule is an explicit (closed) formula for solving systems of n linear equations with n unknowns and nonsigular (invertible) matrix of coefficients. It has important theoretical value, but is unsuitable for practical application. n For any n × n matrix A and any b ∈ R , denote by Ai (b) the matrix obtained from A by replacing column i by the vector b: Ai (b) = a1 · · · ai−1 b ai+1 · · · an . Theorem 3.3.7: Cramer’s Rule. Let A be an invertible n × n matrix. For any b ∈ Rn , the unique solution x of the linear system Ax = b is given by xi = det Ai (b) , det A i = 1, 2, . . . , n. For example, for a system x1 + x2 = 3 x1 − x2 = 1 Cramer’s rule gives the answer 3 1 det 1 −1 −4 = x1 = =2 −2 1 1 det 1 −1 1 3 det 1 1 −2 = x2 = =2 −2 1 1 det 1 −1 30 MATH10212 • Linear Algebra B • Lectures 15–16 • Properties of determinants Theorem 3.3.8: An Inverse Formula. Let A be an invertible n × n matrix. Then C11 C21 · · · Cn1 1 C12 C22 · · · Cn2 A−1 = .. .. .. det A . . . C1n C2n · · · Cnn the cofactors are C11 C21 C12 C22 = = = = (−1)1+1 · d = d (−1)2+1 · b = −b (−1)1+2 · c = −c (−1)2+2 · a = a, where cofactors Cji are given by Cji = (−1)j+i det Aji and Aji is the matrix obtained from A by deleting row j and column i. Notice that for a 2 × 2 a A= c matrix b d and we get the familiar formula A −1 1 d −b = a det A −c 1 d −b . = a ad − bc −c 31 MATH10212 • Linear Algebra B • Lecture 17 • Eigenvalues and eigenvectors Lecture 17: Eigenvalues and eigenvectors Quiz What is the value of this determinant? 1 2 3 ∆ = det 0 2 3 1 2 6 x is called an eigenvector corresponding to λ. Eigenvalue. A scalar λ is an eigenvalue of A iff the equation (A − λI)x = 0 Quiz has a non-trivial solution. Is Γ positive, negative, or zero? Eigenspace. The set of all solutions of 0 0 71 Γ = det 0 93 0 87 0 0 Eigenvectors. An eigenvector of an n × n matrix A is a nonzero vector x such that Ax = λx for some scalar λ. Eigenvalues. A scalar λ is called an eigenvalue of A if there is a non-trivial solution x of Ax = λx; (A − λI)x = 0 is a subspace of Rn called the eigenspace of A corresponding to the eigenvalue λ. Theorem: Linear independence of eigenvectors. If v1 , . . . , vp are eigenvectors that correspond to distinct eigenvalues of λ1 , . . . , λp of an n×n matrix A, then the set { v1 , . . . , vp } is linearly independent. 32 MATH10212 • Linear Algebra B • Lecture 17 • The characteristic equation Lecture 17: The characteristic equation Characteristic polynomial The polynomial det(A − λI) in variable λ is characteristic polynomial of A. For example, in the 2 × 2 case a−λ b det = λ2 −(a+d)λ+(ad−bc). c d−λ det(A − λI) = 0 is the characteristic equation for A. Characterisation of eigenvalues Theorem. A scalar λ is an eigenvalue of an n × n matrix A if and only if λ satisfies the characteristic equation det(A − λI) = 0 Zero as an eigenvalue. λ = 0 is an eigenvalue of A if and only if det A = 0. (s) The number 0 is not an eigenvalue of A. (t) The determinant of A is not zero. Theorem: Eigenvalues of a triangular matrix. The eigenvalues of a triangular matrix are the entries on its main diagonal. Proof. This immediately follows from the fact the the determinant of a triangular matrices is the product of its diagonal elements. Therefore, for example, for the 3 × 3 triangular matrix d1 a b A = 0 d2 c 0 0 d3 its characteristic polynomial det(A−λI) equals d1 − λ a b d2 − λ c det 0 0 0 d3 − λ and therefore det(A − λI) = (d1 − λ)(d2 − λ)(d3 − λ) The Invertible Matrix Theorem (continued). An n × n matrix A is invertible iff : has roots (eigenvalues of A) λ = d1 , d 2 , d 3 . 33 MATH10212 • Linear Algebra B • Lectures 18–21 • Diagonalisation Lectures 18-21: Diagonalisation Similar matrices. A is similar to B if there exist an invertible matrix P such that A = P BP −1 . Diagonalisable matrices. A square matrix A is diagonalisable if it is similar to a diagonal matrix, that is, A = P DP −1 Similarity is an equivalence relation. Conjugation. Operation AP = P −1 AP is called conjugation of A by P . Properties of conjugation • IP = I • (AB)P = AP B P ; as a corollary: • (A + B)P = AP + B P • (c · A)P = c · AP • (Ak )P = (AP )k • (A−1 )P = (AP )−1 • (AP )Q = AP Q Theorem: Similar matrices have the same characteristic polynomial. If matrices A and B are similar then they have the same characteristic polynomials and eigenvalues. for some diagonal matrix D and some invertible matrix P . Of course, the diagonal entries of D are eigenvalues of A. The Diagonalisation Theorem. An n × n matrix A is diagonalisable iff A has n linearly independent eigenvectors. In fact, A = P DP −1 where D is diagonal iff the columns of P are n linearly independent eigenvectors of A. Theorem: Matrices with all eigenvalues distinct. An n × n matrix with n distinct eigenvalues is diagonalisable. Example: Non-diagonalisable matrices 2 1 A= 0 2 has repeated eigenvalues λ1 = λ2 = 2. A is not diagonalisable. Why? 34 MATH10212 • Linear Algebra B • Lectures 22 and 24 • Vector spaces Lectures 22 and 24: Vector spaces A vector space is a non-empty space of elements (vectors), with operations of addition and multiplication by a scalar. By definition, the following holds for all u, v, w in a vector space V and for all scalars c and d. 1. The sum of u and v is in V . 2. u + v = v + u. 3. (u + v) + w = u + (v + w). 4. There is zero vector 0 such that u + 0 = u. 5. For each u there exists −u such that u + (−u) = 0. 6. The scalar multiple cu is in V . 7. c(u + v) = cu + cv. 8. (c + d)u = cu + du. 9. c(du) = (cd)u. 10. 1u = u. Definitions Most concepts introduced in the previous lectures for the vector spaces Rn can be transferred to arbitrary vector spaces. Let V be a vector space. Given vectors v1 , v2 , . . . , vp in V and scalars c1 , c2 , . . . , cp , the vector y = c1 v1 + · · · cp vp is called a linear combination of v1 , v2 , . . . , vp with weights c1 , c2 , . . . , cp . If v1 , . . . , vp are in V , then the set of all linear combination of v1 , . . . , vp is denoted by Span{v1 , . . . , vp } and is called the subset of V spanned (or generated) by v1 , . . . , vp . That is, Span{v1 , . . . , vp } is the collection of all vectors which can be written in the form c1 v1 + c2 v2 + · · · + cp vp with c1 , . . . , cp scalars. Examples. Of course, the most important example of vector spaces is provided by standard spaces Rn of column vectors with n components, with the usual operations of component-wise addition and multiplication by scalar. Another example is the set Pn of polynomials of degree at most n in variable x, Pn = {a0 + a1 x + · · · + an xn } with the usual operations of addition of polynomials multiplication by constant. And here is another example: the set C[0, 1] of real-valued continuous functions on the segment [0, 1], with the usual operations of addition of functions and multiplication by constant. Span{v1 , . . . , vp } is a vector subspace of V , that is, it is closed with respect to addition of vectors and multiplying vectors by scalars. V is a subspace of itself. { 0 } is a subspace, called the zero subspace. An indexed set of vectors { v1 , . . . , vp } in V is linearly independent if the vector equation x1 v1 + · · · + xp vp = 0 has only trivial solution. 35 MATH10212 • Linear Algebra B • Lectures 22 and 24 • Vector spaces The set is linear if: { v1 , . . . , vp } in V is linearly dependent if there exist weights c1 , . . . , cp , not all zero, such that c1 v1 + · · · + cp vp = 0 Theorem: Characterisation of linearly dependent sets. An indexed set S = { v1 , . . . , vp } of two or more vectors is linearly dependent if and only if at least one of the vectors in S is a linear combination of the others. A basis of V is a linearly independent set which spans V . A vector space V is called finite dimensional if it has a finite basis. Any two bases in a finite dimensional vector space have the same number of vectors; this number is called the dimension of V and is denoted dim V . If B = (b1 , . . . , bn ) is a basis of V , every vector x ∈ V can be written, and in a unique way, as a linear combination • T (u + v) = T (u) + T (v) for all vectors u, v ∈ V ; • T (cu) = cT (u) for all vectors u and all scalars c. Properties of linear transformations. If T is a linear transformation then T (0) = 0 and T (cu + dv) = cT (u) + dT (v). The identity transformation of V is V −→ V x 7→ x. The zero transformation of V is V −→ V x 7→ 0. x = x1 b1 + · · · xn bn ; the scalars x1 , . . . , xn are called coordinates of x with respect to the basis B. A transformation T from a vector space V to a vector space W is a rule that assigns to each vector x in V a vector T (x) in W . The set V is called the domain of T , the set W is the codomain of T . A transformation T : V −→ W If T : V −→ W is a linear transformation, its kernel Ker T = {x ∈ V : T (x) = 0} is a subspace of V , while its image Im T = {y ∈ W : T (x) = y for some x ∈ V } is a subspace of W . MATH10212 • Linear Algebra B • Lectures 25–27 • Symmetric matrices and inner product 36 Lectures 25—27: Symmetric matrices and inner product Symmetric matrices. A matrix A is symmetric if AT = A. An example: 1 2 3 A = 2 3 4 . 3 4 5 Notice that symmetric matrices are necessarily square. Inner (or dot) product. For vectors u, v ∈ Rn their inner product (also called scalar product or dot product) u v is defined as u v = uT v v1 v2 = u1 u2 · · · un .. . vn = u1 v1 + u2 v2 + · · · + un vn Properties of dot product Theorem 6.1.1. Let u, v, w ∈ Rn , and c be a scalar. Then (a) u v = v u (b) (u + v) w = u w + v w (c) (cu) v = c(u v) (d) u u > 0, and u u = 0 if and only if u = 0 Properties (b) and (c) can be combined as (c1 u1 + · · · + cp up ) w = c1 (u1 w) + · · · + cp (up w) The length of a vector v is defined as q √ kvk = v v = v12 + v22 + · · · + vn2 In particular, kvk2 = v v. We call v a unit vector if kvk = 1. Orthogonal vectors. Vectors u and v are orthogonal to each other if uv = 0 Theorem. Vectors u and v are orthogonal if and only if kuk2 + kvk2 = ku + vk2 . That is, vectors u and v are orthogonal if and only if the Pythagoras Theorem holds for the triangle formed by u and v as two sides (so that its third side is u + v). Proof is a straightforward computation: ku + vk2 = = = = = (u + v) (u + v) uu + uv + vu + vv uu + uv + uv + vv u u + 2u v + v v kuk2 + 2u v + kvk2 . MATH10212 • Linear Algebra B • Lectures 25–27 • Symmetric matrices and inner product Hence 37 is an orthogonal set of non-zero vectors in Rn then S is linearly independent. ku + vk2 = kuk2 + kvk2 + 2u v, and ku + vk2 = kuk2 + kvk2 iff u v = 0. Proof. Assume the contrary that S is linearly dependent. This means that there exist scalars c1 , . . . , cp , not all of them zeroes, such that Orthogonal sets. A set of vectors c1 u1 + · · · + cp up = 0. { u1 , . . . , up } in Rn is orthogonal if ui uj = 0 whenever i 6= j. Forming the dot product of the left hand side and the right hand side of this equation with ui , we get (c1 u1 + · · · + cp up ) ui = 0 ui = 0 Theorem 6.2.4. Is S = { u1 , . . . , up } After opening the brackets we have c1 u1 ui + · · · + ci−1 ui−1 + ci ui ui + ci+1 ui+1 ui + · · · + cp up ui = 0. In view of orthogonality of the set S, all dot products on the left hand side, with the exception of ui u, equal 0. Hence what remains from the equality is be eigenvectors of a symmetric matrix A for pairwise distinct eigenvalues ci ui ui = 0 λ1 , . . . , λp . Since ui is non-zero, Then ui ui 6= 0 and therefore ci = 0. This argument works for every index i = 1, . . . , p. We get a contradition with our assumption that one of ci is non-zero. { v1 , . . . , vp } is an orthogonal set. Proof. We need to prove the following: n An orthogonal basis for R is a basis which is also an orthogonal set. For example, the two vectors 1 1 , 1 −1 form an orthogonal basis of R2 (check!). If u and v are eigenvectors for A for eigenvalues λ 6= µ, then u v = 0. Theorem: Eigenvectors of symmetric matrices. Let v1 , . . . , vp For a proof, consider the following se- MATH10212 • Linear Algebra B • Lectures 25–27 • Symmetric matrices and inner product Notice that this definition can reformulated as 1, if i = j ui uj = . 0, if i 6= j quence of equalities: λu v = = = = = = = = = 38 (λu) v (Au) v (Au)T v (uT AT )v (uT A)v uT (Av) uT (µv) µuT v µu v. Theorem: Coordinates with respect to an orthonormal basis. If { u1 , . . . , un } is an orthonormal basis and v a vector in Rn , then Hence v = x1 u1 + · · · + xn un λu v = µu v where and (λ − µ)u v = 0. Since λ − µ 6= 0, we have u v = 0. Corollary. If A is an n × n symmetric matrix with n distinct eigenvalues λ1 , . . . , λ n then the corresponding eigenvectors v1 , . . . , vn form an orthogonal basis of Rn . An orthonormal basis in Rn is an orthogonal basis u1 , . . . , un made of unit vectors, kui k = 1 for all i = 1, 2, . . . , n. xi = v ui , for all i = 1, 2, . . . , n. Proof. Let v = x1 u1 + · · · + xn un , then, for i = 1, 2, . . . , n, v ui = x1 u1 ui + · · · + xn un ui = x1 ui ui = xi . Theorem: Eigenvectors of symmetric matrices. Let A be a symmetric n × n matrix with n distinct eigenvalues λ1 , . . . , λ n . Then Rn has an orthonormal basis made of eigenvectors of A. 39 MATH10212 • Linear Algebra B • Lectures 28 and 29 • Inner product spaces Lectures 28 and 29: Inner product spaces Inner (or dot) product is a function which associate, with each pair of vectors u and v in a vector space V a real number denoted u v, subject to the following axioms: u u > 0 and u u = 0 if and only if u = 0, that is, if f is a continuous function on [0, 1] and Z 1 f (x)2 dx = 0 0 1. u v = v u 2. (u + v) w = u w + v w 3. (cu) v = c(u v) 4. u u > 0 and u u = 0 if and only if u = 0 Inner product space. A vector space with an inner product is called an inner product space. Example: School Geometry. The ordinary Euclidean plane of school geometry is an inner product space: • Vectors: directed segments starting at the origin O then f (x) = 0 for all x ∈ [0, 1]. This requires the use of properties of continuous functions; since we study linear algebra, not analysis, I am leaving it to the readers as an exercise. Length. The length of the vector u is defined as √ kuk = u u. Notice that kuk2 = u u, and kuk = 0 ⇔ u = 0. • Addition: parallelogram rule • Product-by-scalar: stretching the vector by factor of c. Orthogonality. We say that vectors u and v are orthogonal if u v = 0. • Dot product: u v = kukkvk cos θ, where θ is the angle between vectors u and v. Example: C[0, 1]. The vector space C[0, 1] of real-valued continuous functions on the segment [0, 1] becomes an inner product space if we define inner product by the formula Z 1 f g= f (x)g(x)dx. 0 In this example, the tricky bit is to show that the dot product on C[0, 1] satisfies the axiom: The Pythagoras Theorem. Vectors u and v are orthogonal if and only if kuk2 + kvk2 = ku + vk2 . Proof is a straightforward computation, exactly the same as in Lectures 25–27: ku + vk2 = = = = = (u + v) (u + v) uu + uv + vu + vv uu + uv + uv + vv u u + 2u v + v v kuk2 + 2u v + kvk2 . 40 MATH10212 • Linear Algebra B • Lectures 28 and 29 • Inner product spaces Hence Theorem: The Cauchy-Schwarz Inequality. In any inner product space, ku + vk2 = kuk2 + kvk2 + 2u v, and |u v| 6 kukkvk. ku + vk2 = kuk2 + kvk2 if and only if The Cauchy-Schwarz Inequality: an example. In the vector space Rn with the dot product of u v = 0. u1 v1 .. .. u = . and v = . un vn The Cauchy-Schwarz Inequality. In the Euclidean plane, u v = kukkvk cos θ hence defined as |u v| 6 kukkvk. u v = uT v = u1 v1 + · · · + un vn , The following Theorem shows that this is is a general property of inner product spaces. |u1 v1 + · · · + un vn | 6 q the Cauchy-Schwarz Inequality becomes u21 + ··· + u2n q · v12 + · · · + vn2 . or, which is its equivalent form, (u1 v1 + · · · + un vn )2 6 (u21 + · · · + u2n ) · (v12 + · · · + vn2 ). The Cauchy-Schwarz Inequality: one more example. In the inner prod- Z 1 0 uct space C[0, 1] the Cauchy-Schwarz Inequality takes the form sZ f (x)g(x)dx 6 s Z 1 1 f (x)2 dx · 0 g(x)2 dx, 0 or, equivalently, Z 2 1 f (x)g(x)dx Z 6 0 Proof of the Cauchy-Schwarz Inequality given here is different from the one in the textbook by Lay. Our proof is based on the following simple fact from school algebra: 1 1 Z f (x) dx · 2 0 2 g(x) dx . 0 Let q(t) = at2 + bt + c be a quadratic function in variable t with the property 41 MATH10212 • Linear Algebra B • Lectures 28 and 29 • Inner product spaces that 2. d(u, v) > 0. q(t) > 0 3. d(u, v) = 0 iff u = v. for all t. Then 4. The Triangle Inequality: b2 − 4ac 6 0. d(u, v) + d(v, w) > d(u, w). Now consider the function q(t) in variable t defined as q(t) = (u + tv) (u + tv) = u u + 2tu v + t2 v v = kuk2 + (2u v)t + (kvk2 )t2 . Notice q(t) is a quadratic function in t and q(t) > 0. Hence Axioms 1–3 immediately follow from the axioms for dot product, but the Triangle Inequality requires some attention. Proof of the Triangle Inequality. It suffices to prove ku + vk 6 kuk + kvk, or ku + vk2 6 (kuk + kvk)2 , (2u v)2 − 4kuk2 kvk2 6 0 or which can be simplified as (u+v) (u+v) 6 kuk2 +2kukkvk+kvk2 , |u v| 6 kukkvk. or Distance. We define distance between vectors u and v as d(u, v) = ku − vk. It satisfies axioms of metric: 1. d(u, v) = d(v, u). u u+2u v+v v 6 u u+2kukkvk+v v, or 2u v 6 2kukkvk But this is the Cauchy-Schwarz Inequality. Now observe that all rearrangements are reversible, hence the Triangle Inequality holds. 42 MATH10212 • Linear Algebra B • Lecture 30 • Orthogonalisation and the Gram-Schmidt Process Lecture 30: Orthogonalisation and the Gram-Schmidt Process Orthogonal basis. A basis v1 , . . . , vn in the inner product space V is called orthogonal if vi vj = 0 for i 6= j. is called orthonormal if it is orthogonal and kvi k = 1 for all i. Equivalently, Coordinates in respect to an orthogonal basis: vi vj = 1 if i = j . 0 if i 6= j u = c1 v1 + · · · + cn vn iff ci = u vi vi vi Orthonormal basis. A basis v1 , . . . , vn in the inner product space V Coordinates in respect to an orthonormal basis: u = c1 v1 + · · · + cn vn iff ci = u vi The Gram-Schmidt Orthogonalisation Process makes an orthogonal basis from a basis x1 , . . . , xp : v1 = x1 x2 v1 v1 v1 v1 x3 v1 x3 v2 = x3 − v1 − v2 v1 v1 v2 v2 .. . xp v1 xp v2 xp vp−1 = xp − v1 − v2 − · · · − vp−1 v1 v1 v2 v2 vp−1 vp−1 v2 = x2 − v3 vp Main Theorem about inner product spaces. Every finite dimensional inner product space has an orthonormal basis and is isomorphic to on of Rn with the standard dot product u v = uT v = u1 · · · v .1 un .. . vn
© Copyright 2024