Research Matters How and Why to Estimate Condition February Numbers Matrix Functions 25,for 2009 Nick Higham Nick Higham Director SchoolofofResearch Mathematics The University of Manchester School of Mathematics http://www.maths.manchester.ac.uk/~higham @nhigham, nickhigham.wordpress.com Joint work with Sam Relton. Computational Linear Algebra and Optimization for the Digital Economy 1/6 Edinburgh, October 2013 Sources of Error f : Cn×n → Cn×n . Want to compute f (A) but given A + ∆A not A. ∆A may come from Rounding errors in storing A. Measurement errors. Errors from an earlier computation. Nick Higham Matrix Function Condition Numbers 2 / 33 Sources of Error f : Cn×n → Cn×n . Want to compute f (A) but given A + ∆A not A. ∆A may come from Rounding errors in storing A. Measurement errors. Errors from an earlier computation. Photocopying errors! Nick Higham Matrix Function Condition Numbers 2 / 33 Photocopier http://en.wikipedia.org/wiki/Photocopier “Most current photocopiers use a technology called xerography, a dry process that uses electrostatic charges on a light sensitive photoreceptor to first attract and then transfer toner particles (a powder) onto paper in the form of an image” “There is an increasing trend for new photocopiers to adopt digital technology, thus replacing the older analog technology. With digital copying, the copier effectively consists of an integrated scanner and laser printer.” Nick Higham Matrix Function Condition Numbers 3 / 33 Xerox WorkCentre 7535, 7556 August 2013: German computer scientist David Kriesel discovered the machines often change “6” to “8”. Jbig2 compression algorithm implicated. Nick Higham Matrix Function Condition Numbers 4 / 33 Maxim Every code for solving a numerical problem should return an error estimate or bound with the computed result. Nick Higham Matrix Function Condition Numbers 5 / 33 Conditioning of f (x) Scalar function f ∈ C 2 , and y = f (x), y + ∆y = f (x + ∆x). Then 0 ∆y xf (x) ∆x = + O ∆x 2 . y f (x) x The relative condition number 0 xf (x) . c(x) = f (x) Nick Higham Matrix Function Condition Numbers 6 / 33 Conditioning of f (x) Scalar function f ∈ C 2 , and y = f (x), y + ∆y = f (x + ∆x). Then 0 ∆y xf (x) ∆x = + O ∆x 2 . y f (x) x The relative condition number 0 xf (x) . c(x) = f (x) Condition number of the condition number: assuming xf 0 (x) > 0 and f (x) > 0, 0 00 xc (x) f (x) f 0 (x) [2] c (x) := = 1+x − . c(x) f 0 (x) f (x) Nick Higham Matrix Function Condition Numbers 6 / 33 Fréchet Derivative Fréchet derivative of f : Cn×n → Cn×n at X ∈ Cn×n A linear mapping Lf : Cn×n → Cn×n s.t. for all E ∈ Cn×n f (X + E) − f (X ) − Lf (X , E) = o(kEk). Nick Higham Matrix Function Condition Numbers 7 / 33 Fréchet Derivative Fréchet derivative of f : Cn×n → Cn×n at X ∈ Cn×n A linear mapping Lf : Cn×n → Cn×n s.t. for all E ∈ Cn×n f (X + E) − f (X ) − Lf (X , E) = o(kEk). Examples: (X + E)2 − X 2 = XE + EX + E 2 ⇒ Lx 2 (X , E) = XE + EX . (X + E)−1 − X −1 = −X −1 EX −1 + O(E 2 ) ⇒ Lx −1 (X , E) = −X −1 EX −1 . Nick Higham Matrix Function Condition Numbers 7 / 33 Fréchet Derivative Fréchet derivative of f : Cn×n → Cn×n at X ∈ Cn×n A linear mapping Lf : Cn×n → Cn×n s.t. for all E ∈ Cn×n f (X + E) − f (X ) − Lf (X , E) = o(kEk). Examples: (X + E)2 − X 2 = XE + EX + E 2 ⇒ Lx 2 (X , E) = XE + EX . (X + E)−1 − X −1 = −X −1 EX −1 + O(E 2 ) ⇒ Lx −1 (X , E) = −X −1 EX −1 . Scalar case: Lf (x, e) = f (0 x)e. Nick Higham Matrix Function Condition Numbers 7 / 33 Gâteaux Derivative Gâteaux derivative of matrix function f at A in direction E: f (A + E) − f (A) . →0 Gf (A, E) = lim Weaker notion of differentiability. Nick Higham Matrix Function Condition Numbers 8 / 33 Applications of Fréchet Derivative Computation of correlated choice probabilities (2013), registration of MRI images (2013), Markov models applied to cancer data (2013), matrix geometric mean computation (2012), model reduction (2012), first and second Fréchet derivative in Halley’s method on Banach space (2003). Nick Higham Matrix Function Condition Numbers 9 / 33 Componentwise Sensitivity (1) Let E = ei ejT . Then f (X + ei ejT ) − f (X ) lim = Lf (X , ei ejT ). →0 (Lf (X , ei ejT ))rs measures the sensitivity of f (X )rs to perturbations in aij . Nick Higham Matrix Function Condition Numbers 10 / 33 Componentwise Sensitivity (2) Since Lf is a linear operator, vec(Lf (A, E)) = Kf (A)vec(E) where Kf (A) ∈ Cn derivative. 2 ×n2 is the Kronecker form of the Fréchet a11 a21 a31 . . . ann f11 f21 f31 .. . Kf (A) ≡ Lf (A, ei ejT ) fnn Nick Higham Matrix Function Condition Numbers 11 / 33 Condition Number kf (A + E) − f (A)k . →0 kEk≤ condabs (f , A) = lim sup kLf (A)k := max E6=0 kLf (A, E)k . kEk Lemma condabs (f , A) = kLf (A)k. Nick Higham Matrix Function Condition Numbers 12 / 33 Condition Number kf (A + E) − f (A)k . →0 kEk≤ condabs (f , A) = lim sup kLf (A)k := max E6=0 kLf (A, E)k . kEk Lemma condabs (f , A) = kLf (A)k. Scalar case: condabs (f , x) = |f 0 (x)|. Nick Higham Matrix Function Condition Numbers 12 / 33 Relative Condition Number kf (A + E) − f (A)k →0 kEk≤kAk kf (A)k condrel (f , A) := lim sup = condabs (f , A) Nick Higham kAk . kf (A)k Matrix Function Condition Numbers 13 / 33 Computing Lf : Methods for Specific f exponential Kenney & Laub (1998), Al-Mohy & H (2009) logarithm Al-Mohy, H & Relton (2013) fractional power H & Lin (2013) Latter three methods obtained by differentiating alg for f . Nick Higham Matrix Function Condition Numbers 14 / 33 Computing Lf : Via 2n × 2n Matrix Theorem (Mathias, 1996) If f is 2n − 1 times ctsly diffble, A E f (A) Lf (A, E) f = . 0 A 0 f (A) Note that Lf (A, αE) = αLf (A, E), but α may effect alg used for the evaluation. Nick Higham Matrix Function Condition Numbers 15 / 33 Computing Lf : Complex Step Assume that f : Rn×n → Rn×n and A, E ∈ Rn×n . Then f (A + i hE) − f (A) − i hLf (A, E) = o(h). Thus (Al-Mohy & H, 2010) f (A) ≈ Re f (A + i hE), Lf (A, E) ≈ Im f (A + i hE) . h Errors O(h2 ). h not restricted by fl pt arith considerations. Can take h = 10−100 . f alg must not employ complex arith. Nick Higham Matrix Function Condition Numbers 16 / 33 Condition Estimation (1) Since Lf is a linear operator, vec(Lf (A, E)) = Kf (A)vec(E) where Kf (A) ∈ Cn derivative. 2 ×n2 is the Kronecker form of the Fréchet Can show kLf (A)kF = kKf (A)k2 , kLf (A)k1 ≤ kKf (A)k1 ≤ nkLf (A)k1 . n So problem reduces to matrix norm estimation. Nick Higham Matrix Function Condition Numbers 17 / 33 Condition Estimation (2) Use the block 1-norm estimator of H & Tisseur (2000). For kBk1 it needs Bx and B ∗ y for several x and y . Let vec(X ) = x. Then Kf (A)x = vec(Lf (A, X )). Theorem (H & Lin, 2013) Let f ∈ C 2n−1 and and ef (z) := f (z). If ef (A)∗ = ef (A∗ ) for all A ∈ Cn×n then Kf (A)∗ x = vec(Lef (A, X ∗ )∗ ), where vec(X ) = x. ef = f for most functions of interest. ef = f ⇒ ef (A∗ )∗ = ef (A∗ ). Nick Higham Matrix Function Condition Numbers 18 / 33 Software for 1-Norm Estimator MATLAB: normest1. NAG library: F04YD, F04ZD, Mark 24. Python: Nick Higham Matrix Function Condition Numbers 19 / 33 Python: SciPy 0.13.0 linalg/_expm_frechet.py linalg/_expm_multiply.py linalg/_sqrtm.py (blocked) Nick Higham Matrix Function Condition Numbers 20 / 33 Higher Derivatives Needed for level-2 condition number, understanding accuracy of algorithms for computing Lf (A, E). (2) Second Fréchet derivative Lf (A, E1 , E2 ) is unique mutlilinear function of E1 , E2 satisfying (2) Lf (A+E2 , E1 ) − Lf (A, E1 ) − Lf Nick Higham (A, E1 , E2 ) = o(kE2 k). Matrix Function Condition Numbers 21 / 33 Higher Derivatives Needed for level-2 condition number, understanding accuracy of algorithms for computing Lf (A, E). (2) Second Fréchet derivative Lf (A, E1 , E2 ) is unique mutlilinear function of E1 , E2 satisfying (2) Lf (A+E2 , E1 ) − Lf (A, E1 ) − Lf (A, E1 , E2 ) = o(kE2 k). k th Fréchet derivative defined by (k −1) Lf (k −1) (A+Ek , E1 , . . . , Ek −1 ) − Lf (k ) − Lf Nick Higham (A, E1 , . . . , Ek −1 ) (A, E1 , . . . , Ek ) = o(kEk k). Matrix Function Condition Numbers 21 / 33 Existing Literature Large literature on Fréchet derivatives in Banach space. Need specialized results for matrix functions. Nick Higham Matrix Function Condition Numbers 22 / 33 Existence & Continuity of Fréchet Derivatives D = open subset of C. Cn×n (D, p) = matrices with spectrum in D and largest Jordan block of size p. Theorem (H & Relton, 2013) Let f be 2k p − 1 times continuously differentiable on D. (k ) Then for A ∈ Cn×n (D, p) the kth Fréchet derivative Lf (A) (k ) exists and Lf (A, E1 , . . . , Ek ) is continuous in A and E1 , . . . , Ek ∈ Cn×n . Proof uses Gâteaux derivative. k = 1: Mathias (1996). Nick Higham Matrix Function Condition Numbers 23 / 33 Properties Assume from now on conditions of theorem satisfied. Then the Ei are interchangeable: (2) (2) Lf (A, E1 , E2 ) = Lf (A, E2 , E1 ). Indeed (k ) Lf (A, E1 , . . . , Ek ) ∂ f (A + s1 E1 + · · · + sk Ek ). = ∂s1 · · · ∂sk s=0 Nick Higham Matrix Function Condition Numbers 24 / 33 The Setup Wish to estimate norms and condition numbers. Knowing how to evaluate exact quantities helps develop and test the estimates. In practice we only ever work with n × n matrices. Nick Higham Matrix Function Condition Numbers 25 / 33 How to Compute Lf(2) X1 = Let h i A E f (A) Lf (A,E) 1 . Know f (X ) = . 1 0 f (A) 0 A A E E 0 1 2 0 A 0 E2 0 1 X2 = I2 ⊗ X1 + ⊗ E2 = 0 0 A E1 . 0 0 0 0 0 A Then (2) f (A) Lf (A, E1 ) Lf (A, E2 ) Lf (A, E1 , E2 ) 0 f (A) 0 Lf (A, E2 ) . f (X2 ) = 0 0 f (A) Lf (A, E1 ) 0 0 0 f (A) Nick Higham Matrix Function Condition Numbers 26 / 33 ) How to Compute L(k f i i Define Xi ∈ C2 n×2 n by 0 1 Xi = I2 ⊗ Xi−1 + ⊗ I2i−1 ⊗ Ei , 0 0 X0 = A. Theorem (H & Relton, 2013) (k ) The (1, n) block of f (Xk ) is Lf (A, E1 , . . . , Ek ). Nick Higham Matrix Function Condition Numbers 27 / 33 Level-2 Condition Number “Condition number of the condition number”. Demmel (1987) showed that for matrix inversion (and D. J. Higham, 1995), the eigenproblem, polynomial zero-finding, pole assignment in linear control problems (relative) level-1 and level-2 cond no’s are equivalent. Cheung & Cucker (2005) show same holds when “condition number = 1 / distance to nearest ill-posed problem”. Nick Higham Matrix Function Condition Numbers 28 / 33 Level-2 Condition Number [2] |condabs (f , A + Z ) − condabs (f , A)| . →0 kZ k≤ condabs (f , A) = lim sup Theorem (H & Relton, 2013) [2] (2) condabs (f , A) ≤ kKf (A)k2 , (2) for a Kronecker matrix Kf (A) ∈ Cn Nick Higham 4 ×n2 . Matrix Function Condition Numbers 29 / 33 Level-2 Condition Number: Exponential Theorem Let A ∈ Cn×n be normal. Then in the 2-norm [2] condabs (exp, A) = condabs (exp, A). Nick Higham Matrix Function Condition Numbers 30 / 33 Level-2 Condition Number: Exponential Theorem Let A ∈ Cn×n be normal. Then in the 2-norm [2] condabs (exp, A) = condabs (exp, A). Theorem Let A ∈ Cn×n be normal. Then in the 2-norm [2] 1 ≤ condrel (exp, A) ≤ 2condrel (exp, A) + 1. Nick Higham Matrix Function Condition Numbers 30 / 33 Level-2 Condition Number: Matrix Inverse Theorem For nonsingular A ∈ Cn×n , [2] condabs (x −1 , A) = 2condabs (x −1 , A)3/2 . Latter is what is expected from the scalar case: f (x) = x −1 ⇒ |f 00 | = |2(f 0 )3/2 |. Have experimental evidence that matrix case can be similar to scalar case. Nick Higham Matrix Function Condition Numbers 31 / 33 Condition Number of the Fréchet Derivative kLf (A + ∆A, E + ∆E) − Lf (A, E)k . →0 k∆Ak≤ condabs (Lf , A, E) = lim sup k∆Ek≤ Theorem (H & Relton, 2013) We have condabs (f , A) ≤ condabs (Lf , A, E) (2) ≤ max kLf (A, E, Z )k + condabs (f , A). kZ k=1 We have developed method to estimate the bounds. Nick Higham Matrix Function Condition Numbers 32 / 33 Summary and Open Questions Showed that given an O(n3 ) flop method for f : (k ) Computing Lf (A, E) costs O(8k n3 ) flops. (k ) Computing Kf (A) costs O(8k n3+2k ) flops. Cost of estimates is O(n3 ) flops, given an O(n3 ) flop method for Lf . Open questions about relation between level-1 and level-2 condition numbers. How can we exploit the symmetry of (k ) Lf (E1 , E2 , . . . , Ek ) in the Ei ? Looking at componentwise condition numbers. Nick Higham Matrix Function Condition Numbers 33 / 33 References I S. D. Ahipasaoglu, X. Li, and K. Natarajan. A convex optimization approach for computing correlated choice probabilities with many alternatives. Preprint 4034, Optimization Online, 2013. 26 pp. A. H. Al-Mohy and N. J. Higham. Computing the Fréchet derivative of the matrix exponential, with an application to condition number estimation. SIAM J. Matrix Anal. Appl., 30(4):1639–1657, 2009. Nick Higham Matrix Function Condition Numbers 1/8 References II A. H. Al-Mohy and N. J. Higham. The complex step approximation to the Fréchet derivative of a matrix function. Numer. Algorithms, 53(1):133–148, 2010. A. H. Al-Mohy, N. J. Higham, and S. D. Relton. Computing the Fréchet derivative of the matrix logarithm and estimating the condition number. SIAM J. Sci. Comput., 35(4):C394–C410, 2013. S. Amat, S. Busquier, and J. M. Gutiérrez. Geometric constructions of iterative functions to solve nonlinear equations. J. Comput. Appl. Math., 157(1):197–205, 2003. Nick Higham Matrix Function Condition Numbers 2/8 References III J. Ashburner and G. R. Ridgway. Symmetric diffeomorphic modelling of longitudinal structural MRI. Frontiers in Neuroscience, 6, 2013. Article 197. J. W. Demmel. On condition numbers and the distance to the nearest ill-posed problem. Numer. Math., 51:251–289, 1987. Nick Higham Matrix Function Condition Numbers 3/8 References IV B. García-Mora, C. Santamaría, G. Rubio, and J. L. Pontones. Computing survival functions of the sum of two independent Markov processes. An application to bladder carcinoma treatment. Int. Journal of Computer Mathematics, 2013. D. J. Higham. Condition numbers and their condition numbers. Linear Algebra Appl., 214:193–213, 1995. Nick Higham Matrix Function Condition Numbers 4/8 References V N. J. Higham. Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008. ISBN 978-0-898716-46-7. xx+425 pp. N. J. Higham and A. H. Al-Mohy. Computing matrix functions. Acta Numerica, 19:159–208, 2010. N. J. Higham and L. Lin. An improved Schur–Padé algorithm for fractional powers of a matrix and their Fréchet derivatives. SIAM J. Matrix Anal. Appl., 34(3):1341–1360, 2013. Nick Higham Matrix Function Condition Numbers 5/8 References VI N. J. Higham and S. D. Relton. The condition number of the Fréchet derivative of a matrix function. MIMS EPrint, Manchester Institute for Mathematical Sciences, The University of Manchester, UK, 2013. In preparation. N. J. Higham and S. D. Relton. Higher order Fréchet derivatives of matrix functions and the level-2 condition number. MIMS EPrint 2013.58, Manchester Institute for Mathematical Sciences, The University of Manchester, UK, Nov. 2013. 19 pp. Nick Higham Matrix Function Condition Numbers 6/8 References VII B. Jeuris, R. Vandebril, and B. Vandereycken. A survey and comparison of contemporary algorithms for computing the matrix geometric mean. Electron. Trans. Numer. Anal., 39:379–402, 2012. C. S. Kenney and A. J. Laub. A Schur–Fréchet algorithm for computing the logarithm and exponential of a matrix. SIAM J. Matrix Anal. Appl., 19(3):640–663, 1998. R. Mathias. A chain rule for matrix functions and applications. SIAM J. Matrix Anal. Appl., 17(3):610–620, 1996. Nick Higham Matrix Function Condition Numbers 7/8 References VIII D. Petersson. A Nonlinear Optimization Approach to H2 -Optimal Modeling and Control. PhD thesis, Department of Electrical Engineering, Linköping University, Sweden, SE-581 83 Linköping, Sweden, 2013. Dissertation No. 1528. D. Petersson and J. Löfberg. Model reduction using a frequency-limited H2 -cost. Technical report, Department of Electrical Engineering, Linköpings Universitet, SE-581 83, Linköping, Sweden, Mar. 2012. 13 pp. Nick Higham Matrix Function Condition Numbers 8/8
© Copyright 2025