Matrix Reference Manual Matrix Calculus Go to: Introduction, Notation, Index Contents of Calculus Section z z z z Notation Derivatives of Linear, Quadratic and Cubic Products Derivatives of Inverses, Trace and Determinant Jacobians and Hessian matrices Notation z z z z z z z z d/dx (y) is a vector whose (i) element is dy(i)/dx d/dx (y) is a vector whose (i) element is dy/dx(i) d/dx (yT) is a matrix whose (i,j) element is dy(j)/dx(i) d/dx (Y) is a matrix whose (i,j) element is dy(i,j)/dx d/dX (y) is a matrix whose (i,j) element is dy/dx(i,j) xR and xI are the real and imaginary parts of x x* is the complex conjugate of x j is the square root of -1 An expression, y, can only differentiated with respect to a complex x if it satisfies the Cauchy-Riemann equations: dy/dxR = j dy/dxI . Expressions involving the complex conjugate or Hermitian transpose do not normally satisfy this requirement, so separate expressions for dy/dxR and dy/dxI are given in these cases. In the expressions below matrices and vectors A, B, C do not depend on X. Derivatives of Linear Products z z z z z z d/dx (AYB) =A * d/dx (Y) * B { d/dx (Ay) =A * d/dx (y) d/dx (xTA) =A T { d/dx (x ) =I T T { d/dx (x a) = d/dx (a x) = a d/dX (aTXb) = abT T T T T { d/dX (a Xa) = d/dX (a X a) = aa d/dX (aTXTb) = baT d/dx (YZ) =Y * d/dx (Z) + d/dx (Y) * Z dy/dxR (YH) = ( dy/dxR (Y) )H z dy/dxI (YH) = ( dy/dxI (Y) )H z dy/dxR (xHA) = A { z dy/dxR (xH) = I dy/dxI (xHA) = -jA { dy/dxI (xH) = -jI Derivatives of Quadratic Products z z z z z d/dx (Ax+b)TC(Dx+e) = ATC(Dx+e) + DTCT(Ax+b) T T { d/dx (x Cx) = (C+C )x T T [C=C ]: d/dx (x Cx) = 2Cx T d/dx (x x) = 2x T T T { d/dx (Ax+b) (Dx+e) = A (Dx+e) + D (Ax+b) T T d/dx (Ax+b) (Ax+b) = 2A (Ax+b) T T T { [C=C ]: d/dx (Ax+b) C(Ax+b) = 2A C(Ax+b) d/dX (aTXTXb) = X(abT + baT) T T T { d/dX (a X Xa) = 2Xaa d/dX (aTXTCXb) = CTXabT + CXbaT T T T T { d/dX (a X CXa) = (C + C )Xaa T T T T { [C=C ] d/dX (a X CXa) = 2CXaa d/dX ((Xa+b)TC(Xa+b)) = (C+CT)(Xa+b)aT d/dxR (Ax+b)HC(Dx+e) = AHC(Dx+e) + DTCT(Ax+b)* { z d/dxR (xHCx) = Cx+CTx* = Cx+(xHC)T [C=CT]: d/dxR (xHCx) = 2CxR [C=CH]: d/dxR (xHCx) = 2(Cx)R d/dxR (xHx) = 2xR d/dxI (Ax+b)HC(Dx+e) = j( DTCT(Ax+b)*–AHC(Dx+e) ) { d/dxI (xHCx) = j(CTx* – Cx) = j( (xHC)T – Cx ) [C=CT]: d/dxI (xHCx) = 2CxI [C=CH]: d/dxI (xHCx) = 2(Cx)I d/dxR (xHx) = 2xI Derivatives of Cubic Products z d/dx (xTAxxT) = (A+AT)xxT+xTAxI Derivatives of Inverses z z d/dx (Y-1) = -Y-1d/dx (Y)Y-1 [2.1] d/dX (aTX-1b) = -X-TabTX-T [2.6] Derivative of Trace Note: matrix dimensions must result in an n*n argument for tr(). z z z z z z z d/dX (tr(X)) = d/dX (tr(XT)) = I [2.4] d/dX (tr(Xk)) =k(Xk-1)T d/dX (tr(AXk)) = SUMr=0:k-1(XrAXk-r-1)T d/dX (tr(AX-1B)) = -(X-1BAX-1)T = -(X-TABX-T) [2.5] -1 -1 -T T -T { d/dX (tr(AX )) =d/dX (tr(X A)) = -X A X d/dX (tr(ATXBT)) = d/dX (tr(BXTA)) = AB [2.4] T T T T { d/dX (tr(XA )) = d/dX (tr(A X)) =d/dX (tr(X A)) = d/dX (tr(AX )) = A d/dX (tr(AXBXTC)) = ATCTXBT + CAXB T T T T { d/dX (tr(XAX )) = d/dX (tr(AX X)) = d/dX (tr(X XA)) = X(A+A ) T T T T { d/dX (tr(X AX)) = d/dX (tr(AXX )) = d/dX (tr(XX A)) = (A+A )X d/dX (tr(AXBX)) = ATXTBT + BTXTAT z z z [C:symmetric] d/dX (tr((XTCX)-1A) = d/dX (tr(A (XTCX)-1) = -(CX(XTCX)-1)(A+AT)(XTCX)-1 [B,C:symmetric] d/dX (tr((XTCX)-1(XTBX)) = d/dX (tr( (XTBX)(XTCX)-1) = -2(CX(XTCX)-1) XTBX(XTCX)-1 + 2BX(XTCX)-1 z Derivative of Determinant Note: matrix dimensions must result in an n*n argument for det(). Some of the expressions below involve inverses: these forms apply only if the quantity being inverted is square and non-singular. z z z z d/dX (det(X)) = d/dX (det(XT)) = ADJ(A)T=det(X)*X-T T T T -T T -T { d/dX (det(AXB)) = A ADJ(AXB)B = det(AXB)*A (AXB) B = det(AXB)*X T -T T -T { d/dX (ln(det(AXB))) = A (AXB) B = X d/dX (det(Xk)) = k*det(Xk)*X-T k -T { d/dX (ln(det(X ))) = kX [Real] d/dX (det(XTCX)) = det(XTCX)*(C+CT)X(XTCX)-1 T T T -1 { [C: Real,Symmetric] d/dX (det(X CX)) = 2det(X CX)* CX(X CX) [C: Real,Symmetricc] d/dX (ln(det(XTCX))) = 2CX(XTCX)-1 Jacobian If y is a function of x, then dyT/dx is the Jacobian matrix of y with respect to x. Its determinant, |dyT/dx|, is the Jacobian of y with respect to x and represents the ratio of the hypervolumes dy and dx. The Jacobian occurs when changing variables in an integration: Integral(f(y)dy) =Integral(f(y(x)) |dyT/dx| dx). Hessian matrix If f is a function of x then the symmetric matrix d2f/dx2 = d/dxT(df/dx) is the Hessian matrix of f(x). A value of x for which df/dx = 0 corresponds to a minimum, maximum or saddle point according to whether the Hessian is positive definite, negative definite or indefinite. z z d2/dx2 (aTx) = 0 d2/dx2 (Ax+b)TC(Dx+e) = ATCD + DTCTA 2 2 T T { d /dx (x Cx) = C+C d2/dx2 (xTx) = 2I d2/dx2 (Ax+b)T (Dx+e) = ATD + DTA 2 2 T T d /dx (Ax+b) (Ax+b) = 2A A [C: symmetric]: d2/dx2 (Ax+b)TC(Ax+b) = 2ATCA { { The Matrix Reference Manual is written by Mike Brookes, Imperial College, London, UK. Please send any comments or suggestions to [email protected]