Advanced Topics in Control: Distributed Systems and Control Project #2 Note: the project is due Monday, May 25th Leaders, followers and likeness in networks Introduction Consider a digraph G = (V, E) describing the web, where the vertex set V of cardinality n represents web-pages and the edge set E of cardinality m represents the links between pages, with (i, j) ∈ E if page i contains a link to page j. Let A01 ∈ {0, 1}n×n be the 0-1 adjacency matrix of the graph; let D be the diagonal matrix whose entry [D]ii corresponds to the out-degree of node i and define A := D−1 A01 ; notice that A is nonnegative and row-stochastic. In exercise session 3, we introduced the page-rank algorithm, in order to rank web-pages of G according to their relevance. The idea behind the page rank algorithm is that the importance of a page i should be increased by every page j that links to i, by an amount proportional to the importance of page j. Pn For every page i ∈ {1, . . . , n} we denote by xi ∈ [0, 1], s.t. x = 1, its relative importance. i i=1 According to the definition given before, this can be computed as xi = n X [A]ji xj , (1) j=1 or in vector form x = A> x . (2) Finding the solution of the algebraic equalities (2) is computationally too expensive for a network as big as the web. Therefore, in the page-rank algorithm the solution of (2) is obtained as the asymptotic value x ¯ := liml→∞ xl of the iterative algorithm xk+1 = A> xk x0 = p, (3) Pn with the initial value p such that pi ∈ [0, 1] and i=1 pi = 1. As explained in exercise session 3 and in paragraph 5.6 of the lecture notes, the page rank algorithm is actually a slight modification of (3). Task 1: modeling Leader-follower index In this project you will develop and extend the concept of importance of a vertex in a graph. Coming back to the example of the web, we now consider two indices for every web page i: a leader index `i , and a follower index fi . If we perform a web search with the keyword “university”, the homepages of ETH, University of Zurich and other universities are good leaders, while web pages pointing to these home pages are good followers. In other words, good followers are pages that link to good leaders, and good leaders are pages that are pointed to by good followers. Task (1.A) Based on the description above, write one equation expressing the follower index fi and one equation expressing the leader index `i of node i; your equation should have a structure similar to (1), but with [A]ji replaced by [A01 ]ji . Introduce the compact notation f x= ∈ R2|V | , ` 1 Advanced Topics in Control: Distributed Systems and Control Project #2 where the vector x stacks together the follower and leader indices of all the vertices; the two equations that you wrote can be expressed in compact form as x = Mx , (4) where the matrix M is symmetric and nonnegative. Express M in terms of the 0-1 adjacency matrix A01 . 1-2-3 index We can give a different interpretation to the leader score `i and follower score fi of vertex i by considering the graph with two vertices follower −→ leader (5) Then `i can be interpreted as a measure of how much vertex i of the graph G and vertex “leader” in graph (5) are alike; in the same way, we interpret fi as a measure of how much vertex i of the graph G and vertex “follower” in graph (5) are alike. As you expressed in task (1.A), the likeness index between i and “leader” is the sum of the likeness indices between the in-neighbors of i and “follower”. Analogously, the likeness index between i and “follower” is the sum of the likeness indices between the out-neighbors of i and “leader”. Building on this interpretation, instead of computing the likeness index between vertex i of G and the two vertices of (5), we compute in what follows the likeness index between vertex i of G and the vertices of the the following 1-2-3 graph 1 −→ 2 −→ 3 (6) For each vertex i of the graph G, we can define the index xi1 , indicating how much i in G and 1 in (6) are alike, the index xi2 , indicating how much i in G and 2 in (6) are alike, and the index xi3 , indicating how much i in G and 3 in (6) are alike. As for the leader and follower indices, the 1,2 and 3 indices of i depend in turn from the 1,2 and 3 indices of the neighbors of i in G. Specifically: • xi1 is the sum of the likeness indices between j and 2, for all the out-neighbors j of i; • xi2 is the sum of the likeness indices between j and 1, for all the in-neighbors j of i, plus the sum of the likeness indices between j and 3, for all the out-neighbors j of i; • xi3 is the sum of the likeness indices between j and 2, for all the in-neighbors j of i. Task (1.B) Based on the description above, write one equation describing the 1-index xi1 , one equation describing the 2-index xi2 and one describing the 3-index xi3 of node i. These equations have the structure of (1), but with [A]ji replaced by [A01 ]ji . Introduce the compact notation x1 x = x2 ∈ R3|V | , x3 where the vector x stacks together the 1-index, 2-index and 3-index of all the vertices; the three equations that you derived can be rewritten in compact form as x = Mx , 2 (7) Advanced Topics in Control: Distributed Systems and Control Project #2 where the matrix M is symmetric and nonnegative. Express M in terms of the 0-1 adjacency matrix A01 . General formulation: likeness index We now come to a more general description, where we introduce the likeness between the graph G = (V, E) and an arbitrary reference graph GR = (VR , ER ). For the leader-follower case, the reference graph GR was given by (5), while in the 1-2-3 case the reference graph was given by (6). For each node i in G and each node j in GR , we introduce an index xij , describing how much the node i and the node j are alike. Generalizing what done for the leader-follower graph (5) and for the 1-2-3 graph (6), we say that the likeness index xij is the sum of the likeness indices between each in-neighbor of i in G and each in-neighbors of j in GR , plus the likeness indices between each out-neighbor of i in G and each out-neighbor of j in GR . Task (1.C) Based on the description above, write an equation describing the likeness index xij between node i in G and node j in GR . This equation generalizes the structure of (1). If we define the likeness matrix X ∈ R|V |×|VG | such that [X]ij = xij , then the equation that you wrote for one node i in G and one node j in GR can be expressed for all the nodes of G and of GR at the same time in matrix form, by making use of X, the 0-1 adjacency matrix A01 of G and the 0-1 adjacency matrix A01,R of GR ; report such expression, which in the following we refer to as matrix-form likeness equation. Such equation is linear in the entries of the likeness matrix X; it is possible to make this linear dependence more explicit by means of the matrix-to-vector operator called vectorization (and denoted vec), which transforms the matrix X into the vector x by taking its columns one by one. It is then possible to express the matrix-form likeness equation as x = Mx , (8) which generalizes (4) and (7) and where the matrix M is symmetric and nonnegative. Provide an expression for the matrix M in (8), by exploiting the combined properties between the operator vec, the matrix multiplication and the Kronecker product. As for the solution of (2), finding the solution of (8) (which generalizes (4) and (7)) can be computationally too expensive for a very large network. For this reason, as it was done for page-rank in (3), we introduce the iterative algorithm zk+1 = M zk , kM zk k2 z0 > 0 , (9) where the normalization enforces the Euclidean norm of zk to be unitary for all k > 0. Since we are interested in solutions to (8), we investigate the convergence properties of the sequence (9) in the following Task 2. Task 2: convergence analysis Assumption 1 : the spectral radius ρ of the matrix M in (9) is larger than the magnitude of any other eigenvalue of M . 3 Advanced Topics in Control: Distributed Systems and Control Project #2 Task (2.A) Consider a nonnegative symmetric matrix M , satisfying assumption 1 (not necessarily of the form derived in the previous tasks); prove that the iteration (9) converges and provide an expression of the limit value it converges to. To this end, you could use some of the following facts: • symmetric matrices admit an eigenvalue decomposition; • every symmetric nonnegative matrix can be permuted to a block-diagonal matrix, with irreducible blocks. Task (2.B) Assumption 1 is restrictive and in the following we remove it; consider a nonnegative symmetric matrix M ; by modifying the proof of task (2.A), show that the two subsequences z2k and z2k+1 in (9) converge, i.e. lim z2k = zeven (z0 ) lim z2k = zodd (z0 ) , k→∞ k→∞ and provide an expression for zeven (z0 ) and zodd (z0 ). Task (2.C) Provide a characterization of {zeven (z0 ) : z0 > 0} ∪ {zodd (z0 ) : z0 > 0}. Moreover, by using the Schwarz inequality, show that zeven (1) is the vector of largest 1-norm in that set. Since zeven and zodd are in general different and depend on the initial condition z0 , in the following we will adopt zLK := zeven (1) as definition of likeness vector ; as a consequence, we can define the likeness matrix ZLK ∈ R|V |×|VG | , where [ZLK ]ij = [zLK ]i·|V |+j . In other terms, if we define the inverse of the vectorization operator vec introduced in task (1.C) as vec −1 , then ZLK = vec −1 (zLK ). Task 3: Self-likeness If we compare the graph G with a reference graph which is G itself, i.e. G = GR , then the likeness matrix ZLK is a square matrix and we refer to it as self-likeness matrix. We expect each vertex to have a high likeness with itself; this intuition is to be formalized in the following task. Task (3.A) Given a graph G, show that the largest element of its self-likeness matrix appears cannot appear outside the diagonal. If a diagonal element is zero, what can be said about the corresponding rows and columns? Hint: Express the analogous of iteration (9) relatively to the matrix-form likeness equation; notice that Z(k) → ZLK and study if Z(k) is positive semi-definite, positive definite or indefinite. Task 4: Example Consider the butterfly graph reported in Figure 1, where the central node is pointed to by n nodes and points to m nodes and where each node is assigned to a label. 4 Advanced Topics in Control: Distributed Systems and Control Project #2 SIMILARITY IN GRAPHS 2" n+2" 3" n+3" 657 n+4" 1" n" m" n&1" ""n" ""n+m" n+1" n+m+1" Figure 1: The butterfly graph √ Fig. 4.1 A directed bow-tie graph. Kleinberg’s hub score of the center vertex is equal to√1/ m + 1 if m (4.A) > n and if m < n. and Thethe central equal to 1/graph m+n+1 Task Findtothe0 leader index followerscore indexofforthis eachvertex node ofis the butterfly independently of the relative values (making a distinction between the case m >ofn m andand m <n. n); moreover, find the 1-index, the 2-index and the 3-index for each node of the butterfly graph; justify your findings with analytical and/or numerical evidence. T Which index do you think is more appropriate to describe the structure of the butterfly graph and P diag{Πv , Πu }P , and hence the subvectors of Π1 are the vectors Πv 1 and Πu 1, why? T T T but which can be computed from the smaller matrices E E or EE . Since E E = B T B + BB T , the central vector Πv 1 is the middle vector of Π1. It is worth pointing out that (4.1) also yields a relation between the two smaller projectors: Task 5: Application 2 ρ Πv = E T Πu E, ρ2 Πu = EΠv E T . Task (5.A) Make up three graphs and on each of them compute (analytically and/or numerically) the self-likeness matrix. Make choicegraphs of the three in order to show different patterns of order to illustrate thatthepath of graphs length 3 may have an advantage the self-likeness matrix. You can focus on graphs with a low number of vertices. In over the hub–authority structure graph we consider here the special case of the “directed bow-tie graph” GB represented in Figure 4.1. If we label the center vertex first, then label the m left vertices, and finally the n right vertices, the adjacency matrix for this Task (5.B) graph isDownload given by from http://control.ee.ethz.ch/~ifaatic/2015/Material/projects/thesaurus. ⎤ ⎡ zip the thesaurus graph. The thesaurus 0 0graph · · ·has 0 a 1vertex · · · for 1 every word, and there exists an edge from i to j if j appears in the definition of i. ⎥ possibility}. Given the vertex ⎢ words Let us now consider the set of four ⎥ ⎢ 1 W = {meeting, learn, salt, ⎥ to w, composed by all the ⎢ .. Gw as the subgraph relative associated to a word w ∈ W , construct ⎥ a subgraph because handling . 0 0 neighboring vertices of w, and the⎢ relative edges. We consider here n ⎥ ⎢ ⎥Via. a simulation tool (such as ⎢ the entire graph would result in time-consuming computations. B=⎢ 1 ⎥the 3-index for every vertex of Matlab, Octave or similar), compute the 1-index, the 2-index and ⎥ ⎢ 0 ⎥ of Gw by their 2-index and Gw , for all the words w ∈ W . For⎢every word w ∈ W , sort the vertices ⎥ relate to the original word ⎢ . give an interpretation to such ranking; scorers ⎦ ⎣ .. how do0 the top 2-index 0 m w? 0 The matrix B T B + BB T is equal to ⎡ m+n 0 5 T T 0 1n B B + BB = ⎣ 0 0 ⎤ 0 0 ⎦, 1m
© Copyright 2024