U NIVERSITÀ DI P ISA Scuola di dottorato in Scienze di base “Galileo Galilei” Dottorato in Fisica Ciclo xxvii The Euclidean Matching Problem Statistical Physics Methods for Optimisation A Thesis for the degree of P HILOSOPHIÆ D OCTOR in T HEORETICAL P HYSICS Supervisor: P ROF. S ERGIO Candidate: C ARACCIOLO G ABRIELE S ICURO December 17, 2014 Pisa Contents Introduction A matching history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. Optimisation, disorder and statistical mechanics 1.1. Graph Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Combinatorial Optimisation: generalities . . . . . . . . . . . . . . 1.3. Weighted matching problems . . . . . . . . . . . . . . . . . . . . 1.3.1. The Hungarian algorithm . . . . . . . . . . . . . . . . . . 1.3.2. Random matching problems . . . . . . . . . . . . . . . . . 1.4. Optimisation and statistical mechanics: replicas for optimisation 1.4.1. Spin glasses . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.2. The solution of the random matching problem via replicas 1.5. Models on graphs and Belief propagation . . . . . . . . . . . . . 1.5.1. Factor graph and Belief propagation . . . . . . . . . . . . v v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 4 5 5 8 10 10 22 23 23 2. Euclidean Matching Problems 2.1. Euclidean optimisation problems . . . . . . . . . . . . . . . . . . . . 2.1.1. Euclidean optimisation and randomness: empirical measures 2.1.2. Asymptotics for Euclidean functionals . . . . . . . . . . . . . 2.2. Euclidean matching problems . . . . . . . . . . . . . . . . . . . . . . 2.3. The Monge–Kantoroviˇc transport problem . . . . . . . . . . . . . . . 2.4. One dimensional Ebmp: an exact solution for convex weights . . . . 2.4.1. Ebmp on the interval . . . . . . . . . . . . . . . . . . . . . . 2.4.2. Ebmp on the circumference . . . . . . . . . . . . . . . . . . . 2.4.3. General solution and universality for p > 1 . . . . . . . . . . 2.5. Higher dimensional Euclidean matching problem . . . . . . . . . . . 2.6. Correlation functions for the Ebmp . . . . . . . . . . . . . . . . . . . 2.7. Functional approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1. Uniform density on the hypertorus . . . . . . . . . . . . . . . 2.7.2. Matching problem on the line . . . . . . . . . . . . . . . . . . 2.8. Emmp via replicas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.1. Recovering the mean field approximation . . . . . . . . . . . 2.8.2. Polygonal approximation and zero temperature limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 29 30 30 31 34 36 38 40 43 46 53 59 63 63 65 69 70 . . . . . . . . . . Conclusions 77 A. The Wiener process and the Brownian bridge process A.1. Wiener process and Brownian bridge process . . . . . . . . . . . . . . . . . A.2. Probability distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 79 81 B. The Simplex Method B.1. Linear optimisation problems . . . . . . . . . . . . . . . . . . . . . . . . . . 83 83 iii iv Contents B.2. The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography 83 87 Introduction A matching history In 1781 a French geometer, Gaspard Monge (1746–1818), published his M´emoire sur la th´eorie des d´eblais at des remblais, in which he discussed the following very simple problem. Let us suppose that we have a certain number of mines and the same number of deposits. We want to associate each mine to one deposit only (where the production of the considered mine will be transported and stored). How we can perform this matching in such a way that the total transport cost is minimum? Monge made the quite natural assumption that the transport cost from a certain mine to a certain deposit is a given function of the distance between the mine and the deposit. Moreover, in this formulation of the problem, everything is deterministic: we know where mines and deposits are, we know their distances and therefore the problem is fixed in all its details. However the problem of finding an optimal matching between mines and deposits, given their positions e.g. on a chart, is simple in its formulation, but quite difficult to solve: if the number of mines is N , we have N ! ways to match mines and deposits and we have to select the cheaper one. It is however self-evident that, if N is quite large, a brute force approach is not feasible: if we are able to compute the cost of a matching in, let us say, 10−3 seconds, for N = 20 we have to wait 7 · 107 years to obtain the solution in the worst case. . . It is clear therefore that a smarter approach is needed. Only in 19551 Harold Kuhn proposed an algorithm, called Hungarian algorithm, able to solve the problem in a computation time that scales as N 3 in the size N of the original problem [29]: we will discuss this algorithm in Section 1.3.1. The Hungarian algorithm proves that the problem is in the so called P computational complexity class, but still the computation time required to solve it grows quite rapidly as N increases. The idea of an algorithmic solution the matching problem was interesting by itself in the 18th century, but not completely usable, due to the lack of computational resources. Monge therefore reformulated the problem in a “continuum” version, in which the matching between points of different types was replaced by a transport map between two different measures on the same domain, to be found in such a way that a certain functional of the map is minimised: in Measure Theory this problem is called optimal transport problem. The combinatorial problem became therefore an interesting problem at the edge between Measure Theory and Geometry. This approach was developed in 1938 by Leonid V. Kantoroviˇc (1912–1986): its (dual) reformulation of the problem is of paramount importance in Measure Theory, Economics and Linear Programming and led him to the Nobel Prize in Economics in 1975 (we will briefly discuss his contributions in Section 2.3). During the last decade the interest in the Theory of the Optimal transport has increased exponentially in the mathematical community, due to the results of Luigi Ambrosio, Luis A. Caffarelli, Alessio Figalli, C´edric Villani and others on the existence and the properties of optimal transport maps 1 Remarkably, in 2006 it was discovered that the problem had been solved by Carl Gustav Jacobi, but his work was published ony in 1890 in Latin and it was ignored at the time. v vi Contents (see the recent review paper of Bogachev and Kolesnikov [9] and the monograph by Villani [53]). Both the previous approaches (the continuum one and the combinatorial one) assume that no disorder or randomness is present in the problem (the point positions or the measures are supposed given). When we find the optimal matching between the two set of points or the optimal map between the two measures, the problem is solved. But we can consider the problem under a different point of view. Let us suppose for example that we have two set of random points on a certain domain and that we ask for the optimal matching between them in such a way that a certain functional of the matching itself and of point positions is minimum. The specific solution in this case is not of great interest: more interestingly, we could ask what are the average properties of the optimal matching (e.g., the average optimal cost) as functions of the parameters of the problem and of the specific cost functional adopted. The problem, called in this version random Euclidean matching problem, is not trivial at all: indeed, correlation among the distances is present, due to the presence of the Euclidean constraint. Marc M´ezard and Giorgio Parisi treated this problem in a set of papers published between 1985 and 1988 [35–37, 40]: they considered as first order approximation the so called random matching problem, in which the distances are supposed totally uncorrelated, as a sort of mean field model of the original problem; subsequently, they introduced correlations as corrections to the mean field results, only sketching the complete computation. Their results are remarkable for different important reasons: firstly, they were able to give a complete solution of the purely random case, obtaining the correct average optimal cost and its distribution; secondly, their results were obtained using statistical physics techniques developed for the study of disordered systems. Their approach was therefore not rigorous, but the results were confirmed completely sixteen years later with rigorous probability arguments; they showed that the statistical physics methods are extremely powerful to treat combinatorial problems when randomness is assumed. In the present thesis we overview the main results obtained on the Euclidean matching problem in the last fifty years and we present the results of our investigation on this subject. The material is organized in two chapters: Optimisation and physics In the first chapter we review the two main subject of this work, optimisation and the physics of disordered systems. From one hand, the availability in the last decades of powerful computational resources encouraged the development of efficient algorithms to solve difficult optimisation problems on graphs; on the other hand, dating back to the work of Edwards and Anderson [22] on spin glasses, the physics of disordered systems developed a pletora of powerful techniques with an impressive impact in many scientific fields: we will present in particular the contribution of this field to Combinatorial Optimisation. The Eucliden Matching Problem and our contributions In the second part we deal with the main subject of the thesis, Euclidean matching problem. We followed very different approaches, each of them able to provide an insight on the problem and its peculiarities. In particular: • we solved the one dimensional matching problem with convex cost function using Measure Theory arguments and showing that, in the thermodynamical limit, the solution of problem is equivalent to a Brownian bridge process [12]; • we proposed a scaling ansatz for the optimal cost in any dimension, obtaining a precise analytical prediction for its scaling respect to the system size and for the finite size corrections to the average optimal cost (this part of the work was developed with Carlo Lucibello and Giorgio Parisi, from University “La Sapienza” in Rome) [14]; • we computed the correlation function in any dimension for the optimal transport map in the thermodynamical limit using both a generalization of the previous Contents vii ansatz and a functional argument that provided a recipe for a very general case [11, 13]; • we developed the computation of M´ezard and Parisi for the corrections to the mean field approximation of the Euclidean matching problem; we showed also that there is a formal correspondence between the corrections to the mean field contribution and the finite size corrections in the purely random case, already computed by Parisi and Rati´eville [46]. This part, as the scaling ansatz, was developed again with Carlo Lucibello and Giorgio Parisi. Chapter 1 Optimisation, disorder and statistical mechanics Summary of the chapter In the first part of this chapter we collect some well-known mathematical results on Combinatorial Optimisation, Matchings and Graph Theory. The aim of this part is to provide a brief introduction to the subject. As concern the Section on Graph Theory, we refer to the work of Diestel [19]. The book of Papadimitriou and Steiglitz [43] is a classical introduction to Combinatorial Optimisation in general. We recommend also the monograph of Lov´ asz and Plummer [31] on Matching Theory. In the second part we recall some classical results on Spin Glass Theory. For a pedagogical introduction, we refer to the nice paper of Castellani and Cavagna [15]. A classical book on the subject was published by M´ ezard, Parisi, and Virasoro [33]. We refer also to the review paper of Binder and Young [8] and to the modern expositions of Nishimori [42] and Fischer and Hertz [25]. We will recall some fundamental results on the application of techniques developed in the Theory of Disordered Systems to optimisation problems, like the Rmmp or the Rbmp: again, this type of investigation was pioneered by M´ ezard and Parisi [37] in a famous seminal paper. The fruitful application of the celebrated Cavity Method is also discussed. A modern and quite complete exposition of the techniques discussed here can be found in the recent monograph by M´ ezard and Montanari [34]. The material presented here is a collection and re-elaboration of the main results presented in the above works. 1.1. Graph Theory In the present thesis we will consider some relevant combinatorial problems defined on graphs. We will give therefore a small introduction to Graph Theory without claiming to be complete: we will fix the fundamental definitions and our notation to be used throughout the present work. Fundamental definitions A graph G = Graph(V; E) is a pair of sets (V, E) such that E ⊆ V × V. The elements of V are called vertices (or nodes), whilst the elements of E are usually called edges. We will denote by1 V = |V| the number of elements of V (sometimes called order of G) and we will suppose always that V ∈ N is finite unless otherwise specified, N set of natural numbers. Moreover, given a vertex v and an edge e, we say that v is incident with e if v ∈ e: in this case e is an edge at v. We will call E(v) the set of edges at v and |E(v)| the degree of v. We say that u, v ∈ V are adjacent if (u, v) ∈ E; we will denote by ∂v the set of adjacent vertices to v. Finally, we define the complete graph KV as the graph with V vertices in which each vertex is adjacent to all the others. Two graphs G = Graph(V; E) 1 In the present work, given a set A of N ∈ N elements, we will use the notation |A| = N for the cardinality of the set. 1 2 Chapter 1. Optimisation, disorder and statistical mechanics (a) An example of separable planar graph. (b) The Sephiroth is an example of graph with 10 vertices and 22 edges. (c) The Petersen’s graph is an example of non planar graph: its genus is g = 1. Figure 1.1.1.: Three examples of graphs. and G0 = Graph(V 0 ; E 0 ) are isomorphic, or G ∼ G0 , if a bijection ϕ : V → V 0 exists such that (u, v) ∈ E ⇔ (ϕ(u), ϕ(v)) ∈ E 0 . Finally, given two graphs G = Graph(V; E) and G0 = Graph(V 0 ; E 0 ), if V ⊆ V 0 and E ⊆ E 0 , than we say that G is a subgraph of G0 and G0 is a supergraph of G: in symbols, G ⊆ G0 . We say that G is a spanning subgraph of G0 if and only if V = V 0 . A certain graph G = Graph(V; E), V = {vi }i=1,...,V , E = {ei }i=1,...,E , can be represented also by a V × E matrix, called incidence matrix, B := (bij )ij in such a way that ( 1 if vi ∈ ej (1.1.1) bij = 0 otherwise. Similarly, we introduce the more commonly used V × V adjacency matrix A := (aij )ij , such that ( 1 if (vi , vj ) ∈ E aij = (1.1.2) 0 otherwise. A vertex cover of G is a subset of V such that any edge of G has at least one endpoint in it. The vertex covering number cV (G) is the smallest possible size of a vertex cover of G. Similarly, and edge cover of G is a subset of E such that any vertex of G is the endpoint of at least one element in it. The edge covering number cE (G) is the smallest possible size of an edge cover of G. A directed graph (or digraph) is a graph in which we assign an initial vertex and a terminal vertex to each edge in the edge set: in a digraph, an edge in which the initial vertex and terminal vertex coincide is called loop and corresponds to a non-zero element on the diagonal of the adjacency matrix. A weighted graph is a graph in which we associate a certain function w : E → R to the graph itself; given an edge e, we say that w(e) is the weight of the edge. For a given weighted graph G, we can introduce the weighted adjacency matrix as W := (w(eij )aij )ij . Given two graphs G = Graph(V; E) and G0 = Graph(V 0 ; E 0 ) with the same vertex set V, we define G4G0 := Graph(VG4G0 ; E4E 0 ), (1.1.3) where E4E 0 is the symmetric difference between the two edge sets and VG4G0 is the set of the vertices that are ends for the edges in E4E 0 . 1.1. Graph Theory 3 Path, cycles, trees and forests A path P = Graph(VP ; EP ) in a graph G is a particular subgraph P ⊆ G such that VP = {v0 , . . . , vk } is a set of distinct vertices and the edge set is given by EP = {(v0 , v1 ), (v1 , v2 ), . . . , (vk−1 , vk )}. We say that the path links v0 and vk and has lenght k. The length of the shortest path joining two vertices u and v of G is called distance of u and v on G. If any two vertices of a graph G can be connected by a path, we say that G is connected: any graph can be expressed as union of maximal connected subgraphs (eventually given by single vertices), called components, and a connected graph is a graph with only one component. Given a connected graph G = Graph(V; E), the subset X ⊂ V ∪ E is said to be a separating set if G0 = Graph(V \ X ; E \ X ) is not connected. If X contains only a single vertex, the vertex is said a cutvertex; similarly if X contains one edge, we say that the selected edge is a bridge. For k ≥ 3, a cycle C = Graph(VC ; EC ) in a graph G is a subgraph C ⊆ G such that VC = {v0 , . . . , vk−1 } is a set of distinct vertices and EC = {(v0 , v1 ), (v1 , v2 ), . . . , (vk−2 , vk−1 ), (vk−1 , v0 )}; we say that such a cycle has lenght k. The minimum length of a cycle contained in a certain graph G is called girth of G, whilst the maximum length is called circumference of G. Clearly, not all graphs contain cycles: an acyclic graph is called forest, a connected forest is called tree. A non-trivial forest (i.e., a forest with |E| 6= 0) has always 1-degree vertices, called leaves. Given a tree, sometimes a specific vertex is considered special and called root of the tree: with reference to the root, we define the height of a vertex as the distance of the vertex itself from the root. Multipartite graphs A graph G = Graph(V; E) is called qpartite if we can partition V into q subsets (or classes), V = 1 2 3 Sq V , i=1 i Vi ∩ Vj = ∅ for i 6= j, in such a way that every edge in E connects vertices in different classes: we will denote such a graph as G = Graph(V1 , . . . , Vq ; E). A q-partite graph is called complete if, given two vertices in two different classes, there exists 1 2 3 an edge connecting them; we will denote it by KV1 ,...,Vq , Vi := |Vi |. If q = 2 a multipartite graph is called bipartite: bipartite graphs Figure 1.1.2.: K3,3 ; observe that cV (K3,3 ) = have the characterising property of having no odd cycles. 3 (for bipartite complete graphs cV (KV,V ) = V ). Matchings Given a certain graph G = Graph(V; E), we say that M = Graph(VM ; EM ) ⊆ G is a matching of size |EM | in G if, given two edges in M, they have no vertex in common. The size of the largest matching in G, m(G), is called matching number of G; if VM = V we say that M is perfect. For a bipartite graph G = Graph(V1 , V2 ; E) the K¨ onig minimax theorem holds: cV (G) = m(G). Moreover, Hall proved that G has a matching if and only if, ∀X ⊆ V1 , [ ∂x ≥ |X |. (1.1.4) (1.1.5) x∈X In particular, if |V1 | = |V2 | = V the previous inequality guarantees the existence of a perfect matching (Marriage theorem): observe that in this case, numbering separately the vertices of each class in the graph, a perfect matching can be expressed as a permutation of V elements. For a proof of the previous statements, see [31]. Finally, given a path P (or a cycle) in G, we say that the path P is M-alternating if the edges of P are alternately in and not in M. Planar graphs and Euler formula Let us consider a connected graph G = Graph(V; E) with E = |E| edges and V = |V| vertices. It can be proved that there always exists at least a surface S on which the graph can be embedded through a proper map, in such a way that 4 Chapter 1. Optimisation, disorder and statistical mechanics 3 4 1 2 3 4 1 2 3 4 2 5 1 6 8 7 Figure 1.1.3.: On the left, complete graph K8 and an example of perfect matching on it. On the right, complete bipartite graph K4,4 and an example of perfect bipartite matching on it. no edge-crossing appears. In particular, let us consider the set SG of the surfaces on which we can perform such operation. We say that g is the genus of the graph if g(G) = min g(S), S∈SG (1.1.6) where g(S) is the genus of the surface S. By definition, G is planar if g(G) = 0. We say that L + 1 is the number of faces of the graph, i.e., the number regions delimited by edges without any edge inside when a graph is drawn on a surface of genus g(G). Euler proved that, for a graph of genus g, the following relation holds: V + L − E = 1 − 2g. (1.1.7) 1.2. Combinatorial Optimisation: generalities In the previous section we have introduced some fundamental definitions of Graph Theory. To introduce our main subject, let us now give a brief survey of Combinatorial Optimisation from a mathematical point of view. An instance of an optimisation problem is a couple of mathematical elements, i.e., a space of configurations Ξ and a cost function E : Ξ → R. The space Ξ can be in general a set of finite, countable or uncountable number of elements. The target is to find the (globally) optimal solution (if it exists), i.e. an element ξ ∗ ∈ Ξ such that E [ξ ∗ ] ≤ E [ξ] ∀ξ ∈ Ξ. (1.2.1) The set of all instances of a certain optimisation problem is the optimisation problem itself by definition. Combinatorial optimisation deals with optimisation problems in which the cardinality of Ξ is finite for all their instances, |Ξ| ∈ N: obviously, in this case the problem has always a solution. However, a direct inspection of the optimal solution requires the evaluation of the cost function over all possible elements of Ξ, and this brute-force approach is usually computationally unfeasible. As a specific example, let us consider the Travelling Salesman Problem (Tsp). In this problem, a (finite) set Σ of N points is given; moreover, to each couple (i, j) of points it is associated a weight wij ∈ R+ . In the Tsp, the space Ξ is given by all possible closed paths ξ passing only once through each point, whilst the cost function is given by ξ 7→ E Tsp [ξ] := N X wξ(i)ξ(i+1) . (1.2.2) i=1 From a graph theoretical point of view, the Tsp can be expressed as the problem of finding the minimum weight Hamiltonian cycle2 . Observe that there are 21 (N − 1)! possible paths 2A Hamiltonian cycle is a closed path traversing all vertices of a certain graph exactly once. 1.3. Weighted matching problems 5 and therefore the direct inspection of the solution by computing all possible values of the cost function requires at least 21 (N − 1)! steps, i.e. a huge number of evaluations as N grows: it is clear that a smarter approach is needed. Optimisation problems can be classified in a more fundamental way with reference to their complexity class, i.e., with reference to the number of computational operations (running time) needed to solve the problem as function of the size of the instance through a certain algorithm. Algorithms themselves can be classified as polynomial if the running time is bounded from above by a certain polynomial in the size of the input and superpolynomial if such a bound does not exist. We say that a certain optimisation problem belongs to the class P if there exists a polynomial algorithm that solves it. There is a larger class of problems, called NP class, containing non-deterministic polynomial problems, i.e., problems that can be solved in polynomial time by a non-deterministic algorithm. Clearly P ⊆ NP, but it is still unknown whether P = NP or P 6= NP. The Tsp belongs to the NP class but no polynomial algorithm to solve it is known. In the following we will analyse in some details a specific combinatorial problem belonging to the P class, i.e., the matching problem. As explained below, for our problem there exists a quite efficient polynomial algorithm that provides the solution for a given instance. 1.3. Weighted matching problems Let us now consider a weighted graph G = Graph(V; E) with weight function w : E → R. In the (monopartite) matching problem we want to find a matching M ⊆ G such that a certain functional EG [M] is minimum. Usually, the cost functional has the form, in the notation above, 1 X w(e). (1.3.1) EG [M] := |EM | e∈EM If w(e) = 1 ∀e ∈ E the problem is sometimes called cardinality matching problem. In the relevant case in which G is a bipartite graph, the problem is called bipartite matching problem or assignment problem. The matching problem, both in its monopartite and in its bipartite version, belongs to the P computational complexity class. As explained in Section 1.3.1, a fast algorithm is available for the solution of the assignment problem on KN,N in polynomial time respect to the size of the input N . In the following we will consider weighted matching problems on complete graphs only. 1.3.1. The Hungarian algorithm A lot of different algorithms to treat optimisation problems are available in the literature. The most popular algorithm for the solution of linear optimisation problems is the Simplex Algorithm (see Appendix B); this algorithm is however in general superpolynomial and less efficient of specific algorithms in the solution, e.g., of matching problems. In the present section we will analyse in some details the celebrated Hungarian algorithm for the solution of the assignment problem on the complete graph3 KN,N , proposed by Kuhn [29] in 1955: this algorithm has known polynomial computational complexity, proving that the assignment problem belongs to the P class. Let us consider therefore a weighted complete bipartite graph KN,N = Graph(V, U; E), |V| = |U| = N , V = {vi }i , U = {ui }i . Denoting by wij the weight associated to the edge eij := (vi , uj ), let us introduce also the following N × N weight matrix W = (wij )i,j=1,...,N . 3 Observe that more general polynomial algorithms are available to solve the matching problem on KN,M , N 6= M . 6 Chapter 1. Optimisation, disorder and statistical mechanics A matching on this graph can be represented by a matrix M = (mij )i,j=1,...,N such that ( mij = 1 0 N X if eij is included in the matching, otherwise, i=1 mij = N X mij = 1. (1.3.2) j=1 We associate to the matching M the matching cost EN [M ] := 1 X wij mij . N i,j (1.3.3) The optimal matching cost is therefore given by EN := min M 1 X wij mij . N i,j (1.3.4) Denoting by M ∗ the matrix associated to the optimal matching, we want to find the expression of M ∗ for a given instance W of the problem. To obtain an algorithm for this matrix, observe firstly that our problem is invariant under some gauge transformations: in particular, given two set of real values L = {λi }i=1,...,N and M = {µi }i=1,...,N we can map PN 0 := wij − λi − µj , whilst EN [M ] 7→ EN [M ] + e0 , where e0 = N1 i=1 (λi + µi ) and wij 7→ wij therefore the optimality of a certain solution is not altered by the gauge transformation. The couple (L, M ) defines the gauge transformation; we say that the gauge is proper if 0 wij ≥ 0. Examples of proper gauges are the following ones: (L1 , M 1 ) : {λi = min wij , µi = 0}i=1,...N , (1.3.5a) (L2 , M 2 ) : {λi = 0, µi = min wji }i=1,...N (1.3.5b) j j Let us suppose that a proper gauge transformation has been performed on our matrix (L,M ) W −−−−→ W 0 an let us define the zeroes graph Z = Graph(V; EZ ) ⊆ KN,N as the subgraph 0 of the complete bipartite graph such that eij ∈ EZ ⇔ wij = 0: if Z contains a perfect matching on KN,N we say that the proper gauge (L, M ) is Hungarian. It is clear that to find M ∗ we have to find a suitable Hungarian gauge: observe however that the space of Hungarian gauges is in general quite large. It can be proved that the optimal matching cost is given by EN = max e0 ; (1.3.6) (L, M ) proper moreover, a proper gauge (L∗ , M ∗ ) that realizes the previous equality always exists, it is Hungarian and the corresponding zeroes graph Z admits a matching. This set of nontrivial properties was proved by Egerv´ary in 1931. On the basis of K¨onig theorem, eq. (1.1.4), and Egerv´ ary’s results, Kuhn elaborated the Hungarian algorithm, so called in honour of the two mathematicians, as a sequence of gauge operations on the original weight matrix. His algorithm can be summarised in the flow chart in figure 1.3.1. The output of the out algorithm provides a weight matrix W out = (wij )ij with at least N zero entries in such out a way that we can construct a matching matrix M ∗ so that mij = 1 ⇒ wij = 0 and 4 corresponding to an optimal matching . The Hungarian algorithm is polynomial and its time complexity is O(N 4 ). A modified version by Edmonds and Karp achieves O(N 3 ): it follows that the assignment problem is a “simple” combinatorial problem, but still not trivial at all. Remarkably, the algorithm presented above is deeply connected with the more general theory of the Cavity Method, that will be presented in Section 1.5. 4 Observe that different matching solutions could be possible. 1.3. Weighted matching problems 7 Input: weight matrix W 1 1 2 2 (L ,M ) (L ,M ) ˜ − W −−−−−−→ W −−−−−→ W 0 Apply the gauge trasformation λi = µi = c 0 −c 0 if the i row is uncovered otherwise if the i column is uncovered otherwise Draw lines through rows and columns of W 0 in such a way that all zero entries are covered with the minimum number of lines Select the minimum uncovered entry c no Is the number of drawn lines N ? yes Output: weight matrix W out Figure 1.3.1.: Hungarian algorithm: note that at the first step the gauge transformations (1.3.5) are applied. 8 Chapter 1. Optimisation, disorder and statistical mechanics φ 1 111 11 12 112 113 2 13 21 3 22 23 31 32 4 33 41 42 43 Figure 1.3.2.: Structure of the unfolding map for a K4,4 graph. In bold line, an example of matching. The directed edge ~e = (33, 3) is represented with its children ~e1 = (3, φ), ~e2 = (3, 31), ~e3 = (3, 32). 1.3.2. Random matching problems Given a certain instance of the matching problem, expressed by the weighted adjacency matrix W of the complete graph under analysis, we can find a solution of the problem using well known algorithms both in the monopartite case and in the bipartite case (e.g., the Hungarian algorithm presented above for the bipartite matching). Here however we are interested in the case in which the weights {w(e)}e∈E on the complete graph are random variables with a certain distribution. In the simplest case all weights are independently and identically distributed random variables with a distribution density ρ(w). The distribution defines an ensemble of random instances and we ask for the typical properties (average optimal cost, optimal cost distribution. . . ) for a given ensemble. In this case we say that we are considering a random matching problem; in particular in the random monopartite matching problem (Rmmp) the matching problem is defined on a complete graph K2N , N ∈ N, whilst in the random bipartite matching problem (Rbmp) the problem is supposed formulated on the complete bipartite graph KN,N , N ∈ N. In the latter case, the matching can be expressed as a permutation of n elements π ∈ SN , SN set of all permutations of N elements: indeed, numbering the vertices of each class in KN,N = Graph(V, U; E) as V = {v1 , . . . , vN } and U = {u1 , . . . , uN }, the optimal cost can be expressed as EN = min π∈SN N 1 X wi π(i) , N i=1 wij = w ((vi , uj )) . (1.3.7) Both the Rmmp and the Rbmp have been investigated using purely combinatorial and probabilistic arguments, but also through statistical physics approaches that proved themselves particularly powerful. In this Section we will summarise some mathematical results: remarkably, many of them have been obtained firstly using non-rigorous statistical physics arguments and have been proved rigorously only subsequently. We will consider the Rmmp in Section 1.4, where a replica-based approach is used. Here we discuss, from a probabilistic point of view, some interesting results on Rbmp. The Rbmp: Aldous’ solution and Parisi formula In 2001 Aldous [2] provided a rigorous treatment of the Rbmp on the complete bipartite graph KN,N in the N → ∞ limit, adopting a weight distribution density given by ρ(w) = e−w . (1.3.8) Due to the importance of his solution, we will give a brief sketch of his results. Aldous introduced an unfolding map Φ−1 to construct, from the weighted complete graph KN,N = Graph(V, U; E), V = {v1 , . . . , vN } and U = {u1 , . . . , uN }, an infinite ordered tree T = Graph(VT ; ET ), in the following way. Let us select a vertex of KN,N at random and let us call it φ: for the sake of simplicity, let us suppose that φ ≡ vN . We identify the N 1.3. Weighted matching problems 9 vertices at the other ends of the edges starting from φ by 1, . . . , N . We order them in such a way that wφ i < wφ j ∀i, j such that i < j. Given a vertex i connected to φ, we denote by ik the vertex connected to i but different from φ such that wki is the k-th weight in increasing order in the set of wli , l = 1, . . . , N − 1. We proceed in this way constructing an infinite tree like in figure 1.3.2: each vertex of the tree has (N − 1) “children” and each child is associate to one and only one vertex in the original KN,N graph. Taking the limit N → ∞, an element of the infinite tree is denoted by φ ∈ VT or by ω = ω1 ω2 · · · ωd ∈ VT at the d level, where ωi ∈ N. Observe that the inverse Φ of the unfolding map is a function Φ : VT → V ∪ U: denoting by e = (ω, ω 0 ) an edge of the tree, there is only one corresponding edge (Φ(ω), Φ(ω 0 )) ∈ E. As anticipated, Aldous supposed moreover that the weights {w(e)} are independently and identically distributed random variables with probability density (1.3.8). For this reason the constructed tree is called Poisson–weighted infinite tree. An optimal matching on this tree is given by a matching such that X ET = min w(e) (1.3.9) M e∈M where the minimum is taken over all possible matchings M ⊆ T on the tree. Observe that the previous quantity is formally divergent in the N → ∞ limit. −−−−→ Denoting by ~e = (ω, ω 0 ) ∈ E~T a directed edge with tail in ω and head in ω 0 , E~T set of directed edges on the tree, we say that ~e1 , . . . , ~eN −1 are children of ~e if they are directed −−−−→ edges with tail in ω 0 . Let us consider a certain edge ~e = (ω, ω 0 ) on the Poisson–weighted infinite tree and let us introduce the tree T~e obtained considering the descendants of ~e only e obtained considering all the descendent of ~e with the exclusion of its and the forest T~nc children. Then, with reference to eq. (1.3.9), we introduce X(~e) := ET~e − ET~nc e . (1.3.10) X(~e) is therefore the difference between an optimal matching cost with and without the children of ~e. We can write ET~e in the previous expression as X ET~e = min w(~ei ) + ETe~i + ET~ej (1.3.11) ~ ei child nc j6=i It follows that X(~e) = min w(~ei ) + ETe~i + ~ ei child X nc j6=i ET~ej − X j ET~ej (1.3.12) = min [w(~ei ) − X(~ei )] . ~ ei child ∗ Denoting by ~e the child for which the identity is verified, ~e∗ is an edge in the optimal matching and w(~e∗ ) = X(~e) + X(~e∗ ). (1.3.13) From eq. (1.3.12) and assuming a distribution in the form (1.3.8) for the weights w, it follows that X(~e) has a logistic probability distribution density: 2 1 . (1.3.14) pX (x) = x x e 2 + e− 2 From this result and from eq. (1.3.13), Aldous proved a formula obtained by M´ezard and Parisi [36] in 1985 using the replica method, i.e. that in the N → ∞ limit N →∞ EN −−−−→ π2 , 6 (1.3.15) 10 Chapter 1. Optimisation, disorder and statistical mechanics where we denoted by • the average over all instances. Aldous’ approach gives also the distribution density of the optimal matching cost, already found by M´ezard and Parisi, as ρE (x) = x ex (ex −1) 2 − ex 1 . −1 (1.3.16) Finally, Aldous showed that e ∈ ET is an edge of the optimal matching if and only if w(e) < X(~e) + X(e~). (1.3.17) Aldous’ results are valid in the N → ∞ limit but no information about finite size effects can be obtained from the arguments above. In 1998 Parisi [44] conjectured that for the Rbmp on KN,N the expected cost of the optimal matching in the hypothesis of exponential distribution (1.3.8) is given by N X 1 EN = . (1.3.18) k2 k=1 This result was rigorously proved by Coppersmith and Sorkin one year later (we refer to the monograph of M´ezard and Montanari [34] for a complete proof). 1.4. Optimisation and statistical mechanics: replicas for optimisation In the present work we deal with combinatorial problems in which disorder plays an important role and therefore we are not interested in the properties of a specific instance of our problem, but in its average properties, eventually depending on the parameters of the problem itself. In Statistical Physics, the study of the average properties of systems with a huge number of degrees of freedom in which disorder appears has a long standing tradition. In this systems disorder is introduced usually in the Hamiltonian function of the system, H = H(σ, J), as a set of parameters {J} randomly extracted from a certain distribution: these parameters usually couple with the degrees of freedom {σ} of the system itself. Different techniques were developed to treat properly these systems depending on the fact that the disordered parameters are constant on the time scale over which the degrees of freedom of the system fluctuate (quenched disorder) or not (annealed disorder). In a famous set of seminal papers, M´ezard and Parisi [37] discussed the application of the methods of spin glass theory to some optimization problems. In particular they studied the Tsp and the Rmmp. To understand their result, some concepts from the Theory of Disordered Systems are necessary. We will present here a brief survey on Spin glass Theory and, subsequently, we will reproduce their argument for the Rmmp, neglecting the first order finite size corrections obtained by Parisi and Rati´eville [46]: remarkably, these results on random matching problems, published in 1985, were rigorously proved by Aldous for the Rbmp only in 2001, as explained in Section 1.3.2. 1.4.1. Spin glasses In the context of the Theory of Disordered Systems, spin glasses played the role of reference frame in which physicists analysed the peculiar effects of disorder on the behaviour of systems with a large number of degrees of freedom. To summarise the important results obtained in the last forty years, let us briefly introduce the Ising model and the corresponding mean field disordered models. After that, we will discuss the theory of spin glasses, with a particular attention to the SK-model and its solution. 1.4. Optimisation and statistical mechanics: replicas for optimisation 11 Ising model Let us consider a graph G = Graph(V; E) and suppose that we assign to each node i ∈ V of G an Ising spin variable σi ∈ {−1, 1} and to each edge e ≡ (i, j) ∈ E an interaction energy −Jσi σj , J ∈ R, depending on the value of the Ising spins on the end points of the edge. We introduce also an energy functional for the graph in the following form: X X HG [σ; J, h] := −J σi σj − h σi . (1.4.1) (ij)∈E i∈V A model defined as above is an Ising model on the graph G: if G is a complete graph, the model is sometimes called Curie–Weiss model (in the literature, Ising-type models on complete graphs are also called infinite range models). In the problem of magnetism, the Ising spins represent microscopic magnetic moments on a certain lattice, coupled in a ferromagnetic (J > 0) or antiferromagnetic (J < 0) way and in presence of a certain external uniform magnetic field h. In eq. (1.4.1) we denoted by σ := {σi }i∈V the generic configuration of the system. To each configuration is associated a Gibbs–Boltzmann weight µG [σ; β, J, h] := 1 e−βHG [σ;J,h] , ZG (β, J, h) ZG (β, J, h) := X e−βHG [σ;J,h] , (1.4.2) σ P in such a way that σ µG [σ; β, J, h] = 1. The parameter β −1 > 0 is called, as known, temperature of the system, whilst ZG (β, J, h) is its partition function. In the following, given a function f := f (σ), we will denote by X hf i := f (σ)µG [σ; β, J, h]. (1.4.3) σ If N := |V| ∈ N, ZG (β, J, h) is an analytic function of its arguments β, J and h; however, in the limit N → ∞, some singularities may appear in some points (critical points) or surfaces (critical surfaces) in the (β, J, h) space. The existence and location of this critical points or lines is of paramount interest, being the fingerprint of phase transitions, i.e. they identify the interfaces between different macroscopic behaviours of the system. This change of macroscopic behaviour is in general characterized by the sharp change of the value of a certain parameter, called order parameter. Unfortunately, there are no general criteria to identify the proper order parameter for a certain system. To be more precise in our exposition, let us consider an hypercubic lattice Ld in d dimensions of side L, in which there are Ld ≡ N vertices and each of them is connected to its 2d nearest neighbours. On this lattice, let us consider an Ising model (fig. 1.4.1a) and let us denote by HN [σ; J, h] := −J X σi σj − h N X σi i=1 hiji and ZN (β, J, h) := X e−βHN [σ;J,h] (1.4.4) σ its Hamiltonian functional and its partition function respectively. We denote by hiji pairs of nearest neighbours spins and we assume also J > 0 (ferromagnetic model). The Gibbs– Boltzmann measure is therefore µN [σ; β, J, h] := 1 e−βHN [σ;J,h] . ZN (β, J, h) (1.4.5) In the large N limit (i.e., for an infinite lattice) an useful order parameter is the so-called magnetisation: m(β, J, h) := lim N 1 X 1 ∂ ZN (β, J, h) hσi i ≡ − lim ∈ [−1, 1]. N N N i ∂h (1.4.6) 12 Chapter 1. Optimisation, disorder and statistical mechanics (a) Ising model. (b) EA-model. (c) SK-model. Figure 1.4.1.: Pictorial representation of 2-dimensional lattice models and of a mean field model with Ising variables. Different edge colors represent different values of the coupling constants. Magnetisation is an order parameter in a strict sense: in fact, |m(β, J, h)| ≈ 1 is indicative of the fact that, on average, almost all spins have the same value, whilst |m(β, J, h)| ≈ 0 suggests that the system is disordered. Suppose now J > 0 fixed. Remarkably, for d > 1 the function m(β, J, h) has a branch cut in the (β, h) plane for h = 0 and β > βc (d, J), where βc (d, J) is a certain critical value depending on dimensionality: formally, for d = 1 βc (1, J) = +∞, (1.4.7) whilst for d = 2 a famous duality argument from Kramers and Wannier [28] shows that √ ln 1 + 2 . (1.4.8) βc (2, J) = 2J For d > 2 only numerical estimations of βc are available. The branch cut of m(β, J, h) is a signal of the non analyticity of ZN (β, J, h) in the N → ∞ limit; moreover, for β > βc (d, J), we have that (1.4.9) lim+ m(β, J, h) 6= lim− m(β, J, h) = −1 h→0 h→0 and therefore lim µN [σ; β, J, h] 6= lim µN [σ; β, J, h]. h→0+ h→0− (1.4.10) In other words, on the critical line two different equilibrium measures coexist and we have a spontaneous symmetry breaking of the Z2 invariance {σi } 7→ {−σi } of the Hamiltonian functional for h = 0. We say that for β > βc (d, J) and h = 0 we have two different phases, each of them identified by the value of the order parameter m. The intuitive concept of pure phase can be more rigorously formulated in the context of Statistical Field Theory [57]. In this context, we simply state that an equilibrium measure µ describes a pure phase if and only if, given two local observables A(x) and B(x) (where x is a graph site) then lim [hA(x)B(y)i − hA(x)i hB(y)i] = 0, (1.4.11) δ(x,y)→+∞ where δ(x, y) is the distance between the node x and the node y in the graph. The previous condition is called cluster property. The Ising model is easily solvable on a one dimensional lattice; in 1944 Lars Onsager announced its celebrated solution for the Ising model with h = 0 in d = 2. Unfortunately, no exact solutions are available for d > 2, even in absence of external magnetic field. For this reason a proper approximation can be useful to understand, at least qualitatively, the properties of our system. The most used approximation is the mean field approach: 1.4. Optimisation and statistical mechanics: replicas for optimisation 13 we assume that σi = m + δσi , where m ≡ m(β, J, h), and we neglect fluctuations in the non-linear terms of the Hamiltonian. For the Ising model on a d-dimensional lattice, HN [σ; J, h] = −J X (m + δσi ) (m + δσj ) − h N X σi ≈ 2dJN m2 − (2dJm + h) i=1 hiji N X σi . i=1 (1.4.12) In this approximation is easily obtained 2 N ZN (β, J, h) = e−β2dN Jm [2 cosh β (Jmz + h)] , m = tanh β(2dJm + h). (1.4.13) From the last equation, we have that a non-vanishing magnetisation can be obtained for 1 . The critical value obtained h = 0 only if 2dβJ > 1, i.e. for β > 2dJ βcmf = 1 2dJ (1.4.14) is incorrect: however, the mean field theory qualitatively reproduce the critical behaviour of the original model and correctly predict the existence of a critical temperature. The mean field energy for the Ising model has the form 4 1 N N β N m FN (β, J, h) = − ln ZN (β, J, h) ≈ − ln 2+ mf 1 − mf m2 + β 3 +o m4 mf β β 2βc βc 12 βc (1.4.15) in which we performed a power expansion in m up to the fourth order. Note that the functional above has a global minimum in m = 0 for β < βcmf that becomes a local maximum for β > βcmf ; on the other hand, two new minima appear for β > βcmf . This simple observation, proposed for the first time by Landau, gives a qualitative explanation of the spontaneous symmetry breaking discussed above. Edwards–Anderson model and Sherrington–Kirkpatrick model Let us consider an Ising-type model on a graph G = Graph(V; E) as introduced at page 11. We consider the following Hamiltonian functional: X X HG [σ; {J}, h] := − Jij σi σj − h σi . (1.4.16) (i,j)∈E i This time, each Jij is supposed to be extracted, independently from all the others, from a certain given probability distribution density ρ(Jij ), identical for all e = (i, j) ∈ E. The set of values is extracted once and for all (for this reason we say that the disorder is quenched), and usually both positive and negative value of Jij are admitted. If we consider as graph the hypercubic lattice Ld , the model obtained is called Edwards–Anderson model (EA-model) and it is an example of disordered system. A disordered system is a system in which both frustration 5 and randomness appear. Clearly we are not interested in the properties of a precise realisation of our system, but we have to average in some sense on the possible values of {J}. How to perform this average, however, is a delicate matter: in fact, it turns 5A lattice model is frustrated whenever there exists a set of local constraint in conflict each other. Suppose that we have a cycle of length four in the EA-model in which product of the coupling constants J along the cycle is negative; then, no configuration of Ising spins capable of minimising simultaneously the energy contribution of all edges separately exists. For example, denoting by a black line an interaction with J < 0 and by a white line an interaction with J > 0, there is no way to satisfy locally all interactions in the following plaquette: 14 Chapter 1. Optimisation, disorder and statistical mechanics out that the free energy FG for a disordered system on a generic graph G is self averaging, i.e., (F¯ − F )2 = 0, (1.4.17) lim |V|→∞ F¯ 2 where we denote by ¯ • the average over the disorder, but the partition function is not. It follows that we have to compute in general the following configurational average: Z 1 Y FG [{Jij }; β] := − p(Jij ) d Jij ln ZG [{J}; β]. (1.4.18) β (i,j)∈E Replica approach To perform the integral (1.4.18), it is usually adopted a famous and powerful trick, called replica trick. The manipulation is based on the following identity: xn − 1 , n→0 n ln x = lim x > 0. (1.4.19) To proceed with our computation, let us assume that G ≡ KN is a complete graph on which the Hamiltonian functional (1.4.16) is defined: the obtained model (an infinite range version of the EA-model) is called Sherrington–Kirkpatrick model (SK-model). In this model the probability distribution density r 2 N − N Jij 2J 2 e (1.4.20) ρ(Jij ) = 2πJ 2 is assumed6 . We want to compute the average free energy density Zn − 1 − β f¯ := lim N . n→0 nN (1.4.21) N →∞ It follows that Q P R Pn PN Pn Zn = p(Jij ) d Jij exp β i<j Jij α=1 σiα σjα + βh i=1 α=1 σiα (ij)∈E i P 2 2 2 hQ P PN Pn β2 J 2 P Nβ J n α β α + βh i=1 α=1 σi = exp α {σ α } exp α<β i σi σi 4 2n 2 2 hQ i h i R N β2 J 2 P 2 q + N ln z[q] , d q = exp N β 4J n exp αβ α<β α<β αβ 2n (1.4.22) where we have introduced z[q] := X exp β 2 J 2 σ 1 ,...,σ n X qαβ σ α σ β + βh α<β X σα . (1.4.23) α Being the exponent of the integrand proportional to N , we can use the steepest descent to evaluate the expression. In particular, the extremality condition respect to the generic element qαβ of the matrix q := (qαβ )αβ gives qαβ = 6 The ∂ ln z[q] α β ≡ σ σ z, ∂qαβ (1.4.24) normalisation constant is due to the fact that we want the energy to be an extensive quantity. 1.4. Optimisation and statistical mechanics: replicas for optimisation 15 where we denoted by 1 O σ 1 , . . . , σ n z := z X O σ1 , . . . , σ n exp β 2 J 2 σ 1 ,...,σ n X qαβ σ α σ β + βh X σα ; α α<β (1.4.25) Let Q := (Qαβ )αβ be the solution of eq. (1.4.24). Calling S[Q ] := − β2J 2 X 2 1 1 Qαβ + β 2 J 2 + ln z[Q ], 2 4 n H(αβ)(γδ) := α6=β ∂ 2 S[q] , (1.4.26) ∂qαβ ∂qγδ q=Q it follows that 2 2 X 1 β J 1 1 1 + nN − Zn ∼ √ Q2αβ + β 2 J 2 + ln z[Q ] . 4 4 n det H α6=β (1.4.27) Notably, Qαβ are not merely a set of variables: they express a sort of “average superposition” between replicas, being (as can be directly checked) E D (1.4.28) Qαβ = σiα σiβ , R where h•iR is the average respect to the replicated system whose Hamiltonian function is H R := − n X X Jij σiα σjα + h α=1 i<j n X N X σiα . (1.4.29) α=1 i=1 The variables Qαβ play the role of spin glass order parameters, as we will see. If we suppose that all replicas are equivalent (replica symmetric hypothesis) D E D E 2 Qαβ = σiα σiβ = hσi i = q =: qEA . (1.4.30) = hσiα iR σiβ R R The quantity qEA is called Edward–Anderson order parameter: we expect that for β → 0 the spins are randomly oriented, and therefore qEA = 0, whilst in the β → +∞ limit qEA > 0, having hσi i = 6 0 for each realization. Stability of the replica symmetric solution A first question that arises about the solution obtained is if it is stable respect to fluctuations around the solution Qαβ = q. From a mathematical point of view, the solution is stable if the Hessian matrix H is positive definite in Qαβ = q. We have to solve the eigenvalue problem H η = λη. (1.4.31) We will not reproduce here all the steps of this computation, but we will only present the results, obtained for the first time by Almeida and Thouless [3]. A direct computation shows that H(αβ)(γδ) = δαγ δβδ − β 2 J 2 σ α σ β σ γ σ δ z − σ α σ β z σ γ σ δ z . (1.4.32) Therefore, the structure of the Hessian matrix is the following 2 H(αβ)(αβ) = 1 − β 2 J 2 1 − σ α σ β z ≡ a, H(αβ)(αγ) = −β 2 J 2 σ β σ γ z − σ α σ β z hσ α σ γ iz ≡ b, H(αβ)(γδ) = −β 2 J 2 σ α σ β σ γ σ δ z − σ α σ β z σ γ σ δ z ≡ c, (1.4.33a) (1.4.33b) (1.4.33c) 16 Chapter 1. Optimisation, disorder and statistical mechanics i.e., the Hessiam matrix has a particular symmetric form. In the paramagnetic phase, being q = 0, a direct computation shows that b = c = 0 and a = 1, so the solution is stable. In the ferromagnetic phase, Almeida and Thouless [3] found the following eigenvalues for the eigenvector symmetric under the exchange of replica indices (i.e. of the form η αβ = η ∀α, β) λ1 = Figure 1.4.2.: Structure of the Hessian matrix H for n = 4. 1 − a − 4b + 3c ± |1 − a + 4b − 3c| , 2 (1.4.34) The eigenvalues λ2 obtained for the n − 1 eigenvectors in the form η αβ ≡ η0 (δαθ + δβθ ) + η1 for a certain θ, is such n→0 that λ2 −−−→ λ1 . The third eigenvalue, corresponding to the n(n−3) eigenvectors in the form η θν ≡ η1 , η θα = η να ≡ η2 and 2 αβ η ≡ η3 ∀α, β and for fixed ν, θ, is λ3 = a − 2b + c. It can be seen that, for Qαβ ≡ q, λ1 is always positive. However Z w2 1 1 √ λ3 = a − 2b + c > 0 ⇒ 2 2 > √ e− 2 sech2 (βJ qw + βh) d w. β J 2π (1.4.35) (1.4.36) The previous inequality defines a region in the (β, h) space in which the replica symmetric solution is stable and, as consequence, a region in which this solution is unstable. The boundary between these two region is called Almeida–Thouless line (AT-line). The eigenvector η3 corresponding to the eigenvalue λ3 is called replicon. Returning to eq. (1.4.27), careful computation gives √ ln λ1 n(n−3) n−1 1 = e− 2 − 2 ln λ2 − 4 ln λ3 det H 1 dλ1 1 dλ2 ln λ2 3 ≈1+n − + − + ln λ3 + o(n) 2λ1 dn 2λ2 dn 2 4 n=0 b − 32 c n 3 =1− − + ln λ1 − ln λ3 + o(n). (1.4.37) 2 λ1 2 n=0 1 → 1. If all replicas are equivalent, In the n → 0 limit, at first order we obtain simply √det H i.e., Qαβ ≡ q Z w2 β2J 2 1 √ 2 ¯ − βf = (1 − q) + √ e− 2 ln [2 cosh (βJ qw + βh)] d w, (1.4.38) 4 2π from which we get Z w2 1 √ e− 2 tanh (βJ qw + βh) d w, m= √ (1.4.39) 2π whilst the value of q to be inserted in the previous equation can be obtained from the extremization condition on eq. (1.4.38): Z w2 1 √ q=√ e− 2 tanh2 (βJ qw + βh) d w. (1.4.40) 2π The replica approach for the Sherrington–Kirkpatrick model seems to give a complete solution of the model and some information can be extracted from a proper Landau theory on eq. (1.4.38). However, our solution presents an anomalous behaviour in the β → +∞ ∂ f¯ 1 gives s → − 2π limit. For example, a direct computation of the entropy density s = β 2 ∂β for β → +∞. The origin of this pathological behaviours (instability, negative entropy) was identified and fixed by Parisi [45]. 1.4. Optimisation and statistical mechanics: replicas for optimisation 17 Replica symmetry breaking and Parisi’s solution In the replica symmetric hypothesis, the matrix Q = (Qαβ )αβ can be represented in the following way (in the example, n = 12): 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q (1.4.41) Q = q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 Here we impose Qαα = 0 ∀α and Qαβ = q for α 6= β, under the hypothesis that the invariance under the exchange of replicas is satisfied. Parisi suggested to break this symmetry in a set of steps. At the first step we choose an integer m1 such that all replicas are divided into mn1 groups: if α and β belong to the same group, then Qαβ = q1 , otherwise Qαβ = q0 . The matrix Q has the following form (in the example, n = 12 and m1 = 4): 0 q1 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 0 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 0 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 q1 0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 0 q1 q1 q1 q0 q0 q0 q0 q q q q q 0 0 0 0 1 0 q1 q1 q0 q0 q0 q0 Q = q q q q q q (1.4.42) 0 0 0 0 1 1 0 q1 q0 q0 q0 q0 q q q q q q q 0 0 0 0 1 1 1 0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 0 q1 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 0 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 0 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 q1 0 We can proceed further, subdividing each one of the m1 groups of replicas in m2 subgroups, and so on. We finally obtain a set of integer numbers n ≥ m1 ≥ m2 · · · ≥ 1 and a gradually finer subdivision of the matrix Q , in which at the kth step of symmetry breaking we admit k + 1 possible different q values (in the example below, n = 12, m1 = 4, m2 = 2): 0 q q q q q q q q q q q | q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q {z rs q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 q q q q q q q q q q q q 0 } ⇒ 0 q1 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 | q1 0 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 0 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 q1 0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 0 q1 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 0 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 0 q1 q0 q0 q0 q0 {z 1rsb q0 q0 q0 q0 q1 q1 q1 0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 0 q1 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 0 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 0 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 q1 0 } ⇒ 0 q2 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 | q2 0 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 0 q2 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 q2 0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 0 q2 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q2 0 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 0 q2 q0 q0 q0 q0 {z 2rsb q0 q0 q0 q0 q1 q1 q2 0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 q0 0 q2 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q2 0 q1 q1 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 0 q2 q0 q0 q0 q0 q0 q0 q0 q0 q1 q1 q2 0 ⇒ ··· } (1.4.43) The idea of evaluating the free energy taking into account different levels of replica symmetry breaking is not justified by any general principle, but the results obtained are in excellent agreement with the data. Moreover, not always infinite steps of replica symmetry breaking (full rsb) are needed, and it happens for certain disordered systems that a finite number of steps is sufficient to reproduce correctly the behaviour of all interesting quantities. For 18 Chapter 1. Optimisation, disorder and statistical mechanics the SK-model, numerical evidences show that a full rsb is necessary. To perform it, note that, at the kth level of rsb, k X 1X l Qαβ = (mj − mj+1 )qjl , n j=0 m0 ≡ n, mk+1 ≡ 1, l ∈ N. (1.4.44) αβ Let us also define q l (x) := k X qil I[mi+1 ,mi ] (x) (1.4.45) i=1 where I[mi+1 ,mi ] (x) takes the value 1 if x ∈ [mi+1 , mi ], zero otherwise. In the limit n → 0 we have that n ≥ m1 ≥ m2 · · · ≥ 1 −→ 0 ≤ m1 ≤ · · · ≤ 1, and therefore q(x) is defined in the interval [0, 1]; taking n → 0 and in the large k limit, mj − mj+1 → − d x and we can write Z1 k X 1X l l k→+∞ Qαβ = (mj − mj+1 )qj −−−−−→ − q l (x) d x. (1.4.46) n j=0 αβ 0 From eq. (1.4.27) and eq. (1.4.37), we obtain − β f¯ ∼ − 1 1 β2J 2 X 2 Qαβ + β 2 J 2 + ln z[Q ]. 4 4 n (1.4.47) α6=β and then we get for the energy and the susceptibility R1 P 2 2 ¯ := ∂β f¯ = − βJ2 1 + n1 α6=β Q2αβ −→ − βJ2 1 − 0 q 2 (x) d x R1 P χ ¯ := ∂h f¯ = β 1 + n1 α6=β Qαβ −→ β 1 − 0 q(x) d x . (1.4.48a) (1.4.48b) The complete expression for f¯ in the full rsb scheme can be obtained after some computations [42]: Z1 p Z1 w2 2 2 1 β J 2 ¯ P 1 + q (x) d x − 2q(1) − √ q(0)w e− 2 d w, βf = − 4 2π (1.4.49) 0 0 where the function P (x, h) satisfies the so called Parisi equation: ∂ P (x, h) ∂P (x, h) J 2 dq(x) ∂ 2 P (x, h) +x =− , ∂x 2 dx ∂h ∂h2 P (1, h) = ln (2 cosh βh) . (1.4.50) As before, q must be found extremising the functional (1.4.49): this is a quite difficult task but numerical simulations are in perfect agreement with Parisi’s solution when the limit β → +∞ is taken. The reason of the success of Parisi’s approach is that, in the thermodynamical limit, the free energy of a spin glass has a multivalley structure in which many minima separated by indefinitely high barriers appear: the rsb is necessary to probe all these different valleys simultaneously. It is said that each valley corresponds to a phase (or state) and ergodicity breaking occurs. To better understand this concept, let us introduce the overlap between states for a certain realization {J} of the system in the form Qab := N 1 X hσi ia hσi ib , N i=1 (1.4.51) 1.4. Optimisation and statistical mechanics: replicas for optimisation 19 where h•ia denote the average restricted to the state a, and let be X ρJ (Q) := hδ(Q − Qab )i = wa wb δ(Q − Qab ), (1.4.52) ab where wa is the probability of being in the state a. We are interested, as usual, to the average over the disorder of ρJ , (1.4.53) ρ(Q) := ρJ (Q). Note that in the Ising model for β → +∞ and h = 0 only two phases are possible and ρ(Q) = δ(Q−1) + δ(Q+1) . In the full rsb scheme, it can be proved [45] that 2 2 1 Qk Z := Z X 1 k Q ρ(Q) d Q = lim Qαβ = q k (x) d x, n→0 n(n − 1) k α6=β (1.4.54) 0 average of all Qαβ , solutions of (1.4.24). Here we used eq. (1.4.46). In particular, Q = R1 R1 q(x) d x = −1 q dx dq d q, i.e., denoting by x(q) the inverse function of q(x), 0 Zq ρ(Q) d Q. x(q) = (1.4.55) −∞ This equation stresses the strong relation between the overlap between different replicas Qαβ and the overlap between different states Qαβ : in particular, it is sufficient to invert q(x) to obtain the cumulative distribution of the overlaps. Thouless–Anderson–Palmer equations A different but instructive approach was proposed in 1977 by Thouless, Anderson, and Palmer [52] to obtain a set of equations for the local magnetisation in the SK-model. Given a certain realisation of the SK-model, the free energy can be written as X X X − βF [β; {h}; {J}] := ln exp β Jij σi σj + β hi σi . (1.4.56) {σi } i i6=j Here we suppose that the external magnetic field depends on the site position. As known from the general theory, the local magnetisation is given by mi := hσi i = −∂hi F . Let us now perform a Legendre transform, introducing " # X G[β; {m}; {J}] = max F [β; {h}; {J}] + mi hi . (1.4.57) hi i The new functional is a constrained free energy, in the form X X X − βG[β; {m}; {J}] = ln exp β Jij σi σj + β hi (mi )(σi − mi ) , {σi } (1.4.58) i i6=j where hi (mi ) are such that hσi i = mi . Let us now expand the previous functional around β = 0. We obtain − βG[β; {m}; {J}] = N X 1 − mi 2 i=1 +β X ij ln 1 − mi 2 Jij mi mj + 1 + mi − ln 2 1 + mi 2 β2 X 2 J (1 − m2i )(1 − m2j ) + o(β 2 ). (1.4.59) 2 ij ij 20 Chapter 1. Optimisation, disorder and statistical mechanics σ1 σ2 σ5 σ3 σ4 (a) B2N . (b) G2N,5 σ1 σ1 σ2 σ3 σ0 σ2 σ5 σ4 σ5 σ0 σ3 σ4 (d) Example of both link addition σ1 − σ2 , and site addition, σ0 . (c) Addition of the site σ0 . Figure 1.4.3.: Cavity method. In a pictorial way we stressed that in a Bethe lattice loops at large distance N ) preserving the frustration property of the disordered system. are present (of lenght of order ln ln k Plefka [48] showed that, for the SK-model, the additional terms can be neglected in the N → +∞. The minimum condition respect to mi , ∂mi G = 0, gives the celebrated Thouless– Anderson–Palmer equations (TAP-equations): mi = tanh β X Jij mj + hi − mi j6=i X 2 Jij β(1 − mj )2 (1.4.60) j6=i In the previous expression an additional term appears respect to the Ising mean field magnetisation (1.4.13), i.e. the so called Onsager reaction hO i := −mi X 2 Jij β(1 − mj )2 . (1.4.61) j6=i This additional term is not negligible in the spin glass case due to the fact that Jij ∼ and not Jij ∼ N1 as in the ferromagnetic case. √1 N 1.4. Optimisation and statistical mechanics: replicas for optimisation 21 The cavity method The cavity method provides a set of iterative equations to obtain information on the ground state of a disordered system: in a quite dated formalism, the method was applied to the random matching problem by M´ezard and Parisi [35], as we will discuss later. We briefly present this method for a glass on a Bethe lattice following a more recent paper [39]. Suppose that we have a Bethe lattice BkN = Graph(V; E), i.e. a random graph of N 0 vertices with fixed connectivity k+1 (see figure 1.4.3a). Moreover, let σi be the spin variable on site i, Jij the random coupling on the link (i, j). We suppose that Jij are identically and independently distributed random variables with probability distribution density ρ(J), and that the Hamiltonian has the form X HBk [σ; {J}] = − Jij σi σj . (1.4.62) N (i,j)∈E Let EN be the global ground state (ggs) energy for this system. Consider a new graph GkN,q ⊆ BkN obtained from BkN decreasing the degree of q > k spins from k + 1 to k (cavity spins, see figure 1.4.3b). Fixing the values of these spins to σ1 , . . . , σq , we suppose that the ggs energy of this new system is related to the old one in the following way EN (σ1 , . . . , σq ) − EN = q X hi σi , (1.4.63) i=1 where hi are called local cavity fields. Due to the random nature of the system, we assume that {hi }i are i.i.d. random variables with distribution %(h). Now suppose that a new spin σ0 is added to the lattice and coupled to k cavity spins σ1 , . . . , σk by J1 , . . . , Jk respectively (see figure 1.4.3c). We fix the value of σ0 and change the value of σ1 , . . . , σk in such a way that the new ggs energy is minimized. The energy of each link (i, 0) is minimized taking i = min [(−hi − Ji σ0 )σi ] =: −a(Ji , hi ) − σ0 b(Ji , hi ). σi (1.4.64) It is clear that the final cavity field on σ0 is h0 = k X b(Ji , hi ), (1.4.65) i=1 implying that the following recursion relation holds %(h) = k Z Y %(hi ) d hi δ h − i=1 k X ! b(Ji , hj ) (1.4.66) i=1 Now, it can be observed that, denoting by δ10 the energy shift of the ggs due to a site addition and by δ20 the energy shift of the ggs due to link addition to GkN,q [38], EN k+1 2 = δ10 − δ0 . N →∞ N 2 lim (1.4.67) If we are able to solve eq. (1.4.66), then we can compute the ggs energy in the β → ∞ limit by k+1 k+1 k+1 Y Z X X %(hi ) d hi a(Jj , hj ) + δ10 = − b(Jj , hj ), (1.4.68a) j=1 i=1 j=1 δ20 = − 2 Z Y i=1 %(hi ) d hi max (h1 σ1 + h2 σ2 + J12 σ1 σ2 ). σ1 ,σ2 (1.4.68b) 22 Chapter 1. Optimisation, disorder and statistical mechanics Finally, observe that the method, in the formulation presented here, works only in the replica symmetric hypothesis. In other words, we assume that there are not local ground states, i.e. local minima in energy whose corresponding configurations can be obtained from the global minimum ones only by an infinite number of spin flip. This eventuality is not so rare and it is necessary to break the replica symmetry to correctly reproduce the ggs energy. 1.4.2. The solution of the random matching problem via replicas After the digression above on the general Theory of Disordered Systems, let us turn back to the Rmmp. As explained before, in the Rmmp we work on the complete graph G = K2N with a weight function w in which the edge weights are random values extracted from a certain probability density function ρ(w). We suppose that this distribution is ρ(w) = wr e−w . r! (1.4.69) Labelling by i = 1, . . . 2N the vertices of the graph, the 2N × 2N weighted adjacency matrix W = (wij ), wij = wji weight on the edge joining i and j, identifies a realization of our problem. The cost of a matching on this graph and the optimal matching cost can be expressed respectively as EN [M ] = 1 N X EN := min EN [M ], mij wij , M 1≤i<j≤2N (1.4.70) where the symmetric matrix M = (mij )ij is such that mij = 1 if the vertices i and j are P2N connected, zero otherwise, mii = 0 ∀i, and j=1 mij = 1 ∀i. To study the optimal matching properties, we can write down a “partition function” for our “system”, assuming EN [M ] as an “energy” density functional of the “configuration” M , in the form 2N 2N Y X X δ 1, mij e−βN EN [M ] Z[β; W ] := {nij =0,1} = i=1 2N Z2π iλi Y i=1 0 e j=1 (1.4.71) d λi Y 1 + e−βwjk −i(λj +λk ) , 2π j<k where we introduced the Kronecker symbol δ (a, b) ≡ δa,b . To get a finite free energy 1 ˆ r+1 density in the N → ∞ limit we impose the scaling β = βN . We proceed using the replica approach. After some computations, we get in the large N limit Z Y n n Y X X d q 1 1 α ...α 1 p p Zn = exp N − qα2 1 ...αp + 2 ln z[q] , 2 g(p) 2πN g(p) α <···<α α <···<α p=1 p=1 p 1 p 1 (1.4.72) where g(p) := 1 ˆ r+1 (pβ) and n Z2π iλα n n α Y X X X α1 αp e d λ exp i z[q] := λα + qα1 ...αp e−i(λ +···+λ ) . 2π α=1 α=1 p=1 α <···<α 1 0 p (1.4.73) The saddle point equation gives D E α1 αp qα1 ...αp = 2g(p) e−i(λ +···+λ ) , z (1.4.74) 1.5. Models on graphs and Belief propagation 23 where h•iz is the average computed respect to the measure defined by the partition function z[q]. In the following we will denote by Qα1 ...αp the solution of the saddle point equation. In the replica symmetric hypothesis Qα1 ...αp ≡ Qp and − β f¯ = − Z∞ ∞ w 1 X (−1)p−1 2 e− e − e−Gr (w) d w, Qp + 2 2 p→0 g(p)p Gr (x) := ∞ X (−1)p−1 p=1 −∞ Qp exp . p! (1.4.75) The saddle point equation can be rewritten in terms of Gr (w) as Gr (w) = 2 βˆr+1 +∞ Z Br (w + y) e−Gr (y) d y, Br (x) := ∞ X (−1)p−1 p=1 −∞ epx . pr (p!)2 From the expression of f¯ the average mean energy can be easily obtained: R +∞ 1 Gr (w) e−Gr (w) d w. hE (β)i = N − r+1 r+1 −∞ βˆ (1.4.76) (1.4.77) Remarkably, in the limit βˆ → +∞ 1 hE (β)i = N − r+1 (r + 1) +∞ Z ˜ r (w) e−G˜ r (w) d w, G ˜ r (w) := 2 G −∞ +∞ Z (w + y)r −G˜ r (w) e d y. r! −w (1.4.78) and for r = 0, the computation can be performed exactly and the the famous result for the average optimal cost in the flat Rmmp is obtained: lim N EN = lim N hE (β)i = N →∞ β→∞ π2 . 12 (1.4.79) Moreover, M´ezard and Parisi [37] were able to obtain the distribution ρ(w) of the optimal 1 weights: rescaling by w 7→ wN − r+1 , they obtained for r = 0 ρ(w) = w − e−w sinh w . sinh2 w (1.4.80) By similar arguments, M´ezard and Parisi [36] showed that in the Rbmp for r = 0 lim N EN = N →∞ π2 . 6 (1.4.81) Finally, observe that, also in the replica symmetric hypothesis, an infinite number of order parameters is necessary to deal with such combinatorial problems, contrary to what happens for the SK-model. 1.5. Models on graphs and Belief propagation 1.5.1. Factor graph and Belief propagation The idea of a map between an infinite bipartite graph and a tree, suggested by Aldous and presented in section 1.3.2, meets the statistical physics approach in an algorithmic method that can be used to solve the Rbmp called Belief Propagation (Bp): this method is indeed quite general, as we will see below. Moreover, this algorithmic approach is strictly connected to the cavity method. Let us now introduce the general theory of Bp following the pedagogical exposition of M´ezard and Montanari [34]. Suppose that we have a model in 24 Chapter 1. Optimisation, disorder and statistical mechanics N variables xi , i = 1, . . . , N , taking their values in a finite set and suppose also that their joint probability has the form M 1 Y ψa (x∂a ), µ(x1 , . . . , xN ) = Z a=1 (1.5.1) where by x∂a we denote a certain subset of the set of variables, depending on a and by Z a proper normalization constant. A joint probability of the form (1.5.1) can be graphically represented by a factor graph, i.e. a bipartite graph in which two types of vertices are present: in a factor graph each “function–type” vertex a is connected to the corresponding subset ∂a of the N “variable–type” vertices. We will denote by Vf , |Vf | = M , the set of function nodes, by Vv the set of variable nodes and by Gµ = Graph(Vv , Vf ; E) the factor graph itself, E set of the edges. We can define on the edges of this bipartite graph two functions (in this context called messages), one for each direction in which the edge can be crossed. In particular, for an edge e = (i, a) joining the vertex corresponding to xi to the function vertex corresponding to ψa , we define ui→a (xi ) and va→i (xi ). The messages are elements in the space of probability distribution functions, subject to the normalization conditions N M X X ui→a = 1 va→i = 1. (1.5.2) a=1 i=1 We can write down an iterative “message–passing” algorithm (the so-called Bp-equations), in which we update a message on an edge on the basis of the values of the incoming messages on the tail of the directed edge, i.e., up to multiplicative (normalising) constants, Y t ut+1 vb→i (xi ), (1.5.3a) i→a (xi ) ' b∈∂i\a t+1 va→i (xi ) ' X ψa (x∂a ) {x∂a\i } Y utk→a (xk ). (1.5.3b) k∈∂a\i where ∂i denote the set of function nodes connected to the variable node i and t is the index of the iteration. After T iterations Y T −1 uTi (xi ) ' va→i (xi ), (1.5.4) a∈∂i It can be proved that, if the factor graph Gµ is a tree, eq. (1.5.3) converge after at most T ∗ iterations, where T ∗ = diam(Gµ ) is the diameter of the graph, irrespectively of the initial conditions, to some functions u∗i , vi∗ . Moreover, in this case the obtained functions are strictly related to the marginal probability distributions: X u∗i (xi ) ≡ µ(x1 , . . . , xN ). (1.5.5) {xj6=i } This provides the connection between our initial µ function and the message functions. There is a simple correspondence between the cavity method and the Bp algorithm. To make clearer this relation between the cavity equations and the Bp equations, let us consider the glass model in Section 1.4.1 and let us start from the cavity equations (1.4.68a): we can consider for a moment a non zero temperature, β < ∞; the marginal distribution of σ0 µ0 (σ0 ) ' eβh0 σ0 is simply given by X X µ0 (σ0 ) ' eβJi σ0 σi +βhi σi ≡ eβJi σ0 σi µi (σi ). (1.5.6) {σi } Comparing with eq. (1.5.3) and observing that: {σi } 1.5. Models on graphs and Belief propagation 25 21 2 1 23 22 11 2 12 32 13 31 3 1 33 3 Figure 1.5.1.: Factor graph for eq. (1.5.8) for the assignment problem on K3,3 , see figure 1.1.2: each factor node i represents the constraint P j mij = 1; similarly circle nodes ij represent the variables nij and j represents the constraint represents the factor e −βmij wij P i mij = 1; the . 1. in the factor graph of this model each factor node has degree 2, being, in the notation of the previous section, ψa ≡ ψ(σi , σj ) = Z1 eβJij σi σj ∀a; 2. µi (σi ) ≡ u∗i (σi ); it follows that the cavity equations at zero temperature are the fixed point Bp equations. Observe that the model introduced in Section 1.4.1 is supposed on a Bethe lattice, i.e. on a locally tree-like graph in which however large loops are present. Belief propagation for loopy factor graphs: the Rbmp. It is not obvious that the algorithm above works (even approximately) on “loopy” factor graphs (i.e., factor graphs with cycles), but the general idea in this case is to adapt our method imposing local consistent conditions. It can be proved that, in this case, if ψa (x∂a ) > 0 ∀a, then the Bp algorithm has at least one fixed point. We will not treat this delicate problem in details, being outside the purposes of the present work (for a treatment of the problem, see [34]). To exemplify the described method on a loopy factor graph, let us consider its application to the Rbmp. Given a complete bipartite graph KN,N = Graph(V; E), let us call W = (wij )ij the weight matrix, where wij are i.i.d random variable with a probability density ρ(w). We define also M = (mij )ij , matching matrix, such that mij = 1 if the edge (i, j) is occupied, 0 otherwise. Note that in this case the first index and the second one in M and W run on different sets (i.e., W is not a weight adjacency matrix in a strict sense). Again, M PN PN represents a matching if, and only if, i=1 mij = j=1 mij = 1 and the energy of the matching M is given by X EN [M ] = mij wij . (1.5.7) ij In the following we identify M with the matching itself. We can write the joint probability µ(M ) = Y (i,j) δ ! N X e−βmij wij . mij , 1 δ mij , 1 Z i=1 j=1 N X (1.5.8) The corresponding factor graph for N = 3 is depicted in figure 1.5.1. Looking to the 26 Chapter 1. Optimisation, disorder and statistical mechanics previous formula, the Bp-equations become in this case ut+1 (ij)→i (mij ) ' t+1 vi→(ij) (mij ) ' t (mij ) e−βmij wij , vj→(ij) X X Y δ mij + mkj , 1 ut(kj)→j (mkj ). mij =0,1 k6=i (1.5.9a) (1.5.9b) k6=i −−→ To take the β → ∞ limit, let us introduce, for each oriented edge ~e = (i, j) of the original bipartite graph, t 1 vi→(ij) (1) X t (~e) := ln t . (1.5.10) β vi→(ij) (0) Through this change of variable and taking the β → ∞ limit, we obtain the so called min-sum algorithm for the assignment problem: X t+1 (~e) = min w(ej ) − X t (~ej ) . (1.5.11) ~ ej child Observe moreover that for a given edge e ∈ E we have two message functions, i.e. X(~e) and X(e~) and we have to update both simultaneously. Comparing eq. (1.5.11) with eq. (1.3.12), it is easily seen that the fixed point solution of the previous equation X(~e) is the same function introduced by Aldous on the Poisson weighted directed tree: indeed, our Bp-equations suggest that the problem of finding an optimal matching on the (loopy) bipartite complete graph is equivalent to the searching for an optimal matching on an (infinite) tree in the N → ∞ limit. This fact however does not guarantee the convergence for finite N . Indeed, Fichera [24] showed that, for finite N , after a certain time tc the message functions enter in a quasi-periodic regime that however still make possible to identify the correct solution. To summarize his results, let us define the drift of a matching M with corresponding matching matrix M = (mij )ij from the optimal matching M∗ , whose corresponding matching matrix is M ∗ = (m∗ij )ij , as EN [M ] − EN [M ∗ ] P δE [M] = . (1.5.12) N − ij mij m∗ij Introducing the set of matching differing from the optimal one for one loop, D := {M matching on KN,N : M4M∗ is a cycle} , (1.5.13) we can define ∆1 = M1 = min δE [M], (1.5.14a) arg min δE [M], (1.5.14b) M∈D M∈D ∆2 = min M∈D\{M1 } δE [M]. (1.5.14c) Let us now consider a certain edge e = (i, j) ∈ E. Fichera proved that, running the algorithm, we have that ∗ min (maxi,j (wij ), EN ) + N − 1, sign X t (~e) = (−1)mij −1 for t > 2∆1 (1.5.15) where as usual EN ≡ EN [M ∗ ] := minM EN [M ] is the optimal matching cost. The previous formula gives a criterion to identify the optimal matching in the hypothesis ∆1 6= 0. If ∆1 6= 0 and ∆1 6= ∆2 , then there exists a period T and a time tc such that ∗ X t+T (~e) = X t (~e) + (−1)mij −1 T ∆1 for t > tc . (1.5.16) 1.5. Models on graphs and Belief propagation 27 It follows that the algorithm does not converge but it still provides useful informations on the solution through the relation (1.5.15). More importantly, Fichera proved that, in the quasi-periodic regime, denoting as above by e = (i, j) ∈ E, λi = − t+T −1 1 X min X τ (~e), T τ =t j∈∂i µj = − t+T −1 1 X (2) τ min X (e~) T τ =t i∈∂j (1.5.17) (2) are an Hungarian gauge for our instance. In the previous formula, min gives as output the second minimum in a certain set of values. If ∆1 1 or ∆1 ≈ ∆2 the algorithm converges very slowly. Chapter 2 Euclidean Matching Problems Summary of the chapter In the present Chapter we present our original contributions to the main subject of the thesis, the Euclidean matching problem. Firstly, we review some well known results on Euclidean matching problems. We discuss also the so called Monge–Kantoroviˇ c transport problem mainly following the lecture notes by Ambrosio [4] and Evans [23] and the monograph of Villani [54]; some fundamental results from Measure Theory on the existence and properties of optimal maps are reviewed. Subsequently, we discuss and present our results on the Euclidean matching problem: we give an exact solution for the one dimensional case both on the line and on the circle; we discuss a proper scaling ansatz to correctly reproduce the scaling behaviour of the optimal cost in any dimension; we present two different approaches for the computation of the correlation function of the optimal matching map in any dimension; we present a replica approach showing the analogies between the Euclidean problem and the purely random problem that manifests itself as a mean field approximation of the Euclidean version. 2.1. Euclidean optimisation problems In the previous chapter we presented some optimisation problems on weighted graphs and some useful tools from Mathematics and Physics to solve them. However, we did not discuss the effect of eventual correlations between the weights on the specific weighted graph G = Graph(V; E) on which the problem is formulated, or we supposed that no correlation at all was present (as in the random matching problems). In the present chapter, we will consider the case in which the vertices of the graph correspond to geometrical points on a certain d-dimensional space Ω ⊆ Rd . To be more precise, we will consider the case Ω ≡ Ωd (L) := [0, L]d ⊆ Rd , Ωd := Ωd (1) (2.1.1) and a mapping Φ : V → Ωd (L) such that vi ∈ V 7→ Φ(vi ) ≡ ξi ∈ Ωd (L). (2.1.2) In an Euclidean problem, given an edge eij = (vi , vj ) of the weighted graph, the corresponding weight wij := w(eij ) is a function of the Euclidean distance kξi − ξj k between the geometric points corresponding to the ends of the considered edge; in particular, we will consider the case p eij = (vi , vj ) ∈ E 7→ wp (eij ) ≡ wp,ij = kξi − ξj k , p ∈ R+ . (2.1.3) As in the non-Euclidean case, Euclidean optimisation problems are therefore formulated as the search for the subgraph g∗ ⊆ G among a set of feasible subgraphs F such that a certain functional 1 X w(e), g = Graph(Vg , Eg ) ⊆ G, (2.1.4) E (p) [g; Φ] := |Eg | e∈Eg 29 30 Chapter 2. Euclidean Matching Problems is minimised: E (p) [gΦ ] = min E (p) [g; Φ]. g∈F (2.1.5) In the previous expression we denoted by gΦ the subgraph that realises the optimality. In the Tsp, for example, G ≡ KN , whilst F is the set of all Hamiltonian cycles; in the monopartite matching problem, G ≡ K2N and the set of feasible solutions is given by the set of all possible perfect matchings on G. The Euclidean versions of many optimisation problems are of great interest, due to the fact that some optimisation problems are actually defined in the geometric space (consider, for example, the Tsp to be solved in a certain geographical area); on the other hand, the Euclidean constraint determines correlation between the weights that cannot be neglected when we deal with average properties of some Euclidean random optimisation problems. 2.1.1. Euclidean optimisation and randomness: empirical measures As in the general case, we are not interested in a specific solution of a certain instance of an Euclidean optimisation problem: indeed, from an algorithmic point of view, the fact that the weights are obtained from Euclidean distances or not, does not matter. Instead, we are interested in the average properties of a certain Euclidean optimisation problem in presence of disorder. In a random Euclidean optimisation problem, the vertices of the graph G, on which the problem is defined, are associated to points randomly generated on Ωd ⊆ Rd . The weights to be associated to the edges are then computed as functions of the point positions. As anticipated above, this fact determines, above all, the presence of correlations (due to the underlying Euclidean structure) between the so obtained random weights. The average procedure on the interesting quantities is therefore more subtle than in the purely uncorrelated case and the point generation procedure becomes of paramount importance. This generation takes place usually considering a set of points independently and identically distributed on ΩdRand using a certain probability distribution density ρ(x) : Ωd → R+ on the same domain, Ωd ρ(x) dd x = 1. In the present work we assume for Ωd the form (2.1.1) and therefore we deal with a compact subset of Rd . Let us suppose then that a set XN = {ξi }i=1,...,N ∈ Ωd is generated on the considered domain with probability distribution density ρ; we introduce the empirical measure N 1 X (d) δ (x − ξi ) . ρN [x; XN ] = N i=1 (2.1.6) It can be proved [21] that for N → ∞ the empirical measures ρN converge to ρ almost surely. 2.1.2. Asymptotics for Euclidean functionals The asymptotics of Euclidean functionals in the form (2.1.5) has been widely studied in the hypothesis of randomly generated points on a certain domain Ω ⊆ Rd (see the monograph of Yukich [56] for a complete treatment). In particular, the asymptotics of the Tsp problem was firstly analysed by Beardwood, Halton, and Hammersley [6] in a celebrated result subsequently generalised by Redmond and Yukich [49]: we will reproduce here only the main results, being the analysis of the specific probabilistic methods used in the proofs outside the purpose of this thesis. The scaling properties of the optimal solution will be however of great interest in the following. In the previous paragraph we introduced the optimal value of a certain Euclidean functional E [gΦ ] (2.1.5), in which we assume the form (2.1.3) for the weight function on the edges of the considered graph. The defined functional is homogeneous and translationally 2.2. Euclidean matching problems 31 invariant, i.e., if xi 7→ λxi + r ∀i, λ ∈ R+ , r ∈ Rd , then E [gΦ ] 7→ λp E [gΦ ]. Moreover, we say that the considered functional is continuous if, assuming Ω ≡ Ωd , 1− p (2.1.7) |EgΦ |E (p) [gΦ ] − EgΦ0 E (p) [gΦ0 ] ≤ cVgΦ 4gΦ0 d , for some constant c ∈ R+ . Finally, we distinguish three type of Euclidean functionals in relation to their behaviour respect to a partition of Ωd in M d cubes {ωi }i=1,...,M d of edge 1 length M . Subadditive functional An Euclidean functional E (p) is said subadditive if there exist a constant cs ∈ R+ such that X |EgΦ |E (p) [gΦ ] ≤ |EgΦ ∩ωi |E (p) [gΦ ∩ ωi ] + cs M d−p (2.1.8) i for all possible Φ. In the previous expression gΦ ∩ ωi denotes the graph obtained suppressing all vertices v ∈ VgΦ such that v 7→ ξ ∈ Ωd \ ωi and all edges at them. Superadditive functional An Euclidean functional E (p) [gΦ ] is said superadditive if there exist a constant cS ∈ R+ such that X |EgΦ |E (p) [gΦ ] ≥ |EgΦ ∩ωi |E (p) [gΦ ∩ ωi ] − cS M d−p (2.1.9) i for all possible Φ in Ωd . Quasi-additive functional A subadditive Euclidean functional E (p) [gΦ ] is said quasi-additive (p) (p) if there exist a superadditive Euclidean functional ES [gΦ ] such that ES [gΦ ] ≤ (p) E [gΦ ] for all possible Φ and (p) 1− p |Eg |E (p) [gΦ ] − ES [gΦ ] = o |Vg | d , (2.1.10) where the average • is intended over all possible Φ. Redmond and Yukich [49] proved that for a quasi-additive Euclidean functional with 0 < p < d and for independently and identically distributed positions of the vertices of the graph, generated following the distribution density ρ(x) on Ωd , the identity lim |VgΦ |→∞ |EgΦ |E (p) [gΦ ] |VgΦ | d 1− p Z = αE (p) ,d p 1− d [ρ(x)] dd x (2.1.11) Ωd holds. Moreover they proved that the Euclidean functional (2.1.4) for the Tsp and for the Euclidean monopartite matching problem on the complete graph K2N are quasi-additive p and therefore the optimal cost scales N − d for p ≥ 1. In the following we will propose new arguments and, moreover, we will derive by our considerations finite size corrections to the leading scaling with exact coefficients. 2.2. Euclidean matching problems As anticipated in many occasions, in the present work we will consider Euclidean matching problems on the unit hypercube Ωd . Random points on the unit hypercube are intended generated independently and with the same uniform distribution ρ(x) = 1, unless otherwise specified. We will consider here two different types of Euclidean matching problems. 32 Chapter 2. Euclidean Matching Problems (a) p = 0.5. (b) p = 1. (c) p = 1.5. Figure 2.2.1.: Optimal grid–Poisson Euclidean bipartite matching with N = 100 on the square with open boundary conditions for the same instance: observe that for p = 1 there are no intersecting links. Euclidean monopartite matching problem In the Euclidean monopartite matching problem (Emmp), we consider a matching problem on the complete graph K2N , assuming that to each vertex v ∈ V is associated a point ξ = Φ(v) randomly generated on the unit hypercube Ωd . We are interested in the perfect matching m∗ ⊆ K2N that minimises the functional (2.1.4): 1 X wp (e), m = Graph(Vm ; Em ) matching; (2.2.1) E (p) [m; Φ] := N e∈Em where wp (e) has the expression (2.1.3). In the following we will denote by (p,m) EN [Φ] := min m matching E (p) [m; Φ] (2.2.2) the optimal matching cost and by (p,m) EN (p,m) (d) := EN [Φ] (2.2.3) the average optimal matching cost in d dimensions. Euclidean bipartite matching problem In the Euclidean bipartite matching problem (Ebmp), we consider a matching problem on the complete bipartite graph KN,N = Graph(V, U; E), N := |V| = |U| ∈ N. Randomness can be introduced in two different ways: we can assume that both vertices in V and vertices in U are associated to randomly generated points in Ωd (Poisson–Poisson Euclidean bipartite matching problem, PP Ebmp) or generating randomly one set of points, and fixing the other ones on a certain lattice positions: if a square lattice is adopted, we say that we are dealing with a grid–Poisson Euclidean bipartite matching problem (gP Ebmp). In the Ebmp we are interested on the perfect matching m such that the functional 1 X (p) EN [m; Φ] := w(e), m = Graph(Vm ; Em ) matching; (2.2.4) N e∈Em is minimised as usual. Denoting by V := {vi }i=1,...,N and by U := {ui }i=1,...,N , a given matching can be uniquely associated to an element σ ∈ SN of the set of permutations of N elements, in such a way that σ ↔ m if (vi , uσ(i) ) ∈ Em . In the following we will use quite often this correspondence and the previous functional (2.2.4) will be written as N 1 X (p) w((vi , uσ(i)) ); (2.2.5) EN [σ; Φ] := N i=1 2.2. Euclidean matching problems 33 Figure 2.2.2.: Optimal gP Euclidean bipartite matching with N = 225 and p = 2 on the torus. We will denote by (p) EN [Φ] := (p) min m matching (p) EN [m; Φ] = min EN [σ; Φ] (2.2.6) σ∈SN the optimal matching cost and by (p,PP) EN (p) (p,gP) (d) := EN [Φ], EN (p) (d) := EN [Φ] , (2.2.7) one set fixed the average optimal costs for the PP Ebmp and the gP Ebmp respectively in d dimensions. In the present work we will consider both matching problems on the unit hypercube Ωd and matching problems on the unit hypercube with periodic boundary conditions, i.e., on the flat hypertorus Td := Rd /Zd : in this case the Euclidean norm in eq. (2.1.3) is intended on this flat manifold. Observe finally that, although the Ebmp appears as a slight variation of the Emmp, it presents a crucial complication. Let S us consider for example an Euclidean matching problem on Ωd and a certain partition i ωi ⊂ Ωd : in a greedy approach, in the monopartite case, we can always perform a optimal matching procedure inside each ωi leaving at worst one point unmatched for each element of the partition; in the bipartite case it could happen that a relevant fraction of points remains unmatched for each ωi and therefore they have to be connected with points outside ωi . This suggests that local fluctuations in the number of points of different type in the Ebmp can be relevant in the scaling properties of the average optimal cost of the Ebmp in a non trivial way. The scaling of the average optimal cost for the Emmp can be directly obtained from the general results on subadditive Euclidean functional. It turns out indeed that, for 0 < p < d, in the Emmp on Ωd we have that (p,m) EN p (d) ∼ N − d . (2.2.8) Remarkably, from a mathematical point of view, the treatment of the scaling of the Ebmp appears more difficult, in particular in low dimensions. We will treat exactly this problem for d = 1 later. For d = 2, in 1983 Ajtai, Koml´os, and Tusn´ady [1] proved that the average optimal cost scales as p ln N 2 (p,PP) EN (2) ∼ . (2.2.9) N 34 Chapter 2. Euclidean Matching Problems Observe that a logarithmic correction appears respect to the monopartite case: this correction is indeed due to the relevance of local fluctuations in the number of points of different type. For d ≥ 3 and p = 1 Dobri´c and Yukich [20] proved that (1,PP) EN 1 (d) ∼ N − d . (2.2.10) The previous result can be justified by an heuristic argument: given a certain point on 1 , and therefore we Ωd , its nearest neighbour of different type is expected at a distance √ d N expect that p (p,PP) (2.2.11) EN (d) ∼ N − d . This result is indeed confirmed by numerical simulations for d ≥ 3. To our knowledge, however, no further scaling properties of the average optimal costs for Euclidean matching problems are known from a rigorous point of view. In the following, we will show that, using a proper scaling ansatz, we derived the correct known scaling properties for the Ebmp in any dimension and we obtained also the finite size corrections to the leading terms for d ≥ 2. ˇ transport problem 2.3. The Monge–Kantorovic To properly introduce our scaling ansatz and our results on the Ebmp, let us consider a “continuum version” of the Ebmp, the so called Monge–Kantoroviˇc problem, or optimal transport problem, a well studied problem in Measure Theory. Let us suppose that two non-negative measures ρR (x) and ρB (y) are given on Rd , such that the overall mass balance condition is satisfied Z Z d ρR (x) = d ρB (y). (2.3.1) Rd Rd Let us define also the set of maps Z Z M(ρR , ρB ) := T : Rd → Rd h(T(x)) d ρR (x) = h(y) d ρB (y) ∀h ∈ C0 (Rd ) . Rd Rd (2.3.2) We ask for the map M ∈ M(m, n) such that, given the cost functional Z def E[T; ρR , ρB ] = w(x, T(x)) d ρR (x), (2.3.3) Rn and the cost function w(x, y) : Rd × Rd → R+ , we have E[M; ρR , ρB ] =: E[ρR , ρB ] := inf T∈M(ρR ,ρB ) E[T; ρR , ρB ]. (2.3.4) The previous formulation is due to Monge (1780): in the original problem, Monge considered a finite set of piles of soil and a finite set of excavations of the same cardinality and he wanted to find a proper matching to minimise the transportation cost. It is self-evident that the problem in the formulation above appears as a generalisation of the Ebmp, in which points are replaced with measures on a certain support. In the 1940s Kantoroviˇc suggested a slightly different approach. He considered the set of transfer plans Z Z MK (ρR , ρB ) := π : Rd × Rd → R+ d π(x, y) = d ρR (x), d π(x, y) = d ρB (y) . d d y∈R x∈R (2.3.5) 2.3. The Monge–Kantoroviˇc transport problem 35 and the relaxed functional Z EK [π; ρR , ρB ] := w(x, y) d ρ(x, y); (2.3.6) Rn ×Rn ∗ he asked therefore for the transfer plan π such that EK [π∗ ρR , ρB ] := inf π∈MK (ρR ,ρB ) EK [π; ρR , ρB ]. (2.3.7) As mentioned before, this problem is a weak formulation of the original Monge transport problem: the delicate problem of the existence of an optimal transport plan is treated in [54]. In general we note that the existence of a transport plan π does not guarantee that an optimal map M in the Monge sense exists. We will not discuss this problem here and we will assume that, in our hypothesis, an optimal map always exists. It can be proved indeed [4] that, if w is a continuous function on its support and ρR (x) has no atoms1 , an optimal transport plan and a corresponding optimal map always exists. In the following we will consider only cost function in the form w(x, y) ≡ w(x − y). Kantoroviˇc introduced also a dual formulation, defining the following space: K(w) := (u, v)|u, v : Rd → R, u, v ∈ C0 (Rd ), u(x) − v(y) ≤ w(x − y) (x, y) ∈ Rd × Rd . (2.3.8) and asking for the couple (u∗ , v ∗ ) ∈ K(w) that maximise the following functional Z Z K[u, v; ρR , ρB ] := u(x) d ρR (x) − v(y) d ρB (y). (2.3.9) Rd Rd It can be proved [23] that, if w is uniformly continuous, such a couple of functions exists and they satisfy the following relations ( u∗ (x) = inf y∈Rd [v ∗ (y) + w(x − y)] (2.3.10) v ∗ (y) = supx∈Rd [u∗ (x) − w(x − y)] The correspondence between the previous set of equations and the Bp equations is remarkable. To understand what is the relation between the couple of functions (u∗ , v ∗ ) and the optimal map of the original problem M, we present the following theorem: Theorem (Strictly convex cost). Let w ∈ C 1 (Rd × Rd ) a strictly convex function, i.e., w(αx1 + (1 − α)x2 ) < αw(x1 ) + (1 − α)w(x2 ) for α ∈ (0, 1). Then an optimal map M for the original plan exists almost everywhere and it satisfies the following differential equation ∇w (x − M(x)) = ∇u∗ (x); (2.3.11) K[u∗ , v ∗ ; ρR , ρB ] = E[M; ρR , ρB ]. (2.3.12) moreover If the cost function w is not strictly convex (e.g., w(x) = |x|), then the uniqueness of the map is not guaranteed any more. If the cost function w is strictly convex, under some additional assumptions on m and n, we can also write a differential equation for M in the form of a change-of-variable formula (sometimes called Jacobian equation) d ρR (x) = d ρB (M(x)) det JM (x), (2.3.13) where JM (x) is the Jacobian matrix for the map M. Remarkably, this formula does not hold in general, as we will see later; note that the optimal map is not usually “smooth” and the meaning of the Jacobian matrix itself must be clarified. The validity conditions of the Jacobian equation (2.3.13) are pointed out in the following theorem (a more general statement can be found in [54, Chapter 11]) 1 In Measure Theory, given a measurable space (m, Σ), A ⊂ Σ is called atom if m(A) > 0 and ∀B ⊂ A measurable such that m(B) < m(A), m(B) = 0. 36 Chapter 2. Euclidean Matching Problems Theorem (Jacobian equation). Let be ρR , ρB two non-negative measures on Rd and suppose that d ρR (x) = ρR (x) dd x, where ρR (x) ∈ L1 (Rd ). Let us assume that M ∈ M(ρR , ρB ) and • ∃Σ ⊂ Rd such that ρR (x) = 0 almost everywhere outside Σ and M is injective on Σ; • M is approximately differentiable2 on Σ almost everywhere. If ∇M(x) is the approximate gradient of M and by definition JM (x) := det |∇M(x)|, then Eq. (2.3.13) holds. Working with strictly convex potentials and compact submanifolds of Rd , the hypothesis of the previous theorem are satisfied. As an important particular case, let us consider the following cost function: 2 w(x − y) := kx − yk , x, y ∈ Rd , (2.3.14) where k•k is the Euclidean norm. The cost is strictly convex; we suppose also that two measures are given on Rd such that d ρR (x) := ρR (x) dd x, d ρB (y) := ρB (y) dd y, (2.3.15) such that ρR (x), ρB (x) ∈ L1 (Rd ). The optimal map is a minimum of the following functional Z h i 2 L[M, λ] := kx − M(x)k + λ(x) (ρB (M(x)) det JM (x) − ρR (x)) dd x (2.3.16) Rd where λ(x) is a Lagrange multiplier. A direct computation of the Euler–Lagrange equations gives that M(x) = ∇φ(x) (2.3.17) for a certain (convex) potential φ, i.e., in the quadratic case, the optimal map can be expressed as a gradient of a convex potential. The Jacobian equation assumes the well known form of a Monge–Amp`ere equation, ρR (x) = ρB (∇φ(x)) det Hessφ (x), (2.3.18) where Hessφ is the Hessian matrix of φ. 2.4. L One dimensional Ebmp: an exact solution for convex weights Let us start considering the Ebmp on the line and on the circle. The low dimensionality allows us to give an exact solution to the problem in the case of convex cost functions. In particular, we will consider the gP Ebmp in one dimension both with open boundary conditions (obc) and with periodic boundary conditions (pbc) on the interval Ω1 = [0, 1]. We will suppose that we work on the complete graph KN,N = Graph(V, U; E), in which the first set of vertices is associated to a set of fixed points on the interval and in particular vi ∈ V 7→ Φ(vi ) = ri ≡ 2 It 2i − 1 , 2N i = 1, . . . , N, (2.4.1) is said that F(x) is approximately differentiable in x if there exists a differentiable function Φ(x) such that the set {F(x) = Φ(x)} has zero density in x; ∇Φ(x) is called approximate gradient of F(x). 2.4. One dimensional Ebmp: an exact solution for convex weights 37 whilst the set of vertices U is associated to a set of N points, {bi }i=1,...,N , randomly generated in the interval, such that ui ∈ U 7→ Φ(ui ) = bi . We will suppose the U-vertices indexed in such a way that i ≤ j ⇒ bi ≤ bj . Finally, we will consider the following cost functionals v u N u1 X (p,gP) (p,gP) p ˜ := min E˜N(p,gP) [σ; Φ], (2.4.2a) EN [σ; Φ] = t wp ri − bσ(i) , E˜N σ N i=1 (p,gP) EN [σ; Φ] = N 1 X wp ri − bσ(i) , N i=1 (p,gP) EN := min EN(p,gP) [σ; Φ], σ (2.4.2b) in which Φ is the map between the complete graph and the points on the line defined above (i.e. represent as usual the random process by which the points are generated), p ∈ R+ 0 and the function wp : [0, 1] → [0, 1] is given by: ( p |x| for obc, (2.4.3) wp (x) = p 1 1 p |x| θ 2 − |x| + (1 − |x|) θ |x| − 2 for pbc. Before any analytical consideration on the optimal cost, let us discuss the structure of the optimal matching. The point configurations in our problem are of this type: Let us consider for a moment the cost functional (2.4.2a) assuming p ∈ (0, +∞). It can be easily observed that, in the hypothesis p > 1, the optimal matching is always ordered, i.e. the optimal permutation σ ∗ is ( i for obc, ∗ (2.4.4) σ (i) = i + λ mod N for pbc, for a certain λ ∈ {1, . . . , N }. The optimal matching can be pictorially represented as This simple fact follows directly from the convexity of the weight function wp for p > 1: indeed, it can be seen by direct inspection in the N = 2 case that the ordered solution minimises the cost functional among all possibilities. If instead p < 1, the weight function wp is concave and the ordered solution is not necessarily the optimal one. Indeed, the optimal matching σ ∗ has to satisfy a different requirement, i.e., it must be uncrossing, or nested [10, 17]: given two interval [ri , bσ∗ (i) ] and [rj , bσ∗ (j) ], i 6= j, either [ri , bσ∗ (i) ] ∩ [rj , bσ∗ (j) ] = ∅ or one of the two intervals is a subset of the other one. Pictorially, drawing arcs connecting matched pairs above the line, they must be uncrossing, as in the picture below: Clearly many nested solutions are possible: the uncrossing condition is necessary for the optimal solution in the p ∈ (0, 1) case but not sufficient to identify it: this simple fact in the discrete case, appears as a no crossing rule for the optimal map in the one dimensional Monge–Kantoroviˇc transport problem in presence of concave cost functions [32]. Remarkably, for p = 1 it is possible to have instances in which the same optimal cost is obtained by different matching solutions. 38 Chapter 2. Euclidean Matching Problems The observation above on the properties of the solution for p > 1 solves completely the problem. Indeed, we can perform a simple probabilistic argument to obtain the correct average optimal cost and its distribution in the large N limit both on the line and on the circumference [10]. Here we will perform also a different calculation following our paper [12] and inspired by the Monge–Kantoroviˇc theory. 2.4.1. Ebmp on the interval As shown before, in the case of the gP Ebmp on Ω1 , the optimal matching is always ordered for p ∈ (1, +∞), i.e. the optimal permutation is simply σ ∗ (i) = i. Note that, due to the monotony of the function x 7→ xp for p > 1, the optimal solution for the cost functional (p,gP) (p,gP) E˜N [σ; Φ] is also optimal for the cost functional EN [σ; Φ]. The probability density distribution for the position of the i-th b-point is: N i i (2.4.5) Pr (bi ∈ d b) = y (1 − b)N −i d b, b i where we used the short-hand notation x ∈ d z ⇔ x ∈ [z, z + d z]. In the N → ∞ limit, a non trivial result is obtained introducing the variable µ(y) √ i −b (2.4.6) µ(b) := N N expressing the rescaled (signed) distance between a b-point in [b, b+d b] and its corresponding r-point in the optimal matching. We finally obtain a distribution for the variable µ(t) depending on the position on the interval t ∈ [0, 1]: µ2 e− 2t(1−t) d µ. Pr (µ(t) ∈ d µ) = p 2πt(1 − t) (2.4.7) The distribution (2.4.7) is the one of a Brownian bridge on the interval [0, 1], a continuous time stochastic process defined as B(t) := W(t) − tW(1), t ∈ [0, 1], (2.4.8) where W(t) is a Wiener process (see Appendix A). The joint distribution of the process can be derived similarly. In particular, the covariance matrix for the 2-points joint distribution has the form, for s, t ∈ [0, 1], 2φ(s) φ(s) + φ(t) − φ(t − s) Σ2 = , (2.4.9) φ(s) + φ(t) − φ(t − s) 2φ(t) where we introduced the function φ(x) := |x| 1 − |x| . 2 Averaging over the positions s, t and fixing the distance τ := |t − s|, we have 1 c c − φ(τ ) hΣ2 i (τ ) = , c= . c − φ(τ ) c 6 (2.4.10) (2.4.11) The Euclidean matching problem on the interval [0, 1] with open boundary conditions and cost functional (2.4.2a) is therefore related to the Brownian bridge in the N → ∞ limit. By using this correspondence, the correlation function ∀p > 1 is computed as hµ(t)µ(t + τ )i = 1 − φ(τ ). 6 (2.4.12) µ(t) 2.4. One dimensional Ebmp: an exact solution for convex weights 39 0 −0.5 0 0.1 0.2 0.3 0.4 0.5 t 0.6 0.7 0.8 0.9 1 (a) Plot of µ(t) for a certain realisation of the problem with N = 4096. 0.15 2 ρ˜p (u) µ(r)µ(0) p=2 p = 50 p = 500 p→∞ 3 0.1 1 0.05 0 0 0.1 0.2 0.3 0.4 0.5 0 0.5 r 1 1.5 2 u (b) Correlation function ∀p > 1 in the obc case for N = 1000 obtained averaging over 104 realisations (for clarity, not all data are represented). (c) Distribution density of the rescaled cost p (p,gP) N 2 E˜N for different values of p computed with 5 · 105 iterations and N = 5000 for the obc case. Figure 2.4.1.: One dimensional gP Ebmp on the interval. where the average h•i is intended on the position t, whilst we denoted by • the average over different realisations of the problem. This theoretical prediction was confirmed numerically. Introducing the normalised variable σ(t) = µ(t) = sign(µ(t)), |µ(t)| (2.4.13) Boniolo, Caracciolo, and Sportiello [10] computed also the correlation function for this quantity, finding s √ Z1−t 2 min(s, t)(1 − max(s, t)) 1− t √ . (2.4.14) σ(s)σ(t) = arctan ⇒ σ(s)σ(s + t) d s = π |t − s| 1+ t 0 Both formulas were confirmed numerically. To compute the average cost of the matching, from eq. (2.4.7) N p 2 (p,gP) N →∞ EN −−−−→ Z1 p |B(t)| d t = p 1 Γ +1 . 2 2 (p + 1) p 2 (2.4.15) 0 The optimal cost √ (p,gP) N E˜N in the N → ∞ limit can be written as v uZ1 u √ (p,gP) N →∞ u p p N E˜N −−−−→ t |B(t)| d t. 0 (2.4.16) 40 Chapter 2. Euclidean Matching Problems Although the previous expression is difficult to evaluate exactly for finite p (see for example the paper of Shepp [51] for additional information about the distribution of the random R1 p variable Xp := 0 |B(t)| d t), the calculation can be easily performed in the relevant limit p → ∞, being v uZ1 u u p→∞ p p t |B(t)| d t −−−→ sup |B(t)|. (2.4.17) t∈[0,1] 0 The distribution of the sup of the absolute value of the Brownian bridge is the well known Kolmogorov distribution [21] ! +∞ X 2 2 Pr sup |B(t)| < u = (−1)k e−2k u (2.4.18) t∈[0,1] and therefore √ k=−∞ (p,gP) N →∞ N E˜N −−−−→ r π ln 2. 2 (2.4.19) h p i (p,gP) ≤u for different values of p: In figure 2.4.1c we plotted ρp (u) := ddu Pr N 2 E˜N observe that ρp approaches the Kolmogorov distribution in the large p limit. (p,gP) Finally, observe also that the variance of EN 2 (1 + p)Γ(p + 2) − (2p + 1)Γ2 p2 + 1 (p,gP) (p,gP) p N EN − EN = (2.4.20) 2p (p + 1)2 (2p + 1) increases with p and that (p,gP) EN (p,gP) − EN (p,gP) EN 2 = 2 (p + 1)Γ(p + 2) p→∞ 2 − 1 −−−→ +∞. p (2p + 1)Γ 2 + 1 (2.4.21) (p,gP) requires a larger From a numerical point of view, this means that a computation of EN amount of iterations as p increases and fluctuations around the mean value become extremely large in the p → ∞ limit. On the other hand, fluctuations of the optimal cost (p,gP) (p,gP) E˜ around its mean value E˜ for p → ∞ remain finite N N 2 p→∞ π 2 π (p,gP) (p,gP) − E˜N −−−→ − ln2 2. N E˜N 12 2 (2.4.22) (p,gP) This fact allows to perform a precise computation of E˜N for large p. 2.4.2. Ebmp on the circumference Let us now consider the case of periodic boundary conditions. For p > 1, the solution for both the functionals (2.4.2), with p ∈ (1, +∞), is again ordered; however, in this case the optimal permutation is i 7→ σ ∗ (i) = i + λ mod N for a certain λ ∈ {0, 1, . . . , N − 1}. In the continuum limit, the solution is a generalised Brownian bridge, µp (t) = B(t) + λp , t ∈ [0, 1], for a certain constant λp ∈ R depending on p. The constant λp can be found by optimality condition on the cost functional (2.4.2a): v uZ1 u ∂ u p p t |B(t) + λp | d t = 0. (2.4.23) ∂λp 0 2.4. One dimensional Ebmp: an exact solution for convex weights 41 The R 1 previous equation can be solved only R 1 for p = 2 and p → +∞. For p = 2, λ2 = − 0 B(τ ) d τ ; therefore, m2 (t) = B(t) − 0 B(t) d t and hµ2 (t)µ2 (t + τ )i = 1 − φ(τ ). 12 (2.4.24) R p1 p→∞ 1 p −−−→ supt∈[0,1] |B(t) + λ∞ | and therefore the optimality For p → ∞, 0 |B(t) + λp | d t condition becomes ! supt∈[0,1] B(t) + inf t∈[0,1] B(t) ∂ sup |B(t) + λ| = 0 ⇒ λ∞ = − . (2.4.25) ∂λ t∈[0,1] 2 We have therefore, for t ∈ [0, 1], µ∞ (t) = B(t) − sups∈[0,1] B(s) + inf s∈[0,1] B(s) . 2 (2.4.26) The correlation function can be directly found using the known joint distributions for the Brownian bridge and its sup and for the sup and inf of a Brownian bridge (see Appendix A). After some calculations we obtain hµ∞ (t)µ∞ (t + τ )i = 2 − ζ(2) − φ(τ ), 4 (2.4.27) P∞ 2 where ζ(2) := n=1 n12 = π6 . The value µ2∞ (t) = 2−ζ(2) = 0.0887665 . . . is very close to 4 1 the value obtained for p = 2, µ22 (t) = 12 = 0.08¯3. In figure 2.4.2b we plot the values of cp := µ2p (t) (2.4.28) as function of p. Note, finally, that due to the fact that we have imposed pbc, h•i ≡ • holds in all the previous formulas. Let us now introduce the normalised transport field σp (t) := µp (t) = sign(µp (t)). |µp (t)| (2.4.29) The correlation function σp (s)σp (t) can be computed from the covariance matrix observing that the process is still Gaussian. The correlation function is found in the form σp (s)σp (t) = 2 cp − φ(t − s) arctan p π φ(t − s) (2cp − φ(t − s)) (2.4.30) 1 where c2 = 12 for p = 2 and c∞ = 2−ζ(2) . 4 The optimal cost in the p → ∞ limit can be evaluated as the average spread of the Brownian bridge. Denoting ξ := sup B(s) − inf B(s), s∈[0,1] (2.4.31) s∈[0,1] the distribution of the spread ξ is given by [21] 2 2 d Pr (ξ < u) = θ3 e−2u + u θ3 e−2u , du where θ3 (x) ≡ ϑ3 (0, x), ϑ3 (z, q) := 1 + 2 ∞ X n=1 2 q n cos(2nz) (2.4.32) (2.4.33) 42 Chapter 2. Euclidean Matching Problems p=2 p=5 p = 200 p→∞ 0.088 0.05 cp hµp (t)µp (t + τ )i 0.1 0.086 0 0.084 p=2 −0.05 0 0.1 0.2 0.3 0.4 5 0.5 10 15 (a) Correlation function in the pbc case for N = 1000 obtained averaging over 104 realisations (for clarity, not all data are represented). The continuous line is the theoretical prediction for p = 2; the dashed line corresponds to the p → ∞ case. (b) Value of cp in the pbc case for N = 1000 obtained from the fit of the correlation function, averaging over 5000 realisations, for different values of p. 101 ρ˜p (u) tan π2 σp (t)σp (t + τ ) p=2 p = 50 p = 200 p→∞ 4 2 100 10−1 10−2 p=2 p = 200 0 0 (c) d du 20 p τ 0.2 Density p (p,gP) Pr N 2 E˜N 0.4 0.6 u distribution <u of 0.8 1 1.2 0 0.1 0.2 0.3 0.4 0.5 τ ρ˜p (u) the := rescaled cost for different values of p computed with 2 · 104 iterations and N = 1000 for the pbc case. (d) Plot for the normalised map correlation function (2.4.30) obtained with 104 iterations and N = 1000 for the pbc case. The continuous line is the theoretical prediction for p = 2, whilst the dotted line is obtained from the theoretical prediction for p → ∞ The pole is at τ (p) = √ 1 2 − 1−8cp . 2 Figure 2.4.2.: One dimensional gP Ebmp on the circumference. 2.4. One dimensional Ebmp: an exact solution for convex weights 43 is the third Jacobi theta function. From eq. (2.4.32) the distribution of the optimal cost in the p → ∞ limit is easily derived. Moreover, r ! √ ξ 1 1 π N →∞ (p,gP) (2.4.34) N E˜N −−−−→ = sup B(s) − inf B(s) = p→∞ 2 2 s∈[0,1] 2 2 s∈[0,1] with corresponding variance 2 p→∞ π 2 − 3π (p,gP) (p,gP) N E˜N − E˜N . −−−→ 24 (2.4.35) 2.4.3. General solution and universality for p > 1 In the present section we will justify and generalise the previous results looking at the so Monge–Kantoroviˇc formulation of the problem. Let us consider the interval Ω1 (L) := [0, L] ⊂ R, L ∈ R+ and let us suppose also that two different measures are given on Ω, i.e., the uniform (Lebesgue) measure d m(x) := 1 d x, L (2.4.36) and a non uniform measure d n(x) with measure density ν(x), ∞ X 2πkx 2πkx dx ν1 (k) cos + dx + ν2 (k) sin . d n(x) := ν(x) d x = L L L (2.4.37) k=1 We ask for the optimal map M : Ω1 (L) → Ω1 (L) such that the following transport condition is satisfied Z Z 1 dx = d n(x) ∀A ⊂ Ω measurable. (2.4.38) L A M −1 (A) and M minimises the following functional ZL Ep [M ; m, n] := p |x − M (x)| d n(x), p ∈ R+ . (2.4.39) 0 In the hypothesis p > 1 the Jacobian equations (2.3.13) and eq. (2.4.38) can be rewritten as a change-of-variable formula: L d n(x) = d M (x). (2.4.40) Imposing pbc, that is M (0) = M (L) − L, the solution of (2.4.40) determines the optimal map up to a constant as M = x + Mp (0) + ϕ(x). (2.4.41) In the previous equation we have introduced ϕ(x) X ∞ ∞ X L2 sin kπx kπx L2 kπx L cos + ν2 (k) sin2 . ϕ(x) := ν1 (k) πk L πk L k=1 (2.4.42) k=1 Note that ϕ(0) = ϕ(L) = 0. The value of Mp (0) must be determined requiring that the functional (2.4.39) is minimum: we have that ZL p−1 sign (Mp (0) + ϕ(x)) |µp (0) + ϕ(x)| p 0 d n(x) = 0. (2.4.43) 44 Chapter 2. Euclidean Matching Problems If instead obc are considered, then Mp (0) = 0 and the solution is obtained explicitly ∀p > 1. Let us now suppose that L ≡ N ∈ N and that the measure n(d x) is obtained as a limit measure of a random atomic measure of the form ! N N dx X 1 X d nN (x) := δ (x − ηi ) = d θ (x − ηi ) , (2.4.44) N i=1 N i=1 where {ηi }i=1,...,N is a set of N points uniformly randomly distributed in Ω1 (N ). The previous measure can be written as N ∞ r X 2 Zk πkx x 1 X sin + + ηi − 1. (2.4.45) nN (x) = N πk N N N i=1 k=1 where we have introduced N 1 X Zk ≡ Zk (x) := zi (x), N i=1 √ 2ηi + x zi (x) := − 2N cos kπ . N (2.4.46) Observe now that Zk (x) is a sum of independent identically distributed random variables. From the central limit theorem, we have that Zk (x) is normally distributed as Zk ∼ N (0, 1) ∀k ∈ N. (2.4.47) Remarkably the previous distribution does not depend on x. Moreover, the Zk and Zl are independent random variables for k 6= l, being Gaussian distributed and hZl Zk i = 0, where the average h•i is intended over the possible values {ηi }i . In eq. (2.4.45) the Karhunen– Lo`eve expansion for the Brownian bridge [5] on the interval [0, N ] appears: BN (x) := ∞ X √ k=1 2N πkx Zk sin , πk N Zk ∼ N (0, 1) ∀k ∈ N. (2.4.48) It follows that nN (x) can be written for large N , up to irrelevant additive constants, as nN (x) ' 1 x BN (x) + N N (2.4.49) and therefore we cannot associate a density measure to it, due to the fact that nN (x) is not differentiable. However the solution of the matching problem in the continuum can be still obtained directly from eq. (2.4.37); considering pbc, then Mp (x) = Mp (0) + x + BN (x), x ∈ [0, N ]. (2.4.50) Denoting by µp (x) := Mp (x) − x, (2.4.51) it follows that ∀p > 1 µp (x)µp (y) = cp (N ) − N φ x−y N . (2.4.52) where cp (N ) is a constant depending on N and p. Adopting the notation F (N ) ∼ G(N ) ⇔ F (N ) 0 < limN →∞ G(N ) < +∞, for two positive real functions F and G depending on N , note that cp (N ) ∼ cp N (2.4.53) for some positive constant cp depending on p. Indeed, in the discrete case, from the fact that the solution must be ordered, in the large N limit it can be easily seen that p min Ep [M ; m, n] ∼ N 2 , M (2.4.54) 2.4. One dimensional Ebmp: an exact solution for convex weights 45 where Ep is the cost functional (2.4.39) in which the measure (2.4.44) is adopted. Therefore, √ Mp (0) = µp (0) ∼ µp (x) ∼ N . Moreover, note also that Mp2 (t) < N 2 ⇒ cp (0) = 0. For p = 2, eq. (2.4.43) becomes 1 M2 (0) = − N ZN 1 BN (x) ◦ d BN (x) − N 0 ZN 1 BN (x) d x = − N 0 ZN BN (x) d x (2.4.55) 0 where the first integral is intended in the Stratonoviˇc sense. The result is in agreement with the one presented in the Subsection 2.4.2. If we consider the transport cost functional v uN uZ u p p ˜ Ep [M ; m, n] := t |x − M (x)| d n(x) (2.4.56) 0 the matching problem has the same solution M˜p ≡ Mp obtained √ for the cost (2.4.39) for all finite values of p, due to the fact that the function f (x) = p x is monotone. However, for the functional cost (2.4.56), in the p → ∞ limit, we can reproduce the computations of Section 2.4.2, obtaining lim M˜p (0) = − p→∞ sups∈[0,N ] BN (s) + inf s∈[0,N ] BN (s) . 2 (2.4.57) If obc are considered, then Mp (0) = 0 and we have simply, ∀p > 1, µp (x) ≡ µ(x) = BN (x), x ∈ [0, N ]; (2.4.58) It can be easily seen that 1 N ZN µ(x)µ(x + r) d x = c(N ) + N φ r , N c(N ) = N . 6 (2.4.59) 0 Note that, therefore, if we consider the problem of a matching of N random b-points to N lattice r-points on the interval [0, N ], for s, t ∈ [0, N ] the correlation function assumes the form |τ | |τ | G(τ ) := hµp (t)µp (t + τ )i = cp N − 1− , (2.4.60) 2 N where Mp (t) is the signed distance between t ∈ [0, N ] and its corresponding inverse image under the action of the optimal map. It follows that for N → ∞ the correlation function G(τ ) has a divergent part, G(0) = cp N , depending through cp on the specific details of the problem (e.g., the boundary conditions adopted or the value of p), a universal finite τ2 part − |τ2| and a (universal) finite size correction 2N . We obtained also numerical evidences of the validity of eq. (2.4.60) for different values of p both with obc and with pbc: an exact derivation of cp for obc ∀p > 1 and for c2 and c∞ for pbc was presented. This fact suggests that all Euclidean matching problems in one dimension with strictly convex cost functionals belong to the same universality class and that the specific details of the model determine only the value of the constant cp in the divergent contribution G(0). Similarly, on the interval [0, N ] eq. (2.4.30) becomes s τ 2cp N − τ2 1 − N 1 2 1 2τ hσp (t)σp (t + τ )i = arctan q = 1 − π c N + o √ π τ N p 1 − τ 4cp N − τ 1 − τ 2 N N (2.4.61) in which the universal part is given by the constant 1 and finite size corrections scale as √1 up to a scaling factor depending on cp . N 46 Chapter 2. Euclidean Matching Problems 2.5. L Higher dimensional Euclidean matching problem In Section 2.3 we discussed the Monge–Kantoroviˇc problem and we observed that it appears as a “continuum” generalisation of our discrete matching problem. In the previous section, moreover, we gave a general solution for the one dimensional case for p > 1 using the Jacobian equation (2.3.13), that for d = 1 is linear. Inspired by the previous results and by the continuum theory, we tried, in a non conventional way, to obtain useful informations on the discrete problem for any dimension from the continuum problem by proper regularisation procedures. In particular, we made an ansatz about the functional dependence of the optimal cost (2,PP) of the Ebmp EN from the empirical measures obtained from the two sets of points. Indeed, let us consider a bipartite matching problem on KN,N := Graph(V, U; E), and let us suppose that we associate to the vertices of KN,N 2N points independently and uniformly randomly generated on the hypercube Ωd = [0, 1]d ⊂ Rd through the map Φ: V := {vi }i=1,...,N U := {ui }i=1,...,N 7→ R := Φ(V) = {ri := Φ(vi )}i=1,...N ⊂ Ωd , 7 → B := Φ(U) = {bi := Φ(ui )}i=1,...N ⊂ Ωd . (2.5.1a) (2.5.1b) As above, we want to find the optimal permutation σ ∗ that minimises the following functional: N 1 X (2,PP) bσ(i) − ri 2 . (2.5.2) [σ; Φ] := EN N i=1 We can associate to the set R and to the set B two empirical measures respectively ρR (x) := N 1 X δ (x − ri ) , N i=1 (2.5.3a) ρB (x) := N 1 X δ (x − bi ) , N i=1 (2.5.3b) N →∞ such that ρR (x), ρB (x) −−−−→ ρ(x) = 1 weakly. We suppose that periodic boundary conditions are imposed, so we work on the d dimensional flat torus Td . Let us call δρ(x) := ρR (x) − ρB (x) (2.5.4) the difference between the two densities and δ ρˆ(k) := N X e−2πik·ri − e−2πik·bi N i=1 , k ∈ Zd , (2.5.5) the corresponding Fourier modes. We ask therefore for an optimal map M : Ωd → Ωd such that M(ri ) = bσ∗ (i) . (2.5.6) The previous map is indeed a map between the two atomic densities (2.5.3) that minimises the following transport functional Z E[ν; ρR , ρB ] := Ωd 2 kx − ν(x)k ρR (x) dd x ≡ N 1 X 2 kri − ν(ri )k . N i=1 (2.5.7) among all possible maps between the two given atomic measures. Observe that the optimal map is not unique in our hypotheses (indeed, there are infinite possible maps satisfying the 2.5. Higher dimensional Euclidean matching problem 47 constraint (2.5.6)). Neglecting for a moment the fact the highly non regular structure of our measures, we have that for a quadratic random cost such as (2.5.7) the optimal map can be expressed as a gradient of a scalar function ϕ(x), M(x) := x + µ(x), µ(x) = ∇ϕ(x) (2.5.8) satisfying the Monge–Amp`ere equation (2.3.18): ∂ 2 ϕ(x) ρR (x) = ρB (x + ∇ϕ(x)) det I + . ∂xi ∂xj (2.5.9) In the limit N → ∞ however we expect that |µ(x)| 1 (the two empirical measures converge weakly to the same uniform measure and therefore M(x) → x, identity map): under this working hypothesis, the Monge–Amp`ere equation can be linearised as ∆ϕ(x) = δρ(x). (2.5.10) The previous equation has only one non trivial solution on the d-dimensional hypertorus and therefore identifies our optimal map. Substituting in the equation (2.5.7) we have that for the optimal map M 2 X |δ ρˆ(k)| (2.5.11) E[M; ρR , ρB ] = 2. 4π 2 kkk k∈Zd \{0} Our hypothesis is that the previous functional at large N captures the leading terms of the (2,PP) ∗ exact optimal cost of our original matching problem EN [σ ; Φ]; in particular, observing that 2 2 |δ ρˆ(k)| = (2.5.12) N we expect that X 1 1 (2,PP) (d) ≈ (2.5.13) EN 2. 2 N 2π kkk d k∈Z \{0} One dimensional case Let us consider the Ebmp in dimension d = 1. In low dimensions, that is for d = 1, 2, the optimal cost is no longer a subadditive quantity and the fluctuations of the density of points are dominant. Moreover for d = 1 the optimal cost is also not selfaveraging [27]. The one-dimensional case is the simplest application of our formula, since this is the only where no regularisation is needed. We obtain straightforwardly (2,PP) EN (1) = 1 6N (2.5.14) This result is exact and it can be obtained reproducing arguments similar to the gP case presented in Section 2.4. We checked the validity of our prediction numerically: results are presented in fig. 2.5.1. Two dimensional case For d ≥ 2 the sum appearing in eq. (2.5.13) diverges; therefore we have to find a suitable regularization to give meaning to the expression. To regularise the sum, let us introduce a proper cut-off in the momentum space, i.e., let us consider only the element of the sum with 2π 1 kkk ≤ , ` := √ , (2.5.15) d ` N where ` is the characteristic length of the system at finite N , being of the order of the distance between two points of different type. Clearly 2π`−1 → +∞ for N → +∞. 48 Chapter 2. Euclidean Matching Problems −0.155 Numerical data (2) Fit by f1 (N ) (2,PP) 0.164 −0.16 −0.165 N EN EN (2,PP) (1) (1) − N 6 0.166 0.162 −0.17 0 0.005 0.01 0.015 0.02 0.025 0.03 N 0 0.01 0.01 0.02 −1 N (2,PP) (a) Numerical data for N EN 0.02 0.03 (2,PP) (b) Numerical results for N EN (1). 0.03 −1 (1) − 16 . Figure 2.5.1.: Numerical simulations for d = 1; the fit was performed using the three parameters fit function (2) (2) (2) f1 (N ) = α1 + e1 N + c . N2 We averaged the optimal cost of the Ebmp given by an exact algorithm [18] for system sizes up to N = (2) 2048. From a least square fit we obtained the coefficient α1 = 0.166668(3), in perfect agreement with our analytical prediction. Once verified the validity our theoretical result, we used it to extrapolate the (2) (2) (2) subleading coefficient e1 , fixing α1 = 16 and using the fitting function f1 (N ) with two free parameters (see table 2.1). Let us start with the d = 2 case. We choose a regularizing smooth function F (x) such that F (0) = 1 and F (x) → 0 for x → +∞. The function has to decrease rapidly enough to make the series X 2πknk 1 (2.5.16) 2F 2π`−1 knk 2 n∈Z \{0} converge. Finally, let us denote by Nd (r) := {x ∈ Zd \ {0}|kxk < r}, (2.5.17) the number of lattice points (excluded the origin) included in a ball of radius r centred in the origin in dimension d. Then, for arbitrary a ∈ (0, 1), we can write 1 X k∈Z2 \{0} kkk 2F kkk √ N ZR = lim R→∞ 1 F r2 a ZR ≈ lim R→∞ r √ N Z∞ ∂N2 (r) r dr − 2πr d r+2π F √ ∂r r N a 1 ∂N2 (r) − 2πr d r + 2π r2 ∂r Z∞ F (r) d r, r (2.5.18) √a N a where we have isolated in the last term the singular part of the series. The first integral in the right hand side is well behaved: indeed, ZR a R ZR N2 (r) − πr2 1 ∂N2 (r) N2 (r) − πr2 − 2πr d r = . + 2 2 r ∂r r 2r3 a (2.5.19) a Both the first and the second term are finite in the R → ∞ limit due to the fact that [26] 2.5. Higher dimensional Euclidean matching problem 49 0.134 0.132 (2) − (2,PP) 1 0.13 N EN N EN (2,PP) (2) ln N 2π 1.5 0.5 0.128 Numerical data Fit by 0 10−5 10−4 (2) f2 (N ) 10−3 10−2 10−1 10−4 100 N −1 10−3 10−2 10−1 N −1 (2,PP) (a) Numerical data for N EN (2,PP) (b) Numerical results for N EN (2). (2) − ln N 2π . (2,PP) Figure 2.5.2.: Numerical simulations for d = 2. We fitted our numerical data for N EN (2) using the function b (2) (2) . f2 (N ) = a ln N + e2 + ln N 1 The ln N correction was suggested by the right hand plot. From a least square fit we obtain the coefficient 2πa = 1.0004(6), in perfect agreement with our analytical prediction. Once verified the theoretical prediction (2) 1 and fitting the other two for a, we used it to extrapolate the subleading coefficient e2 , fixing a = 2π parameters (see table 2.1). d=1 1 2π 2 ζd (1) (2) αd (2) ed d=2 − − 0.1332(5) 1 6 0.166668(3) −0.1645(13) d=3 −0.45157 . . . −0.4489(16) 0.66251(2) d=4 −0.28091 . . . −0.282(4) 0.571284(6) d=5 −0.21423 . . . −0.2139(13) 0.584786(2) (2) Table 2.1.: Results of numerical simulations for p = 2. In the first line the analytical prediction for αd for d 6= 2 is presented. √ N2 (r) − πr2 ≤ 1 + 2 2πr. Therefore we have 1 X k∈Z2 \{0} kkk 2F kkk √ N +∞ Z ≈ √ Z∞ F (r) 1 N 2 N2 (r) − πr d r + 2π log + 2π d r. 3 2r a r a 1 (2.5.20) Eq. (2.5.13) for the case d = 2 can then be rewritten as (2) (2,PP) EN (2) where e2 (2) ∼ ln N e + 2 , 2πN N (2.5.21) is some constant. To our knowledge the result (2,PP) N EN (2) 1 = N →∞ ln N 2π lim (2.5.22) is new to the literature. The validity of eq. (2.5.21) has been confirmed by numerical simulations, see fig. 2.5.2. Higher dimensional case As in the previous paragraph, we use a sufficiently rapidly decaying function F (x), with limx→∞ F (x) = 0 and limx→0 F (x) = 1, to regularise our 50 Chapter 2. Euclidean Matching Problems 0.572 0.66 (4) (2,PP) 0.64 0.568 N EN √ √ 3 N 2 EN (2,PP) (3) 0.57 0.62 0.566 0.564 0 0.06 0.12 1.5 · 10−2 N −γ4 0 N −γ3 (a) d = 3. 3 · 10−2 (b) d = 4. 0.584 0.583 √ 5 N 2 EN (2,PP) (5) 0.585 0.582 0 5 · 10−3 1.3 · 10−2 N −γ5 (c) d = 5. Figure 2.5.3.: Numerical simulations for d > 2. We fitted our numerical data for function c (2) (2) (2) fd (N ) = ed + αd N −γd + . N √ d (2,PP) N 2 EN (d) using the (2) The expected value for αd was 2π1 2 ζd (1). We verified the validity of eq. (2.5.33) with numerical simulation on systems with sizes up to N = 10648 in dimension d = 3, N = 14641 in dimension d = 4 and N = 32768 in dimension d = 5. The scaling exponents γd are readily confirmed to be the right ones and the fitted (2) (2) ζd (1) coefficients αd are in strong agreement with the analytical prediction. We fixed also αd = 2π in 2 (2) (2) fd (N ) to extrapolate the extensive coefficients ed , reported in table 2.1. divergent series. Denoting as before by Nd (r) the number of lattice points inside a sphere centred in the origin and of radius r, with the exclusion of the origin itself, we can write 1 X 2F kkk k∈Zd \{0} +∞ Z = kkk √ d N 1 F r2 0 = r √ d N +∞ Z ≈ 0 Z∞ d−2 ∂Nd (r) d−1 − Sd r d r + N d Sd F (r) rd−3 d r ∂r 0 d−2 1 ∂Nd (r) − Sd rd−1 d r + N d Sd 2 r ∂r Z∞ 0 F (r) rd−3 d r, (2.5.23) 2.5. Higher dimensional Euclidean matching problem 51 d where Sd = 2π 2 Γ( d 2) is the unit sphere surface in d dimensions. The last term in Eq. (2.5.23) cannot be explicitly computed since it depends on the choice of the regularizing function F (x). We name +∞ Z 1 ∂Nd (r) d−1 Σd := − Sd r d r. (2.5.24) r2 ∂r 0 It can be shown that Σd = ζd (1), the analytic continuation to the point s = 1 of the function 1 X ζd (s) := k∈Zd \{0} kkk 2s for <s > d . 2 (2.5.25) The previous function is a particular case of a more general class of zeta functions, called Epstein zeta functions. Indeed, let us fix d > 2 and a > 0. Then, for <s > d2 , we can rewrite eq. (2.5.25) as ζd (s) = lim R→+∞ ZR 1 X − Sd 2s k∈Zd \{0} kkk≤R kkk rd−1 d r + Sd r2s a R→+∞ a kkk≤R ZR 1 X = lim 2s k∈Zd \{0} ZR kkk − Sd rd−1 d r r2s d−2s rd−1 + Sd a d r . (2.5.26) r2s 2s − d a Assuming that the limit in the last equation exists also for <s < singular term. The analytic continuation of ζd (s) then reads ζd (s) = lim R→+∞ ZR 1 X 2s k∈Zd \{0} kkk≤R kkk − Sd 0 rd−1 d r r2s d 2, we have isolated the for <s < d , 2 (2.5.27) where the limit a → 0 has been taken. Note that, comparing the previous equation with eq. (2.5.24), ζd (1) ≡ Σd . On the other hand, for <s > d2 ζd (s) = X k∈Zd \{0} 1 Γ(s) +∞ +∞ Z Z πs s−1 −kkk2 z z e dz = z s−1 Θd (z) − 1 d z, Γ(s) 0 (2.5.28) 0 P+∞ 2 where Θ(z) is defined by Θ(z) := n=−∞ e−πn z ≡ ϑ3 (0; iz), where ϑ3 (τ ; z) is the third Jacobi theta function. Noticing that for z 1 asymptotically we have Θ(z) ∼ √1z , while Θ(z) ∼ 1 + 2 e−z for z 1, we can isolate the singular parts of ζd (s) writing +∞ Z Z1 1 1 πs 2 − + z s−1 Θd (z) − 1 d z + z s−1 Θd (z) − d d z . ζd (s) = Γ(s) 2s − d s z2 1 0 Last expression can be readily continued to the region <s < d2 . Using the property Θ(t−1 ), we can write +∞ Z d 2 Σd = ζd (1) = π −1+ 1 + z 2 −2 Θd (z) − 1 d z . 2−d 1 √ (2.5.29) t Θ(t) = (2.5.30) 52 Chapter 2. Euclidean Matching Problems d=3 −0.233(2) 0.7472(2) p=1 d=4 −0.152(17) 0.7181(2) d=5 −0.127(16) 0.73905(5) αd (3) ed d=3 −0.652(3) 0.6313(2) p=3 d=4 −0.36(1) 0.4804(1) d=5 −0.28(4) 0.4830(1) (4) αd (4) ed d=3 −0.863(3) 0.6286(2) p=4 d=4 −0.44(2) 0.4183(2) d=5 −0.34(1) 0.41043(4) (1) αd (1) ed (3) √ d Table 2.2.: Results of the fit of numerical data for (p,PP) N p EN (p) (p) (p) fd (N ) = ed + αd N γd + (d) by a function of the form c . N Therefore we finally have that for d > 2, (2,PP) EN (2) 2 (2) ∼ ed N − d + ζd (1) . 2π 2 N (2.5.31) (2) The expression for ζd (1) is given by Eq. (2.5.30), while ed have to be determined numerically. Note that for d → +∞ we recover the correct meanfield scaling behaviour already analysed by Houdayer, Boutet de Monvel, and Martin [27] for the average optimal cost (mf) EN of the random assignment problem √ d (2,PP) N 2 EN d→∞ (mf) (d) −−−→ EN (2.5.32) Observe indeed that the rescaled variable above has the scaling behaviour √ d (2,PP) N 2 EN (2) (d) ∼ ed + ζd (1) , 2π 2 N γd 2 γd := 1 − . d (2.5.33) However, for finite d, the scaling behaviour can be very different from the mean field one that Houdayer, Boutet de Monvel, and Martin [27] used to fit also the Euclidean optimal cost. The predictions above were confirmed by proper numerical simulations (see fig. 2.5.3). Numerical results for generic p The asymptotic form we proposed for the average optimal cost in the Ebmp with quadratic costs and periodic boundary conditions could not be extended to cover the case of generic cost exponent p. Nonetheless, our numerical (p,PP) simulations give strong evidence that, for d > 2 and any p ≥ 1, EN (d) has the asymptotic form (p) √ α d (p,PP) (p) N p EN (d) ∼ ed + dγ . (2.5.34) N d We thus find the same scaling exponent γd analytically predicted only for the case p = 2. Again, the non-trivial scaling exponent γd differs from the mean-field exponent γ∞ = 1 of the random link matching problem [46] and of the Eucliden monopartite matching [27]. (p) The identification of the correct exponent γd is crucial to the extrapolation of ed from numerical data. Since in the literature a mean-field like scaling has been considered [27], we 2.6. Correlation functions for the Ebmp (p) 53 (p) report in table 2.2 the values of ed and αd for different values of p. They were obtained √ (p,PP) fitting d N p EN (d) with functions of the form (p) (p) (p) fd (N ) = ed + αd c + . N γd N (2.5.35) 2.6. L Correlation functions for the Ebmp Let us now discuss a different aspect of the Ebmp with reference to the optimal matching map µ(ri ) := bσ∗ (i) − ri , σ ∗ optimal permutation, between a set of N r-points and a set of N b-points, in the notation above, in the large N limit. In particular, let us analyse the correlation function of the optimal map in the Ebmp with quadratic cost both in the gP case and in the PP case. To avoid finite size effects, we will work on the unit hypertorus Td (i.e., we will assume periodic boundary conditions). The correlation function is defined as follows: . (2.6.1) Cd (x) := µ(ri ) · µ(rj ) ri −rj =x To obtain the previous quantity we will proceed in two different ways: • we will generalise the scaling ansatz proposed for the average optimal cost [13]; • we will introduce a functional formalism to compute the two point correlation function for the optimal map [11]. Both the methods and the results are, to our knowledge, new to the literature. Let us start with the first approach: in the next Section we will introduce the second one in all generality. Let us consider the Ebmp between two set of points, R = {ri }i=1,...,N ⊂ Ωd and B = {bi }i=1,...,N ⊂ Ωd , in the unit hypercube with periodic boundary conditions and quadratic cost. In the following we will adopt the same notation of Section 2.5. Using the simple equation (2.5.10), obtained by a proper linearisation of the Monge–Amp´ere equation in the large N limit. We can simply write down an expression for the correlation function assuming that µ(x) = ∇ϕ(x), ∆ϕ(x) = δρ(x) := ρR (x) − ρR (x). (2.6.2) R d Being Ωd ρ(x) d x = 0, eq. (2.5.10) has a unique solution on the compact manifold Td , given by Z ϕ(x) = ρ(y)Gd (y, x) dd y, (2.6.3) Td where Gd is the Green’s function for the Laplace operator ∆ on the flat torus Td defined by ∆y Gd (x, y) = δ (d) (x − y) − 1, (2.6.4) where we have introduced the empirical measures (2.5.3); in a Fourier mode expansion, we can write X e2πin·(x−y) Gd (x, y) ≡ Gd (x − y) = − (2.6.5) 2 . 4π 2 knk n∈Zd \{0} Assuming that at least one set of points is randomly generated, we introduce the following correlation function CdPP (x − y) :=∇φ(x) · ∇φ(y) ZZ = ∇z Gd (z − x) · ∇w Gd (w − y)ρ(z)ρ(w) dd z dd w (2.6.6) 54 Chapter 2. Euclidean Matching Problems where we used the fact that periodic boundary conditions are assumed and eq. (2.6.3) holds; moreover, we denoted by ρ(x) := ρR (x) − ρB (x) N i 1 X h (d) δ (x − ri ) − δ (d) (x − bi ) = N i=1 (2.6.7) and the average • is intended over all possible instances as usual. As above, we will distinguish two different cases: PP Ebmp The points of R and the points of B are random points uniformly distributed within Ωd . In this case we obtain i 2 h (d) δ (x − y) − 1 , ρ(x)ρ(y) = (2.6.8) N and therefore the correlation function is CdPP (x − y) = − 2 Gd (x − y). N (2.6.9) gP Ebmp In the gP Ebmp we suppose that N = Ld for some natural L ∈ N and that one set of points, e.g. the set R = {ri }i=1,...,N ,is fixed on the vertices of an hypercubic k lattice, in such a way that {ri }i=1,...,N = L |k ∈ [0, L]d ∩ Nd , whilst the set B = {bi }i=1,...,N ⊂ Ωd is obtained as before considering randomly generated points in Ωd . We have 2 ρ(x)ρ(y) = N1 δ (d) (x − y) + N N−N 2 P (d) 1 + N 2 ij δ (x − ri ) δ (d) (y − rj ) P − N1 i δ (d) (x − ri ) + δ (d) (y − ri ) . (2.6.10) In this case the correlation function is therefore Cdgp (x − y) = − 1 Gd (x − y). N (2.6.11) We will consider also the correlation function for the normalised transport field, i.e. the following quantity: cPP (2.6.12) d (x − y) = σ(x) · σ(y), in which the correlation between the values normalized transport field σ(x) := µ(x) ∇x φ(x) = kµ(x)k k∇x φ(x)k (2.6.13) in different positions is evaluated. Note that σ lives on the d-dimensional unit sphere. To compute the correlation function (2.6.12) for the normalised field in the PP case, we assume a Gaussian behaviour for the joint probability distribution of two values of the optimal transport field, and therefore we have ZZ −1 1 T µ1 · µ2 e− 2 µ ·Σ (x,y)·µ d d PP cd (x − y) = d µ1 d µ2 (2.6.14) √ d kµ1 kkµ2 k 2π det Σ T where µ = µ1 µ2 and Σ(x, y) is the covariance matrix, µ(x) · µ(x) µ(x) · µ(y) CdPP (0) Σ(x, y) := ≡ PP Cd (x − y) µ(y) · µ(x) µ(y) · µ(y) CdPP (x − y) . CdPP (0) (2.6.15) 2.6. Correlation functions for the Ebmp 55 For d ≥ 2 (the case d = 1 was studied in [10]), introducing A := CdPP (0) , det Σ(x, y) (2.6.16a) B := CdPP (x − y) , det Σ(x, y) (2.6.16b) 2 + PP PP 1− d observe that B A → 0 for N → ∞, being N Cd (x) finite for x 6= 0 and N Cd (0) ∼ N for d > 2, N CdPP (0) ∼ ln N for d = 2. We have therefore that, in the notation above, det Σ = A2 1 − B2 (2.6.17) and B 2Γ2 d+1 2 cPP d (x, y) = A dΓ2 d2 B2 1− 2 A d2 d+1 2 d 2 2 F1 d+1 2 B 2 N →∞ 2 ; −−−−→ B d + 1 A2 A →0 Γ Γ d+1 2 d 2 !2 In the previous expression, we have introduced the hypergeometric function ∞ X a b Γ (a + 1) (a)n (b)n z n ; z := , (a)n := . 2 F1 c (c)n n! Γ (a − n + 1) n=0 CdPP (x − y) (2,PP) EN (d) (2.6.18) (2.6.19) Observe that the corresponding correlation function for the normalised field in the gP case has the same expression, i.e., !2 2 Γ d+1 CdgP (x − y) gP 2 cd (x − y) = . (2.6.20) d (2,gP) d Γ 2 EN (d) Finally, for d ≥ 2 we can compute also the so called wall-to-wall correlation function for the PP case: 1 !2 Z d Y G1 (r) 4 Γ d+1 PP 2 . (2.6.21) Wd (r) := d xi cd (r, x2 , . . . , xd ) = − d (2,PP) d Γ 2 EN (d) i=2 0 Similarly, the computation for the gP case gives 1 Z d Y 2 d xi cgp WdgP (r) := d (r, x2 , . . . , xd ) = − d i=2 0 Γ Γ d+1 2 d 2 !2 G1 (r) . (2,gP) (d) EN (2.6.22) In the following we consider explicitly the cases d = 1 and d = 2 and we numerically verify the results presented above. One dimensional case In Section 2.4 we discussed an exact solution of the one dimensional matching problem in the grid–Poisson case. Let us consider now the PP case (as always, similar considerations can be carried on for the gP case). We have that G1 (r) = − X n6=0 1 |r| 1 e2πinr = − + (1 − |r|) . 2 2 4π n 12 2 (2.6.23) It follows from eq. (2.6.9) that C1PP (x − y) = 1 N 1 6 − |x − y| (1 − |x − y|) ; (2.6.24) . 56 Chapter 2. Euclidean Matching Problems G2 0 −0.2 0.5 −0.4 0 −0.4 −0.2 0 0.2 x y 0.4 −0.5 Figure 2.6.1.: Laplacian Green’s function on T2 . moreover, note that the average total cost of the optimal matching is given by (2,PP) EN (1) = C1PP (0) = 1 6N . (2.6.25) In the gP the results of Section 2.4 are easily recovered; Boniolo, Caracciolo, and Sportiello [10] obtained the correlation function for the normalised transport field as " # 2 1 − 6x(1 − x) gP cPP arctan p . (2.6.26) 1 (x) = c1 (x) = π 12x(1 − x) (1 − 3x(1 − x)) Two dimensional case have that [30] G2 (x − y) = − For d = 2, denoting by x = (x1 , x2 ) and by y = (y1 , y2 ), we X 1 n6=0 2 4π 2 knk e2πin·(x−y) 1 3/4 ϑ1 (πz|i) ln 2π = 1 2π Γ 4 2 z=(x1 −y1 )+i(x2 −y2 ) (x2 − y2 ) − , 2 (2.6.27) where, denoting by q := eiπτ , 1 ϑ1 (z|τ ) := 2q 4 ∞ X (−1)n q n(n+1) sin [(2n + 1)z] (2.6.28) n=0 is the first Jacobi theta function. From eq. (2.6.9) we have simply " # 1 1 x22 3/4 ϑ1 (πz|i) PP C2 (x) = − ln 2π . − N 2π 2 Γ 41 (2.6.29) z=x1 +ix2 In the gP case, we have as usual C2gP (x − y) = 1 PP C (x − y). 2 2 (2.6.30) Observe that the previous expressions contains no free parameters and therefore a direct comparison with numerical data is possible. We present our numerical results both from the 2.6. Correlation functions for the Ebmp 57 gP case and the PP case in fig. 2.6.2a. The average optimal cost for the PP Ebmp is given (2) by EN (2) = C2 (0): however, the right hand side of (2.6.29) is divergent: as explained in Section 2.5, a proper regularisation led to the correct scaling, ln N βPP 1 C2PP (0) = + +o , βPP = 0.1332(5). (2.6.31) 2πN N N A numerical fit of the optimal costs for d = 2 for the gP Ebmp gives 1 ln N 1 C2gP (0) = + βgP + o , βgP = 0.3758(5). 2N 2π N (2.6.32) The correlation function (2.6.12) for the normalised matching field in the PP case has the expression (2.6.18), π C2PP (x − y) . (2.6.33) cPP (x − y) = 2 4 C2PP (0) Observe that the only free parameter in this quantity is C2PP (0): inserting the value obtained by Caracciolo et al. [14], eq. (2.6.31), we obtain the theoretical prediction in fig. 2.6.2b, where we also present some numerical results for c2 (x, y) that show the agreement with the analytical prediction. As remarked above, the form of the correlation function for the normalised transport field in the gP case, cgP 2 (x − y) is the same as in the PP case, being the derivation of eq. (2.6.18) independent from the explicit form of C2gP (x). Using the value (2.6.32) in eq. (2.6.20) for d = 2 we obtained the theoretical curve depicted in fig. (2.6.2b), where, once again, an excellent agreement is found with numerical data. Finally, let us compute the wall-to-wall correlation function for the PP Euclidean matching problem on the flat torus. The theoretical prediction is given by eq. (2.6.21), W2PP (r) = − π G1 (r). 2N C2PP (0) (2.6.34) π G1 (r). 4N C2gP (0) (2.6.35) In the gP case, instead, we have W2gP (r) = − Numerical results both for the PP case and for the gP case are presented in fig. 2.6.2c and fig. 2.6.2d. Once again observe that the values of the average optimal cost in the corresponding cases, eq. (2.6.31) and eq. (2.6.31), fix completely the expression of the wallto-wall correlation function. Three-dimensional case For d = 3 eq. (2.7.24) and eq. (2.6.9) give C3PP (x − y) = 1 2π 2 N X 1 2 knk n∈Z3 \{0} e2πin·(x−y) . (2.6.36) Clearly the previous function can not be represented in a plot, but we computed the correlation function for a system of size N = 8000 averaging on 104 realisations. From the correlation function C3PP (x), the wall to wall correlation function can be obtained as before in the form 16 W3PP (r) = − G1 (r). (2.6.37) 3πN C3 (0) As in the previous cases, C3PP (0) can be evaluated from the cost fit and it is equal to 2 C3 (0) = 0.66251(2)N − 3 − 0.45157... (see Section 2.5). Following the same procedure of the N PP case, we can compute the wall-to-wall correlation function for on the unit hypercube in 58 Chapter 2. Euclidean Matching Problems 1.5 1 C2PP (r1 , 0) C2gp (r1 , 0) cPP 2 (r1 , 0) cgp 2 (r1 , 0) 0.8 1 cPP 2 C2PP 0.6 0.4 0.5 0.2 0 0 0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 r1 0.4 0.5 r1 (a) Section C2PP (r1 , 0) and C2gP (r1 , 0) of the correlation function for N = 104 in the PP case and for N = 3600 in the gP case and corresponding theoretical predictions. (b) Section cPP = 104 and 2 (r1 , 0) for N cgp (r , 0) for N = 3600 and theoretical predic1 2 tions, eq. (2.6.18) and eq. (2.6.20): note that the theoretical curves almost overlap. p=2 Theoretical prediction p=2 Theoretical prediction 0.05 W2gp W2PP 0.05 0 0 −0.05 −0.05 0 0.1 0.2 0.3 0.4 0.5 r (c) Wall-to-wall correlation function in two dimensions for the PP matching with N = 104 on the unit flat torus. The continuous line corresponds to the analytical prediction. 0 0.1 0.2 0.3 0.4 0.5 r (d) Wall-to-wall correlation function in two dimensions for the gP matching with N = 3600 on the unit flat torus. The continuous line corresponds to the analytical prediction. Figure 2.6.2.: Correlation functions for the Euclidean bipartite matching problem in two dimensions and numerical results. 2.7. Functional approach 59 PP gP 0.01 W3 0.005 0 −0.005 0 0.1 0.2 0.3 0.4 0.5 r Figure 2.6.3.: Wall-to-wall correlation function in three dimensions for the PP matching and the gP matching with d = 3 and N = 8000 on the unit flat hypertorus. We attributed the discrepancy between numerical data and theoretical prediction for r → 0 for the PP case to binning effects. the three-dimensional case for the gP matching problem. Reproducing the computations of the d = 2 case we have W3gp (r) = − 8 G1 (r). 3πN C3gp (0) (2.6.38) We evaluated C3gp (0) from the cost scaling, obtaining 2 C3gp (0) = 0.4893(4)N − 3 − 0.23(5) . N (2.6.39) The prediction obtained and the numerical data are presented in fig. 2.6.3. 2.7. L Functional approach The previous results suggest that the Ebmp with quadratic cost, in the large N limit, appears as a Gaussian free theory on the d-dimensional flat hypertorus, in such a way that the correlation function of the matching map is related directly with the free propagator of the theory itself. To properly prove this fact, let us consider the Ebmp on the unit hypercube and, for the sake of clarity, let us restate once again the problem. We suppose that the we are working on a bipartite complete graph KN,N = Graph(V, U; E) and that we have a certain map Φ such that to each element v ∈ V ∪ U 7→ Φ(v) ∈ Ω, point randomly generated on a certain domain Ω (with given boundary conditions) with a certain probability distribution density ρ(x). In particular, we will use the following notation: vi ∈ V ui ∈ U 7→ ri := Φ(vi ) ∈ Ω, 7→ bi := Φ(ui ) ∈ Ω. (2.7.1a) (2.7.1b) We define the empirical measures ρR (x) := ρB (y) := 1 N 1 N PN i=1 PN i=1 δ (d) (x − ri ) , δ (d) (y − bi ) . (2.7.2a) (2.7.2b) A perfect matching m = Graph(Vm ; Em ) on KN,N can be represented by a N × N square matrix M = (mij )ij such that ( N N X X 1 if (vi , uj ) ∈ Em , mij = , mij = mij = 1. (2.7.3) 0 otherwise. i=1 j=1 60 Chapter 2. Euclidean Matching Problems We define the following functional ZZ 1 X p p mij kbj − ri k = kx − yk π(x, y) dd x dd y N i,j Ω Z p = kµ(x)k ρR (x) dd x, (p) EN [M ; Φ] := (2.7.4) Ω where we have introduced π(x, y) := 1 X mij δ (d) (x − ri ) δ (d) (y − bi ) N i,j and the transport vector µ(x) on Ω such that X µ(rk ) = mkj (bj − rk ), (2.7.5) ∀k = 1, . . . , N. (2.7.6) j In the previous expressions, k•k is the Euclidean norm on the considered domain, taking into account the imposed boundary conditions. Observe that we can write π(x, y) = δ (d) (y − x − µ(x)) ρR (x) = δ (d) x − y − µ−1 (y) ρB (y) (2.7.7) = δ (d) (y − x − µ(x)) ρB (y). The constraints (2.7.3) can be expressed in terms of π as R π(x, y) dd x = ρB (y), Ω R π(x, y) dd y = ρR (x), Ω (2.7.8a) (2.7.8b) or, alternatively, in terms of µ(x) substituting the expression (2.7.7) in the previous relations. Observe however that, if adopting (2.7.7), only one constraint is needed: indeed, R Ω π(x, y) dd x = ρR (z) = R Ω δ (d) (y − x − µ(x)) ρR (x) dd x = ρB (y) ⇒ RR Ω δ (d) (y − z − µ(z)) δ (d) (y − x − µ(x)) ρR (x) dd x dd y = Z = δ (d) (y − z − µ(z)) ρB (y) dd y. (2.7.9) Ω We can now introduce a partition function for our problem in the form X (p) Z[β, N ; Φ] := e−βN EN [M ;Φ] M = X N Y {mij =0,1} ' −β 2 N 2 j=1 ZZZ δ 1− N X ! mij i=1 N Y δ 1 − i=1 N X −β mij e P ij mij kbj −ri kp j=1 [DπDuDv] e−βN Sπ [π,u,v] π>0 ZZZ ' iβN [DµDλ] e−βN Sµ [µ,λ] (2.7.10) where we introduced the two actions RR p Sπ [π, u, v] := Ω [(kx − yk − u(x) − v(y)) π(x, y)] dd x dd y R + Ω [u(x)ρR (x) + v(x)ρB (x)] dd x (2.7.11a) 2.7. Functional approach 61 and p [kµ(x)k ρR (x) − λ(x)ρB (x)] dd x RR + Ω λ(y)δ (d) (y − x − µ(x)) ρR (x) dd x dd y R p = Ω [kµ(x)k ρR (x) + λ(x)%(x)] dd x R + Ω ρR (x)µ(x) · ∇λ(x) dd x + s[µ, λ], Sµ [µ, λ] := R Ω (2.7.11b) where we have introduced %(x) := ρR (x) − ρB (x). (2.7.12) and s[µ, λ] are higher order non-linear terms in the fields obtained from the Taylor series expansion of around P µ = 0. P Observe that we have implicitly relaxed the condition mij = 0, 1 requiring only i mij = j mij = 1. Let us use the saddle point approximation: we have that for Sπ δSπ δπ(x,y) δSπ δu(x) δSπ δv(x) p = kx − yk − u(x) − v(y) = 0, R = − Ω π(x, y) dd y + ρR (x) = 0, R = − Ω π(y, x) dd y + ρB (x) = 0. (2.7.13a) (2.7.13b) (2.7.13c) Let us denote by π ∗ , u∗ and v ∗ a solution of the previous set of equations; observe that in the functional integration u and v are supposed imaginary, whilst in the saddle point u∗ + v ∗ is real. The saddle point equations for the functional Sµ can be written instead as δSµ δµ(x) = pµ(x)kµ(x)k δSµ δλ(x) = %(x) − ∇ · (ρR (x)µ(x)) + p−2 ρR (x) − ρR (x)∇λ(x) + δs δλ(x) δs δµ(x) = 0, = 0, (2.7.14a) (2.7.14b) where we supposed that either periodic boundary conditions are imposed on Ω or µ is zero on its boundary. In the thermodynamical limit, i.e. for N → ∞, we expect that kµ(x)k → 0 ∀x ∈ Ω due to the fact that both the empirical measures ρR and ρB converge weakly to the same probability density function ρ(x). In this limit it is more convenient to use the functional Sµ , in which we neglect the higher order contribution s to the action. Moreover, in the βN → ∞ limit we can perform a saddle point approximation. The saddle point is given by the solutions µ0 (x), λ0 (x) of eq. (2.7.14). In particular, we obtain hµ(x) · µ(y)i = µ0 (x) · µ0 (y). (2.7.15) To properly perform the approximation, we have to take into account that kµk ∼ λ ∼ %. For the sake of simplicity, let us consider the quadratic case, p = 2. In the spirit of the saddle point approximation, we neglect therefore the non quadratic terms as higher order corrections in the inspected limit. The approximate action3 is i R h R 2 2 Sµ [µ, λ] = Ω kµ(x)k ρ(x) + λ(x)%(x) dd x + Ω ρ(x)µ(x) · ∇λ(x) dd x. (2.7.18) 3 We observe here that the functional reproduce the correct discrete in the limit of the validity of the approximation µ(ri ) · ∇λ(x)|x=ri ≈ X mij (λ(bj ) − λ(ri )) , i = 1, . . . , N, (2.7.16) j for a given function λ : Ω → R: for d > 2 and ρ(x) = 1, for example, the optimal transport vector 1 2 scale as kµk ∼ N − d and therefore the error on the previous relation is o N − d . If we assume that 62 Chapter 2. Euclidean Matching Problems In this approximation the Euler–Lagrange equations greatly simplify as %(x) = µ(x) = ∇ · (ρ(x)µ(x)) , i∇λ(x) . 2 (2.7.19a) (2.7.19b) The same linearisation procedure can be performed in the functional Sπ , obtaining the same result: we can indeed adopt the expression π(x, y) = δ (d) (y − x) ρR (x) − ρ(x)µ(x) · ∇y δ (d) (y − x) . (2.7.20) Substituting in eq. (2.7.11a) and observing that hR i R d u(x) π(x, y) d y − ρ (x) dd x = 0, R Ω Ω i hR R R R v(y) Ω π(x, y) dd x − ρB (y) dd y = Ω v(x)%(x) dd x + Ω ρ(x)µ(x) · ∇v(x) dd x. Ω (2.7.21) reproducing therefore the contribution in eq. (2.7.18) up to the identification v(x) ≡ λ(x). As suggested by the notation, the Lagrange multipliers corresponds therefore to the functions u, v introduced by Kantoroviˇc in its dual formulation, presented in Section 2.3, of the transport problem. Remarkably, in eq. (2.7.19b) we recover the exact result (2.3.17) for the optimal transport map; we justify also the scaling ansatz proposed in Section 2.5, generalised in eq. (2.7.19a). We can therefore compute directly by Gaussian integration the correlation function ∇ϕ(x) · ∇ϕ(y) , (2.7.22) hµ(x) · µ(y)i = 4 where we denoted by ϕ(x) the solution of the following equation on the d-dimensional hypertorus: ∇ · [ρ(x)∇ϕ(x)] = 2%(x). (2.7.23) R Observing that Ω %(x) dd x = 0, we introduce now the Green’s function Gρ (x, y) on Ω defined by X φi (x)φi (y) 0 0 ∇y · [ρ(y)∇y Gρ (x, y)] = δ (d) (x − y) − (2.7.24) R i 2 d , φ (z) d z i 0 Ω In the previous expression φi0 (y) are the zero mode eigenfunctions ∇x · ρ(x)∇x φi0 (x) = 0. (2.7.25) We can write an explicit expression for ϕ(x) as Z ϕ(x) = 2 Gρ (x, y)%(y) dd y, (2.7.26) Ω eq. (2.7.16) holds, the partition function in the discrete case is recovered as Z Z = iβN Z [DµDλ] exp −βN [kµ(x)kp ρR (x) + λ(x)%(x) + ρR (x)µ(x) · ∇λ(x)] dd x Ω = N X Y {mij } i=1 (2.7.17) ! δ X mij − 1 ! δ X j up to the relaxed constraint mij = 0, 1. j mji − 1 e −β PN i=1 p mij kbj −ri k 2.7. Functional approach 63 from which we obtain ZZ [%(z)%(w)∇x Gρ (x, z) · ∇y Gρ (y, w)] dd z dd w. (2.7.27) Averaging over the disorder, we obtain easily the following expression: ZZ h i hµ(x) · µ(y)i = %(z)%(w)∇x GR (x, z) · ∇y GR (y, w) dd z dd w (2.7.28) hµ(x) · µ(y)i = Ω Ω where we denoted by • the average over all instances and %(z)%(w) = 2 i ρ(z) h (d) δ (z − w) − ρ(w) . N Therefore, up to higher order corrections, we have Z 2 hµ(x) · µ(y)i = [ρ(z)∇x Gρ (x, z) · ∇y Gρ (y, z)] dd z N Ω ZZ 2 [ρ(z)ρ(w)∇x Gρ (x, z) · ∇y Gρ (y, w)] dd z dd w. − N (2.7.29) (2.7.30) Ω The previous expressions provide a recipe for the calculation of the average optimal cost and for the correlation function in the random Euclidean bipartite matching problem in the case of quadratic cost. Observe that a general expression for the distribution from which the points are extracted is assumed. 2.7.1. Uniform density on the hypertorus If ρ(x) = 1 and Ω ≡ Td , then the previous expression can be greatly simplified.Observe indeed that in this case Gρ (x, y) ≡ Gd (x − y), where Gd (x − y) is Green’s function of the Laplacian on the unit flat hypertorus Td : ∇2y Gd (x − y) = δ (d) (x − y) − 1. Under these hypotheses we have that, up to o N1 terms, 2 1 hµ(x) · µ(y)i = − Gd (x − y) + o , N N (2.7.31) (2.7.32) where we used the fact that, if ρ(x) = 1, %(z)%(w) = i 2 h (d) δ (z − w) − 1 . N (2.7.33) We recover therefore the same expression obtained through a scaling ansatz already adopted. 2.7.2. Matching problem on the line Let us consider now Ω = R case and a certain probability density distribution ρ(x) on R it, R ρ(x) d x = 1, limx→±∞ ρ(x) = limx→±∞ ∂ρ(x) = 0, ρ(x) > 0 ∀x ∈ Ω. We assume ∂x Neumann boundary conditions for the scalar potential ϕ(x) lim ∂x ϕ(x) = 0, x→±∞ (2.7.34) 64 Chapter 2. Euclidean Matching Problems N = 5000 Theoretical prediction N hµ(x)µ(0)i 3 2.5 2 1.5 0 0.5 1 1.5 x Figure 2.7.1.: Numerical result and theoretical prediction for the correlation function of the optimal Euclidean matching on the line between two set of N = 5000 points independently and identically distributed according to a standard Gaussian distribution. The numerical data are obtained averaging over 5000 realisations. (i.e., the map µ(x) = ∂x ϕ(x) is zero at large distances). It follows that Zx 2 ∂x ϕ(x) = ρ(x) %(y) d y, (2.7.35) −∞ and therefore Zy Zx hµ(x)µ(y)i = d z1 −∞ d z2 2 Φρ (min{x, y}) − Φρ (x)Φρ (y) %(z1 )%(z2 ) = ρ(x)ρ(y) N ρ(x)ρ(y) (2.7.36) −∞ where we introduced the partition function Zx Φρ (x) := ρ(ξ) d ξ. (2.7.37) −∞ If we assume, for example, a Gaussian distribution 1 x2 ρ(x) = √ exp − 2 , 2σ 2πσ (2.7.38) we obtain the theoretical prediction for hµ(x)µ(0)i for σ = 1 in fig. 2.7.1. Numerical data are is an excellent agreement with the theoretical prediction. Observe also that eq. (2.7.36) provides the correct correlation function for the Euclidean matching problem on the segment Ω = [0, 1] with uniform distribution. Assuming indeed ( 1 x ∈ Ω, ρ(x) = (2.7.39) 0 otherwise, we have ( 2 min{x,y}−xy N hµ(x)µ(y)i = 0 (x, y) ∈ [0, 1]2 otherwise. (2.7.40) Observe that the covariance of a Brownian bridge process naturally appear, as shown in [10] (observe however that in the cited work one set of point is supposed fixed and therefore the factor 2 is not present). 2.8. Emmp via replicas 65 2.8. L Emmp via replicas Let us now consider the problem of the Emmp in a replica approach. The computation follows directly the method used for the Rmmp but in this context a more subtle correlation between the different weights appears, due to the presence of an Euclidean structure. We will present here a detailed computation for this case: we consider the weighted matching problem on the complete graph K2N = Graph(V; E), whose vertices are associated to point independently and uniformly generated on the unit hypercube Ωd , vi ∈ V 7→ Φ(vi ) = ri ∈ Ωd , (2.8.1) with weights given by p wij := w ((vi , vj )) = kri − rj k , p ∈ R+ . (2.8.2) We want to find a perfect matching m on K2N such that the following functional (p,m) EN [m, Φ] := 1 X w(e) N e∈m (2.8.3) is minimised. In particular, let us introduce the adjacency matrix M for the matching m as M = (mij ∈ {0, 1})1≤i,j≤2N , mij = mji , 2N X mij = 1; (2.8.4) i=1 the functional above can be written as (p,m) EN (p) [m, Φ] ≡ EN [M ; Φ] := 1 X mij wij . N i<j (2.8.5) In a celebrated paper, M´ezard and Parisi [40] showed that the Emmp can be studied via a replica approach considering correlations between different weights as a perturbation to the purely random case, computing the very first order corrections due to correlations among three different weights (two weights are always uncorrelated). We want to perform here a general computation following their perturbative approach. We start from the following partition function Z[β, N ; Φ] := X p +1 e−βN d (p) EN [M ;Φ] M X = 2N Y {mij =0,1} = 2π δ 1− j=1 2N X i=1 ! mij exp −βN p d X mij wij (2.8.6) i<j 2N Z i Y p eiλi d λi Y h 1 + exp −βN d wij − iλi − iλj . 2π i=1 i<j 0 p In the previous expression we have properly rescaled the temperature β → βN d to obtain p a finite average cost in the large N limit (note that the average optimal cost scales as N − d in any dimension in the Emmp). As in the purely random case, we have to properly average over the disorder, i.e., over all the possible maps Φ, and therefore we proceed via a replica 66 Chapter 2. Euclidean Matching Problems trick, as in Section 1.4 for the Rmmp. The replicated partition function is n Yh n Y 2N Z2π iλa i Y Y i d λa p e i n Z [β, N ; Φ] = 1 + exp −βN d wij − iλai − iλaj 2π a=1 i<j a=1 i=1 0 n Y 2N Z2π iλa Y e i d λai Y = (1 + uij ) , 2π a=1 i=1 i<j (2.8.7) 0 uij := ∞ X p e−i X e−rβN d wij Pr m=1 (λai m +λaj m ) . a1 <···<ar r=1 In the Rmmp the average can be performed directly, using the fact that the joint probability distribution of the weights wij factorizes [37]. Here this is not true any more, due to the Euclidean constraint. Therefore, we write the following: Y (1 + uij ) = 1 + i<j ∞ X X0 E Y uek , (2.8.8) E=1 {ek }k=1,...,E ⊂E k=1 −1) where the primed sum runs over all the N (2N set of different E edges. In other words, E the sum runs over all the possible subgraphs g ⊆ K2N . In particular, let us suppose that g = Graph(Vg ; Eg ) ⊆ K2N is a generic connected subgraph with V = |Vg | ≤ 2N vertices Q and E = |Eg | edges. Then we have to evaluate e∈Eg ue . The average must be performed using the following joint distribution4 ! E E Z Y Y Y X p ρg (w(e1 ), . . . , w(eE )) = dd zi δ (w(ei ) − kzi k ) δ (d) zi , i=1 k=1 f∈F (g) i : ei ∈Ef (2.8.9) where F(g) is the set of cycles f := Graph(Vf ; Ef )5 surrounding the faces of g. To evaluate QE k=1 uek we have to compute an integral in the form E Z Y p −rk βN d w(ek ) d w(ek ) e ! ρg (w(e1 ), . . . , w(ek )), r1 , . . . , rE ∈ N. (2.8.10) k=1 In the relevant limit of short links, i.e., taking into account that the occupied links are of p the order N − d , the previous quantity can be rewritten as ! E Z Y 1 Wβ (g; r1 , . . . , rE ) d(ek ) e−rk βw(ek ) ρg (w(e1 ), . . . , w(eE )) =: . (2.8.11) N E−L N E−L k=1 where L := |F(g)| is the number of faces of the graph. Using the Euler formula, V −E +L = 1 − 2g, where g is the genus of the graph, we have that the contribution of the subgraph g 1 . scales as N V −1+2g Now we must evaluate the number of distinct subgraph of K2N homeomorphic to g. (2N )! V There are (2N different ordered subsets of V vertices in K2N . Being the −V )! ∼ (2N ) graph complete, we can associate to each set a subgraph homeomorphic to g. However, if g has some additional symmetry under the exchange of vertices, we overcount the number of subgraphs and a certain correcting factor taking into account this symmetry σ(g) must 4 Note 5 We d(V −1) −E that ρg ({αw}) = α p ρg ({w}). consider here also cycle of length 3. 2.8. Emmp via replicas 67 be introduced6 : note that this factor depends on the subgraph size and not on the size N of the complete graph in which it is embedded. Finally, we have that a given topology gives a contribution of the order N 1−2g . If we restrict ourselves to the leading order contribution, than we must consider planar graphs, for which g = 0 and therefore the Euler formula holds in its simpler form V − E + L = 1. To proceed further, note that for a generic connected subgraph g of the complete graph K2N , if, e.g., v is a vertex separating g = Graph(Vg ; Eg ) into g1 = Graph(V1 ; E1 ) and g2 = Graph(V2 , E2 ), V1 ∩ V2 = {v}, Q Q Q then, e∈Eg ue = e∈E1 ue e∈E2 ue . Remember that each connected graph can be decomposed in non-separable components in a unique manner [55]. For a generic subgraph g = Graph(Vg ; Eg ) of the complete weighted graph K2N , we can introduce therefore the Ursell function of g X Y Y Y ∂ X ln exp (2.8.12) U (g) := (−1)|Π|−1 (|Π|−1)! ue ≡ ze ue ∂z e e∈Eg e∈Eg Π π∈Π e∈π ze =0 where Π is a partition in |Π| subsets of the set Eg : if g is separable, U (g) = 0. Indeed, in ˜ such that this case ∃!Π ! X X X ln exp ln exp ze ue . (2.8.13) ze ue = e∈π ˜ π∈Π e∈Eg 1 Finally, we can rewrite Eq. (2.8.8), up to o N terms, as ∞ X Y X X0 (1 + uij ) ≈ exp σ(g) U (g) i<j E=1 [g,E] [E] (2.8.14) 1 1 1 1 1 + + + + + ... . = exp 2 6 8 10 4 P0 where the primed sum [E=x] runs over the all possible subsets of E = x distinct edges P and [g,E] is a sum over all planar different topologies of non separable planar graphs with E edges. The previous expression generalises an analogue one appearing in [40] in which E ≤ 3 was considered. We can further simplify our computation observing that, given a planar non-separable graph g, U (g) := X Y Y Y X Y (−1)|Π|−1 (|Π| − 1)! ue − ue ue˜ + . . . ue = Π π∈Π e∈π e∈Eg e∈Eg (2.8.15) e˜6=e The first term of the previous sum is of order O N V1−1 as explained above; all other terms 1 of the sum are of order O N V +2|Π|−3 (each “cut” augments the number of vertices of 2). Therefore at the first order we can simply consider U (g) ' Y ue (2.8.16) e∈Eg 6 Let us consider for example the graph 1 g=2 4 3 For a given set {x1 , x2 , x3 , x4 } of points, we can arrange them in 4! ways but only 3! are really different, due to the symmetry under the exchanges 1 ↔ 3 and 2 ↔ 4. For this graph σ(g) = 14 . 68 Chapter 2. Euclidean Matching Problems Following M´ezard and Parisi [40], let us introduce now, using a proper set of Lagrange multipliers qˆa1 ,...,ak , the order parameter, 2N k X X 1 a qa1 ...ak := exp −i (2.8.17) λi j , 1 ≤ k ≤ n, 2N i=1 j=1 symmetric under permutation of replica indices. For a given value of E and a planar graph g = Graph(Vg ; Eg ), |Vg | = V , we have in the notation above X0 X Wβ (g; {ri }) X0 Y 2qa(v) (2.8.18) U (g) = N QE r ! k 1 E r ,...,r k=1 v∈V [E] 1 E (a )...(a ) g where (ai ) = (ai1 , . . . , airi ) is a set of ri replica indices, i = 1, . . . , E, the primed sum runs over all different sets of replica indices. In the expression above we used also the notation [ a(v) = (ak ), (2.8.19) k : ek ∈E(v) where E(v) is the set of edges at v. Once again, if we restrict ourselves to the E = 3 contribution, the previous formula reduces to the results of M´ezard and Parisi [40]. We are interested in the replica symmetric solution, in which qa1 ...ap ≡ Qp , and therefore X0 U (g) = 2V N X {re }e∈Eg [E] (2.8.20) Y Wβ (g; {re }) Q ςn (g; {re }) Qr(v) , e∈Eg re ! r(v) := v∈Vg X rj . j : ej ∈E(v) (2.8.21) In the previous expression we P introduced the combinatorial coefficient ςn (g; {re }), denoting the number of ways to select e re (not necessarily distinct) replica indices satisfying the following properties: we can chose V sets Rv , v = 1, . . . , V , such that, given v ∈ V, Rv ∩ Rv0 = ∅ for ∀v 0 ∈ ∂v. Note that ς0 = 0: we denote by ς = limn→0 ςnn . The averaged replicated partition function is therefore Z Zn = d µ(q, qˆ) e−N βS[β;q,ˆq] , (2.8.22a) −βS[β; q, qˆ] := Σ(q) + 2 ln z(ˆ q) − 2 n X n k=1 where R d µ(q, qˆ) := Σ(q) := R +∞ k=1 −∞ Qn ∞ X X E=1 [g,E] n→0 −−−→ n ∞ X d qk X 2V {re }e∈Eg X E=1 [g,E] V R +i∞ −i∞ k qˆk qk , (2.8.22b) d qˆk and V Y Wβ (g; {re }) Q ςn (g; {re }) qr(v) e∈Eg re ! v=1 2 σ(g) (2.8.23a) X {re }e∈Eg Y Wβ (g; {re }) Q ς(g; {re }) qr(v) . e∈Eg re ! v The purely random link contribution is related to the E = 1 contribution and omitting all other terms we recover the Rmmp as mean field approximation. Following [37], ! Pr n Z2π iλa n a Y X aj e d λ n −i λ exp j=1 2 ln z(ˆ q ) := 2 ln qˆr e 2π r a=1 0 r=1 (2.8.23b) +∞ Z βx n→0 −−−→ 2nβ e− e − e−G (x) d x, −∞ 2.8. Emmp via replicas 69 where the function G (x) := G(βx), G(x) := ∞ X (−1)r−1 r=1 qˆr erx r! (2.8.24) and we used the fact that n n→0 (−1)k−1 −−−→ n . k k (2.8.25) The saddle point equations can be written as +∞ Z Qr = β eβrx−G (x) d x. (r − 1)! (2.8.26a) −∞ ∞ X X V X Q er(vi )βx j6=i Qr(vj ) Wβ (g; {re }) Q G (x) = ςn (g; {re }) 2 σ(g) Γ(r(vi )) e∈Eg re ! i=1 {re }e∈Eg E=1 [g,E] ZZ ∂ β(x + y − w) =−2 ρ(w) e−G (y) J0 2 exp dwdy ∂x 2 V ∞ X X YZ YZ X ρg ({w}) d w(e) d xv G 0 (xv ) e−G (xv ) + 2V −1 σ(g) · β e∈g i=1 E=3 [g,E] v6=i ∂ Kg [{β(xa + xb − wab )}] · ∂xi xi ≡x (2.8.26b) V −1 X where J0 (x) is a Bessel function of the first kind. In the previous equation we introduced the following function depending on the topology of the considered graph " Kg ({xe }) := X {re ∈N} # ς(g; {re }) Y ere xe Q . re ! v∈V r(v)! i (2.8.27) We can finally write down a general expression for the free energy as +∞ +∞ Z Z Zn − 1 − eβx −G (x) ¯ − f (β) := lim =2 e −e dx − G (x) e−G (x) d x+ N →∞ nN β n→0 −∞ −∞ " E Z ! # ∞ X V Z X Y Y ρ ({w} ) g k Kg [{β(xa + xb − wab )}] . + 2V σ(g) d we d xv G 0 (xv ) e−G (xv ) β v=1 E=3 [g,E] e∈E (2.8.28) From the previous expression we have that p p (p) (p) lim N d EN (d) := lim N d min EN [M ; Φ] = lim f¯(β). N N M β→∞ (2.8.29) 2.8.1. Recovering the mean field approximation In the mean field approximation only the E = 1 contribution is considered (i.e., different weights in the complete graph are considered identically and independently distributed). 70 Chapter 2. Euclidean Matching Problems The saddle point equations becomes: +∞ Z eβrx−G (x) d x, (r − 1)! −∞ ZZ β(x + y − w) −G (y) ∂ G (x) = −2 ρ(w) e J0 2 exp dwdy ∂x 2 Qr = β Sd d p −1 , p w where in the short-link limit ρ(w) ∼ (2.8.30a) (2.8.30b) d where Sd := 2π 2 Γ( d 2) is the surface of a d- dimensional unit sphere as usual. In the large β limit, we have that βx β→∞ − J0 2 exp −−−−→ θ(x) 2 (2.8.31) and therefore eq. (2.8.30b) becomes d 4π 2 G (x) = pΓ d2 Z∞ d w p −1 e−G (w−x) d w. (2.8.32) 0 The previous equation is the mean field equation for the function G , as already found in [37, 40] and can be numerically solved for given values p, d. The mean field contribution to the average optimal length is therefore obtained from eq. (2.8.28) in the β → ∞ limit: lim N N p d (p,mf) (d) EN Z∞ = G (x) e −G (x) dx − 2 −∞ +∞ Z θ(x) − e−G (x) d x. (2.8.33) −∞ The previous equation and eq. (2.8.32) shows that, up to a rescaling factor, the mean field solution for the Emmp with parameters p, d corresponds to the mean field solution of a d Rmmp whose weight distribution has the form ρ(w) ∼ w p −1 for w → 0. Observing that, for α > 0 Z∞ f (x) = α e−f (y−x) d y ⇒ f (x) = ln (1 + eαx ) , (2.8.34) 0 then for p = d we have the explicit solution " d G (x) = ln 1 + exp 2π 2 x Γ d2 + 1 !# (2.8.35) and the average optimal length is (d,mf) lim N 2 EN N (d) = 1 π2 , α(d) 12 α(d) := Sd . d (2.8.36) 2.8.2. Polygonal approximation and zero temperature limit Preliminaries Let us now consider the polygonal corrections in the evaluation of the free energy: we will show that a formal correspondence between the polygonal contribution and the finite size corrections evaluated by Parisi and Rati´eville [46] for the Rmmp appears in the 1 computation. Let us denote by pE the polygon with E edges: for this graph, σ(pE ) = 2E . Moreover, we can write the saddle point action (2.8.22a) in the n → 0 limit as ˆ := Smf [β; Q, Q] ˆ + S[β; Q, Q] ∞ X ˆ SE [β; Q, Q] , 2E E=3 (2.8.37) 2.8. Emmp via replicas 71 where +∞ Z ˆ := 2nβ − βSmf [β; Q, Q] − eβx e −G (x) −e +∞ Z d x − nβ G (x) e−G (x) d x (2.8.38) −∞ −∞ is the well known mean field contribution, whilst, denoting by "E Z # ! E Y X p d (d) ρE (w1 , . . . , wE ) := d zi δ (wi − kzi k ) δ zi , i=1 (2.8.39) i=1 the series terms are given by ˆ := 2E −βSE [β; Q, Q] X ς(pE , {rk }) r1 ,...rE E Z Y −ri βwi d wi e i=1 Qri +ri+1 ri ! ρE (w1 , . . . , wE ) ∞ Z E Y 2E Sd X ς(pE , {rk }) d−1 = k Qri +ri+1 gri (k) d k. QE d (2π) r ,...r i=1 ri ! i=1 E 1 0 (2.8.40) In the previous expression Sd = d 2π 2 d Γ 2 ( ) is the solid angle in d dimensions and we have introduced gr (k) := Z∞ Sd−1 β Zπ dz d p 0 Z∞ = Sd z d φ z d−1 eikβ −1 p z cos φ−rz p sind−2 φ 0 d−1 −rβz p e (2.8.41) 0 F1 − d 2 0 k2 z 2 ;− d z. 4 In analogy with the computation of Parisi and Rati´eville [46] for the finite size corrections to the Rmmp, we can introduce the (2n − 1) × (2n − 1) matrix q Tαα0 := δα∩α0 =∅ Q|α|+|α0 | g|α| (k)g|α0 | (k), (2.8.42) where α is an element of the power set of the replica indices with cardinality |α| and ( 1 if α ∩ α0 = ∅ δα∩α0 =∅ = (2.8.43) 0 otherwise. The contribution of the polygon pE can be written as E ˆ = 2 Sd − βSE [β; Q, Q] (2π)d Z∞ k d−1 tr T E (k) d k. (2.8.44) 0 To proceed further, we will attempt to diagonalize the operator T following the strategy of Almeida and Thouless [3]. An eigenvector cα of the matrix T must satisfy the equation q X X Tαγ cγ = Q|α|+|γ| g|α| (k)g|γ| (k)cγ = λcα . (2.8.45) γ γ : α∩γ=∅ cqα We will look for eigenvectors with q distinguished replicas, in the form ( 0 if |α| < q, cqα = i d|α| if α contains q − i + 1 distinguished replica indices, i = 1, . . . , q + 1. (2.8.46) 72 Chapter 2. Euclidean Matching Problems If we consider q − 1 distinguished replicas, q ≥ 2, it can be proved [36] that the following orthogonality condition holds: q−j X k |α| − (k + j) q+1−(k+j) d|α| = 0. q−j n−q (2.8.47) k=0 The orthogonality conditions provide a relation between all the different values di|α| , showing that we can keep only one value, say d1|α| , as independent. Using this assumption, it can be shown [46] that the eigenvalues of the original Tαα0 matrix can be evaluated diagonalising the infinite dimensional matrices in the n → 0 limit p Γ(a + b)Γ(b) (q) Qa+b ga (k)gb (k), (2.8.48) Nab = (−1)b Γ(a)Γ(b − q + 1)Γ(b + q) In particular, for q = 0 a direct computation gives p n−a Γ(a + b) n→0 (0) Nab = Qa+b gb (k) −−−→ (−1)b Qa+b ga (k)gb (k) b Γ(a)b! (2.8.49) and for q = 1 we obtain (1) Nab = p n−a b Qa+b ga (k)gb (k). b−n b (2.8.50) (q) Remarkably, limn→0 N (1) = N (0) . Computing the spectrum of the matrix Nab is equivalent to the computation of the spectrum of s gb+q (k) Γ(a + 1)Γ(b + q) (q) (q) Mab := (−1)a+b N ga+q (k) Γ(b + 1)Γ(a + q) b+q a+q (2.8.51) a+q Γ(a + b + 2q) = (−1) Qa+b+2q gb+q (k); Γ(a + 2q)b! n each eigenvalue of the previous matrix has multiplicity nq − q−1 in the spectrum of T . (q) The eigenvalue equation for M has the form Z 0 ∞ X (q) (q) (−1)r e(r+q)βu X e(r +q)βu−G (u) (q) q M c = β(−1) λc(q) = cr0 gr0 +q (k) d u. r rr 0 r 0 Γ(r + 2q) r0 ! r0 r 0 =0 (2.8.52) Defining now G (u) ∞ X e(r+q)βu− 2 (q) (q) φ (u; k) := cr gr+q (k) (2.8.53) r! r=0 the eigenvalue equation becomes λφ(q) (u; k) = (−1)q Z A(q) (u, v; k)φ(q) (v; k) d v, (2.8.54) where A(q) (u, v; k) := β e− G (u)+G (v) +qβ(u+v) 2 ∞ X (−1)r erβ(u+v) r=0 − = βSd e G (u)+G (v) +qβ(u+v) 2 Γ(r + 2q)r! gr+q (k) +∞ Z p ∞ − k 2 z 2 −qβzp X (−1)r erβ(u+v−z ) d−1 dz z e . 0 F1 d ; − 4 Γ(r + 2q)r! 2 r=0 0 (2.8.55) 2.8. Emmp via replicas 73 The relation between A(q) (u, v; k) and βSE can be explicitly written as Z∞ ∞ E 2E Sd X n n qE d−1 (q) ˆ − βSE [β; Q, Q] = k (−1) − tr A (u, v; k) d k. (2π)d q=0 q q−1 0 (2.8.56) The combinatorial coefficient in the n → 0 limit has the 1 n n n→0 − −−−→ −1 + n q q−1 n(−1)q 2q−1 q(1−q) form if q = 0, if q = 1, + o(n). if q > 1, (2.8.57) Sectors q ≥ 2 Let us consider the contribution of the q ≥ 2 sectors; in the n → 0 limit we have that the second line of (2.8.56) becomes Z∞ ∞ E 2E Sd X n n qE d−1 (q) (−1) − k tr A (u, v; k) dk (2π)d q=2 q q−1 0 2E Sd −−−→ n (2π)d n→0 ∞ X (−1) q(E+1) q=2 2q − 1 q(1 − q) Z∞ k d−1 tr A(q) (u, v; k) E dk 0 ∞ Z ∞ E 2E Sd X 4q − 1 (2q) d−1 A (u, v; k) k tr dk =n (2π)d 2q(1 − 2q) +(−1)E q=1 0 ∞ X Z∞ q=1 4q + 1 2q(2q + 1) k d−1 tr 0 E A(2q+1) (u, v; k) d k . (2.8.58) To perform the zero temperature limit, let us follow Parisi and Rati´eville [46] writing for q≥2 A (q) iβSd − G (u)+G (v) 2 (u, v; k) = e 2π +∞ Z 0 d 2 Z ` p −1 d ` − k2 ` p es(q,β;z,u+v−`) d z 0 F1 d ; − p 4 2 γ s(q, β; z, u + v − `) := βq(u + v − `) − z − 2q ln(−z) + eβ(u+v−`) , (2.8.59) z where γ is the contour in fig. 2.8.1 and we used the fact that ∞ X (−1)r eβ(q+r)x r=0 Γ(r + 2q)r! = i eβqx 2π Z e−z−2q ln(−z)+ eβx z d z. (2.8.60) γ To compute the β → ∞ limit, let us consider the saddle point equation for s, obtaining ( ∂s ∂z ∂s ∂` = 0, = 0, ( ⇒ zsp = −q, q `sp = u + v − 2 ln β . ∂ 2 s ∂ 2 s q ln2 q , = 0, = − sp sp t ∂z 2 z=z ∂`2 z=z `=` `=` sp sp ∂ 2 s ln q = . (2.8.61) z=z sp ∂z∂` t `=`sp 74 Chapter 2. Euclidean Matching Problems Moreover, observe that the saddle point has fixed position assuming that ln q = tβ for some t: we will consider this case, with q → ∞. Indeed, taking q fixed and assuming β → ∞, it is easily seen from eq. (2.8.58) that A(q) (u, v; k) → ∞ for u + v > 0, whilst, for u + v < 0 A(q) (u, v; k) → 0. Observing that only for u + v − 2t > 0 the saddle point is inside the range of integration, we have that Figure 2.8.1.: Path of integration γ . =z γε <z Ht,k (u, v) := lim A(q) (u, v; k) β→∞ q→∞ ln q=βt (2.8.62) Sd − G (u)+G (v) d,p 2 Θk (u + v − 2t) , e ≈ p where we introduced Θd,p k (x) := x d p −1 0 F1 − d 2 2 k2 x p ;− θ(x). 4 (2.8.63) For k = t = 0 the operator H0,0 has eigenvalue 1 with the eigenvector h10,0 (u) = G 0 (u) e− Observing that P∞ 1 q=2 βq → R +∞ 0 ( E (p, d) := G (u) 2 . (2.8.64) d t let us define 2E+1 Sd (2π)d RR ∞ 0 h i E k d−1 tr Hk,t dtdk E odd, (2.8.65) E even. 0 A numerical analysis of the operator Hk,t was carried on through a proper discretization both of saddle point equation (2.8.32), to obtain G in the mean field approximation, and of eq. (2.8.62) defining the operator itself. In particular, the largest eigenvalue λ1 (k, t) of the operator determines the convergence behaviour of series (2.8.76). Sectors q = 0 and q = 1 Let us compute now the contribution of the two sectors q = 0 and q = 1 in eq. (2.8.56). Up to a global multiplicative constant, it is given by Z∞ k d−1 tr A (0) E E (1) (u, v; k) + (n − 1) −A (u, v; k) d k. (2.8.66) 0 Observe that, from eq. (2.8.49) and eq. (2.8.50), (1) (0) (0) ˜ (0) (k) := Nab (k) − Nab (k) = Nab (k) + o(n) N ab n b p Γ(a + b) Q a+b = (−1)b ga (k)gb (k) + o(n). Γ(a)b! b (2.8.67) In the usual approximation in which Qa is given by (2.8.30a), we can introduce an integral operator to diagonalise. Indeed, denoting by (˜ cp )p the eigenvector of s ˜ (0) (k) = (−1)a Γ(a + b) Qa+b gb (k), ˜ (0) (k) := (−1)a+b a + 1 gb (k) N M (2.8.68) ab b + 1 ga (k) Γ(a)b! b corresponding to the eigenvalue λ, we have that A˜(0) (u, v; k) := βSd e− G (u)−G (v) 2 Z∞ 0 z d−1 0 F1 − d 2 ;− u+v−z p k2 z 2 J0 2 eβ 2 dz 4 (2.8.69) 2.8. Emmp via replicas 75 ˜ (0) (k) with eigenvectors has the same eigenvalues of M φ˜(0) (u; k) := G (u) ∞ X erβu− 2 c˜r gr (k) r!r r=0 (2.8.70) Under the hypotheses above we can write Z∞ k d−1 tr A(0) E E + (n − 1) A(0) + nA˜(0) dk = 0 Z∞ =n k d−1 tr A (0) E Z∞ d k + nE 0 k d−1 tr A(0) E−1 A˜(0) d k + o(n). (2.8.71) 0 Observe now that A(1) (u, v; k) = −A(0) (u, v; k) and that A (0) − (u, v; k) := −βSd e G (u)+G (v) 2 +∞ Z u+v−z p − k 2 z 2 β u+v−zp d−1 2 dz z J1 2 eβ 2 , e 0 F1 d ; − 4 2 0 = Sd e− G (u)+G (v) 2 +∞ Z u+v−z p − k2 z 2 ∂ d−1 . dz z J0 2 eβ 2 0 F1 d ; − 4 ∂u 2 0 (2.8.72) Remembering eq. (2.8.31) we have that in the large β limit we obtain a finite expression β→∞ A(0) (u, v; k) −−−−→ −Sd e− G (u)+G (v) 2 (u + v) d−1 p 0 F1 − d 2 ;− 2 k 2 (u + v) p θ(u+v) =: −Kk (u, v). 4 (2.8.73) Similarly, we have that βSd − β→∞ A˜(0) (u, v; k) −−−−→ − e d Results G (u)+G (v) 2 d (u+v) p 0 F1 2 − k 2 (u + v) p ˜ k (u, v). θ (u + v) =: −β K ; − d 4 2 +1 (2.8.74) We can collect in a unique expression the previous results. Defining 2E Sd κE (p, d) := (2π)d Z∞ h i E−1 ˜ k d−1 tr (Kk ) Kk d k (2.8.75) 0 the average optimal cost, up to non poligonal contributions, is therefore given by p p (p,mf) lim N d EN = lim N d EN N →∞ N →∞ (d) + ∞ X (−1)E κE (p, d) + E=3 ∞ X E (p, d) + ..., 2E (2.8.76) E=3 odd p (p,mf) where E (p, d) is given by eq. (2.8.65), whilst limN N d EN (d) is obtained as in eq. (2.8.33). Observe that three different contributions appears in the polygonal approximations: the first one is related to the mean field approximation for the Emmp; the second one and the third one are due to corrections related to correlations among weights. Remarkably, the third contribution has an expression totally analogue to the finite size corrections for the Rmmp computed by Parisi and Rati´eville [46]. Conclusions In the present work we discussed some connections between Combinatorial optimisation and Statistical physics. In particular, we analysed the so-called Euclidean bipartite matching problem, in which an optimal matching between two sets of points in a certain domain in Rd must be found in such a way that a given functional of the matching (depending on the positions of the points) is minimised. We considered the random version of the problem, assuming that the points are randomly generated and independently and identically distributed according to given a certain probability distribution density, that we supposed in general uniform. The presence of randomness in the presence of Euclidean constraints makes the study of the average properties of the solution highly non trivial: Statistical Physics in general, and the Theory of disordered systems and of Stochastic Processes, can be very useful in the inspection of these properties. For what concerns the Euclidean matching problem, in this thesis we presented the following results. 1. We provided a complete and general solution for the one dimensional problem in the case of convex cost functional, showing that the solution can be mapped into specific stochastic processes related to the Brownian bridge process depending on the imposed boundary conditions. We computed also the correlation function for the solution of the one dimensional problem. 2. We proposed a scaling ansatz for the average optimal matching cost in the quadratic case, obtaining an analytical expression for the finite size corrections to the thermodynamical limit in any dimension. 3. We generalised the scaling ansatz above to compute the correlation function of the optimal matching in any dimension. 4. We obtained, using a functional approach, a general recipe for the computation of the correlation function of the optimal matching in any dimension and any domain in the hypothesis of points generated according to a given (in general non uniform) probability distribution density. 5. We performed a replica computation (in the replica symmetric hypothesis) for the Euclidean monopartite matching problem in d dimensions, generalising some results known to the literature in which correlations between weights are considered as perturbation to a mean field model. Many of the results above suggest further directions of investigation. 1. The case of convex cost functionals has been largely studied in this thesis; however, very little is known about the thermodynamical limit of the concave case even on the line. 2. The proposed functional approach seems able to fully justify the scaling ansatz proposed and to generalise our results, even to the case of concave cost functional: a proper investigation in this sense is required (actually, it is in progress). 77 78 Chapter 2. Euclidean Matching Problems 3. The replica approach proved itself quite powerful in the treatment of random optimisation problems; however in random Euclidean optimisation problems correlations highly complicate the computation. We isolated the simplest part (that we called polygonal contribution), generalising the classical results, but we neglected completely the non-polygonal terms: a proper (even numerical) treatment of these contributions is required. As the present thesis, in its own small way, suggests, combinatorial optimisation problems in general, and Euclidean combinatorial optimisations problem, can be analysed using very different approaches from very different research areas and, on the return, they provide insights on the methodologies adopted and on the interconnection between apparently distant research fields of Mathematics and Physics. Appendix A The Wiener process and the Brownian bridge process Summary of the appendix In the present appendix we will introduce some fundamental properties of the Wiener process and of the Brownian bridge process. A complete treatment of these processes is outside the purpose of this small Appendix and excellent books on the subject are already present in the literature (see, for example, the work of Rogers and Williams [50] or the monograph of M¨ orters and Peres [41]): here we will restrict ourselves to the main properties, presenting also some useful probability distributions for the Wiener process and the Brownian bridge process. A.1. Wiener process and Brownian bridge process Let us suppose that Ω is the sample space of the process and that F corresponds to the set of events, σ-algebra1 over Ω; let us suppose also that a probability P : F → R+ is given on the space of events. A Wiener process W, or standard Brownian process W : R+ → R on this probability space is such that ∀ω ∈ Ω Wω (0) = 0 and the map t → Wω (t) is continuous ∀t ∈ R+ and ∀ω ∈ Ω. Moreover, we require that the process satisfies the fundamental property that ∀t, τ ∈ R+ the increment W(t + τ ) − W(t) is independent from d W(u), u ∈ [0, t], and normally distributed as W(t + τ ) − W(t) ∼ N (0, τ ). This set of properties uniquely identify the Wiener process: it can be proved that such a process exists. A Wiener process can be introduced also using different probabilistic terminologies connected to its main properties that follow directly from the definition. 1 Recall that a σ-algebra on a certain set A is a collection Σ of subset of A such that A ∈ Σ, a ∈ Σ ⇒ A \ a ∈ Σ and a, b ∈ Σ ⇒ a ∪ b ∈ Σ. If C is a set of subsets of A, then σ(C) is the smallest σ-algebra on A containing C. W(t) 0.5 0 −0.5 −1 0 0.1 0.2 0.3 0.4 0.5 t 0.6 0.7 0.8 0.9 1 Figure A.1.1.: A realisation of a Wiener process. 79 80 Appendix A. The Wiener process and the Brownian bridge process B(t) 1 0 −1 0 0.1 0.2 0.3 0.4 0.5 t 0.6 0.7 0.8 0.9 1 Figure A.1.2.: A realisation of a Brownian bridge process. Gaussian process A Gaussian process is a certain stochastic process X(t) : T ⊆ R → R such that, given a set of n element {t1 , . . . , tn } ∈ T , the joint distribution of (X(t1 ), . . . , X(tn )) is a multivariate Gaussian distribution. It follows that we have to specify only the mean2 µ(t) := E(X(t)) and the covariance E (X(t)X(t0 )). The Wiener process is a Gaussian process with mean E(W(t)) = 0 and covariance E (W(t)W(t0 )) = min{t, t0 }. Most importantly, if a continuous process satisfies the previous relations, it is a Wiener process. Martingale The concept of martingale is at the basis of the modern analysis of stochastic processes. Due to its importance, let us briefly introduce the fundamental ideas. Let us firstly consider a probability space (Ω, F, P) as above; we say that a family {Ft : t ∈ R+ }, sub σ-algebras of F, is a filtration if, for s < t, ! [ Fs ⊆ Ft ⊆ F∞ := σ Fτ ⊆ F, (A.1.1) τ A certain process X(t), t ∈ R+ on our probability space is said to be adapted to the filtration if X(t) is measurable on Ft . Given an adapted process such that E (X(t)) < +∞ ∀t and E (X(t)|Fs ) = X(s) for s < t almost surely, then the process is called martingale. In this sense we can say that the Wiener process is a martingale, due to the fact that, denoting by Ws := σ ({W(τ ) : τ ≤ s}) the filtration of the probability space, E (W(t) − W(s)|Ws ) = 0 ⇒ E (X(t)|Ws ) = X(s) directly from the defining properties of the Wiener process. Markov process An adapted process X(t), t ∈ R+ , with filtration {Fs }s∈R+ , is Markov process if there exists a Markov kernel pX (τ, X(s), A) for A open subset of R, such that Pr (X(t) ∈ A|Fs ) = pX (t − s, X(s), A). (A.1.2) The Wiener process is a Markov process, having (x−y)2 2τ e− pW (τ, y, (x, x + d x)) = √ 2πτ d x =: ρ(x − y; τ ) d x. (A.1.3) It is well known that the probability density ρ(x, τ ) is the fundamental solution of the heat equation 1 ∂t ρ(x, t) = ∂x2 ρ(x, t), (A.1.4) 2 obtained by Einstein in his celebrated work on Brownian diffusion. 2 We denote by E(•) the expected value of a certain quantity f : Ω → R, E(f ) = similarly, if G ⊂ F is a sub σ-algebra of F , E(f |G) := R Ω f (ω)P(ω|G) d ω. R Ω P(ω)f (ω) d ω; A.2. Probability distributions 81 A stochastic process strictly related to the Wiener process is the Brownian bridge process. A Brownian bridge B(t) is a Gaussian process on the unit interval [0, 1] such that E(B(t)) = 0 and E(B(s)B(t)) = min{s, t} − st. It can be written in terms of a Wiener process on the unit interval as B(t) := W(t) − tW(1), t ∈ [0, 1]. (A.1.5) It can be proved that the previous process can be obtained simply conditioning a Wiener process to be equal to zero for t = 1. We can introduce also the Gaussian stochastic process B(−1) (t) := Zt B(s) d s, t ∈ [0, 1], (A.1.6) 0 2 area of the stochastic process. It is easily proved that E B(−1) (t) = 0 and E B(−1) (1) = 1 12 . Finally, the zero-area Brownian bridge can be defined as B0 (t) = B(t) − B (−1) (1). (A.1.7) The previous process has 0 E B (t) = 0, 1 E B (s)B (t) = 2 0 0 1 |s − t| − 2 2 − 1 . 24 (A.1.8) A.2. Probability distributions A series of mathematical results is available for the probability distributions of some interesting quantities related to the Wiener process or the Brownian bridge on the unit interval. We collect here some of these results, without proving them (for further details and more general results, see the detailed paper of Beghin and Orsingher [7]). However, we remark here the probability analysis performed is usually carried on through the so-called reflection principle: the general idea is that, given a certain time t∗ and a value W(t∗ ) for the Wiener process, given any set of possible realisations starting at W(t∗ ) for t = t∗ , the set obtained reflecting them in the line w = W(t∗ ) has the same probability. By this simple procedure, it can be easily proved that Z∞ ! sup W(τ ) > W Pr =2 τ ∈(0,t) z2 e− 2t √ d z. 2πt (A.2.1) W Similarly it can be proved that ! Pr sup W(τ ) > W |W(t) = w = ( exp − 2W (Wt −w) 1 τ ∈(0,t) W > w, (A.2.2) w < W. If w = 0 and t= 1 we have the probability distribution for the sup of the Brownian bridge −2B 2 process as Pr supτ ∈(0,1) B(τ ) > B = e . For the absolute value of the Wiener process we have that ! X 2kW (kW − w) Pr sup |W(τ )| < W |W(t) = w = (−1)k exp − . (A.2.3) t τ ∈(0,t) k∈Z 82 Appendix A. The Wiener process and the Brownian bridge process P Again, for w = 0 and t = 1 we have Pr supτ ∈(0,1) |B(τ )| < B = k∈Z (−1)k exp −2k 2 B 2 . Darling [16] proved that for the zero-area Brownian bridge the following distribution holds: √ √ ∞ 8B 4 πX 1 Pr B (t) < B = ψ , 3 n=1 αn 3αn 0 B > 0, (A.2.4) where 0 < α1 < · · · < αn < . . . are the zeros of the combination of Bessel functions f (α) := J 31 (α) + J− 31 (α), (A.2.5) whilst ψ(x) is the solution of the following integral equation: Z∞ 2 e−λx ψ(x) d x = e−λ 3 . (A.2.6) 0 The theorem of Darling is not trivial at all, both in the derivation and in the final result. The explicit treatment of the probability distribution density is indeed quite involved. More implicit results on the Laplace transform of the probability distribution of B(−1) have been obtained by Perman and Wellner [47], but we do not present them here. Appendix B The Simplex Method B.1. Linear optimisation problems The simplex algorithm is the most widely used algorithm for the solution of linear optimisation problems. A linear optimisation problem can be stated as follows: let us consider an n × m matrix of integers A = (aij )ij , a vector of m integers b ∈ Zm and a vector of n integers c ∈ Zn . We want to find a vector X = (Xi )i=1,...,n ∈ (R+ )n such that z := c · X = min c| · x (B.1.1) x∈F on a certain non-empty space F of feasible solutions, F := {x = (x1 , . . . , xn ) ∈ Rn : A · x = b, xi ≥ 0 ∀i = 1, . . . , n} = 6 ∅. (B.1.2) The linear optimisation problem, when stated in the previous form, is said in a standard form. The algorithmic solution of the previous optimisation problem is due to George B. Dantzig that formulated the celebrated simplex algorithm. To have a solution of our optimisation problem it is necessary that n − rank [A] > 0 (otherwise no point satisfies all the constraints): we require therefore for simplicity that rank [A] = m < n. In the space Rn−m , the constraint condition A · x = b, with the additional constraints xi ≥ 0 ∀i = 1, . . . , n, delimits eventually an (n − m)dimensional bounded convex subspace called convex polytope. B.2. The algorithm In the hypothese above, we can select m linearly independent columns in A, let us call them B := {aik }k=1,...,m , such that the system A · x = b, xj = 0 if j 6= ik ∀k = 1, . . . m Figure B.1.1.: An example of 3dimensional convex polytope. (B.2.1) 83 84 Appendix B. The Simplex Method has a unique solution, i.e. we have a problem in the form x1 b1 x2 b2 x3 b3 0 b4 a1,1 a2,1 a3,1 a4,1 a5,1 a6,1 a7,1 a8,1 a9,1 a1,2 a2,2 a3,2 a4,2 a5,2 a6,2 a7,2 a8,2 a9,2 a1,3 a2,3 a3,3 a4,3 a5,3 a6,3 a7,3 a8,3 a9,3 · x 5 = b5 (B.2.2) a1,4 a2,4 a3,4 a4,4 a5,4 a6,4 a7,4 a8,4 a9,4 x6 b6 0 b7 x8 b8 a1,5 a2,5 a3,5 a4,5 a5,5 a6,5 a7,5 a8,5 a9,5 a1,6 a2,6 a3,6 a4,6 a5,6 a6,6 a7,6 a8,6 a9,6 | {z A } 0 b9 | {z} |{z } x b where we have highlighted the submatrix B. If the solution x of the previous problem has xi ≥ 0, it is called basic feasible solution (bfs). Remarkably, the optimal solution is a bfs. It can be proved that x ∈ F and x is bfs ⇔ x is a vertex of the polytope. (B.2.3) Moreover, for any instance of a linear programming problem there is an optimal bfs. By the previous operation, let us suppose that a certain bfs x∗ is known and let us suppose also that this bfs corresponds for the sake of simplicity to the set of columns {ai }i=1,...,m in such a way that x∗ = (x∗1 , . . . , x∗m , 0, . . . , 0). It follows that an optimal solution can be found among the vertices of the polytope corresponding to the instance. In the simplex method, we move from one vertex to another until the optimal vertex is found through proper pivoting operations. The first step is to write down the simplex tableaux: ( z − c| · x = 0 A 0 b ⇒ (B.2.4) −c| 1 0. A ·x=b Let us now give a pictorial representation of the algorithm (in the pictures, m = 6 and n = 9). The simplex matrix is a1,1 a2,1 a3,1 a4,1 a5,1 a6,1 a7,1 a8,1 a9,1 0 b1 a1,2 a2,2 a3,2 a4,2 a5,2 a6,2 a7,2 a8,2 a9,2 0 b2 a1,3 a2,3 a3,3 a4,3 a5,3 a6,3 a7,3 a8,3 a9,3 0 b3 a1,4 a2,4 a3,4 a4,4 a5,4 a6,4 a7,4 a8,4 a9,4 0 b4 a1,5 a2,5 a3,5 a4,5 a5,5 a6,5 a7,5 a8,5 a9,5 0 b5 a1,6 a2,6 a3,6 a4,6 a5,6 a6,6 a7,6 a8,6 a9,6 0 b6 −c1 −c2 −c3 −c4 −c5 −c6 −c7 −c8 −c9 1 0 (B.2.5) After a sequence of row operation we can transform the simplex tableaux in the following B.2. The algorithm 85 way 1 0 0 0 0 0 a ˜7,1 a ˜8,1 a ˜9,1 0 ˜b1 0 1 0 0 0 0 a ˜7,2 a ˜8,2 a ˜9,2 0 ˜b2 0 0 1 0 0 0 a ˜7,3 a ˜8,3 a ˜9,3 0 ˜b3 0 0 0 1 0 0 a ˜7,4 a ˜8,4 a ˜9,4 0 ˜b4 0 0 0 0 1 0 a ˜7,5 a ˜8,5 a ˜9,5 0 ˜b5 0 0 0 0 0 1 a ˜7,6 a ˜8,6 a ˜9,6 0 ˜b6 0 0 0 0 0 0 −˜ c7 −˜ c8 −˜ c9 1 zB (B.2.6) where zB is the value of z on the current bfs. If all the entries of {˜ ci }i=m+1,...,n are positive, then the current bfs is optimal and zB is the optimal cost. Otherwise, we have to proceed further: we choose a non-zero pivot element a ˜rc 6= 0 in the simplex tableaux, and we multiply the corresponding row for a1rc ; proper multiples of ˜ in such a way that the kth column of the the new row are added to the remaining rows of A new matrix has 1 in correspondence of the original pivot element and zero otherwise. The chosen variable is a new basic variable (entering variable) substituting the old rth basic variable (leaving variable): we switch the rth-column with the current cth-column to obtain a new simplex tableaux in the form (B.2.6). Due to the fact that the value of z must be minimised, the entering variable is chosen in a column c in such a way that cc < 0 (non-zero values of the new selected variable decrease z). The condition that the new solution must be feasible determines a criterion for the row: it can be shown that this condition implies that, being c the chosen pivot column, the row ˜r r must be such that a˜brc is minimum among all rows r. The iteration of this sequence of steps leads to the optimal solution exploring the bfss. Note that the method requires as starting point a bfs. Finally, observe that if we have constraints expressed in terms of inequalities, e.g., n X aij xj < bi , (B.2.7) j=1 Pn we can introduce a new slack variable si ≥ 0 for each inequality and write it as j=1 aij xj + si = bi . Obviously we can introduce a “dependence” of z from the new variables in a trivial way as n X X z= ci xi + 0 · sj (B.2.8) i=1 j Bibliography [1] M. Ajtai, J. Koml´ os, and G. Tusn´ ady. “On optimal Matchings”. In: Combinatorica 4.4 (1984), pp. 259–264. [2] D. J. Aldous. “The ζ(2) limit in the random assignment problem”. In: Random Structures and Algorithms 2 (2001), pp. 381–418. [3] J. R. de Almeida and D. J. Thouless. “Stability of the Sherrington-Kirkpatrick solution of a spin glass model”. In: Journal of Physics A: Mathematical and General 11.5 (1978). [4] L. Ambrosio. Lecture notes on optimal transport problems. 2003, pp. 1–52. [5] M. Barczy and E. Igl´ oi. “Karhunen-Lo`eve expansions of α-Wiener bridges”. In: Central European Journal of Mathematics 9.1 (Dec. 2010), pp. 65–84. [6] J. Beardwood, J. H. Halton, and J. M. Hammersley. “The shortest path through many points”. In: Mathematical Proceedings of the Cambridge Philosophical Society 55.04 (Oct. 1959), p. 299. [7] L. Beghin and E. Orsingher. “On the maximum of the generalized Brownian bridge”. In: Lithuanian Mathematical Journal 39.2 (Apr. 1999), pp. 157–167. [8] K. Binder and A. P. Young. “Spin glasses: Experimental facts, theoretical concepts, and open questions”. In: Reviews of Modern Physics 58.4 (1986). [9] V. I. Bogachev and A. V. Kolesnikov. “The Monge-Kantorovich problem: achievements, connections, and perspectives”. In: Russian Mathematical Surveys 67.5 (Oct. 2012), pp. 785–890. [10] E. Boniolo, S. Caracciolo, and A. Sportiello. “Correlation function for the Grid-Poisson Euclidean matching on a line and on a circle”. In: Journal of Statistical Mechanics: Theory and Experiment 2014.11 (Nov. 2014), P11023. [11] S. Caracciolo and G. Sicuro. “Functional Approach for the Euclidean Bipartite Matching Problem”. in preparation. 2014. [12] S. Caracciolo and G. Sicuro. “One-dimensional Euclidean matching problem: Exact solutions, correlation functions, and universality”. In: Physical Review E 90.4 (Oct. 2014), p. 042112. [13] S. Caracciolo and G. Sicuro. “Scaling hypothesis for the Euclidean bipartite matching problem. II Correlation functions”. in preparation. 2014. [14] S. Caracciolo et al. “Scaling hypothesis for the Euclidean bipartite matching problem”. In: Physical Review E 90.1 (2014), p. 012118. [15] T. Castellani and A. Cavagna. “Spin-glass theory for pedestrians”. In: Journal of Statistical Mechanics: Theory and Experiment (May 2005), P05012. [16] D. A. Darling. “On the supremum of certain gaussian processes”. In: The Annals of Probability 11.3 (1983), pp. 803–806. 87 88 Bibliography [17] J. Delon, J. Salomon, and A. Sobolewski. “Minimum-weight perfect matching for non-intrinsic distances on the line”. In: Zapiski Nauchykh Seninarov POMI 390.52 (2011). [18] B. Dezs¨ o, A. J¨ uttner, and P. Kov´acs. “LEMON – an Open Source C++ Graph Template Library”. In: Electronic Notes in Theoretical Computer Science 264.5 (July 2011), pp. 23–45. [19] R. Diestel. Graph Theory: Springer Graduate Text GTM 173. Springer Graduate Texts in Mathematics (GTM). 2012. [20] V. T. Dobri´c and J. E. Yukich. “Asymptotics for Transportation Cost in High Dimensions”. In: Journal of Theoretical Probability 8.1 (1995), pp. 97–118. [21] R. Dudley. Real Analysis and Probability. Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2002. [22] S. F. Edwards and P. W. Anderson. “Theory of spin glasses”. In: Journal of Physics F: Metal Physics 5 (1975), pp. 965–974. [23] L. C. Evans. “Partial differential equations and Monge-Kantorovich mass transfer”. In: Current developments in mathematics (1997). [24] D. Fichera. “Cavity methods in optimization problems: exact and approximated algorithms”. PhD thesis. Universit`a di Milano, 2008. [25] K. Fischer and J. Hertz. Spin Glasses. Cambridge Studies in Magnetism. Cambridge University Press, 1993. [26] G. H. Hardy. Ramanujan: twelve lectures on subjects suggested by his life and work. AMS Chelsea Publishing Series, 1999. [27] J. Houdayer, J. H. Boutet de Monvel, and O. C. Martin. “Comparing mean field and Euclidean matching problems”. In: The European Physical Journal B 6.3 (Nov. 1998), pp. 383–393. [28] H. A. Kramers and G. H. Wannier. “Statistics of the two-dimensional ferromagnet. Part I”. In: Physical Review 60 (1941). [29] H. W. Kuhn. “The Hungarian method for the assignment problem”. In: Naval research logistics quarterly 2 (1955), pp. 83–97. [30] C.-S. Lin and C.-L. Wang. “Elliptic functions, Green functions and the mean field equations on tori”. In: Annals of Mathematics 172.2 (2010), pp. 911–954. [31] L. Lov´ asz and D. Plummer. Matching Theory. AMS Chelsea Publishing Series. 2009. [32] R. J. McCann. “Exact Solutions to the Transportation Problem on the Line”. In: Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 455.1984 (1999), pp. 1341–1380. [33] M. M´ezard, G. Parisi, and M. Virasoro. Spin Glass Theory and Beyond. Lecture Notes in Physics Series. World Scientific Publishing Company, Incorporated, 1987. [34] M. M´ezard and A. Montanari. Information, Physics, and Computation. Oxford Graduate Texts. OUP Oxford, 2009. [35] M. M´ezard and G. Parisi. “Mean-field equations for the matching and the travelling salesman problems”. In: Europhysics Letters 2.12 (1986), pp. 913–918. [36] M. M´ezard and G. Parisi. “On the solution of the random link matching problems”. In: Journal de Physique 48.9 (1987), pp. 1451–1459. [37] M. M´ezard and G. Parisi. “Replicas and optimization”. In: Journal de Physique Lettres 46.17 (1985), pp. 771–778. [38] M. M´ezard and G. Parisi. “The Bethe lattice spin glass revisited”. In: The European Physical Journal B 20.2 (Mar. 2001), pp. 217–233. Bibliography 89 [39] M. M´ezard and G. Parisi. “The cavity method at zero temperature”. In: Journal of Statistical Physics 111.1/2 (2003), pp. 1–24. arXiv:cond-mat/0207121v2 [arXiv:cond-mat]. [40] M. M´ezard and G. Parisi. “The Euclidean matching problem”. In: Journal de Physique 49 (1988), pp. 2019–2025. [41] P. M¨ orters and Y. Peres. Brownian Motion. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2010. [42] H. Nishimori. Statistical Physics of Spin Glasses and Information Processing. An Introduction. Clarendon Press Oxford, 2001. [43] C. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Dover Books on Computer Science Series. Dover Publications, 1998. [44] G. Parisi. “A conjecture on random bipartite matching”. In: arXivorg cond-mat.9801176 (1998). arXiv:9801176v1 [arXiv:cond-mat]. [45] G. Parisi. “Order parameter for spin-glasses”. In: Physical Review Letters 50.24 (1983), pp. 1946–1948. [46] G. Parisi and M. Rati´eville. “On the finite size corrections to some random matching problems”. In: The European Physical Journal B 29.3 (Oct. 2002), pp. 457–468. [47] M. Perman and J. A. Wellner. “On the distribution of brownian areas”. In: The Annals of Applied Probability 6.4 (1996), pp. 1091–1111. [48] T. Plefka. “Convergence condition of the TAP equation for the infinite-ranged Ising spin glass model”. In: Journal of Physics A: Mathematical and General 15 (1982), pp. 1971–1978. [49] C. Redmond and J. E. Yukich. “Asymptotics for Euclidean functionals with powerweighted edges”. In: Stochastic Processes and their Applications 61.2 (Feb. 1996), pp. 289–304. [50] L. Rogers and D. Williams. Diffusions, Markov Processes, and Martingales: Volume 1, Foundations. Cambridge Mathematical Library. Cambridge University Press, 2000. [51] L. A. Shepp. “On the integral of the absolute value of the pinned Wiener process”. In: The Annals of Probability 10.1 (1982), pp. 234–239. [52] D. Thouless, P. Anderson, and R. Palmer. “Solution of ’Solvable model of a spin glass’”. In: Philosophical Magazine 35.3 (1977), pp. 593–601. [53] C. Villani. “Current trends in optimal transportation”. In: 2 (), pp. 1–13. [54] C. Villani. Optimal transport. Old and new. Springer–Verlag, 2009. [55] H. Whitney. “Non-separable and planar graphs”. In: Proceedings of the National Academy of Sciences of the United States of America 17.1 (1931), pp. 125–127. [56] J. Yukich. Probability theory of classical Euclidean optimization problems. Lecture notes in mathematics. Springer, 1998. [57] J. Zinn-Justin. Quantum Field Theory and Critical Phenomena. The International Series of Monographs on Physics Series. Clarendon Press, 2002.
© Copyright 2024