Cayley diagraphs - a compuational approach.

A Computational Approach to Finding the
Diameter of a Cayley Digraph of Sn
Jordan Almeter
April 16, 2015
Basic definitions
A Cayley digraph is defined on a group G with a generator set
S ⊆ G such that, for every s ∈ S and a, b ∈ G , if as = b, then
there is an edge from a to b. Note that this does not mean there is
a path from b to a.
The distance between two vertices in a graph is the length of the
shortest path between these vertices, and the diameter is the
largest distance in a graph.
Importance of diameter
In any Cayley digraph, we may write that a path from the identity
to some element x gives us a factorization of x, where every
element in the factorization is from the generator set. In the case
of sorting algorithms, the diameter gives us the maximum number
of steps necessary to sort a list of values.
A Computational Approach
How do we go about computing with permutations in a Cayley
digraph??
I
We need a function T : Sn → Zn! to use as a hash function,
for indexing
I
We need a data structure which can be stored on a typical
hard drive
I
Given the size of these graphs, a by hand analysis of the entire
graph is impractical. Can we generate a smaller graph which
gives us the information we need yet is still easily understood?
The function T : Sn → Zn!
Consider that, from a combinatorics perspective, there are n
choices for the first element in x, n − 1 choices for the second
element, and so on. Conceptually, we can say that T (x) is equal to
(n − 1)! times the first choice, plus T of the rest of the
permutation.
We can say T ([a1 , a2 , . . . , an ]) =
a1 (n − 1)! + T ([g (a2 , x), g (a3 , x), . . . g (an , x)]), where g (a, b) = a
if a < b, and g (a, b) = a − 1 if a ≥ b.
There is an inverse function, T −1 , which we can find
computationally.
Unfortunately, there are no isomorphisms between Sn and Zn! , so
operations like T (x1 ) ∗ T (x2 ) do not give meaningful results.
Computational limits of data structures
The largest integer that can be stored in 32-bit is
231 − 1 = 2147483647. The largest factorial standard math
packages can handle is 12!. For n > 12, we must use 64 bit
integers.
The smallest possible representation for a k-regular (each vertex
has k neighbors) Cayley digraph is n! indices, each with k 64 bit
digit integers representing its neighbors. For n = 13, this
representation would require ≈ 400k gigabytes. For k > 1, this
becomes an unfeasible amount of space for a consumer-level hard
drive, and n > 13 is an order of magnitude higher.
An adjacency matrix can be constructed with n!2 bits, which would
require over 16 GB when n = 9, and is unreasonably large for
n > 9.
Computational limits: Array of distances
Assuming that the diameter of a graph is a reasonable number, say
less than 256, then we can create an array of length n! which has
one byte per entry. Each entry stores the distance between that
vertex and the identity. For n = 12, this requires almost 0.5 GB,
and for n = 13, over 6 GB.
In addition, it is possible to calculate this array without a separate
data structure for depth-first search; set every element to a value
of −1, and use this as a control value for a breadth-first search.
This is computationally expensive, and requires about 15 minutes
for n = 12 on my personal computer, but it minimizes memory
storage.
Digraphs of interest: hL, S, L−1 i and hL, Si
In Sn , we define L = (0 1 2 . . . n − 1), and S = (0 1). Values for
the diameter of each graph are determined as follows. Note that
the value 87 is not available on oeis; the relevant sequences are
A039745 and A186783 on the online encyclopedia of integer
sequences. Also note that the values in the second row follow n2 .
It is a conjecture shared by Professor Li and others that this is true
for all n.
Diameter for Sn
hL, Si
hL, S, L−1 i
S2
1
1
S3
2
2
S4
6
6
S5
11
10
S6
18
15
S7
25
21
S8
35
28
S9
45
36
S10
58
45
S11
71
55
S12
87
66
New conjectures on the diameter of hL, Si
From the numbers we have, we can make the following
observations, and hypothesize that the pattern will continue:
I
For L, S and L, S, L−1 , most patterns treat S3 as an exception.
I
For L, S, the even and odd values fall on different parabolas
for n 6= 3, 4 and n ≤ 12. For odd values of n we can
d = 3n2 /4 − 2n + 9/4, and for even values of n, we know
d = 3n2 /4 − 2n + 3. The proposed conjecture is that this
pattern continues.
I
For n > 4, we can say
d = b3n2 /4 − 2n + 3c = d3n2 /4 − 2n + 9/4e
I
Another
n > 4, the diameter of L, S is
formula: for
n−4
cb
d = n2 + b n−1
2
2 c + 1 for n > 4
hL, S, L−1 i and hL, Si
For these graphs, we can ask how many different vertices are the
maximum distance away from the identity? The graph below
shows computed values for these.
Number of vertices with maximum distance from identity
hL, Si
hL, S, L−1 i
S2
1
1
S3
3
2
S4
3
1
S5
2
1
S6
1
1
S7
2
1
S8
1
1
S9
2
1
S10
1
1
S11
2
1
S12
1
1
Creating simple metrics
For any n greater than 24, a list of the distance of each vertex
from 0 becomes difficult to analyze by hand, so we can attempt to
generate more useful metrics. Of interest are the permutations
which are of maximum distance from the identity, as well as the
paths that connect the identity to these vertices, and the
factorizations that each represents. A metric which combines all of
these is presented for select values of n.
Paths of incrementing distance
A path of incrementing distance is defined here as one where the
distance from the identity for each vertex is one greater than the
vertex preceding it. A maximum incrementing path is one whose
final vertex v is the maximum possible distance from the identity.
If we take the union of all maximum incrementing distance paths,
we have a subgraph showing the paths leading to the vertices
furthest from the identity. Call this subgraph the distance
incrementing subgraph.
Distance Incrementing Subgraph visualization
This is a simple case, where n = 3.
LSL−1 , n = 3
3
0
LS, n = 3
5
2
4
0
1
2
1
3
4
5
LS, n = 4
6
3
17
23
14
7
4
21
8
12
1
11
13
0
18
9
16
22
LS,n = 4
Factorizations:
SLLSLL, LLSLLS, LLSLLL, SLSLLS, LLLSLS, SLSLLL, LLLSLL,
SLLSLS
LS, n = 5
0
64
90
96
18
105
45
112
60
57
40
82
42
99
8
50
13
113
62
33
73
4
79
28
32
21
91
98
20
111
56
37
LS, n = 5
Maximum factorizations:
LLLLSLSLLLL, LSLSLLLSLSL, LSLSLLSLLLL, LSLLLSLLLSL,
LLLLSLLSLSL
LS, n = 6
0
263
LS, n = 6
Maximum factorizations:
SLSLLSLLSLSLSLLLLL, SLSLLSLSLLLLLSLLLL,
SLLLSLSLLLSLLSLLLL, LLLSLSLLSLSLLSLLLL,
SLSLLLLSLSLSLLSLLL, SLSLLLSLLLLLSLSLLL,
SLLLLSLLLSLLSLSLLL, LLLLSLLSLSLLSLSLLL,
SLLSLSLLSLSLLSLLSL, SLLLSLLSLSLLSLSLSL,
SLLLSLLSLSLSLLLLLS, SLLLSLSLLLLLSLLLLS,
LLLSLSLLSLLLSLLLLS, LSLSLSLLSLSLLSLLLS,
SLLLLLSLSLSLLSLLLS, SLLLLSLLLLLSLSLLLS,
LLLLSLLSLLLSLSLLLS, LSLLSLLSLSLLSLSLLS,
LLLSLLSLSLSLLLLSLS, LLLSLSLLLLLSLLLSLS,
LLLLLSLSLSLLSLLSLS, LLLLSLLLLLSLSLLSLS
LSL−1 , n = 4
6
0
18
9
3
17
20
13
23
4
21
8
16
22
11
15
2
12
10
19
14
1
5
7
LSL−1 , n = 4
The following is a list of all factorizations of the maximum value
(1 0 3 2 ):
SLLSLL, SRRSLL, RSLLSL, SLSRSL, RSRRSL, LLSLLS, RRSLLS,
LSRSLS, RSLSRS, LLSRRS, RRSRRS, LSLLSR, SRSLSR,
LSRRSR, SLLSRR, SRRSRR
LSL−1 , n = 5
0
24
102
42
99
21
113
62
86
55
96
9
92
116
71
117
80
31
5
33
18
51
16
93
38
53
49
23
90
76
61
85
11
84
47
57
105
45
112
60
7
101
64
114
70
115
74
107
119
40
91
35
59
118
88
66
97
19
43
6
48
108
94
14
65
89
68
103
95
82
63
87
30
54
110
29
LSL−1 , n = 5
SRSLSLLSLL, SRRSRSLSLL, SLSLLSLLSL, RSRSLSLLSL,
RSRRSRSLSL, SRRSRRSRSL, RSLSLLSLLS, RRSRSLSLLS,
LSLLSLLSLS, RRSRRSRSLS, LLSLLSLSRS, RSRRSRRSRS,
LLSLSRSRRS, LSRSRRSRRS, SLLSLLSLSR, LSLLSLSRSR,
LSLSRSRRSR, SRSRRSRRSR, SLLSLSRSRR, SLSRSRRSRR
Swapping adjacent vertices, n = 4
The graph here is using the generating set of permutations
swapping adjacent values. While this set’s behavior is well
understood, its inclusion here illustrates that this metric applies to
all Cayley digraphs on Sn .
0
6
8
9
15
17
2
12
14
16
21
1
3
13
11
22
7
10
20
4
18
19
5
23
Conclusion
I
The extreme size of Sn makes a Cayley digraph impossible to
compute for large values of n.
I
Even when n is less than 12, understanding the structure of a
Cayley digraph difficult to understand
I
I propose the distance incrementing subgraph as a new metric
for understanding the structure of a Cayley digraph