Parallel String Sorting with Super Scalar String Sample Sort Timo Bingmann and Peter Sanders | September 4th, 2013 @ ESA’13 I NSTITUTE OF T HEORETICAL I NFORMATICS – A LGORITHMICS KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association www.kit.edu Abstract We present the currently fastest parallel string sorting algorithm for modern multi-core shared memory architectures. First, we describe the challenges posed by these new architectures, and discuss key points to achieving high performance gains. Then we give an overview of existing sequential and parallel string sorting algorithms and implementations. Thereafter, we continue by developing super scalar string sample sort (S5 ), which is easily parallelizable and yields higher parallel speedups than all previously known algorithms. Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 2 / 29 Overview 1 Introduction and Motivation Parallel Memory Bandwidth Test 2 String Sorting Algorithms Radix Sort Multikey Quicksort Super Scalar String Sample Sort 3 Experimental Results 4 Conclusion Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 3 / 29 String Sorting Algorithms Theoretical Parallel Algorithms “Optimal Parallel String Algorithms: . . .” [Hagerup ’94] Existing Basic Sequential Algorithms Radix Sort Multikey Quicksort Burstsort LCP-Mergesort [McIlroy et al. ’95] [Bentley, Sedgewick ’97] [Sinha, Zobel ’04] [Ng, Kakehi ’08] Existing Algorithm Library by Tommi Rantala (for Radix Sort [Kärkkäinen, Rantala ’09]) http://github.com/rantala/string-sorting Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 4 / 29 String Sorting Algorithms Theoretical Parallel Algorithms “Optimal Parallel String Algorithms: . . .” [Hagerup ’94] Existing Basic Sequential Algorithms Radix Sort Multikey Quicksort Burstsort LCP-Mergesort [McIlroy et al. ’95] [Bentley, Sedgewick ’97] [Sinha, Zobel ’04] [Ng, Kakehi ’08] Existing Algorithm Library by Tommi Rantala (for Radix Sort [Kärkkäinen, Rantala ’09]) http://github.com/rantala/string-sorting Our Contribution: Practical Parallel Algorithms Parallel Super Scalar String Sample Sort (pS5 ) [This Work] Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 4 / 29 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L3 Cache L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L3 Cache RAM RAM L3 Cache RAM RAM Modern Multi-Core Architecture L3 Cache Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 5 / 29 Parallel Memory Bandwidth Test // ScanRead64IndexUnrollLoop for (size_t i=0; i<n; i+=16) { uint64_t x0 = array[i+0]; // ... 14 times uint64_t x15 = array[i+15]; } // PermRead64SimpleLoop uint64_t p = *array; while((uint64_t*)p != array) p = *(uint64_t*)p; n p1 p2 p3 p4 n p1 p2 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 6 / 29 ScanRead64IndexUnrollLoop 1,200 p=1 p=2 p=4 p=8 p = 16 p = 32 p = 64 bandwidth [GiB/s] 1,000 800 600 400 200 0 210 215 220 225 area size n [B] 230 235 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 7 / 29 ScanRead64IndexUnrollLoop 1,200 p=1 p=2 p=4 p=8 p = 16 p = 32 p = 64 bandwidth [GiB/s] 1,000 1,137 GiB/s 800 600 400 16 GiB/s 200 0 210 215 220 225 area size n [B] 230 235 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 7 / 29 ScanRead64IndexUnrollLoop 50 p=1 p=2 p=4 p=8 p = 16 p = 32 p = 64 bandwidth [GiB/s] 40 30 20 10 0 210 215 220 225 area size n [B] 230 235 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 8 / 29 ScanRead64IndexUnrollLoop 50 p=1 p=2 p=4 p=8 p = 16 p = 32 p = 64 bandwidth [GiB/s] 40 30 35 GiB/s 20 2.6 GiB/s 10 0 210 215 220 225 area size n [B] 230 235 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 8 / 29 PermRead64SimpleLoop bandwidth [GiB/s] 300 p=1 p=2 p=4 p=8 p = 16 p = 32 p = 64 200 100 0 210 215 220 225 area size n [B] 230 235 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 9 / 29 access latency [ns] PermRead64SimpleLoop p=1 p=2 p=4 p=8 p = 16 p = 32 p = 64 600 400 200 0 210 215 220 225 area size n [B] 230 235 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 10 / 29 access latency [ns] PermRead64SimpleLoop p=1 p=2 p=4 p=8 p = 16 p = 32 p = 64 600 400 6 ns 200 0.028 ns 0 210 215 220 225 area size n [B] 230 235 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 10 / 29 Log-Log Access Latency access latency [s] 10−6 Scan p = 1 Scan p = 64 Perm p = 1 Perm p = 64 10−7 10−8 10−9 10−10 10−11 210 215 220 225 area size n [B] 230 235 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 11 / 29 Parallel Memory Bandwidth Test Scanning an Array p1 p2 n p3 p4 Random Access in an Array n p1 p 1 16 32 Cache (n = 16 KiB) Scan Random 35 GiB/s 4.4 GiB/s 567 GiB/s 69 GiB/s 1131 GiB/s 137 GiB/s p2 p 1 16 32 RAM (n = 4 GiB) Scan Random 2.6 GiB/s 19 MiB/s 10.3 GiB/s 403 MiB/s 10.6 GiB/s 763 MiB/s Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 12 / 29 Sorting Strings a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d g e l e n n e u s t on a i c ⇒ a a a a a a k k k k k k k b l r r r r a e i i i i r a p c c r r y r t t t t y c h a a a a a n u a d i n y k e s e c g e l c h e n e t e n p t on Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 13 / 29 Sorting Strings a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d g e l e n n e u s t on a i c ⇒ a a a a a a k k k k k k k b l r r r r a e i i i i r a p c c r r y r t t t t y c h a a a a a n u a d i n y k e s e c g e l c h e n e t e n p t on Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 13 / 29 Sorting Strings: Radix Sort a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d g e 0 a k σ− 1 0 6 7 0 l e n n e u s t on a i c Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29 Sorting Strings: Radix Sort a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d 0 a k σ− 1 g e 0 6 7 0 l e n n e 0 0 6 6 13 13 13 u s t on a i c Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29 Sorting Strings: Radix Sort a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d 0 a k σ− 1 g e 0 6 7 0 l e n n e 0 0 6 6 13 13 13 u s t on a i c Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29 Sorting Strings: Radix Sort a a a a a a k k k k k k k r r r b l r i a e i i i r r r c a p c t y r t t t y a a a c h a y n d u a i a n c t e p k e l h e n e n g e e s 0 a k σ− 1 0 6 7 0 0 0 6 6 13 13 13 c t on Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29 Sorting Strings: Radix Sort a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d 0 a k σ− 1 g e 0 6 7 0 l e n n e 0 0 6 6 13 13 13 2 2 2 3 3 1 0 0 0 p1 0 p2 0 p3 0 u s t on a i c Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29 Sorting Strings: Radix Sort a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d 0 a k σ− 1 g e 0 6 7 0 l e n n e 0 0 6 6 13 13 13 p1 0 p2 0 p3 0 2 2 2 3 3 1 0 0 0 p1 0 p2 0 p3 0 0 6 2 6 4 6 6 13 9 13 12 13 13 13 13 13 u s t on a i c Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29 Sorting Strings: Multikey Quicksort a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c [Bentley, Sedgewick ’97] a y a a n c t a e c p h a n k e h e d g e l e n n e u s t on a i c Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29 Sorting Strings: Multikey Quicksort a a k k a a a a k k k k k r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i [Bentley, Sedgewick ’97] g e l e s < c c h e n t e n e p t on = > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29 Sorting Strings: Multikey Quicksort a a k k a a a a k k k k k r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i [Bentley, Sedgewick ’97] = g e l e s < ? > = < c c h e n t e n e p t on = > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29 Sorting Strings: Multikey Quicksort a a k k a a a a k k k k k r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i [Bentley, Sedgewick ’97] < g e l e s = > < c c h e n t e n e p t on = > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29 Sorting Strings: Multikey Quicksort a a k k a a a a k k k k k r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i [Bentley, Sedgewick ’97] < g e l e s > [Rantala ’??] < partition by w = 8 characters cache characters ⇒ fewer random accesses c c h e n t e n e p t on = fastest sequential algorithm = > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29 Sorting Strings: Multikey Quicksort a a k k a a a a k k k k k r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i [Bentley, Sedgewick ’97] < g e l e s > [Rantala ’??] < partition by w = 8 characters cache characters ⇒ fewer random accesses c c h e n t e n e p t on = fastest sequential algorithm [This Work] = ? ? ? ? ? ? ? ? ? ? ? ? ? > parallelize using blocks Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29 Sorting Strings: Multikey Quicksort a a k k a a a a k k k k k r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i g e l e s ? ? ? ? ? ? ? ? ? ? ? ?? < c c h e n t e n e p t on = > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29 Sorting Strings: Multikey Quicksort a a k k a a a a k k k k k r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i g e l e s ? ? ? ? ? ? ? ? ? ? ? ?? < c c h e n t e n e p t on p1 ? ? ? p2 ? ? ? p3 ? ? ? = > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29 Sorting Strings: Multikey Quicksort a a k k a a a a k k k k k r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i g e l e s ? ? ? ? ? ? ? ? ? ? ? ?? < < =? >? p2 < ? = ? p3 < ? c c h e n t e n e p t on p1 = > >? = > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29 Sorting Strings: Multikey Quicksort r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i g e l e s ? ? ? ? ? ? ? ? ? ? ? ?? < ? =? >? p2 < ? = ? p3 < ? c c h e n t e n e p t on p1 = > Output a a k k a a a a k k k k k ? ? >? < = > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29 Sorting Strings: Multikey Quicksort r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i g e l e s ? ? ? ? ? ? ? ? ? ? ? ?? < p2 < = > > =? >? p3 < ? = ? c c h e n t e n e p t on p1 < ? = ? Output a a k k a a a a k k k k k > < = > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29 Sorting Strings: Multikey Quicksort r r a e r b l r i i i i r r r y r c a p c t t t t y a a a n a c h a y n k e d u a i g e l e s ? ? ? ? ? ? ? ? ? ? ? ?? < p2 0 =? >? p3 < ? = ? c c h e n t e n e p t on p1 < ? = ? ? 0 = > Output a a k k a a a a k k k k k < < 0 < = > > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29 Sorting Strings: Multikey Quicksort r r y r c a p c t t t t y a a a n a c h a y n k e d u a i g e l e s ? ? ? ? ? ? ? ? ? ? ? ?? < p2 < 0 = 0 > 0 p3 < 0 = 0 > 0 c c h e n t e n e p t on p1 < 0 = 0 > 0 = > Output r r a e r b l r i i i i r < < More Output a a k k a a a a k k k k k < = > > Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29 Super Scalar String Sample Sort (S5 ) a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d g e l e n n e u s t on a i c Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29 Super Scalar String Sample Sort (S5 ) a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d g e l e n n e ar < > = ab ki < = > < = > b0 b1 b2 b3 b4 b5 b6 u s t on a i c Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29 Super Scalar String Sample Sort (S5 ) a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d g e l e n n e u s t on a i c ar < > = ab ki < = > < = > b0 b1 b2 b3 b4 b5 b6 b0 b1 b2 b3 b4 b5 b6 b0 b1 b2 b3 b4 b5 b6 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29 Super Scalar String Sample Sort (S5 ) a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d ar g e l e n n e u s t on a i c < > = ab ki < = > < = > 0 0 0 0 1 0 0 0 1 2 1 1 2 0 0 1 3 0 0 0 1 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29 Super Scalar String Sample Sort (S5 ) a k a k k k k a k a k a a r i r a e i i r i b r l r r t r y r t t c t a y p c a y a a n c t a e c p h a n k e h e d ar g e l e n n e u s t on a i c < > = ab ki < = > < = > 0 0 0 0 1 1 1 1 2 4 5 6 8 8 8 9 12 12 12 12 13 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29 Super Scalar String Sample Sort (S5 ) a a a a a a k k k k k k k b l r r r r a e i i i i r a p r r c c y r t t t t y c h a a a a a n u a y n d i k e s ar < > g e e c = ab ki < = > < = > l 0 0 0 c h e n t e n e p t on 0 1 1 1 1 2 4 5 6 8 8 8 9 12 12 12 12 13 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29 Super Scalar String Sample Sort (S5 ) prefix a b a c u s 2 1 a l ph a a a a a k k k k k k k r r r r a e i i i i r r r c c y r t t t t y a a a a a n y n d i k e g e 2 e c l 0 c h e n 2 t e n e p t on 0 increase prefix by LCP of splitters or key size. > ar < 1 0 = ab ki < = > < = > 0 0 0 0 1 1 1 1 2 4 5 6 8 8 8 9 12 12 12 12 13 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29 Super Scalar String Sample Sort (S5 ) prefix a b a c u s 2 1 a l ph a a a a a k k k k k k k r r r r a e i i i i r r r c c y r t t t t y a a a a a n y n d i k e g e 2 e c l 0 c h e n 2 t e n e p t on 0 increase prefix by LCP of splitters or key size. > ar < 1 0 = ab ki < = > < = > 0 0 0 0 1 1 1 1 2 4 5 6 8 8 8 9 12 12 12 12 13 partitions by w = 8 chars easy parallelization Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29 Super Scalar String Sample Sort (S5 ) predicated instructions ar < > = ab ki < = > < = > 1 2 3 ar ab ki i := 2i + 0/1 b0 b1 b2 b3 b4 b5 b6 partitions by w = 8 chars easy parallelization 256 KiB L2 cache: 13 levels Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 18 / 29 Super Scalar String Sample Sort (S5 ) predicated instructions ar < > = ab ki < = > < = > b0 b1 b2 b3 b4 b5 b6 partitions by w = 8 chars 1 2 3 ar ab ki i := 2i + 0/1 equality checking: 1 at each node 2 after full descent easy parallelization 256 KiB L2 cache: 13 levels Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 18 / 29 Super Scalar String Sample Sort (S5 ) predicated instructions ar < > = ab ki < = > < = > b0 b1 b2 b3 b4 b5 b6 partitions by w = 8 chars easy parallelization 256 KiB L2 cache: 13 levels 1 2 3 ar ab ki i := 2i + 0/1 equality checking: 1 at each node 2 after full descent interleave tree descents: classify 4 strings at once ⇒ super scalar parallelism Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 18 / 29 Parallel S5 – Sub-Algorithms |S| ≥ n p n p > |S| ≥ 216 fully parallel S5 sequential S5 216 > |S| ≥ 64 caching multikey quicksort 64 > |S| insertion sort Important: dynamic load balancing with voluntary work freeing Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 19 / 29 URLs – 1.1 G Lines, 70.7 GiB http://algo2.iti.kit.edu/index.php http://algo2.iti.kit.edu/english/index.php http://algo2.iti.kit.edu/1483.php http://algo2.iti.kit.edu/1484.php http://www.kit.edu/ http://algo2.iti.kit.edu/286.php http://algo2.iti.kit.edu/1294.php http://algo2.iti.kit.edu/research.php http://algo2.iti.kit.edu/publications.php http://algo2.iti.kit.edu/members.php http://algo2.iti.kit.edu/lehre.php http://algo2.iti.kit.edu/1844.php http://algo2.iti.kit.edu/294.php http://algo2.iti.kit.edu/basic-toolbox-page.php http://map.project-osrm.org?dest=49.0137004,8.419307&destname=%22Am%20Fa... http://www.uni-karlsruhe.de/fs/Uni/info/campusplan/index.php?id=50.34 http://www.informatik.kit.edu/1158.php http://algo2.iti.kit.edu/emailform.php?id=eea49c752e4c1329710cba2efae10511 http://algo2.iti.kit.edu/routeplanning.php http://algo2.iti.kit.edu/sanders.php http://www.mwk.baden-wuerttemberg.de/forschung/forschungsfoerderung/land... http://algo2.iti.kit.edu/1996.php Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 20 / 29 URLs – Speedup on 32-core Intel E5 10 speedup 8 pS5 pMultikeyQuicksort pRadixSort 6 4 2 0 1 8 16 32 48 number of threads 64 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 21 / 29 Wikipedia – Suffixes of 4 GiB <page> <title>Karlsruhe Institute of Technology</title> <ns>0</ns> <id>5977314</id> <revision> <id>491222814</id> <timestamp>2013-04-24T03:17:12Z</timestamp> <text xml:space="preserve">{{Infobox university |name = Karlsruhe Institute of Technology |image = [[Image:KIT logo.png|270px|]] |tagline = ’’{{lang|de|Forschungsuniversität}}’’ (Research University) |established = Fridericiana: 1825 as polytechnical school, 1865 as university;<br... |president = Horst Hippler, Eberhard Umbach |students = 23,905 (October 2012) |staff = 3.423<ref name="kit.edu"/>}} The ’’’Karlsruhe Institute of Technology’’’ (’’’KIT’’’) is one of the largest research and educations institution in Germany, resulting from a merger of the university (’’Universität Karlsruhe (TH)’’) and the research center (’’Forschungszentrum Karlsruhe’’)<ref>[[ Federal Ministry of Education and Research (Germany)]]: http://www.bmbf.de/pub/eck punktepapier_kit.pdf</ref> of the city of [[Karlsruhe]]. The university, also known as ’’’’’Fridericiana’’’’’, was founded in 1825. In 2009, it merged with the former national nuclear research center founded in 1956 as the ’’Kernforschungszentrum Karlsruhe (KfK)’’. Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 22 / 29 Suffixes – Speedup on 32-core Intel E5 20 speedup 15 10 5 pS5 pMultikeyQuicksort pRadixSort 0 1 8 16 32 48 number of threads 64 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 23 / 29 GOV2 – 3.1 G Lines in 128 GiB <p align="left"><font face="Verdana, Arial, Helvetica, sans-serif"><b>BACKGROUND INFORMATION. </b>The Forest Ecosystem Dynamics (FED) Project is concerned with modeling and monitoring ecosystem processes and patterns in response to natural and anthropogenic effects. The project uses coupled ecosystem models and remote sensing models and measurements to predict and observe ecosystem change. The overall objective of the FED project is to link and use models of forest dynamics, soil processes, and canopy energetics to understand how ecosystem response to change affects patterns and processes in northern and boreal forests and to assess the implications for global change. See <a href="http://fedwww.gsfc.nasa.gov/html/conceptdiagram.html"> Conceptual Diagram</a> for model schematic. </font></p> <p align="left"><font face="Verdana, Arial, Helvetica, sans-serif">The Forest Ecosystem Dynamics World-Wide-Web server has been online since July 1994. The FED server was created for the dissemination of project information, to archive numerous spatial and scientific data sets, and demonstrate the linking of ecosystem and remote sensing models. </font></p> </td> <td valign="top" width="28%" align="left" height="565"> <p><img src="vertrule.gif" alt="vertical rule" width="2" height="580" hspace="5" ... <p><font size="2" face="Verdana, Arial, Helvetica, sans-serif"><a href="http://fedwww... Abstract</a> </font></p> <p><font face="Verdana, Arial, Helvetica, sans-serif" size="2"><a href="http://fedwww... Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 24 / 29 GOV2 – Speedup on 32-core Intel E5 12 10 speedup 8 pS5 pMultikeyQuicksort pRadixSort 6 4 2 0 1 8 16 32 48 number of threads 64 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 25 / 29 Words – 31.6 M Lines, 382 MiB stereosto tocellular cellularand andportable wellas asdiscussion ofcomputer computersecurity securityissues issuesin ingeneral generalWhile Whilewe wewelcome welcomeall allmanner mannerof ofquestion questionabout aboutencrypted encryptedmessages messagesHow doesPGP PGPencrypt encrypta amessage messageWhat apublic publickey keyencrypted messagesthemselves themselvesare offtopic topicHost CamelotMessage MessageFile FileBBS BBSCulver CulverCity CityCA ListGraphics GraphicsGeneral GeneralGraphics Graphics15 15Discussions aboutall alltypes typesof computergraphics graphicsand languageconference forthe theexchange exchangeof ofideas ideasin Learnmore moreabout theSpanish Spanishculture culturewhile communicatingin thelanguage languageHost ListFantsySciF FantsySciFFantasy Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 26 / 29 Words – Speedup on 4-core Intel i7 4 speedup 3 2 pS5 pMultikeyQuicksort pRadixSort 1 1 2 3 5 4 6 number of threads 7 8 Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 27 / 29 Future Work Non-uniform memory architecture (NUMA) effects distributed string sorting algorithms distributed text index construction and query high-performance middleware? Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 28 / 29 Thank you for your attention! Questions? Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample Sort Institute of Theoretical Informatics – Algorithmics September 4th, 2013 29 / 29
© Copyright 2025