How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand Jonathan Passerat-Palmbach High Performance Computing and Simulation Conference HPCS 2012 ISIMA - LIMOS UMR CNRS 6158 Clermont Université - Université Blaise Pascal July, 2nd 2012 Theoretical Aspects Outline 1 Theoretical Aspects 2 Guidelines to deal with pseudorandom streams distribution 3 The ShoveRand solution How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 1 Theoretical Aspects GPU How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 2 Theoretical Aspects GPU charachteristics Graphics board SIMD Architecture Favours computing Memory hierarchy Controlled from a host machine How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 3 Theoretical Aspects Legacy constraints from PRNGs Good statistical quality Independent sequences Low memory footprint Number throughput Reproducibilty How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 4 Theoretical Aspects GPU-related constraints No global synchronization Memory hierarchy ’SIMD-compliant’ algorithm How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 5 Theoretical Aspects What is the right way to do? How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 6 Guidelines to deal with pseudorandom streams distribution Outline 1 Theoretical Aspects 2 Guidelines to deal with pseudorandom streams distribution 3 The ShoveRand solution How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 7 Guidelines to deal with pseudorandom streams distribution Seed Status Current state of the PRNG Data structure depending on the PRNG Example Array of integers for Mersenne Twisters 6 integers for MRG32k3a Single counter for Threefry How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 8 Guidelines to deal with pseudorandom streams distribution Parameterized Status PRNGs’ intrinsic parameters Example Output of Dynamic Creator for Mersenne Twisters Matrices computed through Jump Ahead for MRG32k3a Integer key for Threefry How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 9 The ShoveRand solution Outline 1 Theoretical Aspects 2 Guidelines to deal with pseudorandom streams distribution 3 The ShoveRand solution How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 11 The ShoveRand solution Two main goals Simplfy the use of PRNG on GPUs for developers Safely integrate new PRNG algorithms How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 12 The ShoveRand solution User side Nice interface Swap PRNGs in the blink of an eye No intrusive declarations CMake! How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 13 The ShoveRand solution API close to high-level languages (C++, Java) 1 __global__ void testMRG32k3a(double∗ ddata) { 2 3 RNG < float, MRG32k3a > rng; 4 5 ddata[blockDim.x ∗ blockIdx.x + threadIdx.x] = rng.next(); 6 } How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 14 The ShoveRand solution Demo 1 Example of use: Pi calculation through a Monte Carlo method How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 15 The ShoveRand solution Summary: device side 1 2 3 4 5 6 7 8 9 10 11 12 13 14 #include <shoverand/prng/mrg32k3a/MRG32k3a.hxx> #include <shoverand/core/RNG.hxx> using shoverand::RNG; using shoverand::MRG32k3a; typedef RNG < float, MRG32k3a > randomengine; __global__ void kernelMonteCarloPi( float ∗ outArray ) { randomengine rng; ... rng.next(); } How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 16 The ShoveRand solution Summary: host side 1 2 3 4 5 6 7 8 9 int main() { ... random_engine::init(nbBlocks); ... kernel<<< nbBlocks, nbThreads >>> (array); ... random_engine::release(); ... } How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 17 The ShoveRand solution Developer side Provide a PRNG class No inheritance required Few constraints to satisfy How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 18 The ShoveRand solution Safe integration of PRNGs: concepts Ability of an object to match some requirements Example Objects stored in sorted constainers must be comparable Serializable, Assignable, Default contructible How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 19 The ShoveRand solution RNGAlgorithm concept in action 1 2 3 4 5 6 7 8 9 10 11 12 13 14 BOOST_CONCEPT_USAGE(RNGAlgorithm) { // require Algo<T>::init() al_.init(42); // require Algo<T>::release() al_.release(); // require T Algo<T>::next() value_ = al_.next(); // require Algo<T>::ss_ to be of SeedStatus<Algo> type same_type(ss_, al_.ss_); // require Algo<T>::ps_ to be of ParameterizedStatus<Algo> type 15 same_type(ps_, al_.ps_); 16 } How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 20 The ShoveRand solution Demo 2 Embedding a dummy PRNG How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 21 The ShoveRand solution Summary: Outside PRNG class 1 2 3 4 5 6 7 8 9 #include <shoverand/core/SeedStatus.hxx> #include <shoverand/core/ParameterizedStatus.hxx> template <class T> class DummyGenerator; template <> class SeedStatus<DummyGenerator> { ... }; template <> class ParameterizedStatus<DummyGenerator> { ... }; 10 11 typedef :: SeedStatus < DummyGenerator > SeedStatusDummyGenerator; 12 typedef :: ParameterizedStatus < DummyGenerator > ParameterizedStatusDummyGenerator; How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 22 The ShoveRand solution Summary: Inside PRNG class 1 2 3 4 5 6 7 8 9 10 11 12 13 14 template < typename T > T DummyGenerator<T>::next() { ... } template < typename T > void DummyGenerator<T>::init(unsigned int inNbBlocks) { ... } template < typename T > T DummyGenerator<T>::release() { ... } How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 23 Conclusion Future work More to come in ShoveRand Distribution functions Conclusion Future work More to come in ShoveRand Distribution functions OpenCL port? Conclusion Future work More to come in ShoveRand Distribution functions OpenCL port? More and more top-quality RNGs (Threefry, ...) How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 24 Conclusion Any Questions? Thank you for listening! [email protected] http://forge.clermont-universite.fr/ projects/shoverand How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand N 25
© Copyright 2025