How to correctly deal with pseudorandom numbers in manycore environments?

How to correctly deal with pseudorandom
numbers in manycore environments?
Application to GPU programming with
Shoverand
Jonathan Passerat-Palmbach
High Performance Computing and Simulation Conference
HPCS 2012
ISIMA - LIMOS UMR CNRS 6158
Clermont Université - Université Blaise Pascal
July, 2nd 2012
Theoretical Aspects
Outline
1
Theoretical Aspects
2
Guidelines to deal with pseudorandom streams distribution
3
The ShoveRand solution
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
1
Theoretical Aspects
GPU
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
2
Theoretical Aspects
GPU charachteristics
Graphics board
SIMD Architecture
Favours computing
Memory hierarchy
Controlled from a host machine
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
3
Theoretical Aspects
Legacy constraints from PRNGs
Good statistical quality
Independent sequences
Low memory footprint
Number throughput
Reproducibilty
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
4
Theoretical Aspects
GPU-related constraints
No global synchronization
Memory hierarchy
’SIMD-compliant’ algorithm
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
5
Theoretical Aspects
What is the right way to do?
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
6
Guidelines to deal with pseudorandom streams distribution
Outline
1
Theoretical Aspects
2
Guidelines to deal with pseudorandom streams distribution
3
The ShoveRand solution
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
7
Guidelines to deal with pseudorandom streams distribution
Seed Status
Current state of the PRNG
Data structure depending on the PRNG
Example
Array of integers for Mersenne Twisters
6 integers for MRG32k3a
Single counter for Threefry
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
8
Guidelines to deal with pseudorandom streams distribution
Parameterized Status
PRNGs’ intrinsic parameters
Example
Output of Dynamic Creator for Mersenne Twisters
Matrices computed through Jump Ahead for MRG32k3a
Integer key for Threefry
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
9
The ShoveRand solution
Outline
1
Theoretical Aspects
2
Guidelines to deal with pseudorandom streams distribution
3
The ShoveRand solution
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
11
The ShoveRand solution
Two main goals
Simplfy the use of PRNG on GPUs for developers
Safely integrate new PRNG algorithms
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
12
The ShoveRand solution
User side
Nice interface
Swap PRNGs in the blink of an eye
No intrusive declarations
CMake!
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
13
The ShoveRand solution
API close to high-level languages (C++,
Java)
1 __global__ void testMRG32k3a(double∗ ddata) {
2
3 RNG < float, MRG32k3a > rng;
4
5 ddata[blockDim.x ∗ blockIdx.x + threadIdx.x] = rng.next();
6 }
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
14
The ShoveRand solution
Demo 1
Example of use: Pi calculation through a Monte Carlo method
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
15
The ShoveRand solution
Summary: device side
1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <shoverand/prng/mrg32k3a/MRG32k3a.hxx>
#include <shoverand/core/RNG.hxx>
using shoverand::RNG;
using shoverand::MRG32k3a;
typedef RNG < float, MRG32k3a > randomengine;
__global__ void kernelMonteCarloPi( float ∗ outArray )
{
randomengine rng;
...
rng.next();
}
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
16
The ShoveRand solution
Summary: host side
1
2
3
4
5
6
7
8
9
int main() {
...
random_engine::init(nbBlocks);
...
kernel<<< nbBlocks, nbThreads >>> (array);
...
random_engine::release();
...
}
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
17
The ShoveRand solution
Developer side
Provide a PRNG class
No inheritance required
Few constraints to satisfy
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
18
The ShoveRand solution
Safe integration of PRNGs: concepts
Ability of an object to match some requirements
Example
Objects stored in sorted constainers must be comparable
Serializable, Assignable, Default contructible
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
19
The ShoveRand solution
RNGAlgorithm concept in action
1
2
3
4
5
6
7
8
9
10
11
12
13
14
BOOST_CONCEPT_USAGE(RNGAlgorithm) {
// require Algo<T>::init()
al_.init(42);
// require Algo<T>::release()
al_.release();
// require T Algo<T>::next()
value_ = al_.next();
// require Algo<T>::ss_ to be of SeedStatus<Algo> type
same_type(ss_, al_.ss_);
// require Algo<T>::ps_ to be of ParameterizedStatus<Algo>
type
15 same_type(ps_, al_.ps_);
16 }
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
20
The ShoveRand solution
Demo 2
Embedding a dummy PRNG
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
21
The ShoveRand solution
Summary: Outside PRNG class
1
2
3
4
5
6
7
8
9
#include <shoverand/core/SeedStatus.hxx>
#include <shoverand/core/ParameterizedStatus.hxx>
template <class T>
class DummyGenerator;
template <> class SeedStatus<DummyGenerator> { ... };
template <> class ParameterizedStatus<DummyGenerator> {
... };
10
11 typedef :: SeedStatus < DummyGenerator >
SeedStatusDummyGenerator;
12 typedef :: ParameterizedStatus < DummyGenerator >
ParameterizedStatusDummyGenerator;
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
22
The ShoveRand solution
Summary: Inside PRNG class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
template < typename T >
T DummyGenerator<T>::next() {
...
}
template < typename T >
void DummyGenerator<T>::init(unsigned int inNbBlocks) {
...
}
template < typename T >
T DummyGenerator<T>::release() {
...
}
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
23
Conclusion
Future work
More to come in ShoveRand
Distribution functions
Conclusion
Future work
More to come in ShoveRand
Distribution functions
OpenCL port?
Conclusion
Future work
More to come in ShoveRand
Distribution functions
OpenCL port?
More and more top-quality RNGs (Threefry, ...)
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
24
Conclusion
Any Questions?
Thank you for listening!
[email protected]
http://forge.clermont-universite.fr/
projects/shoverand
How to correctly deal with pseudorandom numbers in manycore environments? Application to GPU programming with Shoverand
N
25