Random number generation

TDA 351: Random Number
Generation
December 16, 2014
Michał Pałka <[email protected]>
Caption: RFC 1149.5 specifies 4 as the standard IEEEvetted random number.
http://xkcd.com/221/
Uses of random data
•
Cryptography
•
Simulation
•
Testing
In cryptography
•
Random keys
•
Passwords
•
Secret values
•
Encryption & MAC — pseudorandom values
Generating random numbers
s = getSeed();
rng = seed (s);
x1 = rng.next ();
x2 = rng.next ();
x3 = rng.next ();
//...
Get an unpredictable value
from the environment
Instantiate a PRNG
Generate many random
numbers
}
Determnistic
computation
Problems
Example 1: RSA, recall
In RSA the public key consists of:
•
n = pq, where p and q are randomly chosen primes
•
Random e, such that gcd(e, Φ(n)) = 1
The private key is n and d = e−1 mod φ(n)
p and q must be kept private, otherwise the adversary can compute
d.
Example 1: Debian OpenSSL fiasco
(2006–2008)
MD_Update(&m,buf,j);
// …
MD_Update(&m,buf,j); /* purify complains */
Example 1: Debian OpenSSL fiasco
(2006–2008)
•
OpenSSL used an uninitialized array for providing the seed.
•
The code caused Valgrind and Purify (static analysis tools) to produce
warnings.
•
In September 2006 the line was commented out in Debian’s version of
OpenSSL.
•
In May 2008 Luciano Bello [Schneier08] found the vulnerability.
•
Nearly all the keys generated by faulty OpenSSL were breakable. Example 1: In addition to that…
•
2012: 0.4% of all public HTTPS keys were possible to factor
because of bad randomness (lecture 6)
•
1% of keys were repeated!
Example 2: Forgot password?
newPassword = rng.nextString(n);
•
New password is generated using a RNG and sent to the user. ︎
•
Adversary should not be able to guess it.
Example 2: Randomness from time,
process ids, etc.
t1 = time ();
t2 = microtime ();
pid = getmypid ();
rng = seed (combine (t1, t2, pid));
// Use the rng...
x1 = rng.next ();
•
Sometimes time, pid, other standard functions that are nondeterministic are used for seeding. •
Their results are easy to predict.
Example 2: RNG seeding flaws in PHP
(2012)
From George Argyros and Aggelos Kiayias, I forgot your password: Randomness
attacks against PHP applications [AK12]. •
When PHP creates a new process, its RNG is initialized using time and pid values. •
This RNG is then used to create the SESSIONID. •
Malicious client can then guess the seed of the RNG in 227 tries.
•
It is possible to trigger creating new PHP processes by flodding the server with
requests.
•
Many popular PHP applications have been found vulnerable, e.g. mediawiki.
Example 3
Edward Snowden 2013
Source: http://commons.wikimedia.org/wiki/Category:Edward_Snowden_in_2013
In September 2013, the New York Times, the Guardian and
ProPublica reported on a secret National Security Agency
SIGINT Enabling Project with the mission to “actively
[engage] the US and foreign IT industries to covertly
influence and/or overtly leverage their commercial products’
designs.” The project aims to influence commercial
encryption products to make the encrypted connections
vulnerable to electronic surveillance. Named targets include
protocols for “TLS/SSL, https (e.g., webmail), SSH,
encrypted chat, VPNs and encrypted VOIP.”
—http://dualec.org
In particular, leaked NSA documents indicate that particular
NIST and ISO cryptography standards may have been
influenced by the NSA in order to weaken the security of
U.S. and non-US cryptography products. These standards
include NIST Special Publication 800-90A and ISO 18031,
both of which contain algorithms for generating the random
numbers used, for example, to generate keys for
cryptographic systems.
—http://dualec.org
One of the algorithms contained within these documents is a
pseudorandom number generator called the Dual Elliptic
Curve Deterministic Random Bit Generator
(Dual_EC_DRBG) that has long been known to admit a
serious potential back door in the event that an attacker
generates the standard algorithm parameters. While no one
is claiming that NIST or NSA designed the generator to
facilitate such attacks, the September 2013 news appears to
provide evidence for this possibility.
—http://dualec.org
Timeline
•
2004: First draft of NIST SP 800-90A published (under different name)
•
Standard implemented by RSA’s BSAFE crypto library — Dual_EC_DRBG is the default
RNG
•
2006: NIST SP 800-90A published
•
2006: Bias found in Dual_EC_DRBG (NIST SP 800-90A includes an optional work
around) [Schneier08A]
•
2007: Suspected backdoor found (secret constant) — no reaction from NIST
•
Microsoft Internet Explorer and IIS, OpenSSL include it as an optional RNG
Timeline, cont
•
September 2013: Snowden revelations about Dual_EC_DRBG
surface
•
September 2014: RSA Security advises its customers to stop using
Dual_EC_DRBG in RSA Security's BSAFE
•
NIST removs Dual_EC_DRBG from NIST SP 800-90A
According to the New York Times story, the NSA spends
$250 million per year to insert backdoors in software and
hardware as part of the Bullrun program; In December 2013,
a Reuters news article alleged that in 2004, before NIST
standardized Dual_EC_DRBG, NSA paid RSA Security $10
million in a secret deal to use Dual_EC_DRBG as the
default in the RSA BSAFE cryptography library, which
resulted in RSA Security becoming the most important
distributor of the backdoored algorithm.
—http://en.wikipedia.org/wiki/Dual_EC_DRBG
Example 4: Monte-carlo simulations
Non-crypto applications of RNGs:
•
︎To generate large amounts of random data
•
Simulations: economy, finance, engineering, etc.
•
Software testing
Example 4: Problems with some common RNGs (2009)
From B. D. McCullough, The accuracy of econometric software [McC09]
20
THE ACCURACY OF ECONOMETRIC SOFTWARE
Wichmann−Hill uniform with Kinderman−Ramage transformation
−3.0
−3.5
−4.0
−4.5
−5.5
−5.0
Lower Tail of the Standard Normal
−3.5
−4.0
−4.5
−5.0
−5.5
Lower Tail of the Standard Normal
−3.0
Marsaglia Multicarry uniform with Kinderman−Ramage transformation
0
2000
4000
6000
8000
number of observations
10000
12000
14000
0
2000
4000
6000
8000
10000
12000
number of observations
Figure 2.3 The effect of failing randomness tests on the Kinderman-Ramage transform to
normality. Notice the bands on the left when the RNG is Marsaglia-Multicarry. The bands
disappear on the right when the RNG is Wichmann-Hill.
14000
Summary
•
Examples 1 and 2: Wrong (predictable) initialisation
•
In ex. 2 the PRNGs are also broken
•
Example 3: possible backdoor in RNG
•
Example 4: PRNG is biased
•
RNGs hard to get right
"#251 Introduzca su PIN [08-09-11]" by Jose M. Rus available under Creative Commons Attribution Non-commercial No-Derivatives 2.0 license.
ATM card PIN
•
4 digit PIN
•
How many different possibilities?
•
How many tries (on average)?
ATM card PIN
•
4 digit PIN
•
Adversary knows that last digits are 8 and 5
ATM card PIN
•
4 digit PIN
•
Adversary knows that last digits are 8 and 5
•
Adversary knows that PINs never start with 0
Entropy
•
Measure of unpredictability
•
Ex. random key 0 — 2^128-1 has 128 bits of entropy
•
PIN 0000—9999 has 13.29 bits of entropy
•
PIN xx85 has 6.64 bits of entropy
•
When some keys are more probable, adversary will use
this (less entropy)
Entropy, cont.
•
Defines amount of information in Information
Theory
•
When some keys are more probable, adversary
will use this (less entropy)
•
In that case entropy does not measure exactly
what we want
Examples
•
Examples 1 and 2: Wrong (predictable) initialisation
•
Example 3: PRNG is biased
•
RNGs hard to get right
Requirements
•
External sources of unpredictable events/data
•
Even if some sources are partially/fully predicted by the adversary,
the generated random numbers should not be predictable (to the
extent possible)
Why not use a hardware RNG?
•
Ex. Intel RdRand is a fast hardware RNG (included in common
CPUs)
•
Randomness is essential for cryptography
•
Black-box designs discouraged (trust)
•
It’s impossible to determine the quality of random data based on a
single sample
How about a secret algorithm?
•
Kerckhoffs's principle — security should not depend on secrecy of
algorithm
•
Black-box designs discouraged (trust)
Our first try
Random-looking combining
function
t1 = time ();
t2 = microtime ();
pid = getmypid ();
rng = seed (combine (t1, t2, pid));
// Use the rng...
x1 = rng.next ();
x2 = rng.next ();
Random-looking generating
function
What else should we add?
Our first try
Random-looking combining
function
t1 = time ();
t2 = microtime ();
pid = getmypid ();
rng = seed (combine (t1, t2, pid));
// Use the rng...
x1 = rng.next ();
x2 = rng.next ();
Random-looking generating
function
Sources of entropy
•
Time, PID, CPU counters
•
OS events
•
Network events
•
Keyboard, mouse
•
Seed saved in a file (beware of backup/restore, cloning!)
•
Hardware RNGs
Seeding — operating system
support
•
Recommended to collect entropy on OS/hypervisor level
•
UNIXes provide /dev/random and /dev/urandom special files
•
Windows: CryptGenRandom from WIN32 API
•
Based on unpredictable events
•
May use hardware RNGs too
Linux
•
/dev/random: blocking source of randomness, hangs when there is not
enough new entropy; disadvantage — might block without need •
/dev/urandom: non-blocking variant; disadvantage — might give
predictable results just after bootup
•
Uses entropy estimation — unreliable
•
Might use a seed that has been saved (/var/run/random-seed?) —
distribution specific
Another try
Random-looking combining
function
t1 = time ();
t2 = microtime ();
pid = getmypid ();
src1 = more_sources ();
// /dev/random etc.
rng = seed (combine (t1, t2, pid, src1));
// Use the rng...
x1 = rng.next ();
x2 = rng.next ();
Random-looking generating
function
Pseudorandom number generator
•
Start with random seed
•
Generate a long sequence of numbers that is unpredictable (to
somebody that does not know the seed)
One time pad
•
A sequence of random bits
•
We can fake it if we have a random key and a stream cipher
Another try
Random-looking combining
function
t1 = time ();
t2 = microtime ();
pid = getmypid ();
src1 = more_sources ();
// /dev/random etc.
rng = seed (combine (t1, t2, pid, src1));
// Use the rng...
x1 = rng.next ();
x2 = rng.next ();
Stream cipher
Key Derivation Function (KDF)
•
Turns imperfect source of randomness into a random-looking key
•
Variable-length input
•
Cryptographic hash functions are often used
Another try
Key Derivation Function
t1 = time ();
t2 = microtime ();
pid = getmypid ();
src1 = more_sources ();
// /dev/random etc.
rng = seed (combine (t1, t2, pid, src1));
// Use the rng...
x1 = rng.next ();
x2 = rng.next ();
Stream cipher
General scheme
•
Collect (imperfect) entropy
•
Use a KDF to ‘compress’ them into a high-quality seed
•
Use a stream cipher to ‘expand’ the seed into a long sequence of
random-looking numbers
Additional requirements
•
Some external events/data may be predictable (time, PID) — RNG
should not allow the adversary to learn more than it is possible by
brute force
•
If the adversary learns the state of RNG, he/she should not be able to
predict future states
•
If the adversary learns the state of RNG, he/she should not be able to
learn anything about past states
Another try
t1 = time ();
Key Derivation Function
t2 = microtime ();
pid = getmypid ();
src1 = more_sources ();
// /dev/random etc.
rng = seed (combine (t1, t2, pid, src1));
// Use the rng...
x1 = rng.next ();
x2 = rng.next ();
src2 = more_sources ();
rng.reseed(combine(src2));
x3 = rng.next ();
Stream cipher with rekeying
Add more entropy
Reseeding
•
Implemented by all crypto RNGs
•
Not unanimously recommended
State of the art
•
The Yarrow algorithm (Bruce Schneier, John Kelsey, and Niels
Ferguson, 1999)
•
The Fortuna algorithm (Bruce Schneier and Niels Ferguson, 2003
[FSK10])
Conclusion
•
RNGs difficult to get right (depend on environment)
•
Cryptography needs RNGs
•
RNGs use cryptography
•
Book Cryptography engineering is a good starting place
References
•
George Argyros and Aggelos Kiayias, I forgot your password:
Randomness attacks against PHP applications, Proceedings of the
21st USENIX Conference on Security Symposium, Security’12,
2012, pp. 6–6. •
Yevgeniy Dodis, David Pointcheval, Sylvain Ruhault, Damien
Vergnaud, and Daniel Wichs, Security Analysis of Pseudo-Random
Number Generators with Input: /dev/random is not Robust, ACM
CCS 2013 - 20th ACM Conference on Computer and
Communications Security, ACM, November 2013.
References
•
Niels Ferguson, Bruce Schneier, and Tadayoshi Kohno,
Cryptography engineering - design principles and practical
applications., Wiley, 2010.
•
B. D. McCullough, The accuracy of econometric software,
Handbook of Computational Econometrics, Wiley, 2009, pp. 55–79. •
Phillip Rogaway, Evaluation of some blockcipher modes of
operation, Unpublished manuscript, 2011.
References
•
Bruce Schneier, https://www.schneier.com/blog/archives/
2008/05/random_number_b.html, 2008.
•
Bruce Schneier, https://www.schneier.com/blog/archives/
2007/11/the_strange_sto.html, 2008A.