Show Lecture.RandomNumbers as a slide show.
CS253 Random Numbers
Philosophy
“Computers can’t do anything truly random. Only a person can do that.”
- Stop trying to prove your superiority.
- If you believe that you have something special that distinguishes you
from machines, you’re talking religion, not CS.
- My dog is pretty random.
- You’re somewhat predictable.
- An online rock-paper-scissors
program beats people 60% of the time over more than a million games,
because people are lousy at being random.
Old Stuff
- There are several C random number generators,
of varying degrees of standardization:
- They still work ok, but avoid them for new C++ code.
- They mix up generation and distribution something terrible.
- Also, each family has a separate seeding function.
- Also also, there’s no way to save/restore state!
Traditional Method
Traditional random number generators work like this:
unsigned long n = 1;
for (int i=0; i<5; i++) {
n = n * 16807 % 2147483647;
cout << n << '\n';
}
16807
282475249
1622650073
984943658
1144108930
- It’s fast, simple, and good enough for many tasks. However …
- What happens if
n
is zero?
- What number always follows 16807?
- How many possible states does this RNG
(Random Number Generator) have?
Overview
- In C++, random numbers have:
- Generators
Generate uniformly-distributed random integers,
typically zero or one to a big number.
- Distributions
Take uniformly-distributed random integers, and transform them into
other distributions with different ranges.
- Examples:
- Picking a card (uniform, but discrete)
- Rolling 3d6 (bell-shaped, but discrete)
- Human height (bell-shaped, continuous)
Generators
Default Engine
Define a random-number generator, and use ()
to generate a number.
This is not a function call, because gen
is an object, not a
function. It’s operator()
.
That sequence looks familiar …
#include <random>
#include <iostream>
using namespace std;
int main() {
default_random_engine gen;
for (int i=0; i<5; i++)
cout << gen() << '\n';
}
16807
282475249
1622650073
984943658
1144108930
I won’t bother with the #includes in subsequent examples.
Mersenne Twister
- Here’s a different, 64-bit generator.
- Use
.min()
and .max()
to find out the range of a given generator.
mt19937_64 gen;
cout << "range is " << gen.min() << "…" << gen.max() << "\n\n";
for (int i=0; i<3; i++)
cout << gen() << '\n';
range is 0…18446744073709551615
14514284786278117030
4620546740167642908
13109570281517897720
Ranges
Generators have varying ranges:
ranlux24 rl;
minstd_rand mr;
random_device rd;
mt19937_64 mt;
cout << "ranlux24: " << rl.min() << "…" << rl.max() << '\n'
<< "minstd_rand: " << mr.min() << "…" << mr.max() << '\n'
<< "random_device: " << rd.min() << "…" << rd.max() << '\n'
<< "mt19937_64: " << mt.min() << "…" << mt.max() << '\n';
ranlux24: 0…16777215
minstd_rand: 1…2147483646
random_device: 0…4294967295
mt19937_64: 0…18446744073709551615
Hey, look! Zero is not a possible return value for minstd_rand.
Save/Restore
A generator can save & restore state to an I/O stream:
ranlux24 gen;
cout << gen() << ' ';
cout << gen() << endl;
ofstream("state") << gen;
system("wc -c state");
cout << gen() << ' ';
cout << gen() << '\n';
ifstream("state") >> gen;
cout << gen() << ' ';
cout << gen() << '\n';
15039276 16323925
209 state
14283486 7150092
14283486 7150092
endl! Isn’t that a sin? 😈 🔥
Needed to flush output before wc ran.
True randomness
random_device a, b, c;
cout << a() << '\n'
<< b() << '\n'
<< c() << '\n';
1005106450
502010410
2614722604
- random_device is, ideally, truly random, and not pseudo-random.
- Intel computers have an RDRAND instruction.
- It might depend on random things like human typing intervals,
network packets arrival times, or radioactive decay.
- If true randomness isn’t available, it resorts to pseudo-random numbers.
- It could pause waiting for randomness to become available.
- Use it sparingly.
Cloudflare
The hosting service Cloudflare uses a unique source of randomness.
Seeding
minstd_rand a, b, c(123);
cout << a() << ' ' << a() << '\n';
cout << b() << ' ' << b() << '\n';
cout << c() << ' ' << c() << '\n';
48271 182605794
48271 182605794
5937333 985676192
- Great—we can “seed” the random number generator with a value.
- This way, we can reproduce our pseudo-random sequences.
- Consider random testing: we want to be able to reproduce the sequence
if we find an error.
- How to choose the random seed?
- It should probably be … random.
Seed with process ID
auto seed = getpid();
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
149512993
1596631183
2090711057
1866925329
1348793451
- You can seed with your process id.
- OK for casual use, but the seed is easily guessed.
- Process IDs are usually 15- or 16-bit quantities, so there are
generally only 32768 or 65536 of them.
Somebody could easily try them all.
Seed with time
// seconds since start of 1970
auto seed = time(nullptr);
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
1209322977
181446366
1159220720
1909468888
1974563408
- You can seed with a time-related value.
- Two runs may occur within the same second,
and so produce identical random sequences.
- OK for casual use, but the seed is easily guessed.
- There are only 86,400 seconds in a day.
Somebody could easily try them all.
Seed with more accurate time
Nanoseconds make more possibilities:
auto seed = chrono::high_resolution_clock::now()
.time_since_epoch().count();
cout << "Seed: " << seed << '\n';
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
Seed: 1711703562338213548
1815412596
1563722034
523594811
710080238
296678731
- There are 86,400,000,000,000 nanoseconds in a day.
Better Seeding
- Many generators have more than 32 or 64 bits of state.
- Therefore, you can seed them with more than 32 or 64 bits.
- If you’re doing something very important, and somebody guessing
your seed, and hence predicting your sequence, would be catastrophic:
- on-line poker
🂺 🂻 🂽 🂾 🂱
- encryption of military communications
⚔️ 🔫 💣 🥆 ☢️
- encrypted email re: extra-marital affairs 💔
- That’s beyond the scope of this discussion.
Seed with random_device
random_device rd;
auto seed = rd();
minstd_rand0 a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
1425637582
1215791095
494032460
1031775918
127404301
You can seed with random_device, if you know that
it’s truly random.
Not good enough.
- Great, so we know how to generate a number 1…2,147,483,646
or perhaps 0…18,446,744,073,709,551,615
- How often do we want to do that?
- Sometimes, we want integers with different ranges.
- Or, perhaps we want floating-point numbers.
- Maybe spread out linearly, or a bell-shaped curve, Poisson, etc.
- This is a job for a distribution.
Caution
- Resist the urge to create your own distribution using division
or modulus.
- This is harder than you think.
- Your home-grown code will by off by one, or have some bias because
the range of the generator isn’t a perfect multiple of what you want.
- Just use the standard distributions.
Distributions
- Uniform:
- Bernoulli (yes/no) trials:
- Piecewise distributions:
|
- Related to Normal distribution:
- Rate-based distributions:
|
uniform_int_distribution
auto seed = random_device()(); //❓❓❓
mt19937 gen(seed);
uniform_int_distribution<int> dist(1,6);
for (int y=0; y<10; y++) {
for (int x=0; x<40; x++)
cout << dist(gen) << ' ';
cout << '\n';
}
6 2 2 2 2 4 6 5 1 3 5 5 4 2 6 2 2 1 5 5 2 6 2 2 1 5 1 5 2 2 6 4 1 5 1 4 6 4 6 1
5 1 6 5 2 2 2 5 1 1 3 6 1 1 3 5 6 1 4 2 2 1 5 6 2 2 3 6 5 4 3 5 6 3 3 4 5 4 2 5
4 3 3 2 2 4 3 3 6 4 3 2 2 4 1 5 4 6 3 6 2 4 3 4 5 3 6 3 4 6 5 3 3 6 4 5 4 6 1 2
6 2 5 4 5 5 4 5 4 2 3 3 4 3 2 5 3 3 4 6 3 2 4 1 3 1 2 6 4 6 2 5 1 6 5 6 5 5 6 1
2 1 6 1 1 1 4 5 6 2 5 6 5 1 4 1 1 6 6 3 3 6 6 1 6 5 4 4 5 1 2 3 4 4 6 4 5 4 6 2
6 3 1 2 5 5 5 6 5 1 5 2 1 1 6 5 1 3 1 2 4 1 3 4 3 3 2 4 1 4 1 3 2 2 5 3 4 2 2 6
6 5 5 1 5 6 4 1 6 6 3 1 3 3 5 2 6 1 3 6 5 3 1 3 3 5 3 2 1 3 1 2 2 1 1 5 6 4 1 3
4 4 1 4 2 4 2 5 2 4 3 3 2 6 2 3 4 2 5 6 4 2 3 3 4 5 2 3 6 3 5 5 6 2 5 6 5 5 6 2
1 3 4 6 3 6 5 2 6 2 2 4 3 4 5 3 2 3 3 5 4 2 5 3 2 2 3 6 1 1 5 3 3 2 3 2 1 6 3 4
6 1 3 6 6 5 3 4 4 5 4 4 2 3 6 6 1 4 5 1 5 5 3 6 4 4 6 1 3 3 4 6 1 2 4 4 3 4 5 5
uniform_real_distribution
auto seed = random_device()();
ranlux48 gen(seed);
uniform_real_distribution<> dist(18.0, 25.0);
for (int y=0; y<5; y++) {
for (int x=0; x<10; x++)
cout << fixed << setprecision(3) << dist(gen) << ' ';
cout << '\n';
}
24.315 24.280 21.380 18.825 22.262 18.882 19.133 19.496 18.499 24.292
22.942 22.339 24.125 18.456 22.606 21.832 20.863 20.238 23.182 24.744
18.924 22.409 18.500 23.670 18.440 21.759 19.196 21.118 22.440 21.648
24.117 18.496 19.125 22.592 19.920 24.549 20.888 24.801 21.944 18.627
22.270 20.684 19.711 24.833 22.247 20.751 21.109 23.752 18.755 23.041
OMG—what’s that <>
doing there?
uniform_real_distribution’s template argument defaults to double,
because … real
.
Boolean Values
Yield true 42% of time:
random_device rd;
knuth_b gen(rd());
bernoulli_distribution dist(0.42);
constexpr int nrolls = 1'000'000;
int count=0;
for (int i=0; i<nrolls; i++)
if (dist(gen))
count++;
cout << "true: " << count*100.0/nrolls << "%\n";
true: 42.0913%
Histogram
random_device rd;
mt19937_64 gen(rd());
normal_distribution<> dist(21.5, 1.5);
map<int,int> tally;
for (int i=0; i<10000; i++)
tally[dist(gen)]++;
for (auto p : tally)
cout << p.first << ": " << string(p.second/100,'#') << '\n';
15:
16:
17:
18: ###
19: ##########
20: ####################
21: ###########################
22: #####################
23: ###########
24: ###
25:
26:
27:
Passwords
random_device rd;
auto seed = rd();
ranlux24 gen(seed);
uniform_int_distribution<char> dist('A','~');
for (int y=0; y<8; y++) {
string pw;
for (int x=0; x<32; x++)
pw += dist(gen);
cout << "Password: " << pw << '\n';
}
Password: vZmNevyFmv_G\[xF{y]UX_XikN[}AdwU
Password: xlstO~Le_QYlqEeD\lRJHGNDj}Qi|{X\
Password: {agfAPai`rtseaX^qJE~|MojexI\vkov
Password: SqabRnwxVXZGQLcoRScNW^TmSp[VRHhX
Password: XbSZ`}CSCjc]rPRLKwWmTBBGk]g\GI\S
Password: O[\oNoEOLGVGvdkRNMt{uq\Kh]VO|b`b
Password: X~ZFeMsYrKYeUEPblCV[`R~gjl[{oQkl
Password: Ykqq[Do{HdI^wYRG[ypqDyrRxE_PKJPA
Even though we’re using uniform_int_distribution, which has int
right there in its name, it’s
uniform_int_distribution<char>
, so we get characters.
Think of them as 8-bit integers that display differently.