Show Lecture.RandomNumbers as a slide show.
CS253 Random Numbers
Philosophy
“Computers can’t do anything truly random. Only a person can do that.”
- Stop trying to prove your superiority.
- If you believe that you have something special that distinguishes you
from machines, you’re talking religion, not CS.
- My dog is pretty random.
- You’re somewhat predictable.
- An online rock-paper-scissors
program beats people 60% of the time over more than a million games,
because people are lousy at being random.
Old Stuff
There are several C random number generators,
of varying degrees of standardization:
They still work ok, but avoid them for new C++ code.
They mix up generation and distribution something terrible.
Traditional Method
Traditional random number generators work like this:
unsigned long n = 1;
for (int i=0; i<5; i++) {
n = n * 16807 % 2147483647;
cout << n << '\n';
}
16807
282475249
1622650073
984943658
1144108930
- It’s fast, simple, and good enough for many tasks. However …
- What happens if
n
is zero?
- What number always follows 16807?
- How many possible states does this RNG
(Random Number Generator) have?
Overview
- In C++, random numbers have:
- Generators
Generate uniformly-distributed random integers,
typically zero or one to a big number.
- Distributions
Take uniformly-distributed random integers, and transform them into
other distributions with different ranges.
- Examples:
- Picking a card (uniform, but discrete)
- Rolling 3d6 (bell-shaped, but discrete)
- Human height (bell-shaped, continuous)
Generators
Default Engine
Define a random-number generator, and use ()
to generate a number.
This is not a function call, because gen
is an object, not a
function. It’s operator()
.
That sequence looks familiar …
#include <random>
#include <iostream>
using namespace std;
int main() {
default_random_engine gen;
for (int i=0; i<5; i++)
cout << gen() << '\n';
}
16807
282475249
1622650073
984943658
1144108930
I won’t bother with the #include
s in subsequent examples.
Mersenne Twister
- Here’s a different, 64-bit generator.
- Use
.min()
and .max()
to find out the range of a given generator.
mt19937_64 gen;
cout << "min=" << gen.min() << '\n'
<< "max=" << gen.max() << "\n\n";
for (int i=0; i<5; i++)
cout << gen() << '\n';
min=0
max=18446744073709551615
14514284786278117030
4620546740167642908
13109570281517897720
17462938647148434322
355488278567739596
Ranges
Not all generators have the same range:
mt19937_64 mt;
minstd_rand mr;
cout << "mt19937_64: " << mt.min() << "…" << mt.max() << '\n'
<< "minstd_rand " << mr.min() << "…" << mr.max() << '\n';
mt19937_64: 0…18446744073709551615
minstd_rand 1…2147483646
Hey, look! Zero is not a possible return value for minstd_rand
.
Save/Restore
A generator can save & restore state to an I/O stream:
ranlux24 gen;
cout << gen() << ' ';
cout << gen() << endl;
ofstream("state") << gen;
system("wc -c state");
cout << gen() << ' ';
cout << gen() << '\n';
ifstream("state") >> gen;
cout << gen() << ' ';
cout << gen() << '\n';
15039276 16323925
209 state
14283486 7150092
14283486 7150092
endl
! Isn’t that a sin? 😈 🔥
True randomness
random_device a, b, c;
cout << a() << '\n'
<< b() << '\n'
<< c() << '\n';
4036411824
2516287698
1822046769
random_device
is, ideally, truly random, and not pseudo-random.
- Intel computers have an RDRAND instruction.
- It might depend on random things like human typing intervals,
network packets arrival times, or radioactive decay.
- If true randomness isn’t available, it resorts to pseudo-random numbers.
- It could pause waiting for randomness to become available.
- Use it sparingly.
Cloudflare
The hosting service Cloudflare uses a unique source of randomness.
Seeding
minstd_rand a, b, c(123);
cout << a() << ' ' << a() << '\n';
cout << b() << ' ' << b() << '\n';
cout << c() << ' ' << c() << '\n';
48271 182605794
48271 182605794
5937333 985676192
- Great—we can “seed” the random number generator with a value.
- This way, we can reproduce our pseudo-random sequences.
- Consider random testing: we want to be able to reproduce the sequence
if we find an error.
- How to choose the random seed?
- It should probably be … random.
Seed with process ID
auto seed = getpid();
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
1322953785
580944896
949612290
696405375
1622330134
- You can seed with your process id.
- OK for casual use, but the seed is easily guessed.
- Process IDs are usually 15- or 16-bit quantities, so there are
generally only 32768 or 65536 of them.
Somebody could easily try them all.
Seed with time
// seconds since start of 1970
auto seed = time(nullptr);
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
1303312068
1643395563
301301393
1360284019
809890477
- You can seed with a time-related value.
- Two runs may occur within the same second,
and so produce identical random sequences.
- OK for casual use, but the seed is easily guessed.
- There are only 86,400 seconds in a day.
Somebody could easily try them all.
Y2038
int biggest = 0x7fffffff;
time_t epoch = 0,
now = time(nullptr),
end = biggest,
endp1 = biggest + 1;
cout << "epoch:" << setw(12) << epoch << ' ' << ctime(&epoch);
cout << "now: " << setw(12) << now << ' ' << ctime(&now);
cout << "end: " << setw(12) << end << ' ' << ctime(&end);
cout << "end+1:" << setw(12) << endp1 << ' ' << ctime(&endp1);
epoch: 0 Wed Dec 31 17:00:00 1969
now: 1713529520 Fri Apr 19 06:25:20 2024
end: 2147483647 Mon Jan 18 20:14:07 2038
end+1: -2147483648 Fri Dec 13 13:45:52 1901
I hope that nobody’s still using 32-bit signed time representations by then!
Seed with more accurate time
Nanoseconds make more possibilities:
auto seed = chrono::high_resolution_clock::now()
.time_since_epoch().count();
cout << "Seed: " << seed << '\n';
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
Seed: 1713529520708663234
1971452526
394549388
1408526552
1652927572
859407374
- There are 86,400,000,000,000 nanoseconds in a day.
Better Seeding
- Many generators have more than 32 or 64 bits of state.
- Therefore, you can seed them with more than 32 or 64 bits.
- If you’re doing something very important, and somebody guessing
your seed, and hence predicting your sequence, would be catastrophic:
- on-line poker
🂺 🂻 🂽 🂾 🂱
- encryption of military communications
⚔ 🔫 💣 🥆 ☢
- encrypted email re: extra-marital affairs 💔
- That’s beyond the scope of this discussion.
Seed with random_device
random_device gen;
auto seed = gen();
minstd_rand0 a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
1381443080
1468137843
405623271
1197220119
1904251290
You can seed with random_device
, if you know that
it’s truly random.
Not good enough.
- Great, so we know how to generate a number 1…2,147,483,646
or perhaps 0…18,446,744,073,709,551,615
- How often do we want to do that?
- Sometimes, we want integers with different ranges.
- Or, perhaps we want floating-point numbers.
- Maybe spread out linearly, or a bell-shaped curve, Poisson, etc.
- This is a job for a distribution.
Distributions
- Uniform:
- Bernoulli (yes/no) trials:
- Piecewise distributions:
|
- Related to Normal distribution:
- Rate-based distributions:
|
uniform_int_distribution
auto seed = random_device()(); //❓❓❓
mt19937 gen(seed);
uniform_int_distribution<int> dist(1,6);
for (int y=0; y<10; y++) {
for (int x=0; x<40; x++)
cout << dist(gen) << ' ';
cout << '\n';
}
2 2 5 2 4 1 2 4 4 6 5 1 6 1 1 6 6 4 4 1 5 2 1 5 2 6 4 6 6 5 2 2 3 2 4 1 4 4 5 3
4 4 5 5 1 5 1 6 2 2 2 2 5 1 4 3 5 4 4 3 6 5 1 1 1 3 3 4 1 3 3 3 6 4 1 2 6 5 5 1
2 4 3 1 1 2 1 4 4 6 1 1 2 4 3 4 6 5 1 3 3 4 5 2 6 4 2 4 2 2 3 4 4 2 2 2 4 6 5 4
1 5 2 3 6 3 3 3 6 6 1 3 2 5 4 6 6 1 5 6 2 3 2 5 5 5 4 3 1 5 3 4 5 4 3 4 6 1 4 2
5 6 5 5 4 1 4 6 4 3 4 2 4 3 5 4 3 2 5 5 3 1 5 3 2 2 2 1 5 1 6 5 5 6 6 6 4 2 3 2
6 2 2 3 3 3 5 2 5 3 3 6 5 5 6 5 6 3 5 5 6 4 2 1 6 2 2 5 5 4 1 6 2 2 6 2 4 6 5 1
2 6 2 6 6 3 2 2 2 5 2 3 1 6 2 1 3 3 3 3 6 4 4 2 6 4 2 6 5 1 6 6 1 5 4 1 5 1 1 6
5 6 3 5 4 2 1 2 5 6 2 5 3 1 4 5 6 4 1 6 1 3 3 4 6 4 6 5 6 3 1 3 3 3 6 4 6 3 5 2
6 5 5 3 1 6 4 1 5 2 5 5 2 2 5 3 3 3 4 4 6 1 4 6 4 1 1 5 1 5 3 6 4 1 2 3 1 6 6 4
5 1 4 1 1 5 1 4 6 3 6 6 1 1 5 1 6 6 3 3 5 1 5 1 5 6 6 4 6 5 4 3 5 6 4 3 3 6 6 2
uniform_real_distribution
auto seed = random_device()();
ranlux48 gen(seed);
uniform_real_distribution<> dist(18.0, 25.0);
for (int y=0; y<10; y++) {
for (int x=0; x<10; x++)
cout << fixed << setprecision(3) << dist(gen) << ' ';
cout << '\n';
}
18.684 19.495 24.615 23.185 21.794 22.469 23.794 23.885 20.893 23.051
21.536 19.679 22.428 23.435 24.874 23.425 23.625 18.668 23.416 22.463
23.722 24.469 23.413 20.599 20.397 20.661 23.993 22.598 20.190 23.271
21.192 18.884 18.954 22.550 24.605 22.683 19.673 22.626 22.242 22.541
18.075 24.143 22.767 20.414 23.877 21.055 23.965 23.973 19.908 22.450
20.439 24.455 24.492 18.895 21.146 18.625 21.333 23.636 23.496 24.991
22.954 20.443 23.387 21.261 22.645 21.504 22.949 20.333 18.205 22.334
20.034 23.629 20.541 18.261 18.574 22.176 18.859 22.993 21.264 19.035
18.898 19.894 22.469 21.895 19.736 24.756 23.861 20.631 22.500 22.020
22.439 20.349 18.077 20.092 21.631 22.194 22.039 20.126 19.998 23.501
OMG—what’s that <>
doing there?
Binding
auto seed = random_device()();
minstd_rand gen(seed);
uniform_real_distribution<> dist(18.0, 25.0);
auto r = bind(dist, gen);
for (int y=0; y<10; y++) {
for (int x=0; x<10; x++)
cout << fixed << setprecision(3) << r() << ' ';
cout << '\n';
}
23.243 19.291 24.655 20.558 22.004 23.208 18.216 20.103 24.981 20.224
23.046 18.744 24.681 19.679 22.821 18.755 20.750 19.270 23.059 21.050
21.392 18.280 18.171 19.074 22.865 22.022 21.947 23.191 18.357 20.342
20.143 21.511 20.543 18.261 21.427 22.923 21.693 22.744 24.454 23.484
18.249 23.494 23.004 22.249 21.121 20.146 23.793 20.598 22.001 23.035
20.775 23.109 19.720 18.909 21.876 24.889 22.014 23.761 19.804 22.958
23.853 19.025 24.655 18.416 21.081 21.751 22.220 19.849 21.282 23.155
22.630 21.121 24.287 21.033 22.575 20.398 22.825 18.026 21.139 24.026
21.407 22.908 18.663 20.360 24.108 21.350 20.292 18.587 18.577 22.094
22.387 18.661 23.620 24.031 19.726 22.373 22.451 24.485 21.784 23.203
Binding with temporaries
auto seed = random_device()();
auto r = bind(uniform_real_distribution<>(18.0, 25.0), mt19937(seed));
for (int y=0; y<10; y++) {
for (int x=0; x<10; x++)
cout << fixed << setprecision(3) << r() << ' ';
cout << '\n';
}
18.639 20.263 19.510 24.125 18.502 19.259 18.598 22.135 20.499 22.477
20.085 20.754 21.449 20.121 24.801 23.624 23.085 20.162 19.871 18.204
24.682 24.925 23.242 18.995 21.186 23.258 24.826 18.772 21.796 21.162
21.157 18.127 20.961 19.320 18.523 19.211 21.369 19.091 23.778 23.933
21.032 20.442 20.945 22.208 21.269 19.492 19.746 24.807 24.630 18.401
20.526 21.992 19.944 22.996 22.955 18.043 24.687 23.234 24.477 24.708
24.163 20.634 21.882 19.123 20.473 20.801 22.587 20.227 22.421 19.836
24.419 21.993 20.832 23.318 19.232 24.129 22.045 24.648 18.297 22.230
18.796 19.034 18.216 19.157 23.259 21.243 19.345 23.546 24.458 18.854
20.533 20.259 24.562 20.946 20.303 20.404 18.289 22.939 18.357 19.397
Boolean Values
Yield true
42% of time:
auto seed = random_device()();
constexpr int nrolls = 1'000'000;
auto r = bind(bernoulli_distribution(0.42), knuth_b(seed));
int count=0;
for (int i=0; i<nrolls; i++)
if (r())
count++;
cout << "true: " << count*100.0/nrolls << "%\n";
true: 42.0563%
Histogram
auto seed = random_device()();
mt19937_64 gen(seed);
normal_distribution<> dist(21.5, 1.5);
auto r = bind(dist, gen);
map<int,int> tally;
for (int i=0; i<10000; i++)
tally[r()]++;
for (auto p : tally)
cout << p.first << ": " << string(p.second/100,'#') << '\n';
16:
17:
18: ###
19: ##########
20: #####################
21: ##########################
22: ####################
23: ##########
24: ###
25:
26:
27:
Passwords
random_device rd;
auto seed = rd();
ranlux24 gen(seed);
uniform_int_distribution<char> dist('a','z');
for (int y=0; y<8; y++) {
string pw;
for (int x=0; x<12; x++)
pw += dist(gen);
cout << "Password: " << pw << '\n';
}
Password: rksvepwhmmuo
Password: jmxsphbjrisw
Password: rhbxbrdepaha
Password: ipvbmubigbla
Password: rqlkwgfatjsq
Password: wvwiazwusehd
Password: nloqlvpkobgj
Password: nhxddtduxnmj
Even though we’re using uniform_int_distribution
, which has int
right there in its name, it’s
uniform_int_distribution
<
char
>
, so we get characters.
Think of them as 8-bit integers that display differently.
Passwords
With binding:
auto seed = random_device()();
ranlux24 gen(seed);
uniform_int_distribution<char> dist('a','z');
auto r = bind(dist, gen);
for (int y=0; y<8; y++) {
string pw;
for (int x=0; x<12; x++)
pw += r();
cout << "Password: " << pw << '\n';
}
Password: uuxqeoeqzgvk
Password: oadfttmdrpzl
Password: ttnfatbxefmh
Password: cxkrsqidngwy
Password: cukiazyunzmg
Password: odokbqayzrvj
Password: jsvrxokwzzqv
Password: sgixjqijjuqp
Passwords
With extreme binding:
auto r = bind(uniform_int_distribution<char>('a','z'),
ranlux24((random_device())()));
for (int y=0; y<8; y++) {
string pw;
for (int x=0; x<12; x++)
pw += r();
cout << "Password: " << pw << '\n';
}
Password: fxkbjxsrskvh
Password: yajgvytddwmy
Password: ugtsgxqxghuv
Password: joohkyvyhsvt
Password: meiqrbuxjylg
Password: ndnndicxzzra
Password: mjleioygzezw
Password: vzyxgdkuokpm