CS253: Software Development with C++

Spring 2020

Sort Algorithm

Show Lecture.SortAlgorithm as a slide show.

CS253 Sort Algorithm

made at imgflip.com

The sort() algorithm

The sort() algorithm (from the header file <algorithm>) has two forms:

Containers

Default comparison

string s = "Kokopelli";
sort(s.begin(), s.end());
cout << s << '\n';
Keiklloop

Explicit comparison

string s = "Kokopelli";
sort(s.begin(), s.end(), less<char>);
cout << s << '\n';
c.cc:2: error: expected primary-expression before ')' token

Explicit comparison

string s = "Kokopelli";
sort(s.begin(), s.end(), less<char>());
cout << s << '\n';
Keiklloop

Reverse sort

string s = "Kokopelli";
sort(s.begin(), s.end(), greater<char>());
cout << s << '\n';
poollkieK

Comparison function

bool lt(char a, char b) {
    return a < b;
}

int main() {
    string s = "Kokopelli";
    sort(s.begin(), s.end(), lt);
    cout << s << '\n';
}
Keiklloop

λ-function

string s = "Kokopelli";
sort(s.begin(), s.end(),
     [](char a, char b){return a<b;});
cout << s << '\n';
Keiklloop

Case folding

bool lt(char a, char b) {
    return toupper(a) < toupper(b);
}

int main() {
    string s = "Kokopelli";
    sort(s.begin(), s.end(), lt);
    cout << s << '\n';
}
eiKklloop

Unique

If you want to avoid duplicates, then use unique().

bool lt(char a, char b) {
    return toupper(a) < toupper(b);
}

int main() {
    string s = "Kokopelli";
    sort(s.begin(), s.end(), lt);
    auto it = unique(s.begin(), s.end());
    s.resize(it-s.begin());
    cout << s << '\n';
}
eiKklop

unique() assumes that its input is in order already.

Unique

Case-independent uniqueness doesn’t come free:

bool lt(char a, char b) {
    return toupper(a) < toupper(b);
}

bool eq(char a, char b) {
    return toupper(a) == toupper(b);
}

int main() {
    string s = "Kokopelli";
    sort(s.begin(), s.end(), lt);
    auto it = unique(s.begin(), s.end(), eq);
    s.resize(it-s.begin());
    cout << s << '\n';
}
eiKlop

Unfortunately, we’ve duplicated the calls to toupper().

Unique and DRY

Duplication of code is a bad thing, but avoiding it sometimes has a cost.

bool lt(char a, char b) {
    return toupper(a) < toupper(b);
}

bool eq(char a, char b) {
    return !(lt(a,b) || lt(b,a));
}

int main() {
    string s = "Kokopelli";
    sort(s.begin(), s.end(), lt);
    auto it = unique(s.begin(), s.end(), eq);
    s.resize(it-s.begin());
    cout << s << '\n';
}
eiKlop

Generality

It’s not just about strings:

int a[] = {333, 22, 4444, 1};
sort(begin(a), end(a));
for (auto val : a)
    cout << val << '\n';
1
22
333
4444
vector<double> v = {1.2, 0.1, 6.7, 4.555};
sort(v.begin(), v.end(), greater<double>());
for (auto val : v)
    cout << val << '\n';
6.7
4.555
1.2
0.1

Why didn’t I say a.begin() in the first example?

a is a C array. It doesn’t have methods.