CS 551: Distributed Operating Systems
- Many of the algorithms for distributed systems
require some centralization,
some site with leader/coordinator activities.
- All other sites need to recognize this leader;
often there is accomplished by
an initial agreement/election.
- When the site of the leader fails
or goes down for some reason,
it is necessary to elect a new leader.
- This is the purpose of election or agreement algorithms.
- There are two basic criteria for an election/agreement algorithm.
- One way to decide the leader is to use some global priority.
The Bully algorithm by Garcia-Molina (1982)
falls into this category.
- The second is a more general, preference-based algorithm,
that permits some nodes to have heavier votes.
Chang & Roberts's Token Ring Election algorithm belongs here.
- Assumptions for most election algorithms:
- A complete topology,
i.e. one message hop between any two processes.
- All process ids are unique and known to all other processes.
- All communication networks are reliable,
i.e. only communicating processes may fail.
This assures that no messages are
- A recovering process is aware that it failed.
- Failure is reliably detected by setting the time-out
interval to be a little larger than the sum of the
round-trip message delay and the message
- A failed process can rely on the coordinator to
poll periodically for recovered process
so that they may rejoin the pool of processes.
- The following pages discuss a few such algorithms:
- The Bully algorithm by Garcia-Molina (1982)
- The Token Ring Election algorithm by Chang & Roberts
- A Tree-based Election algorithm
- Yet another algorithm is related to the Byzantine
General paper by Lamport et al.
Copyright © 2001-2003:
Colorado State University for CS 551.
All rights reserved.