#### Discussion 5: Pipelined Merge Sort

Consider a pipeline of sorters S0 to Sm.

S0 has one input stream (the input sequence), and two output streams. Si (i = 1 to m-1) has two input streams and two output streams. The output streams of Si are the input streams of Si+1, for i = 0 to m-1. Sm has two input streams and one output stream.

S0 reads the input stream, creates "sorted" sub-sequences of size one and sends these intermittently to one of its two output streams: firast to the top channel, then to the boteom, then to the top again, etc.

Si repeatedly reads two sorted sub-sequences, one from each input stream, merges them, and writes the double sized sorted sub-sequences intermittently two one of its output streams.

Sm reads two sorted sub-sequences, one from each input stream, merges these and produces the resulting output sequence. Here is an example for a sequence of 8 numbers, where a bar | delimits sorted sub sequences

```

2 | 1 | 6 | 8  3 1 | 8 4  8 6 5 4
7 2 3 1 5 6 4 8   ------------>  -------->  ------>   8 7 6 5 4 3 2 1
--------------> S0             S1         S2       S3 -------------->
------------>  -------->  ------>
7 | 3 | 5 | 4  7 2 | 6 5  7 3 2 1
```

Assume that input size 'n' is a power of two, that sorters can only read or write exactly one number in a time step and that reading or writing one number takes one time step, while internal computation such as comparing two numbers or storing a number takes zero time steps.

Sorters can start working as soon as they have data on all their input channels. Once Si has sent its first data item to the bottom channel, Si+1 can start working in parallel with Si. So this is a pipeline parallel algorithm.

Questions:

• 1. How many sorters are needed to sort n numbers.
• 2. How many time steps does it take to sort n numbers.
• 3. Is this algorithm cost optimal?

Provide explanations and/or derivations for each sub-part of the problem.