HW3E - Report Questions
Report the following in a document (maximum of 4 pages):
You will need at least a few hours to run the tests before you interpret the data. Hence, make sure that you have ample time to run these tests and report them.
For parts HW3B, HW3C and HW3D report performance results for the input data
files k500x100M, k50kx1M. Record times for the part of your code that computes the table, and report
speedups for 1 through 20 threads (pick a few points, for eg. 1, 2, 4, 8, 12,
16, 20 threads).
- HW3A:
- How does the execution time of this program compare with that of
the sequential code of HW0A (for the problem sizes that both can
handle)?
- By changing the N and C values (modify the kxxx.txt file)
can you empirically validate the asymptotic complexity as a function of N
and C? Remember that the complexity is a function of two independent
parameters, so you may end up with a very large number of data points.
Choose 4-6 independent values of each of N and C, and choose them so that a
reasonable number have the same product, N*C in order to confirm the
expected complexity. In your graph, plot the execution time as a function
of N*C.
- What is the largest capacity that your program is able to handle,
based on the configuration of capital machines?
- HW3B: For this program we are going to ask you to tweak the
parameters to obtain the best performance.
- For the coarse grain parallel code, measure the execution time as a
function of the number of processors, and observe the speedup. Do this for
a range of values of depth for all three variations of course grain parallelism.
Show your speedup plots (3 plots, one for each parallel implemenation), and report the
value that provides the best performance for each type of variation.
- HW3C:
- For the fine grain parallel code, measure the execution time as a
function of the number of processors. Show your speedup plot.
- HW3D: For this program we are going to ask you to tweak the
parameters to obtain the best performance.
- For the hybrid code, measure the execution time as a function of the
number of processors, and observe the speedup. Do this for a range of
values of depth. Show your speedup plots, and report the value
that provides the best performance. Discuss the differences you observe
between the optimal value for this and the coarse grain parallel code.
- Based on the above questions, was the code memory-bound or compute bound? Why?
Submit, using the Checkin tab, your sequential and parallel codes, your time and speedup statistics, and observations as well as your complexity analysis in a pdf (maximum 4 pages).
Your submission must be named "HW3R.pdf". Submit the report via checkin link HW3R (NOTE that there is separate submission for CODE and REPORT).