HW3E - Report Questions

Report the following in a document (maximum of 4 pages):

    You will need at least a few hours to run the tests before you interpret the data. Hence, make sure that you have ample time to run these tests and report them.

    For parts HW3B, HW3C and HW3D report performance results for the input data files k500x100M, k50kx1M. Record times for the part of your code that computes the table, and report speedups for 1 through 20 threads (pick a few points, for eg. 1, 2, 4, 8, 12, 16, 20 threads).

    • HW3A:

      1. How does the execution time of this program compare with that of the sequential code of HW0A (for the problem sizes that both can handle)?
      2. By changing the N and C values (modify the kxxx.txt file) can you empirically validate the asymptotic complexity as a function of N and C? Remember that the complexity is a function of two independent parameters, so you may end up with a very large number of data points. Choose 4-6 independent values of each of N and C, and choose them so that a reasonable number have the same product, N*C in order to confirm the expected complexity. In your graph, plot the execution time as a function of N*C.
      3. What is the largest capacity that your program is able to handle, based on the configuration of capital machines?

    • HW3B: For this program we are going to ask you to tweak the parameters to obtain the best performance.

      1. For the coarse grain parallel code, measure the execution time as a function of the number of processors, and observe the speedup. Do this for a range of values of depth for all three variations of course grain parallelism. Show your speedup plots (3 plots, one for each parallel implemenation), and report the value that provides the best performance for each type of variation.

    • HW3C:

      1. For the fine grain parallel code, measure the execution time as a function of the number of processors. Show your speedup plot.

    • HW3D: For this program we are going to ask you to tweak the parameters to obtain the best performance.

      1. For the hybrid code, measure the execution time as a function of the number of processors, and observe the speedup. Do this for a range of values of depth. Show your speedup plots, and report the value that provides the best performance. Discuss the differences you observe between the optimal value for this and the coarse grain parallel code.
      2. Based on the above questions, was the code memory-bound or compute bound? Why?

    Submit, using the Checkin tab, your sequential and parallel codes, your time and speedup statistics, and observations as well as your complexity analysis in a pdf (maximum 4 pages). Your submission must be named "HW3R.pdf". Submit the report via checkin link HW3R (NOTE that there is separate submission for CODE and REPORT).