The objectives are to (i) analyze some data-sets and deduce the analytical functions that match some measured data-sets, and (ii) report this in a simple concise form. Rather than using sophisticated packages for data analysis, we want you to go to basics, and use the simple, yet efective techniques described in the lectures. Essentially, you should massage the data and plot it so that it appears as a straight line. We also want you to use thses plots in your reports.
This homework assignment will be done in two passes, the first one due on Tuesday, Aug 30 and the second revision will be due mid septemebr, based on feedback on the first pass. For this reason, the first pass does not require you to do analyze all the functions.
Even if you are more familiar with other tools for fitting curves to data, plotting and visualizing the results (e.g., Matlab, R, Mathematica, Excel, etc.), we want you to use Gnuplot for this assignment. A tutorial of the use of this tool is available in Lab1. We DO NOT want you to use any other more sophisticated data fitting tool for this assignment.
You do not have to write the programs, we have already collected a large database of running times of a range of programs.
We have also provided an access function, tfun that lets you query the database of running times, for different programs, input sizes and numbers of runs. The executable is in ~cs475/provided/Homeworks/HW0/tfun. It queries the database and reports running times. It takes three arguments: the first one specifies which program (data-set) to report; the second is the value of the independent variable (the input size, n for the first seven data-sets and p, the number of processors, for the last two); and the third argument is the number of independent runs to report.
After making a few test plots, you should estimate (hypothesize) an analytic function for each of the data-sets and validate your hypothesis by doing a least-squares fit of the data.