For this assignment, you are required to step back and analyze what you have done so far for this course. Computer scientists are superb at informal experimentation; it really is at the root of debugging. Hypothesize what is wrong, fix it and see what happens. Now you are going to practice formal experimentation. This should help prepare you for your project as well.
Not surprisingly, you will be evaluating the timetabling solvers that have been developed. The intent is to learn more about their performance relative to each other.
To support your second experiment, each student will be asked to determine the best known objective function values for 2 problem instances. The problem instance set will comprise the 24 test cases posted for assignment 3 plus enough additional test cases so that each student gets 2. We will post the additional test cases HERE. You will be sent email on 3/13 with which 2 instances you need to determine from the instance set.
It would be tempting to post poor best known solutions and compare to them. Certainly, such values could make a solver look good! To motivate everyone to find really good values, simply submitting plausible values for the objective functions earns 4 points per instance (2 points per objective function). To earn 6 points, you need to submit a really good values that cannot be bested.
We do not know the optimal values for the problems in the instance set. So I am allowing other students to submit new best known values for extra credit. If you submit a better objective function value (with a verifiable solution!) than what is posted, you earn 1 point per problem instance/objective function up to 5 points total, with no more than one try per problem. (So you can't submit slots=200, then 199, then...) An assignment will be set up on Canvas for this extra credit; it will be due March 28 at noon with no late period. Note: only one attempt per student will be allowed, so you will need to include all of the instances and objective functions (up to 5) in a single submission.
When you submit a baseline, you also need to say how you determined it (cite a website or paper or provide a solution in verifiable form.)
You need to decide on the settings of everything else (P, D, O, and M) and describe and justify them in your report. From assignment 2 and Part A, you now have access to a large number of problem instances, although you are welcome to go outside that set to others on the web or that you have generated. You then need to run the experiment as you have designed it, collecting the data (O). Your report must include table(s) either presenting raw or summarized data.
The last part is to compare the performance as measured by M (which may be the same as O or computed from O) using some statistical tests. This has two parts: First, decide on a hypothesis that compares the two. Second, run a statistical analysis to test the hypothesis. Your report should present summaries of the results/data, should state the hypothesis and report the results of the statistical test (including why you picked the test you did). I'm not expecting statistical sophistication; for simple statistics, you can run them in Microsoft excel or use basic R functionality (tutorial available here).
The characteristic must be computable from the problem input. Additionally, its value should vary across the test cases in the instance test set (see Part A).
Given the characteristic, you need to run an experiment to determine how your algorithm's performance varies as the characteristic value varies. For this experiment, the independent variable is the problem set. The dependent variable will be the objective functions (as a standin for solution quality). You will run your algorithm for five minutes (on the capital machines) on each of the test cases and record the objective function values of the solution found. Note: You should run your algorithm with one of the settings for objective function but report both objective function values from that run.
The objective functions need to be converted into a performance metric. In this case, the metric will be percentage difference from the baseline value for each problem in the instance set from Part A. The best known solution values for each of the problems will be posted in the same directory as the instance set in a file to be made available after Part A is turned in. We will update the best known values as soon as we get them. In your results, you should list the baseline values that you used with the date that you gathered them.
CHANGE AS OF MARCH 16 You need to produce a csv (comma separated value) file which has five columns: algorithm, problem name, characteristic value and percentage from optimal for snuber of slots and percentage from optimal for student costs. The algorithm name will be the same in every line. The file should have 53 lines: a header line and one for each problem. The header line should be
algorithm, instance, characteristic-name, percent-from-optimal-slots, percent-from-optimal-costswhere characteristic-name is the name of your characteristic. You will have to compute the characteristic value for each of the problems. The file name should be mylastname-characteristicname.csv
You will be following a set experimental procedure, but you do have three degrees of freedom. First, you can do multiple runs of your algorithm and report the average, if you wish, in the file. If you chose to do so, the data you analyze can also include the multiple runs.
Second, you are required to explore a single characteristic. If you wish to explore more than one, that is acceptable as long as you add columns to the result file with the last column still being the required performance metric.
Third, you can include both algorithms if you like.
Your report for this experiment must describe your characteristic, explaining why you thought it would be informative. The most important part of this report will be analyzing and discussing the data to show how your characteristic predicts or explains the performance. You can try to model the percentage from optimal based on the characteristic value or you can bin the results into sets describing gradations from good to poor performance and show the relationship between the characteristic and the bins.
You need to submit a short description of the characteristic you developed for experiment 2 (you can simply submit what you wrote up for Experiment 2). You should indicate why you thought it would be valuable and summarize how it well it characterized performance based on your data.
You will need to submit three votes. So you will turn in something that looks like: