CS 540, Spring 2013: Assignment 1
Search and TSP

Locations due by Thursday 1/31 at noon MST
Distances due by Monday 2/4 at noon MST
Code and Questions Due Thursday 2/14/08 at noon MST

For this assignment, you are required to implement a search algorithm to solve asymmetric TSPs. You have some latitude in your selection of language and algorithm. For language, if you wish to choose a language other than C, C++, Java or Lisp, you must obtain instructor's permission. For algorithm, you should use one of the algorithms (with tuning) described in class or in the text; if you wish one other than that you need to obtain instructor's permission.

A New ATSP Dataset

The basics of the problem have been described already in class. You will be using problems derived from the distance matrix for Fort Collins locations which will be made available via RamCT in posting of Assignment1 Part 2.

The current version has 60 locations in Fort Collins. Unfortunately, 60 locations is not particularly large. So everyone will be asked to contribute to the data collection. An excel formatted spreadsheet (the distance matrix mentioned above) will be used. Each student will be asked to submit 2 locations in Colorado (not necessarily Fort Collins) that are accessible by car and then asked to add their driving distances to the spreadsheet. Locations are designated by a name and an address. See top for due dates for these parts.

Each location must be unique, so don't repeat locations already HERE. Also, to encourage uniqueness, we will add locations to below as we receive them from students. The "due date" is an end date; locations can be submitted at any time prior to that.

Student submitted locations:

Once the set of locations is finalized, an updated spreadsheet will be provided via RamCT with the new locations. You will need to add distances for the new locations to the spreadsheet. We recommend using your favorite route calculator (google maps, mapquest...) to determine driving distance.

ATSP Code

The format for problems will be similar to that found in the TSPLIB, but not exactly the same. An example looks like:
NAME:  br17
DIMENSION:  17
EDGE_WEIGHT_SECTION
 9999    3    5   48   48    8    8    5    5    3    3    0    3    5    8    8    5
    3 9999    3   48   48    8    8    5    5    0    0    3    0    3    8    8    5
    5    3 9999   72   72   48   48   24   24    3    3    5    3    0   48   48   24
   48   48   74 9999    0    6    6   12   12   48   48   48   48   74    6    6   12
   48   48   74    0 9999    6    6   12   12   48   48   48   48   74    6    6   12
    8    8   50    6    6 9999    0    8    8    8    8    8    8   50    0    0    8
    8    8   50    6    6    0 9999    8    8    8    8    8    8   50    0    0    8
    5    5   26   12   12    8    8 9999    0    5    5    5    5   26    8    8    0
    5    5   26   12   12    8    8    0 9999    5    5    5    5   26    8    8    0
    3    0    3   48   48    8    8    5    5 9999    0    3    0    3    8    8    5
    3    0    3   48   48    8    8    5    5    0 9999    3    0    3    8    8    5
    0    3    5   48   48    8    8    5    5    3    3 9999    3    5    8    8    5
    3    0    3   48   48    8    8    5    5    0    0    3 9999    3    8    8    5
    5    3    0   72   72   48   48   24   24    3    3    5    3 9999   48   48   24
    8    8   50    6    6    0    0    8    8    8    8    8    8   50 9999    0    8
    8    8   50    6    6    0    0    8    8    8    8    8    8   50    0 9999    8
    5    5   26   12   12    8    8    0    0    5    5    5    5   26    8    8 9999
EOF
The first name gives the name of the problem. The second is the number of "cities" (or in the case of Fort Collins, locations). The next line "EDGE_WEIGHT_SECTION" flags the start of the matrix of costs. By convention, locations are numbered from 1 to DIMENSION. So the second column in the first row indicates the distance between location 1 and 2.

A solution should be represented as shown below. The first line indicates the total cost for the tour (sum of distances including from the last location back to the first). Subsequent lines show the ordering for the locations. To make it simple, each location will be on its own line and the lines together (after the cost) should form a permutation. So the first location should not appear at the end. For example, a solution to the problem above would look like:

167
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
To make sure that everyone has implemented it properly, we will be verifying your solutions as formatted above to check your cost and that it is a legal tour (all locations once and only once). So it is very important that you follow the format EXACTLY.

For testing your program, you should run the problem above and another based on the original Fort Collins locations.

You should name your program atsp. Your program should take as its first 2 arguments: input file and output file. File formats MUST conform to above. You may add arguments after those if you have parameters you wish to vary. Note that locations are numbered starting from 1.

Make sure that your README file describes what parameter settings should be used (if you have parameters beyond the two mentioned above) as well as what compiler and settings you use. Include a MAKEFILE if needed to run your program. In other words, make this as easy as possible for the instructor and GTA to run the code.

Search Algorithm

You should implement your choice of search algorithm from what we covered in class. You must also select parameters for implementation, e.g., ordering heuristics, neighborhood, initialization. In comments in your code, you should name the algorithm and cite your source for it.

The last 10 points will be awarded based on how well your program did on four test cases, which the GTA and instructor will select from our Colorado locations dataset and from the benchmark problems available at TSPLIB (formatted to match my requirements). We will not tell you in advance what tests we will be running. Each evaluation trial will be allotted up to 10 minutes of CPU time. These points will be awarded as a direct function of rank based on minimizing the tour costs. Ranks for each will be combined for a final ranking. Your program will be run on one of the machines in the first floor LINUX lab (120-unix-lab); make sure your code runs on those machines and produces some output before the end of 10 minutes. You can find a list of those machines by running the command "more ~info/machines | grep 120-unix-lab" on one of the department's machines. Your program will be terminated at 10 minutes.

Part of your grade will depend on how well you followed directions and how high quality was your implementation: how well it implements the method chosen, how efficiently it has been written, how easy it is to read and how creative/well justified were your design decisions.

Questions

Answers to the questions are worth 24 points.
  1. Why did you implement the search algorithm that you did? How did you choose parameter settings for it? Cite any relevant literature or pilot experiments that you did.
  2. Is your solution suited for both ATSP and TSP? Why or why not? What knowledge is the program using in constructing its solutions?
  3. How did the time limit impact your implementation?
  4. On what problems did you test your code? What differences did you notice in your program's performance on the different problems?
Each answer is expected be 1/2-1 page in length.

What to hand in

You can submit everything electronically.
  1. Output from two runs, one for each of the two test problems, in ASCII.
  2. Written answers to your questions in ASCII, PDF or PS.
  3. A file (tarfile or zip) containing the source code. You should submit this via RamCT by the due date/time for the assignment.
  4. A README file describing exactly how to compile and run your code and how to set the parameters, if you have any. The README should include a line that can be cut-and-pasted into the instructor's script for running as batch.
  5. And if appropriate, a makefile for compiling your code and/or a script file for running it.
Note: Your code MUST accept input in exactly the format specified. Your code MUST produce an output file whose name can be specified (no hardcoding!) as an argument and be in exactly the format specified in this document. An automated test script will be used to run your code and validate your answers. If your format does not match, you may lose all points for program correctness and quality of results!

If you are concerned about your output format, you may send an output file to the GTA UP TO 24 hours before the assignment is due to be checked for format correctness.

Details