Structured versus Random Benchmarks for the Permutation Flow-Shop Scheduling Problem

For most well-known NP-hard optimization problems, there are established benchmarks suites that are used to compare the performance of approximation algorithms. Very often, the problem instances in these benchmark suites are generated by selecting problem features at random: e.g., the city grid locations in the TSP. However, real-world problems generally exhibit some form of structure, and it is unclear whether algorithms that perform well on random instances will also perform well on more realistic, structured instances. To date, relatively little research has considered structured benchmarks for scheduling problems, despite the ubiquity of scheduling in real-world problems. We ( Jean-Paul Watson , Laura Barbulescu , L. Darrell Whitley , and Adele E. Howe ) consider this issue in the context of the Permutation Flow-Shop Scheduling Problem, or PFSP. Our paper, entitled Contrasting Structured and Random Permutation Flow-Shop Scheduling Problems: Search-Space Topology and Algorithm Performance, is due to appear in the Spring 2002 issue of the INFORMS Journal on Computing . Below, we provide links to both the structured and random problem instances we analyze in our paper. We also make available the code for our problem generator, and document the various generator input parameters. NOTE: the contents of this web page pretty much assume that you've read the journal paper.

Benchmark Problems

Random Benchmark Problems

We analyze two types of random PFSPs. In the first type (rand), the operation durations are independently and uniformly sampled from the interval [1,99], following Taillard (1993) . In the second type (rand-narrrow), the operation durations are independently and uniformly sampled from the interval [45,55]. The instances used in our experiments are available via the following links:

20 x 20 rand          20x20 rand-narrow
50 x 20 rand          50 x 20 rand-narrow
100 x 20 rand        100 x 20 rand-narrow
200 x 20 rand        200 x 20 rand-narrow

Each link points to a g'zipped tar file containing 100 problem instances and a file summarizing the lower bounds on the makespan and the best makespans found by any of the algorithms we considered in our paper. The contents of each archive are extracted to a sub-directory with an obvious name (e.g., 2020rand). The sub-directory contains a 'summary' file that reports the lower bounds/best-known solutions, and a 'problems' sub-directory containing the actual problem instances. The file format for the individual problem instances is identical to the format introduced by Taillard (1993) , and is the same format used in the PFSP benchmark suite available from the OR Library .

Structured Benchmark Problems

We introduce and analyze three types of structured PFSPs: job-correlated, machine-correlated, and mixed-correlated. The exact details of the generation process are discussed in the paper. The instances used in our experiments are available via the following links:

Job-Correlated                         Machine-Correlated                         Mixed-Correlated
20 x 20                                         20 x 20                                           20 x 20
50 x 20                                         50 x 20                                           50 x 20
100 x 20                                       100 x 20                                         100 x 20
200 x 20                                       200 x 20                                         200 x 20

Each link points to a g'zipped tar file containing 100 instances for each of the following values of a: 0.0, 0.1, 0.2, ..., 1.0. The contents of each archive are extracted to a sub-directory with an obvious name (i.e., 2020jc). The sub-directory contains 11 sub-directories, one for each value of a considered. The structure of each a sub-directory is identical to that reported for our random problem instances, as discussed above .

Problem Generator

We have made available the problem instances used in our INFORMS Journal on Computing paper to allow replication and further analysis of our results. However, the goal of our research was specifically not to introduce new PFSP benchmark problems. Rather, we are interested in analyzing search space structure and algorithm performance on different types of problem instances, in order to better understand the circumstances in which particular algorithms are likely to perform well. To facilitate such research on the PFSP, we have made the code for our problem generator publically available.

Download/Build

The code for our PFSP problem generator is available via the following link: fspgen.tar.gz . The g'zipped tar file contains contains 2 files: fspgen.cc and a makefile. The code is written in standard C++, and makes use of the Standard Template Library (STL). We have successfully built and tested the code on the following platforms: HP-UX (standard C++ compiler), Solaris 2.8 (GNU gcc 3.0 C++ compiler), and Linux (GNU gcc 3.0 C++ compiler). However, the code should compile successfully on any platform with a reasonably modern C++ compiler and an ANSI-compliant STL library. To build the generator, simply type 'make', which should produce an executable called fspgen. If you have any problems building the generator code, or find any errors in either our implementation or C++/STL compliance, please contact Jean-Paul Watson .

Generator Parameters

The command-line arguments to fspgen are as follows:

fspgen problem-type num-jobs num-machines random-#-seed [optional-arguments]*

The 'num-jobs', 'num-machines', and 'random-#-seed' are self-explanatory; if a value of 0 is specified for the 'random-#-seed' parameter, the random number generator is seeded using the value returned from the time function required by the ANSI C standard. There are 5 valid values of the 'problem-type' parameters:

taillard
gaussian
job-correlated
machine-correlated
mixed-correlated

For all problem types, the operation durations are limited to the interval [LB,UB]. By default, LB=1 and UB=99, following Taillard . LB and UB can be over-ridden by specifying the optional arguments '-durationLB=X' and '-durationUB=Y', respectively.

For the taillard problem type, the operation durations are independently and uniformly sampled from the interval [LB,UB].

For the gaussian problem type, the operation durations are independently and uniformly sampled from a normal distribution with mean ((UB-LB)/2)+LB and standard deviation (UB-LB)/6. We did not analyze this type of PFSP in our INFORMS Journal on Computing paper; it is considered in an earlier paper ( Watson et al. 1999 ), where we compare the search space structure of instances generated by sampling the operation durations both uniformly and from a normal distribution.

The details of the generation process for the three correlated problem types can be found in the paper. For each correlated problem type, lower and upper bounds on the distribution half-widths are central in the instance generation process; the default values are 1 and 5, respectively, and can be over-ridden by specifying the optional arguments '-distHalfWidthLB=X' and '-distHalfWidthUB=Y', respectively. Similarly, the expected overlap in the job/machine distributions is specified via the a parameter; the default value is 0.5, and can be over-ridden by specifying the optional argument '-alpha=Z'.

In the mixed-correlated mode, there is an additional noise parameter, which defaults to 0 and can be over-ridden by specifying the optional argument '-durationNoise=X'.

Output Format

All problem instances are output in a format identical to that used by the PFSP benchmark suite available from the OR Library . Following the specifications of the problem dimensions and operations durations, we additionally output the following:

The lower bound on the optimal makespan of the problem instance, as defined by Taillard (1993) .
The lower bound on the optimal makespan of the problem instance, as computed via a transform to a proportionate flow-shop (Pinedo 1995).
The minimum of the prior two lower bounds
The random number seed used to generate the problem instances

References.

Pinedo, M. 1995. Scheduling: Theory, Algorithms, and Systems. Prentice Hall.
Taillard, E. 1993. Benchmarks for basic scheduling problems. European Journal of Operational Research 64 278-285.
Watson, J-P., Barbulescu, L., Howe, A.E., and Whitley, L.D. 1999. Algorithm performance and problem structure for flow-shop scheduling.
Appears in the Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-99).

Questions/Errors/Comments?

If you have any questions or comments regarding the contents of this web page, please contact Jean-Paul Watson .

Web page last updated: January 15th, 2002.