Reducing the Search

Next: Results Up: Local Search Dependency Detection Previous: Improving the Matching

Reducing the Search

The purpose of the search is to test models of co-occurrence and find those that seem to represent significant dependencies between events in time. Because of the computational problems of increasing the number and character of the potential models, the dependency detection algorithm also was modified to apply local instead of exhaustive search. In this case, local search refers to a process of starting with a randomly selected initial pattern (a precursor and target event combination) and applying operators to find the best pattern from that one; the search continues until no better pattern can be found through operator application. Multiple initial starts of local search are done to collect different dependencies and explore different parts of the search space.

Local search has several advantages over exhaustive search for this application. First, we can control how much time is devoted to search through the operators applied and the number of trials. Second, we can be assured of getting the strongest dependencies; local search prunes the number of dependencies to be considered and usually prunes spurious dependencies based on few data.

The algorithm incorporates local search as follows:

Gather event streams of salient events.
Select parameters for the pattern: a specific target event, a length for the precursor (P) and a length for the window (W).
For each of T trials
1. Construct an initial pattern by randomly selecting P events in some order.
2. Find the best pattern by searching from the initial pattern:
  1. Apply the permutation operator to the best pattern found so far.
  2. For each permutation, test its significance by constructing a contingency table for the relative order matching and applying the G test to it.
  3. Select the best permutation.
  4. If the permutation is no better than the previous best, then halt search and return the best that was found; otherwise, repeat from step i.
Return a list of the best patterns found in each trial.

For the second step, as described in the previous section, patterns may be relative order and of a specified length with a specific target event. The window length indicates the number of allowed spurious events. For the third step, the search finds the best related pattern: the one with the highest G value for its contingency table. For each of the T trials, the initial pattern is composed of a precursor of length P with the elements chosen randomly plus the target event at the end.

New patterns are generated by applying a local search operator to the current best. Several local search operators were considered that permute the pattern by reordering and replacing pattern elements, but the best operator, in terms of efficiency and efficacy, is 1-OPT. 1-OPT optimizes one element in the pattern at a time. Progressing through the pattern positions in order, this operator considers all available elements as replacements; the operator replaces the element originally in the position with the best element for that position by testing the possibilities.

The algorithm continues to apply 1-OPT to subsequent best patterns until the resulting G value cannot be improved. The G values are obtained by constructing contingency tables and applying the G test as was described in the last subsection. The output is a list of the patterns returned in each of the T trials.

Next: Results Up: Local Search Dependency Detection Previous: Improving the Matching

adele howe
Wed Mar 6 09:01:10 MST 1996