The original dependency detection required an exact match of pattern constituents to events observed in the traces. Many domains may not produce uninterrupted sequences, e.g., ones in which parallel processes are incorporated in a single event trace or regularly occurring, but uninteresting events may disrupt patterns.
To make the matching insensitive to the insertion of random events, the contingency table construction was changed to match a relative order pattern within a window. As before, the pattern consists of a precursor and a final target event. For example, we could use a precursor of any four recovery actions and a target of a particular failure. For the construction of a contingency table, we need to determine whether the precursor matches (thus, we add one to a count in the top row of the contingency table) and whether the target event matches (in which case we add one to a count in the left column of the table).
In this algorithm, the precursor is a relative order of a specified number of events. A window designates an area over which the algorithm will match the precursor. A pattern matches a given window position when the target event is at the last position and when the elements of the precursor appear in the window in the same order as they do in the pattern. If the window is exactly the length of the precursor, then the match must be exact as in the original dependency detection algorithm; if the window length is greater than that of the precursor, then the match checks the relative order of the constituents of the precursor. For example, if the precursor is [A B] and the window is of length three, then the precursor will match a portion of the event streams in which A is followed immediately by B or A and B are separated by a single event.
The algorithm for constructing the contingency table is:
Figure 1: Sliding a window to construct a contingency table.
For a given pattern, incidence counts are gathered by sliding an imaginary match window over the event streams. The window defines an area of length W. By default, this length is 2P+1, which indicates that twice as many elements can be found in the window as are expected to be part of the precursor (2P) and 1 accommodates the target event in the pattern. In other words, we can allow P spurious events to slip between salient events. The window is re-positioned by moving its last position (the target) to the next event of the same type as the target. As in the original formulation, four counts are tallied for the contingency table: precursor with target, precursor without target, not precursor with target and not precursor without target. Each window positioning contributes to one and only one of the four positions in the contingency table. Figure 1 shows four positionings of the sliding window and their contribution to the contingency table for the [ , ] pattern (the target is of type A and has the value ). After sliding the window through all event streams, a G-test is run on the resulting contingency table.