Planner Components

The following organizes planner components into those involved with translation, preprocesing, search, heuristics, and postprocessing. The planners as described by their authors will rarely fall neatly into these categories and unfortunately the source code is typically even more tighly coupled and more difficult to disentagle. Nevertheless, several strategies within each component are often reused by many planners. The focus is primarily on planners that use some heuristic to guide search through state-space.

A page organized by planners can be found here


Components:


Translation: What tools are used to translate PDDL (usually lex/flex and yacc/bison) and how facts and operators are represented.

Binary representation of facts
The state of the world is represented as a bit vector, one bit for each fact that is true. Operators therefore have one bit vector for preconditions (each fact that must be true for the operator to be executed is set), one for added effects, and one for deleted effects.
Spike representation of planning graph
(Spike Rep) Motivation:
During construction (especially later construction) of the planning graph very little new information is added at each new layer. Not only must information be duplicated to the next layer, but mutex relations must also be rechecked.
(Spike Rep) Basic idea:
Separate the layer-dependent and layer-independent information about facts and operators. Distinguish between permanent and temporary mutex relations.
(Spike Rep) Data Structures:
  1. Array of fact headers - one for each fact. These are created dynamically. Each fact header contains:
    1. Name - description only (probably just for reconstructing the plan)
    2. Index - position in fact array
    3. Bit mask - Bit vector with its index bit set, all others unset
    4. Reference - to achieving no-op (if exists)
    5. Consumers - Bit vector with bits set of all actions (their indices) that need this fact as a precondition
    6. Fact level package - one for each layer in the graph. This is what stores the layer dependent information and consists of:
      1. FMV (fact mutex vector) - Bit vector with bit set for each fact the fact is mutex with in this rank.
      2. AV (achievement vector) - Bit vector with bit set for all actions that achieve this fact in the previous action rank.
  2. Array of action headers - one for each action (also created dynamically). Each action header contains:
    1. Name
    2. Index
    3. Bit mask
    4. Flag - is it a no-op or not
    5. 3 bit vectors for preconditions, adds, and deletes
    6. Action level package - one for each layer that stores layer dependent information. Each consists of:
      1. AMV (action mutex vector) - Bit vector that corresponds to all actions the action is mutex with.

(Spike Rep) Algorithm:
Continue constructing spike until graph opens (all goals are pairwise non-mutex). Once this occurs alternate search and expansion until fix point is reached (no changes in new layers). After the fix point is reached use wave front.
  1. Construct new action rank (another action level package for each action header)
    1. Add no-ops for each fact header in previous fact rank. Update reference in fact header so it points to this no-op.
    2. Enact all possible actions. OR the FMVs (of previous rank) of all preconditions and AND that with precondition vector of same action. If result is non-zero then can't enact action.
    3. From add effects of action, create new fact headers for new facts, update fact level packages for existing facts.
  2. Update action mutexes
    1. Permanent mutex relationships never need to be retested (but AMV is recopied into next package). Permanent mutex relationships exist if actions have conflicting add and delete lists or conflicting preconditions and delete lists.
    2. Temporary mutex relationships occur if preconditions are mutex and must be retested. This is similar to testing if an action is applicable except OR the FMVs of preconditions of one action and AND that with preconditions of the other action.
  3. Finish modification of new fact rank. New facts were already added from the add vectors of enacted actions. Update fact mutexes.
    1. Any non-mutex facts are still non-mutex.
    2. Two facts are mutex if all their achievers are mutex. Examine every fact with every other fact (but check is symmetric so only need to do in one direction).
(Spike Rep) Comments:
Details of memoization (TODO) are left out of this description. Also the spike representation does not account for conditional or quantified effects... how to implement that is described in other papers.
(Spike Rep) References:
"Efficient Implementation of the Plan Graph in STAN." Reference

Pre-processing: Domain/problem representation is analyzed, reduced, or modified in order to speed up search or heuristic evaluation. Techniques include removing irrelevant facts/actions/objects, adding constraints, detecting inertia, and detecting symmetry

RIFO (Removal of Irrelevant Facts and Operators)
(RIFO) Motivation:
Many planning algorithms do regression search to focus search on relevant actions. Planning graph systems, however, search forward. Irrelevant facts and actions can seriously hinder this forward search, so methods of detecting these can aid planning graph type systems.
(RIFO) Definitions: (RIFO) Data Structure: Fact Generation Tree
A fact generation tree consists of OR and AND nodes. (RIFO) Algorithm:
  1. Begin with an AND node that contains all goals.
  2. Compute possibility set of goals (either proceed until initial facts are reached or to a certain depth).
    • To compute the possibility set of an AND node, compute the union of all combinations of possibility sets of all children OR nodes. For example, if there were two OR nodes with possibility sets {{a},{b}} and {{c}} then the possibility set would be {{a,c} {b,c}}.
    • To compute the possibility set of an OR node, simply take the union of the preconditions of the operators that generate that node.
  3. Compute the set of probably relevant initial facts from the possibility set of the goals. Since the possibility set is really an approximation (see below) and the possiblity set likely contains more than one element the question is how to choose. There are four options and some can eliminate relevant facts.
    "All-sets" - union over all elements
    Union over all set-inclusion minimal sets (TODO)
    "All-best-sets" - union over all minimum sets
    "One-best-set" - pick one minimum set
  4. Decide what to do with the set of probably relevant initial facts. There are three options of increasing restriction.
    1. Count all objects that are in the set and exclude any initial facts that are not one of these objects.
    2. Count only the initial facts that are in the set as relevant
    3. Count only the ground operators that occur in the fact generation tree and are generated by the set.
  5. Approximations: Since the fact generation tree can be very expensive to compute and since it is only meant as a heuristic to trim some of the irrelevant facts, the tree can be constructed in an approximate fashion. The authors suggest two ways of doing this:
    1. Memoization - making it a directed graph more than a tree (TODO)
    2. Only storing the 10 smallest sets computed so far at each node
(RIFO) References:
"Ignoring irrelevant facts and operators in plan generation." Reference
GRT Remove Irrelevant Objects
(GRT Remove Irrelevant Objects) Motivation:
The motivation is similar to RIFO. Even though GRT does not construct a planning graph of any kind, initial calculation of the heuristic and forward search are all slowed by irrelevant objects.
(GRT Remove Irrelevant Objects) Basic Idea:
Find irrelevant objects and remove all facts and actions containing them. Technique only works on pure STRIPS domains with no negation in the goal or any preconditions. Much more restrictive than RIFO but maintains completeness (RIFO can potentially remove relevant facts/actions) and may run faster.
(GRT Remove Irrelevant Objects) Algorithm:
An object is irrelevant if
  1. It does not appear in any goal fact, unless the same fact is also included in the initial state AND
  2. There is no action containing the object in its preconditions, unless the object is also conntained in all the action's effects
Author's example is adding capabilities to paint an object in logistics domain with a painted predicate, a color predicate, and a paint action for each package and color. If the goal does not specify any colors then colors are irrelevant objects and all facts and actions containing them can be removed.
(GRT Remove Irrelevant Objects) Related:
RIFO, the work of Ulrich Scholz and Haslum and Jonsson. (TODO)
(GRT Remove Irrelevant Objects) References:
"The GRT planning system: Backward heuristic construction in forward state-space planning." (2 pages devoted to description) Reference
Add Negated Predicates
(Add Negated Predicates) Motivation:
Negatives facts that are implicitly present in the initial state or preconditions can cause problems. For example, in the elevator domain the add effect of the action "board" is that a passenger is boarded. There is no predicate for not_boarded. This causes two problems.
  1. The action can be applied to the same passenger repeatedly.
  2. The initial state does not contain the fact that the passengers are not initially boarded. If the initial state only contains static facts, then a heuristic calculation like GRT's will calculate zero distances from the goal for each state. (TODO why?)
(Add Negated Predicates) Basic Idea/Algorithm:
  1. Use the same mutex calculation algorithm used for completing incomplete goals.
  2. Find facts that are not mutex with any initial state fact. Create negated predicates of such facts.
  3. Add negated predicates to initial state. Add negated predicates to preconditions and delete lists of appropriate actions.
(Add Negated Predicates) References:
"The GRT planning system: Backward heuristic construction in forward state-space planning." (2 pages devoted to description) Reference
Operator graphs
These are relevant to partial order planners in the reference below, but may be an interesting precursor to other operator reduction techniques. Operator graphs can also help detect "recursion" or cycles in a problem (open the trunk, close the trunk, open the truck...).
"Suspending Recursion in Causal-link Planning" Reference
Inferring State Constraints (DISCOPLAN)
(Inferring State Constraints) Basic idea:
One can examine domain and discover constraints that are always true based on the logical structure of the operators. The authors find that in most of the domains they examine (blocksworld, logistics, hanoi, and ferry) that two kinds of constraints dominate.
  1. Implicative constraints - in which all variables occuring anywhere in the constraint occur in one of the antecedent literals.
    Example: ON(x,y) && NEQ(y, TABLE) ⇒ ¬CLEAR(y)
    In the example, x and y are the only variables in the constraint and they occur in one of the literals in the antecedent (ON(x,y)).
  2. Single-valuedness constraints - no literal has an all-inclusive set of variables
(Inferring State Constraints) Comments:
While the authors suggest this technique might be helpful in "unachievability pruning" in deductive or regression planners, adding these constraints is most useful in SAT-based planners.
(Inferring State Constraints) References: "Inferring State Constraints for Domain-Independent Planning" Reference
XOR Constraints
(XOR Constraints) Motivation:
GRT and HSP do not handle grid-like domains such as grid and mystery very well. Click here for a brief description of these domains. The problem is that actions are selected that lead to a state that is closest to the goal, and planners therefore can get stuck in a local optimum.
The author's example is a simple grid problem where a robot (R) must pick up a key (K) at the upper left corner and drop it at the upper right corner (see figure below). The way the heuristic is constructed the distance to the goal is actually less if the robot moves right instead of moving up.
Initial state
K    
     
R    
Goal state
    K
     
    R
(XOR Constraints) Basic Idea:
XOR constraints specify that only one fact of a group can be true in any complete state. For example, in the simple domain above the robot can only be at one location at any one time (although the domain description does not specify this). The authors notation for such a constraint is ((xor(at ?Robot *)) (robot ?Robot)). Combining the grounded constraints with the initial and goal facts, one can create an ordering of intermediate states. By attempting to achieve these intermediate states rather than the goal state itself search is sped up considerably.
(XOR Constraints) Algorithm:
NOTE: This algorithm is not explicitly provided by the authors and is pieced together from their general descriptions.
  1. Compute XOR constraints
    One way to do this is to use the mutex calculations from computing incomplete goals. Using the grid example from above, the mutex calculations would show that that the nine facts ( (at R Location 1) through (at R Location 9) ) are all mutually exclusive with each other. One need only to extract "fully connected subsets" from the set of mutexes and each subset would represent an XOR constraint.
  2. Find pairs of facts (one from initial state and one from goal state) that belong to the same grounded XOR constraint.
  3. Construct the Greedy Regression Graph
    This is similar to the Greedy Regression Table. Instead of a table entry, each ground fact is now a node. Each node/fact now stores the action that achieved it. Arcs in the graph are from preconditions to achieved facts.
  4. For every pair of facts computed in earlier step, find a sequence of actions that change the initial fact to the goal fact.
    • These sequences can be derived from the GRG.
    • These sequences may not be complete. In the grid example above the intial fact (at K upper-left) and the goal fact (at K upper-right) would be part of a XOR constraint stating the key can only be in one location at a time. The sequence would be (get R K upper-left) (drop R K upper-right). Only preconditions related to the XOR constraint are considered. The action drop requires the precondition (hold R K) which involves K. The action get makes (hold R K) true. Preconditions concerining R's location are ignored so the sequence does not need those actions.
  5. Check if facts that are in any grounded XOR constraint are related to actions in the sequences computed in previous step (not counting facts related to the same XOR constraint). If so, these facts are subgoals that need to be achieved before the goal facts of sequences. The two types of subgoals are XOR constrained facts that are either
    1. preconditions of an action in a "foreign" XOR sequence (a sequence related to a different XOR constraint)
    2. add effects of an action from their own XOR sequence, but that action has preconditions which are from a "foreign" XOR constraint
  6. If there are no such facts, then the problem should present no difficulties to the heuristic planner. If there are such facts, then
    1. Construct an ordering graph that contains those facts along with the initial and goal facts. See the paper for rules for constructing ordering constraints.
    2. Construct intermediate states from the ordering graph. Try to insert one fact from each XOR constraint (that is part of the ordering graph) without violating any ordering constraints. If more than one fact can be inserted from a XOR constraint then randomly pick one. If no fact can be inserted then leave the state incomplete. These intermediate states now represent subgoals.
(XOR Constraints) Comments:
It is important to note that GRT requires the constraints to be added to the domain definition. While there are ways to compute them automatically (see Algorithm and Related) this was never fully implemented by the authors. They argue that automatically generated ones would lead to pointless decompositions.
(XOR Constraints) Related:
• XOR constraints are related to the constraints discussed here and the techniques in that paper could be used to automatically compute XOR constraints.
• XOR constraints could also be derived from the mutex calculations for completing the goal state (see Algorithm and Completing Incomplete Goals.
• Using XOR constraints to create intermediate goals seems only to be done by GRT, but a goal ordering/agenda is an important tool implemented by many other planners such as FF (TODO insert link) and its descendents.
(XOR Constraints) References:
• Exploiting State Constraints in Heuristic State-Space Planning 2000 Calculating and using XOR Constraints Reference
• "The GRT planning system: Backward heuristic construction in forward state-space planning." 2001 Much of the same discussion/example of XOR constraints, but discusses how integerated into GRT. Reference
Complete Incomplete Goals
(Complete Incomplete Goals) Motivation:
Regression planners such as GRT estimate the distance from intermediate states to goals by searching backwards. Incomplete goal descriptions (i.e. every fact that is true is not explicitly stated) may make it impossible to apply some actions in reverse since the preconditions to those actions will not be present in the goal state.
NOTE: This description is based on GRT, but the problem of incomplete goal states also affects HSP-r and AltAlt (TODO add HSP link and research). HSP-r checks the validity of each individual state during regression search rather than explicitly completing the goal in a pre-processing phase.
(Complete Incomplete Goals) Basic Idea:
Add facts to the goal state that are not mutually exclusive with any of the facts explicitly stated in the goal. There are two ways to do this (summarized in table below).
  Advantages Disadvantages
1) Simplified Graphplan-like algorithm that primarily keeps track of mutex relations. Automated Time-consuming (10 - 20% of the total time. Ratio, of course, depends on how much time is spent searching. Overhead is significant for easy problems.)
2) Domain knowledge in the form of axioms. Quick (once you have the axioms) and allows more complex relationships than binary mutexes. Not automated although DISCOPLAN and TIM (TODO) have made progress in automating this process.
Once one has the facts to complete the goal state, the authors of GRT consider three ways of acutally adding these facts.
  Advantages Disadvantages
1) Use them all (default for GRT) * Heuristic construction fast (many facts are achieved at the beginning, many actions applicable early).
* Search tends towards breadth-first, produce better plans
* Heuristic less informative (small differences between estimates)
* Search tends towards breadth-first, takes longer
2) Use candidate facts that are also initial facts. Remaining facts that are mutually exclusive with these are not used. More informed heuristic, faster search If initial facts shouldn't be included in goal may mislead search.
3) Progressive - instead of completing goal and then calculating heuristic, only add facts to goal when no action can be applied. To choose which fact to add next try the following.
  1. Facts that can combine with original goals
  2. Facts that can combine with other already achieved facts
  3. Facts included in the initial state
  4. Remaining facts selected randomly
Usually this produces the best tradeoff between speed and plan quality. Plan quality suffers greatly for some problems.
(Complete Incomplete Goals) Algorithm:
• The algorithm is described completely in the paper. The following description is an exercise/reference. The numbering of the algorithm corresponds to that of the paper.
• Notation: For an action α, P(&alpha) is the precondition list, A(&alpha) is the add list, D(&alpha) is the delete list.
• Every time a mutex relation is removed or a new fact is achieved, the variable Counter is reset to 0. Therefore, the while loop will end when no applicable action can achieve any new facts or remove any mutex relations.
• Note that steps 5bII - 5bIV maintain the A1(α) set while 5bV - 5bVII maintain the A2(α) set.
  1. Compute all ground actions and put them in a list. N = number of ground actions.
  2. Compute all facts of domain. Set achieved flag to false for each fact.
  3. Set achieved flag to true for initial facts.
  4. Set Counter = 0
  5. While Counter < N
    1. Take the first action α from the list. Increment Counter.
    2. If all preconditions achieved for α and no mutex between them, then (action can be applied)
      1. Divide the facts in A(α) into two sets.
        • A1(α) = all facts not already achieved
        • A2(&alpha) = all facts already achieved.
      2. Mark all facts in A1(α) as achieved. Counter = 0
      3. Mark all facts in A1(α) as mutex with those in D(α). (Newly achieved facts of an action are mutex with facts deleted by that same action.)
      4. For any facts not in A(α) that are mutex with at least one fact in P(α), mark them as mutex with all facts in A1(α). (If a fact is mutex with preconditions of an action, it is also mutex with add effects of that action (unless those effects are achieved already by some other action).)
      5. If any facts in A2(α) are mutex with each other, remove those mutex relations and set Counter = 0. (Previous actions must have marked them as mutex, but this action just achieved them so they are no longer mutex.)
      6. If any facts in A2(α) are mutex with facts in P(α) that are not in D(α), remove those mutex relations and set Counter = 0. (If a fact was previously mutex with a nondeleted precondition, it no longer is mutex since it was achieved by α.)
      7. For any facts that are not mutex with any P(α) but are mutex with one fact in A2(α), remove those mutex relations and set Counter = 0. (If a fact is not mutex with any preconditions of an action, then it is not mutex with any add effects of that action.)
    3. Remove α from the beginning of the list of actions and add it to the end of the list.
  6. The enriched set of goals includes all goal facts and all facts that are achieved and not mutex with any goal facts.
(Complete Incomplete Goals) References:
"On Determining and Completing Incomplete States in STRIPS Domains" Reference

Searching: Basic search technique(s) used, advanced techniques to make search more efficient, and algorithms to choose between search techniques

Wave front

Heuristics: What heuristics are available and algorithms to choose between heuristics.

Admisssibility (GRT Heuristic)
(GRT Heuristic) Motivation:
The hadd and hmax heuristic of early versions of HSP were recomputed for each intermediate state and do not take into account that goal facts may not be independent. The GRT heuristic computes the distance to the goal only once for each fact and keeps track of related facts to try to improve these deficiencies.
(GRT Heuristic) Basic Idea:
(GRT Heuristic) Data Structures:
(GRT Heuristic) Algorithm:
(GRT Heuristic) Comments:
(GRT Heuristic) Related: (GRT Heuristic) References:

Post-processing: Adaptive or feedback processes that control search or heuristics.


Last modified: Wed Sep 1 13:28:29 MDT 2004