CS553 Colorado State University =============================================== Data-Flow Analysis, Lattice-Theoretic Framework =============================================== 9/24/09 - grad school philosophy, http://www.cs.unc.edu/~azuma/hitch4.html What Ronald Azuma wished he knew before going to graduate school. Traits you need for grad school Initiative Tenacity Flexibility Interpersonal skills Organizational skills Communication skills Other really useful information to help you through grad school. ------------------ Context for Lattice-Theoretic Framework Theory provides ... a way to prove an analysis is correct just by proving certain properties about the transfer functions and domain of input and output values the foundation for data-flow analysis software architecture places a bounds on the analysis complexity Recall examples such as reaching definitions ... there was a domain of possible values in the IN and OUT sets each statement had an instance of a transfer function that converted IN sets to OUT sets or vice versa at points in the control flow graph where there were multiple edges coming into a node, we performed an operation such as union or intersection to compute the IN set for that node ------------------ Slides 3: lattice Show V and meet for the example lattice for liveness V = 2^S = { {}, {i}, {j}, {k}, {i,j}, {i,k}, {j,k}, {i,j,k} } where S is the set of all variables. meet = union and show how the example lattice satisfies the given properties. Have students draw lattices for reaching definitions and available expressions, determine what meet is for those lattices, and show that the lattices satisfy the properties. ------------------ Slides 3 and 4: greatest upper bound - show then on picture what intuition for meet/greatest upper bound from book, glb of x and y is such that 1. g <= x 2. g <= y 3. if z such that z <= x and z <= y, then z <= g Meet is glb (in the below '^' is used to represent meet operation) 1.Assume g = x ^ y and show g<=x subgoal: show g ^ x = g using the above assumption (x ^ y) ^ x // use assumption x ^ (y ^ x) // associativity x ^ (x ^ y) // communitivity (x ^ x) ^ y // associativity x ^ y // idempotence g // assumption x ^ g = g iff g <= x Thus: g <= x 2.Similiar for g <= y 3.Assume z <= x and z <= y and g = x^y to show z<=g based on assumptions: z ^ x = z and z ^ y = z subgoal: show z ^ g = z z ^ g z ^ (x ^ y) // assumption (z ^ x) ^ y // associativity z ^ y // assumptions z // assumptions z ^ g = z iff z <= g Thus: z <= g ------------------ Slide 9: Lattice Example What are the data-flow sets for liveness? What is the meet operation for liveness? What partial order does the meet operation induce? Remember: x ^ y = x iff x <= y What is the liveness lattice for this example? ------------------ Slide 10: Recall Liveness Analysis Many analyses can be expressed with transfer functions of the following form: f_n( x ) = gen_n union ( x - kill_n ) where gen_n and kill_n can be computed before iterative data-flow analysis starts. --> What analysis does this not work for? ------------------ Slide 12: Direction of flow The x in the generic transfer function above changes based on the analysis direction. ------------------ Slide 14: Merging Flow Values Main idea is that statically we do not know what path has been taken to get to a node. Have to assume that any are possible and conservatively combine results. When we are unsure of our answer, we want to err on the side of having too much in our sets. For Liveness for register allocation, this creates larger live variables which is safe. For reaching definitions, the same is true: if we THINK we have more reaching definitions than we do, we will inhibit constant Propagation (or loop invariant code motion), which again will be safe. For available expressions smaller sets are safe. ------------------ Slide 15: Reaching defs What is the lattice? What is the initial guess? What is the meet operation? ------------------ Slide 16-17: Available Expressions Determine the meet operation for the lattice based on the algorithm. What is the lattice for this particular example? ------------------ Slide 19: MFP Maximal fixed point (MFP) Visit nodes and evaluate data-flow equations until no changes occur. Each time a data-flow equation is re-evaluated, the result is more conservative (partially ordered before or under) the last time the equation was evaluated. ------------------ Slide 20: Correctness - why does V_mfp <= V_mop indicate correctness? - F_r represent composed function for whole path after merge, up to current node composition of f_p with f_q f_p( x ) = gen_p union ( x - kill_p ) f_q( x ) = gen_q union ( x - kill_q ) f_p( f_q( x ) ) = (f_p compose f_q) (x) = gen_p union ( (gen_q union ( x - kill_q )) - kill_p ) -> Is the gen/kill format for transfer functions closed under composition? Must show both directions to prove observation. Assume: f(x^y) <= f(x) ^ f(y) and x <= y f(x) <= f(x) ^ f(y) // using x <= y f(x) <= f(x) ^ f(y) <= f(y) // f(x) ^ f(y) is glb for f(x) and f(y) To Show: if x <= y then f(x) <= f(y) Assume: if x <= y then f(x) <= f(y) also know: x ^ y <= y and x ^ y <= z due to meet being glb if x ^ y <= y then f(x ^ y) <= f(y) if x ^ y <= x then f(x ^ y) <= f(x) f(x) ^ f(y) is glb of both so due to property 3 of glb ... To Show: f(x^y) <= f(x) ^ f(y) -------------------- Slide 21: Monotonicity -> How can we show that f_p( x ) = gen_p union ( x - kill_p ) is monotonic? -------------------- Slide 23: height of lattice: 6 def statements, height = 6 passes needed: 2 non-optimal order? when went backwards it took 7 passes, worst case is 7 * 6 * t, they match Reaching defs: Another way to think of this is that the gen is not dependent on the IN set. Only need to visit each node 3 times for this example if visit in breadth first order. Not height of lattice, instead depth+2. depth of control flow graph is the biggest number of backward edges in an acyclic path. Determine backedges by performing a depth-first traversal. -------------------- Slide 24: Accuracy Can we show reaching defs is distributive? f(u) = gen union (u - kill) f(v) = gen union (v - kill) f(u^v) = gen union ((u ^ v) - kill) f(u^v) =? f(u) ^ f(v) G union ((u union v) - K) =? G union (u - K) union G union (v - K) G union ((u - K) union (v - K)) G union ((u-K) union G union (v-K) -------------------- Slides 25-28: Powerset and tuple lattices Some tradeoffs that we have experienced between the two when implementing reaching constants. -------------------- mstrout@cs.colostate.edu, 9/24/09