Zhang, et al Incremental Pruning

POMDPs for Dummies

The latest approach, Zhang and Liu (1996), called incremental pruning, combines elements of Monahan's enumeration and the witness algorithms. Like witness, it considers constructing sets of vectors for each action individually and then focusing on each one observation at a time.

The basic idea is to eliminate doing all this region business. Since the main problem is in finding all the different combinations of future strategies, we focus on this aspect of Zhang's algorithm. Adding the immediate rewards is an easy step.

For a given action, we first construct all of S(a,z) sets (one for each observation). In Monahan's algorithm we would then try all ways of combining choices from each of these. Zhang's algorithm does this incrementally observation by observation. For example:

Zhang's vector combinations

We do all combinations of the S(a,z1) and S(a,z2) vectors. Which will yield a value function like this:

Zhang's resulting vectors

We then eliminate all those useless (light blue) vectors and are left with just three vectors. We then combine these three with the vectors in S(a,z3) as shown below.

Zhang's second step

This is repeated for the other action and then the sets combined just as in the witness algorithm. There are some more sophisticated generalizations of this algorithm (Cassandra, Littman and Zhang, 1997) that can better exploit the geometry of the values functions.

End of tutorial

Back | POMDP Tutorial