This is a tutorial aimed at trying to build up the intuition behind solution procedures for partially observable Markov decision processes (POMDPs). It sacrifices completeness for clarity. It tries to present the main problems geometrically, rather than with a series of formulas. In fact, we avoid the actual formulas altogether, try to keep notation to a minimum and rely on pictures to build up the intuition.
We try to keep the required background to a minimum and provide some brief mini-tutorials on the required background material. Select one of these below
- I am clue-less; I don't know anything about Markov decision processes or their algorithms.
- Refresh my memory; I know Markov decision processes, but not the value iteration algorithm for solving them.
- Give me the POMDPs; I know Markov decision processes, and the value iteration algorithm for solving them.
- I'm feeling brave; I know what a POMDP is, but I want to learn how to solve one.
Here is a complete index of all the pages in this tutorial.
- Brief Introduction to MDPs
- Brief Introduction to the Value Iteration Algorithm
- Background on POMDPs
- Background on Solving POMDPs
- POMDP Value Iteration Example
- General Form of a POMDP solution
- Monahan's Enumeration Algorithm
- Sondik's One-Pass Algorithm
- Cheng's Linear Support Algorithm
- Littman, et al's Witness Algorithm
- Zhang's Incremental Pruning Algorithm