About this Page
A set of vectors is the representation use for the value function and each vector has an action associated with it. The vectors represent the coefficients of a hyperplane passing through the origin. The format specified here is what is output from the 'pomdp-solve' program and what will be necessary for input to the 'pomdp-solve' program with the '-terminal_values' command line option.
The format is simply:
A V1 V2 V3 ... VN A V1 V2 V3 ... VN ...
Where A is an action number and the V1 through VN are real values representing the components of a particular vector that has the associated action. The action number is the 0-based index of the action as specificed in the input POMDP file. The vector represents the coefficients of a hyperplane representing one facet of the piecewise linear and convex (PWLC) value function. Note that the length of the lists needs to be equal to the number of states in the POMDP.
To find which action is the "best" for a given set of alpha vectors, the belief state probabilities would be use in a dot product against each alpha vectors coefficients. The vector with the highest value is the winner and the action associated with that vector is the best action to take for that belief state given that PWLC value function.