Our planner is based on the symbolic heuristic search value iteration
(symbolic HSVI) [Sim et. al., 2008], which is an extension of the
heuristic search value iteration (HSVI) [Smith and Simmons, 2004] in
order to handle factored POMDPs. Our planner uses the symbolic Perseus
format. For the competition, we modified the symbolic HSVI to handle
finite horizon POMDP models. We improved the performance of the
algorithm by using the alpha vector masking method [Smith and Simmons,
2005]. In the method, each alpha vector is masked with respect to the
belief b that generated the alpha vector. When looking for the best
alpha vector at belief b¢, we can reject alpha vectors from
consideration if b and b¢ are not closely related enough. As a result,
we can retain simple algebraic decision diagram (ADD) representation
of alpha vectors and reduce the number of computations when
determining the best alpha vector at the given belief. Moreover we
reduced redundant computations by eliminating symmetric structures in
the domains. To find the symmetric structures, we formulated the
problem as a graph automorphism (GA) problem. Since our planner keeps
all structures as ADDs we extend [Kim, 2008]¢s algorithms to factored
symbolic Perseus domains and found the symmetries using Bliss software
tool [Junttila and Kaski].
References
[Sim et. al., 2008] Symbolic heuristic search value iteration for
factored POMDPs
[Smith and Simmons, 2004] Heuristic search value iteration for POMDPs
[Smith and Simmons, 2005] Point-based POMDP algorithms: Improved
analysis and implementation
[Kim, 2008] Exploiting symmetries in POMDPs for point-based algorithms
[Junttila and Kaski] http://www.tcs.hut.fi/Software/bliss/