Help   About ProQuest | 

Dissertations & Theses
The world's most comprehensive collection of dissertations and theses.Learn More...

Citation/Abstract

Print  |  Email  |  Order a Copy  
Finite-memory control of partially observable systems
by Hansen, Eric Anton, Ph.D., University of Massachusetts Amherst, 1998, 174 pages; AAT 9909170

Abstract (Summary)

A partially observable Markov decision process (POMDP) is a model of planning and control that enables reasoning about actions with stochastic effects and observations that provide imperfect information. It has applications in diverse fields that include artificial intelligence, operations research and optimal control, although computational difficulties have limited its use.

This thesis presents new dynamic-programming and heuristic-search algorithms for solving infinite-horizon POMDPs. These algorithms represent a plan or policy as a finite-state controller and exploit this representation to improve the efficiency of problem solving.

One contribution of this thesis is an improved policy-iteration algorithm that searches in a policy space of finite-state controllers. It is based on a new interpretation of the dynamic-programming operator for POMDPs as the transformation of a finite-state controller into an improved finite-state controller. Empirically, it outperforms value iteration in solving infinite-horizon POMDPs.

Dynamic-programming algorithms such as policy iteration and value iteration compute an optimal policy for every possible starting state. An advantage of heuristic search is that it can focus computation on finding an optimal policy for a single starting state. However, it has not been used before to find solutions with loops, that is, solutions that take the form of finite-state controllers. A second contribution of this thesis is to show how to generalize heuristic search to find policies that take this more general form.

Three algorithms that use heuristic search to solve POMDPs are presented. Two solve special cases of the POMDP problem. The third solves the general POMDP problem by iteratively improving a finite-state controller in the same way as policy iteration, but focuses computation where it is most likely to improve the value of the controller for a starting state.

Indexing (document details)

Advisor:Zilberstein, Shlomo
School:University of Massachusetts Amherst
School Location:United States -- Massachusetts
Keyword(s):Stochasticity, Imperfect information, Finite-memory control, Memory control, Partially observable systems
Source:DAI-B 59/10, p. 5440, Apr 1999
Source type:Dissertation
Subjects:Computer science, Operations research, Markov analysis, Studies, Algorithms
Publication Number: AAT 9909170
ISBN:9780599073326
Document URL:http://proquest.umi.com/pqdlink?did=732939921&Fmt=7&clientId =79356&RQT=309&VName=PQD
ProQuest document ID:732939921


 

 » Purchase the full text

Dissertations and theses can be purchased in a variety of formats which may include: PDF for web download, softcover, hardcover, or microform. Click the "Order a Copy" button to see the formats available for this item.

Available without purchase:

Preview  Preview

Print  |  Email  |  Order a Copy  
^Back to Top
Copyright © 2009 ProQuest LLC. All rights reserved. Terms and Conditions