J.P. Locquet, J. Perret, et al.
SPIE Optical Science, Engineering, and Instrumentation 1998
An example for undiscounted multichain Markov Renewal Programming shows that policies may exist such that the Policy Iteration Algorithm (PIA) can converge to these policies for some (but not all) choices of the additive constants in the relative values, and as a consequence that the PIA may cycle if the relative values are improperly determined. A class of rules for choosing the additive constants is given sufficient to guarantee the convergence of the PIA, as well as necessary and sufficient conditions for a policy to have the property that the PIA can converge to it for any relative value vector. Finally we give some properties of the policies that exhibit this foolproof convergence. © 1978.
J.P. Locquet, J. Perret, et al.
SPIE Optical Science, Engineering, and Instrumentation 1998
Ligang Lu, Jack L. Kouloheris
IS&T/SPIE Electronic Imaging 2002
A. Skumanich
SPIE OE/LASE 1992
Zhihua Xiong, Yixin Xu, et al.
International Journal of Modelling, Identification and Control