On generalized policy iteration for continuous-time linear systems

Jae Young Lee, Tae Yoon Chun, Jin Bae Park, Yoon Ho Choi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

This paper investigate the mathematical properties of generalized policy iteration (GPI) applied to a class of continuous-time linear systems with unknown internal dynamics. GPI is a class of dynamic programming (DP) method to solve an optimal control problem by using two consecutive steps-policy evaluation and policy improvement. We first provide several formula equivalent to GPI, and as a result, reveal its relations to linear quadratic optimal control problems and the fact that the computational complexity due to backup operations in policy evaluation steps can be lessened by increasing the time horizon of GPI. A variety of local stability and convergence criteria is also provided with the connection to the convergence speed. Finally, several numerical simulations are performed to verify the results.

Original languageEnglish
Title of host publication2011 50th IEEE Conference on Decision and Control and European Control Conference, CDC-ECC 2011
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1722-1728
Number of pages7
ISBN (Print)9781612848006
DOIs
Publication statusPublished - 2011
Event2011 50th IEEE Conference on Decision and Control and European Control Conference, CDC-ECC 2011 - Orlando, FL, United States
Duration: 2011 Dec 122011 Dec 15

Publication series

NameProceedings of the IEEE Conference on Decision and Control
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370

Other

Other2011 50th IEEE Conference on Decision and Control and European Control Conference, CDC-ECC 2011
Country/TerritoryUnited States
CityOrlando, FL
Period11/12/1211/12/15

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Modelling and Simulation
  • Control and Optimization

Fingerprint

Dive into the research topics of 'On generalized policy iteration for continuous-time linear systems'. Together they form a unique fingerprint.

Cite this