No title
7 Lecture 7. Dynamic programming II 7.1 Policy iteration In previous lecture, we studied dynamic programming for discrete time systems based on Bellman’s principle of optimality. We studied both finite horizon cost J = φ(xN ) + N−1∑ k=1 Lk(xk, uk), uk ∈ Uk and infinite horizon cost J = ∞∑ k=1 L(xk, uk), uk ∈ U(xk). The key ingredients we obtained were the Bellman equations. For finite horizon, J∗
https://www.control.lth.se/fileadmin/control/Education/DoctorateProgram/Optimal_Control/2023/lec7.pdf - 2026-05-09
