Very entry-level book (e.g., no measurability) focusing on the applied aspects.
Appendix A
This appendix collects results from deterministic control theory. Appendix B sketches background material in probability.
Deterministic Control via PMP
Problem formulation (Bolza form):
Here is a -dim state vector; is a -dim control vector.
The book calls the control space to be the whole space if the problem is unconstrained. I think it’s a bit loose, normally we be explicit and say is the set of processes valued in , where is this Euclidean space.
cost function
We would search over all critical points: regular points (gradient = 0), boundary points of the control domain, and singular points or other irregular points. Here singular points probably is referring to that in ode: consider . Then is a singular point.
Hamiltonian
where is of -dim, called the Lagrange multiplier/adjoint state/co-state. It provides the objective extension for including the state dynamics.
PMP, regular control, interior point optimum
If Hamiltonian
is
in
, then necessary conditions for an interior point optimum,
in terms of the Hamiltonian
along the optimal trajectories, are:
Note that these do not necessarily apply to boundary or singular points of the control. The associated final conditions were listed in the book, depending on whether terminal time/state are fixed or flexible.
Long proof (including boundary conditions) via formal use of calculus of variations.
The necessary conditions become sufficient if:
- is in
- the Hessian matrix is neg.def (for maximum) or pos.def (for minimum). Or, is concave (for maximum) or convex (for minimum) at the regular point.
There are cases when this principle cannot be used. For example, if is linear in control, then it is a singular control problem and need to use more basi principles.
Book gave an example of regular control problem with explicit solutions: determining optimal consumption rate. Also, illustrated why second-order condition is needed – if agent is risk-seeking, then there is no interior optimum, agents would like to use bang control and consume at the maximum rate since they do not like consumption smoothing.
(more) Basic Optimum Principle
common situations: optima are at some interior points where the
derivatives in Hamilton’s equations do not exist, or, optima at
boundaries of control set, or linear control problem.
Optimum Principles Necessary conditions for maximum principle is that is maximized: .
If looking for minimum, we minimize Hamiltonian.
The book offered another expression of the maximum principle, as , where we define:
But I find it not very necessary, as long as we remember the principle is to maximize the Hamiltonian, whether it is at interior point or not.
Book then gave one example of linear control, and showed the optimal control is bang-bang; and an example of singular control (a bit convoluted), where the control is bang-singular-bang. Although I find the singular control example illuminating, the explanation at places could be clearer imo, and I find the analysis method hard to generalize to more complicated problems.
Linear Quadratic Models
There is a motivating example for the scalar case presented first, then moving to the multi-dim case. I’ll summarize this case only since the conditions will be derived again using DP later.
The problem is:
where the state LOM is:
Setting up the Hamiltonian: , the optimality conditions are:
From 3, we get . We will now solve for .
Denote , collect the conditions as:
, where:
.
Given the linear structure, assume the solution takes the form , where this is a vector of dimension .
Plugging in, we find the equations become . Therefore, the and the are eigen-values and eigen-vectors.
The general solutions is of the form: , and we shall use the initial and terminal/boundary conditions to pin down the constants .
Then we really have solved for everything, and we can derive other terms of interest, like the explicit form of the linear feedback rule.
There is an alternative approach, which is to start by assuming , and then differentiate this equation wrt time and match coefficients to get a matrix Riccati equation for .
Note that this will reappear in DP approach, as the value function turns out to have the form . This is to be expected as in PMP is the Jacobian of the value function wrt the state variable .
Deterministic Dynamic Programming
now, initial state is arbitrary: .
Bellman’s principle:
infinitesimal version of Bellman – HJB:
If
is differentiable in
and
, then first define pseudo-Hamiltonian:
Then HJB is written as:
Book solved the same LQ problem using DP approach. Guess the functional form of the value function, and derived the matrix Riccati equation that is consistent with that derived via PMP.
recall the problem was maximizing:
subject to
Thus the Hamiltonian is:
Now we guess , where is symmetric matrix.
Then, , and upon plugging into HJB we get:
Note that this is the linear feedback control derived earlier.
Now, , and plugging the optimal control into Hamiltonian, we see from the HJB that we will have a matrix Riccati equation for , with a boundary condition, and they are identical to the ones derived via PMP.
control of PDE driven dynamics
This is about distributed parameter systems. Better to learn from a more systematic source on this topic.
This concludes the review of deterministic control part.