Dynamic programming is basically solved by:
v1=U+beta*v
so as to maximize v1 where v1 and v are both n x 1 vector. We find k and kf and then
update v to v1. We are considering a case with a single policy
function. We have to consider n x 1 vector of values in each step.
If we approximate the curve of value function v1 and v, we can reduce the number of
estimates in each step. Thatfs the main idea of this collocation method.
Usually we approximate the value function in a polynomial manner.
Recall that the algorithm of modified policy iterations is
(1) Set a grid consisting of k and k' columnwise and rowwise respectively.
(2) Calculate utility for consumption as U using the grid matrix above.
(3) Starting from a certain v, update v1=U+beta*v' so as to maximize v1.
(a) Find the corresponding kf from (3) in value function iterations.
(b) Calculate utility from k and kf as r.
(c) Starting from a certain vp, update vp1=U+beta*vp where vp is calculate by
vi=r+beta*vi_1
up until vi and vi_1 are almost the same
and set the final vi to vp.
(d) Repeat the whole thing by setting vp=vp1 until some criterion is met.
(4) Repeat (3) by setting v=v1 for many times.
(5) Find the final corresponding value of k as k' according to the maximum value.
as in 0912 and 0913. When we calculate vp by direct inversion, it is going to regular
policy iterations.
In (c ), we approximate vp and find it in terms of coefficients of polynomials so as to
satisfy vp=U+beta*vp when vp1=vp. That is, we can rewrite it by
poly(x)=U+beta*poly(x)
where poly(x) is some polynomial function of x.
Recall that we can draw the curve of value function for x versus value function.
If the order of polynomials is porder, we can reduce the number of estimates from
n to porder+1 including the intercept in each step. It may speed up the calculation
when we consider many number of points on the grid.