Recall the following basic algorithm of value function iterations:
(1) Set a grid consisting of k and k’ columnwise and rowwise respectively.
(2) Calculate utility for consumption as U using the grid matrix above.
(3) Starting from a certain v, update v1=U+beta*v’ so as to maximize v1.
(4) Repeat (3) by setting v=v1 for many times or until some criterion is met.
(5) Find the final corresponding value of k as k’ according to the maximum value.
Modified algorithm for error bounds is
![]()
where
![]()
We modify (3) and add the adjustment part to the next.
This adjustment part will converge to zero as the time step goes on.
The additional algorithm of policy iterations is:
(a) Find the corresponding k’ from (3) in value function iterations.
(b) Calculate utility from k and k’ as r.
(c) Starting from a certain vp, update vp1=U+beta*vp where vp=(1/(1-beta))*r.
(d) Repeat the whole thing by setting vp=vp1 until some criterion is met.
We now put together the two. That is, we perform the policy iterations based on
the value function iterations with error bounds.