Value function iterations with transition matrix

We continue to consider the following case with transition matrix

where

There is nothing new. We combine error bounds and modified policy iterations.

That is, we implement the following algorithm of modified policy iterations

(1) Set a grid consisting of k and k' columnwise and rowwise respectively.

(2) Calculate utility for consumption as U using the grid matrix above.

(3) Starting from a certain v, update v1=U+beta*P*v' so as to maximize v1.

(a) Find the corresponding k’ from (3) in value function iterations.

(b) Calculate utility from k and k’ as r.

vi=r+beta*P*vi_1

up until vi and vi_1 are almost the same

and set the final vi to vp.

(d) Repeat the whole thing by setting vp=vp1 until some criterion is met.

(4) Repeat (3) by setting v=v1 for many times.

(5) Find the final corresponding value of k as k' according to the maximum value.

based on the adjusted value functions by the following error bounds technique:

where

done across dimensions.

Consequently, there is no improvement compared to the result by modified policy

iterations but the value functions based on are different.