Value function iterations with transition matrix

 

We continue to consider the following case with transition matrix

 

where

 

There is nothing new. We combine error bounds and modified policy iterations.

 

That is, we implement the following algorithm of modified policy iterations

 

(1) Set a grid consisting of k and k' columnwise and rowwise respectively.

(2) Calculate utility for consumption as U using the grid matrix above.

(3) Starting from a certain v, update v1=U+beta*P*v' so as to maximize v1.

(a)   Find the corresponding k’ from (3) in value function iterations.

(b)   Calculate utility from k and k’ as r.

(c)    Starting from a certain vp, update vp1=U+beta*P*vp where vp is calculate by

      vi=r+beta*P*vi_1

      up until vi and vi_1 are almost the same

      and set the final vi to vp.

(d)   Repeat the whole thing by setting vp=vp1 until some criterion is met.

(4) Repeat (3) by setting v=v1 for many times.

(5) Find the final corresponding value of k as k' according to the maximum value.

 

based on the adjusted value functions by the following error bounds technique:

 

 

where

 

 

done across dimensions.

 

Consequently, there is no improvement compared to the result by modified policy

 

iterations but the value functions based on are different.