Value function iterations with transition matrix

 

We continue to consider the following case with transition matrix

 

where

 

Recall that the algorithm of modified policy iterations for a single function is

 

(1) Set a grid consisting of k and k' columnwise and rowwise respectively.

(2) Calculate utility for consumption as U using the grid matrix above.

(3) Starting from a certain v, update v1=U+beta*v' so as to maximize v1.

(a)   Find the corresponding k’ from (3) in value function iterations.

(b)   Calculate utility from k and k’ as r.

(c)    Starting from a certain vp, update vp1=U+beta*vp where vp is calculate by

      vi=r+beta*vi_1

      up until vi and vi_1 are almost the same

      and set the final vi to vp.

(d)   Repeat the whole thing by setting vp=vp1 until some criterion is met.

(4) Repeat (3) by setting v=v1 for many times.

(5) Find the final corresponding value of k as k' according to the maximum value.

 

as in 0913. We rewrite this by using transition matrix P. The modified algorithm is

 

 

(1) Set a grid consisting of k and k' columnwise and rowwise respectively.

(2) Calculate utility for consumption as U using the grid matrix above.

(3) Starting from a certain v, update v1=U+beta*P*v' so as to maximize v1.

(e)    Find the corresponding k’ from (3) in value function iterations.

(f)     Calculate utility from k and k’ as r.

(g)   Starting from a certain vp, update vp1=U+beta*P*vp where vp is calculate by

      vi=r+beta*P*vi_1

      up until vi and vi_1 are almost the same

      and set the final vi to vp.

(h)   Repeat the whole thing by setting vp=vp1 until some criterion is met.

(4) Repeat (3) by setting v=v1 for many times.

(5) Find the final corresponding value of k as k' according to the maximum value.

 

where (3) and (g) are different from the former for a single function.

 

Actually, vp and vp1 and then vi_1 and vi are all n x dim matrix.

 

So we have to transpose them and multiply beta*P by them respectively and

 

transpose them back as in policy iterations with transition matrix.