Parsiad Azimzadeh

In this expository post (for the MATH 525 course at Michigan), I will discuss the significance of regular Markov chains. In short, regular Markov chains turn out to be very well-behaved: a regular Markov chain has a unique steady state which is also its limiting distribution.

For matrices $A = (a_{ij})$ and $B = (b_{ij})$, we use $A \leq B$ to mean that $a_{ij} \leq b_{ij}$ for all indices $(i,j)$. $\lt$, $\gt$, and $\geq$ are defined similarly.

Consider a Markov chain with transition matrix $T$. We say the Markov chain is regular if $T^m > 0$ for some positive integer $m$.

In this article, we use the convention that transition matrices $T$ are right stochastic (i.e., $\sum_j T_{ij} = 1$). We are now ready to state the main result regarding the significance of regularity.

A regular Markov chain with an $n\times n$ transition matrix $T$ has a limiting distribution $\pi=(\pi_{1},\ldots,\pi_{n})$. Moreover, this limiting distribution is the Markov chain's only steady state (i.e., it is the only solution $x$ of $xT=x$).

Since $T$ is an eventually positive matrix (i.e., $T^m > 0$), by the Perron-Frobenius theorem, it has a simple eigenvalue $r = \rho(T)$ (see, e.g., Remark 4.2 of Zaslavsky, Boris G., and Bit-Shun Tam. "On the Jordan form of an irreducible matrix with eventually nonnegative powers." Linear Algebra and its Applications 302 (1999): 303-330). Moreover, since $T$ is a transition matrix, $\rho(T) = 1$.

The remainder of the proof proceeds by power iteration, which we explain in detail for exposition's sake. Now, let $A=(a_{ij})$ be the transpose of $T$, which we decompose into its Jordan canonical form $A=VJV^{-1}$. Denoting by $v_{1},\ldots,v_{n}$ the columns of $V$, we choose $v_{1}$ to be the (positive) eigenvector of $A$ corresponding to the eigenvalue $1$. Now, let $p$ be an arbitrary probability vector. We can write $p=c_{1}v_{1}+\cdots+c_{n}v_{n}$ for some constants $c_{1},\ldots,c_{n}$. Note that $$ A^{k}p=VJ^{k}V^{-1}p=VJ^{k}V^{-1}\left(c_{1}v_{1}+\cdots+c_{n}v_{n}\right)=VJ^{k}\left(c_{1}e_{1}+\cdots+c_{n}e_{n}\right) $$ where $e_{i}$ is the $i$-th standard basis vector. Then $$ A^{k}p=c_{1}v_{1}+VJ^{k}\left(c_{2}e_{2}+\cdots+c_{n}e_{n}\right). $$ Since $$ J^{k}\rightarrow\begin{pmatrix}1\\ & 0\\ & & \ddots\\ & & & 0 \end{pmatrix} $$ as $k\rightarrow\infty$, it follows that $A^{k}p\rightarrow c_{1}v_{1}$. Since $$ e^{\intercal}(Ax)=\sum_{i}\sum_{j}a_{ij}x_{j}=\sum_{j}x_{j}\sum_{i}a_{ij}=\sum_{j}x_{j}=e^{\intercal}x $$ for any vector $x$, it follows by continuity that $$ 1=e^{\intercal}p=e^{\intercal}(Ap)=e^{\intercal}(A^{2}p)=\cdots=e^{\intercal}(A^{k}p)\rightarrow e^{\intercal}(c_{1}v_{1}). $$ Therefore, $\pi=c_{1}v_{1}^{\intercal}$ is a probability vector, and we have arrived at the desired result.