Parsiad Azimzadeh

In this expository post (for the MATH 525 course at Michigan), I will discuss the significance of regular Markov chains. In short, regular Markov chains turn out to be very well-behaved: a regular Markov chain has a unique steady state which is also its limiting distribution.

For matrices $A = (a_{ij})$ and $B = (b_{ij})$, we use $A \leq B$ to mean that $a_{ij} \leq b_{ij}$ for all indices $(i,j)$. $\lt$, $\gt$, and $\geq$ are defined similarly.

Consider a Markov chain with transition matrix $T$. We say the Markov chain is regular if $T^m > 0$ for some positive integer $m$.

In this article, we use the convention that transition matrices $T$ are right stochastic (i.e., $\sum_j T_{ij} = 1$). We are now ready to state the main result regarding the significance of regularity.

A regular Markov chain with an $n\times n$ transition matrix $T$ has a limiting distribution $\pi=(\pi_{1},\ldots,\pi_{n})$. Moreover, this limiting distribution is the Markov chain's only steady state (i.e., it is the only solution $x$ of $xT=x$).

Since $T$ is an eventually positive matrix (i.e., $T^m > 0$), by the Perron-Frobenius theorem, it has a simple eigenvalue $r = \rho(T)$ (see, e.g., Remark 4.2 of Zaslavsky, Boris G., and Bit-Shun Tam. "On the Jordan form of an irreducible matrix with eventually nonnegative powers." Linear Algebra and its Applications 302 (1999): 303-330). Moreover, since $T$ is a transition matrix, $\rho(T) = 1$.

The remainder of the proof proceeds by power iteration, which we explain in detail for exposition's sake. Now, let $A=(a_{ij})$ be the transpose of $T$, which we decompose into its Jordan canonical form $A=VJV^{-1}$. Denoting by $v_{1},\ldots,v_{n}$ the columns of $V$, we choose $v_{1}$ to be the (positive) eigenvector of $A$ corresponding to the eigenvalue $1$. Now, let $p$ be an arbitrary probability vector. We can write $p=c_{1}v_{1}+\cdots+c_{n}v_{n}$ for some constants $c_{1},\ldots,c_{n}$. Note that $$ A^{k}p=VJ^{k}V^{-1}p=VJ^{k}V^{-1}\left(c_{1}v_{1}+\cdots+c_{n}v_{n}\right)=VJ^{k}\left(c_{1}e_{1}+\cdots+c_{n}e_{n}\right) $$ where $e_{i}$ is the $i$-th standard basis vector. Then $$ A^{k}p=c_{1}v_{1}+VJ^{k}\left(c_{2}e_{2}+\cdots+c_{n}e_{n}\right). $$ Since $$ J^{k}\rightarrow\begin{pmatrix}1\\ & 0\\ & & \ddots\\ & & & 0 \end{pmatrix} $$ as $k\rightarrow\infty$, it follows that $A^{k}p\rightarrow c_{1}v_{1}$. Since $$ e^{\intercal}(Ax)=\sum_{i}\sum_{j}a_{ij}x_{j}=\sum_{j}x_{j}\sum_{i}a_{ij}=\sum_{j}x_{j}=e^{\intercal}x $$ for any vector $x$, it follows by continuity that $$ 1=e^{\intercal}p=e^{\intercal}(Ap)=e^{\intercal}(A^{2}p)=\cdots=e^{\intercal}(A^{k}p)\rightarrow e^{\intercal}(c_{1}v_{1}). $$ Therefore, $\pi=c_{1}v_{1}^{\intercal}$ is a probability vector, and we have arrived at the desired result.

Parsiad Azimzadeh

The following is the last in a series of posts on optimal stopping. In the previous post, we showed that the value function satisfied a particular partial differential equation (PDE) in the viscosity sense, thereby positing the existence of a solution to that PDE. In this post, we derive a comparison principle for the optimal stopping problem, which in turn guarantees uniqueness of solutions to the PDE. It follows that, roughly speaking, the value function and the solution to the PDE are one and the same.

We now look to transcribe the PDE of the previous post along with the relevant Cauchy data (i.e., terminal condition). Let $\rho > 0$ be arbitrary and $$ F(w,r,q,A)=-\operatorname{trace}(\sigma(w)\sigma(w)^{\top}A)-b(w)\cdot q+\rho r\text{ where }w=(t,x). $$ We can write the Cauchy problem as \begin{align} \min\left\{ -\partial_{t}v+F(\cdot,v(\cdot),Dv(\cdot),D_{x}^{2}v(\cdot)),(v-g)(\cdot)\right\} & =0 & \text{on }[0,T)\times\mathbb{R}^{d};\nonumber \\ (v-g)(T,\cdot) & =0 & \text{on }\mathbb{R}^{d}.\tag{1}\label{eq:pde_boundary} \end{align}

Note that we have introduced a discounting term $\rho$ into the PDE. We can go from the PDE of the previous post to one of the form above by picking an arbitrary (positive) discount term and performing a change of variables (in particular, the $g$ is not the same as the $g$ of the previous post; it is scaled by a factor of the form $e^{\rho t}$). The reader should be able to convince themselves that uniqueness in the setting $\rho > 0$ implies uniqueness in the setting $\rho = 0$ whenever the change of variables can be performed.

For completeness, we extend (in the obvious manner) our notion of viscosity solution from the previous post to take into account the boundary:

Let $\mathcal{O}=[0,T)\times\mathbb{R}^{d}$. A locally bounded function $v\colon\mathcal{O}\rightarrow\mathbb{R}$ is a viscosity subsolution (resp. supersolution) of \eqref{eq:pde_boundary} if \begin{align*} & \min\left\{ -\partial_{t}\varphi(w)+F(w,v^{*}(w),D_{x}\varphi(w),D_{x}^{2}\varphi(w)),(v^{*}-g)(w)\right\} & \leq0 & & \text{if }0 \leq t < T;\\ & (v^{*}-g)(w) & \leq0 & & \text{if }t=T.\\ \text{(resp. } & \min\left\{ -\partial_{t}\varphi(w)+F(w,v_{*}(w),D_{x}\varphi(w),D_{x}^{2}\varphi(w)),(v_{*}-g)(w)\right\} & \geq0 & & \text{if }0 \leq t < T;\\ & (v_{*}-g)(w) & \geq0 & & \text{if }t=T. & \text{ )} \end{align*} for all $(w=(t,x),\varphi)\in\mathcal{O}\times C^{1,2}(\mathcal{O})$ such that $(v^{*}-\varphi)(y)=\max_{\mathcal{O}}(v^{*}-\varphi)=0$ (resp. $(v_{*}-\varphi)(y)=\min_{\mathcal{O}}(v_{*}-\varphi)=0$) and the maximum (resp. minimum) is strict. We say $v$ is a viscosity solution of \eqref{eq:pde_boundary} if it is both a subsolution and supersolution of \eqref{eq:pde_boundary}.
The reader familiar with the theory will note that for ease of presentation, we have decided not to incorporate the boundary conditions in the "strong sense" (see [1 Section 7.A]), so that some sort of a priori continuity needs to be established up to the boundary to establish the results of the previous post with the added boundary condition. A simple solution is to require that $g$ be Lipschitz in the previous posts.

Uniqueness for elliptic (resp. parabolic) equations in the classical setting is often established via a maximum principle. Such a maximum principle often states that if two solutions $u$ and $v$ satisfy $u\leq v$ on the boundary $\partial\Omega$ of the bounded domain $\Omega$ on which the PDE is defined, then $u\leq v$ on the closure of the domain $\overline{\Omega}$ (i.e., everywhere). However, maximum principles are often derived by considering the case of $u$ and $v$ smooth, so that the first and second derivatives of $u-v$ satisfy the usual conditions for maxima (i.e., $D(u-v)=0$ and $D^{2}(u-v)\preceq0$).

However, in the context of viscosity solutions, no smoothness is assumed. The main tool to circumvent this apparent problem is the celebrated Crandall-Ishii Lemma [1]. We use the notation $|A|=\sup\left\{ A\xi\cdot\xi\colon\left|\xi\right|\leq1\right\} $ for $A\in\mathscr{S}(d)$, along with the parabolic semijets $\overline{\mathscr{P}}^{2,\pm}$ as defined in [1]. The semijets give an alternative characterization of viscosity solutions which we will not discuss here. We mention that we are unable to use the "parabolic" Crandall-Ishii Lemma [1 Theorem 8.3] directly due to an issue with the boundedness of the derivatives. We rely instead on the "elliptic" version [1 Theorem 3.2] and a variable-doubling argument.

We consider here the case of bounded solutions (e.g., $g$ is bounded in the first post of the series). We leave it to the reader to derive conditions for more interesting cases (e.g., solutions of sublinear growth).

Let $u$ be a bounded subsolution and $v$ be a bounded supersolution of \eqref{eq:pde_boundary}. Suppose $\rho>0$ and that $b=b(x)$ and $\sigma=\sigma(x)$ are independent of time $t$ and Lipschitz continuous in $x$. Then, $u\leq v$.

It follows from the above that a viscosity solution $u$ (i.e., sub and super) of \eqref{eq:pde_boundary} satisfies $u^{*}\leq g\leq u_{*}$ on $\{T\}\times\mathbb{R}^{d}$ and hence $u^{*}\leq u_{*}$ everywhere. Moreover, the inequality $u_{*}\leq u^{*}$ is trivial from the definition of semicontinuous envelopes. Therefore, $u_{*}=u^{*}$, so that the function $u$ is continuous. Moreover, since for any two viscosity solutions $u$ and $v$ we have $u\leq v$ and $v\leq u$, it follows that $u=v$.

Without loss of generality, we can assume that $u$ (resp. $v$) is upper (resp. lower) semicontinuous (otherwise, replace $u$ and $v$ by their semicontinuous envelopes). To arrive at a contradiction, suppose $$\delta = \sup_{\mathcal{O}} \left\{ u - v \right\} > 0.$$ Letting $\nu > 0$, we can find $(t^\nu, x^\nu) \in \mathcal{O}$ such that $(u - v)(t^\nu, x^\nu) \geq \delta - \nu$. Let $\alpha>0$, $0 < \epsilon\leq1$, and $$\varphi(t,x,s,y)=\frac{\alpha}{2}\left(\left|x-y\right|^{2}+\left|t-s\right|^{2}\right)+\frac{\epsilon}{2}\left(\left|x\right|^{2}+\left|y\right|^{2}\right).$$ Note that \begin{align*} M_{\alpha} & =\sup_{(t,x,s,y)\in([0,T]\times\mathbb{R}^{d})^2}\left\{ u(t,x)-v(s,y)-\varphi(t,x,s,y)\right\} \\ & \geq\sup_{(t,x)\in[0,T]\times\mathbb{R}^{d}}\left\{ (u-v)(t,x)-\varphi(t,x,t,x)\right\} \\ & =\sup_{(t,x)\in[0,T]\times\mathbb{R}^{d}}\left\{ (u-v)(t,x)-\epsilon|x|^{2}\right\} \\ & \geq(u-v)(t^\nu,x^\nu)-\epsilon|x^\nu|^{2}\\ & \geq\delta-\nu-\epsilon|x^\nu|^{2}. \end{align*} We henceforth assume $\epsilon$ is small enough so that $\delta - \nu - \epsilon|x^\nu|^2$ is positive. Since $u$ and $v$ are trivially of subquadratic growth, for each $\alpha>0$, there exists $(t_{\alpha},x_{\alpha},s_{\alpha},y_{\alpha})\in([0,T]\times\mathbb{R}^{d})^2$ such that $$M_{\alpha}=u(t_{\alpha},x_{\alpha})-v(s_{\alpha},y_{\alpha})-\varphi(t_{\alpha},x_{\alpha},s_{\alpha},y_{\alpha}).$$ It follows that $$\left\Vert u\right\Vert _{\infty}+\left\Vert v\right\Vert _{\infty}\geq u(t_{\alpha},x_{\alpha})-v(s_{\alpha},y_{\alpha})\geq\delta-\nu-\epsilon|x^\nu|^{2}+\varphi(t_{\alpha},x_{\alpha},s_{\alpha},y_{\alpha})\geq$$ and hence $$\alpha\left(\left|x_{\alpha}-y_{\alpha}\right|^{2} + \left|t_{\alpha}-s_{\alpha}\right|^{2}\right) + \epsilon(|x_{\alpha}|^{2}+|y_{\alpha}|^{2})$$ is bounded independently of $\alpha$ and $\epsilon$. Now, for fixed $\epsilon$, consider an increasing sequence of $\alpha$, say $(\alpha_{n})_{n}$. To each $\alpha_{n}$ is associated a maximum point $(t_{n},x_{n},s_n,y_{n})=(t_{\alpha_{n}},x_{\alpha_{n}},s_{\alpha_n},y_{\alpha_{n}})$. Since $|x_{\alpha}|^{2}+|y_{\alpha}|^{2}$ is bounded independently of $\alpha$ (for fixed $\epsilon$), it follows that $\{(t_{n},x_{n},y_{n})\}_{n}$ is contained in a compact set. Therefore, $(\alpha_{n},t_{n},x_{n},s_n,y_{n})_n$ admits a subsequence whose four last components converge to some point $(\hat{t},\hat{x},\hat{s},\hat{y})$. It follows that $\hat{x}=\hat{y}$ since otherwise $|\hat{x}-\hat{y}|>0$ and $$\limsup_{n\rightarrow\infty}\left\{ \alpha_{n}\left|x_{n}-y_{n}\right|^{2}\right\} =\limsup_{n\rightarrow\infty}\alpha_{n}\left|\hat{x}-\hat{y}\right|^{2}=\infty,$$ contradicting the boundedness in the discussion above. Similarly, $\hat{t}=\hat{s}$. Moreover, letting $\varphi_n=\varphi(t_n,x_n,s_n,y_n;\alpha_n)$, \begin{align*} 0\leq\limsup_{n\rightarrow\infty}\varphi_n & \leq\limsup_{n\rightarrow\infty}\left\{ u(t_{n},x_{n})-v(s_{n},y_{n})\right\} -\delta+\nu+\epsilon|x^\nu|^{2}\\ & \leq\limsup_{n\rightarrow\infty}u(t_{n},x_{n})-\liminf_{n\rightarrow\infty}v(s_{n},y_{n})-\delta+\nu+\epsilon|x^\nu|^{2}\\ & \leq(u-v)(\hat{t},\hat{x})-\delta+\nu+\epsilon|x^\nu|^{2} \end{align*} and hence $$0 < \delta-\nu-\epsilon|x^\nu|^{2}\leq(u-v)(\hat{t},\hat{x}).$$ Since the left-hand side of the above is positive, it follows that $(\hat{t},\hat{x})\in\mathcal{O}$ (otherwise we would contradict $u\leq v$ on the boundary). Therefore, we can, without loss of generality, assume $(t_{n},x_{n},s_n,y_{n})\in\mathcal{O}$ for all $n$.
We can now apply the Crandall-Ishii Lemma (with $u_{1}=u$ and $u_{2}=-v$) to find $A_{n},B_{n}\in\mathscr{S}(d)$ and $a_{n}\in\mathbb{R}$ such that$$ \left(a_{n},D_x\varphi(x_{n},y_{n}),A_{n}+\epsilon I_d \right)\in\mathscr{\overline{P}}_{\mathcal{U}}^{2,+}u(t_{n},x_{n})\text{ and }\left(a_{n},-D_y\varphi(x_{n},y_{n}),B_{n}-\epsilon I_d \right)\in\mathscr{\overline{P}}_{\mathcal{U}}^{2,-}v(t_{n},x_{n}) $$ and $$-3\alpha_{n}I_{2d}\preceq\left(\begin{array}{cc} A_{n}\\ & -B_{n} \end{array}\right)\preceq3\alpha_{n}\left(\begin{array}{cc} I_{d} & -I_{d}\\ -I_{d} & I_{d} \end{array}\right).$$
Since $u$ is a subsolution and $v$ is a supersolution, it follows that \begin{align*} \min\left\{ -a_{n}+F(t_{n},x_{n},u(t_{n},x_{n}),\alpha(x_{n}-y_{n})+\epsilon x_{n},A_{n}+\epsilon I_d),(u-g)(t_{n},x_{n})\right\} & \leq0;\\ \min\left\{ -a_{n}+F(s_{n},y_{n},v(s_{n},y_{n}),\alpha(x_{n}-y_{n})-\epsilon y_{n},B_{n}-\epsilon I_d),(v-g)(t_{n},y_{n})\right\} & \geq0. \end{align*} With an abuse of notation, suppose $(u-g)(t_{n},x_{n})\leq0$ along some subsequence $(t_{n},x_{n},s_n,y_{n})_{n}$. Then, since $(v-g)(s_{n},y_{n})\geq0$ by the supersolution property, we have that $$\delta-\nu-\epsilon|x^\nu|^{2}+\varphi_n \leq u(t_{n},x_{n})-v(s_{n},y_{n})-\left(g(t_n,x_{n})-g(s_n,y_{n})\right)\leq0.$$ If we take $n$ large enough and $\epsilon$ small enough, the left-hand side of the above becomes strictly positive, yielding a contradiction. Therefore, with yet another abuse of notation, we can pick a subsequence $(t_{n},x_{n},s_n,y_{n})_n$ on which $(u-g)(t_{n},x_{n})>0$ for all $n$. On this subsequence, \begin{align*} -a_{n}+F(t_{n},x_{n},u(t_{n},x_{n}),\alpha(x_{n}-y_{n})+\epsilon x_{n},A_{n}+\epsilon I_d) & \leq0;\\ -a_{n}+F(s_{n},y_{n},v(s_{n},y_{n}),\alpha(x_{n}-y_{n})-\epsilon y_{n},B_{n}-\epsilon I_d) & \geq0. \end{align*} We now claim that if $$\left(\begin{array}{cc} A\\ & -B \end{array}\right)\preceq \operatorname{const.} \alpha\left(\begin{array}{cc} I_{d} & -I_{d}\\ -I_{d} & I_{d} \end{array}\right), $$ then \begin{multline*} F(s,y,r^{\prime},\alpha(x-y)-\epsilon y,B-\epsilon I_d)-F(t,x,r,\alpha(x-y)+\epsilon x,A+\epsilon I_d)\\ \leq\rho\left(r^{\prime}-r\right)+\operatorname{const.}(\alpha\,|x-y|^{2}+\epsilon\,(1+|x|^{2}+|y|^{2})). \end{multline*} $\operatorname{const.}$ denotes some nonnegative constant. We leave this as an exercise to the reader (use the Lipschitz continuity and linear growth of $b$ and $\sigma$). Using this claim, \begin{align*} 0 & \leq F(s_{n},y_{n},v(s_{n},y_{n}),\alpha(x_{n}-y_{n})-\epsilon y_{n},B_{n}-\epsilon I_{d})\\ & \qquad-F(t_{n},x_{n},u(t_{n},x_{n}),\alpha(x_{n}-y_{n})+\epsilon x_{n},A_{n}+\epsilon I_{d})\\ & \leq\rho\,(v(s_{n},y_{n})-u(t_{n},x_{n}))+\operatorname{const.}\,(\varphi(x_{n},y_{n})+\epsilon)\\ & \leq\rho\,(-\delta+\nu+\epsilon|x^\nu|^{2})+\operatorname{const.}\,(\varphi_n+\epsilon) \end{align*} Taking the limit superior of both sides as $n\rightarrow\infty$ and moving some terms around, we get $$\delta\leq\operatorname{const.}\,(\nu+\epsilon+\epsilon|x^\nu|^{2})$$ where $\operatorname{const.}$ is not necessarily the same as it was above. Now, simply take $\epsilon$ small enough to arrive at a contradiction.


  1. Crandall, Michael G., Hitoshi Ishii, and Pierre-Louis Lions. "User’s guide to viscosity solutions of second order partial differential equations." Bulletin of the American Mathematical Society 27.1 (1992): 1-67.