Optimal Stopping

“System” evolves in time according to differential equation. “Controls” modify some parameters or stop the process. “Cost criterion” depends on the choice of control and the corresponding state of the the system.

The goal is to discover an optimal choice of controls, to minimize the cost criterion.

${ \left\{ \begin{align} & \mathrm{d} X = b(X) \mathrm{d} t + B(X) \mathrm{d} W, \\ & X_{0} = x. \end{align} \right. }$ ${ J_{x}(\theta) := \mathbb{E}\left( \underbrace{ \int_{0} ^{\theta \wedge \tau} f(X_{s}) \,\mathrm{d}s }_{ \text{running cost} } + \underbrace{ g(X_{\theta \wedge \tau}) }_{ \text{cost} } \right). }$

Does there exist an optimal stopping time s.t.

${ J_{x} (\theta^{\ast}) = \min_{\theta } J_{x} ( \theta ) ? }$

Very difficult to do so.

“Value function”:

${ u(x) := \inf_{\theta} J_{x} ( \theta ). }$

After we get , we can construct by dynamic programming.

Optimal Conditions

Immediately stop,

${ u(x) \leq g(x), \forall x \in U. }$

on ,

${ u(x) = g(x), \forall x \in \partial U. }$

Next take any point , and fix , and do not stop and assume we do not hit , then we have

${ u(x) \leq \mathbb{E}\left( \int_{0}^{\delta}f(x) \, \mathrm{d}s + u(X_{\delta}) \right) = \mathbb{E}\left( \int_{0}^{\delta} f(X_{s}) \, \mathrm{d}s + \int_{0}^{\theta \wedge \tau} f(X_{s}) \, \mathrm{d}s + g(X_{\theta \wedge \tau}) \right). (\ast) }$

Moreover Ito’a formula implies:

${ \mathbb{E}(u(X_{\delta})) - u(x) = \mathbb{E}\left( \int_{0}^{ \delta } L u (X_{s}) \, \mathrm{d}s \right). }$

Here depends on the and in the SDE.

Hence

${ 0 \leq \mathbb{E}\left( \int_{0}^{\delta} f(X) + Lu(X) \, \mathrm{d}s \right). }$

Divide it by , and then let : (Lebesgue differential theorem)

${ 0 \leq f(x) + Lu(x) }$

i.e.

${ Mu \leq f \text{ in } U. }$

${ u(x) \lt g(x), \forall x \in U, }$

we should let the system go, for at least some very small time , so

${ u ( x ) = \mathbb{E}\left( \int_{0}^{\delta} f(X) \, \mathrm{d}s + u(X_{\delta}) \right). }$

Then

${ Mu = f. }$

In conclusion,

${ \left\{ \begin{align} & \max(Mu - f, u - g) = 0 &\text{ in } U\\ & u = g &\text{ on }\partial U. \end{align}\right. }$

It’s called “optimal conditions”.

Design an optimal stopping policy

stopping set (立停集) is closed. Define for each in ,

${ \theta^{\ast} = \text{first hitting time of }S. }$

continuation set (继续集). We have on . and on . Remember that . So . Since on and .

We need to show that .
By Ito formula, .