On one stochastic optimal control problem with variable delay
A stochastic optimal control problem with variable delays in control is considered. The maximum principle for nonlinear stochastic control system with constrains in the right end of trajectory is proved.
Gespeichert in:
Datum: | 2007 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | English |
Veröffentlicht: |
Інститут математики НАН України
2007
|
Online Zugang: | http://dspace.nbuv.gov.ua/handle/123456789/4471 |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Назва журналу: | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
Zitieren: | On one stochastic optimal control problem with variable delay / C. Agayeva // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 1-12. — Бібліогр.: 7 назв.— англ. |
Institution
Digital Library of Periodicals of National Academy of Sciences of Ukraineid |
irk-123456789-4471 |
---|---|
record_format |
dspace |
spelling |
irk-123456789-44712009-11-20T12:00:54Z On one stochastic optimal control problem with variable delay Agayeva, C. A stochastic optimal control problem with variable delays in control is considered. The maximum principle for nonlinear stochastic control system with constrains in the right end of trajectory is proved. 2007 Article On one stochastic optimal control problem with variable delay / C. Agayeva // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 1-12. — Бібліогр.: 7 назв.— англ. 0321-3900 http://dspace.nbuv.gov.ua/handle/123456789/4471 en Інститут математики НАН України |
institution |
Digital Library of Periodicals of National Academy of Sciences of Ukraine |
collection |
DSpace DC |
language |
English |
description |
A stochastic optimal control problem with variable delays in control is considered. The maximum principle for nonlinear stochastic control system with constrains in the right end of trajectory is proved. |
format |
Article |
author |
Agayeva, C. |
spellingShingle |
Agayeva, C. On one stochastic optimal control problem with variable delay |
author_facet |
Agayeva, C. |
author_sort |
Agayeva, C. |
title |
On one stochastic optimal control problem with variable delay |
title_short |
On one stochastic optimal control problem with variable delay |
title_full |
On one stochastic optimal control problem with variable delay |
title_fullStr |
On one stochastic optimal control problem with variable delay |
title_full_unstemmed |
On one stochastic optimal control problem with variable delay |
title_sort |
on one stochastic optimal control problem with variable delay |
publisher |
Інститут математики НАН України |
publishDate |
2007 |
url |
http://dspace.nbuv.gov.ua/handle/123456789/4471 |
citation_txt |
On one stochastic optimal control problem with variable delay / C. Agayeva // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 1-12. — Бібліогр.: 7 назв.— англ. |
work_keys_str_mv |
AT agayevac ononestochasticoptimalcontrolproblemwithvariabledelay |
first_indexed |
2025-07-02T07:42:32Z |
last_indexed |
2025-07-02T07:42:32Z |
_version_ |
1836520206274396160 |
fulltext |
Theory of Stochastic Processes
Vol.13 (29), no.1-2, 2007, pp.1-12
CHERKEZ AGAYEVA
ON ONE STOCHASTIC OPTIMAL CONTROL
PROBLEM WITH VARIABLE DELAY
A stochastic optimal control problem with variable delays in control
is considered. The maximum principle for nonlinear stochastic con-
trol system with constrains in the right end of trajectory is proved.
1. Introduction
The stochastic differential equations with delay find much exhibits in
description of the real systems, more or less are subjected to the influence
of the random noises. Many problems in theories of automatic regulation,
mechanical engineering, economy, automatics are described by stochastic
differential equations with delay. Therefore problems of optimal control for
systems, described by such equations, are actual at present [1,2]. Earlier
the problems of stochastic optimal control with variable delay in phase [3]
and with constant delay in control [4] were considered. The present work is
devoted to the problem of stochastic optimal control with variable delay in
control with constrains on right endpoint of trajectory. Our objective is to
obtain a necessary condition for optimal control, when diffusion coefficient
contains the control variable with delay.
2. Statement of the main problem
Let (Ω, F, P ) be a complete probability space with filtration {F t :
t0 ≤ t ≤ t1} generated by Wiener process wt, F t = σ(ws, t0 ≤ s
≤ t). Let L2
F (t0, t1, R
n) be a space of predictable processes xt (ω) such that
E
t1∫
t0
|xt|2 dt < +∞.
Consider the following stochastic system with variable delay in control:
2000 Mathematics Subject Classifications. 93E20, 49K45
Key words and phrases. Stochastic differential equations with delay, Stochastic
control problem, Necessary condition of optimality, Maximum principle, Ekeland’s vari-
ational principle.
1
2 CHERKEZ AGAYEVA
dxt = g(xt, ut, ut−h(t), t)dt + f(xt, ut, ut−h(t), t)dwt; t ∈ (t0, t1] , (1)
xt0 = x0 (2)
ut = Q(t), t ∈ [t0 − h(t0), t0) (3)
ut(ω) ∈ U∂ ≡ {u(·) ∈ L2
F (t0, t1; R
m)| u(ω) ∈ U ⊂ Rm, a.s.} (4)
where U is a nonempty bounded set, Q (t) is a piecewise continuous
non-random function, h(t) ≥ 0 is a continuously differentiable, non-random
function such that dh(t)
dt
< 1.
Let it is necessary to minimize the following functional inside the set of
admissible controls:
J(u) = E
⎧⎨
⎩p(xt1) +
t1∫
t0
l(xt, ut, t)dt
⎫⎬
⎭ (5)
with a condition
Eq(xt1) ∈ G ⊂ Rk, (6)
where G is a closed convex set in Rk.
Let assume that the following requirements are satisfied:
I. The functions l, g, f and their derivatives are continuous in (x, u, t) :
l(x, u, t) : Rn × Rm × [t0, t1] → R1;
g(x, u, ν, t) : Rn × Rm × Rm × [t0, t1] → Rn;
f(x, u, ν, t) : Rn × Rm × Rm × [t0, t1] → Rn×n.
II. The functions l, g, f are twice continuously differentiable with respect
to x, lxx, gxx, fxx, bounded and of linear growth:
(1+ |x|)−1(|g(x, u, ν, t)|+ |f(x, u, ν, t)|+ |gx(x, u, ν, t)|+ |fx(x, u, ν, t)|) ≤ N ;
(1 + |x|)−1(|l(x, u, t)|) + (lx(x, u, t)) ≤ N.
STOCHASTIC OPTIMAL CONTROL 3
III. Function p(x) : Rn → R1 is twice continuously differentiable and
|p(x)| + |px(x)| ≤ N(1 + |x|); |pxx(x)| ≤ N.
IV. Function q(x) : Rm → Rk is twice continuously differentiable and
|q(x)| + |qx(x)| ≤ N(1 + |x|); |qxx(x)| ≤ N.
At first the stochastic optimal control problem (1)-(5) is being consid-
ered.
Theorem 1. Let conditions I-III hold and (x0
t , u
0
t ) is a solution of prob-
lem (1)-(5). Then there exist random processes (ψt, βt) ∈ L2
F (t0, t1; R
n)
×L2
F (t0, t1; R
n×n) and (Φt, Kt) ∈ L2
F (t0, t1; R
n)×L2
F (t0, t1; R
n×n), which are
the solutions of the following adjoint equations:
{
dψt = −Hx(ψt, x
0
t , u
0
t , ν
0
t , t)dt + βtdwt, t0 ≤ t < t1,
ψt1 = −px(x
0
t1
),
(7)
⎧⎪⎪⎨
⎪⎪⎩
dΦt = −[g∗
x(x
0
t , u
0
t , ν
0
t , t)Φt + Φtgx(x
0
t , u
0
t , ν
0
t , t)+
+f ∗
x(x0
t , u
0
t , ν
0
t , t)Φtfx(x
0
t , u
0
t , ν
0
t , t)dt + f ∗
x(x0
t , u
0
t , ν
0
t , t)Kt+
+Ktfx(x
0
t , u
0
t , ν
0
t , t) + Hxx(ψt, x
0
t , u
0
t , ν
0
t , t)]dt + Ktdwt, t0 ≤ t < t1,
Φt1 = −pxx(x
0
t1),
(8)
and ∀u ∈ U a.s. the following equations hold:
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
ΔuH(ψθ, x
0
θ, u
0
θ, ν
0
θ , θ) + [ΔνH(ψz, x
0
z, u
0
z, ν
0
z , z)+
+0.5Δνf
∗(x0
z, u
0
z, ν
0
z , z)ΦzΔνf(x0
z, u
0
z, ν
0
z , z)]
∣∣∣
z=s(θ)
s′(θ)+
+0.5Δuf
∗(x0
θ, u
0
θ, ν
0
θ , θ)ΦθΔuf(x0
θ, u
0
θ, ν
0
θ , θ) ≤ 0,
a.e. θ ∈ [t0, t1 − h(t1))
H(ψθ, x
0
θ, u, ν0
θ , θ) − H(ψθ, x
0
θ, u
0
θ, ν
0
θ , θ)+
+0.5Δuf
∗(x0
θ, u
0
θ, ν
0
θ , θ)ΦθΔuf(x0
θ, u
0
θ, ν
0
θ , θ) ≤ 0,
for a.e. θ ∈ [t1 − h(t1), t1],
(9)
where t = s(τ) is a solution of the equation τ = t − h(t), νt = ut−h(t),
Δuy(xt, ut, νt, t) = y(xt, u, νt, t) − y(xt, ut, νt, t),
Δνy(xt, ut, νt, t) = y(xt, ut, u, t) − y(xt, ut, νt, t),
H(ψt, xt, ut, νt, t) = ψ∗
t · g(xt, ut, νt, t) + β∗
t · f(xt, ut, νt, t) − l(xt, ut, t).
4 CHERKEZ AGAYEVA
Proof. Let ut = u0
t + Δut be some admissible control and xt = x0
t + Δxt
be corresponding to this control trajectory of system (1)-(4). Let’s use the
following identity:
⎧⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎩
d(Δxt) = [g(xt, ut, νt, t) − g(x0
t , u
0
t , ν
0
t , t)]dt + [f(xt, ut, νt, t)−
−f(x0
t , u
0
t , ν
0
t , t)]dwt = [Δug(x0
t , u
0
t , νt, t) + Δνg(x0
t , u
0
t , ν
0
t , t)+
+gx(x
0
t , ut, νt, t)Δxt + 0.5Δx∗
t gxx(x
0
t , ut, νt, t)Δxt]dt+
+[Δuf(x0
t , u
0
t , ν
0
t , t) + Δνf(x0
t , u
0
t , ν
0
t , t) + fx(x
0
t , u, νt, t)Δxt+
+0.5Δx∗
t fxx(x
0
t , ut, νt, t)Δxt]dwt + η1
t ,
t ∈ (t0, t1], Δxt = 0, t ∈ [t0 − h(t0), t0]
(10)
where
η1
t =
⎧⎨
⎩
1∫
0
[g∗
x(x
0
t + μΔxt, ut, νt, t) − g∗
x(x
0
t , u
0
t , ν
0
t , t)]Δxtdμ+
+0.5 ·
1∫
0
Δx∗
t [g
∗
xx(x
0
t + μΔxt, ut, νt, t) − g∗
xx(x
0
t , u
0
t , ν
0
t , t)]Δxtdμ}dt+
+
⎧⎨
⎩
1∫
0
[f ∗
x(x0
t + μΔxt, ut, νt, t) − f ∗
x(x0
t , u
0
t , ν
0
t , t)]Δxtdμ+
+0.5 ·
1∫
0
Δx∗
t [f
∗
xx(x
0
t + μΔxt, ut, νt, t) − f ∗
xx(x
0
t , u
0
t , ν
0
t , t)]Δxtdμ}dwt.
According to Ito’s formula [5] we have:
d(ψ∗
t · Δxt) = dψ∗
t Δxt + ψ∗
t dΔxt + {β∗
t [Δuf(x0
t , u
0
t , ν
0
t , t) + Δνf(x0
t , u
0
t , ν
0
t , t) +
+ fx(x
0
t , u
0
t , ν
0
t , t)Δxt + 0.5Δx∗
t fxx(x
0
t , u
0
t , ν
0
t , t)Δxt] +
+ β∗
t
1∫
0
[
fx(x
0
t + μΔxt, u
0
t , ν
0
t , t) − fx(x
0
t , u
0
t , ν
0
t , t)
]
Δxtdμ + 0.5 ×
× β∗
t
1∫
0
Δx∗
t
[
fxx(x
0
t + μΔxt, u
0
t , ν
0
t , t)− fxx(x
0
t , u
0
t , ν
0
t , t) ] Δxtdμ}dt (11)
and
STOCHASTIC OPTIMAL CONTROL 5
d(Δx∗
t · Φt · Δxt) = Δx∗
t · dΦt · Δxt + Δx∗
t · ΦtdΔxt + dΔx∗
t · Φt · Δxt +
+ { K∗
t [Δuf(x0
t , u
0
t , ν
0
t , t) + Δνf(x0
t , u
0
t , ν
0
t , t) + fx(x
0
t , u
0
t , ν
0
t , t)Δxt +
+ 0.5Δx∗
t fxx(x
0
t , u
0
t , ν
0
t , t)Δxt] + [Δuf(x0
t , u
0
t , ν
0
t , t) + Δνf(x0
t , u
0
t , ν
0
t , t) +
+ fx(x
0
t , u
0
t , ν
0
t , t)Δxt + 0.5Δx∗
t fxx(x
0
t , u
0
t , ν
0
t , t)Δxt] ×
× Φt · [Δuf(x0
t , u
0
t , ν
0
t , t) + Δνf(x0
t , u
0
t , ν
0
t , t) +
+ fx(x
0
t , u
0
t , ν
0
t , t)Δxt + 0.5Δx∗
t fxx(x
0
t , u
0
t , ν
0
t , t)Δxt]}dt (12)
The almost certainly uniqueness of the solutions of adjoint stochastic
equations (7), (8) follow from [6].
Taking into consideration (10)-(12), the expression of increment of a
functional (5) along the admissible control takes a form:
ΔJ
(
u0
)
= E
⎧⎨
⎩p(xt1) − p(x0
t1
) +
t1∫
t0
[l(xt, ut, t) − l(x0
t , u
0
t , t)]dt
⎫⎬
⎭ =
= −E
t1∫
t0
[
ψ∗
t Δug(x0
t , u
0
t , ν
0
t , t) + ψ∗
t Δνg(x0
t , u
0
t , ν
0
t , t)+ β∗
t Δuf(x0
t , u
0
t , ν
0
t , t) +
+ β∗
t Δνf(x0
t , u
0
t , ν
0
t , t) − Δul(x
0
t , u
0
t , t) + 0.5 · Δuf
∗(x0
t , u
0
t , ν
0
t , t) ×
× ΦtΔuf(x0
t , u
0
t , ν
0
t , t)dt + Δuf
∗(x0
t , u
0
t , ν
0
t , t)ΦtΔνf(x0
t , u
0
t , ν
0
t , t) +
+ Δuf
∗(x0
t , u
0
t , ν
0
t , t)ΦtΔuf(x0
t , u
0
t , ν
0
t , t) + Δνf
∗(x0
t , u
0
t , ν
0
t , t) ×
× ΦtΔνf(x0
t , u
0
t , ν
0
t , t) ] dt − ηt1
t0 , (13)
where
ηt1
t0 = E
1∫
0
[
p∗x(x
0
t1
+ μΔxt1) − p∗x(x
0
t1
)
]
Δxt1dμ + 0.5 ×
× E
1∫
0
Δx∗
t1
[
p∗xx(x
0
t1
+ μΔxt1) − p∗xx(x
0
t1
)
]
Δxt1dμ +
+ E
t1∫
t0
⎧⎨
⎩
1∫
0
[l∗x(x
0
t + μΔxt, ut, t) − l∗x(x
0
t , ut, t)]Δxtdμ}dt + 0.5 ×
× E
t1∫
t0
⎧⎨
⎩
1∫
0
Δx∗
t [l
∗
xx(x
0
t + μΔxt, ut, t) − l∗xx(x
0
t , ut, t)]Δxtdμ}dt +
6 CHERKEZ AGAYEVA
+ E
t1∫
t0
⎧⎨
⎩
1∫
0
[ψ∗
t (gx(x
0
t + μΔxt, ut, νt, t) − gx(x
0
t , u
0
t , ν
0
t , t))]Δxtdμ}dt + 0.5 ×
× E
t1∫
t0
⎧⎨
⎩
1∫
0
Δx∗
t [ψ
∗
t (gxx(x
0
t + μΔxt, ut, νt, t) − gxx(x
0
t , u
0
t , ν
0
t , t))]Δxtdμ}dt +
+ E
t1∫
t0
1∫
0
β∗
t [fx(x
◦
t + μΔxt, ut, νt, t) − fx(x
0
t , u
0
t , ν
0
t , t)]Δxtdμdt + 0.5 ×
× E
t1∫
t0
1∫
0
Δx∗
t · β∗
t [fxx(x
◦
t + μΔxt, ut, νt, t) − fxx(x
0
t , u
0
t , ν
0
t , t)]Δxtdμdt. (14)
Let’s consider the following spike variation:
Δut = Δuθ
t,ε =
{
0, t /∈ [θ, θ + ε), ε > 0,
u∗ − u0
t , t ∈ [θ, θ + ε), u∗ ∈ L2
(
Ω, F θ, P ; Rm
)
,
where xθ
t,ε is a trajectory corresponding to control uθ
t,ε = u0
t + Δuθ
t,ε · uθ
t,ε =
u0
t + Δuθ
t,ε.
Then (13) takes a form:
ΔθJ
(
u0
)
= −E
θ+ε∫
θ
[
ψ∗
t Δu∗g
(
x0
t , u
0
t , ν
0
t , t
)
+ ψ∗
t Δν∗g
(
x0
t , u
0
t , ν
0
t , t
)
+
+β∗
t Δu∗f
(
x0
t , u
0
t , ν
0
t , t
)
+ β∗
t Δν∗f
(
x0
t , u
0
t , ν
0
t , t
)
+ 0.5Δu∗f ∗(x0
t , u
0
t , ν
0
t , t)Φt×
×Δu∗f(x0
t , u
0
t , ν
0
t , t)dt + Δu∗f ∗ (
x0
t , u
0
t , ν
0
t , t
)
ΦtΔν∗f
(
x0
t , u
0
t , ν
0
t , t
)
+
+Δν∗f ∗ (
x0
t , u
0
t , ν
0
t , t
)
ΦtΔu∗f
(
x0
t , u
0
t , ν
0
t , t
)
+
+Δν∗f ∗ (
x0
t , u
0
t , ν
0
t , t
)
ΦtΔν∗f
(
x0
t , u
0
t , ν
0
t , t
) −Δu∗ l
(
x0
t , u
0
t , t
)]
dt + ηθ+ε
θ .
Lemma 1. Let conditions I-III hold.
Then E
∣∣∣xθ
t,ε − x0
t
∣∣∣2 ≤ Nε, if ε → 0, where trajectory of systems (1)-
(4) corresponds to control uθ
t,ε = u0
t + Δuθ
t,ε.
Proof. Let’s designate the following:
x̃t,ε = xθ
t,ε − x0
t .
It is clear that ∀t ∈ [t0, θ) x̃t,ε = 0
Then for ∀t ∈ [θ, θ + ε)
dx̃t, ε =
[
g(x0
t + εx̃t,ε, u
∗, ν0
t , t) − g(x0
t , u
0
t , ν
0
t , t)
]
dt+
STOCHASTIC OPTIMAL CONTROL 7
+
[
f(x0
t + εx̃t,ε, u
∗, ν0
t , t) − f(x0
t , u
0
t , ν
0
t , t)
]
dwt, t ∈ (θ, θ + ε)
x̃θ, ε = −(g(x0
θ, u
∗, ν0
θ , θ) − g(x0
θ, u
0
θ, ν
0
θ , θ))
or
x̃θ+ε, ε =
θ+ε∫
θ
[
g(x0
s + εx̃s,ε, u
∗, ν0
t , s) − g(x0
s, u
0
s, ν
0
s , s)
]
ds+
+
θ+ε∫
θ
[
g(x0
θ, u
0
θ, ν
0
θ , θ) − g(x0
s, u
0
s, ν
0
s , s)
]
ds +
θ+ε∫
θ
[
f(x0
s + εx̃s,ε, u
0
s, ν
0
s , s)−
−f(x0
s, u
0
s, ν
0
s , s)
]
dws +
θ+ε∫
θ
[
g(x0
s, u
∗, ν0
t , s) − g(x0
θ, u
∗, ν0
θ , θ)
]
ds.
Therefore from the conditions I-II and the Gronwall’s inequality we have
E |x̃θ+ε,ε|2 ≤ N
[
ε2E sup
θ≤t≤θ+ε
∣∣xθ
t,ε − x0
t
∣∣2 + ε2E sup
θ≤t≤θ+ε
∣∣x0
t − x0
θ
∣∣2 +
+ sup
θ≤t≤θ+ε
ε2E
∣∣g(x0
t , u
∗, ν0
t , t) − g(x0
θ, u
∗, ν0
θ , θ)
∣∣2 +
+εE
θ+ε∫
θ
| f(x0
t , u
0
t , ν
0
t , t) − f(x0
θ, u
0
θ, ν
0
θ , θ) |2 dt+
+ε2E
θ+ε∫
θ
∣∣g(x0
t , u
0
t , ν
0
t , t) − g(x0
θ, u
0
θ, ν
0
θ , θ)
∣∣2 dt
⎤
⎦
Hence:
E |x̃t+ε, ε|2 ≤ εN, ε → 0, ∀t ∈ [θ, θ + ε)
For ∀t ∈ [θ + ε, t1] :
dx̃t,ε =
[
g(x0
t + εx̃t,ε, u
0
t , u
∗, t) − g(x0
t , u
0
t , ν
0
t , t)
]
dt+
+
[
f(x0
t + εx̃t,ε, u
0
t , u
∗, t) − f(x0
t , u
0
t , ν
0
t , t)
]
dwt
Consequently we have:
dx̃t,ε =
1∫
0
gx(x
0
t + μεx̃t,ε, u
0
t , ν
0
t , t)x̃t,εdμdt+
+
1∫
0
fx(x
0
t + μεx̃t,ε, u
0
t , ν
0
t , t)x̃t,εdμdt
8 CHERKEZ AGAYEVA
x̃θ+ε,ε = −(g(x0
θ+ε, u
0
θ+ε, u
∗, θ) − g(x0
θ+ε, u
0
θ+ε, ν
0
θ+ε, θ))
Hence:
E |x̃t,ε|2 ≤ εN, for ∀t ∈ [θ + ε, t1], if ε → 0
Thus
sup
t0≤t≤t1
E |x̃t,ε|2 ≤ εN. Lemma 1 is proved.
From Lemma 1 and expression (14) for ηt0,t1 we obtain: ηθ+ε
θ = o(ε).
Since ε can be sufficiently small, we obtain that:
E
[
Δu∗H(ψθ, x
0
θ, u
0
θ, ν
0
θ , θ) + Δν∗H(ψz, x
0
z, u
0
z, ν
0
z , z)
∣∣
z=s(θ) s′(θ)−
−Δu∗ l(x0
θ, u
0
θ, θ) + 0.5Δu∗f ∗(x0
θ, u
0
θ, ν
0
θ , θ)ΦθΔu∗f(x0
θ, u
0
θ, ν
0
θ , θ)+
+0.5Δν∗f ∗(x0
z, u
0
z, ν
0
z , z)ΦzΔν∗f(x0
z, u
0
z, ν
0
z , z)
∣∣
z=s(θ)s
′(θ)] ≤ 0,
a.e. θ ∈ [t0, t1 − h(t1))
and
E [Δu∗ H(ψθ, x
0
θ, u
0
θ, ν
0
θ , θ) − Δu∗ l(x0
θ, u
0
θ, θ)+
+0.5Δu∗f ∗(x0
θ, u
0
θ, ν
0
θ , θ)ΦθΔu∗f(x0
θ, u
0
θ, ν
0
θ , θ)] ≤ 0,
a.e. θ ∈ [t1 − h(t1), t1),
in other words (9) is fulfilled. Theorem 1 is proved.
Then using the obtained result and variation principle of Ekeland [7]
we prove the following theorem for stochastic optimal control problem with
endpoint constraint (6).
2. Problem with constraint
Then using the obtained result and variation principle of Ekeland [7]
we prove the following theorem for stochastic optimal control problem with
endpoint constraint (6).
Theorem 2. Let the conditions I-IV hold and (x0
t , u
0
t ) be a solution of
problem (1)-(6).
Then there exist nonzero (λ0, λ1) ∈ Rk+1 such that λ0 ≥ 0, λ1 is a normal
to the set G at point Eq(x0
t1
), λ2
0+ |λ1|2 = 1 and random processes (ψt, βt) ∈
L2
F (t0, t1; R
n) × L2
F (t0, t1; R
n×n), (Φt, Kt) ∈ L2
F (t0, t1; R
n) × L2
F (t0, t1; R
n×n)
which are solutions of the following adjoint system:
{
dψt = −Hx(ψt, x
0
t , u
0
t , ν
0
t , t)dt + βtdwt, t0 ≤ t < t1,
ψt1 = −λ0px(x
0
t1
) − λ1qx(x
0
t1
)
(15)
STOCHASTIC OPTIMAL CONTROL 9
⎧⎪⎪⎨
⎪⎪⎩
dΦt = −[g∗
x(x
0
t , u
0
t , ν
0
t , t)Φt + Φtgx(x
0
t , u
0
t , ν
0
t , t)+
+f ∗
x(x0
t , u
0
t , ν
0
t , t)Φtfx(x
0
t , u
0
t , ν
0
t , t)dt + f ∗
x(x0
t , u
0
t , ν
0
t , t)Kt+
+Ktfx(x
0
t , u
0
t , ν
0
t , t) + Hxx(ψt, x
0
t , u
0
t , ν
0
t , t)]dt + Ktdwt, t0 ≤ t < t1,
Φt1 = −λ0pxx(x
0
t1
) − λ1qxx(x
0
t1
)
(16)
such that ∀u ∈ U a.s. it is fulfilled:
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎩
ΔuH(ψθ, x
0
θ, u
0
θ, ν
0
θ , θ) + [ ΔνH(ψz, x
0
z, u
0
z, ν
0
z , z)+
+0.5Δνf
∗(x0
z, u
0
z, ν
0
z , z)ΦzΔνf(x0
z, u
0
z, ν
0
z , z) ]
∣∣
z=s(θ)s
′(θ)+
+0.5Δuf
∗(x0
θ, u
0
θ, ν
0
θ , θ)ΦθΔuf(x0
θ, u
0
θ, ν
0
θ , θ) ≤ 0,
a.e. θ ∈ [t0, t1 − h(t1))
H(ψθ, x
0
θ, u, ν0
θ , θ) − H(ψθ, x
0
θ, u
0
θ, νθ, θ)+
+0.5Δuf
∗(x0
θ, u
0
θ, ν
0
θ , θ)ΦθΔuf(x0
θ, u
0
θ, ν
0
θ , θ) ≤ 0,
for a.e. θ ∈ [t1 − h(t1), t1]
(17)
Proof. For any natural number j let’s introduce the following approximation
functional: Ij(u) = Sj(Ep(xt1) + E
t1∫
t0
l(xt, ut, t)dt, Eq(xt1)) =
= min
(c,y)∈ E
√√√√√|c − 1/j − Ep(xt1) − E
t1∫
t0
l(xt, ut, t )dt|2 + ‖y − Eq(xt1)‖2
where E = {(c, y) : c ≤ J0, y ∈ G} , J0 is a minimal value of the functional
in (1)-(6).
Let V ≡ (U∂, d) be the space of controls obtained by means of introduc-
ing of the following metric:
d(u, ν) = (l ⊗ P ) {(t, ω) ∈ [t0, t1] × Ω : νt �= ut}
V is a complete metric space. Now we prove some auxiliary results (Lem-
mas 2, 3, 4).
Lemma 2. Let’s assume that conditions I-IV hold, un
t be a sequence of
admissible controls from V, xn
t be a sequence of corresponding trajectories
of the system (1)-(3).
If d(un
t , ut) → 0, n → ∞, then lim
n→∞
{
sup
t0≤t≤t1
E |xn
t − xt|2
}
= 0, where
xt is a trajectory corresponding to an admissible control ut.
10 CHERKEZ AGAYEVA
Proof. Let un
t be a sequence of admissible controls from V and xn
t be a
sequence of corresponding trajectories. Then for any t ∈ (t0; t1] we have:
|xn
t − xt| =
∣∣∣∣∣∣
t∫
t0
[g(xn
s , u
n
s , ν
n
s , s) − g(xs, us, νs, s)] ds+
+
t∫
t0
[f(xn
s , un
s , ν
n
s , s) − f(xs, us, νs, s)] dws|
Let’s put both sides of the equation to the second power and take math-
ematical expectations. Due to assumption II we have:
E |xn
t − xt|2 ≤ NE
t∫
t0
|Δung(xs, us, νs, s)|2 ds+
+NE
t∫
t0
|Δνng(xs, us, νs, s)|2 ds + NE
t∫
t0
|Δunf(xs, us, νs, s)|2 ds+
+NE
t∫
t0
|Δνnf(xs, us, νs, s)|2 ds + NE
t∫
t0
|xn
s − xs|2 ds
Hence from the condition I,II and using the Gronwall’s inequality we have
E |xn
t − xt|2 ≤ C exp(C(t − t0)), where
C = NE
t∫
t0
|Δung(xs, us, νs, s)|2 ds + NE
t∫
t0
|Δνng(xs, us, νs, s)|2 ds+
+NE
t∫
t0
|Δunf(xs, us, νs, s)|2 ds + NE
t∫
t0
|Δνnf(xs, us, νs, s)|2 ds
Lemma 2 is proved.
Due to continuity of the functional Jj : V → Rn, according to variation
principle of Ekeland we have that there exists a control uj
t : d(uj
t , u
0
t ) ≤ √
εj
and ∀u ∈ V it is fulfilled: Jj(u
j) ≤ Jj(u) +
√
εjd(uj, u), εj = 1
j
.
This inequality means that (xj
t , u
j
t) is a solution of the following problem:
⎧⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎩
Ij(u) = Jj(u) +
√
εjE
t1∫
t0
δ(ut, u
j
t)dt → min
dxt = g(xt, ut, νt, t)dt + f(xt, ut, νt, t)dwt, t ∈ (t0, t1]
ut = Q(t), t ∈ [t0 − h(t0), t0]
ut ∈ U∂
(18)
STOCHASTIC OPTIMAL CONTROL 11
Function δ(u, ν) is determined in the following way:
δ(u, ν) =
{
0, u = ν
1, u �= ν
Let (xj
t , u
j
t) be a solution of problem (18). Then according to Theo-
rem 1 we have: there exist random processes (ψj
t , β
j
t ) ∈ L2
F (t0, t1; R
n) ×
L2
F (t0, t1; Rn×n) , (Φj
t , K
j
t ) ∈ L2
F (t0, t1; R
n) × L2
F (t0, t1; R
n×n) which are so-
lutions of the following systems:
{
dψj
t = −Hx
(
ψj
t , x
j
t , ν
j
t , u
j
t , t
)
dt + βj
t dwt, t0 ≤ t ≤ t1
ψj
t1 = −λj
0px(x
j
t1) − λj
1qx(x
j
t1)
(19)
⎧⎪⎪⎨
⎪⎪⎩
dΦj
t = −[g∗
x(x
j
t , u
j
t , ν
j
t , t)Φ
j
t + Φj
tgx(x
j
t , u
j
t , ν
j
t , t) + f ∗
x(xj
t , u
j
t , ν
j
t , t)×
×Φj
tfx(x
j
t , u
j
t , ν
j
t , t)dt + f ∗
x(xj
t , u
j
t , ν
j
t , t)K
j
t + Kj
t fx(x
j
t , u
j
t , ν
j
t , t)+
+Hxx(ψ
j
t , x
j
t , u
j
t , ν
j
t , t)]dt + Kj
t dwt, t0 ≤ t < t1,
Φj
t1 = −λj
0pxx(x
j
t1) − λj
1qxx(x
j
t1)
(20)
and non-zero (λj
0, λ
j
1) ∈ Rk+1 meet the following requirement:
(λj
0, λ
j
1) = (−cj +1/j+Ep
(
xj
t1
)
+E
t1∫
t0
l(xj
t , u
j
t , t)dt−yj+Eq
(
xj
t1
))
/J0
j (21)
then ∀u ∈ U a.s. it is fulfilled:
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎩
ΔuH(ψj
θ, x
j
θ, u
j
θ, ν
j
θ , θ) + [ ΔνH(ψj
z, x
j
z, u
j
z, ν
j
z , z)+
+0.5Δνf
∗(xj
z, u
j
z, ν
j
z , z)Φj
zΔνf(xj
z, u
j
z, ν
j
z , z) ]
∣∣
z=s(θ)s
′(θ)+
+0.5Δuf
∗(xj
θ, u
j
θ, ν
j
θ , θ)Φ
j
θΔuf(xj
θ, u
j
θ, ν
j
θ , θ) ≤ 0,
a.e. θ ∈ [t0, t1 − h(t1))
H(ψθ, x
0
θ, u, ν0
θ , θ) − H(ψθ, x
0
θ, u
0
θ, ν
0
θ , θ)+
+0.5Δuf
∗(x0
θ, u
0
θ, ν
0
θ , θ)ΦθΔuf(x0
θ, u
0
θ, ν
0
θ , θ) ≤ 0,
a.e. θ ∈ [t1 − h(t1), t1]
(22)
Here
J0
j =
√√√√√|cj − 1 /j − Ep
(
xj
t1
) − E
t1∫
t0
l(xj
t , u
j
t , t )dt|2 +
∣∣yj − Eq
(
xj
t1
)∣∣2.
Since
∥∥(λj
0, λ
j
1)
∥∥ = 1, then we can think that (λj
0, λ
j
1) → (λ0, λ1).
It is known that Sj is a convex function which is differentiable by Gateaux
at the point (Ep(xj
t1) + E
t1∫
t0
l(xj
t , u
j
t , t)dt, Eq(xj
t1)). Then for all (c, y) ∈ E :
(λj
0, c − 1
j
− Ep(xj
t1) − E
t1∫
t0
l(xj
t , u
j
t , t)dt) + (λj
1, y − Eq(xj
t1)) ≤ 1
j
12 CHERKEZ AGAYEVA
Proceeding to the limit in the last inequality we receive that λ0 ≥ 0 and λ1
is a normal to the set G at the point Eq(x0
t1).
Since ψj
t1 = −λj
0px(x
j
t1) − λj
1qx(x
j
t1), then ψj
t1 → ψt1 in L2
F (t0, t1; R
n).
Similarly, from Φj
t1 = −λj
0pxx(x
j
t1)−λj
1qxx(x
j
t1), for j → ∞ implies Φj
t1 → Φt1
in L2
F (t0, t1; R
n).
Lemma 3. Let ψj
t be a solution of system (19), and ψt be a solution of
system (15).
Then E
t1∫
t0
|ψj
t − ψt|2dt + E
t1∫
t0
|βj
t − βt|2dt → 0, if d(uj
t , ut) → 0, j → ∞.
Lemma 4. Let Φj
t be a solution of system (20), and Φt be a solution of
system (16).
Then E
t1∫
t0
|Φj
t − Φt|2dt + E
t1∫
t0
|Kj
t − Kt|2dt → 0, if j → ∞.
According to Lemma 3 and Lemma 4 taking limit in (19), (20) as j → ∞
we obtain equalities (15) and (16).
Consequently, taking limit in (22) as j → ∞ we obtain inequality (17).
Theorem 2 is proved.
Bibliography
1. Kolmanovskii V.B., Myshkis A.D., Applied Theory of Functional Differen-
tial Equations. N.Y, Kluwer Academic Publishers, (1992).
2. Chernousko F.L., Kolmanovskii V.B., Optimal control on random distur-
bances, M.: Nauka, (1978), 352.
3. Agayeva Ch. A., Allahverdiyeva J.J., Maximum principle for stochastic sys-
tems with variable delay, Reports of NSA of Azerbaijan, LIX 5-6, (2003),
61-65.
4. Agayeva Ch.A., The necessary condition of optimality for one stochastic
control problem with a constant delay on control, Transactions of NAS o
Azerbaijan, Information science and control problems, Baku, 2, (2005),
117-123.
5. Gikhman I.I., Skorokhod A.V., Stochastic Differential Equations. Kiyev:
Naukova Dumka, (1968), 352.
6. Bismut J.M., Linear quadratic optimal stochastic control with random co-
efficients, SIAM J. on Control, 6, (1976), 419-444.
7. Ekeland I., Nonconvex minimization problem, Bull. Amer. Math. Soc.,
(NS) 1, (1979), 443-474.
Institute of Cybernetics, Baku State University, Baku, Azerbaijan
E-mail address: cher.agayeva@rambler.ru
|