On large deviations in estimation problem with dependent observations

The paper is devoted to the stochastic optimization problem with a stationary ergodic random sequence satisfying the hypermixing condition. It is assumed that we have the finite number of observed elements in the sequence, and instead of solving the former problem we investigate the empirical func...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Datum:2005
Hauptverfasser: Knopov, P.S., Kasitskaya, E.J.
Format: Artikel
Sprache:English
Veröffentlicht: Інститут математики НАН України 2005
Online Zugang:http://dspace.nbuv.gov.ua/handle/123456789/4430
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Назва журналу:Digital Library of Periodicals of National Academy of Sciences of Ukraine
Zitieren:On large deviations in estimation problem with dependent observations / P.S. Knopov, E.J. Kasitskaya // Theory of Stochastic Processes. — 2005. — Т. 11 (27), № 3-4. — С. 97–103. — Бібліогр.: 4 назв.— англ.

Institution

Digital Library of Periodicals of National Academy of Sciences of Ukraine
id irk-123456789-4430
record_format dspace
spelling irk-123456789-44302009-11-10T12:00:32Z On large deviations in estimation problem with dependent observations Knopov, P.S. Kasitskaya, E.J. The paper is devoted to the stochastic optimization problem with a stationary ergodic random sequence satisfying the hypermixing condition. It is assumed that we have the finite number of observed elements in the sequence, and instead of solving the former problem we investigate the empirical function, find its points of minimum, and study their asymptotic properties. More precisely we consider the probabilities of large deviations of minimizers and the minimal value of the empirical criterion function from the corresponding characteristics of the main problem. The conditions under which the probabilities of the large deviations decrease exponentially are found. 2005 Article On large deviations in estimation problem with dependent observations / P.S. Knopov, E.J. Kasitskaya // Theory of Stochastic Processes. — 2005. — Т. 11 (27), № 3-4. — С. 97–103. — Бібліогр.: 4 назв.— англ. 0321-3900 http://dspace.nbuv.gov.ua/handle/123456789/4430 519.21 en Інститут математики НАН України
institution Digital Library of Periodicals of National Academy of Sciences of Ukraine
collection DSpace DC
language English
description The paper is devoted to the stochastic optimization problem with a stationary ergodic random sequence satisfying the hypermixing condition. It is assumed that we have the finite number of observed elements in the sequence, and instead of solving the former problem we investigate the empirical function, find its points of minimum, and study their asymptotic properties. More precisely we consider the probabilities of large deviations of minimizers and the minimal value of the empirical criterion function from the corresponding characteristics of the main problem. The conditions under which the probabilities of the large deviations decrease exponentially are found.
format Article
author Knopov, P.S.
Kasitskaya, E.J.
spellingShingle Knopov, P.S.
Kasitskaya, E.J.
On large deviations in estimation problem with dependent observations
author_facet Knopov, P.S.
Kasitskaya, E.J.
author_sort Knopov, P.S.
title On large deviations in estimation problem with dependent observations
title_short On large deviations in estimation problem with dependent observations
title_full On large deviations in estimation problem with dependent observations
title_fullStr On large deviations in estimation problem with dependent observations
title_full_unstemmed On large deviations in estimation problem with dependent observations
title_sort on large deviations in estimation problem with dependent observations
publisher Інститут математики НАН України
publishDate 2005
url http://dspace.nbuv.gov.ua/handle/123456789/4430
citation_txt On large deviations in estimation problem with dependent observations / P.S. Knopov, E.J. Kasitskaya // Theory of Stochastic Processes. — 2005. — Т. 11 (27), № 3-4. — С. 97–103. — Бібліогр.: 4 назв.— англ.
work_keys_str_mv AT knopovps onlargedeviationsinestimationproblemwithdependentobservations
AT kasitskayaej onlargedeviationsinestimationproblemwithdependentobservations
first_indexed 2025-07-02T07:40:40Z
last_indexed 2025-07-02T07:40:40Z
_version_ 1836520089031016448
fulltext Theory of Stochastic Processes Vol. 11 (27), no. 3–4, 2005, pp. 97–103 UDC 519.21 P. S. KNOPOV AND E. J. KASITSKAYA ON LARGE DEVIATIONS IN ESTIMATION PROBLEM WITH DEPENDENT OBSERVATIONS The paper is devoted to the stochastic optimization problem with a stationary ergodic random sequence satisfying the hypermixing condition. It is assumed that we have the finite number of observed elements in the sequence, and instead of solving the former problem we investigate the empirical function, find its points of minimum, and study their asymptotic properties. More precisely we consider the probabilities of large deviations of minimizers and the minimal value of the empirical criterion function from the corresponding characteristics of the main problem. The conditions under which the probabilities of the large deviations decrease exponentially are found. We investigate the stochastic optimization problem: minimize (1) Ef(x) = Ef(x, ξ0), x ∈ X, where {ξi, i ∈ Z} is a stationary in a strict sense ergodic random sequence defined on a probability space (Ω, F, P ) and with values in some measurable space (Y,�), X is a compact subset of some Banach space S with the norm ‖ · ‖, f : X ∗ Y → R is some known function continuous in the first argument and measurable in the second one. Instead of (1) we will minimize the empirical function (2) Fn(x) = 1 n n∑ k=1 f(x, ξk), x ∈ X, with {ξk, k = 1, . . . , n} observed elements of the sequence {ξi}. If we have E {max(|f(x, ξ0)| , x ∈ X)} < ∞. then there exists a solution x∗ to the problem (1), and we suppose that it is unique. It is known that there exists a minimum point xn(ω) of the function (2). Under some sufficiently simple conditions (see [1]) xn(ω) tends to x∗ with probability 1 as n → ∞. The aim of the paper is to estimate the large deviations of xn and Fn(xn) . Let us recollect some facts from functional analysis. For any y ∈ Y the function f(◦, y) belongs to the space C(X) of continuous real functions on X . We assume that for all y ∈ Y we have f(◦, y) − Ef(◦) ∈ K, where K is some convex compact set from C(X). Therefore for any n Fn(◦)−Ef(◦) is a random element defined on the probability space (Ω, F, P ) and with values in K. Definition 1. [2]. Let (V, ‖◦‖) be a normed linear space, B(x, ρ) - a closed ball in V with the radius ρ and the center x, f : V → [−∞, +∞] -some function, and f(xf ) = min{f(x), x ∈ V }. A condition function ψ for f at xf is a monotone increasing function 2000 AMS Mathematics Subject Classification. Primary 62M10, 62M40. Key words and phrases. Large deviations, stochastic programming problem, empirical estimates, hypermixing. 97 98 P. S. KNOPOV AND E. J. KASITSKAYA ψ : [0, +∞) → [0, +∞] with ψ(0) = 0 such that for some ρ > 0 and for all x ∈ B(xf , ρ) we have f(x) ≥ f(xf ) + ψ (‖x − xf‖) . Assume that V0 ⊂ V , and denote by δV0 the indicator function of V0: δV0(x) = 0, x ∈ V0; δV0(x) = +∞, x /∈ V0. Theorem 1. [2]. Let (V, ‖◦‖) be a normed linear space, V0 ⊂ V closed, and f0, g0 : V → R be continuous functions on V . Suppose that ε = sup {|f0(x) − g0(x)| , x ∈ V0} . Define the functions f, g : V → (−∞, +∞] in the following way: f = f0 + δV0 , g = g0 + δV0 . Then |inf{f(x), x ∈ V } − inf{g(x), x ∈ V }| ≤ ε. Next, let xf be a minimum point of f : f(xf ) = inf{f(x), x ∈ V }. Assume that ψ is a condition function for f at xf with some coefficient ρ > 0. If ε is sufficiently small so that for all x when ψ (‖x − xf‖) ≤ 2ε we have ‖x − xf‖ ≤ ρ, then for any xg ∈ arg min{g(x), x ∈ B(xf , ρ)} the following inequality is fulfilled: ψ (‖xf − xg‖) ≤ 2ε. When ψ is convex and strictly increasing on [0, ρ], the preceding inequality can also be expressed in the following way: if ε is small enough so that ψ−1(2ε) ≤ ρ, then for any xg ∈ argmin{g(x), x ∈ B(xf , ρ)} one has ‖xf − xg‖ ≤ ψ−1(2ε). Theorem 2. [3, p.53]. Let {με : ε > 0} be a family of probability measures on G, where G is a closed convex subset of a separable Banach space S. Assume that Λ(λ) ≡ lim ε→0 εΛμε(λ/ε) exists for every λ ∈ S∗, where S∗ is the dual space of S, and Λμ(λ) = ln (∫ E exp [〈λ, x〉] μ(dx) ) for an arbitrary probability measure μ on S , where 〈λ, x〉 is the corresponding duality relation. Denote Λ∗(q) = sup {〈λ, q〉 − Λ(λ), λ ∈ S∗} , q ∈ G. Then the function Λ∗ is nonnegative, lower semicontinuous and convex, and for any compact set A ⊂ G lim sup ε→0 ε ln (με(A)) ≤ − inf{Λ∗(q), q ∈ A} holds. ON LARGE DEVIATIONS IN THE ESTIMATION PROBLEM 99 Definition 2. [3]. Let Σ be a separable Banach space, {ξi, i ∈ Z} –a stationary in a strict sense random sequence defined on a probability space (Ω, F, P ) with values in Σ. Let Bmk denote the σ -algebra over Ω generated by the random elements {ξi, m ≤ i ≤ k}. For given l ∈ N the real random variables η1, . . . , ηp , p ≥ 2 are called l -measurably separated if −∞ ≤ m1 ≤ k1 < m2 ≤ k2 < . . . < mp ≤ kp ≤ +∞ ; mj − kj−1 ≥ l, j = 2, . . . , p and for each j ∈ {1, . . . , p} the random variable ηj is Bmjkj -measurable. Definition 3. [3]. A random sequence {ξi} from Definition 2 is called a sequence with hypermixing if there exist a number l0 ∈ N ⋃{0} and non-increasing functions α, β : {l > l0} → [1, +∞) and γ : {l > l0} → [0, 1] which satisfy lim l→∞ α(l) = 1, lim sup l→∞ l(β(l) − 1) < ∞, lim l→∞ γ(l) = 0, and for which (H-1) ‖η1 . . . ηp‖L1(P ) ≤ p∏ j=1 ‖ηj‖Lα(l)(P ) whenever p ≥ 2, l > l0 and η1, . . . , ηp are l -measurably separated functions. Here ‖η‖Lr(P ) = (∫ Ω |η(ω)|r dP )1/r , and (H-2) ∣∣∣∣ ∫ Ω ( ξ(ω) − ∫ Ω ξ(ω)dP ) η(ω)dP ∣∣∣∣ ≤ γ(l) ‖ξ‖Lβ(l)(P ) ‖η‖Lβ(l)(P ) for all l > l0 and ξ, η ∈ L1(P ) l -measurably separated. It is known (see [4]), that C(X)∗ = M(X) is the set of bounded signed measures on X , and 〈g, Q〉 = ∫ X g(x)Q(dx) for any g ∈ C(X), Q ∈ M(X) . Theorem 3. Suppose that {ξi, i ∈ Z} is a stationary in a strict sense ergodic ran- dom sequence satisfying the hypothesis (H-1) of the hypermixing condition, defined on a probability space (Ω, F, P ) with values in a compact convex set K ⊂ C(X). Then for any measure Q ∈ M(X) there exists Λ(Q) = lim n→∞ 1 n ln (∫ Ω exp { n∑ i=1 ∫ X ξi(ω)(x)Q(dx) } dP ) and for any closed A ⊂ K lim sup n→∞ 1 n ln ( P { 1 n n∑ i=1 ξi ∈ A }) ≤ − inf{Λ∗(g), g ∈ A}, 100 P. S. KNOPOV AND E. J. KASITSKAYA where Λ∗(g) = sup {∫ X g(x)Q(dx) − Λ(Q), Q ∈ M(X) } is the nonnegative, lower semi- continuous and convex function. Proof. Let us consider any Q ∈ M(X) . Assume that l0 is the number from the hypothesis (H-1). Fix l > l0 and m, n ∈ N , where l < m < n . Then n = Nnm + rn, Nn ∈ N, rn ∈ N ⋃ {0}, rn < m. We will use the following notation: ‖g‖ = max{|g(x)| , x ∈ X}, g ∈ C(X), (3) fn = ln (∫ Ω exp { n∑ i=1 ∫ X ξi(ω)(x)Q(dx) } dP ) , c = max{‖g‖ , g ∈ K}, v(Q, X) = sup { k∑ i=1 |Q(Ei)| , Ei ⋂ Ej = ∅, i �= j, Ei ∈ B(X), k ∈ N } < ∞, where B(X) is the Borel σ–algebra on X , Q ∈ M(X), where the last formula is taken from [Dunford and Schwartz (1957)]. For all ω we have n∑ i=1 ∫ X ξi(ω)(x)Q(dx) = Nn−1∑ j=0 (j+1)m−l∑ i=jm+1 ∫ X ξi(ω)(x)Q(dx) (4) + Nn−1∑ j=0 (j+1)m∑ i=(j+1)m−l+1 ∫ X ξi(ω)(x)Q(dx) + n∑ i=Nnm+1 ∫ X ξi(ω)(x)Q(dx). Further, in view of (3) for each i, ω (5) ∣∣∣∣ ∫ X ξi(ω)(x)Q(dx) ∣∣∣∣ ≤ cv(Q, X). Due to (5) for any ω we have (6) Nn−1∑ j=0 (j+1)m∑ i=(j+1)m−l+1 ∫ X ξi(ω)(x)Q(dx) ≤ cv(Q, X)lNn, (7) n∑ i=Nnm+1 ∫ X ξi(ω)(x)Q(dx) ≤ cv(Q, X)rn. For each ω denote V1 = Nn−1∑ j=0 (j+1)m−l∑ i=jm+1 ∫ X ξi(ω)(x)Q(dx), V2 = Nn−1∑ j=0 (j+1)m∑ i=(j+1)m−l+1 ∫ X ξi(ω)(x)Q(dx), V3 = n∑ i=Nnm+1 ∫ X ξi(ω)(x)Q(dx). The inequalities (6) and (7) imply that exp {V1 + V2 + V3} ≤ ON LARGE DEVIATIONS IN THE ESTIMATION PROBLEM 101 (8) ≤ exp {V1} exp {cv(Q, X)lNn} exp {cv(Q, X)rn} , ω ∈ Ω. It follows from (8) that∫ Ω exp {V1 + V2 + V3} dP ≤ exp {cv(Q, X)lNn} exp {cv(Q, X)rn} ∫ Ω exp {V1} dP. Due to the properties of {ξi} we obtain ∫ Ω Nn−1∏ j=0 exp ⎧⎨ ⎩ (j+1)m−l∑ i=jm+1 ∫ X ξi(ω)(x)Q(dx) ⎫⎬ ⎭ dP ≤ (9) ≤ Nn−1∏ j=0 ⎛ ⎜⎝∫ Ω ⎛ ⎝exp ⎧⎨ ⎩ (j+1)m−l∑ i=jm+1 ∫ X ξi(ω)(x)Q(dx) ⎫⎬ ⎭ ⎞ ⎠ α(l) dP ⎞ ⎟⎠ 1/α(l) , ∫ Ω exp ⎧⎨ ⎩α(l) (j+1)m−l∑ i=jm+1 ∫ X ξi(ω)(x)Q(dx) ⎫⎬ ⎭ dP = (10) = ∫ Ω exp { α(l) m−l∑ i=1 ∫ X ξi(ω)(x)Q(dx) } dP, j = 1, . . . , Nn − 1. In view of (9) and (10) we get ∫ Ω exp {V1} dP ≤ (∫ Ω exp { α(l) m−l∑ i=1 ∫ X ξi(ω)(x)Q(dx) } dP )Nn/α(l) . ¿From (4) we get fn = ln (∫ Ω exp {V1 + V2 + V3} dP ) ≤ cv(Q, X)lNn + cv(Q, X)rn+ + ln ⎡ ⎣ (∫ Ω exp { α(l) m−l∑ i=1 ∫ X ξi(ω)(x)Q(dx) } dP )Nn/α(l) ⎤ ⎦ = cv(Q, X)lNn + cv(Q, X)rn+ + Nn α(l) ln (∫ Ω exp { (α(l) − 1) m−l∑ i=1 ∫ X ξi(ω)(x)Q(dx) + m−l∑ i=1 ∫ X ξi(ω)(x)Q(dx) } dP ) ≤ ≤ cv(Q, X)lNn + cv(Q, X)rn + Nn α(l) (α(l) − 1) (m − l) cv(Q, X)+ + Nn α(l) ln (∫ Ω exp { m−l∑ i=1 ∫ X ξi(ω)(x)Q(dx) } dP ) ≤ ≤ cv(Q, X)lNn + cv(Q, X)rn + (α(l) − 1) (m − l) cv(Q, X)Nn+ + Nn α(l) ln (∫ Ω exp { m∑ i=1 ∫ X ξi(ω)(x)Q(dx) − m∑ i=m−l+1 ∫ X ξi(ω)(x)Q(dx) } dP ) ≤ 102 P. S. KNOPOV AND E. J. KASITSKAYA ≤ cv(Q, X)lNn + cv(Q, X)rn + (α(l) − 1) cv(Q, X)mNn + Nn α(l) cv(Q, X)l+ + Nn α(l) ln (∫ Ω exp { m∑ i=1 ∫ X ξi(ω)(x)Q(dx) } dP ) ≤ ≤ 2cv(Q, X)lNn + cv(Q, X)rn + (α(l) − 1) cv(Q, X)mNn + Nn α(l) fm . (11) The inequality (11) implies that fn n ≤ 2Nncv(Q, X)l Nnm + cv(Q, X)rn n + (α(l) − 1) cv(Q, X) + Nnfm α(l) (Nnm + rn) = = 2cv(Q, X)l m + cv(Q, X)rn n + (α(l) − 1) cv(Q, X) + fm α(l) (m + rn/Nn) . Therefore lim sup n→∞ fn n ≤ 2cv(Q, X)l m + (α(l) − 1) cv(Q, X) + 1 α(l) fm m . Passing to the lim inf as m → ∞, we obtain lim sup n→∞ fn n ≤ (α(l) − 1) cv(Q, X) + 1 α(l) lim inf m→∞ fm m , and letting l → ∞, we have lim sup n→∞ fn n ≤ lim inf m→∞ fm m . Consequently, there exists lim n→∞ fn n = Λ(Q). Now we can see that the theorem follows from Theorem 2. Indeed, for G = K, S = C(X), S∗ = M(X), 〈Q, g〉 = ∫ X g(x)Q(dx), ε = 1 n , and for με = μ1/n the probability measure on K, defined by the distribution of a random element 1 n ∑n i=1 ξi , we have lim ε→0 εΛμε(Q/ε) = lim n→∞ 1 n ln (∫ K exp {∫ X g(x)nQ(dx) } μ1/n(dg) ) = (12) = lim n→∞ 1 n ln (∫ Ω exp {∫ X 1 n n∑ i=1 ξi(ω)(x)nQ(dx) } dP ) = lim n→∞ fn n = Λ(Q). The proof is complete. Now let us consider the problems (1) and (2). Suppose that the given sequence {ξi, i ∈ Z} satisfies the hypothesis (H-1) of the hypermixing condition. Then the sequence ζi = f(◦, ξi) − Ef(◦), i ∈ Z, satisfies (H-1) too. Denote Aε = {z ∈ K : ‖z‖ ≥ ε} , ON LARGE DEVIATIONS IN THE ESTIMATION PROBLEM 103 I(z) = Λ∗(z) = sup {∫ X z(x)Q(dx) − Λ(Q), Q ∈ M(X) } . Theorem 4. Under the hypothesis (H-1) of the hypermixing condition we have lim sup n→∞ 1 n lnP {|min {Ef(x), x ∈ X} − min {Fn(x), x ∈ X}| ≥ ε} (13) ≤ − inf {I(z), z ∈ Aε} . Assume that there exists a condition function ψ for Ef at x∗ with some constant ρ. Let xn be a point of the minimum of the function (2) on the set B(x∗, ρ). If ε is sufficiently small so that the condition ψ (‖x − x∗‖) ≤ 2ε ⇒ ‖x − x∗‖ ≤ ρ, is fulfilled, then (14) lim sup n→∞ 1 n lnP {ψ (‖xn − x∗‖) ≥ 2ε} ≤ − inf {I(z), z ∈ Aε} . Moreover, if ψ is convex and strictly increasing on [0, ρ], then (15) lim sup n→∞ 1 n lnP {‖xn − x∗‖ ≥ ψ−1(2ε) } ≤ − inf {I(z), z ∈ Aε} . Proof. Theorem 1 implies that for each ω (16) |min {Ef(x), x ∈ X} − min {Fn(x), x ∈ X}| ≤ ‖Fn − Ef‖ . Then, the conditions of Theorem 3 are fulfilled for the sequence {ςi} . Therefore for any ε > 0 (17) lim sup n→∞ 1 n lnP {‖Fn − Ef‖ ≥ ε} ≤ − inf {I(z), z ∈ Aε} . The inequality (13) follows from (16) and (17). For the proof of the second part of the theorem we also use Theorem 1. Under the conditions of the theorem we have for all ω (18) ψ (‖x∗ − xn‖) ≤ 2 ‖Fn − Ef‖ , or (19) ‖xn − x∗‖ ≤ ψ−1 (2 ‖Fn − Ef‖) . Taking into account (17), the inequalities (18) and (19) imply (14) and (15) respectively. The theorem is proved. Bibliography 1. Knopov, P.S. and Kasitskaya, E.J., Empirical estimates in stochastic optimization and identi- fication, Kluwer Academic Publishers,, Dordrecht, 2002. 2. Kaniovski, Yu.M., King, A.J. and Wets, R.J-B., ”Probabilistic bounds (via large deviations) for the solutions of stochastic programming problems”, Ann.Oper.Res. 56, 189 - 208, 1995. 3. Deuschel, J.-D. and Stroock, D.W., ”Large Deviations”, Academic Press, Boston, 1989. 4. Dunford, N. and Schwartz, J., ”Linear Operators, Part I: General Theory”, Interscience Pub- lishers, New York, 1958. E-mail : knopov1@yahoo.com