On asymptotic information integral inequalities
Asymptotical versions of Bayesian Cramer – Rao inequalities are discussed.
Збережено в:
Дата: | 2007 |
---|---|
Автор: | |
Формат: | Стаття |
Мова: | English |
Опубліковано: |
Інститут математики НАН України
2007
|
Онлайн доступ: | http://dspace.nbuv.gov.ua/handle/123456789/4498 |
Теги: |
Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
|
Назва журналу: | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
Цитувати: | On asymptotic information integral inequalities / A. Veretennikov // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 294-307. — Бібліогр.: 14 назв.— англ. |
Репозитарії
Digital Library of Periodicals of National Academy of Sciences of Ukraineid |
irk-123456789-4498 |
---|---|
record_format |
dspace |
spelling |
irk-123456789-44982009-11-20T12:00:41Z On asymptotic information integral inequalities Veretennikov, A. Asymptotical versions of Bayesian Cramer – Rao inequalities are discussed. 2007 Article On asymptotic information integral inequalities / A. Veretennikov // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 294-307. — Бібліогр.: 14 назв.— англ. 0321-3900 http://dspace.nbuv.gov.ua/handle/123456789/4498 en Інститут математики НАН України |
institution |
Digital Library of Periodicals of National Academy of Sciences of Ukraine |
collection |
DSpace DC |
language |
English |
description |
Asymptotical versions of Bayesian Cramer – Rao inequalities are discussed. |
format |
Article |
author |
Veretennikov, A. |
spellingShingle |
Veretennikov, A. On asymptotic information integral inequalities |
author_facet |
Veretennikov, A. |
author_sort |
Veretennikov, A. |
title |
On asymptotic information integral inequalities |
title_short |
On asymptotic information integral inequalities |
title_full |
On asymptotic information integral inequalities |
title_fullStr |
On asymptotic information integral inequalities |
title_full_unstemmed |
On asymptotic information integral inequalities |
title_sort |
on asymptotic information integral inequalities |
publisher |
Інститут математики НАН України |
publishDate |
2007 |
url |
http://dspace.nbuv.gov.ua/handle/123456789/4498 |
citation_txt |
On asymptotic information integral inequalities / A. Veretennikov // Theory of Stochastic Processes. — 2007. — Т. 13 (29), № 1-2. — С. 294-307. — Бібліогр.: 14 назв.— англ. |
work_keys_str_mv |
AT veretennikova onasymptoticinformationintegralinequalities |
first_indexed |
2025-07-02T07:43:43Z |
last_indexed |
2025-07-02T07:43:43Z |
_version_ |
1836520281306300416 |
fulltext |
Theory of Stochastic Processes
Vol.13 (29), no.1-2, 2007, pp.294-307
ALEXANDER VERETENNIKOV
ON ASYMPTOTIC INFORMATION INTEGRAL
INEQUALITIES
Asymptotical versions of Bayesian Cramér – Rao inequalities are
discussed.
1. Introduction
Apparently, the Cramér–Rao inequality for unbiased estimators and its
asymptotic version are among the most basic results of point estimation
theory in modern textbooks in Statistics. This is despite the existence of
super-efficiency in a wider class of estimators which makes the use of this
inequality in its conventional form at least doubtful. A logically correct
approach which is free from the obstacle mentioned above can be based on
LAN – local asymptotic normality – and minimax concepts, which provide
improved methods of using the idea of this inequality (see, e.g., [2, 6]). Un-
fortunately, corresponding monographs are hardly suitable for courses at
the undergraduate level. At the same time, there is a nice, relatively easy,
and also logically correct way of teaching point estimation theory, based on
a certain integral version of the famous inequality. The point is that these
versions do not require unbiasedness, at least not the usual one. In this
way, either Maximum Likelihood Estimator, or Bayesian one under mild
assumptions are asymptotically efficient in the integral Mean Square sense.
In the other words, there is no super-efficiency phenomenon here. These in-
tegral inequalities are in use in practical applications, for example, in signal
processing, see [12] et al.
Some history of integral versions of Cramér–Rao’s lower bounds can be
found, e.g., in the paper [8]. Apparently, the first integral inequality of this
sort in a rather general form, although without a precise set of assumptions
was established by Schützenberger [9, 10]; the first of these references is
cited in the paper [8]. It should be admitted that both sources (i.e., [9,
10]) are not easily available nowadays, however, there is a useful link on
Invited lecture.
2000 Mathematics Subject Classifications. 62F15; 60E15.
Key words and phrases. Cramér – Rao inequality, integral asymptotic case.
294
ON ASYMPTOTIC INFORMATION INTEGRAL INEQUALITIES 295
the publication list, http://www-igm.univ-mlv.fr/˜ herstel/Mps. Anyway,
in about a decade, this idea of integration with respect to a parameter was
re-discovered by Van Trees, see [12], and since that time it is called usually
the Van Trees inequality. Probably, the name Schützenberger would be
more appropriate here. But in this respect it is rather instructive to remind
that an earlier version of the classical Cramér–Rao inequality has been
established, actually, by Fréchet. This fact was known and seemed to be
recognized in old days. Because of this variety of names, – and who knows
if there is no other papers which remain just unknown, – we use here a
general term, information inequality. In any case, in the paper [10], two
other names are mentioned, but the author did not manage to trace their
papers on the subject.
Let us return to asymptotic integral information inequalities, our goal
in this paper. Indeed, they are most important for the construction of a
complete asymptotic theory of estimation. We will discuss mainly the lower
bound (6), due to Borovkov and Sakhanenko. For some other results in this
direction see also [1, 7].
Let us say a few words about assumptions, for, in fact, the aim of this
paper is new assumptions for an existing inequality (6). The most standard
assumption in this theory is a so-called “weak unbiasedness condition”, –
see (2) below, – which is used in most of the works on the subject, see [8, 5],
et al. Notice here that this term “weak unbiasedness” should not confuse
the reader: it has practically nothing to do with unbiasedness. The paper
[3] (see also [2, §30]) introduced an amazing idea of replacing this already
weak condition by a very unusual and even weaker technical assumption for
the limiting inequality: namely, the “prior density”, – even though it could
be not a density, – has to be Riemann integrable if Θ is bounded, or directly
Riemann integrable if it is not (see [4, Chapter 11] ). We notice that the
words “directly integrable” is not used in [2], however, the calculus in the
proof, apparently, exploits exactly the latter. There were some generaliza-
tions, see [8, 11], and references therein. However, the author did not see
any discussion of this surprising Riemann integrability condition anywhere
but in the textbook [2]. So, the question arises, could one drop even this
remarkable condition? Surely, Riemann integrability as an assumption in
the theory based on Lebesgue integration looks strange. The aim of this
note is to show that some asymptotical versions of this integral inequality
are valid without any Riemann integrability at all.
The structure of the paper is as follows. Firstly, we present a new version
of the Theorem 5 [3], – that is, a rather similar, although not fully identical
result with Riemann integrability, – with a proof. Then we present one
more version of this inequality without Riemann integrability at all. In the
proofs, we use the approach developed in the proof of the Theorem 30.5
[2], and combine it with some further hints. Notice that in the proof of the
296 ALEXANDER VERETENNIKOV
Theorem 30.5 [2] there are some minor inaccuracies, which apparently can
be corrected; see the remarks 4 and 5 below. This does not influence on
our deep appreciation of the result by Borovkov and Sakhanenko, however,
corrections, being unthankful, are not the aim of this paper. Following [2],
the general idea of dropping weak unbiasedness is to use approximations,
with a level, say, ε, adjusted to the sample size n. Finally, let us mention
that we will consider here bounded intervals in R1 as parametric sets; gen-
eralizations to unbounded parametric sets and to any finite dimension are
possible. The idea of this work gradually arose from teaching theoretical
statistics in the University of Leeds, and earlier in Moscow State University
(cf. [13]).
2. Setting and results: Θ = (a, b)
We consider a family of densities (f(x | θ), θ ∈ Θ̄), with a parametric
set Θ = (a, b), −∞ < a < b < ∞, independent samples X = (X1, . . . , Xn),
n → ∞, Fisher’s information I(θ) := Eθ (∂ ln Ln(X | θ)/∂θ)2, where Ln is
a conditional density for the sample X given θ, and the “prior density”
q(θ), θ ∈ Θ̄, which is a Borel measurable function. We assume that the
function I is also Borel, but do not require that it is necessarily continuous.
All these are standard assumptions which are not reminded in the sequel.
We just notice that considering Θ̄ is theoretically rather convenient, see [6].
In addition to these standard requirements, we assume in all Theorems
and Proposition below, as in [2, §30], the following:
(A1)
0 < J :=
∫ b
a
q(t)
I(t)
dt < ∞,
∫ b
a
√
I(t) dt < ∞. (1)
The weak unbiasedness condition which we mentioned above, – and
which we do not use below, – reads,
q(b)Eb(θ
∗ − b) − q(a)Ea(θ
∗ − a) = 0, q ∈ C[a, b]. (2)
In turn, a simple sufficient condition for (2) is
q(a) = q(b) = 0, q ∈ C[a, b]. (3)
On the other hand, (2) could be guaranteed also by ordinary unbiasedness
at the two limiting points, i.e. at a and b. Instead of (2), the Theorem 5 [3]
assumes (1), and, in addition, requires that
(ABS)
the function q is Riemann integrable on [a, b]. (4)
ON ASYMPTOTIC INFORMATION INTEGRAL INEQUALITIES 297
We believe that some other minor condition should be added to (ABS) in
order to be sufficient for the inequality (6), see the Remark 5 below. In-
stead, we will use here a slightly different version of the latter assumption,
(A2)
the function q/I is Riemann integrable on [a, b], and inf
a≤t≤b
q(t) > 0. (5)
First of all, we propose a new version of the Theorem 5 from [3] and [2,
§30], under the assumption (A2) instead of (ABS).
Theorem 1. Let (A1) and (A2) hold true. Then
lim inf
n→∞ n E(θ∗ − θ)2 = lim inf
n→∞ n
∫
Eθ(θ
∗ − θ)2 q(θ) dθ ≥ J. (6)
Now, we are going to show that in some cases it is possible to relax the
assumption on Riemann integrability further, even drop it completely, see
the Theorem 2 below. The following new assumption which describes one
of these cases will be used,
(A3) there exists C > 0 such that
C−1 ≤ q(t)/I(t) ≤ C, a ≤ t ≤ b, and inf
a≤t≤b
I(t) > 0.
Theorem 2. Let the assumptions (A1) and (A3) be satisfied. Then (6)
holds true.
The “price” of dropping the assumption on Riemann integrability is a
new restriction on the “prior”, (A3); at least, in certain cases this restric-
tion could be considered as mild. Certain “mixture” or the two assumptions,
(A2) and (A3) is possible, – that is, for example, one could require (A2) on
one part of Θ, and (A3) on the complementary part, – but we do not pursue
this here. Let us state a useful technical lemma which will be applied in the
proof of the Theorem 2; however, it may have some independent interest,
too: see the Corollary 1 below.
Lemma 1. Let the assumption (A1) be satisfied, let q ≥ 0, and let there be
a sequence 0 ≤ qm ↑ q (a.e.) as m → ∞, such that for any estimator θ∗,
lim inf
n→∞ n
∫
Eθ(θ
∗ − θ)2q̃m(θ) dθ ≥
∫
q̃m(t)
I(t)
dt,
where
q̃m(t) :=
qm(t)∫
qm(θ) dθ
.
298 ALEXANDER VERETENNIKOV
Then (6) holds true with the prior q.
Corollary 1. Let q ≥ 0, and
0 ≤ qm(t) ↑ q(t), m → ∞, a.e.
If every function qm is Riemann integrable and satisfies (A1), then (6) holds
true for this q.
Some independent interest here might arise because the class of such
functions q that possess monotonic approximation from below by Riemann
integrable ones is wider than Riemann integrable, although it is yet more
narrow in compare to L1[a, b].
Remark 1. Is it tempting to formulate an analogue of the Lemma 1 for a
decreasing sequence approximating q, in order to extend the Theorem 2 to
all Lebesgue’s integrable densities. However, this idea does not seem to help
much, because it only works, apparently, if the convergence is uniform; and
as such, it could be re-formulated as a uniform increasing convergence, too,
just by a multiplication by an appropriate constant close to one. Hence, this
new Lemma, which could be, indeed, stated, follows from, and is actually
strictly weaker then the Lemma 1 above.
Remark 2. In certain works, the notion of “Fisher’s integral information”
is used,
Ī(n) := E(∂ ln Ln(X, θ)/∂θ)2; (7)
a similar definition via the second derivative could be applied, too. Then,
e.g., under the assumption of weak unbiasedness (3), the following Cramér–
Rao integral inequality can be established,
E(θ∗ − θ)2 ≥ Ī(n)−1 =
1
n
∫
q(θ)I(θ) dθ +
∫
(q′)2/q dθ
. (8)
In the limit this gives,
lim inf
n
nE(θ∗ − θ)2 ≥ 1∫
q(θ)I(θ) dθ
. (9)
Due to Jensen’s inequality applied to the strictly convex function (t �→
1/t, t > 0), we have,
J =
∫
q(θ)
1
I(θ)
dθ ≥ 1∫
q(θ)I(θ) dθ
=: (Ī)−1.
This fact is, of course, well known and mentioned in the literature. Natu-
rally, in most of cases the latter inequality is strict. So, the integral asymp-
totic inequality with the Borovkov–Sakhanenko bound J , - as in [3] or in
ON ASYMPTOTIC INFORMATION INTEGRAL INEQUALITIES 299
the Theorem 1 above, – is strictly stronger, and, therefore, is clearly more
reasonable than the lower bound with Ī, of course, if J is finite. As shown
in [2, Example 30.1], and as it may be proved for MLE’s under mild condi-
tions, the bounds with J is often attainable, see [2]. In the latter Example
from [2], of course, (Ī)−1 < J , hence, the use of (Ī)−1 is not reasonable, be-
cause this bound is not attainable. We are not speaking of the case Ī = 0,
nor of non-differentiable densities. The question about optimal finite lower
bounds apparently was not studied in the literature.
We will be using the following result, see, e.g., [2, §30, Theorem 4].
Notice that the inequality (10) with hε = q, leads to (9), while hε = q/I
provides (6), – of course, each one under the condition that q, or corre-
spondingly q/I, satisfies the assumptions of the Proposition.
Proposition 1. Let hε be absolutely continuous function, – that is, possess a
representation as an integral of some Lebesgue integrable function, – satisfy
the equalities
hε(a) = hε(b) = 0,
and
supp(hε) ⊂ supp(q),
and let the second condition in (A1) be satisfied. (The first one is not
exploited here because we do not use J .) Then
n
∫
Eθ(θ
∗ − θ)2 q(θ) dθ ≥
(∫ b
a hε(t) dt
)2
∫ b
a I(t)hε(t)2/q(t) dt + (1/n)
∫ b
a (h′
ε(t))
2/q(t) dt
.(10)
For convenience of reading let us remind the main idea of the proof, that is
an application of the Cauchy inequality to the identity,
E
(
(θ∗(X) − θ)
(f(X | θ)hε(θ))
′
θ
f(X, θ)
)
=
∫
hε(t) dt = E
hε(θ)
q(θ)
,
which, in turn, follows from
E
(
θ∗(X)
(f(X | θ)hε(θ))
′
θ
f(X, θ)
)
=
∫
θ∗(x)
(∫
(f(x | t)hε(t))
′
t dt
)
dx = 0,
due to hε(a) = hε(b) = 0, and
−E
(
θ
(f(X | θ)hε(θ))
′
θ
f(X, θ)
)
= −
∫ (∫
t (f(x | t)hε(t))
′
t dt
)
dx
=
∫ (∫
(f(x | t)hε(t)) dt
)
dx =
∫ (∫
(f(x | t)hε(t)) dx
)
dt
=
∫ ∫
(f(x | t)hε(t))
q(t)
q(t) dx dt = E
hε(θ)
q(θ)
,
300 ALEXANDER VERETENNIKOV
where hε(a) = hε(b) = 0 has been also used explicitly. We skip further
details which can be found in [2].
Remark 3. Notice that in in [2, Theorem 30.4], one may wish to use a
function hε which is not necessarily non-negative; in this case, one should
require |hε| ≤ h instead of hε ≤ h. We do not need either of these in the
Proposition above, because we do not aim to get an inequality with h in
this assertion.
3. Proof of Theorem 1
1. Let us denote,
q− := inf
a≤t≤b
q(t) > 0,
see the assumption (A2). For the function
∫ b
a (h′
ε(t))
2/q(t) dt, the following
notation will be used,
H(ε) :=
∫ b
a
(h′
ε(t))
2/q(t) dt.
Let
h0(t) := q(t)/I(t),
h̄ε(t) := min
|u|≤ε
q(t + u)
I(t + u)
, h̃ε(t) := h̄ε(t) ∧ q−
ε
, a ≤ t ≤ b,
and
hε(t) :=
1
2ε
∫ t+ε
t−ε
h̃ε(v) dv. (11)
With this definition, we clearly have
h̃ε(t) ≤ h0(t), & 0 ≤ hε(t) ≤ h0(t). (12)
Now, the function hε defined in (11) is absolutely continuous and differen-
tiable almost everywhere, with
|h′
ε(t)| ≤
Cq(t)
ε
∧ q(t)
I(t)
,
and hε(a) = hε(b) = 0, for any ε > 0; of course, we define q outside [a, b]
as identical zero. Due to the assumption (A2), the function H(ε) is finite,
and, moreover,
H(ε) ≤ C
ε2
∫ q2(t)
q(t)
dt ≤ C
ε2
.
2. Let us show that
h̃ε(t) → h0(t), ε ↓ 0 (a.e.). (13)
ON ASYMPTOTIC INFORMATION INTEGRAL INEQUALITIES 301
For that, due to the Lebesgue dominated convergence theorem, it suffices
to show that ∫ b
a
(h0(t) − h̃ε(t)) dt ↓ 0, ε ↓ 0. (14)
This follows similarly to [2, Proof of Theorem 30.5], where this hint is
applied to the function q. We have, by virtue of Riemann integrability
condition and the theorem about Darboux integral sums from the Calculus,
∑
k
h̄δ(2kδ) 2δ →
∫ b
a
h0(t) dt, δ → 0,
∑
k
h̄δ((2k + 1)δ) 2δ →
∫ b
a
h0(t) dt, δ → 0.
Estimate the difference,
0 ≤ ∑
k
(h̄δ(2kδ) − h̃δ(2kδ)) 2δ
≤ 2δ
∑
k
h̄δ(2kδ) 1(h̄δ(2kδ) > q−/(2δ)).
However, since h0 is Riemann integrable, in must be bounded on [a, b], and
so is h̄δ ≤ h0. Since inft∈[a,b] q(t) > 0, then it follows from (A2) that h̃δ ≡ h̄δ
as δ is small enough. Then, of course,
1(h̄δ(2kδ) > q−/(2δ)) = 0.
Therefore the sum
∑
k h̄δ(2kδ) 1(h̄δ(2kδ) > q−/(2δ)) equals zero if δ is small
enough. So,
0 ≤ ∑
k
(h̄δ(2kδ) − h̃δ(2kδ)) 2δ → 0, δ → 0.
Similarly,
0 ≤ ∑
k
(h̄δ((2k + 1)δ) − h̃δ((2k + 1)δ)) 2δ → 0, δ → 0.
Hence,
∫ b
a
h̃ε(t) dt ≥
(∑
k
h̃2ε(4kε) 2ε +
∑
k
h̃2ε((4k + 2)ε) 2ε
)
→
∫ b
a
h0(t)dt, ε → 0.(15)
Since
∫
h̃ε ≤ ∫
h, the latter convergence implies (14); hence, (13) holds true
almost everywhere for a ≤ t ≤ b.
302 ALEXANDER VERETENNIKOV
3. Notice that hε satisfies the assumptions of the Proposition 1, being
differentiable and since it vanishes at a and b. So, we get, with ε = (Cn)−1/3,
n E(θ∗ − θ)2 ≥
(∫ b
a hε(t) dt
)2
∫ b
a I(t)hε(t)2/q(t) dt + n−1/3
. (16)
Hence, to complete the proof, it suffices to establish
∫ b
a
hε(t) dt →
∫ b
a
h0(t) dt, (17)
and
∫ b
a
I(t)hε(t)
2/q(t) dt →
∫ b
a
q(t)/I(t) dt, ε → 0. (18)
4. We have,
0 ≤
∫ b
a
(h0(t) − hε(t)) dt
=
∫ b
a
(
h0(t) − 1
2ε
∫ t+ε
t−ε
h̃ε(v) dv
)
dt
=
∫ b
a
1
2ε
∫ t+ε
t−ε
(
h0(t) − h̃ε(v)
)
dv dt
=
∫ b
a
1
2ε
∫ t+ε
t−ε
(
h0(t) − h̃ε(t)
)
dv dt
+
∫ b
a
1
2ε
∫ t+ε
t−ε
(
h̃ε(t) − h̃ε(v)
)
dv dt.
Here,
∫ b
a
1
2ε
∫ t+ε
t−ε
(
h0(t) − h̃ε(t)
)
dv dt =
∫ b
a
(
h0(t) − h̃ε(t)
)
dt → 0, ε → 0,
due to (14). On the other hand side,
∫ b
a
1
2ε
∫ t+ε
t−ε
(
h̃ε(t) − h̃ε(v)
)
dv dt =
=
∫
h̃ε(t) dt −
∫
h̃ε(v)
(
1
2ε
∫ v+ε
v−ε
1 dt
)
dv = 0.
Thus, indeed, (17) holds true.
ON ASYMPTOTIC INFORMATION INTEGRAL INEQUALITIES 303
5. Further, by virtue of (12) and (17, we also have,
0 ≤
∫
I(t)
q(t)
(h2
0(t) − h2
ε (t)) dt
=
∫
h−1
0 (h0(t) − hε(t))(h0(t) + hε(t)) dt
≤ 2
∫
h−1
0 (h0(t) − hε(t))2h0(t) dt
= 2
∫
(h0(t) − hε(t)) dt → 0, ε → 0.
Whence, from (16), (17) and (18) the desired inequality (6) follows. Q.E.D.
Remark 4. Here is a simple example where the sum
∑
k qδ(2kδ) 2δ may
decrease when δ decreases in [2]. Let a = 0, b = 5, δ1 = 1, δ2 = 3/2, and
q(t) = 1(1 < t < 5) × (1/4). Then, clearly, δ1 < δ2, but the sum with δ2 is
not less than that with δ1: indeed,
∑
k
qδ1(2kδ1) 2δ1 = 0 < 3/4 =
∑
k
qδ2(2kδ2) 2δ2.
This does not change the rest of the proof in [2, Theorem 30.5].
Remark 5. More important is that for the construction used in [2], – which
is slightly different from ours, – the inequality hε ≤ h0 may fail; there is
simply no reason why it could be guaranteed. This inequality which asserts
the applicability of Lebesgue’s dominated convergence theorem, might be
replaced by hε ≤ Ch0 under additional suitable conditions on q and/or I, for
example, under 0 < infa≤t≤b I(t) ≤ supa≤t≤b I(t) < ∞ (cf. (A2)). However,
as was explained above, we do not pursue this goal.
4. Proof of Lemma 1
We have, by the assumptions,
lim inf
n→∞
∫
Eθ(θ
∗ − θ)2q̃m(θ) dθ ≥
∫
q̃m
I
(θ) dθ.
From the monotone convergence theorem it follows,
κm :=
∫
qm ↑ 1.
Again by the monotone convergence theorem,
∫
qm
I
dt ↑
∫
q
I
dt = J.
304 ALEXANDER VERETENNIKOV
Thus, we get,
lim inf
n→∞ n
∫
Eθ(θ
∗ − θ)2q(θ) dθ ≥ lim inf
n
n
∫
Eθ(θ
∗ − θ)2qm(θ) dθ
= κm lim inf
n
n
∫
Eθ(θ
∗ − θ)2q̃m(θ) dθ ≥ κm
∫
q̃m(θ)
I(θ)
dθ =
∫
qm(θ)
I(θ)
dθ.
Since the left hand side here does not depend on m, we deduce that
lim inf
n→∞ n
∫
Eθ(θ
∗ − θ)2q(θ) dθ ≥ lim sup
m→∞
∫ qm(θ)
I(θ)
dθ = J.
Q.E.D.
5. Proof of Theorem 2
1. We will construct approximations qm suitable for using the Lemma 1.
Let
qm(t) := q(t) 1(a + 1/m < t < b − 1/m), m ≥ 1.
Then 0 ≤ qm(t) ↑ q(t), m → ∞. Denote
κm =
∫
qm, & q̃m = qm/κm.
To prove the Theorem, it suffices to show that for every m,
lim inf
n→∞ n
∫
Eθ(θ
∗ − θ)2q̃m(θ) dθ ≥
∫
q̃m(t)
I(t)
dt. (19)
Denote
Sm := supp (qm) = [a + 1/m, b − 1/m],
and
h0,m(t) :=
q̃m(t)
I(t)
,
and consider the following continuous piece-wise linear function ϕ = ϕε,m,
with ε < (b − a − 1/m)/2 and m > 2/(b − a),
ϕε,m(t) =
⎧⎪⎪⎪⎨
⎪⎪⎪⎩
a + 1/m+ε
1/m
(t − a), a ≤ t ≤ a + 1/m,
+ε a+b
b−a−2/m
+ t
(
1 − 2ε
b−a−2/m
)
, a + 1/m ≤ t ≤ b − 1/m,
b + 1/m+ε
1/m
(t − b), b − 1/m ≤ t ≤ b.
Notice that
ϕ(a) = a, ϕ(a+1/m) = a+1/m+ε, ϕ(b−1/m) = b−1/m−ε, ϕ(b) = b,
ON ASYMPTOTIC INFORMATION INTEGRAL INEQUALITIES 305
0 < C−1 ≤ ϕ′
ε,m ≤ C < ∞, sup
t
∣∣∣ϕ′
ε,m(t) − 1
∣∣∣ → 0, ε → 0,
sup
t
|ϕε,m(t) − t| → 0, ε → 0,
and
q̃m(a + 1/m−) = q̃m(b − 1/m+) = 0.
In particular, it follows,
sup
v
∣∣∣∣∣1 − 1
2ε
∫ ϕ−1
ε,m(v+ε)
ϕ−1
ε,m(v−ε)
dt
∣∣∣∣∣ → 0, ε → 0. (20)
Let
hε,m(t) :=
1
2ε
∫ ϕε,m(t)+ε
ϕε,m(t)−ε
h0,m(v) dv.
Then there exists C such that for every ε small enough, and every m large
enough,
|h′
ε,m(t)| ≤ C/ε. (21)
By virtue of the Proposition 1 with this function as hε, we get the following,
n
∫
Eθ(θ
∗−θ)2q̃m(θ) dθ ≥
(∫ b
a hε,m(t) dt
)2
∫ b
a I(t)h2
ε,m(t)/q(t) dt + (1/n)
∫ b
a (h′
ε,m(t))2/q(t) dt
.
(22)
2. Let us show that ∫ b
a
hε,m(t) dt →
∫ b
a
h0,m(t) dt, (23)
and ∫ b−1/m
a+1/m
I(t)h2
ε,m(t)/q̃m(t) dt →
∫ b−1/m
a+1/m
q̃m(t)/I(t) dt, ε → 0. (24)
To show (23), we simply notice that∫
hε,m(t) dt =
∫
dt
1
2ε
∫ ϕε,m(t)+ε
ϕε,m(t)−ε
h0,m(v) dv
=
∫
h0,m(v) dv
1
2ε
∫ ϕ−1
ε,m(v+ε)
ϕ−1
ε,m(v−ε)
dt →
∫
h0,m dt, ε → 0,
due to the Lebesgue dominated convergence theorem and (20). To show
(24), we notice that ∫
I(t)
qm(t)
(
hε,m(t)2 − h0,m(t)2
)
dt
=
∫
I(t)
qm(t)
(hε,m(t) − h0,m(t)) (hε,m(t) + h0,m(t)) dt.
306 ALEXANDER VERETENNIKOV
Since the terms I(t)
qm(t)
and (hε,m(t) + h0,m(t)) are uniformly bounded on Sm,
it suffices to establish∫
|hε,m(t) − h0,m(t)| dt → 0, ε → 0. (25)
Let δ > 0 be any positive value, and let us approximate the function h0,m
in L1[a, b] by some continuous function hδ
0,m so that∫
|h0,m − hδ
0,m| < δ.
Then, denoting
hδ
ε,m(t) =
1
2ε
∫ ϕε,m(t)+ε
ϕε,m(t)−ε
hδ
0,m(v) dv,
we get,∫
|hε,m − hδ
ε,m|(t) dt =
∫ ∣∣∣∣∣ 1
2ε
∫ ϕε,m(t)+ε
ϕε,m(t)−ε
(
h0,m(v) − hδ
0,m(v)
)
dv
∣∣∣∣∣ dt
≤
∫
1
2ε
∫ ϕε,m(t)+ε
ϕε,m(t)−ε
∣∣∣h0,m(v) − hδ
0,m(v)
∣∣∣ dv dt
=
∫
dv
∣∣∣h0,m(v) − hδ
0,m(v)
∣∣∣ 1
2ε
∫ ϕ−1
ε,m(v+ε)
ϕ−1
ε,m(v−ε)
dv dt ≤ Cδ.
Hence, ∫
|hε,m(t) − h0,m(t)| dt
≤
∫ ∣∣∣hε,m(t) − hδ
ε,m(t)
∣∣∣ dt +
∫ ∣∣∣h0,m(t) − hδ
0,m(t)
∣∣∣ dt +
∫ ∣∣∣hδ
ε,m(t) − hδ
ε,m(t)
∣∣∣ dt
≤ Cδ +
∫ ∣∣∣hδ
ε,m(t) − hδ
0,m(t)
∣∣∣ dt.
For every fixed δ > 0, the latter integral tends to zero as ε → 0, because the
function hδ
0,m is uniformly continuous, and, hence, supx(h
δ
ε,m − hδ
0,m)(x) →
0, ε → 0. Therefore, for every δ > 0,
lim sup
ε→0
∫
|hε,m − hδ
ε,m|(t) dt ≤ Cδ;
however, the left hand side
∫ |hε,m(t) − h0,m(t)| dt does not depend on δ,
hence, (25) holds true, which implies (24). From (23), (24) and (22) we
deduce (19), which, finally, implies the desired asymptotic inequality (6) by
virtue of the Lemma 1. Q.E.D.
6. Acknowledgements
The author thanks the grants by Leeds University for conferences on teach-
ing in higher education, and RFBR 05-01-00449 (Russia) for support. The
author is grateful to Mrs Hélène Schützenberger who has kindly provided a
copy of the paper [10] by her father.
ON ASYMPTOTIC INFORMATION INTEGRAL INEQUALITIES 307
Bibliography
1. Bobrovsky, B. Z., Zakai, M. A lower bound on the estimation error for
certain diffusion processes, IEEE Trans. Inform. Theory. IT-22, (1976),
45–52.
2. Borovkov, A. A. Mathematical statistics, Gordon and Breach Science Pub-
lishers, Amsterdam, (1998).
3. Borovkov, A. A., Sakhanenko, A. I. Estimates for averaged quadratic risk,
(Russian) Probab. Math. Statist. 1, (1980), no. 2, 185-195 (1981).
4. Feller, W. An introduction to probability theory and its applications, Vol.
2., John Wiley & Sons, New York-London-Sydney, (1966).
5. Gill, R. D., Levit, B. Y. Applications of the Van Trees inequality: a
Bayesian Cramér-Rao bound, Bernoulli, 1, (1995), no. 1-2, 59–79.
6. Ibragimov, I. A.; Hasminskii, R. Z. Statistical estimation. Asymptotic
theory, Springer, New York - Berlin, (1981).
7. Nazin, A. V. Information approach to optimization problems and adaptive
control of discrete stochastic systems, DrSc Thesis, Moscow, Inst. Control
Sci., 1995.
8. Prakasa Rao, B. L. S. On Cramér-Rao type integral inequalities, Calcutta
Statist. Assoc. Bull. 40, (1990/91), no. 157-160, 183–205.
9. Schützenberger, M. P. A generalization of the Frechet–Cramér inequality
to the case of Bayes estimation, Bull. Amer. Math. Soc., 63, (1957), 142.
10. Schützenberger, M. P. A propos de l’inégalité de Fréchet – Cramér, Publ.
Inst. Statist. Univ. Paris 7, no. 3/4 (1958), 3–6.
11. Shemyakin, A. E. Rao–Cramér type integral inequalities for estimates of a
vector parameter, Theory Probab. Appl., 32, (1987), 426–434; translation
from Teor. Veroyatn. Primen. 32(3), (1987), 469–477.
12. Van Trees, H. Detection, Estimation and Modulation Theory, Vol. I., Wiley,
New York, (1968).
13. Veretennikov, A. Yu. Parametric and non-parametric estimation for Mar-
kov chains, Lecture notes, Moscow, Publ. Centre Appl. Research, Mech.
& Math. Faculty, Moscow State Univ., (2000).
14. Weiss, A. J., Weinstein, E. A lower bound on the mean-square error in
random parameter estimation, IEEE Trans. Inf. Theory, 31, (1985), 680–
682.
School of Mathematics, University of Leeds, LS2 9JT, Leeds, UK,
& Institute of Information Transmission Problems, Moscow, 127994,
GSP-4, Russia.
E-mail address: a.veretennikov@leeds.ac.uk
|