Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle
Orthogonal rational functions (ORF) on the unit circle generalize orthogonal polynomials (poles at infinity) and Laurent polynomials (poles at zero and infinity). In this paper we investigate the properties of and the relation between these ORF when the poles are all outside or all inside the unit d...
Gespeichert in:
Datum: | 2017 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | English |
Veröffentlicht: |
Інститут математики НАН України
2017
|
Schriftenreihe: | Symmetry, Integrability and Geometry: Methods and Applications |
Online Zugang: | http://dspace.nbuv.gov.ua/handle/123456789/149239 |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Назва журналу: | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
Zitieren: | Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle / A. Bultheel, R. Cruz-Barroso, A. Lasarow // Symmetry, Integrability and Geometry: Methods and Applications. — 2017. — Т. 13. — Бібліогр.: 47 назв. — англ. |
Institution
Digital Library of Periodicals of National Academy of Sciences of Ukraineid |
irk-123456789-149239 |
---|---|
record_format |
dspace |
spelling |
irk-123456789-1492392019-02-20T01:23:17Z Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle Bultheel, A. Cruz-Barroso, R. Lasarow, A. Orthogonal rational functions (ORF) on the unit circle generalize orthogonal polynomials (poles at infinity) and Laurent polynomials (poles at zero and infinity). In this paper we investigate the properties of and the relation between these ORF when the poles are all outside or all inside the unit disk, or when they can be anywhere in the extended complex plane outside the unit circle. Some properties of matrices that are the product of elementary unitary transformations will be proved and some connections with related algorithms for direct and inverse eigenvalue problems will be explained. 2017 Article Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle / A. Bultheel, R. Cruz-Barroso, A. Lasarow // Symmetry, Integrability and Geometry: Methods and Applications. — 2017. — Т. 13. — Бібліогр.: 47 назв. — англ. 1815-0659 2010 Mathematics Subject Classification: 30D15; 30E05; 42C05; 44A60 DOI:10.3842/SIGMA.2017.090 http://dspace.nbuv.gov.ua/handle/123456789/149239 en Symmetry, Integrability and Geometry: Methods and Applications Інститут математики НАН України |
institution |
Digital Library of Periodicals of National Academy of Sciences of Ukraine |
collection |
DSpace DC |
language |
English |
description |
Orthogonal rational functions (ORF) on the unit circle generalize orthogonal polynomials (poles at infinity) and Laurent polynomials (poles at zero and infinity). In this paper we investigate the properties of and the relation between these ORF when the poles are all outside or all inside the unit disk, or when they can be anywhere in the extended complex plane outside the unit circle. Some properties of matrices that are the product of elementary unitary transformations will be proved and some connections with related algorithms for direct and inverse eigenvalue problems will be explained. |
format |
Article |
author |
Bultheel, A. Cruz-Barroso, R. Lasarow, A. |
spellingShingle |
Bultheel, A. Cruz-Barroso, R. Lasarow, A. Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle Symmetry, Integrability and Geometry: Methods and Applications |
author_facet |
Bultheel, A. Cruz-Barroso, R. Lasarow, A. |
author_sort |
Bultheel, A. |
title |
Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle |
title_short |
Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle |
title_full |
Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle |
title_fullStr |
Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle |
title_full_unstemmed |
Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle |
title_sort |
orthogonal rational functions on the unit circle with prescribed poles not on the unit circle |
publisher |
Інститут математики НАН України |
publishDate |
2017 |
url |
http://dspace.nbuv.gov.ua/handle/123456789/149239 |
citation_txt |
Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle / A. Bultheel, R. Cruz-Barroso, A. Lasarow // Symmetry, Integrability and Geometry: Methods and Applications. — 2017. — Т. 13. — Бібліогр.: 47 назв. — англ. |
series |
Symmetry, Integrability and Geometry: Methods and Applications |
work_keys_str_mv |
AT bultheela orthogonalrationalfunctionsontheunitcirclewithprescribedpolesnotontheunitcircle AT cruzbarrosor orthogonalrationalfunctionsontheunitcirclewithprescribedpolesnotontheunitcircle AT lasarowa orthogonalrationalfunctionsontheunitcirclewithprescribedpolesnotontheunitcircle |
first_indexed |
2025-07-12T21:10:42Z |
last_indexed |
2025-07-12T21:10:42Z |
_version_ |
1837477030254346240 |
fulltext |
Symmetry, Integrability and Geometry: Methods and Applications SIGMA 13 (2017), 090, 49 pages
Orthogonal Rational Functions on the Unit Circle
with Prescribed Poles not on the Unit Circle
Adhemar BULTHEEL †, Ruyman CRUZ-BARROSO ‡ and Andreas LASAROW §
† Department of Computer Science, KU Leuven, Belgium
E-mail: adhemar.bultheel@cs.kuleuven.be
URL: https://people.cs.kuleuven.be/~adhemar.bultheel/
‡ Department of Mathematical Analysis, La Laguna University, Tenerife, Spain
E-mail: rcruzb@ull.es
§ Fak. Informatik, Mathematik & Naturwissenschaften, HTWK Leipzig, Germany
E-mail: lasarow@imn.htwk-leipzig.de
Received August 01, 2017, in final form November 20, 2017; Published online December 03, 2017
https://doi.org/10.3842/SIGMA.2017.090
Abstract. Orthogonal rational functions (ORF) on the unit circle generalize orthogonal
polynomials (poles at infinity) and Laurent polynomials (poles at zero and infinity). In this
paper we investigate the properties of and the relation between these ORF when the poles
are all outside or all inside the unit disk, or when they can be anywhere in the extended
complex plane outside the unit circle. Some properties of matrices that are the product
of elementary unitary transformations will be proved and some connections with related
algorithms for direct and inverse eigenvalue problems will be explained.
Key words: orthogonal rational functions; rational Szegő quadrature; spectral method; ra-
tional Krylov method; AMPD matrix
2010 Mathematics Subject Classification: 30D15; 30E05; 42C05; 44A60
1 Introduction
Orthogonal rational functions (ORF) on the unit circle are well known as generalizations of
orthogonal polynomials on the unit circle (OPUC). The pole at infinity of the polynomials is
replaced by poles “in the neighborhood” of infinity, i.e., poles outside the closed unit disk. The
recurrence relations for the ORF generalize the Szegő recurrence relations for the polynomials.
If µ is the orthogonality measure supported on the unit circle, and L2
µ the corresponding
Hilbert space, then the shift operator Tµ : L2
µ → L2
µ : f(z) 7→ zf(z) restricted to the polynomials
has a representation with respect to the orthogonal polynomials that is a Hessenberg matrix.
However, if instead of a polynomial basis, one uses a basis of orthogonal Laurent polynomials by
alternating between poles at infinity and poles at the origin, a full unitary representation of Tµ
with respect to this basis is a five-diagonal CMV matrix [12].
The previous ideas have been generalized to the rational case by Velázquez in [47]. He
showed that the representation of the shift operator with respect to the classical ORF is not
a Hessenberg matrix but a matrix Möbius transform of a Hessenberg matrix. However, a full
unitary representation can be obtained if the shift is represented with respect to a rational
analog of the Laurent polynomials by alternating between a pole inside and a pole outside the
unit disk. The resulting matrix is a matrix Möbius transform of a five-diagonal matrix.
This paper is a contribution to the Special Issue on Orthogonal Polynomials, Special Functions and Applica-
tions (OPSFA14). The full collection is available at https://www.emis.de/journals/SIGMA/OPSFA2017.html
mailto:adhemar.bultheel@cs.kuleuven.be
https ://people.cs.kuleuven.be/~adhemar.bultheel/
mailto:rcruzb@ull.es
lasarow@imn.htwk-leipzig.de
https://doi.org/10.3842/SIGMA.2017.090
https://www.emis.de/journals/SIGMA/OPSFA2017.html
2 A. Bultheel, R. Cruz-Barroso and A. Lasarow
Orthogonal Laurent polynomials on the real line, a half-line, or an interval were introduced
by Jones et al. [31, 32] in the context of moment problems, Padé approximation and quadrature
and this was elaborated by many authors. González-Vera and his co-workers were in particu-
lar involved in extending the theory where the poles zero and infinity alternate (the so-called
balanced situation) to a more general case where in each step either infinity or zero can be
chosen as a pole in any arbitrary order [8, 20]. They also identify the resulting orthogonal
Laurent polynomials as shifted versions of the orthogonal polynomials. Hence the orthogonal
Laurent polynomials satisfy the same recurrence as the classical orthogonal polynomials after
an appropriate shifting and normalization is embedded.
The corresponding case of orthogonal Laurent polynomials on the unit circle was introduced
by Thron in [40] and has been studied more recently in for example [15, 18]. Papers traditionally
deal with the balanced situation like in [18] but in [15] also an arbitrary ordering was considered.
Only in [16] Cruz-Barroso and Delvaux investigated the structure of the matrix representation
with respect to the basis of the resulting orthogonal Laurent polynomials on the circle. They
called it a “snake-shaped” matrix which generalizes the five diagonal matrix.
The purpose of this paper is to generalize these ideas valid for Laurent polynomials on the
circle to the rational case. That is to choose the poles of the ORF in an arbitrary order either
inside or outside the unit disk. We relate the resulting ORF with the ORF having all their poles
outside or all their poles inside the disk, and study the corresponding recurrence relations. With
respect to this new orthogonal rational basis, the shift operator will be represented by a matrix
Möbius transformation of a snake-shaped matrix.
In the papers by Lasarow and coworkers (e.g., [23, 24, 25, 34]) matrix versions of the ORF
are considered. In these papers also an arbitrary choice of the poles is allowed but only with
the restrictive condition that if α is used as a pole, then 1/α cannot be used anymore. This
means that for example the “balanced situation” is excluded. One of the goals of this paper is
to remove this restriction on the poles.
In the context of quadrature formulas, an arbitrary sequence of poles not on the unit circle
was also briefly discussed in [19]. The sequence of poles considered there need not be New-
tonian, i.e., the poles for the ORF of degree n may depend on n. Since our approach will
emphasize the role of the recurrence relation for the ORF, we do need a Newtonian sequence,
although some of the results may be generalizable to the situation of a non-Newtonian sequence
of poles.
One of the applications of the theory of ORF is the construction of quadrature formulas on
the unit circle that form rational generalizations of the Szegő quadrature. They are exact in
spaces of rational functions having poles inside and outside the unit disk. The nodes of the
quadrature formula are zeros of para-orthogonal rational functions (PORF) and the weights are
all positive numbers. These nodes and weights can (like in Gaussian quadrature) be derived from
the eigenvalue decomposition of a unitary truncation of the shift operator to a finite-dimensional
subspace. One of the results of the paper is that there is no gain in considering an arbitrary
sequence of poles inside and outside the unit disk unless in a balanced situation. When all the
poles are chosen outside the closed unit disk or when some of them are reflected in the circle,
the same quadrature formula will be obtained. The computational effort for the general case
will not increase but neither can it reduce the cost.
In network applications or differential equations one often has to work with functions of large
sparse matrices. If A is a matrix and the matrix function f(A) allows the Cauchy representation
f(A) =
∫
Γ f(z)(z − A)−1dµ(z), where Γ is a contour encircling all the eigenvalues of A then
numerical quadrature is a possible technique to obtain an approximation for f(A). If for exam-
ple Γ is the unit circle, then expressions like u∗f(A)u for some vector u can be approximated
by quadrature formulas discussed in this paper which will be implemented disguised as Krylov
subspace methods (see for example [27, 29, 33]).
Orthogonal Rational Functions on the Unit Circle 3
The purpose of the paper though is not to discuss quadrature in particular. It is just an
example application that does not require much extra introduction of new terminology and
notation. The main purpose however is to give a general framework on which to build for the
many applications of ORFs. Just like orthogonal polynomials are used in about every branch
of mathematics, ORFs can be used with the extra freedom to exploit the location of the poles.
For example, it can be shown that the ORFs can be used to solve multipoint moment problems
as well as more general rational interpolation problems where locations of the poles inside and
outside the circle are important for the engineering applications like system identification, model
reduction, filtering, etc. When modelling the transfer function of a linear system, poles should
be chosen inside as well as outside the disk to guarantee that the transient as well as the steady
state of the system is well modelled. It would lead us too far to also include the interpolation
properties of multipoint Padé approximation and the related applications in several branches of
engineering. We only provide the basics in this paper so that it can be used in the context of
more applied papers.
The interpretation of the recursion for the ORFs as a factorization of a matrix into elementary
unitary transformations illustrates that the spectrum of the resulting matrix is independent of
the order in which the elementary factors are multiplied. As far as we know, this fact was
previously unknown in the linear algebra community, unless in particular cases like unitary
Hessenberg matrices. As an illustration, we develop some preliminary results in Section 11 in
a linear algebra setting that is slightly more general than the ORF situation.
In the last decades, many papers appeared on inverse eigenvalue problems for unitary Hes-
senberg matrices and rational Krylov methods. Some examples are [4, 30, 35, 36, 37, 38, 44].
These use elementary operations that are very closely related to the recurrence that will be
discussed in this paper. However, they are not the same and often miss the flexibility discussed
here. We shall illustrate some of these connections with certain algorithms from the literature
in Section 12.
The outline of the paper is as follows. In Section 2 we introduce the main notations used in
this paper. The linear spaces and the ORF bases are given in Section 3. Section 4 brings the
Christoffel–Darboux relations and the reproducing kernels which form an essential element to
obtain the recurrence relation given in Section 5 but also for the PORF in Section 6 to be used for
quadrature formulas in Section 7. The alternative representation of the shift operator is given in
Section 8 and its factorization in elementary 2×2 blocks in the subsequent Section 9. We end by
drawing some conclusions about the spectrum of the shift operator and about the computation
of rational Szegő quadrature formulas in Section 10. The ideas that we present in this paper,
especially the factorization of unitary Hessenberg matrices in elementary unitary factors is also
used in the linear algebra literature mostly in the finite-dimensional situation. These elementary
factors and what can be said about the spectrum of their product is the subject of Section 11.
These elementary unitary transformations are intensively used in numerical algorithms such as
Arnoldi-based Krylov methods where they are known as core transformations. Several variants
of these rational Krylov methods exist. The algorithms are quite similar yet different from
our ORF recursion as we explain briefly in Section 12 illustrating why we believe the version
presented in this paper has superior advantages.
2 Basic definitions and notation
We use the following notation. C denotes the complex plane, Ĉ the extended complex plane
(one point compactification), R the real line, R̂ the closure of R in Ĉ, T the unit circle, D the
open unit disk, D̂ = D ∪ T, and E = Ĉ \ D̂. For any number z ∈ Ĉ we define z∗ = 1/z (and set
1/0 =∞, 1/∞ = 0) and for any complex function f , we define f∗(z) = f(z∗).
4 A. Bultheel, R. Cruz-Barroso and A. Lasarow
To approximate an integral
Iµ(f) =
∫
T
f(z)dµ(z),
where µ is a probability measure on T one may use Szegő quadrature formulas. The nodes of
this quadrature can be computed by using the Szegő polynomials. Orthogonality in this paper
will always be with respect to the inner product
〈f, g〉 =
∫
T
f(z)g(z)dµ(z).
The weights of the n-point quadrature are all positive, the nodes are on T and the formula is
exact for all Laurent polynomials f ∈ span{zk : |k| ≤ n− 1}.
This has been generalized to rational functions with a set of predefined poles. The corre-
sponding quadrature formulas are then rational Szegő quadratures. This has been discussed in
many papers and some of the earlier results were summarized in the book [9]. We briefly recall
some of the results that are derived there. The idea is the following. Fix a sequence α = (αk)k∈N
with α = {αk}k∈N ⊂ D, and consider the subspaces of rational functions defined by
L0 = C, Ln =
{
pn(z)
πn(z)
: pn ∈ Pn, πn(z) =
n∏
k=1
(1− αkz)
}
, n ≥ 1,
where Pn is the set of polynomials of degree at most n. These rational functions have their
poles among the points in α∗ = {αj∗ = 1/αj : αj ∈ α}. We denote the corresponding sequence
as α∗ = (αj∗)j∈N. Let φn ∈ Ln \ Ln−1, and φn ⊥ Ln−1 be the nth orthogonal rational basis
function (ORF) in a nested sequence. It is well known that these functions have all their zeros
in D (see, e.g., [9, Corollary 3.1.4]). However, the quadrature formulas we have in mind should
have their nodes on the circle T. Therefore, para-orthogonal rational functions (PORF) are
introduced. They are defined by
Qn(z, τ) = φn(z) + τφ∗n(z), τ ∈ T,
where besides the ORF φn(z) = pn(z)
πn(z) , also the “reciprocal” function φ∗n(z) = p∗n(z)
πn(z) = znpn∗(z)
πn(z)
is introduced. These PORF have n simple zeros {ξnk}nk=1 ⊂ T (see, e.g., [9, Theorem 5.2.1]) so
that they can be used as nodes for the quadrature formulas
In(f) =
n∑
k=1
wnkf(ξnk)
and the weights are all positive, given by wnk = 1/
n−1∑
j=0
|φj(ξnj)|2 (see, e.g., [9, Theorem 5.4.2]).
These quadrature formulas are exact for all functions of the form {f = g∗h : g, h ∈ Ln−1} =
Ln−1L(n−1)∗ (see, e.g., [9, Theorem 5.3.4]).
The purpose of this paper is to generalize the situation where the αj are all in D to the
situation where they are anywhere in the extended complex plane outside T. This will require
the introduction of some new notation.
So consider a sequence α with α ⊂ D and its reflection in the circle β = (βj)j∈N where
βj = 1/αj = αj∗ ∈ E. We now construct a new sequence γ = (γj)j∈N where each γj is either
equal to αj or βj .
Partition {1, 2, . . . , n} (n ∈ N̂ = N∪{∞}) into two disjoint index sets: the ones where γj = αj
and the indices where γj = βj :
an = {j : γj = αj ∈ D, 1 ≤ j ≤ n} and bn = {j : γj = βj ∈ E, 1 ≤ j ≤ n},
Orthogonal Rational Functions on the Unit Circle 5
and define
αn = {αj : j ∈ an} and βn = {βj : j ∈ bn}.
It will be useful to prepend the sequence α with an extra point α0 = 0. That means that β
is preceded by β0 = 1/α0 =∞. For γ, the initial point can be γ0 = α0 = 0 or γ0 = β0 =∞.
With each of the sequences α, β, and γ we can associate orthogonal rational functions. They
will be closely related as we shall show. The ORF for the γ sequence can be derived from the
ORF for the α sequence by multiplying with a Blaschke product just like the orthogonal Laurent
polynomials are essentially shifted versions of the orthogonal polynomials (see, e.g., [15]).
To define the denominators of our rational functions, we introduce the following elementary
factors:
$α
j (z) = 1− αjz, $β
j (z) =
{
1− βjz, if βj 6=∞,
−z, if βj =∞,
$γ
j (z) =
{
$α
j (z), if γj = αn,
$β
j (z), if γj = βn.
Note that if αj = 0 and hence βj =∞ then $α
j (z) = 1 but $β
j (z) = −z.
To separate the α and the β-factors in a product, we also define
$̇α
j (z) =
{
$α
j , if γj = αj ,
1, if γj = βj ,
and $̇β
j (z) =
{
$β
j , if γj = βj ,
1, if γj = αj .
Because the sequence γ is our main focus, we simplify the notation by removing the superscript γ
when not needed, e.g., $j = $γ
j = $̇α
j $̇
β
j etc.
We can now define for ν ∈ {α, β, γ}
πνn(z) =
n∏
j=1
$ν
j (z)
and the reduced products separating the α and the β-factors
π̇αn(z) =
n∏
j=1
$̇α
j (z) =
∏
j∈an
$j(z), π̇βn(z) =
n∏
j=1
$̇β
j (z) =
∏
j∈bn
$j(z),
so that
πn(z) =
n∏
j=1
$j(z) = π̇αn(z)π̇βn(z).
We assume here and in the rest of the paper that products over j ∈ ∅ equal 1.
The Blaschke factors are defined for ν ∈ {α, β, γ} as
ζνj (z) = σνj
z − νj
1− νjz
, σνj =
νj
|νj |
, if νj 6∈ {0,∞},
ζνj (z) = σνj z = z, σνj = 1, if νj = 0,
ζνj (z) = σνj /z = 1/z, σνj = 1, if νj =∞.
Thus
σνj =
νj
|νj |
, for νj 6∈ {0,∞},
1, for νj ∈ {0,∞}.
6 A. Bultheel, R. Cruz-Barroso and A. Lasarow
Because σαn = σβn , we can remove the superscript and just write σn. If we also use the following
notation which maps complex numbers onto T
u(z) =
z
|z|
∈ T, z ∈ C \ {0},
1, z ∈ {0,∞},
then σj = u(αj) = u(βj) = u(γj).
Set ($ν
j )∗(z) = $ν∗
j (z) = z$ν
j∗(z) (e.g., (1 − αjz)∗ = z − αj if ν = α), then ζνj = σj
$ν∗j
$νj
.
Later we shall also use πν∗n to mean
n∏
j=1
$ν∗
j . Note that ζαj = ζβj∗ = 1/ζβj . Moreover if αj = 0
and hence βj =∞, then $α∗
j (z) = z and $β∗
j (z) = −1.
Next define the finite Blaschke products for ν ∈ {α, β}
Bν
0 = 1, and Bν
n(z) =
n∏
j=1
ζνj (z), n = 1, 2, . . . .
It is important to note that here ν 6= γ. For the definition of Bγ
n = Bn see below.
Like we have split up the denominators πn = π̇αn π̇
β
n in the α-factors and the β-factors, we
also define for n ≥ 1
ζ̇αj =
{
ζαj , if γj = αj ,
1, if γj = βj ,
ζ̇βj =
{
ζβj , if γj = βj ,
1, if γj = αj ,
and
Ḃα
n (z) =
n∏
j=1
ζ̇αj (z) =
∏
j∈an
ζj(z), and Ḃβ
n(z) =
n∏
j=1
ζ̇βj (z) =
∏
j∈bn
ζj(z),
so that we can define the finite Blaschke products for the γ sequence:
Bn(z) =
{
Ḃα
n (z), if γn = αn,
Ḃβ
n(z), if γn = βn.
Note that the reflection property of the factors also holds for the products: Bα
n = (Bβ
n)∗ =
1/Bβ
n , Bn∗ = 1/Bn, and (Ḃα
n Ḃ
β
n)∗ = 1/(Ḃβ
nḂα
n ). However,
Ḃα
n =
∏
j∈an
ζαj =
∏
j∈an
ζβj∗ =
∏
j∈an
1/ζβj 6=
∏
j∈bn
1/ζβj = 1/Ḃβ
n .
3 Linear spaces and ORF bases
We can now introduce our spaces of rational functions for n ≥ 0:
Lνn = span{Bν
0 , B
ν
1 , . . . , B
ν
n}, ν ∈ {α, β, γ}, and
L̇νn = span{Ḃν
0 , Ḃ
ν
1 , . . . , Ḃ
ν
n}, ν ∈ {α, β}.
The dimension of Lνn is n+ 1 for ν ∈ {α, β, γ}, but note that the dimension of L̇νn for ν ∈ {α, β}
can be less than n+1. Indeed some of the Ḃν
j may be repeated so that for example the dimension
of L̇αn is only |an|+ 1 with |an| the cardinality of an and similarly for ν = β. Hence for ν = γ:
Ln = span{B0, . . . , Bn} = span
{
Ḃ0, Ḃ
α
1 , . . . , Ḃ
α
n , Ḃ
β
1 , . . . , Ḃ
β
n
}
= L̇αn + L̇βn = L̇αnL̇βn.
Orthogonal Rational Functions on the Unit Circle 7
Because for n ≥ 1
Ḃα
n =
∏
j∈an
ζαj =
∏
j∈an
1
ζβj
and Ḃβ
n =
∏
j∈bn
ζβj =
∏
j∈bn
1
ζαj
,
it should be clear that Bα
k = Ḃα
k /Ḃ
β
k and Bβ
k = Ḃβ
k /Ḃ
α
k , hence that
Lαn = span
{
Ḃ0,
Ḃα
1
Ḃβ
1
, . . . ,
Ḃα
n
Ḃβ
n
}
and Lβn = span
{
Ḃ0,
Ḃβ
1
Ḃα
1
, . . . ,
Ḃβ
n
Ḃα
n
}
.
Occasionally we shall also need the notation
ς̇αn =
∏
j∈an
σj ∈ T, ς̇βn =
∏
j∈bn
σj ∈ T, and ςn =
n∏
j=1
σj ∈ T.
Lemma 3.1. If f ∈ Ln then f/Ḃβ
n ∈ Lαn and f/Ḃα
n ∈ L
β
n. In other words Ln = Ḃβ
nLαn = Ḃα
nL
β
n.
This is true for all n ≥ 0 if we set Ḃα
0 = Ḃβ
0 = 1.
Proof. This is trivial for n = 0 since then Ln = C. If f ∈ Ln, and n ≥ 1 then it is of the form
f(z) =
pn(z)
πn(z)
=
pn(z)
π̇αn(z)π̇βn(z)
, pn ∈ Pn.
Therefore
f(z)
Ḃβ
n(z)
= ς̇
β
n
pn(z)π̇βn(z)
π̇αn(z)π̇βn(z)π̇β∗n (z)
= ς̇
β
n
pn(z)
π̇αn(z)π̇β∗n (z)
.
Recall that $β∗
j = −1 and σj = 1 if βj =∞ (and hence αj = 0), we can leave these factors out
and we shall write ·
∏
for the product instead of
∏
, the dot meaning that we leave out all the
factors for which αj = 1/βj = 0.
ς̇
β
n
π̇β∗n (z)
= ·
∏
j∈bn
βj
|βj |(z − βj)
= ·
∏
j∈bn
|αj |
αj(z − 1/αj)
= ·
∏
j∈bn
−|αj |
1− αjz
,
and thus
f(z)
Ḃβ
n(z)
= cn
pn(z)
n∏
j=1
(1− αjz)
∈ Lαn, cn = ·
∏
j∈bn
(−|αj |) 6= 0.
The second part is similar. �
Lemma 3.2. With our previous definitions we have for n ≥ 1
Ḃβ
nLαn−1 = span
{
Bα
k Ḃ
β
n =
Ḃα
k
Ḃβ
k
Ḃβ
n : k = 0, . . . , n− 1
}
= ζ̇βn span{B0, B1, . . . , Bn−1} = ζ̇βnLn−1,
and similarly
Ḃα
nL
β
n−1 = span
{
Bβ
k Ḃ
α
n =
Ḃβ
k
Ḃα
k
Ḃα
n : k = 0, . . . , n− 1
}
= ζ̇αn span{B0, B1, . . . , Bn−1} = ζ̇αnLn−1.
8 A. Bultheel, R. Cruz-Barroso and A. Lasarow
Proof. By our previous lemma Ḃβ
nLαn−1 = ζ̇βn Ḃ
β
n−1Lαn−1 = ζ̇βnLn−1. The second relation is
proved in a similar way. �
To introduce the sequences of orthogonal rational functions (ORF) for the different sequen-
ces ν, ν ∈ {α, β, γ} recall the inner product that we can write with our ( )∗-notation as 〈f, g〉 =∫
T f∗(z)g(z)dµ(z) where µ is assumed to be a probability measure positive a.e. on T.
Then the orthogonal rational functions (ORF) with respect to the sequence ν with ν ∈
{α, β, γ} are defined by φνn ∈ Lνn \ Lνn−1 with φνn ⊥ Lνn−1 for n ≥ 1 and we choose φν0 = 1.
Lemma 3.3. The function φαnḂ
β
n belongs to Ln and it is orthogonal to the n-dimensional sub-
space ζ̇βnLn−1 for all n ≥ 1.
Similarly, the function φβnḂα
n belongs to Ln and it is orthogonal to the n-dimensional subspace
ζ̇αnLn−1, n ≥ 1.
Proof. First note that φαnḂ
β
n ∈ Ln by Lemma 3.1.
By definition φαn ⊥ Lαn−1. Thus by Lemma 3.2 and because 〈f, g〉 =
〈
Ḃν
nf, Ḃ
ν
ng
〉
,
Ḃβ
nφ
α
n ⊥ Ḃβ
nLαn−1 = ζ̇βnLn−1.
The second claim follows by symmetry. �
Note that ζ̇βnLn−1 = Ln−1 if γn = αn. Thus, up to normalization, φαnḂ
β
n is the same as φn
and similarly, if γn = βn then φn and φβnḂα
n are the same up to normalization.
Lemma 3.4. For n ≥ 1 the function Ḃα
n (φαn)∗ belongs to Ln and it is orthogonal to ζ̇αnLn−1.
Similarly, for n ≥ 1 the function Ḃβ
n(φβn)∗ belongs to Ln and it is orthogonal to ζ̇βnLn−1.
Proof. Since φαnḂ
β
n ⊥ ζ̇βnLn−1,
(φαnḂ
β
n)∗ ⊥ ζ̇βn∗L(n−1)∗,
and thus by Lemma 3.2 and because
Ḃα
n−1Ḃ
β
n−1L(n−1)∗ = Ḃα
n−1Ḃ
β
n−1
P(n−1)∗
π̇α(n−1)∗π̇
β
(n−1)∗
=
Pn−1
π̇αn−1π̇
β
n−1
= Ln−1
it follows that
Ḃα
nφ
α
n∗ = Ḃα
n Ḃ
β
n(φαnḂ
β
n)∗ ⊥ ζ̇αn Ḃα
n−1Ḃ
β
n−1L(n−1)∗ = ζ̇αnLn−1.
The other claim follows by symmetry. �
We now define the reciprocal ORFs by (recall f∗(z) = f(1/z))
(φνn)∗ = Bν
n(φνn)∗, ν ∈ {α, β}.
For the ORF in Ln however we set
φ∗n = Ḃα
n Ḃ
β
n(φn)∗.
Note that by definition Bn is either Ḃα
n or Ḃβ
n depending on γn being αn or βn, while in the
previous definition we do not multiply with Bn but with the product Ḃα
n Ḃ
β
n . The reason is that
we want the operation ( )∗ to be a map from Lνn to Lνn for all ν ∈ {α, β, γ}.
Orthogonal Rational Functions on the Unit Circle 9
Remark 3.5. As the operation ( )∗ is a map from Lνn to Lνn, it depends on n and on ν. So to
make the notation unambiguous we should in fact use something like f [ν,n] if f ∈ Lνn. However,
in order not to overload our notation, we shall stick to the notation f∗ since it should always
be clear from the context what the space is to which f will belong. Note that we also used the
same notation to transform polynomials. This is just a special case of the general definition.
Indeed, a polynomial of degree n belongs to Lαn for a sequence α where all αj = 0, j = 0, 1, 2, . . .
and for this sequence Bα
n (z) = zn.
Note that for a constant a ∈ L0 = Ĉ we have a∗ = a. Although ( )∗ is mostly used for scalar
expressions, we shall occasionally use A∗ where A is a matrix whose elements are all in Ln. Then
the meaning is that we take the ( )∗ conjugate of each element in its transpose. Thus if A is
a constant matrix, then A∗ has the usual meaning of the adjoint or complex conjugate transpose
of the matrix. We shall need this in Section 8.
Remark 3.6. It might also be helpful for the further computations to note the following. If pn
is a polynomial of degree n with a zero at ξ, then p∗n will have a zero at ξ∗ = 1/ξ. Hence, if
ν ∈ {α, β, γ} and φνn = pνn
πνn
, then φνn∗ = pνn∗
πνn∗
= pν∗n
πν∗n
. We know by [9, Corollary 3.1.4] that φαn has
all its zeros in D, hence pαn does not vanish in E and pα∗n does not vanish in D. By symmetry φβn
has all its zeros in E and pβ∗n does not vanish in E. For the general φn, it depends on γn being αn
or βn. However from the relations between φn and φαn or φβn that will be derived below, we will
be able to guarantee that at least for z = νn we have φν∗n (νn) 6= 0 and pν∗n (νn) 6= 0 for all
ν ∈ {α, β, γ} (see Corollary 3.11 below).
The orthogonality conditions define φn and φ∗n uniquely up to normalization. So let us now
make the ORFs unique by imposing an appropriate normalization. First assume that from now
on the φνn refer to orthonormal functions in the sense that ‖φνn‖ = 1. This makes them unique
up to a unimodular constant. Defining this constant is what we shall do now.
Suppose γn = αn, then φn and φαnḂ
β
n are both in Ln and orthogonal to Ln−1 (Lemma 3.3).
If we assume ‖φn‖ = 1 and ‖φαn‖ = 1, hence ‖φαnḂ
β
n‖ = ‖φαn‖ = 1, it follows that there must be
some unimodular constant sαn ∈ T such that φn = sαnφ
α
nḂ
β
n . Of course, we have by symmetry
that for γn = βn, there is some sβn ∈ T such that φn = sβnφ
β
nḂα
n .
To define the unimodular factors sαn and sβn, we first fix φαn and φβn uniquely as follows.
We know that φαn has all its zeros in D and hence φα∗n has all its zeros in E so that φα∗n (αn) 6= 0.
Thus we can take φα∗n (αn) > 0 as a normalization for φαn. Similarly for φβn we can normalize
by φβ∗n (βn) > 0. In both cases, we have made the leading coefficient with respect to the
basis {Bν
j }nj=0 positive since φαn(z) = φα∗n (αn)Bα
n (z) + ψαn−1(z) with ψαn−1 ∈ Lαn−1 and φβn(z) =
φβ∗n (βn)Bβ
n(z) + ψβn−1(z) with ψβn−1 ∈ L
β
n−1. Before we define the normalization for the γ
sequence, we prove the following lemma which is a consequence of the normalization of the φαn
and the φβn.
Lemma 3.7. For the orthonormal ORFs, it holds that φαn = φβn∗ and (φαn)∗Ḃβ
n = φβnḂα
n and
hence also (φβn)∗Ḃα
n = φαnḂ
β
n for all n ≥ 0.
Proof. For n = 0, this is trivial since φ0, φ
α
0 , φ
β
0 , Ḃ
α
0 and Ḃβ
0 are all equal to 1.
We give the proof for n ≥ 1 and γn = αn (for γn = βn, the proof is similar). Since by previous
lemmas Ḃβ
n(φβn)∗ and φαnḂ
β
n are both in Ln and orthogonal to Ln−1, and since ‖Ḃβ
n(φβn)∗‖ =
‖φβn‖ = 1 and ‖φαnḂ
β
n‖ = ‖φαn‖ = 1, there must be some sn ∈ T such that
snφ
α
nḂ
β
n = φβn∗Ḃ
β
n or snφ
α
n = φβn∗.
10 A. Bultheel, R. Cruz-Barroso and A. Lasarow
Multiply with Bβ
n = Bα
n∗ and evaluate at βn to get snφ
α
n(βn)Bα
n∗(βn) = φβ∗n (βn) > 0. Thus sn
should arrange for
0 < snφ
α
n(1/αn)Bα
n∗(1/αn) = snφαn∗(αn)Bα
n (αn) = snφα∗n (αn),
and since φα∗n (αn) > 0, it follows that sn = 1.
Because (φαn)∗ = Bα
nφ
α
n∗ = Bα
nφ
β
n and Bα
n = Ḃα
n/Ḃ
β
n , also the other claims follow. �
For the normalization of the φn, we can do two things: either we make the normalization
of φn simple and choose for example φ∗n(γn) > 0, similar to what we did for φαn and φβn (but
this is somewhat problematic as we shall see below), or we can insist on keeping the relation
with φαn and φβn simple as in the previous lemma and arrange that sαn = sβn = 1. We choose for
the second option.
Let us assume that γn = αn. Denote
φn(z) =
pn(z)
π̇αn(z)π̇βn(z)
and φαn(z) =
pαn(z)
παn(z)
,
with pn and pαn both polynomials in Pn. Then
φ∗n(z) =
ςn p
∗
n(z)
π̇αn(z)π̇βn(z)
and φα∗n (z) =
ςn p
α∗
n (z)
παn(z)
, ςn =
n∏
j=1
σj .
We already know that there is some sαn ∈ T such that φn = sαnḂ
β
nφαn. Take the ( )∗ conjugate
and multiply with Ḃα
n Ḃ
β
n to get φ∗n = sαnḂ
β
nφα∗n .
It now takes some simple algebra to reformulate φ∗n = sαnḂ
β
nφα∗n as
φ∗n(z) =
ςn p
∗
n(z)
π̇αn(z)π̇βn(z)
= sαn
ςn p
α∗
n (z)
π̇αn(z)π̇βn(z)
·
∏
j∈bn
(−|βj |).
This implies that p∗n(z) = sαnp
α∗
n (z) ·
∏
j∈bn(−|βj |) and thus that p∗n(z) has the same zeros
as pα∗n (z), none of which is in D. Thus the numerator of φ∗n will not vanish at αn ∈ D but
one of the factors (1 − βjαn) from π̇βn(αn) could be zero. Thus a normalization φ∗n(αn) > 0 is
not an option in general. We could however make sαn = 1 when we choose p∗n(αn)/pα∗n (αn) > 0
or, since φα∗n (αn) > 0, this is equivalent with ςn p
∗
n(αn)/παn(αn) > 0. Yet another way to put
this is requiring that φ∗n(z)/Ḃβ
n(z) is positive at z = αn. This does not give a problem with 0
or ∞ since
Ḃα
n (z)φn∗(z) =
φ∗n(z)
Ḃβ
n(z)
=
ς̇αn p
∗
n(z)
π̇αn(z) ·
∏
j∈bn(z − βj)
, ς̇αn =
∏
j∈an
σj . (3.1)
It is clear that neither the numerator nor the denominator will vanish for z = αn.
Of course a similar argument can be given if γn = βn. Then we choose φ∗n(z)/Ḃα
n (z) to be
positive at z = βn or equivalently ςn p
∗
n(βn)/πβn(βn) ·
∏
j∈an(−|αj |) > 0.
Let us formulate the result about the numerators as a lemma for further reference.
Lemma 3.8. With the normalization that we just imposed the numerators pνn of φνn = pνn/π
ν
n,
ν ∈ {α, β, γ} and n ≥ 1 are related by
pn(z) = pαn(z) ·
∏
j∈bn
(−|βj |) = pβ∗n (z)ςn ·
∏
j∈an
(−|αj |), if γn = αn
Orthogonal Rational Functions on the Unit Circle 11
and
pn(z) = pβn(z) ·
∏
j∈an
(−|αj |) = pα∗n (z)ςn ·
∏
j∈bn
(−|βj |), if γn = βn,
where as before ςn =
∏n
j=1 σj.
Proof. The first expression for γn = αn has been proved above. The second one follows in a
similar way from the relation φn(z) = φβ∗n (z)Ḃα
n (z). Indeed
pn(z)
πn(z)
=
ςnp
β∗
n (z)∏
j∈an
$β
j (z)
∏
j∈bn
$β
j (z)
∏
j∈an
σj
z − αj
1− αjz
=
ςnp
β∗
n (z)∏
j∈an
$α
j (z)
∏
j∈bn
$β
j (z)
∏
j∈an
σj
z − αj
1− βjz
=
ςnp
β∗
n (z)
πn(z)
·
∏
j∈an
σj(−αj)
z − αj
z − αj
.
With −σjαj = −|αj | the result follows.
The case γn = βn is similar. �
Note that this normalization again means that we take the leading coefficient of φn to be pos-
itive in the following sense. If γn = αn then φn(z) = (Ḃα
nφn∗)(αn)Ḃα
n (z) + ψn−1(z) with ψn−1 ∈
Ln−1, while Ḃα
nφn∗ = φα∗n and φα∗n (αn) > 0. If γn = βn then φn(z) = (Ḃβ
nφn∗)(βn)Ḃβ
n(z) +
ψn−1(z) with ψn−1 ∈ Ln−1 and the conclusion follows similarly.
Whenever we use the term orthonormal, we assume this normalization and {φn : n = 0, 1,
2, . . .} will denote this orthonormal system.
Thus we have proved the following Theorem. It says that if γn = αn, then φn is a ‘shifted’
version of φαn where ‘shifted’ means multiplied by Ḃβ
n :
Ḃβ
n(z)φn(z) = Ḃβ
n(z)[a0B
α
0 + · · ·+ anB
α
n (z)] = a0Ḃ
β
n(z) + · · ·+ anḂ
α
n (z),
and a similar interpretation if γn = βn. We summarize this in the following theorem.
Theorem 3.9. Assume all ORFs φνn, ν ∈ {α, β, γ} are orthonormal with positive leading coef-
ficient, i.e.,
φα∗n (αn) > 0 and φβ∗n (βn) > 0 and
{
(φ∗n/Ḃ
β
n)(αn) > 0 if γn = αn,
(φ∗n/Ḃ
α
n )(βn) > 0 if γn = βn.
Then for all n ≥ 0
φn = (φαn)Ḃβ
n = (φβn)∗Ḃα
n and φ∗n = (φαn)∗Ḃβ
n = (φβn)Ḃα
n if γn = αn,
while
φn = (φβn)Ḃα
n = (φαn)∗Ḃβ
n and φ∗n = (φβn)∗Ḃα
n = (φαn)Ḃβ
n if γn = βn.
Corollary 3.10. We have for all n ≥ 1 that (φνn)∗ ⊥ ζνnLνn−1, ν ∈ {α, β, γ}.
Corollary 3.11. The rational functions φαn and φα∗n are in Lαn and hence have all their poles in
{βj : j = 1, . . . , n} ⊂ E while the zeros of φαn are all in D and the zeros of φα∗n are all in E.
The rational functions φβn and φβ∗n are in Lβn and hence have all their poles in {αj : j =
1, . . . , n} ⊂ D while the zeros of φβn are all in E and the zeros of φβ∗n are all in D.
The rational functions φn and φ∗n are in Ln and hence have all their poles in {βj : j ∈
an} ∪ {αj : j ∈ bn}.
The zeros of φn are the same as the zeros of φαn and thus are all in D if γn = αn and they
are the same as the zeros of φβn and thus they are all in E if γn = βn.
12 A. Bultheel, R. Cruz-Barroso and A. Lasarow
Proof. It is well known that the zeros of φαn are all in D [9, Corollary 3.1.4], and because
φβn = φαn∗, this means that the zeros of φβn are all in E.
Because φn = (φαn)Ḃβ
n = (φαn)/
∏
j∈bn ζ
α
j if γn = αn, i.e., n ∈ an, and the product with Ḃβ
n
will only exchange the poles 1/αj = βj , j ∈ bn in φn for poles αj = 1/βj , the zeros of φαn are
left unaltered.
The proof for n ∈ bn is similar. �
One may summarize that for f ∈ Lνn the f∗ transform reflects both zeros and poles in T since
z 7→ z∗ = 1/z, while the transform f → f∗ as it is defined in the spaces Lνn, ν ∈ {α, β, γ}, keeps
the poles but reflects the zeros since the multiplication with the respective factors Bα
n , Bβ
n and
Ḃα
n Ḃ
β
n will only undo the reflection of the poles that resulted from the f∗ operation.
4 Christoffel–Darboux relations and reproducing kernels
For ν ∈ {α, β, γ}, one may define the reproducing kernels for the space Lνn. Given the ORF φνk,
the kernels are defined by
kνn(z, w) =
n∑
k=0
φνk(z)φνk(w).
They reproduce f ∈ Lνn by 〈kνn(·, z), f〉 = f(z).
The proof of the Christoffel–Darboux relations goes exactly like in the classical case and we
shall not repeat it here (see, e.g., [9, Theorem 3.1.3]).
Theorem 4.1. The Christoffel–Darboux relations
kνn(z, w) =
φν∗n (z)φν∗n (w)− ζνn(z)ζνn(w)φνn(z)φνn(w)
1− ζνn(z)ζνn(w)
=
φν∗n+1(z)φν∗n+1(w)− φνn+1(z)φνn+1(w)
1− ζνn+1(z)ζνn+1(w)
hold for n ≥ 0, ν ∈ {α, β, γ} and z, w not among the poles of φνn and not on T.
As an immediate consequence we have:
Theorem 4.2. The following relations hold true:
kαn(z, w)Ḃβ
n(z)Ḃβ
n(w) = kn(z, w) = kβn(z, w)Ḃα
n (z)Ḃα
n (w)
for n ≥ 0 and z, w 6∈ (T ∪ {βj : j ∈ an} ∪ {αj : j ∈ bn}).
Proof. The first relation was directly shown above for the case γn = αn. It also follows in the
case γn+1 = αn+1 and using in the second CD-relation the first expressions from Theorem 3.9
for φn+1 and φ∗n+1. The relation is thus valid independent of γn = αn or γn = βn.
Similarly the second expression was derived before in the case γn = βn, but again, it also fol-
lows from the second CD-relation and the first expressions from Theorem 3.9 for φn+1 and φ∗n+1
in the case γn+1 = βn+1. Again the relation holds independently of γn = αn or γn = βn.
Alternatively, the second relation can also be derived from the second CD-relation in the case
γn+1 = αn+1 but using the second expressions from Theorem 3.9 for φn+1 and φ∗n+1. �
Evaluation of the CD-relation in νn for ν ∈ {α, β} results in another useful corollary.
Orthogonal Rational Functions on the Unit Circle 13
Corollary 4.3. For ν ∈ {α, β} we have for n ≥ 0
kνn(z, νn) = φν∗n (z)φν∗n (νn) and kνn(νn, νn) = |φν∗n (νn)|2.
The latter corollary cannot immediately be used when ν = γ because γn could be equal to
some pole of φn if it equals some 1/γj for j < n. In that case we can remove the denominators
in the CD relation and only keep the numerators. Hence setting
kn(z, w) =
Cn(z, w)
πn(z)πn(w)
, φn(z) =
pn(z)
πn(z)
, φ∗n(z) =
ςnp
∗
n(z)
πn(z)
, ςn ∈ T,
the CD relation becomes
Cn(z, w) =
p∗n(z)p∗n(w)− ζn(z)ζn(w)pn(z)pn(w)
1− ζn(z)ζn(w)
=
p∗n+1(z)p∗n+1(w)− pn+1(z)pn+1(w)
(1− ζn+1(z)ζn+1(w))$n+1(z)$n+1(w)
. (4.1)
Thus, the first form gives
Cn(z, γn) = p∗n(z)p∗n(γn) and Cn(γn, γn) = |p∗n(γn)|2.
Evaluating a polynomial at infinity means taking its highest degree coefficient, i.e., if qn(z) is
a polynomial of degree n, then qn(∞) stands for its coefficient of zn.
The second form of (4.1) gives for γn+1 6=∞ and γn 6=∞
Cn(z, γn) =
p∗n+1(z)p∗n+1(γn)− pn+1(z)pn+1(γn)
(1− γnz)(1− |γn+1|2)
and
Cn(γn, γn) =
|p∗n+1(γn)|2 − |pn+1(γn)|2
(1− |γn|2)(1− |γn+1|2)
.
Coupling the first and the second form in (4.1) gives
|p∗n+1(γn)|2 − |pn+1(γn)|2
(1− |γn|2)(1− |γn+1|2)
= |p∗n(γn)|2.
For γn+1 =∞ and γn 6=∞ we get
Cn(z, γn) =
p∗n+1(z)p∗n+1(γn)− pn+1(z)pn+1(γn)
−(1− γnz)
= p∗n(z)p∗n(γn)
and
Cn(γn, γn) =
|p∗n+1(γn)|2 − |pn+1(γn)|2
−(1− |γn|2)
= |p∗n(γn)|2.
If γn+1 =∞ and γn =∞, the denominators in (4.1) have to be replaced by 1, which gives
Cn(z, γn) = p∗n+1(z)p∗n+1(γn)− pn+1(z)pn+1(γn) = p∗n(z)p∗n(γn)
and
Cn(γn, γn) = |p∗n+1(γn)|2 − |pn+1(γn)|2 = |p∗n(γn)|2.
14 A. Bultheel, R. Cruz-Barroso and A. Lasarow
For γn =∞ and γn+1 6=∞ we obtain in a similar way
Cn(z, γn) =
p∗n+1(z)p∗n+1(γn)− pn+1(z)pn+1(γn)
z(1− |γn+1|2)
= p∗n(z)p∗n(γn)
and
Cn(γn, γn) =
|p∗n+1(γn)|2 − |pn+1(γn)|2
(1− |γn+1|2)
= |p∗n(γn)|2.
To summarize, the relations of Corollary 4.3 may not hold for the ORF if ν = γ, but similar
relations do hold for the numerators as stated in the next corollary.
Corollary 4.4. If Cn(z, w) is the numerator in the CD relation and pn(z) is the numerator of
the ORF φn for the sequence γ then we have for n ≥ 0
Cn(z, γn) = p∗n(z)p∗n(γn) and Cn(γn, γn) = |p∗n(γn)|2.
5 Recurrence relation
The recurrence for the φαn is well known. For a proof see, e.g., [9, Theorem 4.1.1]. For φβn the
proof can be copied by symmetry. However, also for ν = γ the same recurrence and its proof
can be copied, with the exception that the derivation fails when p∗n(γn−1) = 0 where pn = φnπn.
This can (only) happen if (1− |γn|)(1− |γn−1|) < 0 (i.e., one of these γ’s is in D and the other
is in E). We shall say that φn is regular if p∗n(γn−1) 6= 0. If ν = α or ν = β then the whole
sequence (φνn)n≥0 will be automatically regular. Thus we have the following theorem:
Theorem 5.1. Let ν ∈ {α, β, γ} and if ν = γ assume moreover that φνn is regular, then the
following recursion holds with initial condition φν0 = φν∗0 = 1[
φνn(z)
φν∗n (z)
]
= Hν
n
$ν
n−1(z)
$ν
n(z)
[
1 λνn
λ
ν
n 1
] [
ζνn−1(z) 0
0 1
] [
φνn−1(z)
φν∗n−1(z)
]
,
where Hν
n is a nonzero constant times a unitary matrix:
Hν
n = eνn
[
ηνn1 0
0 ηνn2
]
, eνn ∈ C \ {0}, ηνn1, η
ν
n2 ∈ T.
The constant ηνn1 is chosen such that the normalization condition for the ORFs is maintained.
The other constant ηνn2 is then automatically related to ηνn1 by ηνn2 = ηνn1σn−1σn. The Szegő
parameter λνn is given by
λνn = ηνn
pνn(νn−1)
pν∗n (νn−1)
with ηνn = ςn−2
pν∗n−1(νn−1)
pν∗n−1(νn−1)
∈ T,
where φνn(z) = pνn(z)/πνn(z).
Proof. We immediately concentrate on the general situation ν = γ. Of course ν = α and ν = β
will drop out as special cases. For simplicity assume that γn and γn−1 are not 0 and not ∞.
The technicalities when this is not true are left as an exercise. It is easy because formally it
follows the steps of the proof below but one has to replace a linear factor involving infinity by
the coefficient of infinity (like 1 − ∞z = −z and z − ∞ = −1) and evaluating a polynomial
at ∞ means taking its leading coefficient.
Orthogonal Rational Functions on the Unit Circle 15
First we show that there are some numbers cn and dn such that
φ(z) :=
1− γnz
z − γn−1
φn(z)− dnφn−1(z)− cn
1− γn−1z
z − γn−1
φ∗n−1 ∈ Ln−2.
This can be written as N(z)/[(z − γn−1)πn−1(z)]. Thus the cn and dn are defined by the
conditions N(γn−1) = N(1/γn−1) = 0. If we denote φk = pk/πk and thus φ∗k = p∗kςk/πk, it is
clear that
N(z) = pn(z)− dn(z − γn−1)pn−1(z)− cn(1− γn−1z)p
∗
n−1(z)ςn−1.
Thus the first condition gives
cn =
ςn−1pn(γn−1)
(1− |γn−1|2)p∗n−1(γn−1)
and the second one
dn =
pn(1/γn−1)
(1/γn−1 − γn−1)pn−1(1/γn−1)
=
p∗n(γn−1)
(1− |γn−1|2)p∗n−1(γn−1)
.
Note that p∗n−1(γn−1) cannot be zero by Corollary 3.11, and that also p∗n(γn−1) does not vanish
by our assumption of regularity.
Furthermore, by using the orthognality of φn and φn−1 and Corollary 3.10, it is not difficult
to show that φ ⊥ Ln−2 so that it must be identically zero. Thus
φn(z) = dnσn−1
1− γn−1z
1− γnz
[ζn−1(z)φn−1(z) + λnφ
∗
n−1(z)],
with
λn = ηn
pn(γn−1)
p∗n(γn−1)
, ηn = ςn−1σn−1
p∗n−1(γn−1)
p∗n−1(γn−1)
.
By taking the ( )∗ transform (in Ln) we obtain
φ∗n(z) = dnσn
1− γn−1z
1− γnz
[λnζn−1(z)φn−1(z) + φ∗n−1(z)].
This proves the recurrence by taking en = |dn| and ηn1 = σn−1u(dn).
It remains to show that the initial step for n = 1 is true. Since φ0 = φ∗0 = 1, then in case
γ0 = α0 = 0, hence ζ0 = z, we have
φ1(z) = e1η11
z + λ1
1− γ1z
and φ∗1(z) = e1η12
λ1z + 1
1− γ1z
.
Thus
p1(z) = e1η11(z + λ1) and p∗1(z) = e1η11(λ1z + 1).
This implies that λ1 is indeed given by the general formula because
λ1 = η1
p1(γ0)
p∗1(γ0)
=
p1(0)
p∗1(0)
=
e1η11λ1
e1η11
.
16 A. Bultheel, R. Cruz-Barroso and A. Lasarow
In case γ0 = β0 =∞, then ζ0 = 1/z, so that
φ1(z) = e1η11
−1− λ1z
1− γ1z
and φ∗1(z) = e1η12
−λ1 − z
1− γ1z
,
and thus
p1(z) = −e1η11(1 + λ1z) and p∗1(z) = −e1η11σ1(λ1 + z),
and again λ1 is given by the general formula
λ1 = η1
p1(γ0)
p∗1(γ0)
= 1
p1(∞)
p∗1(∞)
=
e1η11λ1
e1η11
.
This proves the theorem. �
Remark 5.2. If ν ∈ {α, β} we could rewrite λνn in terms of φνn because by dividing and multi-
plying with the appropriate denominators πνn one gets
λνn = ηνn
φνn(νn−1)
φν∗n (νn−1)
, ηνn = σn−1σn
(1− νnνn−1)
(1− νnνn−1)
φν∗n−1(νn−1)
φν∗n−1(νn−1)
, n ≥ 1.
Note that also this ηνn ∈ T, but it differs from the ηνn in the previous theorem. However if ν = γ,
then this expression has the disadvantage that γn−1 could be equal to 1/γn or it could be equal
to a pole of φn in which case it would not make sense to evaluate these expressions in γn−1. The
latter expressions only make sense if we interpret them as limiting values
φνn(νn−1)
φν∗n (νn−1)
= lim
z→νn−1
φνn(z)
φν∗n (z)
and
(1− νnνn−1)
(1− νnνn−1)
φν∗n−1(νn−1)
φν∗n−1(νn−1)
= lim
z→νn−1
(1− νnz)
(1− νnz)
φν∗n−1(z)
φν∗n−1(z)
,
where one has to assume that lim
ξ→0
[ξ/ξ] = 1. We shall from now on occasionally use these
expressions with this interpretation, but the expressions for λνn from Theorem 5.1 using the
numerators are more direct since they immediately give the limiting values.
Note that λαn is the value of a Blaschke product with all its zeros in D evaluated at αn−1 ∈ D
and therefore λαn ∈ D. Similarly, λβn is the value of a Blaschke product with all its zeros in E,
evaluated at βn−1 ∈ E so that λβn ∈ D. Since the zeros of φn are the zeros of φαn if n ∈ an and
they are the zeros of φβn if n ∈ bn, it follows that if n and n − 1 are both in an or both in bn,
then λn ∈ D but if n ∈ an and n− 1 ∈ bn or vice versa, then λn ∈ E. Therefore
(eνn)2 =
1− |νn|2
1− |νn−1|2
1
1− |λνn|2
> 0 (5.1)
and we can choose en as the positive square root of this expression. The above expression is
derived in [9, Theorem 4.1.2] for the case ν = α by using the CD relations. By symmetry, this
also holds for ν = β. For ν = γ, the same relation can be obtained by stripping the denominators
as we explained after the proof of the CD-relation in Section 4.
What goes wrong with the recurrence relation when φn is not regular? From the proof of
Theorem 5.1, it follows that then dn = 0. We still have the relation
φn(z) =
σn−1
(1− |γn−1|2)p∗n−1(γn−1)
[
p∗n(γn−1)ζn−1(z)φn−1(z) + sn−1pn(γn−1)φ∗n−1(z)
]
Orthogonal Rational Functions on the Unit Circle 17
with sn−1 =
ςn−1p∗n−1(γn−1)
σn−1p∗n−1(γn−1)
∈ T and p∗n(γn−1) = 0. Thus, there is some positive constant en and
some ηn1 ∈ T such that
φn(z) = enηn1
$n−1(z)
$n(z)
[
0 ζn−1(z)φn−1(z) + φ∗n−1(z)
]
,
i.e., the first term in the sum between square brackets vanishes. Applying Theorem 5.1 in
this case would give λn = ∞, and the previous relations show that we only have to replace in
Theorem 5.1 the matrix[
1 λνn
λνn 1
]
by
[
0 1
1 0
]
.
This is in line with how we have dealt with ∞ so far where the rule of thumb was to set
a−νb = −b if ν =∞. So let us therefore also use Theorem 5.1 with this interpretation when φn
is not regular and thus λn = ∞. With the expressions at the end of Section 4, it can also be
shown that in this case
e2
n = − 1− |γn|2
1− |γn−1|2
> 0.
Note that this corresponds to replacing 1 − |λn|2 when λn = ∞ by −1. Since this non-regular
situation can only occur when (1−|γn|)(1−|γn−1|) < 0, this expression for e2
n is indeed positive.
A similar rule can be applied if γn or γn−1 is infinite, just replace in this or previous expression
1− |∞|2 by −1. The positivity of the expressions for e2
n also follows from the following result.
Theorem 5.3. The Szegő parameters satisfy for all n ≥ 1:
If γn = αn and γn−1 = αn−1 then λn = λαn = λβn ∈ D.
If γn = βn and γn−1 = βn−1 then λn = λβn = λαn ∈ D.
If γn = αn and γn−1 = βn−1 then λn = 1/λβn = 1/λαn ∈ E.
If γn = βn and γn−1 = αn−1 then λn = 1/λαn = 1/λβn ∈ E.
Proof. Suppose γn = αn and γn−1 = αn−1, then by Theorems 5.1 and 3.9, or better still by
Lemma 3.8,
λn =
(
ςn−2
p∗n−1(αn−1)
p∗n−1(αn−1)
)
pn(αn−1)
p∗n(αn−1)
=
(
ςn−2
pα∗n−1(αn−1)
pα∗n−1(αn−1)
)
pαn(αn−1)
pα∗n (αn−1)
= λαn.
When using pαn(z) = ςn p
β∗
n (z) ·
∏n
j=1(−|αj |) and αj = 1/βj , the previous relation becomes
λn =
ςn−2
ςn−1p
β
n−1(1/βn−1)
ςn−1p
β
n−1(1/βn−1)
pβ∗n (1/βn−1)
pβn(1/βn−1)
=
(
σ2
n−1ςn−2
pβ∗n (βn−1)
pβ∗n (βn−1)
β
n−1
n−1
βn−1
n−1
)
pβn(βn−1)
pβ∗n (βn−1)
βnn−1
β
n
n−1
= σ2
n−1
βn−1
βn−1
λβn = λβn.
The proof for γn = βn and γn−1 = βn−1, n ≥ 1 is similar.
Next consider γn = αn and γn−1 = βn−1, then
λn =
ςn−2
pβ∗n−1(βn−1)
pβ∗n−1(βn−1)
pαn(βn−1)
pα∗n (βn−1)
= ηβn
pβ∗n (βn−1)
pβn(βn−1)
=
ηβnη
β
n
λβn
=
1
λβn
=
1
λαn
.
The remaining case γn = βn and γn−1 = αn−1, is again similar. �
18 A. Bultheel, R. Cruz-Barroso and A. Lasarow
Remark 5.4. It should also be clear that the expression for the parameters λνn of Theorem 5.1
are for theoretical use only. They are expressed in terms of pνn, which is the numerator of φνn,
the object that should be the result of the computation, and hence unknown at the moment of
computing λνn. For practical use, these parameters λνn should be obtained in a different way. In
inverse eigenvalue problems the recursion is used backwards, i.e., for decreasing degrees of the
ORFs and then these expressions can be used of course.
Even if we know the λνn, then in Theorem 5.1, we still need the normalizing factor ηνn1 ∈ T
which is characterized by “ηνn1 is chosen such that the normalization condition or the ORFs is
maintained”. We could compute φνn with ηνn1 = 1, check the value of φν∗n (γn−1), and derive ηνn1
from it. What this means is shown in the next lemma.
Lemma 5.5. For ν ∈ {α, β}, the phase θνn of the unitary factor ηνn1 = eiθ
ν
n is given by
θνn = arg
(
σn−1σn$ν
n(νn−1)φν∗n (νn−1)
)
or equivalently
ηνn1 = σn−1σnu
(
$ν
n(νn−1)φν∗n (νn−1)
)
.
(Recall u(z) = z/|z|.)
Proof. Take the first relation of Theorem 5.1 and evaluate for z = νn−1, then because
$ν∗
n−1(νn−1) = 0 we get
φνn(νn−1)$ν
n(νn−1) = eνnη
ν
n1$
ν
n−1(νn−1)λνnφ
ν∗
n−1(νn−1)
or
ηνn1 =
$ν
n(νn−1)φνn(νn−1)
eνn$
ν
n−1(νn−1)λνnφ
ν∗
n−1(νn−1)
.
Use the definition of λνn = ηνn
φνn(νn−1)
φν∗n (νn−1)
with ηνn = σn−1σn
$νn(νn−1)
$νn−1(νn) and knowing that φν∗n−1(νn−1) >
0 we obtain after simplification and leaving out all the factors with phase zero
θνn = arg
(
σn−1σn$ν
n(νn−1)φν∗n (νn−1)
)
as claimed. �
Note that this expression for ηνn1 is well defined because φν∗n (νn−1) 6= 0 if ν ∈ {α, β}.
For ν = γ, the expression is a bit more involved but it can be obtained in a similar way from
the normalization conditions given in Theorem 3.9. We skip the details.
Another solution of the recurrence relation is formed by the functions of the second kind.
Like in the classical case (i.e., for ν = α) we can introduce them for ν ∈ {α, β, γ} by (see [9,
p. 83])
ψνn(z) =
∫
T
[E(t, z)φνn(t)−D(t, z)φνn(z)]dµ(t).
where D(t, z) = t+z
t−z and E(t, z) = D(t, z) + 1 = 2t
t−z . This results in
ψν0 = 1 and ψνn(z) =
∫
T
D(t, z)[φνn(t)− φνn(z)]dµ(t), n ≥ 1,
which may be generalized to (see [9, Lemma 4.2.2])
ψνn(z)f(z) =
∫
T
D(t, z)[φνn(t)f(t)− φνn(z)f(z)]dµ(t), n ≥ 1
Orthogonal Rational Functions on the Unit Circle 19
with f arbitrary in Lν(n−1)∗. It also holds that (see [9, Lemma 4.2.3])
ψν∗n (z)g(z) =
∫
T
D(t, z)[φν∗n (t)g(t)− φν∗n (z)g(z)]dµ(t), n ≥ 1
with g arbitrary in Lνn∗(νn∗). Recall that Lνn∗(νn∗) is the space of all functions in Lνn∗ that vanish
for z = νn∗ = 1/νn. This space is spanned by {Bν
k/B
ν
n : k = 0, . . . , n − 1} if ν ∈ {α, β}. For
ν = γ, the space can be characterized as (see Lemma 3.1)
Ln∗(γn∗) = span
{
Bk
Ḃα
n Ḃ
β
n
}n−1
k=0
= span
{
Bα
k
ζnḂα
n−1
}n−1
k=0
= span
{
Bβ
k
ζnḂ
β
n−1
}n−1
k=0
.
Theorem 5.6. The following relations for the functions of the second kind hold for n ≥ 0:
ψn = (ψαn)Ḃβ
n = (ψβn)∗Ḃα
n and ψ∗n = (ψαn)∗Ḃβ
n = (ψβn)Ḃα
n if γn = αn,
while
ψn = (ψβn)Ḃα
n = (ψαn)∗Ḃβ
n and ψ∗n = (ψβn)∗Ḃα
n = (ψαn)Ḃβ
n if γn = βn.
We assume the normalization of Theorem 3.9.
Proof. This is trivial for n = 0, hence suppose n ≥ 1 and γn = αn then
ψn(z) =
∫
T
D(t, z)[φn(t)− φn(z)]dµ(t) =
∫
T
D(t, z)[φαn(t)Ḃβ
n(t)− φαn(z)Ḃβ
n(z)]dµ(t)
= ψαn(z)Ḃβ
n(z),
because
Ḃβ
n(z) = Ḃβ
n−1(z) =
∏
j∈bn−1
ζβj (z) =
∏
j∈bn−1
ζαj∗(z) ∈ Lα(n−1)∗.
Moreover, using φαn∗ = φβn we also have
ψn(z) =
∫
T
D(t, z)[φn(t)− φn(z)]dµ(t) =
∫
T
D(t, z)[φαn(t)Ḃβ
n(t)− φαn(z)Ḃβ
n(z)]dµ(t)
=
∫
T
D(t, z)[φβn∗(t)Ḃ
β
n(t)− φβn∗(z)Ḃβ
n(z)]dµ(t)
=
∫
T
D(t, z)[φβ∗n (t)Ḃα
n (t)− φβ∗n (z)Ḃα
n (z)]dµ(t),
and since Ḃα
n ∈ L
β
n∗(βn∗), we also get the second part: ψn = (ψβn)∗Ḃα
n .
Moreover ψ∗n = [ψαnḂ
β
n ]∗Ḃ
α
n Ḃ
β
n = ψαn∗Ḃ
α
n Ḃ
β
nḂ
β
n∗ = [ψαn∗Ḃ
α
n/Ḃ
β
n ]Ḃβ
n = ψα∗n Ḃβ
n . It follows in
a similar way that ψ∗n = ψβnḂα
n .
The case γn = βn is proved similarly. �
With these relations, it is not difficult to mimic the arguments of [9, Theorem 4.2.4] and
obtain the following.
Theorem 5.7. These functions satisfy the recurrence relation[
ψνn(z)
−ψν∗n (z)
]
= Hν
n
$ν
n−1(z)
$ν
n(z)
[
1 λνn
λ
ν
n 1
] [
ζνn−1(z) 0
0 1
] [
ψνn−1(z)
−ψν∗n−1(z)
]
, n ≥ 1
with ψν0 = ψν∗0 = 1 and all other quantities as in Theorem 5.1.
20 A. Bultheel, R. Cruz-Barroso and A. Lasarow
6 Para-orthogonal rational functions
Before we move on to quadrature formulas, we define para-orthogonal functions (PORF) by
Qνn(z, τνn ) = φνn(z) + τνnφ
ν∗
n (z), τνn ∈ T, ν ∈ {α, β, γ}. (6.1)
The PORF Qνn(z, τνn ) is in Lνn and it is called para-orthogonal because it is not orthogonal
to Lνn−1 but it is orthogonal to a particular subspace of Lνn−1 of dimension n− 1.
Theorem 6.1. The para-orthogonal rational function Qn(z) = Qn(z, τn), τn ∈ T with n ≥ 2 is
orthogonal to ζ̇αnLn−1 ∩ ζ̇βnLn−1 = ζnLn−1 ∩ Ln−1 = Ln−1(γn) with
Ln−1(γn) = {f ∈ Ln−1 : f(γn) = 0} =
{
$∗n(z)pn−2(z)
πn−1(z)
: pn−2 ∈ Pn−2
}
.
Recall that $∗n(z) = z − γn if γn 6=∞, and $∗n(z) = −1 if γn =∞.
Proof. Suppose γn = αn then ζ̇αn = ζαn and ζ̇βn = 1. Hence φn ⊥ Ln−1 and φ∗n ⊥ ζαnLn−1 and
therefore Qn ⊥ Ln−1 ∩ ζαnLn−1 = Ln−1(αn). The proof for γn = βn is similar. �
One may also define associated functions of the second kind as
P νn (z, τn) = ψνn(z)− τνnψν∗n (z), ν ∈ {α, β, γ}.
From Theorems 3.9 and 5.6 we can easily obtain the following corollary.
Corollary 6.2. With the notation Qn and Pn for the PORF and the associated functions just
introduced, we have for n ≥ 1
Qn(z, τn) = Ḃβ
n(z)Qαn(z, τn) = τnḂ
α
n (z)Qβn(z, τn) if γn = αn,
while
Qn(z, τn) = Ḃα
n (z)Qβn(z, τn) = τnḂ
β
n(z)Qαn(z, τn) if γn = βn.
Similarly
Pn(z, τn) = Ḃβ
n(z)Pαn (z, τn) = −τnḂα
n (z)P βn (z, τn) if γn = αn,
while
Pn(z, τn) = Ḃα
n (z)P βn (z, τn) = −τnḂβ
n(z)Pαn (z, τn) if γn = βn.
Proof. Assume that γn = αn, then φn = φαnḂ
β
n and φ∗n = (φαn)∗Ḃβ
n . Thus
Qn(·, τn) = Ḃβ
n [φαn + τn(φαn)∗] = Ḃβ
nQ
α
n(·, τn).
In a similar way one has
Qn(·, τn) = φβ∗n Ḃ
α
n + τnφ
β
nḂ
α
n = τnḂ
α
n
[
φβn + τnφ
β∗
n
]
= τnḂ
α
nQ
β
n(·, τn).
The proofs for Pn and for γn = βn are similar. �
We are now ready to state that the zeros of the para-orthogonal rational functions Qνn(z, τνn )
will be simple and on T no matter whether ν = α, β or γ.
Orthogonal Rational Functions on the Unit Circle 21
Corollary 6.3. Take n ≥ 1, τn ∈ T and ν ∈ {α, β, γ} then the zeros of Qνn(z, τn) are all simple
and on T.
Proof. The case ν = α was proved in [9, Theorem 5.2.1] and by symmetry, this also holds
for ν = β. For ν = γ, assume that γn = αn, then by the previous corollary Qn(z, τn) =
Ḃβ
n(z)Qαn(z, τn) so that the zeros of Qn(z, τn) are the zeros of Qαn(z, τn) (and hence will not
cancel against any of the poles).
The proof for γn = βn is similar. �
These PORF properties are important because the zeros will deliver the nodes for the rational
Szegő quadrature as we will explain in the next section.
For further reference we give the following property.
Proposition 6.4. For n ≥ 1 and ν ∈ {α, β, γ}, the PORF satisfy, using the notation of
Theorem 5.1
Qνn(z, τn) = cνn
$ν
n−1(z)
$ν
n(z)
[
ζνn−1(z)φνn−1(z) + τ̃νnφ
ν∗
n−1(z)
]
with
cνn = eνn
(
ηn1 + ηn2τnλ
ν
n
)
, τ̃νn =
τ̂n + λνn
1 + τ̂nλ
ν
n
∈ T, τ̂n = ηn1ηn2τn.
Proof. Just take the recurrence relation of Theorem 5.1 and premultiply it with [1 τn]. After
re-arrangement of the terms, the expression above will result. �
The importance of this property is that, up to a constant nonzero factor cνn, we can com-
pute Qνn(z, τn) by exactly the same recurrence that gives φνk in terms of φνk−1 and φν∗k−1 for
k = 1, 2, . . . , n, except that we have to replace the trailing parameter λνn by a unimodular con-
stant τ̃νn to get Qνn(z, τn). When we are interested in the zeros of Qνn(z, τn), then this normalizing
factor cνn does not matter since it is nonzero.
7 Quadrature
We start by proving that the subspace of rational functions in which the quadrature formulas
will be exact only depends on the points {αk : k = 0, . . . , n − 1} no matter whether the points
αk are introduced as a pole αk or αk∗ = 1/αk in the sequence γ
n
= (γk)
n
k=0.
Lemma 7.1. Rn ..= Ln · Ln∗ = Rαn ..= Lαn · Lαn∗ = Rβn ..= Lβn · Lβn∗.
Proof. The space Ln contains rational functions of degree at most n whose denominators
have zeros in {βj : j ∈ an} ∪ {αj : j ∈ bn}. The space Ln∗ contains rational functions of
degree at most n whose denominators have zeros in {βj : j ∈ bn} ∪ {αj : j ∈ an}. Thus the
space Rn contains rational functions of degree at most 2n whose denominators have zeros in
{βj : j = 1, . . . , n} ∪ {αj : j = 1, . . . , n}. Since the denominators of functions in Lαn have zeros
in {βj : j = 1, . . . , n} and functions in Lαn∗ have zeros in {αj : j = 1, . . . , n}, the denominator
of functions in Rαn have zeros in {βj : j = 1, . . . , n} ∪ {αj : j = 1, . . . , n} so that Rn = Rαn. Of
course a similar argument shows that also Rn = Rβn. �
As is well known from the classical case associated with the sequence α discussed for example
in [9, Chapter 5], the rational Szegő quadrature formulas are of the form
Iαn (f) =
n−1∑
j=0
wαnjf(ξαnj), n ≥ 1,
22 A. Bultheel, R. Cruz-Barroso and A. Lasarow
where ξαnj = ξαnj(τ
α
n ), j = 0, . . . , n − 1 are the zeros of the para-orthogonal rational function
Qαn(z, ταn ) with ταn ∈ T and the weights are given by
wαnj = wαnj(τn) =
1
n−1∑
k=0
|φαk (ξαnj(τ
α
n ))|2
> 0.
These formulas have maximal degree of exactness, meaning that∫
T
f(t)dµ(t) = Iαn (f), ∀ f ∈ Rαn−1
and that no n-point interpolatory quadrature formula with nodes on T and positive weights can
be exact in a larger (2n− 1)-dimensional space of the form Lαn · Lα(n−1)∗ or Lαn∗ · Lαn−1. They are
however exact in a subspace of dimension 2n − 1 of a different form as shown in [39] and [17]
for Laurent polynomials and in [11] for the rational case.
Our previous results show that taking an arbitrary order of selecting the γ does not add new
quadrature formulas. Indeed, if γn = αn then the zeros of Qn(z, τn), Qαn(z, τn) and Qβn(z, τn)
are all the same, i.e., ξnj(τn) = ξαnj(τn) = ξβnj(τn), j = 1, . . . , n. Dropping the dependency on τ
from the notation, it is directly seen that
n−1∑
k=0
|φk(ξnj)|2 =
n−1∑
k=0
|φαk (ξnj)Ḃ
β
k (ξnj)|2 =
n−1∑
k=0
|φαk (ξnj)|2 =
n−1∑
k=0
|φαk (ξαnj)|2
and thus we also have wnj = wαnj and similarly wnj = wβnj . Therefore In = Iαn = Iβn for
appropriate choices of the defining parameters, i.e.,
ταn = τn and τβn = τn if γn = αn,
τβn = τn and ταn = τn if γn = βn.
An alternative expression for the weights is also known ([9, Theorem 5.4.1]) and it will
obviously also coincide for ν ∈ {α, β, γ}:
wνnj =
1
2ξνnj
P νn (z)
d
dzQ
ν
n(z)
∣∣∣∣∣
z=ξνnj
,
where Qνn is the PORF with zeros {ξνnj}
n−1
j=0 and P νn the associated functions of the second kind
as in Corollary 6.2. We have dropped again the obvious dependence on the parameter τνn from
the notation.
A conclusion to be drawn from this section is that whether we choose the sequence γ, α
or β, the resulting quadrature formula we obtain is an n-point formula that is exact in the space
Rn−1 = Ln−1 ·L(n−1)∗. Such a quadrature formula is called an n-point rational Szegő quadrature
and these are the only positive interpolating quadrature formulas exact in this space. They are
unique up to the choice of the parameter τn ∈ T. Thus, to obtain the nodes and weights for
a general sequence γ and a choice for τn ∈ T, we can as well compute them for the sequence α
and an appropriate ταn ∈ T or for the sequence that alternates between one α and one β. The
resulting quadrature will be the same.
In practice nodes and weights of a quadrature formula are computed by solving an eigenvalue
problem. It will turn out that also in these computations, nothing is gained from adopting
a general order for the γ sequence but that a sequence alternating between α and β (the balanced
situation) is the most economical one. We shall come back to this in Section 10 after we provide
Orthogonal Rational Functions on the Unit Circle 23
in the next sections the structure of the matrices whose eigenvalues will give the nodes of the
quadrature formula.
Rational interpolatory and Szegő quadrature formulas with arbitrary order of the poles in-
cluding error estimates and numerical experiments were considered in [19].
8 Block diagonal with Hessenberg blocks
The orthogonal polynomials are a special case of ORF obtained by choosing a sequence γ = α
that contains only zeros. Another special case is given by the orthogonal Laurent polynomi-
als (OLP) obtained by choosing an alternating sequence γ = (0,∞, 0,∞, 0,∞, . . .). In [47],
Velázquez described spectral methods for ORF on the unit circle that were based on these two
special choices. The result was that the shift operator Tµ : L2
µ → L2
µ : f(z) 7→ zf(z) has a matrix
representation that is a matrix Möbius transform of a structured matrix. The structure is a
Hessenberg structure in the case of γ = α and it is a five-diagonal matrix (a so-called CMV
matrix) for the sequence γ = (0, β1, α2, β3, α4, . . .). In the case of orthogonal (Laurent) polyno-
mials the Möbius transform turns out to be just the identity and we get the plain Hessenberg
matrix for the polynomials and the plain CMV matrix for the Laurent polynomials.
In [16], the OLP case was discussed using an alternative linear algebra approach when the 0,∞
choice did not alternate nicely, but the order in which they were added was arbitrary. Then the
structured matrix generalized to a so-called snake-shape matrix referring to the zig-zag shape of
the graphical representation that fixed the order in which the elementary factors of the matrix
had to be multiplied. This structure is a generalization of both the Hessenberg and the CMV
structures mentioned above. It is a block diagonal where the blocks alternate between upper
and lower Hessenberg structure.
To illustrate this for our ORF, we start in this section by using the approach of Velázquez
in [47], skipping all the details just to obtain the block structure of the matrix that will represent
the shift operator. In the next sections we shall use the linear algebra approach of [16] to analyze
the computational aspects of obtaining the quadrature formulas.
We start from the recurrence for the φαk and transform it into a recurrence relation for the φk,
which will eventually result in a representation of the shift operator.
To avoid a tedious book-keeping of normalizing constants, we will just exploit the fact that
there are some constants such that certain dependencies hold. For example the recurrence
φαn(z) = enηn1
$α
n−1(z)
$α
n(z)
[
ζαn−1(z)φαn−1(z) + λαnφ
α∗
n−1(z)
]
or equivalently
φαn(z) = enηn1
[
σn−1
$α∗
n−1(z)
$α
n(z)
φαn−1(z) + λαn
$α
n−1(z)
$α
n(z)
φα∗n−1(z)
]
or
$α∗
n−1(z)φαn−1(z) = e−1
n ηn1σn−1$
α
n(z)φαn(z)− λαnσn−1$
α
n−1(z)φα∗n−1(z) (8.1)
will be expressed as
$α∗
n−1φ
α
n−1 ∈ span
{
$α
nφ
α
n, $
α
n−1φ
α∗
n−1
}
. (8.2)
Similarly, by combining the recurrence for φαn and φα∗n from Theorem 5.1 and eliminating φαn−1,
we have
$α
nφ
α∗
n ∈ span
{
$α
nφ
α
n, $
α
n−1φ
α∗
n−1
}
. (8.3)
Note that this relation always holds whether or not λαn 6= 0.
24 A. Bultheel, R. Cruz-Barroso and A. Lasarow
Now suppose that our sequence γ has the following configuration
γ = (. . . , βk−1, αk, . . . , αn, . . . , αm−1, βm, . . . , βl−1, αl, . . . )
with 2 ≤ k ≤ n < m < l where k, m, l are fixed and n varying from k to m− 1.
Using (8.2) and then repeatedly (8.3) we get
$α∗
n φαn ∈ span
{
$α
n+1φ
α
n+1, . . . , $
α
kφ
α
k , $
α
k−1φ
α∗
k−1
}
.
Multiply this with Ḃβ
n = Ḃβ
n−1 = · · · = Ḃβ
k−1
$α∗
n Ḃβ
nφ
α
n ∈ span
{
$α
n+1Ḃ
β
nφ
α
n+1, . . . , $
α
k Ḃ
β
kφ
α
k , $
α
k−1Ḃ
β
k−1φ
α∗
k−1
}
,
and since by Theorem 3.9, φp = Ḃβ
pφαp for p = n, n−1, . . . , k and φk−1 = Ḃβ
k−1φ
α∗
k−1, this becomes
$α∗
n φn ∈ span
{
$α
n+1Ḃ
β
nφ
α
n+1, $
α
nφn, . . . , $
α
kφk, $
α
k−1φk−1
}
. (8.4)
Thus if n+ 1 < m, then γn+1 = αn+1, hence Ḃβ
n+1 = Ḃβ
n and φn+1 = Ḃβ
n+1φ
α
n+1, so that
$α∗
n φn ∈ span{$α
p φp}n+1
p=k−1, k ≤ n < m− 1. (8.5)
If n+ 1 = m, then γn+1 = βm and we start dealing with the subsequence of β’s in
γ = (. . . , αm−1, βm, . . . , βl−1, αl, . . .).
Therefore we note that
$α
m(z) = (z − αm)
1− αmz
z − αm
= (z − αm)
z − βm
1− βmz
σ2
m = σm$
α∗
m (z)ζβm(z). (8.6)
Thus
$α
mḂ
β
m−1φ
α
m = σm$
α∗
m Ḃβ
mφ
α
m.
Using (8.2) again repeatedly we then see that
$α
mḂ
β
m−1φ
α
m ∈ Ḃβ
m span
{
$α
mφ
α∗
m , $
α
m+1φ
α
m+1
}
= Ḃβ
m span
{
$α
mφ
α∗
m , $
α∗
m+1ζ
β
m+1φ
α
m+1
}
= span
{
$α
mḂ
β
mφ
α∗
m , $
α∗
m+1Ḃ
β
m+1φ
α
m+1
}
= · · ·
= span
{
$α
mḂ
β
mφ
α∗
m , . . . , $
α
l−1Ḃ
β
l−1φ
α∗
l−1, $
α
l Ḃ
β
l φ
α
l
}
= span
{
$α
mφm, . . . , $
α
l−1φl−1, $
α
l φl
}
,
so that by plugging this into (8.4) with n+ 1 = m gives
$α∗
m−1φm−1 ∈ span
{
$α
p φp
}l
p=k−1
. (8.7)
The two relations (8.4) and (8.7) tell us how we should express $α∗
n φn in terms of the {$α
kφk : k ≤
n+ 1} in the case that γn = αn.
The next step is to express $α∗
n φn in terms of {$α
p φp}, p ≥ n− 1 when γn = βn. This goes
along the same lines. Let us consider the mirror situation
γ = (. . . , αm−1, βm, . . . , βn, . . . , βl−1, αl, . . . , αr−1, βr, . . .)
Orthogonal Rational Functions on the Unit Circle 25
with 2 ≤ m ≤ n < l < r where m, l, r are again fixed and n is varying from m to l − 1.
We first use (8.6) for m = n to write
$α∗
n φn = $α∗
n Ḃβ
nφ
α∗
n = σn$
α
nḂ
β
n−1φ
α∗
n ,
which implies that we shall need expressions for $α
nφ
α∗
n . We use repeatedly (8.3) in combination
with the previous relation to get
$α∗
n φn ∈ Ḃβ
n−1 span
{
$α
n−1φ
α∗
n−1, $
α
nφ
α
n
}
= span
{
Ḃβ
n−1$
α
n−1φ
α∗
n−1, Ḃ
β
n$
α∗
n φαn
}
= span
{
Ḃβ
n−1$
α
n−1φ
α∗
n−1, Ḃ
β
n$
α
nφ
α∗
n , Ḃ
β
n$
α
n+1φ
α
n+1
}
= span
{
Ḃβ
n−1$
α
n−1φ
α∗
n−1, Ḃ
β
n$
α
nφ
α∗
n , Ḃ
β
n+1$
α∗
n+1φ
α
n+1
}
= span
{
Ḃβ
n−1$
α
n−1φ
α∗
n−1, Ḃ
β
n$
α
nφ
α∗
n , Ḃ
β
n+1$
α
n+1φ
α∗
n+1, Ḃ
β
n+1$
α
n+2φ
α
n+2
}
= · · ·
= span
{
Ḃβ
n−1$
α
n−1φ
α∗
n−1, . . . , Ḃ
β
m−1$
α
l−1φ
α∗
l−1, Ḃ
β
m−1$
α
l φ
α
l
}
= span
{
Ḃβ
n−1$
α
n−1φ
α∗
n−1, $
α
nφn, . . . , $
α
l−1φl−1, $
α
l φl
}
,
where in the last step we used γl = αl and hence Ḃβ
l−1 = Ḃβ
l and Ḃβ
l−1φ
α
l = φl. Thus if n > m,
also γn−1 = βn−1 so that also the first term is $α
n−1φn−1 and we have found that
$α∗
n φn ∈ span
{
$α
p φ
α
p
}l
p=n−1
, m < n ≤ l − 1. (8.8)
Now we are left with the remaining case n = m, i.e., γn−1 = αn−1 while γn = βn. This
is the missing link that should connect a β block to the previous α block at the boundary
. . . , αm−1, βm, . . ..
Using (8.3) repeatedly we get
$α
m−1φ
α∗
m−1 ∈ span
{
$α
m−1φ
α
m−1, $
α
m−2φ
α∗
m−2
}
= span
{
$α
m−1φ
α
m−1, $
α
m−2φ
α
m−2, $
α
m−2φ
α∗
m−2
}
= · · ·
= span
{
$α
m−1φ
α
m−1, $
α
m−2φ
α
m−2, . . . , $
α
kφ
α
k , $
α
k−1φ
α∗
k−1
}
.
After multiplying with Ḃβ
k = Ḃβ
k+1 = · · · = Ḃβ
n−1 we get
$α
m−1φ
α∗
m−1 ∈ span
{
$α
p φp
}n−1
p=k−1
.
We plug this in our previous expression for $α∗
n φn and we arrive at
$α∗
m φm ∈ span
{
$α
p φp
}l
p=k−1
. (8.9)
To summarize, (8.4) and (8.7) in the case γn = αn and (8.8) and (8.9) in the case γn = βn
show short recurrences for the φp that fully rely on the recursion for the α-related quanti-
ties: φαn and φα∗n . They use only factors $α
p and $α∗
p and in the recurrence relations only the
rational Szegő parameters λαp for the α sequence are used. The longest recursions are needed to
switch from a α to a β block, which is for n ∈ {m− 1,m} as given in (8.7) and (8.9) since these
need all elements form both blocks.
This analysis should illustrate that as long as we are in a succession of α’s we are building up
an upper Hessenberg block. At the instant the α sequence switches to a β sequence one starts
building a lower Hessenberg block, which switches back to upper Hessenberg when again α’s
26 A. Bultheel, R. Cruz-Barroso and A. Lasarow
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
����������
������
������
������
������
������
������
������
������
������
������
������
������
������
������
������
������
������
������
������
������
������
������
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
��������
�������� αβαβα
α
β
β
α
α
=
αβαβα
α
β
β
α
α
Figure 1. Structure of the matrix Ĝ which is the same as the structure of the matrix GA.
enter the γ sequence etc. See Fig. 1. Of course if there are only α’s in the sequence, we end up
with just an upper Hessenberg as in the classical case. If we alternate between one α and one β’s
we get the so called CMV type matrix which is a five-diagonal matrix with a characteristic block
structure a given in Fig. 2.
Suppose we start with a set of α’s, and set
A = diag(α0, α1, . . .)
and let us recycle our earlier notations $n(z) = 1 − αnz and $∗n(z) = z − αn to now act on
an arbitrary operator T by doing the transform for the whole sequence α simultaneously as
follows: $A(T ) = I − T A∗ and $∗A(T ) = T − A where I is the identity operator. Then (at
least formally) we have
[φ0, φ1, . . .]$
∗
A(zI) = [φ0, φ1, . . .]$A(zI)Ĝ (8.10)
with Ĝ having the structure shown in Fig. 1. In the special case that the α’s and β’s alternate
as in
γ = (α1, β2, α3, β4, α5, . . .),
we get the five-diagonal matrix as in [12] for polynomials and for ORF in [47]. Since for k ≥ 1
each α2k−1-column is the last in an α block and each β2k-column is the first in a β block, we
get a succession of 4× 2 blocks that shift down by two rows as illustrated by Fig. 2.
The particular role of the last column in an α block and the first column in a β block is due to
the fact that we have chosen to derive these relations for the φk from the recurrence for the φαk
leading to factors $A and $∗A and the use of parameters λαk . A symmetric derivation could
have been given by starting from the φβk recursion, in which case the β blocks will correspond
to upper Hessenberg blocks and the α blocks to lower Hessenberg blocks. Then the longer
recurrence would occur in the last column of the β block and the first column of the α block.
The shape of the matrix would be the transposed shape of the figures shown here. However while
it is rather simple to deal with αj = 0 in the current derivation, it requires dealing with more
technical issues to write down the formulas with the corresponding βj equal to ∞. Therefore
we stick to the present form.
Orthogonal Rational Functions on the Unit Circle 27
������������������������������������������������������������������������������������������
����
����
����
����
����
����
����
����
����
����
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
���
���
���
���
���
���
���
���
���
���
���
���
���
���
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
�����
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
β
α
α
β
α
β
α
β
α
β
α
α β α β α β α β α β
������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������
����������������������������������������������
������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������
Figure 2. Structure of the matrix Ĝm when α’s and β’s alternate, which is the five-diagonal CMV
matrix given in [12, 47].
We can define as in [47] an operator Möbius transform of an operator T by
ζA(T ) = ηA$A(T )−1$∗A(T )η−1
A∗ , $A(T ) = I − T A∗, $∗A(T ) = T − A,
and its inverse
ζ̃A(T ) = η−1
A $̃∗A(T )$̃A(T )−1ηA∗ , $̃A(T ) = I +A∗T , $̃∗A(T ) = T +A,
where
ηA := $A(A)1/2 =
√
I −AA∗ =
√
I −A∗A = ηA∗ = $A∗(A∗)1/2
= diag
(
1,
√
1− |α1|2,
√
1− |α2|2, . . .
)
.
Recall the meaning of the ( )∗ notation for a (constant) matrix from Remark 3.5. Note that $̃A
is like $A, except that the minus sign is changed into a plus sign and T A∗ is changed into A∗T .
It should also be clear that the alternating case (i.e., the case of the CMV representation) gives
the smallest possible bandwidth, as was also reported in [12].
Now it is not difficult to see that (8.10) is equivalent to
[φ0(z), φ1(z), φ2(z), . . .]
(
zI − ζ̃A(G)
)
= 0, Ĝ = η−1
A G ηA∗ .
To see this, exactly the same argument as used in [47] to derive relation (19) of that paper can
be applied here. Instead we give a direct derivation. The relation (8.10) can be written as
0 = Φ
[
$∗A(zI)−$A(zI)Ĝ
]
= Φ
[
(zI − A)− (I − zA∗)Ĝ
]
= Φ
[
z
(
I +A∗Ĝ
)
−
(
A+ Ĝ
)]
.
Since $̃A(Ĝ) = (I +A∗Ĝ) is invertible, this is equivalent with
0 = Φ
[
zI − $̃∗A
(
Ĝ
)
$̃A
(
Ĝ
)−1]
= Φ
[
zI − η−1
A $̃∗A(G)$̃A(G)−1ηA∗
]
= Φ
[
zI − ζ̃A(G)
]
,
where we used the fact that the matrix ηA = ηA∗ and its inverse commute with any diagonal
matrix.
The matrices G and Ĝ have the same shape, but G is rescaled to make it isometric. So we see
that on L = span{φ0, φ1, . . .} the operator zI has a matrix representation ζ̃A(G) with respect
to the φ-basis. We shall have a more careful discussion of this fact in the next section.
28 A. Bultheel, R. Cruz-Barroso and A. Lasarow
9 Factorization of a general CMV matrix
It is well known (see [6, 47] in the rational case and [16] for the polynomial case) that the
Hessenberg matrix G = H (obtained when γ = α) can be written as an infinite product of
elementary matrices, i.e.,
G = H = G1G2G3G4 · · · ,
where
Gk :=
Ik−1 0 0
0 G̃k 0
0 0 I∞
, G̃k :=
[
−δk ηk
ηk δk
]
, ∀ k ≥ 1, (9.1)
where Ik−1 and I∞ are the identity matrices of sizes (k− 1) and ∞ respectively, I0 is the empty
matrix, δk = λαk is the k-th Szegő parameter and ηk :=
√
1− |δk|2.
Also the CMV matrix Cε associated with the alternating sequence γ = ε where γ2k−1 = α2k−1
and γ2k = β2k, for all k ≥ 1 uses the same elementary transforms, but they are multiplied in
a different order:
G = CoCe = (· · ·G9G7G5G3G1) · (G2G4G6G8G10 · · · )
= (G1G3G5G7G9 · · · ) · (G2G4G6G8G10 · · · ) .
To find explicit expressions for these elementary factors in our case, taking into account the
proper normalization, requires a more detailed analysis. We start with the following
Theorem 9.1. In the case γ = α, the recurrence relation of Theorem 5.1 can be rewritten in
this form[
$α∗
n−1φ
α
n−1√
1− |αn−1|2
$α
nφ
α∗
n√
1− |αn|2
]
=
[
$α
n−1φ
α∗
n−1√
1− |αn−1|2
$α
nφ
α
n√
1− |αn|2
]
G̃αn (9.2)
with
G̃αn = σn−1η
α
n1
[
−λαnηαn1
√
1− |λαn|2√
1− |λαn|2 λαnη
α
n1
] [
1 0
0 σn
]
or [
$α∗
n−1φ
α
n−1 $α
nφ
α∗
n
]
=
[
$α
n−1φ
α∗
n−1 $α
nφ
α
n
]
Ĝαn (9.3)
with
Ĝαn =
[(
1− |αn−1|2
)−1/2
0
0
(
1− |αn|2
)−1/2
]
G̃αn
[(
1− |αn−1|2
)1/2
0
0
(
1− |αn|2
)1/2
]
.
Proof. This is in fact an explicit form of what in the previous section was expressed as (8.2)
and (8.3).
Taking the first line of the recurrence relation from Theorem 5.1 and solving for $α∗
n−1φ
α
n−1
we get the explicit from of (8.1), i.e.,
$α∗
n−1φ
α
n−1 = (eαn)−1ηαn1σn−1$
α
nφ
α
n − λαnσn−1$
α
n−1φ
α∗
n−1.
Orthogonal Rational Functions on the Unit Circle 29
Using eαn =
√
1−|αn|2√
1−|αn−1|2
1√
1−|λαn |2
and ηαn2 = ηαn1σn−1σn this becomes
$α∗
n−1φ
α
n−1√
1− |αn−1|2
= ηαn1σn−1
[
−λαnηαn1
$α
n−1φ
α∗
n−1√
1− |αn−1|2
+
√
1− |λαn|2
$α
nφ
α
n√
1− |αn|2
]
. (9.4)
The second line of the recurrence relation in Theorem 5.1 is
$α
nφ
α∗
n = eαnη
α
n2σn−1λαn$
α∗
n−1φ
α
n−1 + eαnη
α
n2$
α
n−1φ
α∗
n−1.
Eliminate the term $α∗
n−1φ
α
n−1 with the first line of the recurrence and after inserting the value
of eαn from (5.1) and rearranging, we get
$α
nφ
α
n√
1− |αn|2
= ηαn1σn−1
[
$α
n−1φ
α
n−1√
1− |αn−1|2
√
1− |λαn|2σn +
$α
nφ
α
n√
1− |αn|2
λαnη
α
n1σn
]
.
This is equivalent with the formula (9.2). Derivation of the other form (9.3) is obvious. �
Note that G̃αn is unitary but Ĝαn is not unless |λαn| = 1 or αn = αn−1.
Now suppose that n ∈ an, i.e., γn = αn then Ḃβ
n = Ḃβ
n−1 and hence[
$α∗
n−1φ
α
n−1Ḃ
β
n−1 $α
nφ
α∗
n Ḃ
β
n
]
=
[
$α
n−1φ
α∗
n−1Ḃ
β
n−1 $α
nφ
α
nḂ
β
n
]
Ĝαn. (9.5)
If n ∈ bn, i.e., γn = βn, then Ḃβ
n = Ḃβ
n−1ζ
β
n = σnḂ
β
n−1
$αn
$α∗n
. Define then
G̃βn =
[
σn 0
0 1
]
G̃αn
[
σn 0
0 1
]
= S̃∗nG̃
α
nS̃n, S̃n = diag(σn, 1), (9.6)
and as for Ĝαn
Ĝβn =
[(
1− |αn−1|2
)−1/2
0
0
(
1− |αn|2
)−1/2
]
G̃βn
[(
1− |αn−1|2
)1/2
0
0
(
1− |αn|2
)1/2
]
=
[
σn 0
0 1
]
Ĝαn
[
σn 0
0 1
]
,
so that[
$α
n−1φ
α∗
n−1Ḃ
β
n−1 $α∗
n φαnḂ
β
n
]
Ĝβn
=
[
$α
n−1φ
α∗
n−1Ḃ
β
n−1 $α
nφ
α
nḂ
β
n−1
]
σnĜ
α
n
[
σn 0
0 1
]
=
[
$α∗
n−1φ
α
n−1Ḃ
β
n−1 $α
nφ
α∗
n Ḃ
β
n−1
] [1 0
0 σn
]
=
[
$α∗
n−1φ
α
n−1Ḃ
β
n−1 $α∗
n φα∗n Ḃ
β
n
]
.
Thus [
$α∗
n−1φ
α
n−1Ḃ
β
n−1 $α∗
n φα∗n Ḃ
β
n
]
=
[
$α
n−1φ
α∗
n−1Ḃ
β
n−1 $α∗
n φαnḂ
β
n
]
Ĝβn. (9.7)
We build the general CMV matrix as a product of G-factors. Set α0 = 0 and suppose we
consider for 1 ≤ n < k < m < l < · · · ≤ ∞
α0, α1, . . . , αn−1|βn, . . . , βk−1|αk, . . . , αm−1|βm, . . . , β`−1|α`, . . . .
30 A. Bultheel, R. Cruz-Barroso and A. Lasarow
We use vertical bars to separate the different blocks. These separations are of course clear from
the notation α and β, but since that will not so obvious when we denote the corresponding
functions in the vectors below, we introduced them here and keep them at the corresponding
positions in the vectors that we are about to introduce now. That should improve readability
of the formulas.
We consider the vector Φ = [φ0, φ1, φ2, . . .] so that
Φ(I − A∗z) =
[
$α
0 φ0, . . . , $
α
n−1φn−1
∣∣$α
nφn, . . . , $
α
k−1φk−1
∣∣$α
kφk, . . .
. . . , $α
m−1φm−1
∣∣$α
mφm, . . . , $
α
`−1φ`−1
∣∣$α
` φ`, . . .
]
=
[
$α
0 φ
α
0 , . . . , $
α
n−1φ
α
n−1Ḃ
β
n−1
∣∣$α
nφ
α∗
n Ḃ
β
n , . . . , $
α
k−1φ
α∗
k−1Ḃ
β
k−1
∣∣$α
kφ
α
k Ḃ
β
k , . . .
. . . , $α
m−1φ
α
m−1Ḃ
β
m−1
∣∣$α
mφ
α∗
m Ḃ
β
m, . . . , $
α
`−1φ
α∗
`−1Ḃ
β
`−1
∣∣$α
` φ
α
` Ḃ
β
` , . . .
]
.
Define for ν ∈ {α, β} (recall ηA =
√
I −A∗A)
Gνn =
In−1 0 0
0 G̃νn 0
0 0 I∞
= diag(In−1, G̃
ν
n, I∞), and Ĝνn = η−1
A GνnηA. (9.8)
Apply successively the Gνn to the right on the Φ vector. Keep an increasing order for the
factors within an α block and a decreasing order for the factors covering a β block. Thus for the
example given above the order will be
γ = (α0, α1, . . . , αn−1‖βn, . . . , βk−1|αk‖αk+1, . . . , αm−1‖βm, . . . , β`−1|α`‖α`+1, . . .),
Ĝ = (Ĝα1 Ĝ
α
2 · · · Ĝαn−1)︸ ︷︷ ︸
Ĝ1
α
(Ĝαk Ĝ
β
k−1 · · · Ĝ
β
n)︸ ︷︷ ︸
Ĝ1
β
(Ĝαk+1 · · · Ĝαm−1)︸ ︷︷ ︸
Ĝ2
α
(Ĝα` Ĝ
β
`−1 · · · Ĝ
β
m)︸ ︷︷ ︸
Ĝ2
β
(Ĝα`+1 · · · ) (9.9)
= η−1
A (Gα1G
α
2 · · ·Gαn−1)︸ ︷︷ ︸
G1
α
(GαkG
β
k−1 · · ·G
β
n)︸ ︷︷ ︸
G1
β
(Gαk+1 · · ·Gαm−1)︸ ︷︷ ︸
G2
α
(Gα`G
β
`−1 · · ·G
β
m)︸ ︷︷ ︸
G2
β
(Gα`+1 · · · )
︸ ︷︷ ︸
ηA
= η−1
A G ηA
with Ĝνn and Gνn, ν ∈ {α, β} as in (9.8) and similarly
Ĝnν = η−1
A GnνηA, for n = 1, 2, . . .; ν ∈ {α, β} and Ĝ = η−1
A GηA. (9.10)
Note that we have aligned these formulas properly so that a shift in the blocks from the first
to the second line in (9.9) is made clear. The rationale is that in a Gnβ group the indices are
decreasing from left to right and in a Gnα group the indices increase from left to right. This is
what defines the shape of the result. We used the vertical bars to separate α and β blocks in the
sequence γ as before. Now we use double vertical bars to indicate what corresponds to the Gnα
and Gnβ blocks on the third line. On the second line in (9.9) you may note that the first α in
an α block of the γ sequence corresponds to a Ĝα factor that migrates to the front of the product
group of previous β block. This is due to the fact that we analysed the γ sequence using the
parameters λαk of the α recurrence (recall the definitions of the Ĝαk and Ĝβk from Theorem 9.1
and equation (9.6)).
To see the effect of applying all these elementary operations, we start with the first block G1
α.
Using (9.5) when multiplying Φ(I − A∗z) with Ĝ1
α gives (note φ0 = φ∗0)[
$α∗
0 φ0, . . . , $
α∗
n−2φn−2, $
α
n−1φ
α∗
n−1Ḃ
β
n−1
∣∣$α
nφn, . . .
]
.
Orthogonal Rational Functions on the Unit Circle 31
While multiplying this result with Ĝαk Ĝ
β
k−1 · · · Ĝ
β
n+1, we make use of (9.5) and (9.7) to obtain[
$α∗
0 φ0, . . . , $
α∗
n−2φn−2, $
α
n−1φ
α∗
n−1Ḃ
β
n−1
∣∣
$α∗
n φαnḂn, $
α∗
n+1φn+1, . . . , $
α∗
k−1φk−1
∣∣$α
kφ
∗
k, $
α
k+1φk+1, . . .
]
and the remaining multiplication Ĝβn of Ĝ1
β links the current β block to the previous α block,
resulting in[
$α∗
0 φ0, . . . , $
α∗
n−1φn−1
∣∣$α∗
n φn, . . . , $
α∗
k−1φk−1
∣∣$α
kφ
∗
k, $
α
k+1φk+1, . . .
]
.
The next block is again an α block treated by the product Ĝ2
α, and one may continue like this
to finally get
Φ(I − A∗z)Ĝ = Φ(zI − A) or Φ(I − A∗z)G = Φ(zI − A), Ĝ = η−1
A ĜηA, (9.11)
since ηA is invertible. This Ĝ can be factored as the product of elementary matrices hav-
ing only one non-trivial 2 × 2 diagonal block: Ĝ = Ĝ1Ĝ2 · · · . By setting Ĝ = η−1
A GηA =
(η−1
A G1ηA)(η−1
A G2ηA) · · · = G1G2 · · · , we made G as well as these elementary factors Gk uni-
tary.
These elementary factors are grouped together in blocks as a consequence of the α and β
blocks in the γ sequence. For example G2
β = Gα`G
β
`−1 · · ·G
β
m corresponds to a β block. Because
` > m there must be at least two factors in such a β block: the first one is Gαl and the last one
is a Gβm. However for an α block we have the product G2
α = Gαk+1 · · ·Gαm−1 thus if k = m − 1
then the initial index k+ 1 is larger than the end index m− 1. Thus if an α block has only one
element then there are no factors in this product, which means that this Gα is just the identity.
In the case of γ = α (no β’s), then there is of course only one infinite block so n = ∞ and
there is no k,m, `, . . . and
G = Gα1G
α
2G
α
3 · · · =: H
is the familiar upper Hessenberg matrix, and in the case of alternating α-β sequence, we have
in the case α1, β2, α3, β4, . . .
G = Gα1
(
Gα3G
β
2
)
I∞
(
Gα5G
β
4
)
I∞
(
Gα7G
β
6
)
· · · =
(
Gα1G
α
3G
α
5 · · ·
)(
Gβ2G
β
4G
β
6 · · ·
)
= C = Cαo Cβe ,
where we grouped the product of the Gαk factors as Cαo and of the Gβk factors as Cβe . This
rearrangement of the factors is possible because the blocks commute when their subscript differs
by at least two. If we define Se = diag
(
1, S̃2, S̃4, S̃6, . . .
)
, S̃2k = diag(σ2k, 1), then of course
C = Cαo S∗eCαe Se.
In the case of an alternating sequence of the form β1, α2, β3, α4, . . ., then
G =
(
Gα2G
β
1
)
I∞
(
Gα4G
β
3
)
I∞
(
Gα6G
β
5
)
· · · =
(
Gα2G
α
4G
α
6 · · ·
)(
Gβ1G
β
3G
β
5 · · ·
)
= C = Cαe Cβo .
As in the previous case this is C = Cαe S∗oCαo So with
So = diag
(
S̃1, S̃3, S̃5, . . .
)
, S̃2k−1 = diag(σ2k−1, 1).
These are the classical CMV matrices as also given in [47], except for the S factors, which
were not in [47]. That is because we have used a particular normalization for the φn-basis.
A slightly different normalization of the φn-basis will remove the S factors. Indeed, replacing
all φn by ϕαn = ςβnφn, where
ςβn =
Ḃβ
n
|Ḃβ
n |
=
∏
j∈bn
σj ,
32 A. Bultheel, R. Cruz-Barroso and A. Lasarow
will do.
To see this, note that the relation (9.6) between G̃βn and G̃αn can also be written as (multiply
with σnσn = 1)
G̃βn =
[
1 0
0 σn
]
G̃αn
[
1 0
0 σn
]
. (9.12)
The first and the last of these factors can then be moved to the ϕαn so that the β-relation (9.7)
now becomes[
$α
n−1ς̇
β
n−1φ
α
n−1Ḃ
β
n−1 $α∗
n ς̇βnφα∗n Ḃ
β
n
]
=
[
$α
n−1ς̇
β
n−1φ
α∗
n−1Ḃ
β
n−1 $α∗
n ς̇βnφαnḂ
β
n
]
Ĝαn. (9.13)
Note that the multiplication on the right is now with Ĝαn and not with Ĝβn anymore.
For the α-case, i.e., γn = αn nothing essentially changes since then ς̇βn = ς̇βn−1.
It should be clear that following the same arguments used above, the relation (9.11) then
becomes
Φα(I − A∗z)Ĝα = Φα(zI − A) with Φα = [ϕα0 , ϕ
α
1 , ϕ
α
2 , . . .]
and Ĝα is exactly like Ĝ in (9.10), except that all Gβj should be replaced by a Gαj , that is, replace
all the Gβk factors in G by the corresponding Gαk by removing the Sk factors in (9.6) and G
becomes Gα.
From (9.11) we derive that
zI =
(
Ĝ +A
)(
I +A∗Ĝ
)−1
= η−1
A (G +A)
(
I +A∗G
)−1
ηA = ζ̃A(G),
which is the matrix representation of the shift operator in L with respect to the basis (φ0, φ1,
φ2, . . .). With respect to the basis Φα, the expression is the same except that G should be
replaced by Gα.
We summarize our previous results.
Theorem 9.2. In the general case of a sequence γ, then multiplying the vector of basis functions
Φ(z) = [φ0, φ1(z), φ2(z), . . .] by z corresponds to multiplying it from the right with the infinite
matrix(
Ĝ +A
)(
I +A∗Ĝ
)−1
= η−1
A (G +A)(I +A∗G)−1ηA,
where A = diag(α0, α1, . . .), ηA =
√
I −A∗A and G is a product of unitary factors Gk defined
in (9.8) where the order of multiplication is from left to right in a block of successive α-values
and from right to left in a block of successive β-values as explained in detail above.
For the sequence α we get the classical Hessenberg matrix
G = Gα1G
α
2G
α
3 · · · = H.
The classical CMV matrices are obtained for α1, β2, α3, β4, . . . as
G = Gα1
(
Gα3G
β
2
)(
Gα5G
β
4
)(
Gα7G
β
6
)
· · · =
(
Gα1G
α
3G
α
5 · · ·
)(
Gβ2G
β
4G
β
6 · · ·
)
= C = Cαo Cβe
and in the case β1, α2, β3, α4, . . ., then
G =
(
Gα2G
β
1
)(
Gα4G
β
3
)(
Gα6G
β
5
)
· · · =
(
Gα2G
α
4G
α
6 · · ·
)(
Gβ1G
β
3G
β
5 · · ·
)
= C = Cαe Cβo .
It we use the slightly different orthonormalized basis (ϕαn)n∈N, then the previous relations still
hold true, except that all Gβj can be replaced by Gαj .
Orthogonal Rational Functions on the Unit Circle 33
Note that λαn = 0 does not give any problem for these formulas of the G-factors. If γ = α
then all φαn are regular, but for a general sequence γ, it is possible that φn is not regular. Recall
that then λn =∞ but by Theorem 5.3 this means that λαn = 0 and thus there is no problem in
using the G-matrices introduced above, even if the ORF sequence is not regular.
Another thing to note here is that, as we remarked before, the factors Gνk are unitary, although
the factors Ĝνk = η−1
A GνkηA are in general not unitary. Moreover
ζ̃A(G) = η−1
A (G +A)(I +A∗G)−1ηA
is unitary when G is unitary as can be directly verified in the finite-dimensional case. For
example [ζ̃A(G)][ζ̃A(G)]∗ = I if and only if
(I +AG∗)η−2
A (I + GA) = (G∗ +A∗)η−2
A (G +A).
Note that ηA is invertible but A is not (since for example α0 = 0) and that A and ηA commute.
To see that the relation holds, we multiply out and bring everything to one side, which gives
G∗
(
η−2
A A−Aη
−2
A
)
+
(
A∗η−1
A − η
−1
A A
∗)+ G∗η−1
A G − η
−2
A − G
∗Aη−2
A A
∗G +A∗η−2
A A.
So we have to prove that this is zero. Because η−2
A A = Aη−2
A and η−2
A A∗ = A∗η−2
A , the first
two terms vanish. The four remaining terms can be multiplied from the left and the right with
ηA = η∗A to give
ηAG∗η−2
A GηA − I − ηAG
∗Aη−2
A A
∗GηA + ηAA∗η−2
A AηA.
Because ηAA∗ = A∗ηA and AηA = AηA, the last term equals A∗A and combined with the
second term −I it equals −η2
A. Multiply everything again with η−1
A from the left and from the
right gives
G∗η−2
A G − G
∗Aη−2
A A
∗G − η−1
A η2
Aη
−1
A = G∗η−2
A G − G
∗Aη−2
A A
∗G − I
= (G∗η−1
A )(I − ηAAη−2
A A
∗ηA)(η−1
A G)− I
= (G∗η−1
A )(I − AηAη−2
A ηAA∗)(η−1
A G)− I
= (G∗η−1
A )(I − AA∗)(η−1
A G)− I
= G∗η−1
A η2
Aη
−1
A G − I
= G∗G − I,
which is zero because G is unitary. A similar calculation verifies that [ζ̃A(G)]∗[ζ̃A(G)] = I. The
same holds for the infinite case, i.e., in the whole of L2
µ if L is dense (see also [47]).
When using the recursion for the φn with respect to a general sequence γ, Theorem 5.1
completed by Lemma 5.5 says what the Szegő parameters λn are. Note however that for the
basis Φ corresponding to a general sequence γ, the matrix G had factors Gαk and Gβk that were all
defined in terms of λαk and not λk (see Theorem 9.1 and equation (9.6)). Theorem 5.3 explains
that, depending on γn and γn−1 an λn is either equal to λαn or λαn or the inverses of these. Thus
it should be no problem to obtain all the λαn from the λn or vice versa.
To illustrate how to use the quantities for a general γ sequence in generating the Gαn matrix
we consider the following example. If n − 1 ∈ bn and n ∈ an, then γn = αn, γn−1 = βn−1,
λαn = 1/λn, φαn = φn/Ḃ
β
n , and φα∗n = φ∗n/Ḃ
β
n . This allows us to express ηαn1 and all the other
elements of Gn in terms of the γ-related elements. If we assume that φn is regular, then
G̃αn = σn−1η
α
n1
[
−ηαn1/λn
√
1− 1/|λn|2√
1− 1/|λn|2 ηαn1/λn
] [
1 0
0 σn
]
34 A. Bultheel, R. Cruz-Barroso and A. Lasarow
=
σn−1η
α
n1
|λn|
[
un 0
0 1
] [
−1
√
|λn|2 − 1√
|λn|2 − 1 1
] [
1 0
0 unσn
]
,
where un = (λn/|λn|)ηαn1 ∈ T.
It still requires ηαn1 in terms of γ-elements. This is a bit messy but still possible using
φα∗n = φ∗n/Ḃ
β
n and Lemma 5.5. The factor 1/Ḃβ
n has an essential role here because we have to
evaluate φα∗n (z) for z = αn−1, and if 1/αn−1 = βn−1 ∈ {γk : k ∈ bn}, which means that αn−1
has been introduced as a pole in a previous step, then this pole in φ∗n will cancel against the
same pole in Ḃβ
n . Thus we should use a limiting process or expand φα∗n = φ∗n/Ḃ
β
n or make the
cancellation explicit as we did for example in equation (3.1). We skip further details.
If n is not a regular index, then λn =∞ and in that case the matrix G̃αn has the form
G̃αn = σn−1η
α
n1
[
0 1
1 0
] [
1 0
0 σn
]
= σn−1η
α
n1
[
0 σn
1 0
]
.
10 Spectral analysis
The full spectral analysis of the matrices ζ̃A(G) and how this relates to the shift operator in L2
µ
is given in [47] for the case γ = α (then G = H is a Hessenberg matrix) and in the case of the
alternating γ = ε where either ε2k = α2k and ε2k+1 = β2k+1 or ε2k = β2k and ε2k+1 = α2k+1
(then G = C is a CMV matrix).
Velázquez shows for the cases ν ∈ {α, ε} that if Tµ is the shift operator on L2
µ, (i.e., multipli-
cation with z in L2
µ) and if L is the closure of L∞ in L2
µ, then the matrix representation of Tµ � L
with respect to the basis Φα = [ϕα0 , ϕ
α
1 , ϕ
α
2 , . . .] is given by ζ̃A(Gα) (see [47, Theorems 4.1, 4.2,
5.1, 5.4]).
We mention here the basis (ϕαk )k∈N because in [47] only one type of Gαk matrices is used. As
we have explained, this is related to the precise normalization used for the basis functions, but
as long as the basis is orthogonal and has norm 1, a further normalization with a unimodular
constant will not influence the spectral properties of the matrix representations so that similar
results hold for the basis Φ and the matrix ζ̃A(G).
If moreover L is dense in L2
µ and hence L = L2
µ, then ζ̃A(G) is not only isometric but it is
a representation of the full matrix. For α compactly included in D this happens for γ = α
when logµ′ 6∈ L1 and in the case γ = ε when
∞∑
k=1
(1 − |α2k|) = ∞ =
∞∑
k=1
(1 − |α2k+1|) (see [47,
Theorems 4.1, Proposition 5.3]).
In fact, the proof of [47, Proposition 5.3] needs only minor adaptations to be applicable to
the present situation. So we can state without further details:
Theorem 10.1. If the sequence γ is such that
∑
k∈a∞
(1 − |αk|) = ∞ and
∑
k∈b∞
(1 − |αk|) = ∞,
then both (φk)k∈a∞ and (φk)k∈b∞ are bases of L2
µ.
Now also [47, Theorem 5.4] can be directly translated to our situation:
Theorem 10.2. For the general sequence γ as in the previous theorem, U = ζ̃A(G) is the matrix
representation of the shift operator Tµ of L2
µ and σ(U) = suppµ. Moreover, if Φ = [φ0, φ1, . . .]
then ξ is a mass point of µ if and only if ‖Φ(ξ)‖`2 <∞, µ({ξ}) = 1/‖Φ(ξ)‖2`2 and Φ(ξ) is a (left)
eigenvector for the eigenvalue ξ of U .
In [47] it is also explained how the spectrum of ζ̃A(G) = η−1
A $̃∗A(G)$̃A(G)−1ηA relates to
the spectrum of the pencil ($∗A(G), $A(G)). In the case of an alternating sequence ε, the
matrix G was a CMV matrix which we denoted as C. It was then possible to regroup the even
Orthogonal Rational Functions on the Unit Circle 35
and odd elementary factors so that C = Cαo C
β
e or C = Cαe C
β
o with Cνo = Gν1G
ν
3G
ν
5 · · · and Ce =
Gν2G
ν
4G
ν
6 · · · for ν ∈ {α, β}. As a consequence, because all factors are unitary, the previous result
could be formulated in terms of the spectral properties for the pencils (ACβ∗e + Cαo , C
β∗
e +A∗Cαo )
or (ACβ∗o + Cαe , Cα∗o + A∗Cαe ). In the general case, such a regrouping is possible, but rather
complicated because it will depend on the sizes of the α and β blocks in the γ sequence, i.e.,
on the number of elementary G-factors that appears in the product in increasing or decreasing
order (see (9.9)).
There are also spectral interpretations for the eigenvalues of the finitely truncated sections
of the matrices. Again, a careful discussion is given in [47] for this. If for an operator T we
denote its truncation to Ln as Tn = PnT � Ln with Pn the orthogonal projection operator
on Ln, then Velázquez shows that the zeros of the φn+1 are the eigenvalues of the finite matrices
Vn = ζ̃An(Gn) with Gn the leading principle submatrix of size (n + 1) of the matrix G and
An = diag(α0, . . . , αn). As illustrated in [47, equation (19)], Gn is the product of unitary
matrices Gk of the form (9.1) for k = 1, . . . , n + 1 except that Gn+1 is ‘truncated’ and is of
the form diag(In ⊕ −δn+1) where δn+1 (up to a unimodular constant) is λn+1 and thus this
truncated factor Gn+1 is not unitary because λn+1 6∈ T in general. For γ = α this is proved in
[47, Theorem 4.5] and for γ = ε in [47, Theorem 5.8].
The same arguments can be used for a general sequence γ. Indeed (9.5) and (9.7) show that
in the truncated version we shall end up with
[φ0, . . . , φn]($∗An −$AnGn) = [0, . . . , 0, χn+1],
where χn+1 is of the form cn+1φ
α
n+1Ḃ
β
n+1 or cn+1φ
α∗
n+1Ḃ
β
n+1 depending on whether the truncation
falls inside and α block or a β block. In both cases, this is a constant multiple of φn+1 (see
Theorem 3.9). Like in [47] this can be transformed into
[φ0(z), . . . , φn(z)]
(
z − ζ̃An(Gn)
)
= [0, . . . , 0, cn+1φn+1(z)],
which shows that a zero of φn+1 corresponds to an eigenvalue of Vn = ζ̃An(Gn). Thus the
following theorem is still valid for a general γ.
Theorem 10.3. For the general sequence γ let Gn be the truncation of size n+1 of the matrix G,
and define Vn = ζ̃An(Gn), then the eigenvalues of Vn will be the zeros of φn+1. The (left)
eigenvector corresponding to an eigenvalue ξ is given by Φn(ξ) = [φ0(ξ), . . . , φn(ξ)].
For the unitary truncation, we first form a unitary truncation Gun of Gn by keeping all Gk
with k < n + 1 and replacing Gn+1 by Gun+1 = diag(1, . . . , 1, τ) for some τ ∈ T. We shall then
get (recall the definition of the PORF Qn+1(z, τ) in (6.1))
[φ0, . . . , φn]
(
$∗An −$AnG
u
n
)
= [0, . . . , 0, Xn+1],
with Xn+1 of the form cn+1Q
α
n+1(z, τ)Ḃβ
n+1(z) or cn+1Q
α
n+1(z, τ)Ḃβ
n+1(z) depending on whether
the truncation falls inside and α block or a β block. By Corollary 6.2 this is proportional
to Qn+1(z, τ ′) in both cases for some appropriate choice of τ ′. Therefore eigenvalues of Un =
ζ̃An(Gun) will give the nodes of the (n + 1)-point rational Szegő quadrature formula for some
particular choice of the τ -parameter.
Again this is worked out in detail in [47] for the Hessenberg and CMV matrices. As in the
previous theorem, also for the unitary truncations we can use similar arguments for a general
sequence and we therefore have also
Theorem 10.4. For the general sequence γ let Gun be the unitary truncation of size n + 1 of
the matrix G, and define Un = ζ̃An(Gun), then the eigenvalues of Un will be the zeros of the
PORF Qn+1 for some value of the τ -parameter. The (left) eigenvector corresponding to an
eigenvalue ξ is given by Φn(ξ) = [φ0(ξ), . . . , φn(ξ)].
36 A. Bultheel, R. Cruz-Barroso and A. Lasarow
Remark 10.5. Since Un is unitary, its right eigenvectors will be the conjugate transpose of
the left eigenvectors. Thus if F is the matrix whose entry at position (i, j) is φj(ξi), and Λ =
diag(ξ0, . . . , ξn) contains the eigenvalues of Un, then FUn = ΛF . A row of F is [φ0(ξi), . . . , φn(ξi)]
which is not normalized in `2 sense, but we know that φ0 = 1. In linear algebra it is more usual
to write the eigenvalue decomposition of a unitary (or normal) matrix Un as U∗Un = ΛU∗
with U unitary, thus with orthonormal rows and columns. Thus if D is the diagonal matrix
whose diagonal entries are the elements on the first row of U (which in our case are all nozero)
then F = D−1U∗. So FF ∗ = D−1U∗UD−∗ = D−1D−∗ is diagonal with entries
n∑
k=0
|φk(ξi)|2. In
other words, the weights wi of the quadrature formula are given by the modulus square of the
elements in the first row of U .
Note that the full matrices never need to be constructed explicitly. There exist efficient
algorithms to perform all the required operations when the matrix is stored as a product of
elementary 2×2 factors (for example [3] or [16]). In fact these techniques will be able to handle an
arbitrary order in the product with the same complexity (in terms of floating point operations)
as for example the Hessenberg or CMV ordering. It will however reduce the programming
complexity slightly requiring less bookkeeping, if we use for example a CMV structure. In the
case of the CMV matrix, we do not even need a product at all to explicitly write down the
matrices involved. Indeed, since the odd and even factors Cuon and Cuen of the unitarily truncated
CMV matrix Cun are just block diagonals with all blocks of size at most 2 × 2 with exception
of Gn+1 which is diagonal. Hence a pencil like (AnCu∗en + Cuon, Cu∗en + A∗nCuen) that will allow to
compute the quadrature is extremely simple for numerical computations [7, Theorem 7.3].
We mentioned already in Section 7 that an arbitrary sequence γ did not give rise to new
quadrature formulas than already given by the sequence α or the alternating sequence ε. We
now find that there is no computational advantage either it is not very rewarding in a practical
computation of rational Szegő quadrature to work out all the details in the general case. The
CMV matrix obviously has the smallest possible bandwidth; it is conceptually simpler and
somewhat easier to implement.
There is however a remarkable linear algebra side effect that became clear from the previous
analysis. As far as we know, this was previously unknown in the linear algebra community and
we will discussed some preliminary results in the next section.
11 AMPD matrices
In the previous section, it was recalled from [47] that under the appropriate conditions, the
spectrum of the unitary shift Tµ can be recovered from the matrix representation ζ̃A(Gα) with
respect to the basis [ϕα0 , ϕ
α
1 , . . .], whatever the sequence γ is. Now the matrix Gα is a product
of elementary matrices that differ from the identity only by one 2 × 2 block of the general
form (9.1). The only difference between the Hessenberg and the CMV form is the order in
which these factors are multiplied.
Something of that form was known for a finite unitary Hessenberg matrix. It can be decom-
posed as a product of unitary matrices of the type (9.1) and multiplying these factors in any
order will not change the spectrum of the matrix (see, e.g., [3]).
Since the zeros of the ORFs are given as the eigenvalues of the matrix ζ̃An(Gn) where Gn
is either a finite Hessenberg or a CMV matrix and hence they differ only by the order of
the elementary factors, we might expect that if Gn is any matrix written as the product of n
elementary factors of the form (9.1), then the spectrum of ζ̃An(Gn) will not change if we change
the order of the elementary factors in the product. This is a consequence of the fact that the
spectrum of Gn will not depend on the order of its elementary factors.
Orthogonal Rational Functions on the Unit Circle 37
Since this section is partly detached from the ORF theory, the notation used is also local and
should not be confused with the same notation used in other sections, unless stated explicitly.
We define an AMPD matrix as a matrix of the form AM +D where A and D are diagonal
matrices and M is the product of G-matrices that are matrices that differ from the identity by
one 2× 2 diagonal block like (I0 is the empty matrix)
Gk :=
Ik−1 0 0 0
0 αk βk 0
0 γk δk 0
0 0 0 In−k
, 1 ≤ k ≤ n, (11.1)
which is more general than the matrices defined in (9.1).
The purpose is first to show that the spectrum of an AMPD matrix does not depend on the
order in which the G-matrices are multiplied. The next proof was provided by Marc Van Barel
in private discussion with the first author.
Lemma 11.1. Suppose M ′ ∈ Cn×n is an arbitrary matrix with n ≥ 2 and set
M =
[
M ′ 0
0 1
]
and G =
In−1 0 0
0 α β
0 γ δ
and let A,D ∈ C(n+1)×(n+1) be two arbitrary diagonal matrices. Then the determinant and the
eigenvalues of AGM +D and AMG+D are the same.
Proof. Let us write
M ′ =
M ′′ c
r m
and A = diag(A′′, a′, a), D = diag(D′′, d′, d).
Then
AGM +D =
A′′M ′′ +D′′ A′′c 0
a′αr a′αm+ d′ a′β
aγr aγm aδ + d
and
AMG+D =
A′′M ′′ +D′′ αA′′c βA′′c
a′r a′αm+ d′ a′βm
0 aγ aδ + d
.
Taking the determinant of AGM +D gives (expand along the last column)
det(AGM +D) = (aδ + d)a′α det M̃ + (aδ + d)d′ det(A′′M ′′ +D′′)− a′βaγ det M̃
= aa′(αδ − βγ) det M̃ + a′dα det M̃ + (aδ + d)d′ det(A′′M ′′ +D′′)
= [aa′ det(G) + a′dα] det M̃ + (aδ + d)d′ det(A′′M ′′ +D′′)
38 A. Bultheel, R. Cruz-Barroso and A. Lasarow
with M̃ a matrix independent of G, given by
M̃ =
A′′M ′′ +D′′ A′′c
r m
= diag(A′′, 1)M ′ + diag(D′′, 0).
Evaluating the determinant of AMG+D in a similar way, expanding along the last row, gives
exactly the same result. Hence the determinants are the same: det(AGM+D) = det(AMG+D).
Note that D − λI is still a diagonal if D is, so that if we replace D by D − λI in the de-
terminants, we obtain the characteristic polynomials of the matrices AGM +D and AMG+D
respectively. Equality of these polynomials implies the equality of their roots, i.e., of the eigen-
values. �
In fact the result is more general since we could replace D as well as A by any diagonal
matrices that are functions of λ and the determinants would still be the same.
It is interesting to note that A′′M ′′ +D′′ is the same as the matrix M̃ in which the last row
and the last column are deleted. This will be used in Theorem 11.3 below.
Example 11.2. If in the previous lemma we set
M = G1 =
α1 β1 0
γ1 δ1 0
0 0 1
and G = G2 =
1 0 0
0 α2 β2
0 γ2 δ2
,
then for arbitrary diagonal matrices A = diag(a1, a2, a3) and D = diag(d1, d2, d3) a direct
evaluation of the matrices AG1G2 and AG2G1 gives
AG1G2 =
a1α1 a1β1α2 a1β1β2
a2γ1 a2δ1α2 a2δ1β2
0 a3γ2 a3δ2
and AG2G1 =
a1α1 a1β1 0
a2α2γ1 a2α2δ1 a2β2
a3γ2γ1 a3γ2δ1 a3δ2
.
Note that the diagonal elements are the same. If we add D to this diagonal, then with the
Laplace rule it is immediately seen that det(AG1G2 +D) = det(AG2G1 +D).
If π = (π1, π2, . . . , πk) is a permutation of the numbers (1, 2, . . . , k), then with a notation like∏
i∈π
Gi where Gi ∈ C(n+1)×(n+1) are square matrices, we mean the product from left to right in
the order given by π. Thus∏
i∈π
Gi = Gπ1Gπ2 · · ·Gπk . (11.2)
Recall that Gi and Gj of the form (11.1) commute whenever |i − j| > 1. This means that
of the k! possible permutations of the numbers (1, 2, . . . , k), only 2k−1 different matrices will
result when we consider all possible permutations in the product
∏
i∈π
Gi. Indeed, once the order
is fixed in which the first k − 1 matrices {Gi : i = 1, 2, . . . , k − 1} are multiplied, Gk will have
to multiply this product either from the right or from the left depending on whether it is to
the left or to the right of Gk−1 in the product (11.2). This is because Gk commutes with all
previous Gi, with i < k − 1.
Let PNn = [In 0] ∈ Cn×N be the matrix projecting onto Cn. We define for any matrix
M ∈ CN×N its truncation to Cn×n as PNn MPN∗n . The following main theorem states that the
determinant and hence the eigenvalues of the truncation of an AMPD matrix does not depend
on the order in which its Gi factors are multiplied.
Orthogonal Rational Functions on the Unit Circle 39
Theorem 11.3. Let π = (π1, π2, . . . , πn) be an arbitrary permutation of (1, 2, . . . , n) and define
Mπ =
∏
i∈π
Gi with Gi of the form (11.1), let A and D both be arbitrary diagonal matrices of size
n+ 1 and finally set P = Pn+1
n . Then det(P (AMπ +D)P ∗) will not depend on π.
Proof. The proof is by induction. For n = 2 we need to consider P (AG1G2 + D)P ∗ and
P (AG2G1 + D)P ∗. It can be checked with the expressions of AG1G2 and AG2G1 from Exam-
ple 11.2 that
det(P (AG1G2 +D)P ∗) = det
[
a1α1 + d1 a1α2β1
a2γ1 a2α2δ1 + d2
]
and
det(P (AG2G1 +D)P ∗) = det
[
a1α1 + d1 a1β1
a2α2γ1 a2α2δ1 + d2
]
.
Both expressions give the same determinant.
For the induction step we recall that Gn commutes with all Gk with k < n − 1 thus
∏
i∈π
Gi
equals either Gn
( ∏
i∈π′
Gi
)
or
( ∏
i∈π′
Gi
)
Gn where π′ is the same as π after removing the value n.
In the first case with Gn at the beginning (see the proof of Lemma 11.1)
P (Mπ)P ∗ = P
M ′′ cn−1 0
αnrn−1 αnmn−1 βn
γnrn−1 γnmn−1 δn
P ∗ = P
M ′′ cn−1 0
αnrn−1 αnmn−1 0
0 0 1
P ∗.
Therefore det[P (AMπ +D)P ∗] = det(ÂαnM̂π′ + D̂) where we introduced for any matrix F the
hat-notation to mean F̂ = PFP ∗. Note that ÂαnM̂π′ + D̂ is an AMPD matrix of size n which
has only n − 1 factors (αn can be assimilated in Â). By induction it is independent of the
permutation π′.
The second case, when Gn is at the end of the product, is completely similar. The αn in front
of rn−1 will move to cn−1 and we find the same expression. This implies that the determinant
is not only independent of π′ but that we can also insert n anywhere in π′ and still get the same
result, which proves the induction step. �
We have as an immediate consequence that the truncation is not essential.
Corollary 11.4. The determinant, and hence also the eigenvalues of an AMPD matrix AMπ+D
do not depend on the permutation π.
Proof. To prove this for an AMPD matrix of size n with n − 1 factors, consider the AMPD
matrix of the previous theorem and assume that Gn is just the identity matrix. It is then clear
that the truncated AMPD matrix ÂM̂π + D̂ is equal to an AMPD matrix of size n with M̂π =
PMπ′P
∗ with π′ equal to π in which n is removed. By the previous theorem, its determinant is
independent of the permutation π′. �
As we mentioned earlier, we could have replaced A and D by any two diagonal matrices that
are functions of λ and the determinants would still be independent of the permutation π.
RAMPD matrices are rational AMPD matrices. They are the product of an AMPD matrix
and an inverse of an AMPD matrix. For example
R = (AM + C)(BM +D)−1
40 A. Bultheel, R. Cruz-Barroso and A. Lasarow
with A, B, C, D all diagonal and M a product of G-factors. Again it can be shown that the
spectrum of R is independent of the product order in the matrix M .
Theorem 11.5. Let R = (AMπ +C)(BMπ +D)−1 be a RAMPD matrix as defined above, with
Mπ =
∏
k∈π
Gk, then the eigenvalues of R do not depend on the permutation π. If BMπ + D is
not invertible, then a slightly more general formulation is that the pencil (AMπ +C,BMπ +D)
has eigenvalues that do not depend on π.
Proof. This problem can be conveniently solved by reducing it to the AMPD case. Indeed, the
eigenvalues of the pencil can be found by finding the solutions of the characteristic polynomial
which is the determinant of the matrix
(AMπ + C)− (BMπ +D)λ.
It is obvious that this can be rewritten as
(A−Bλ)Mπ + (C −Dλ) = A′Mπ +D′
in which A′ = A−Bλ and D′ = C−Dλ are diagonal matrices. By what has been proved above,
the determinant of the AMPD matrix A′Mπ +D′ does not depend on the permutation π. �
In the context of ORF, the Gk matrices were unitary, thus of the general form (9.1). Note
that then Mπ =
∏
k∈π
Gk and its unitary truncation will be unitary. Conversely any unitary
matrix can be written as a unitary truncation of such a product because by a sequence of
Givens transformations it can be brought into a unitary Hessenberg form, which can be written
as a product of such factors. All these elementary transforms can be merged into a unitary
truncation of a product of n+ 1 factors Gk if the unitary matrix has size n+ 1.
The results about the zeros of the para-orthogonal functions suggested that the eigenvalues
of the RAMPD matrix
Rπ = η−1
A (A+Mπ)(A∗Mπ + I)−1ηA, ηA = (I −A∗A)1/2 (11.3)
are independent of the permutation π, and moreover, if U is the unitary matrix of eigenvectors,
then the squared absolute values of the entries on the first row of U are also independent of π.
This is because these values give the weights of the quadrature whose nodes are the corresponding
eigenvalues and since these quadratures are uniquely defined, the weights computed must be
the same.
With the ORF interpretation of the previous section we can be more precise and prove that
the following theorem holds.
Theorem 11.6. Let π be a permutation of (1, . . . , n) and Mπ =
∏
k∈π
Gk be the product of unitary
factors Gk of the form (9.1) with all δk ∈ D. Let A be a diagonal matrix of size n + 1 with all
entries in D. Let Rπ be the unitary matrix defined by (11.3) with eigenvalue decomposition
Rπ = VπΛπV
∗
π with Vπ the unitary matrix whose columns are the eigenvectors. Then Λπ and the
absolute values of all the entries in Vπ are independent of π.
Proof. First note that it is sufficient to prove this when Mπ is a unitary truncation of a product
of n+ 1 unitary factors. As in Corollary 11.4 it will then also hold for an untruncated product
of n unitary factors by choosing factor Gn+1 to be the identity.
We shall work with only one type of Gk factors (those of type Gαk ). Using the results
of the previous section we can indeed replace all the Gβk by Gαk if we use the basis {ϕαk =
Orthogonal Rational Functions on the Unit Circle 41
ς̇βk φk : k = 0, 1, 2, . . .} where ς̇βk =
∏
j∈bk
σj : k = 0, 1, 2, . . .. Furthermore we know that row i
of Vπ is proportional to [ϕα0 (ξi), . . . , ϕ
α
n(ξi)] where Λπ = diag(ξ0, . . . , ξn) are the eigenvalues
of Rπ. We also know that reordering the entries in π will not change the eigenvalues, but it does
change the eigenvectors. Assume for example that π = (1, 2, . . . , n), then all ϕαk = ς̇βk φ
α
k = φαk ,
k = 1, . . . , n. If we change π, then some of the αk ∈ an will be replaced by βk = 1/αk. For
the k that remain in an, we have ϕαk = ς̇βk φn = ς̇βk φ
α
k Ḃ
β
k . The entry ϕαk (ξi) = ς̇βnφαk (ξi) will
thus change into ϕαk (ξi)Ḃ
β
k (ξi), which means a multiplication with a unimodular constant. For
other indices k that migrate to bn, the ϕαk becomes ϕαk = ς̇βk φn = ς̇βk φ
α∗
k Ḃ
β
k (see Theorem 3.9).
But when evaluated in ξi ∈ T, clearly Ḃβ
k (ξi) ∈ T, while also |φαk (ξi)| = |φα∗k (ξi)| because
φα∗k (ξi) = Ḃα
k (ξi)Ḃ
β
k (ξi)φ
α
k∗(ξi) = Ḃα
k (ξi)Ḃ
β
k (ξi)φαk (ξi). This proves the theorem. �
Remark 11.7. The condition |δk| < 1 in the previous theorem is important because δk ∈ T
can result in a matrix Mπ that is the direct sum of diagonal blocks. Suppose as an example
π = (1, 2, . . . , n) so that Mπ is Hessenberg, but if δn ∈ T, then the last column of Mπ will be
[0, 0, . . . , 0, δk]
∗ and this will give an eigenvector [0, . . . , 0, 1]∗ and the first entry is obviously
zero, so that the link with ϕα0 = 1 cannot be made. In general, diagonal Gk factors result
in a reducible Mπ, which means that it is a direct sum of submatrices. This corresponds to
a breakdown of the recurrence relation of the ORFs.
Even in the case of orthogonal Laurent polynomials, that is when the diagonal matrix A = 0,
the refinement of Theorem 11.6 was not observed in [16]. Without our ORF interpretation,
a proof purely on linear algebra arguments seems to be difficult.
12 Relation with direct and inverse eigenvalue problems
We are convinced that the generalization of the ORFs that was developed in the previous sections
can be efficiently used to analyse algorithms that are developed and used to solve direct and
inverse eigenvalue problems. A direct problem is: given a matrix, find its eigenvalues and the
eigenvectors, which allows for example, as we have explained to compute the quadrature formula,
while in an inverse problem, we are given the eigenvalues and the weights of the quadrature and
we have to find the matrix. Of course the latter makes only sense if some structure is imposed on
the matrix. The problem related to what has been discussed in this paper is the inverse problem
for unitary Hessenberg matrices (see [14, Section 8] or much earlier papers like [1, 2, 28]). The
methods described there are restricted to orthogonal polynomials. The more general recursion
for orthogonal Laurent polynomials of [16] can be used to construct more general snake shaped
matrices, including Hessenberg and CMV matrices. The algorithms to solve these problems
apply the recurrence relation for the orthogonal (Laurent) polynomials in reverse order, to
descend from degree n to degree 0. These inverse problems relate to discrete least squares
approximation. This is described in a linearized form in [41], but proper rational approximation
with preassigned poles could be applied when not working with vector polynomials but with
ORFs on the unit circle. A proposal is made in [42, 43], but the ORFs are different from ours.
We explain the difference in Section 12.1 below.
The direct methods apply the recursion in the forward direction. The idea is simple: the power
method for a matrix A computes vectors vk = Akv for some vector v = v0 and k = 0, 1, 2, . . . , q.
This will with proper normalization converge to an eigenvector corresponding to the dominant
eigenvalue, while the inverse power method computes v−k = A−kv for k = 0, 1, 2, . . . , p and this
will approximate the dominant eigenvalue of A−1 which is the smallest eigenvalue of A. For
the spaces spanned by {vk}qk=−p one wants to find an orthogonal basis to project the problem
on a space that is usually much smaller than the size of the matrix A. In polynomial terms,
42 A. Bultheel, R. Cruz-Barroso and A. Lasarow
this means finding a set of orthogonal (Laurent) polynomials spanning the space generated
by {zk}qk=−p. In each step of the algorithm either p or q is increased by 1. We are thus in the
case considered in [16], i.e., the sequence γ has only entries 0 and ∞. For a unitary matrix, the
eigenvalues are all on the unit circle, and then the poles of the Laurent polynomials 0 and∞ are
the points inside and outside the circle that are as far as possible away from the boundary. So
one may not expect the best possible approximation of the spectrum when considering a unitary
truncation of the Hessenberg or general snake shaped matrix. Therefore it is essential to select
poles at strategic places much closer to the circle to put more emphasis on regions where the
eigenvalues are more important. The effect of the location of the poles can be observed for
example in the numerical experiments reported in [6, 7, 10].
When the matrix is not unitary, then the underlying theory is for polynomials and ORFs
that are orthogonal with respect to a measure supported on the real line or a real interval
and convergence has been studied based on the asymptotic behaviour of the ORFs and the
approximation error (see for example [5, 13]). For the unitary case such a theory is much less
developed. One of the reasons may be that the link with the ORFs as explained in this paper
is not clear. Part of the reason is that the onset in the literature is slightly different since
the Hessenberg matrix is replaced by its inverse and an extra diagonal was introduced. Since
a unitary Hessenberg can be written as the product of the elementary Gk factors, its inverse is
also the product of (inverse) Gk factors and therefore computationally as simple as a Hessenberg
matrix. The study of the inverse of a Hessenberg has a long history. See the many references
in [45, 46]. Such an inverse is known to be related to semi-separable matrices. That are matrices
in which every rectangular block that can be taken from its lower (or upper) triangular part
have a low rank. Several more detailed variants of the definition exist in the literature but
a simple variant goes as follows. A matrix is lower (upper-) semi-separable of semi-separability
rank r if all submatrices which can be chosen out of the lower (upper) triangular part of the
matrix have a rank at most r and rank r is reached for at least one of them. They can be
characterized with few (say O(n)) parameters. For a simple example take two vectors v, w ∈ Cn
and form a matrix A whose upper triangular part (including the diagonal) is equal to the upper
triangular part of vw∗ and whose strictly lower triangular part is equal to the strictly lower
triangular part of wv∗, then A is a semi-separable matrix of rank 1 (Hermitian, except for the
diagonal which may not be real). The vectors v and w are called its generators. The generators
and the ranks for upper and lower triangular parts may differ in general and several other more
general definitions exist. The inverse of an invertible lower semi-separable matrix of rank 1 is an
upper Hessenberg matrix. Any Hermitian matrix is unitary similar to a tridiagonal matrix, and
the inverse of a tridiagonal is again a semi-separable matrix. That explains why the literature
for unitary and Hermitian eigenvalue problems and inverse eigenvalue problems are involved
with semi-separable (plus-diagonal) matrices. The extra diagonal appears because these papers
do not rely on the Möbius transform ζ̃(G) that we used above but they use a less symmetric
framework. On the other hand, when the symmetry with respect to the unit circle is lost, the
poles can be anywhere in the complex plane except at infinity which means that the important
case of (Laurent) polynomials is excluded.
What we have developed in this paper is however flexible enough to give up the symmetry
with respect to the unit circle too but without excluding the point at infinity. This is illustrated
in the next section. In the subsequent one we shall explain why in the literature the semi-
separable-plus-diagonal algorithms are used.
12.1 Using the general recursion while dealing with infinity
Suppose γ = α, then our derivation given in Sections 8 and 9 shows that
Φ(I − A∗z)Ĥα = Φ(zI − A)
Orthogonal Rational Functions on the Unit Circle 43
with Ĥα an upper Hessenberg matrix. It is irreducible because the subdiagonal elements are
proportional to
√
1− |λ|2. If we go through the derivation, then the same arguments used for
the sequence α also apply when we do exactly the same steps using the recursion of Theorem 5.1
for the general sequence γ, where we assume for the moment that γi =∞ does not appear. Then
we shall again arrive at the above relation, except that all α’s are replaced by γ’s. Thus (assume
for simplicity that the whole sequence is regular) and recall that all |γk| 6= 1 then
Φ(I − Γ ∗z)Ĥγ = Φ(zI − Γ ) and zI = (Ĥγ + Γ )
(
I + Γ ∗Ĥγ
)−1
(12.1)
where Γ = diag(γ0, γ1, . . .) and ηγ = (I − Γ ∗Γ )1/2 in Ĥγ = η−1
γ Hγηγ which is again upper
Hessenberg. An important note is in order here. The previous notation is purely formal.
The expressions will involve quantities 1 − |λk|2 and 1 − |γk|2 that can be negative so that
their square roots, for example in the definition of ηγ is problematic. However, remember the
relation (5.1), which was the consequence of the fact that 1 − |λn|2 can only be negative if
(1− |γn|2)(1− |γn−1|2) < 0. Therefore bringing the factor ηγ and η−1
γ inside the G-factors will
give square roots of positive elements. Just note that[
1/
√
1− |γn−1|2
1/
√
1− |γn|2
] [
−λnηn1
√
1− |λn|2√
1− |λn|2 λnηn1
] [√
1− |γn−1|2 √
1− |γn|2
]
=
−λnηn1
√
(1− |λn|2)(1− |γn|2)
(1− |γn−1|2)√
(1− |λn|2)(1− |γn−1|2)
(1− |γn|2)
λnηn1
and all square roots are taken from positive numbers. To keep the link with what was done for
the α sequence, we used the previous notation and we shall continue doing that in the sequel.
In any case the representation Ĥγ in this setting is a Hessenberg matrix that can be computed
in a proper way and we do not need a more general snake shaped matrix.
By doing this we loose the possibility to include γk =∞. In our previous setting this was not
a problem because we reduced everything to the α parameters so that γk =∞ was easily dealt
with by mapping it to αk = 0 and this fitted seamlessly in the factors I−A∗z and zI−A. In the
current general setting however a choice γk =∞ requires the diagonal element $k(z) = 1− γkz
to be replaced by −z and similarly $∗k(z) = z − γk will become −1 which means that some of
the entries in the I matrix of (12.1) have to be replaced by zero and the second relation does
not hold anymore. To avoid this, we can use the strategy that was used before when switching
between the formalism for the α and for the β sequence, except that we now use this strategy
to switch between a finite and an infinite entry in the γ sequence.
Without going through all the details, the duality between α and β can be transferred to
a duality between γ = {γ1, γ2, . . .} ⊂ Ĉ \ T and the reciprocal sequence γ̌ = (γ̌1, γ̌2, . . .) with
γ̌k = 1/γk, k = 1, 2, . . .. The ORF for the sequence γ we denote as φn and the ORF for the
sequence γ̌ we denote as φ̌n. If ν = (ν1, ν2, . . .) is a sequence that picks νk = γk if γk is finite
and νk = γ̌k = 0 if γk =∞, then, in analogy with Theorem 3.9, with proper normalization, the
ORFs for this ν sequence, which are denoted as φνk will be given by
φνn = φnB̌n if νn = γn and φνn = (φn)∗B̌n if νn = γ̌n,
where
B̌n =
n∏
k=1
νk=γ̌k
ζνk .
44 A. Bultheel, R. Cruz-Barroso and A. Lasarow
First note that as long as we are dealing with finite γk’s we can use the recurrence relation
and write the analog of the relation given in Theorem 9.1, just replacing the α’s by γ’s (which
by our convention means to remove the superscripts) and write[
$∗n−1φn−1 $nφ
∗
n
]
=
[
$n−1φ
∗
n−1 $nφn
]
Ĝγn (12.2)
with
Ĝγn = (Nγ
n )−1G̃γn(Nγ
n ), Nγ
n =
[(
1− |γn−1|2
)−1/2
0
0
(
1− |γn|2
)−1/2
]
,
G̃γn = σn−1ηn1
[
−λnηn1
√
1− |λn|2√
1− |λn|2 λnηn1
] [
1 0
0 σn
]
.
Now, if some γk = ∞ is involved, we shall switch to the γ̌ sequence, which means that we
avoid γk = ∞ and replace it by γ̌k = 0. The factor B̌γ
n will thus only pick up a ζνk = 1/z each
time a γk =∞, hence when νk = 0.
B̌n(z) =
n∏
j=1
γj=∞
z =
∏
j∈in
z = z|in|,
where in = {j : γj =∞, 1 ≤ j ≤ n}. Since in the previous relations, as long as γn 6=∞, we have
B̌n−1 = B̌n so that it is no problem to write[
$n−1B̌n−1φn−1 $nB̌nφ
∗
n
]
=
[
$n−1B̌n−1φ
∗
n−1 $nB̌nφn
]
Ĝνn
with Ĝνn = Ĝγn. If however γn = ∞, then $n = −z and this z-factor we absorb in B̌n. This is
to say that $nB̌n−1 = $∗nB̌n. Hence
B̌n−1
[
$n−1φ
∗
n−1 $nφn
]
Ĝνn = B̌n−1
[
$n−1φn−1 $nφ
∗
n
]
=
[
$n−1B̌n−1φn−1 $∗nB̌nφ
∗
n
]
. (12.3)
Now Ĝνn is like Ĝγn but with γn = ∞ replaced by 0. We can proceed exactly as in the case of
the α-β duality, but now apply it to the finite-infinite duality. Thus using νk = γk if γk is finite
and νk = 0 when γk =∞ and setting
Ĝ = η−1
N GηN , ηN = (I −N ∗N )1/2, N = diag(ν0, ν1, ν2, . . .), (12.4)
then the analog of (9.9) becomes (all the γi’s denoted explicitly as such are finite values and
denoted as ∞ when not finite)
γ = (γ0, γ1, . . . , γn−1 ‖ ∞, . . . , ∞| γk ‖ γk+1, . . . , γm−1‖ ∞ , . . . , ∞| γ`‖γ`+1, . . .),
ν = (γ0, γ1, . . . , γn−1 ‖ 0 , . . . , 0| γk ‖ γk+1, . . . , γm−1‖ 0 , . . . , 0| γ`‖γ`+1, . . .),
Ĝ =
(
Ĝγ1Ĝ
γ
2 · · · Ĝ
γ
n−1
)︸ ︷︷ ︸
Ĝ1
γ
(
ĜγkĜ
ν
k−1 · · · Ĝνn
)︸ ︷︷ ︸
Ĝ1
∞
(
Ĝγk+1 · · · Ĝ
γ
m−1
)︸ ︷︷ ︸
Ĝ2
γ
(
Ĝγ` Ĝ
ν
`−1 · · · Ĝνm
)︸ ︷︷ ︸
Ĝ2
∞
(
Ĝγ`+1 · · ·
)
= η−1
N
(
Gγ1G
γ
2 · · ·G
γ
n−1
)︸ ︷︷ ︸
G1
γ
(
GγkG
ν
k−1 · · ·Gνn
)︸ ︷︷ ︸
G1
∞
(
Gγk+1 · · ·G
γ
m−1
)︸ ︷︷ ︸
G2
γ
(
Gγ`G
ν
`−1 · · ·Gνm
)︸ ︷︷ ︸
G2
∞
(
Gγ`+1 · · ·
)
ηN .
Thus in the γ blocks we multiply the successive factors Gγi to the right while in an ∞ block we
multiply in reversed order like we did in the α-β case.
Orthogonal Rational Functions on the Unit Circle 45
Use (12.2) and multiply Φν(I −N ∗z) from the right with Ĝ1
γ to get (note φν0 = φν∗0 = 1 and
B̌n−1 = 1)[
$∗0φ0, . . . , $
∗
n−2φn−2, $n−1φ
∗
n−1B̌n−1
∣∣$nφ
∗
nB̌n, . . .
]
.
While multiplying this result with ĜγkĜ
ν
k−1 · · · Ĝνn+1, we make use of (12.2) and (12.3) to obtain[
$∗0φ0, . . . , $
∗
n−2φn−2, $n−1φ
∗
n−1B̌n−1
∣∣
$∗nφnB̌n, $
∗
n+1φn+1, . . . , $
∗
k−1φk−1
∣∣$kφ
∗
k, $k+1φk+1, . . .
]
and the remaining multiplication Ĝνn links the γ block to the∞ block, so that after reintroducing
the φνi notation:[
$∗0φ
ν
0 , . . . , $
∗
n−1φ
ν
n−1
∣∣$∗nφνn, . . . , $∗k−1φ
ν
k−1
∣∣$kφ
ν∗
k , $k+1φ
ν
k+1, . . .
]
.
The next block is again an γ block treated by the product Ĝ2
γ , and one may continue like this
to finally get
Φν(I −N ∗z)Ĝ = Φν(zI −N ). (12.5)
Note that to include γk =∞ in the sequence, we had to give up the Hessenberg structure of Hγ
in (12.1) because the Gk factors corresponding to infinite γ’s are multiplied in reverse order.
Thus we again end up with a snake shaped matrix G. It like in Fig. 1 with upper Hessenberg
blocks for the blocks with finite γ’s and lower Hessenberg blocks for the sequences corresponding
to the blocks of infinite γ’s.
This might seem to be a relatively complicated way of dealing with a pole at the origin, which
was included in a more elegant way in the original setting, but this is the way how it is dealt
with in the papers such as [35] and others. The matrix obtained there is an upper Hessenberg
matrix with blocks that bulge below the subdiagonal at those places where in our structure of
Fig. 1 we also have a lower Hessenberg block.
Note that whatever the shape of the matrix is, in the rational case, one should analyse the
spectrum of its Möbius transform as in (12.1) or analyse it as a pencil. This can be avoided at
the expense of a somewhat more complicated matrix problem in the form of a semi-separable-
plus-diagonal matrix.
12.2 Alternative: semi-separable-plus-diagonal matrices
If we start from the first relation in (12.1) in which Ĥγ is upper Hessenberg we can see that the
nth element on the right, i.e., $∗n(z)φn(z) is written as a linear combination of the elements
[φ0, $1φ1, . . . , $n+1φn+1] =
[
1, p1,
p2
π1
,
p3
π2
, . . . ,
pn+1
πn
]
,
φk =
pk
πk
, pk ∈ Pk.
These elements will generate all elements of the form qn+1
πn
with qn+1 a polynomial of degree at
most n + 1. It is clear that also φn belongs to that space. Hence there must exist an upper
Hessenberg matrix H such that
Φ(I − Γ ∗z)H = Φ.
A finite truncation then is
Φn(In − Γ∗nz)Hn = Φn + cn+1[0, . . . , 0, $n+1φn+1] (12.6)
46 A. Bultheel, R. Cruz-Barroso and A. Lasarow
with cn+1 a constant, In the identity matrix of size n + 1 and Γ∗n = diag(γ0, . . . , γn). If z is
a zero of φn+1, then Φn(In − Γ∗nz)Hn = Φn and this can be rearranged as
zΦn = Φn
(
In −H−1
n
)
Γ̌n
with Γ̌n = (Γ∗n)−1 = diag(γ̌0, . . . , γ̌n) where γ̌k = γk∗ = 1/γk. This clearly requires the restrictive
conditions that Hn and Γn are invertible, thus γk = 0 is excluded. It shows that z is and
eigenvalue of the matrix (In − H−1
n )Γ̌n and that the left eigenvector is [φ0(z), . . . , φn(z)]. By
a similar argument if Hn is replaced by Hun, a unitary truncation of H then the eigenvalue z will
be on T.
This is the kind of arguments used in for example [42, 43, 36] and others. They work with
a slightly simplified formula because the ORF have denominators π∗n instead of our πn. If poles
at 0 and ∞ are excluded, this is simple to obtain. Suppose φ̌n = pn
π∗n
, n = 0, 1, . . ., then (12.6)
becomes
Φ̌n
(
zIn − Γ̌n
)
Ȟn = Φ̌n + čn+1
[
0, . . . , 0, $n+1φ̌n+1
]
.
Thus if ž is a zero of φ̌n+1, then
žΦ̌n = Φ̌n
(
Ȟ−1
n + Γ̌n
)
.
This shows that ž is an eigenvalue of Ȟ−1
n +Γ̌n and the corresponding left eigenvector is Φ̌n(ž) =
[φ̌0(ž), . . . , φ̌n(ž)]. This Ȟ−1
n + Γ̌n is the semi-separable-plus-diagonal matrix that we mentioned
before. This is the relation given in [43, Theorem 2.7] where it was obtained in the context of
an inverse eigenvalue problem. See also [21, 42] which also rely on this structure.
The matrix Ȟ−1
n is a semi-separable matrix, as the inverse of an unreducible Hessenberg
matrix, which is as structured as the Hessenberg matrix itself. Indeed, if Ȟn can be written
as a product of unitary Gk factors, except for the last one which is truncated, then the semi-
separable H−1
n is the product of G−1
k in reverse order. Obviously, this has computationally the
same complexity.
Since this approach allows only for finite poles, this theory was later extended in [36] which
is basically using the technique explained in the previous section to include ∞ in the algorithm.
The Hessenberg matrix with bulges below the diagonal is then called an extended Hessenberg
matrix and the associated Krylov algorithms are called extended Krylov methods.
A selection of poles in an arbitrary order is implemented in this way to solve inverse and
direct eigenvalue problems, QR factorization of unitary (Hessenberg) matrices, rational Krylov
type methods, rational Szegő quadrature, and rational approximation problems like in [3, 22,
26, 35, 36, 37, 38]. We hope that future publications will illustrate that with the approach from
the first 10 sections of this paper we have provided a more elegant framework to solve all these
problems.
13 Conclusion
We have systematically developed the basics about orthogonal rational functions on the unit
circle whose poles can be freely chosen anywhere in the complex plane as long as they are not on
the unit circle T = {z ∈ C : |z| = 1}. The traditional cases of all poles outside the closed disk and
the so-called balanced situations where the poles alternate between inside and outside the disk
are included as special instances. The case where all the poles are inside the disk is somewhat
less manageable because it is not so easy to deal with the poles at infinity in this formalism, but
it has been included anyway. The link between the ORF with all poles outside, all poles inside,
and the general case with arbitrary location of the poles is clearly outlined. It is important that
Orthogonal Rational Functions on the Unit Circle 47
previous attempts to consider the general situation were assuming that once some pole 1/γ is
chosen, then γ may not appear anywhere in the sequence of remaining poles, which would for
example exclude the balanced situation. We have shown how the classical Christoffel–Darboux
formula, the recurrence relation, and the relation to rational Szegő formulas can be generalized.
Finally we analyzed the matrix representation of the shift operator with respect to the general
basis as a matrix Möbius transform of a generalization of a snake-shape matrix, and how the
latter can be factorized as a product of elementary unitary 2× 2 blocks. It is shown that there
is no theoretical or computational advantage to compute rational Szegő quadrature formulas for
a general sequence of poles. It is however conceptually slightly simpler to consider the balanced
situation, i.e., the generalization of the CMV representation. In the last two sections we have
made a link with the linear algebra literature. In Section 11 we have given an illustration by
proving some properties of AMPD matrices that are directly inspired by the ORF recursion.
A relation with direct and inverse eigenvalue problems for unitary matrices and the related
rational Krylov methods is briefly discussed in Section 12 illustrating that the approach given
in this paper is more general and hence might therefore be a more appropriate choice.
As a final note we remark that a very similar approach can be taken for ORF on the real line
or a real interval in which case the unitary Hessenberg matrices will be replaced by tridiagonal
Jacobi matrices and their generalizations in case poles are introduced in arbitrary order taken
from anywhere in the complex plane excluding the real line or outside the interval are in order.
Acknowledgements
We thank the anonymous referees for their careful reading of the manuscript and their sugges-
tions for improvement.
References
[1] Ammar G., Gragg W., Reichel L., Constructing a unitary Hessenberg matrix from spectral data, in Numeri-
cal Linear Algebra, Digital Signal Processing and Parallel Algorithms (Leuven, 1988), NATO Adv. Sci. Inst.
Ser. F Comput. Systems Sci., Vol. 70, Editors G. Golub, P. Van Dooren, Springer, Berlin, 1991, 385–395.
[2] Ammar G.S., He C.Y., On an inverse eigenvalue problem for unitary Hessenberg matrices, Linear Algebra
Appl. 218 (1995), 263–271.
[3] Aurentz J.L., Mach T., Vandebril R., Watkins D.S., Fast and stable unitary QR algorithm, Electron. Trans.
Numer. Anal. 44 (2015), 327–341.
[4] Beckermann B., Güttel S., Superlinear convergence of the rational Arnoldi method for the approximation
of matrix functions, Numer. Math. 121 (2012), 205–236.
[5] Beckermann B., Güttel S., Vandebril R., On the convergence of rational Ritz values, SIAM J. Matrix Anal.
Appl. 31 (2010), 1740–1774.
[6] Bultheel A., Cantero M.-J., A matricial computation of rational quadrature formulas on the unit circle,
Numer. Algorithms 52 (2009), 47–68.
[7] Bultheel A., Cantero M.-J., Cruz-Barroso R., Matrix methods for quadrature formulas on the unit circle.
A survey, J. Comput. Appl. Math. 284 (2015), 78–100.
[8] Bultheel A., Dı́az-Mendoza C., González-Vera P., Orive R., On the convergence of certain Gauss-type
quadrature formulas for unbounded intervals, Math. Comp. 69 (2000), 721–747.
[9] Bultheel A., González-Vera P., Hendriksen E., Nj̊astad O., Orthogonal rational functions, Cambridge Mono-
graphs on Applied and Computational Mathematics, Vol. 5, Cambridge University Press, Cambridge, 1999.
[10] Bultheel A., González-Vera P., Hendriksen E., Nj̊astad O., Computation of rational Szegő–Lobatto quadra-
ture formulas, Appl. Numer. Math. 60 (2010), 1251–1263.
[11] Bultheel A., González-Vera P., Hendriksen E., Nj̊astad O., Rational quadrature formulas on the unit circle
with prescribed nodes and maximal domain of validity, IMA J. Numer. Anal. 30 (2010), 940–963.
[12] Cantero M.J., Moral L., Velázquez L., Minimal representations of unitary operators and orthogonal poly-
nomials on the unit circle, Linear Algebra Appl. 408 (2005), 40–65, math.CA/0405246.
https://doi.org/10.1016/0024-3795(93)00188-6
https://doi.org/10.1016/0024-3795(93)00188-6
https://doi.org/10.1007/s00211-011-0434-8
https://doi.org/10.1137/090755412
https://doi.org/10.1137/090755412
https://doi.org/10.1007/s11075-008-9257-9
https://doi.org/10.1016/j.cam.2014.11.002
https://doi.org/10.1090/S0025-5718-99-01107-2
https://doi.org/10.1017/CBO9780511530050
https://doi.org/10.1017/CBO9780511530050
https://doi.org/10.1016/j.apnum.2010.05.009
https://doi.org/10.1093/imanum/drn073
https://doi.org/10.1016/j.laa.2005.04.025
https://arxiv.org/abs/math.CA/0405246
48 A. Bultheel, R. Cruz-Barroso and A. Lasarow
[13] Chesnokov A., Deckers K., Van Barel M., A numerical solution of the constrained weighted energy problem,
J. Comput. Appl. Math. 235 (2010), 950–965.
[14] Chu M.T., Golub G.H., Structured inverse eigenvalue problems, Acta Numer. 11 (2002), 1–71.
[15] Cruz-Barroso R., Daruis L., González-Vera P., Nj̊astad O., Sequences of orthogonal Laurent polynomials,
bi-orthogonality and quadrature formulas on the unit circle, J. Comput. Appl. Math. 200 (2007), 424–440.
[16] Cruz-Barroso R., Delvaux S., Orthogonal Laurent polynomials on the unit circle and snake-shaped matrix
factorizations, J. Approx. Theory 161 (2009), 65–87, arXiv:0712.2738.
[17] Cruz-Barroso R., Dı́az Mendoza C., Perdomo-Ṕıo F., Szegő-type quadrature formulas, J. Math. Anal. Appl.
455 (2017), 592–605.
[18] Cruz-Barroso R., González-Vera P., A Christoffel–Darboux formula and a Favard’s theorem for orthogonal
Laurent polynomials on the unit circle, J. Comput. Appl. Math. 179 (2005), 157–173.
[19] de la Calle Ysern B., González-Vera P., Rational quadrature formulae on the unit circle with arbitrary poles,
Numer. Math. 107 (2007), 559–587.
[20] Dı́az-Mendoza C., González-Vera P., Jiménez-Paiz M., Strong Stieltjes distributions and orthogonal Laurent
polynomials with applications to quadratures and Padé approximation, Math. Comp. 74 (2005), 1843–1870.
[21] Fasino D., Gemignani L., Direct and inverse eigenvalue problems for diagonal-plus-semiseparable matrices,
Numer. Algorithms 34 (2003), 313–324.
[22] Fasino D., Gemignani L., Structured eigenvalue problems for rational Gauss quadrature, Numer. Algorithms
45 (2007), 195–204.
[23] Fritzsche B., Kirstein B., Lasarow A., Orthogonal rational matrix-valued functions on the unit circle, Math.
Nachr. 278 (2005), 525–553.
[24] Fritzsche B., Kirstein B., Lasarow A., Orthogonal rational matrix-valued functions on the unit circle: re-
currence relations and a Favard-type theorem, Math. Nachr. 279 (2006), 513–542.
[25] Fritzsche B., Kirstein B., Lasarow A., Para-orthogonal rational matrix-valued functions on the unit circle,
Oper. Matrices 6 (2012), 631–679.
[26] Gemignani L., A unitary Hessenberg QR-based algorithm via semiseparable matrices, J. Comput. Appl.
Math. 184 (2005), 505–517.
[27] Golub G.H., Meurant G., Matrices, moments and quadrature with applications, Princeton Series in Applied
Mathematics, Princeton University Press, Princeton, NJ, 2010.
[28] Gragg W.B., Positive definite Toeplitz matrices, the Arnoldi process for isometric operators, and Gaussian
quadrature on the unit circle, J. Comput. Appl. Math. 46 (1993), 183–198.
[29] Güttel S., Rational Krylov approximation of matrix functions: numerical methods and optimal pole selection,
GAMM-Mitt. 36 (2013), 8–31.
[30] Güttel S., Knizhnerman L., A black-box rational Arnoldi variant for Cauchy–Stieltjes matrix functions, BIT
53 (2013), 595–616.
[31] Jones W.B., Nj̊astad O., Thron W.J., Two-point Padé expansions for a family of analytic functions, J. Com-
put. Appl. Math. 9 (1983), 105–123.
[32] Jones W.B., Thron W.J., Nj̊astad O., Orthogonal Laurent polynomials and the strong Hamburger moment
problem, J. Math. Anal. Appl. 98 (1984), 528–554.
[33] Knizhnerman L., Simoncini V., A new investigation of the extended Krylov subspace method for matrix
function evaluations, Numer. Linear Algebra Appl. 17 (2010), 615–638.
[34] Lasarow A., Aufbau einer Szegő-Theorie für rationale Matrixfunktionen, Ph.D. Thesis, Universität Leipzig,
Fak. Mathematik Informatik, 2000.
[35] Mach T., Van Barel M., Vandebril R., Inverse eigenvalue problems linked to rational Arnoldi, and rational,
(non)symmetric Lanczos, 2013, Tech. Rep. TW629, Dept. Computer Science, KU Leuven.
[36] Mach T., Van Barel M., Vandebril R., Inverse eigenvalue problems for extended Hessenberg and extended
tridiagonal matrices, J. Comput. Appl. Math. 272 (2014), 377–398.
[37] Mertens C., Short recurrence relations for (extended) Krylov subspaces, Ph.D. Thesis, Department of Com-
puter Science, Faculty of Engineering Science, KU Leuven, 2016, available at https://lirias.kuleuven.
be/handle/123456789/538706.
[38] Mertens C., Vandebril R., Short recurrences for computing extended Krylov bases for Hermitian and unitary
matrices, Numer. Math. 131 (2015), 303–328.
https://doi.org/10.1016/j.cam.2009.11.060
https://doi.org/10.1017/S0962492902000016
https://doi.org/10.1016/j.cam.2006.01.008
https://doi.org/10.1016/j.jat.2008.08.004
https://arxiv.org/abs/0712.2738
https://doi.org/10.1016/j.jmaa.2017.05.055
https://doi.org/10.1016/j.cam.2004.09.039
https://doi.org/10.1007/s00211-007-0102-1
https://doi.org/10.1090/S0025-5718-05-01763-1
https://doi.org/10.1023/B:NUMA.0000005402.66868.af
https://doi.org/10.1007/s11075-007-9082-6
https://doi.org/10.1002/mana.200310257
https://doi.org/10.1002/mana.200310257
https://doi.org/10.1002/mana.200310376
https://doi.org/10.7153/oam-06-44
https://doi.org/10.1016/j.cam.2005.01.024
https://doi.org/10.1016/j.cam.2005.01.024
https://doi.org/10.1016/0377-0427(93)90294-L
https://doi.org/10.1002/gamm.201310002
https://doi.org/10.1007/s10543-013-0420-x
https://doi.org/10.1016/0377-0427(83)90034-1
https://doi.org/10.1016/0377-0427(83)90034-1
https://doi.org/10.1016/0022-247X(84)90267-1
https://doi.org/10.1002/nla.652
https://doi.org/10.1016/j.cam.2014.03.015
https://lirias.kuleuven.be/handle/123456789/538706
https://lirias.kuleuven.be/handle/123456789/538706
https://doi.org/10.1007/s00211-014-0692-3
Orthogonal Rational Functions on the Unit Circle 49
[39] Nj̊astad O., Santos-León J.C., Domain of validity of Szegö quadrature formulas, J. Comput. Appl. Math.
202 (2007), 440–449.
[40] Thron W.J., L-polynomials orthogonal on the unit circle, in Nonlinear Numerical Methods and Rational
Approximation (Wilrijk, 1987), Math. Appl., Vol. 43, Reidel, Dordrecht, 1988, 271–278.
[41] Van Barel M., Bultheel A., Discrete linearized least-squares rational approximation on the unit circle,
J. Comput. Appl. Math. 50 (1994), 545–563.
[42] Van Barel M., Fasino D., Gemignani L., Mastronardi N., Orthogonal rational functions and diagonal-plus-
semiseparable matrices, in Advanced Signal Processing Algorithms, Architectures, and Implementations XII,
Proceedings of SPIE , Vol. 4791, Editor F. Luk, SPIE, 2002, 162–170.
[43] Van Barel M., Fasino D., Gemignani L., Mastronardi N., Orthogonal rational functions and structured
matrices, SIAM J. Matrix Anal. Appl. 26 (2005), 810–829.
[44] Van Beeumen R., Meerbergen K., Michiels W., A rational Krylov method based on Hermite interpolation
for nonlinear eigenvalue problems, SIAM J. Sci. Comput. 35 (2013), A327–A350.
[45] Vandebril R., Van Barel M., Mastronardi N., Matrix computations and semiseparable matrices, Vol. 1,
Linear systems, Johns Hopkins University Press, Baltimore, MD, 2008.
[46] Vandebril R., Van Barel M., Mastronardi N., Matrix computations and semiseparable matrices, Vol. 2,
Eigenvalue and singular value methods, Johns Hopkins University Press, Baltimore, MD, 2008.
[47] Velázquez L., Spectral methods for orthogonal rational functions, J. Funct. Anal. 254 (2008), 954–986,
arXiv:0704.3456.
https://doi.org/10.1016/j.cam.2006.02.039
https://doi.org/10.1007/978-94-009-2901-2_16
https://doi.org/10.1016/0377-0427(94)90327-1
https://doi.org/10.1117/12.453815
https://doi.org/10.1137/S0895479803444454
https://doi.org/10.1137/120877556
https://doi.org/10.1016/j.jfa.2007.11.004
https://arxiv.org/abs/0704.3456
1 Introduction
2 Basic definitions and notation
3 Linear spaces and ORF bases
4 Christoffel–Darboux relations and reproducing kernels
5 Recurrence relation
6 Para-orthogonal rational functions
7 Quadrature
8 Block diagonal with Hessenberg blocks
9 Factorization of a general CMV matrix
10 Spectral analysis
11 AMPD matrices
12 Relation with direct and inverse eigenvalue problems
12.1 Using the general recursion while dealing with infinity
12.2 Alternative: semi-separable-plus-diagonal matrices
13 Conclusion
References
|