Теория информации / Cover T.M., Thomas J.A. Elements of Information Theory. 2006., 748p
.pdf636 INFORMATION THEORY AND PORTFOLIO THEORY
where (16.117) follows from (16.109) and (16.119) follows from (16.112). Consequently, we have
|
|
ˆ |
n |
) |
|
|
|
|
max min |
Sn |
(x |
|
V . |
(16.120) |
|||
S |
(xn) ≥ |
|||||||
ˆ |
xn |
n |
|
|||||
b |
n |
|
|
|
|
|
We have thus demonstrated a portfolio on the 2n possible sequences of
length that achieves wealth ˆn xn within a factor n of the wealth n S ( ) V
Sn (xn) achieved by the best constant rebalanced portfolio in hindsight. To complete the proof of the theorem, we show that this is the best possible, that is, that any nonanticipating portfolio bi (xi−1) cannot do better than a factor Vn in the worst case (i.e., for the worst choice of xn). To prove this, we construct a set of extremal stock market sequences and show that the performance of any nonanticipating portfolio strategy is bounded by Vn for at least one of these sequences, proving the worst-case bound.
n |
{ |
} |
n, we define the corresponding extremal stock mar- |
|||||||
For each j n |
|
|
1, |
2 |
||||||
ket vector x (j |
n |
) as |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
(1, 0)t |
if j |
1, |
|
|
|
|
|
xi |
(ji ) |
= |
(0, 1)t |
if ji = |
2, |
(16.121) |
|
|
|
|
|
|
|
i |
|
|
|
|
|
|
|
|
|
|
|
= |
|
|
Let e1 = (1, 0)t , e2 = (0, 1)t |
be standard basis vectors. Let |
|
||||||||
|
|
|
K = {x(j n) : j n {1, 2}n, xiji |
= eji } |
(16.122) |
be the set of extremal sequences. There are 2n such extremal sequences, and for each sequence at each time, there is only one stock that yields a nonzero return. The wealth invested in the other stock is lost. Therefore, the wealth at the end of n periods for extremal sequence xn(j n) is the product of the amounts invested in the stocks j1, j2, . . . , jn, [i.e., Sn(xn(j n)) = i bji = w(j n)]. Again, we can view this as an investment on sequences of length n, and given the 0 – 1 nature of the return, it is easy to see for xn K that
Sn(xn(j n)) = 1. |
(16.123) |
j n
For any extremal sequence xn(j n) K, the best constant rebalanced portfolio is
n |
n |
)) = |
|
n (j n) |
|
n (j n) |
|
t |
(16.124) |
b (x (j |
|
1 n |
, |
2 n |
, |
638 INFORMATION THEORY AND PORTFOLIO THEORY
where |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
w(jˆ |
i ) = |
n i |
n w(j n) |
(16.133) |
||||||||||||||||
|
|
|
|
|
|
|
|
|
j |
:j j |
|
|
|
|
|
|
|
|
|||
is the weight put on all sequences j n that start with j i , and |
|
||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
i−1 |
|
||||||||||
|
|
x(j i−1) = xkjk |
(16.134) |
||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
k=1 |
|
||||||||||
is the return on those sequences as defined in (16.106). |
|
||||||||||||||||||||
Investigation of the asymptotics of Vn reveals [401, 496] that |
|
||||||||||||||||||||
|
|
|
|
|
|
m−1 |
|
|
|
|
|
|
|
|
|
||||||
Vn |
|
|
2 |
|
(m/2)/√ |
|
|
(16.135) |
|||||||||||||
|
|
π |
|||||||||||||||||||
n |
|||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||
for m assets. In particular, for m = 2 assets, |
|
||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||
|
|
|
Vn |
|
|
2 |
|
|
|
|
|
|
(16.136) |
||||||||
|
|
|
|
|
π n |
||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
||||||||||
and |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
≤ Vn |
|
≤ |
2 |
|
(16.137) |
||||||||
|
√ |
|
|
|
|
|
|
|
|
√ |
|
||||||||||
2 |
n + 1 |
|
|
n + 1 |
|||||||||||||||||
|
|
|
|
|
|
|
|
|
|
for all n [400]. Consequently, for m = 2 stocks, the causal portfolio strat-
ˆ i−1 ˆ n
egy bi (x ) given in (16.132) achieves wealth Sn(x ) such that
ˆ |
|
n |
) |
|
|
1 |
|
|
|
Sn |
(x |
≥ Vn ≥ |
√ |
|
|
(16.138) |
|||
S |
|
(xn) |
|
|
|
||||
n |
|
1 |
|||||||
|
|
|
2 |
+ |
|
||||
|
n |
|
|
|
|
|
|
|
for all market sequences xn.
16.7.2Horizon-Free Universal Portfolios
We describe the horizon-free strategy in terms of a weighting of different portfolio strategies. As described earlier, each constantly rebalanced portfolio b can be viewed as corresponding to a mutual fund that rebalances the m assets according to b. Initially, we distribute the wealth among these funds according to a distribution µ(b), where dµ(b) is the amount of wealth invested in portfolios in the neighborhood db of the constantly rebalanced portfolio b.
16.7 UNIVERSAL PORTFOLIOS |
641 |
We now apply this lemma when µ(b) is the Dirichlet( 12 ) distribution.
|
|
|
|
|
|
|
|
|
|
|
ˆ |
= |
|
|
Theorem 16.7.2 For the causal universal portfolio bi ( ), i |
1, 2, . . ., |
|||||||||||||
given in (16.141), with m = 2 stocks and dµ(b) the Dirichlet( |
1 |
, |
1 |
|||||||||||
2 |
2 ) distri- |
|||||||||||||
bution, we have |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ˆ |
|
(x |
n |
) |
|
|
1 |
|
|
|
|
|
|
|
Sn |
|
≥ |
√ |
|
|
, |
|
|
|
||||
|
S |
|
(x n) |
|
|
|
|
|
|
|||||
|
n |
|
1 |
|
|
|
||||||||
|
|
|
|
|
2 |
+ |
|
|
|
|
||||
|
|
n |
|
|
|
|
|
|
|
|
|
|
|
for all n and all stock sequences x n.
Proof: As in the discussion preceding (16.112), we can show that the weight put by the best constant portfolio b on the sequence j n is
n |
bji |
|
|
k |
k |
n − k |
n−k |
|
2−nH (k/n), |
(16.156) |
|
= |
n |
|
= |
||||||
|
|
|
n |
|
|
|||||
i=1 |
|
|
|
|
|
|
|
|
|
|
where k is the number of indices where ji = 1. We can also explicitly calculate the integral in the numerator of (16.148) in Lemma 16.7.2 for the Dirichlet( 12 ) density, defined for m variables as
|
|
( m2 ) |
|
m |
|
||
|
|
|
b− 21 db, |
|
|||
dµ(b) = |
|
|
|
|
|
j |
(16.157) |
|
|
1 |
|
m |
|||
|
|
2 |
|
j =1 |
|
where (x) = 0∞ e−t t x−1 dt denotes the gamma function. For simplicity,
we consider |
the case of two stocks, in which case |
|
||||||
|
|
|
|
|
|
|
|
|
dµ(b) = |
1 |
|
√ |
1 |
db, |
0 ≤ b ≤ 1, |
(16.158) |
|
|
|
|
||||||
π |
|
|||||||
|
|
|
|
|
b(1 − b) |
|
|
|
where b is the fraction of wealth invested in stock 1. Now consider any sequence j n {1, 2}n, and consider the amount invested in that sequence,
|
|
n |
|
= bl (1 − b)n−l , |
|
|
||||||
|
b(j n) = bji |
|
(16.159) |
|||||||||
|
|
i=1 |
|
|
|
|
|
|
|
|
|
|
where l is the number of indices where ji = 1. Then |
|
|
||||||||||
|
|
= |
|
|
− |
|
π |
|
√ |
1 |
|
|
|
b(j n) dµ(b) |
|
bl (1 |
|
b)n−l |
1 |
|
|
db |
(16.160) |
||
|
|
|
|
|
||||||||
|
|
|
|
|
|
|
|
|
|
b(1 − b) |
|
|
644 INFORMATION THEORY AND PORTFOLIO THEORY
16.8 SHANNON–MCMILLAN–BREIMAN THEOREM (GENERAL AEP)
The AEP for ergodic processes has come to be known as the Shannon – McMillan – Breiman theorem. In Chapter 3 we proved the AEP for i.i.d. processes. In this section we offer a proof of the theorem for a general ergodic process. We prove the convergence of n1 log p(Xn) by sandwiching it between two ergodic sequences.
In a sense, an ergodic process is the most general dependent process for which the strong law of large numbers holds. For finite alphabet processes, ergodicity is equivalent to the convergence of the kth-order empirical distributions to their marginals for all k.
The technical definition requires some ideas from probability theory. To be precise, an ergodic source is defined on a probability space ( , B, P ), where B is a σ -algebra of subsets of and P is a probability measure. A random variable X is defined as a function X(ω), ω , on the probability space. We also have a transformation T : → , which plays the role of a time shift. We will say that the transformation is stationary if P (T A) = P (A) for all A B. The transformation is called ergodic if every set A such that T A = A, a.e., satisfies P (A) = 0 or 1. If T is stationary and ergodic, we say that the process defined by Xn(ω) = X(T nω) is stationary and ergodic. For a stationary ergodic source, Birkhoff’s ergodic theorem states that
n |
|
|
→ |
|
= |
|
|
1 |
n |
Xi (ω) |
|
EX |
|
X dP with probability 1. |
(16.172) |
|
|
|
i=1
Thus, the law of large numbers holds for ergodic processes. We wish to use the ergodic theorem to conclude that
n−1
− n1 log p(X0, X1, . . . , Xn−1) = − n1 log p(Xi |X0i−1)
i=0
|
→ n |
|
− |
log p(X |
n| |
0 |
|
||||||
|
|
|
lim E[ |
|
|
Xn−1)]. (16.173) |
|||||||
|
|
|
|
→∞ |
|
|
|
|
|
|
|
|
|
But the stochastic sequence p(X |
i | |
Xi−1) is |
not |
ergodic. However, |
the |
||||||||
|
|
|
0 |
|
|
|
Xi−1 ) are ergodic |
|
|||||
closely related quantities p(X |
i | |
Xi−1) and |
p(X |
and |
|||||||||
|
i−k |
|
|
|
i | |
−∞ |
|
have expectations easily identified as entropy rates. We plan to sandwich p(Xi |X0i−1) between these two more tractable processes.