Applied regression analysis / Praktic / 1 / Literature / gaussid
.pdfgaussian identities
sam roweis
(revised July 1999)
0.1multidimensional gaussian
a d-dimensional multidimensional gaussian (normal) density for x is:
N ( ; ) = (2 ) d=2j j 1=2 exp |
2 |
(x )T 1 |
(x ) |
(1) |
||||
|
|
|
1 |
|
|
|
||
it has entropy: |
|
|
|
|
|
|
|
|
S = |
1 |
log2 h(2 e)dj ji const bits |
|
(2) |
||||
|
|
|||||||
2 |
|
where is a symmetric postive semi-de nite covariance matrix and the (unfortunate) constant is the log of the units in which x is measured over the \natural units"
0.2linear functions of a normal vector
no matter how x is distributed,
|
|
E[Ax + y] = A(E[x]) + y |
(3a) |
|||||||
|
|
Covar[Ax + y] = A(Covar[x])AT |
(3b) |
|||||||
in particular this means that for normal distributed quantities: |
||||||||||
x s N ( ; ) ) (Ax + y) s N A + y; A AT |
(4a) |
|||||||||
x |
s N |
( ; ) |
) |
1=2(x |
|
) |
|
N |
(0; I) |
|
|
|
|
|
|
|
|
s |
|
|
|
x s N ( ; ) ) (x )T 1(x ) s n2 |
(4c) |
1
0.3marginal and conditional distributions
let the vector z = [xT yT ]T be normally distributed according to:
z = |
y s N |
b |
; |
CT |
B |
(5a) |
|
x |
a |
|
A |
C |
|
where C is the (non-symmetric) cross-covariance matrix between x and y which has as many rows as the size of x and as many columns as the size of y. then the marginal distributions are:
|
|
|
|
|
x s N (a; A) |
|
|
|
(5b) |
|||
|
|
|
|
|
y s N (b; B) |
|
|
|
(5c) |
|||
and the conditional distributions are: |
|
|
|
|
|
|||||||
xjy s N |
|
a + CB 1(y b); A CB 1CT |
|
(5d) |
||||||||
y |
x |
s |
|
b + C |
A |
(x a); B C |
A |
(5e) |
||||
j |
|
|
N |
|
|
|
|
|
|
|
|
|
0.4multiplication
the multiplication of two gaussian functions is another gaussian function (although no longer normalized). in particular,
N (a; A) N (b; B) / N (c; C) |
(6a) |
|||||
where |
|
|
|
|
|
|
C = |
|
A 1 + B 1 |
1 |
(6b) |
||
c = |
|
|
|
1b |
(6c) |
|
|
CA |
1a + CB |
amazingly, the normalization constant zc is Gaussian in either a or b: |
|
||||
zc = (2 ) d=2jCj+1=2jAj 1=2jBj 1=2 exp |
2(aT A 1a + bT B 1b cT C 1c) |
||||
|
|
1 |
|
||
|
|
|
|
|
(6d) |
zc(a) s |
(A 1CA 1) 1(A 1CB 1)b; (A 1CA 1) 1 |
(6e) |
|||
zc(b) s N |
(B 1CB 1) 1(B 1CA 1)a; (B 1CB 1) 1 |
(6f) |
|||
N |
|
|
|
|
|
2
0.5quadratic forms
the expectation of a quadratic form under a gaussian is another quadratic form (plus an annoying constant). in particular, if x is gaussian distributed with mean m and variance S then,
Z
(x )T 1(x )N (m; S) dx
x
= ( m)T 1( m) + Tr 1S |
(7a) |
if the original quadratic form has a linear function of x the result is still simple:
Z
(Wx )T 1(Wx )N (m; S) dx
x
= ( Wm)T 1( Wm) + Tr WT 1WS |
(7b) |
0.6convolution
the convolution of two gaussian functions is another gaussian function (although no longer normalized). in particular,
N (a; A) N (b; B) / N (a + b; A + B) |
(8) |
this is a direct consequence of the fact that the Fourier transform of a gaussian is another gaussian and that the multiplication of two gaussians is still gaussian.
0.7Fourier transform
the (inverse)Fourier transform of a gaussian function is another gaussian function (although no longer normalized). in particular,
|
F [N (a; A)] |
/ N jA 1a; A 1 |
(9a) |
||||
F |
1 |
[ |
N |
(b; B)] |
|
jB 1b; B 1 |
(9b) |
|
|
|
/ N |
|
|
p where j = 1
3
0.8constrained maximization
the maximum over x of the quadratic form:
T x 12xT A 1x subject to the J conditions cj(x) = 0 is given by:
A + AC ; = 4(CT AC)CT A where the jth column of C is @cj(x)=@x
(10a)
(10b)
4