已知兩個高斯分布及他們的關係，如何求條件期望?

01-05

下面h和x是兩個符合高斯分布的變數，分布函數已給出，並且x=Ph，請問最終的條件期望E(h|x)是如何得到的?
我試著推理如下，但沒法繼續了

謝邀。

Again I am passionate about this kind of technicalities.

I do not redefine the variables in the question. But let me define something more. Note that $mathbf{h}$ has one more dimension than $mathbf{x}$ , where except the first dimension of $mathbf{h}$ , $x_i = mu + h_i$ . For later convenience, I define $mathbf{h} = left[ egin{array}{c} mu \ mathbf{ ilde{h}} end{array} ight]$ , where $mu$ is the first dimension and $mathbf{ ilde{h}}$ contains the rest.

On the other hand, given $Sigma_x$ calculated above, one can easily show that its inverse is given by

$Sigma_x^{-1} = left[ egin{array}{ccc} frac{1}{S_h} - frac{S_{mu}}{S_h} frac{1}{n S_{mu} +S_h} - frac{S_{mu}}{S_h} frac{1}{n S_{mu} +S_h} dotsc \ - frac{S_{mu}}{S_h} frac{1}{n S_{mu} +S_h} frac{1}{S_h} - frac{S_{mu}}{S_h} frac{1}{n S_{mu} +S_h} dotsc \ vdots vdots ddots end{array} ight]$ .

(Note: I found this out by clumsily going through the calculation below, but verification is easy. And this is actually the main obstacle of this exercise.)

First we define the joint probability distribution function $P(mathbf{x}, mathbf{h})$ . We know that $mathbf{h}$ obeys the Gaussian distribution, and $mathbf{x}$ is a linear superposition of the elements in $mathbf{h}$ . It is the most appropriate to use Dirac delta function to describe this:

$P(mathbf{x}, mathbf{h}) = frac{1}{sqrt{(2 pi)^{n+1} |Sigma_h|}} exp left( -frac{1}{2} mathbf{h}^T Sigma_h^{-1} mathbf{h} ight) delta^{(n)} (mathbf{x} - mathbf{P} mathbf{h})$ .

As we know, the probability distribution for $mathbf{x}$ is in the form

$P(mathbf{x}) = frac{1}{sqrt{(2 pi)^n |Sigma_x|}} exp left( -frac{1}{2} mathbf{x}^T Sigma_x^{-1} mathbf{x} ight)$ .

However, we can also get $P(mathbf{x})$ by integrating out $mathbf{h}$ :

$P(mathbf{x}) = int d^{n+1} mathbf{h} P(mathbf{x}, mathbf{h}) \ = frac{1}{(2pi)^{n+1} | Sigma_h |} int dmu int d^n mathbf{ ilde{h}} exp left( - frac{1}{2} frac{mu^2}{S_{mu}} - frac{1}{2} mathbf{ ilde{h}}^T Sigma_{ ilde{h}}^{-1} mathbf{ ilde{h}} ight) prod_{i=1}^n delta(x_i - mu - h_i) \ = frac{1}{sqrt{(2pi)^{n+1} S_u S_h^n}} int dmu expleft[ -frac{1}{2} left(frac{1}{S_{mu}} + frac{n}{S_h} ight) left( mu - frac{1}{frac{1}{S_{mu}} + frac{n}{S_h}} sum_{i=1}^n frac{x_i}{S_h} ight)^2 - frac{1}{2} mathbf{x}^T Sigma_x^{-1} mathbf{x} ight] \ = frac{1}{sqrt{(2pi)^n S_{mu} S_h^n left( frac{1}{S_{mu}} + frac{n}{S_h} ight)}} expleft( - frac{1}{2} mathbf{x}^T Sigma_x^{-1} mathbf{x} ight)$ ,

where you can see the determinant of $Sigma_x$ is $|Sigma_x| = S_{mu} S_h^n left( frac{1}{S_{mu}} + frac{n}{S_h} ight)$ .

Then the conditional probability can be calculated:

$E( mathbf{h} | mathbf{x} ) = int d^{n+1} mathbf{h} [mathbf{h} P(mathbf{h} | mathbf{x})] = int d^{n+1} mathbf{h} frac{mathbf{h} P(mathbf{x}, mathbf{h})}{P(mathbf{x})} \ = frac{1}{sqrt{2pi frac{|Sigma_h|}{|Sigma_x|}}} int dmu int d^n mathbf{ ilde{h}} left[ egin{array}{c} mu \ mathbf{ ilde{h}} end{array} ight] exp left( -frac{1}{2} frac{mu^2}{S_{mu}} - frac{1}{2} mathbf{ ilde{h}}^T Sigma_{ ilde{h}}^{-1} mathbf{h} + frac{1}{2} mathbf{x}^T Sigma_x^{-1} mathbf{x} ight) prod_{i=1}^n delta(x_i - mu - h_i) \ = sqrt{frac{frac{1}{S_{mu}}+frac{n}{S_h}}{2pi}} int dmu left[ egin{array}{c} mu \ mathbf{x} - mu mathbf{1}_c end{array} ight] exp left( -frac{1}{2} frac{mu^2}{S_{mu}} - frac{1}{2} (mathbf{x} - mu mathbf{1}_c)^T Sigma_{ ilde{h}}^{-1} (mathbf{x} - mu mathbf{1}_c) + frac{1}{2} mathbf{x}^T Sigma_x^{-1} mathbf{x} ight)$

In the above, we simply exploit the definition of conditional probability, and integrate over all $mathbf{ ilde{h}}$ with the Dirac delta function. There are a lot of algebra involved which the readers can diligently verify on their own. Continuing the calculation gives

$E( mathbf{h} | mathbf{x} ) = sqrt{frac{frac{1}{S_{mu}}+frac{n}{S_h}}{2pi}} int dmu left[ egin{array}{c} mu \ mathbf{x} - mu mathbf{1}_c end{array} ight] exp left[ -frac{1}{2} left( frac{1}{S_{mu}} + frac{n}{S_h} ight) left( mu - frac{S_{mu}}{S_h+nS_{mu}} sum_{i=1}^n x_i ight)^2 ight] \ = left[ egin{array}{c} frac{S_{mu}}{S_h+nS_{mu}} sum_{i=1}^n x_i \ mathbf{x} - frac{S_{mu}}{S_h+nS_{mu}} sum_{i=1}^n x_i mathbf{1}_c end{array} ight]$

A careful operation of algebra can show that this is exactly equal to $Sigma_h P^T Sigma_x^{-1} mathbf{x}$ .

I have skipped many details, but readers can verify this on their own.

P.S.: For me, the most difficult part is to find the inverse of $Sigma_x$ , which I found by calculating the integrals above and got the matrix elements, and then verify it by multiplying it by itself to see if it gives an identity matrix. The difference in the number of dimensions of $mathbf{x}$ and $mathbf{h}$ does impose some inconvenience. However, it is more about menial algebra instead of an intellectually challenging problems. If you want to have a taste without much algebraic operation, take the fewer dimensions and do the calculation first using Mathematica to get a feeling.