I would like to implement the Multivariate Normal Distribution in the Torch library from scratch. My implementation is not giving me the same output as the distribution at torch.distributions.MultivariateNormal
. What part do I have wrong?
I tried implementing an equation of the Multivariate Normal Distribution I found on the internet but it doesn't match the output of the Torch MultivariateNormal distribution. I don't see an equation for it in the Torch Documentation.
My code
import torch µ = torch.tensor([[-10.5, 2.0], [-0.5, 2.0]]) cov = torch.tensor([[12.0, 8.0], [12.0, 40.0]]) pos_def_cov = torch.matmul(cov, cov.T) Σ = torch.linalg.cholesky(pos_def_cov) x = torch.randn([1]) d = torch.tensor(x.shape[0]) (1 / torch.sqrt(2 * torch.pi**d) * torch.abs(Σ)) * torch.exp(-0.5 * (x - µ).T * Σ**-1 * (x - µ))
Typically the value in the upper right corner of the matrix is zero.
tensor([[ 0.2842, 0.0000], [12.4068, 8.7792]])
The Torch distribution with the same matrices.
torch.distributions.MultivariateNormal(µ, pos_def_cov).sample()
The output doesn't have a constant zero value like my output does.
tensor([[-5.4596, 7.1297], [ 0.8562, -7.6340]])
This is the equation I believe I have implemented correctly from scratch in Torch above. I think my problem may have something to do with my Cholesky Decomposition and making the covariant a positive definite matrix, if this is a fine equation to use and I implemented it correctly.
I have looked at the source code of torch.distributions.MultivariateNormal
and I find it too abstract to get a foothold in.
Σ should be a symmetric matrix by definition. In your provided example, the following code is not correct.
Σ = torch.linalg.cholesky(pos_def_cov)
Moreover, the pdf should return a scalar but not a matrix. The following code is also wrong. You should not use torch.abs()
but torch.det()
(1 / torch.sqrt(2 * torch.pi**d) * torch.abs(Σ)) * torch.exp(-0.5 * (x - µ).T * Σ**-1 * (x - µ))
The problem is you are trying to compare a probability density function with a randomly generated sample.
A correct demo is the following code:
import torch µ = torch.tensor([-10.5,2.0]) cov = torch.tensor([[12.0, 8.0], [8.0, 40.0]]) x = torch.randn(2) d = torch.tensor(x.shape[0]) # your manual implementation of pdf torch.log(1 / torch.sqrt((2 * torch.pi)**d * torch.det(cov)) * torch.exp(-0.5 * torch.sum((x - µ) * torch.mv(torch.inverse(cov), (x - µ))))) # pdf from pytorch torch.distributions.MultivariateNormal(µ, cov).log_prob(x)
9ncG1vNJzZmirpJawrLvVnqmfpJ%2Bse6S7zGiorp2jqbawutJobmxrYmuFdn2OoaawZaekwq2wjLKmrmWnp7a1sYytn55lnaq5tbXVmqmimaSaeq%2B70aaYpWWUnsC1vsibrK2hn6N6p77OpmSsm6KWwaS0jKKlZqyfp7Cp