How would you write the Multivariate Normal Distribution from scratch in Torch?

June 19, 2024

I would like to implement the Multivariate Normal Distribution in the Torch library from scratch. My implementation is not giving me the same output as the distribution at torch.distributions.MultivariateNormal. What part do I have wrong?

I tried implementing an equation of the Multivariate Normal Distribution I found on the internet but it doesn't match the output of the Torch MultivariateNormal distribution. I don't see an equation for it in the Torch Documentation.

My code

import torch µ = torch.tensor([[-10.5, 2.0], [-0.5, 2.0]]) cov = torch.tensor([[12.0, 8.0], [12.0, 40.0]]) pos_def_cov = torch.matmul(cov, cov.T) Σ = torch.linalg.cholesky(pos_def_cov) x = torch.randn([1]) d = torch.tensor(x.shape[0]) (1 / torch.sqrt(2 * torch.pi**d) * torch.abs(Σ)) * torch.exp(-0.5 * (x - µ).T * Σ**-1 * (x - µ))

Typically the value in the upper right corner of the matrix is zero.

tensor([[ 0.2842, 0.0000], [12.4068, 8.7792]])

The Torch distribution with the same matrices.

torch.distributions.MultivariateNormal(µ, pos_def_cov).sample()

The output doesn't have a constant zero value like my output does.

tensor([[-5.4596, 7.1297], [ 0.8562, -7.6340]])

This is the equation I believe I have implemented correctly from scratch in Torch above. I think my problem may have something to do with my Cholesky Decomposition and making the covariant a positive definite matrix, if this is a fine equation to use and I implemented it correctly.

I have looked at the source code of torch.distributions.MultivariateNormal and I find it too abstract to get a foothold in.

1 Answer

Σ should be a symmetric matrix by definition. In your provided example, the following code is not correct.

Σ = torch.linalg.cholesky(pos_def_cov)

Moreover, the pdf should return a scalar but not a matrix. The following code is also wrong. You should not use torch.abs() but torch.det()

(1 / torch.sqrt(2 * torch.pi**d) * torch.abs(Σ)) * torch.exp(-0.5 * (x - µ).T * Σ**-1 * (x - µ))

The problem is you are trying to compare a probability density function with a randomly generated sample.

A correct demo is the following code:

import torch µ = torch.tensor([-10.5,2.0]) cov = torch.tensor([[12.0, 8.0], [8.0, 40.0]]) x = torch.randn(2) d = torch.tensor(x.shape[0]) # your manual implementation of pdf torch.log(1 / torch.sqrt((2 * torch.pi)**d * torch.det(cov)) * torch.exp(-0.5 * torch.sum((x - µ) * torch.mv(torch.inverse(cov), (x - µ))))) # pdf from pytorch torch.distributions.MultivariateNormal(µ, cov).log_prob(x)

ncG1vNJzZmirpJawrLvVnqmfpJ%2Bse6S7zGiorp2jqbawutJobmxrYmuFdn2OoaawZaekwq2wjLKmrmWnp7a1sYytn55lnaq5tbXVmqmimaSaeq%2B70aaYpWWUnsC1vsibrK2hn6N6p77OpmSsm6KWwaS0jKKlZqyfp7Cp