Gaussian Mixture Models

Gaussian mixture models are popular for clustering data. Generally speaking, they are continuous random variables with a special probability density, namely

\[\rho(x) = \sum_{i = 1}^{n} \frac{w_i}{\sqrt{2 \pi \sigma_i^2}} \exp \left( \frac{(x - \mu_i)^2}{2 \sigma_i^2} \right) \quad \text{with} \quad \sum_{i = 1}^n w_i = 1,\]

where the pairs of means and standard deviations $(\mu_i, \sigma_i)$, and the weights $w_i$ for all $i \in \{ 1, \dots, n \}$ are given. Let's consider a simple example.

using Plots
f(x,μ,σ) = 1 / sqrt(2*π*σ^2) * exp(-(x - μ)^2 / (2σ^2))
μs, σs, ws = [1., 1.7], [0.2, 0.3], [0.5, 0.5]
ρ(x) = sum(w*f(x, μ, σ) for (μ, σ, w) in zip(μs, σs, ws))
x = 0:0.01:3;
plot(x, ρ.(x))
xlabel!("x"); ylabel!("rho(x)");

This looks nice!

What are now the polynomials that are orthogonal relative to this specific density?

using PolyChaos
deg = 4
meas = Measure("my_GaussMixture", ρ, (-Inf,Inf), false, Dict(:μ=>μ, :σ=>σ)) # build measure
op = OrthoPoly("my_op", deg, meas; Nquad = 100, Nrec = 2*deg) # construct orthogonal polynomial
showbasis(op, digits=2) # in case you wondered
1
x - 1.35
x^2 - 2.84x + 1.82
x^3 - 4.36x^2 + 5.94x - 2.5
x^4 - 5.91x^3 + 12.35x^2 - 10.74x + 3.26

Let's add the quadrature rule and compute the square norms of the basis polynomials.

T2 = Tensor(2,op) # compute scalar products
[ T2.get([i,j]) for i in 0:deg, j in 0:deg ]
5×5 Matrix{Float64}:
 1.0  0.0     0.0     0.0        0.0
 0.0  0.1875  0.0     0.0        0.0
 0.0  0.0     0.0385  0.0        0.0
 0.0  0.0     0.0     0.0128086  0.0
 0.0  0.0     0.0     0.0        0.00485189

Great!