KL divergence visualiser and VAE reparameterization trick
KL divergence
Reparameterization trick
Mean μ
0.0
Std σ
1.0
KL(q ‖ p)
0.000
μ² / 2
0.000
σ² / 2 − log σ − ½
0.000
KL = ½ (μ² + σ² − log σ² − 1) = ½ (0 + 1 − 0 − 1) = 0
When q = p = N(0,1), KL = 0. Drag μ away from 0 or make σ very small or large to see the penalty grow.
μ(x)
0.8
σ(x)
0.8
z = μ(x) + σ(x) · ε, ε ~ N(0,1). The noise ε is fixed — only μ and σ are learned. This makes z differentiable w.r.t. the encoder parameters.