Gaussian Function vs Gaussian PDF, and Understanding Normalization Constants

Gaussians show up everywhere: statistics, physics, machine learning, and even in population receptive field (pRF) modeling used in fMRI research.

But if you look closely at the formulas, the normalization factor seems to keep changing.

Sometimes it is

\[ \sqrt{\pi} \]

Sometimes it becomes

\[ \sigma\sqrt{2\pi} \]

And in two dimensions it turns into

\[ 2\pi\sigma^2 \]

Let’s unpack where these numbers come from, step by step.

1. Gaussian Function vs Gaussian PDF

Before diving into normalization constants, it helps to separate two related ideas.

Gaussian function (shape only):

\[ g(x)=\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right) \]

This gives the bell shape, but by itself it does not guarantee total area 1.

Gaussian PDF (probability density):

\[ f(x)=\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right) \]

The extra prefactor is the normalization term that makes \(\int_{-\infty}^{\infty} f(x)\,dx=1\).

2. The Starting Point: The Pure Gaussian Integral

Everything begins with the basic Gaussian function

\[ f(x)=e^{-x^2} \]

The key integral is

\[ I=\int_{-\infty}^{\infty} e^{-x^2}\,dx \]

This integral has no elementary antiderivative. The classical trick is to square it, which transforms the problem from computing area under a 1D curve to computing volume under a 2D surface.

\[ I^2=\left(\int_{-\infty}^{\infty} e^{-x^2}\,dx\right)\left(\int_{-\infty}^{\infty} e^{-y^2}\,dy\right) \]

\[ I^2=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}e^{-(x^2+y^2)}\,dx\,dy \]

Now switch to polar coordinates: \(x^2+y^2=r^2\) and \(dx\,dy=r\,dr\,d\theta\).

\[ I^2=\int_{0}^{2\pi}\int_{0}^{\infty}e^{-r^2}r\,dr\,d\theta \]

\[ \int_0^{\infty} e^{-r^2}r\,dr=\frac{1}{2},\quad \int_0^{2\pi} d\theta=2\pi \]

\[ I^2=\pi\quad\Rightarrow\quad I=\sqrt{\pi} \]

This is the fundamental source of the \(\sqrt{\pi}\) constant in Gaussian normalization.

Alternative approach: Geometric intuition

Independently, we can reason using thin circular rings.

Take a ring at radius \(r\): circumference is \(2\pi r\).
Give it thickness \(dr\): ring area is approximately \(2\pi r\,dr\).
At radius \(r\), Gaussian height is \(e^{-r^2}\).

So each ring contributes the slice

\[ e^{-r^2}(2\pi r\,dr) \]

Summing all rings from \(r=0\) to \(\infty\):

\[ I^2=\int_0^{\infty}2\pi r\,e^{-r^2}\,dr \]

This gives the same result \(I^2=\pi\), and the \(r\) term appears naturally from ring geometry.

The diagram below shows a single ring at radius \(r\) with thickness \(dr\), illustrating how the ring area and Gaussian height combine to form the volume element.

The diagram shows a single ring element in orange at radius \(r\). The ring sits at the base (darker outline) and is shown lifted to the Gaussian height \(h = e^{-r^2}\) (orange outline). The volume of this thin ring is the product of its base area \(2\pi r \cdot dr\) and its height \(e^{-r^2}\). Integrating over all such rings from \(r = 0\) to \(\infty\) gives the total volume under the Gaussian surface.

3. From Pure Gaussian to Probability Distribution

For a probability density function (PDF), the total area under the curve must be 1.

The pure Gaussian \(e^{-x^2}\) has total area \(\sqrt{\pi}\), so it is not normalized as-is.

The standard 1D Gaussian with mean \(\mu\) and standard deviation \(\sigma\) is

\[ f(x)=\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right) \]

Why does \(\sigma\sqrt{2\pi}\) appear?

Scaling \(x\) by \(\sigma\) stretches the integral by a factor of \(\sigma\).
The \(2\sigma^2\) term in the exponent contributes an additional \(\sqrt{2}\) factor.
Together with \(\sqrt{\pi}\), this gives \(\sigma\sqrt{2\pi}\).

So the prefactor \(1/(\sigma\sqrt{2\pi})\) is exactly what enforces

\[ \int_{-\infty}^{\infty} f(x)\,dx=1 \]

Interactive Visualization: How Normalization and Sigma Affect the Curve

Below, you can see how the normalization factor changes the height of the Gaussian. The blue curve is the unnormalized Gaussian function, and the red curve is the normalized PDF. Adjust the slider to change \(\sigma\) and observe how both curves scale.

Note: The unnormalized version uses the formula \(\exp\left(-\frac{x^2}{2\sigma^2}\right)\), which has a peak value of 1 at \(x=0\). This peak height of 1 occurs because the formula uses \(x^2\) (implicitly meaning the mean \(\mu=0\)), so at \(x=0\) the exponent becomes zero and \(\exp(0)=1\). In the more general form with \((x-\mu)^2\), the peak would occur at \(x=\mu\) with the same height of 1.

Standard deviation (σ): 1.0

Observation: As \(\sigma\) increases, the unnormalized curve spreads, but there no cahnge in the height. The normalized PDF, however, remains a valid probability distribution (always integrates to 1) and simply becomes wider and lower on as \(\sigma\) increases. The normalization factor \(1/(\sigma\sqrt{2\pi})\) automatically adjusts the height to maintain the total area.

4. Moving to Two Dimensions

In many applications (including pRF models), we use a 2D Gaussian:

\[ G(x,y)=\exp\left(-\frac{x^2+y^2}{2\sigma^2}\right) \]

To normalize it as a probability density over area, the total volume under the surface must be 1.

\[ \iint G(x,y)\,dx\,dy \]

Again in polar form:

\[ \int_0^{2\pi}\int_0^{\infty} e^{-r^2/(2\sigma^2)}r\,dr\,d\theta \]

This evaluates to \(2\pi\sigma^2\), so the normalized 2D Gaussian is

\[ G(x,y)=\frac{1}{2\pi\sigma^2}\exp\left(-\frac{x^2+y^2}{2\sigma^2}\right) \]

5. Difference of Gaussians (DoG)

In vision science and pRF modeling, center-surround receptive fields are often modeled with a Difference of Gaussians:

\[ DoG(x,y)=G_c(x,y)-\beta G_s(x,y) \]

with

\[ G_c=\exp\left(-\frac{x^2+y^2}{2\sigma_c^2}\right),\qquad G_s=\exp\left(-\frac{x^2+y^2}{2\sigma_s^2}\right) \]

\[ \sigma_s>\sigma_c \]

Here, \(\beta\) controls how strong the surround suppression is. In this form, the two Gaussians are used without explicit normalization factors, so the DoG profile is governed by relative widths and amplitude scaling.

Takeaway

The normalization constants are not arbitrary—they reflect geometry and scaling.

\(\sqrt{\pi}\) comes from the pure Gaussian integral.
\(\sigma\sqrt{2\pi}\) appears in normalized 1D Gaussian PDFs.
\(2\pi\sigma^2\) appears in normalized 2D Gaussians.

Once you track variable scaling and dimensionality, the pattern becomes clear.