Inferential Statistics · Interactive

The Sampling Machine

A coffee chain wants to know the average spend per customer across all 40,000 of its customers. Asking everyone is impossible — so they take a sample. But how much should they trust the average they get from a sample? Let's find out by running the machine ourselves.

Act 1

Draw one sample

Watch the average jump around

Act 2

Draw hundreds

A shape appears on its own

Act 3

Change the sample size

Watch the spread react

The customers show the hidden population ▾

These are all 40,000 customers' spends. Notice it's not a tidy bell — most people spend a little, a few spend a lot.

Spend per customer (CHF) True average (we'll pretend we don't know this): —

This sample

Each draw picks 10 random customers and computes their average spend.

This sample's average: —

Every average we've collected

Each draw drops one dot here — the average from that sample. This growing pile is the star of the show.

Samples drawn0

Average of the averages—

Spread of the averages—

Notice: the average jumps around with every sample. One sample alone could mislead you — the chain shouldn't bet the business on a single draw.

The shape: even though the customers themselves are lopsided, the averages pile up into a symmetric bell — centred right on the true average. That's the Central Limit Theorem, and you just made it happen.

Sample size — how many customers per draw?

n = 10

🔒 Unlocks after Act 2 — draw at least 100 samples first.

The payoff: bigger samples make the bell narrower. The averages cluster tighter around the truth, so a bigger sample is a more trustworthy estimate. That width has a name: the standard error.

✦ One question

To get a more reliable estimate of average spend, the coffee chain should:

ADraw one really careful, slow sample

BDraw a bigger sample

CIt doesn't matter — averages are averages

The Sampling Machine · built for teaching · Jan Erik Meidell