Inferential Statistics · Interactive

The Sampling Machine

A coffee chain wants to know the average spend per customer across all 40,000 of its customers. Asking everyone is impossible — so they take a sample. But how much should they trust the average they get from a sample? Let's find out by running the machine ourselves.

Act 1
Draw one sample
Watch the average jump around
Act 2
Draw hundreds
A shape appears on its own
Act 3
Change the sample size
Watch the spread react

The customers show the hidden population ▾

These are all 40,000 customers' spends. Notice it's not a tidy bell — most people spend a little, a few spend a lot.
Spend per customer (CHF) True average (we'll pretend we don't know this):

This sample

Each draw picks 10 random customers and computes their average spend.
This sample's average:

Every average we've collected

Each draw drops one dot here — the average from that sample. This growing pile is the star of the show.
Samples drawn0
Average of the averages
Spread of the averages
Notice: the average jumps around with every sample. One sample alone could mislead you — the chain shouldn't bet the business on a single draw.
The shape: even though the customers themselves are lopsided, the averages pile up into a symmetric bell — centred right on the true average. That's the Central Limit Theorem, and you just made it happen.

Sample size — how many customers per draw?

🔒 Unlocks after Act 2 — draw at least 100 samples first.
The payoff: bigger samples make the bell narrower. The averages cluster tighter around the truth, so a bigger sample is a more trustworthy estimate. That width has a name: the standard error.

✦ One question

To get a more reliable estimate of average spend, the coffee chain should:
ADraw one really careful, slow sample
BDraw a bigger sample
CIt doesn't matter — averages are averages
The Sampling Machine · built for teaching · Jan Erik Meidell