Confusing information provided by the mean and the
median (Back
to main page)

(If any pictures appear blurred, click on it to view it in full
size)

Check also the Wikipedia description of this
problem.

The table below shows the calculated profit in 2 different business concepts, exposed to 5 different market scenarios:

When we arrange the collected data, the two concepts reveal the same median value but different means:

Concept A has two values lying symmetrically around the median. But a
minimum 30 below the median, and a maximum 40 above, moves the
"point of gravity" a little to the right. This bias is also
revealed through the mean, which is a little to the right as a
result of this.

Concept B has a mimimum lying 40 below the median, and a maximum
40 above. But this symmetry is disturbed by the two other
values. One of them is 30 below the mean, and the other 5 above.
This is a distribution that is scewed to the left, resulting in
a mean value lying below the median.

How can we use these data?

The median is resistant to extreme values. If the Concept A had a calculated theoretic minimum of -150 instead of -10, the median would still be "in the middle of the road" and give a correct picture of the middle value (the 50th percentile). But does this mean that a minimum of -150 is of no significant interest to us? Not at all! This value has a probability of 1/5 (20%), and it would be foolish to ignore such bad scenarios!

However, the mean now is –6. This shows us that Concept A might
be a risky project when we take all data into consideration.
Nevertheless, the sample also shows us that in only 1/5 (20%) of
the cases we can expect negative values, which is good compared
to Concept B, where 2/5 (40%) of the values are negative. But
then Concept B has a positive mean of 22.....

The mean is affected by extremes and other irregularities. When
the tail of the distribution is long and heavy and tilts it all
towards one of the sides, the mean moves to balance it all, it
is the "point of gravity".

The mean and the median both give us useful pieces of
information. But it is neccesary to understand the messages.
This is just an extreme example to illustrate what may cause the
mean and the median to differ. In random samples data like those
above may occur. Usually, in simulations generated by YourSim,
the differences are smaller and of less significance, but they
still may be confusing sometimes. If they don’t make sense at
all, the cause quite often lies in the data you put into the
program.

(Back to main page)