Confusing information provided by the mean and the median     (Back to main page)

(If any pictures appear blurred, click on it to view it in full size)

Check also the Wikipedia description of this problem.



The table below shows the calculated profit in 2 different business concepts, exposed to 5 different market scenarios:

 

When we arrange the collected data, the two concepts reveal the same median value but different means:

 

Concept A has two values lying symmetrically around the median. But a minimum 30 below the median, and a maximum 40 above, moves the "point of gravity" a little to the right. This bias is also revealed through the mean, which is a little to the right as a result of this.
Concept B has a mimimum lying 40 below the median, and a maximum 40 above. But this symmetry is disturbed by the two other values. One of them is 30 below the mean, and the other 5 above. This is a distribution that is scewed to the left, resulting in a mean value lying below the median.
How can we use these data?

The median is resistant to extreme values. If the Concept A had a calculated theoretic minimum of -150 instead of -10, the median would still be "in the middle of the road" and give a correct picture of the middle value (the 50th percentile). But does this mean that a minimum of -150 is of no significant interest to us? Not at all! This value has a probability of 1/5 (20%), and it would be foolish to ignore such bad scenarios!

 

However, the mean now is Ė6. This shows us that Concept A might be a risky project when we take all data into consideration. Nevertheless, the sample also shows us that in only 1/5 (20%) of the cases we can expect negative values, which is good compared to Concept B, where 2/5 (40%) of the values are negative. But then Concept B has a positive mean of 22.....
The mean is affected by extremes and other irregularities. When the tail of the distribution is long and heavy and tilts it all towards one of the sides, the mean moves to balance it all, it is the "point of gravity".

The mean and the median both give us useful pieces of information. But it is neccesary to understand the messages. This is just an extreme example to illustrate what may cause the mean and the median to differ. In random samples data like those above may occur. Usually, in simulations generated by YourSim, the differences are smaller and of less significance, but they still may be confusing sometimes. If they donít make sense at all, the cause quite often lies in the data you put into the program.


(Back to main page)