An Introduction to Data Analysis & Presentation

Prof. Timothy Shortell, Sociology, Brooklyn College

Measures of Central Tendency

Why calculate a measure of central tendency? Measures of central tendency are summary statistics. They express in a single number the most typical case in a distribution. There are three common measures: mode, median and mean.

The mode is the most frequently occurring score or category in a distribution. It is the only measure of central tendency appropriate with nominal data. You can identify a mode in an ungrouped frequency table by finding the attribute with the largest frequency (among the valid values). The mode in a grouped frequency table is the category with the largest frequency.

The median is the middlemost score in a distribution. Exactly half of the scores fall above and half below the median. It is equivalent to the 50th percentile.

The median is the appropriate measure of central tendency for ordinal data. It can be used with interval data also, and is the most accurate measure in a skewed distribution.

The median can be identified in a frequency distribution by locating the attribute containing the 50% percentile. Begin at the top (0%) of the cumulative percent column, and go down until you reach at least 50. The same can be done with a grouped frequency distribution. (Using a mathematical procedure known as interpolation, you can locate the median within the category. For our purposes, however, it will be enough to locate the median category without further elaboration.)

The mean is calculated by taking an arithmetic average for a distribution. This is what we mean when we refer colloquially to "the average." As the formula shows, the mean is the sum of all scores divided by the number of scores in a distribution.

The mean is the most appropriate measure of central tendency for normally distributed interval data. Because it is the most mathematically sophisticated of the three measures, it can not be used with categorical data. The mean is often a decimal number; although this results in an artificial figure when the data are integers, by convention, it is reported to one or two decimal places.

The mean is the most informative, in general, of the three measures, but can be used only with interval data. Because the mean is most sensitive to individual cases, it is distorted by extreme scores. The mean is "pulled" toward the skew of a distribution.

We can get SPSS to crunch the numbers and give us the measures of central tendency. It is more important that we be able to interpret the results.

We are looking for the best description of what is most typical in a distribution. We must decide what measure is appropriate, given the level of measurement of our data and the shape of the distribution.

You should be able to express in a single, well-constructed sentence what the measure of central tendency means.

Let's look at some examples.

First, the cd purchase data:

Another example, this from the Correlates of War, 1816-1992, data:

One more example, from the same data:

All materials on this site are copyright © 2001, by Professor Timothy Shortell, except those retained by their original owner. No infringement is intended or implied. All rights reserved. Please let me know if you link to this site or use these materials.