Descriptive Statistics and Measures of Central Tendency and Dispersion

© 1998 by Dr. Thomas W. MacFarland -- All Rights Reserved

Background:  Quite often when examining data and relationships
             between data, it is useful to offer a general
             view of the data.  Imagine an array of data,
             representing final examination test scores in a
             computer science education class:

             -- How many students sat for the test?

             -- What is the average test score and are
                there multiple definitions of the term

             -- Did most test scores come close to the
                average, or was there a wide degree of
                variance in test scores?

             -- What was the range of test scores, from
                the lowest test score to the highest test

             The following listing identifies a series of
             statistical measures of central tendency (closeness
             to the "average" score) and dispersion (spread or
             variance in the range of scores away from average
             score) typically used in the social sciences:

             1.  Measures of central tendency or closeness
                 to the average score:

                 A.  Mode ...... most frequent score

                 B.  Median .... mid-point of an array

                 C.  Mean ...... arithmetic average (Sum/N)

             In the "perfect" bell-shaped curve, all three
             measures of central tendency would be equivalent.

             2.  Measures of dispersion, spread, or variance in
                 the range of scores away from the average

                 A.  Variance ... the sum of squared deviations
                                  from the mean

                 B.  SD ......... the standard deviation, or the
                                  square root the variance

                 C.  Range ...... the spread from the lowest score
                                  to the highest score

             It is common to present in summary statistics a
             listing of these descriptive statistics, to give
             the reader a general view of the data.  In our
             current example, you would typically identify:

             -- N or number of valid final examination test

             -- Average final examination test score

                -- Mode

                -- Median

                -- Mean

             -- Variance in final examination test scores

                -- SD (Std dev) or standard deviation

                -- Range of scores from minimum score to
                   highest test score

             This information gives a far more complete
             description of test results than merely stating
             that "the average test score was 80 out of 100."

Scenario:    A computing technology teacher administered a final
             examination at the end of a nine-week term.  In an 
             attempt to better understand the progress of her 
             students, she prepared a data file and then used 
             leading software products to examine final 
             examination outcomes.  Scores (potentially ranging
             from 000 to 100) for her 23 students are presented
             in Table 1:

             Table 1

             Scores for a Computing Technology Final Examination
             Student Number          Score
                   01                 089
                   02                 092
                   03                 073
                   04                 083
                   05                 056
                   06                 082
                   07                 077
                   08                 092
                   09                 100
                   10                 067
                   11                 071
                   12                 076
                   13                 083
                   14                 086
                   15                 077
                   16                 049
                   17                 071
                   18                 084
                   19                 091
                   20                 088
                   21                 082
                   22                 077
                   23                 097

Files:       1.  cent_tnd.doc

             2.  cent_tnd.dat

             3.  cent_tnd.r01

             4.  cent_tnd.o01

             5.  cent_tnd.con

             6.  cent_tnd.lis  

Command:     At the UNIX prompt (%), key:

             %spss -m < cent_tnd.r01 > cent_tnd.o01

             Contact your system administrator if you need
             to use another command to effect SPSS-X in
             batch mode.  Of course, slight modifications
             may be necessary if you use SPSS on a PC.
                   01                 089
                   02                 092
                   03                 073
                   04                 083
                   05                 056
                   06                 082
                   07                 077
                   08                 092
                   09                 100
                   10                 067
                   11                 071
                   12                 076
                   13                 083
                   14                 086
                   15                 077
                   16                 049
                   17                 071
                   18                 084
                   19                 091
                   20                 088
                   21                 082
                   22                 077
                   23                 097
SET WIDTH      = 80
TITLE          = Descriptive Statistics and Central Tendency
COMMENT        = This file examines scores on a computing
                 technology final examination
DATA LIST FILE = 'cent_tnd.dat' FIXED
     / Stu_Code   20-21
       Score      39-41

Variable Labels
       Stu_Code   "Student Code"
     / Score      "Exam Score  "

     / STATISTICS     = All

   1  SET WIDTH      = 80
   2  SET LENGTH     = NONE
   3  SET CASE       = UPLOW
   4  SET HEADER     = NO
   5  TITLE          = Descriptive Statistics and Central Tendency
   6  COMMENT        = This file examines scores on a computing
   7                   technology final examination
   8  DATA LIST FILE = 'cent_tnd.dat' FIXED
   9       / Stu_Code   20-21
  10         Score      39-41

This command will read 1 records from cent_tnd.dat

Variable   Rec   Start     End         Format

STU_CODE     1      20      21         F2.0
SCORE        1      39      41         F3.0

  12  Variable Labels
  13         Stu_Code   "Student Code"
  14       / Score      "Exam Score  "
  18       / STATISTICS     = All

SCORE     Exam Score

                                                        Valid     Cum
Value Label                 Value  Frequency  Percent  Percent  Percent

                               49         1      4.3      4.3      4.3
                               56         1      4.3      4.3      8.7
                               67         1      4.3      4.3     13.0
                               71         2      8.7      8.7     21.7
                               73         1      4.3      4.3     26.1
                               76         1      4.3      4.3     30.4
                               77         3     13.0     13.0     43.5
                               82         2      8.7      8.7     52.2
                               83         2      8.7      8.7     60.9
                               84         1      4.3      4.3     65.2
                               86         1      4.3      4.3     69.6
                               88         1      4.3      4.3     73.9
                               89         1      4.3      4.3     78.3
                               91         1      4.3      4.3     82.6
                               92         2      8.7      8.7     91.3
                               97         1      4.3      4.3     95.7
                              100         1      4.3      4.3    100.0
                                     -------  -------  -------
                            Total        23    100.0    100.0

Mean         80.130      Std err       2.546      Median       82.000
Mode         77.000      Std dev      12.211      Variance    149.119
Kurtosis       .914      S E Kurt       .935      Skewness      -.813
S E Skew       .481      Range        51.000      Minimum      49.000
Maximum     100.000      Sum        1843.000

Valid cases      23      Missing cases      0
Conclusion:  Descriptive statistics and measures of central
             tendency for final examination test scores
             N   Mode  Median  Mean   SD    Range
             23  77    82      80.1   12.2  51: 49 to 100

             Far greater detail (perhaps too much detail) on 
             descriptive statistics for final examination test
             scores can be found at the end of the output file

             As you examine this section of the output file, be
             sure to notice that:

             -- N = 23, which is to say that there were 23
                students who had scores for this examination.

             -- Three separate values were provided for the
                "average" score:

                -- Mode (most frequent) was 77

                -- Median (mid-point of the array of all final
                   examination scores) was 82
                -- Mean (arithmetic average, or Sum of all
                   final examination scores / Number of final
                   examination scores) was 80.1

             -- Variance is expressed by two leading statistics:

                -- Standard Deviation (Std dev or SD,
                   representing dispersion of final examination
                   scores away from the mean) was 12.2
                -- Range in final examination scores was 51,
                   from a minimum score of 49 to a maximum
                   score of 100

             Each statistic is useful in our attempt to place
             context to outcomes.  Although it is very common
             to only see N, Mean, and SD presented in the
             literature, the other statistics presented above
             give a more complete picture of outcomes.
% minitab
 MTB > outfile 'cent_tnd.lis'
 Collecting Minitab session in file: cent_tnd.lis
 MTB > # MINITAB addendum to cent_tnd.dat
 MTB > read 'cent_tnd.dat' c1 c2
 Entering data from file: cent_tnd.dat
      23 rows read.
 MTB > print c1
     1     2     3     4     5     6     7     8     9    10    11    12    13 
    14    15    16    17    18    19    20    21    22    23 
 MTB > print c2
     89     92     73     83     56     82     77     92    100     67     71 
     76     83     86     77     49     71     84     91     88     82     77 
 MTB > histogram c2
 Histogram of C2   N = 23
 Midpoint   Count
       50       1  *
       55       1  *
       60       0
       65       1  *
       70       2  **
       75       5  *****
       80       2  **
       85       4  ****
       90       5  *****
       95       1  *
      100       1  *
 MTB > stem-and-leaf c2
 Stem-and-leaf of C2        N  = 23
 Leaf Unit = 1.0
     1    4 9
     1    5 
     2    5 6
     2    6 
     3    6 7
     6    7 113
    10    7 6777
    (5)   8 22334
     8    8 689
     5    9 122
     2    9 7
     1   10 0
 MTB > dotplot c2
             .      .          .   : .  .:    ::. . .. .:    .  .
             50        60        70        80        90       100
 MTB > tally c2
       C2  COUNT
       49     1 
       56     1 
       67     1 
       71     2 
       73     1 
       76     1 
       77     3 
       82     2 
       83     2 
       84     1 
       86     1 
       88     1 
       89     1 
       91     1 
       92     2 
       97     1 
      100     1 
       N=    23 
 MTB > describe c2
                 N     MEAN   MEDIAN   TRMEAN    STDEV   SEMEAN
 C2             23    80.13    82.00    80.67    12.21     2.55
               MIN      MAX       Q1       Q3
 C2          49.00   100.00    73.00    89.00
 MTB > stop

Disclaimer:  All care was used to prepare the information in this 
tutorial.  Even so, the author does not and cannot guarantee the 
accuracy of this information.  The author disclaims any and all 
injury that may come about from the use of this tutorial.  As 
always, students and all others should check with their advisor(s) 
and/or other appropriate professionals for any and all assistance 
on research design, analysis, selected levels of significance, and 
interpretation of output file(s).

The author is entitled to exclusive distribution of this tutorial. 
Readers have permission to print this tutorial for individual use, 
provided that the copyright statement appears and that there is no 
redistribution of this tutorial without permission.

Prepared 980316
Revised  980914
end-of-file 'cent_tnd.ssi'

Please send comments or suggestions to Dr. Thomas W. MacFarland

There have been visitors to this page since February 1, 1999.