Mann-Whitney U-Test
© 1998 by Dr. Thomas W. MacFarland -- All Rights Reserved
************
mann_whi.doc
************
Background: The Mann-Whitney U test is often viewed as the
nonparametric equivalent of Student's t-test.
Like the parametric Student's t-test, the non-
parametric Mann-Whitney U test:
-- is used to determine if a difference exists
between two "groups," however you define
"group"
-- is ideally dependent on random selection of
subjects into their respective group
The major difference between the Mann-Whitney U
Test and Student's t-Test involves the concept of
normal distribution:
-- Mann-Whitney is a nonparametric test.
-- Normal distribution of data is not necessary
for use of this test.
There is a table of U values in many statistics
texts. If you use this table:
-- The column number should be the number of the
larger sample.
-- The row number should be the number of the smaller
sample.
-- Samples of equal size will use n by n for
determining the criterion U statistic.
If you use SPSS (or many of the other statistical
packages) for data analysis you may find it far more
convenient to use either z values or p values in the
output file to determine significance.
When using z values:
-- If the observed z value does not equal or exceed
the critical z value of 1.96 (p <= .05 critical z
value for a two-tailed test), then you can assume
that the null hypothesis is correct and that there
is no difference between groups.
-- If the z value, however, exceeds 1.96 then you
have evidence to reject the null hypothesis.
Or, you may find it more convenient to observe
the printed p value.
Scenario: This study examines if there any differences in
outcomes in a Pascal programming course between
students who received Computer Based Training,
as opposed to students who received traditional
lecture:
-- Differences (if indeed they exist) between the
two teaching formats will be measured by student
performance on a common final examination.
-- Based on prior experience with the test
instrument, it is suspected that outcomes are
not normally distributed (e.g., bell-shaped curve)
but are instead skewed to the right. Accordingly,
Student's t-Test is not the appropriate test for
difference between the two groups. Instead, this
study will be based on the use of the Mann-Whitney
U Test.
In this study random selection was used to assign the
30 students in Mr. Seeger's Pascal programming course
into one of two groups:
1. Students in group 1 received instruction
through the use of Computer Based Training
(CBT).
2. Subjects in group 2 received instruction
through the use of traditional lecture.
A summary of the study is presented in Table 1.
Table 1
Pascal Programming Course Final Examination
Scores: Breakouts by Computer Based Training and
Traditional Lecture
====================================================
Assigned Group
==============
1 = CBT
Student Number 2 = Lecture Exam Score
----------------------------------------------------
01 1 080
02 1 082
03 1 091
04 1 100
05 1 076
06 1 065
07 1 085
08 1 088
09 1 097
10 1 055
11 1 069
12 1 088
13 1 075
14 1 097
15 1 081
16 2 072
17 2 089
18 2 086
19 2 085
20 2 099
21 2 047
22 2 079
23 2 088
24 2 100
25 2 076
26 2 083
27 2 094
28 2 084
29 2 082
30 2 093
----------------------------------------------------
Ho: Null Hypothesis: There is no difference in final
examination test scores in a Pascal programming
course between students who received Computer
Based Training and students who received traditional
lecture (p <= .05).
Files: 1. mann_whi.doc
2. mann_whi.dat
3. mann_whi.r01
4. mann_whi.o01
5. mann_whi.con
6. mann_whi.lis
Command: At the Unix prompt (%), key:
%spss -m < mann_whi.r01 > mann_whi.o01
************
mann_whi.dat
************
01 1 080
02 1 082
03 1 091
04 1 100
05 1 076
06 1 065
07 1 085
08 1 088
09 1 097
10 1 055
11 1 069
12 1 088
13 1 075
14 1 097
15 1 081
16 2 072
17 2 089
18 2 086
19 2 085
20 2 099
21 2 047
22 2 079
23 2 088
24 2 100
25 2 076
26 2 083
27 2 094
28 2 084
29 2 082
30 2 093
************
mann_whi.r01
************
SET WIDTH = 80
SET LENGTH = NONE
SET CASE = UPLOW
SET HEADER = NO
TITLE = Sign Test
COMMENT = This file examines if Computer Based Training
is as equally effective as traditional lecture
in a Pascal programming course. Differences
between the two teaching formats will be
measured by student performance on a common
final examination.
DATA LIST FILE = 'mann_whi.dat' FIXED
/ Stu_Code 20-21
Group 36
Score 49-51
Variable Labels
Stu_Code "Subject Code"
/ Group "Assigned Group: CBT or Traditional"
/ Score "Common Final Examination Score"
Value Labels
Group 1 'Computer Based Training'
2 'Traditional Lecture'
NPAR TESTS M-W = Score BY Group (1,2)
************
mann_whi.o01
************
1 SET WIDTH = 80
2 SET LENGTH = NONE
3 SET CASE = UPLOW
4 SET HEADER = NO
5 TITLE = Sign Test
6 COMMENT = This file examines if Computer Based Training
7 is as equally effective as traditional lecture
8 in a Pascal programming course. Differences
9 between the two teaching formats will be
10 measured by student performance on a common
11 final examination.
12 DATA LIST FILE = 'mann_whi.dat' FIXED
13 / Stu_Code 20-21
14 Group 36
15 Score 49-51
16
This command will read 1 records from mann_whi.dat
Variable Rec Start End Format
STU_CODE 1 20 21 F2.0
GROUP 1 36 36 F1.0
SCORE 1 49 51 F3.0
17 Variable Labels
18 Stu_Code "Subject Code"
19 / Group "Assigned Group: CBT or Traditional"
20 / Score "Common Final Examination Score"
21
22 Value Labels
23 Group 1 'Computer Based Training'
24 2 'Traditional Lecture'
25
26 NPAR TESTS M-W = Score BY Group (1,2)
***** Workspace allows for 18724 cases for NPAR tests *****
- - - - - Mann-Whitney U - Wilcoxon Rank Sum W Test
SCORE Common Final Examination Score
by GROUP Assigned Group: CBT or Traditional
Mean Rank Cases
14.53 15 GROUP = 1 Computer Based Train
16.47 15 GROUP = 2 Traditional Lecture
--
30 Total
Exact Corrected for ties
U W 2-Tailed P Z 2-Tailed P
98.0 218.0 .5668 -.6020 .5472
************
mann_whi.con
************
Outcome: Significance can of course be verified by using the
computed test statistic (e.g., U) and comparing
this statistic to the criterion (i.e., table) value.
It is often much easier, however, to use the output
file to verify interpretation of significance:
p = .5472
By interpretation of the p (probability) value, it
is observed that p = .55, which exceeds the Null
Hypothesis declaration that p <= .05.
There is certainly sufficient information to accept
the Null Hypothesis and to declare that there is
no difference between the two training groups in
terms of final examination scores.
************
mann_whi.lis
************
% minitab
MTB > outfile 'mann_whi.lis'
Collecting Minitab session in file: mann_whi.lis
MTB > # MINITAB addendum to mann_whi.dat
MTB > read 'mann_whi.dat' c1 c2 c3
Entering data from file: mann_whi.dat
30 rows read.
MTB > print c1 c2 c3
ROW C1 C2 C3
1 1 1 80
2 2 1 82
3 3 1 91
4 4 1 100
5 5 1 76
6 6 1 65
7 7 1 85
8 8 1 88
9 9 1 97
10 10 1 55
11 11 1 69
12 12 1 88
13 13 1 75
14 14 1 97
15 15 1 81
16 16 2 72
17 17 2 89
18 18 2 86
Continue? y
19 19 2 85
20 20 2 99
21 21 2 47
22 22 2 79
23 23 2 88
24 24 2 100
25 25 2 76
26 26 2 83
27 27 2 94
28 28 2 84
29 29 2 82
30 30 2 93
MTB > # I will now UNSTACK the data to get distinct
MTB > # groups.
MTB > unstack (c2-c3) into (c5-c6) (c8-c9);
SUBC> subscripts c2.
MTB > print c1 c2 c3 c4 c5 c6 c7 c8 c9
ROW C1 C2 C3 C5 C6 C8 C9
1 1 1 80 1 80 2 72
2 2 1 82 1 82 2 89
3 3 1 91 1 91 2 86
4 4 1 100 1 100 2 85
5 5 1 76 1 76 2 99
6 6 1 65 1 65 2 47
7 7 1 85 1 85 2 79
8 8 1 88 1 88 2 88
9 9 1 97 1 97 2 100
10 10 1 55 1 55 2 76
11 11 1 69 1 69 2 83
12 12 1 88 1 88 2 94
13 13 1 75 1 75 2 84
14 14 1 97 1 97 2 82
15 15 1 81 1 81 2 93
16 16 2 72
17 17 2 89
18 18 2 86
Continue? y
19 19 2 85
20 20 2 99
21 21 2 47
22 22 2 79
23 23 2 88
24 24 2 100
25 25 2 76
26 26 2 83
27 27 2 94
28 28 2 84
29 29 2 82
30 30 2 93
* NOTE * One or more variables are undefined.
MTB > mannwhitney c6 c9
Mann-Whitney Confidence Interval and Test
C6 N = 15 Median = 82.00
C9 N = 15 Median = 85.00
Point estimate for ETA1-ETA2 is -2.00
95.4 pct c.i. for ETA1-ETA2 is (-11.00,6.00)
W = 218.0
Test of ETA1 = ETA2 vs. ETA1 n.e. ETA2 is significant at 0.5614
The test is significant at 0.5611 (adjusted for ties)
Cannot reject at alpha = 0.05
MTB > stop
--------------------------
Disclaimer: All care was used to prepare the information in this
tutorial. Even so, the author does not and cannot guarantee the
accuracy of this information. The author disclaims any and all
injury that may come about from the use of this tutorial. As
always, students and all others should check with their advisor(s)
and/or other appropriate professionals for any and all assistance
on research design, analysis, selected levels of significance, and
interpretation of output file(s).
The author is entitled to exclusive distribution of this tutorial.
Readers have permission to print this tutorial for individual use,
provided that the copyright statement appears and that there is no
redistribution of this tutorial without permission.
Prepared 980316
Revised 980914
end-of-file 'mann_whi.ssi'