Oneway Analysis of Variance (ANOVA)
© 1998 by Dr. Thomas W. MacFarland -- All Rights Reserved
************ one_anov.doc ************ Background: A common statistical technique for determining if differences exist between between three or more "groups" is Oneway Analysis of Variance (ANOVA), and the associated F test: The F test and subsequent ANOVA methodology involves the determination of differences for: 1. one group with multiple (typically, three or more) variations, as well as 2. one variable, compared to multiple groups. When using Oneway ANOVA for three or more groups, an immediate concern is how to interpret findings if the hypothesis is not accepted (consult an appropriate statistics text to review why there are those who consider it more appropriate to declare "The null hypothesis was not accepted" instead of "The null hypothesis was rejected"). When only two groups are compared and if the Null Hypothesis is not accepted, then you know that the difference between Group #1 and Group #2 is a true difference (at the declared level of significance, or p level). What happens, however, if you reject the null hypothesis for a Oneway ANOVA design involving three groups: -- Is the difference between Group A and Group B the reason for failure to accept the null hypothesis? -- Is the difference between Group A and Group C the reason for failure to accept the null hypothesis? -- Is the difference between Group B and Group C the reason for failure to accept the null hypothesis? There are certainly many techniques for determining multiple comparisons between the means of each group. The following mean comparison tests are found in SPSS for the purpose of comparing differences between means in a Oneway ANOVA design: 1. LSD ............ Least-significant difference 2. DUNCAN ......... Duncan's multiple range test 3. SNK ............ Student-Newman-Keuls 4. TUKEYB ......... Tukey's alternate procedure 5. TUKEY .......... Honestly significant difference 6. LSDMOD ......... Modified LSD 7. SCHEFFE ........ Scheffe's test Be sure to remember that Oneway ANOVA methodology, as opposed to Student's t-test, can serve as a useful tool in the development of processes for understanding "real-world" problems. Most "real- world" problems are related to complex issues. Statistical tests that can account for this complexity are needed if meaningful decisions are to be effected. Scenario: This study examines if there are differences in final examination test scores between four groups of students in a software engineering course: -- The first group of students was taught by traditional lecture. -- The second group of students was taught by Computer Based Training. -- The third group of students was taught by the use of instructional videotapes. -- The last group of students was enrolled through independent study. Students were all from a university senior- level software engineering course who were assigned, through random selection, to placement into one of four groups: instruction by traditional lecture, instruction by CBT (Computer Based Training), instruction by the use of instructional videotapes, and independent study. Because the teacher was confident that final examination scores represented interval data (i.e., the data are parametric, with the difference between "89" and "90" equal to the difference between "75" and "76"), Oneway ANOVA (Analysis of Variance) was correctly judged to be the appropriate test for this analysis of summative differences in final examination scores between three or more groups. Final examination test scores are summarized in Table 1. Table 1 Final Examination Test Scores in a Senior-Level Software Engineering Course by Instructional Method: Traditional Lecture, Computer Based Training, Instructional Videotape, and Independent Study ==================================================== Instructional Method ============= 1 = Lecture 2 = CBT 3 = Video 4 = Independent Student Number Study (IDS) Final Score ---------------------------------------------------- 01 1 089 02 1 081 03 1 073 04 1 084 05 1 070 06 1 056 07 1 070 08 1 081 09 1 078 10 1 069 11 1 089 12 1 088 13 1 045 14 1 083 15 1 095 16 1 077 17 1 069 18 1 080 19 2 093 20 2 086 21 2 089 22 2 095 23 2 089 24 2 088 25 2 098 26 2 089 27 2 094 28 2 095 29 2 095 30 2 098 31 2 087 32 2 085 33 2 098 34 2 093 35 2 087 36 2 095 37 2 093 38 2 093 39 3 095 40 3 096 41 3 083 42 3 089 43 3 088 44 3 087 45 3 094 46 3 097 47 3 095 48 3 093 49 3 085 50 3 095 51 3 092 52 3 082 53 3 086 54 3 087 55 3 089 56 3 097 57 3 100 58 3 093 59 3 096 60 4 084 61 4 085 62 4 073 63 4 092 64 4 057 65 4 063 66 4 069 67 4 073 68 4 091 69 4 065 70 4 074 71 4 071 72 4 068 73 4 062 74 4 056 75 4 085 ---------------------------------------------------- Note. Notice how the N (i.e., number of subjects or group members) for each instructional group does not have to be equal. Ho: Null Hypothesis: There is no difference in the final examination test scores of students enrolled in a university senior-level software engineering course after students were assigned, through random selection, to placement into one of four groups: instruction by traditional lecture, instruction by CBT (Computer Based Training), instruction by the use of instructional videotapes, and independent study (p <= .01). Note. The p (i.e., probability or alpha level) value is declared as p <= .01 instead of the more liberal p <= .05. Files: 1. one_anov.doc 2. one_anov.dat 3. one_anov.r01 4. one_anov.o01 5. one_anov.con 6. one_anov.lis Command: At the Unix prompt (%), key: %spss -m < one_anov.r01 > one_anov.o01 ************ one_anov.dat ************ 01 1 089 02 1 081 03 1 073 04 1 084 05 1 070 06 1 056 07 1 070 08 1 081 09 1 078 10 1 069 11 1 089 12 1 088 13 1 045 14 1 083 15 1 095 16 1 077 17 1 069 18 1 080 19 2 093 20 2 086 21 2 089 22 2 095 23 2 089 24 2 088 25 2 098 26 2 089 27 2 094 28 2 095 29 2 095 30 2 098 31 2 087 32 2 085 33 2 098 34 2 093 35 2 087 36 2 095 37 2 093 38 2 093 39 3 095 40 3 096 41 3 083 42 3 089 43 3 088 44 3 087 45 3 094 46 3 097 47 3 095 48 3 093 49 3 085 50 3 095 51 3 092 52 3 082 53 3 086 54 3 087 55 3 089 56 3 097 57 3 100 58 3 093 59 3 096 60 4 084 61 4 085 62 4 073 63 4 092 64 4 057 65 4 063 66 4 069 67 4 073 68 4 091 69 4 065 70 4 074 71 4 071 72 4 068 73 4 062 74 4 056 75 4 085 ************ one_anov.r01 ************ SET WIDTH = 80 SET LENGTH = NONE SET CASE = UPLOW SET HEADER = NO TITLE = Oneway Analysis of Variance (ONEWAY ANOVA) COMMENT = This file examines if there are differences in final examination test scores between four groups of students in a software engineering course: the first group of students was taught by traditional lecture, the second group of students was taught by Computer Based Training, the third group of students was taught by the use of instructional videotapes, and the last group of students were enrolled through independent study. Students were all from a university senior- level software engineering course who were assigned, through random selection, to placement into one of four groups: instruction by traditional lecture, instruction by CBT (Computer Based Training), instruction by the use of instructional videotapes, and independent study. Because the teacher was confident that final examination scores represented interval data (i.e., the data are parametric, with the difference between "89" and "90" equal to the difference between "75" and "76"), Oneway ANOVA (Analysis of Variance) was correctly judged to be the appropriate test for this analysis of summative differences in final examination scores between three or more groups. DATA LIST FILE = 'one_anov.dat' FIXED / Stu_Code 20-21 Method 35 Score 50-52 Variable Labels Stu_Code "Student Code" / Method "Method: Lecture, CBT, Video, IDS" / Score "Final Examination Score" Value Labels Method 1 'Lecture: Traditional Lecture' 2 'CBT: Computer-Based Training' 3 'Video: Instructional Videotape' 4 'IDS: Independent Study' ONEWAY Score BY Method(1,4) / STATISTICS = ALL / RANGES = SCHEFFE (.01) / FORMAT = LABELS ************ one_anov.o01 ************ 1 SET WIDTH = 80 2 SET LENGTH = NONE 3 SET CASE = UPLOW 4 SET HEADER = NO 5 TITLE = Oneway Analysis of Variance (ONEWAY ANOVA) 6 COMMENT = This file examines if there are differences 7 in final examination test scores between four 8 groups of students in a software engineering 9 course: the first group of students was taught 10 by traditional lecture, the second group of 11 students was taught by Computer Based Training, 12 the third group of students was taught by the 13 use of instructional videotapes, and the last 14 group of students were enrolled through 15 independent study. 16 17 Students were all from a university senior- 18 level software engineering course who were 19 assigned, through random selection, to 20 placement into one of four groups: instruction 21 by traditional lecture, instruction by CBT 22 (Computer Based Training), instruction by 23 the use of instructional videotapes, and 24 independent study. Because the teacher was 25 confident that final examination scores 26 represented interval data (i.e., the data are 27 parametric, with the difference between "89" 28 and "90" equal to the difference between "75" 29 and "76"), Oneway ANOVA (Analysis of Variance) 30 was correctly judged to be the appropriate test 31 for this analysis of summative differences in 32 final examination scores between three or more 33 groups. 34 DATA LIST FILE = 'one_anov.dat' FIXED 35 / Stu_Code 20-21 36 Method 35 37 Score 50-52 38 This command will read 1 records from one_anov.dat Variable Rec Start End Format STU_CODE 1 20 21 F2.0 METHOD 1 35 35 F1.0 SCORE 1 50 52 F3.0 39 Variable Labels 40 Stu_Code "Student Code" 41 / Method "Method: Lecture, CBT, Video, IDS" 42 / Score "Final Examination Score" 43 44 Value Labels 45 Method 1 'Lecture: Traditional Lecture' 46 2 'CBT: Computer-Based Training' 47 3 'Video: Instructional Videotape' 48 4 'IDS: Independent Study' 49 50 ONEWAY Score BY Method(1,4) 51 / STATISTICS = ALL 52 / RANGES = SCHEFFE (.01) 53 / FORMAT = LABELS ONEWAY problem requires 504 bytes of memory. - - - - - O N E W A Y - - - - - Variable SCORE Final Examination Score By Variable METHOD Method: Lecture, CBT, Video, IDS Analysis of Variance Sum of Mean F F Source D.F. Squares Squares Ratio Prob. Between Groups 3 5372.3343 1790.7781 23.6178 .0000 Within Groups 71 5383.4524 75.8233 Total 74 10755.7867 Standard Standard Group Count Mean Deviation Error 95 Pct Conf Int for Mean Lecture: 18 76.5000 12.2774 2.8938 70.3946 TO 82.6054 CBT: Com 20 92.0000 4.1675 .9319 90.0495 TO 93.9505 Video: I 21 91.3810 5.1037 1.1137 89.0578 TO 93.7041 IDS: Ind 16 73.0000 11.4601 2.8650 66.8934 TO 79.1066 Total 75 84.0533 12.0561 1.3921 81.2795 TO 86.8272 Fixed Effects Model 8.7077 1.0055 82.0485 to 86.0582 Random Effects Model 4.9191 68.3987 to 99.7080 Random Effects Model - estimate of between component variance 91.79 GROUP MINIMUM MAXIMUM Lecture: 45.0000 95.0000 CBT: Com 85.0000 98.0000 Video: I 82.0000 100.0000 IDS: Ind 56.0000 92.0000 TOTAL 45.0000 100.0000 Levene Test for Homogeneity of Variances Statistic df1 df2 2-tail Sig. 6.5470 3 71 .001 - - - - - O N E W A Y - - - - - Variable SCORE Final Examination Score By Variable METHOD Method: Lecture, CBT, Video, IDS Multiple Range Tests: Scheffe test with significance level .01 The difference between two means is significant if MEAN(J)-MEAN(I) >= 6.1572 * RANGE * SQRT(1/N(I) + 1/N(J)) with the following value(s) for RANGE: 4.94 (*) Indicates significant differences which are shown in the lower triangle I L V C D e i B S c d T : t e : u o I r : C n e o d : I m Mean METHOD 73.0000 IDS: Ind 76.5000 Lecture: 91.3810 Video: I * * 92.0000 CBT: Com * * ************ one_anov.con ************ Outcome: Computed F = 23.6178 Criterion F = 4.13 (alpha = .01, df = 3,60) Note. Although df = 3,71 the table values for the F distribution increase from df = 3,40 (F = 4.31), to df = 3,60 (F = 4.13), to df = 3,120 (F = 3.95). Occasionaly, it is necessary to extrapolate the F statistic when determining the Criterion F statistic. Computed F (23.62) > Criterion F (4.13) Therefore, the null hypothesis is rejected. That is to say, there are differences in final examination test scores in a senior-level software engineering course, based on instructional method (p <= .01). The p value is another way to view differences in the final examination test scores: -- The calculated p value is .000. -- The delcared p value is .01. The calculated p value is less than the declared p value and there is, accordingly, a difference in test scores. Conclusion: Although you now know that differences exist, the F statistic does not tell you where the difference(s) exist between instructional methods. Instead, review the following section of the SPSS output file: (*) Indicates significant differences which are shown in the lower triangle I L V C D e i B S c d T : t e : u o I r : C n e o d : I m Mean METHOD 73.0000 IDS: Ind 76.5000 Lecture: 91.3810 Video: I * * 92.0000 CBT: Com * * Using traditional methodology, you could also visually present on your own the mean comparisons among groups by using underscores, as presented below: IDS Lecture Video CBT 73.00 76.50 91.38 92.00 _____________________ __________________ Although it is not possible at this point to suggest "why" differences exist, there is sufficient evidence from this one-time study to: -- There is no difference in final examination test scores between students who received instruction through IDS and Lecture. -- There is no difference in final examination test scores between students who received instruction through Video and CBT. -- There is a difference in final examination test scores between students who received instruction through CBT and either IDS or Lecture, with CBT students receiving a higher score. -- There is a difference in final examination test scores between students who received instruction through Video and either IDS or Lecture, with video students receiving a higher score. You will notice that these complex outcomes have a graphic representation in MINITAB that is fairly easy to understand, as opposed to the more complex graphical representation in SPSS. ************ one_anov.lis ************ % minitab MTB > outfile 'one_anov.lis' Collecting Minitab session in file: one_anov.lis MTB > # MINITAB Addendum to 'one_anov.dat' MTB > # MTB > read 'one_anov.dat' c1 c2 c3 Entering data from file: one_anov.dat 75 rows read. MTB > print c1 c2 c3 ROW C1 C2 C3 1 1 1 89 2 2 1 81 3 3 1 73 4 4 1 84 5 5 1 70 6 6 1 56 7 7 1 70 8 8 1 81 9 9 1 78 10 10 1 69 11 11 1 89 12 12 1 88 13 13 1 45 14 14 1 83 15 15 1 95 16 16 1 77 17 17 1 69 18 18 1 80 Continue? y 19 19 2 93 20 20 2 86 21 21 2 89 22 22 2 95 23 23 2 89 24 24 2 88 25 25 2 98 26 26 2 89 27 27 2 94 28 28 2 95 29 29 2 95 30 30 2 98 31 31 2 87 32 32 2 85 33 33 2 98 34 34 2 93 35 35 2 87 36 36 2 95 37 37 2 93 38 38 2 93 39 39 3 95 40 40 3 96 41 41 3 83 Continue? y 42 42 3 89 43 43 3 88 44 44 3 87 45 45 3 94 46 46 3 97 47 47 3 95 48 48 3 93 49 49 3 85 50 50 3 95 51 51 3 92 52 52 3 82 53 53 3 86 54 54 3 87 55 55 3 89 56 56 3 97 57 57 3 100 58 58 3 93 59 59 3 96 60 60 4 84 61 61 4 85 62 62 4 73 63 63 4 92 64 64 4 57 Continue? y 65 65 4 63 66 66 4 69 67 67 4 73 68 68 4 91 69 69 4 65 70 70 4 74 71 71 4 71 72 72 4 68 73 73 4 62 74 74 4 56 75 75 4 85 MTB > # I'll unstack the data in c3 and c2 and then use MTB > # the two commands to effect the Oneway ANOVA calculations. MTB > # MTB > # If at all possible, stack and or unstack data but MTB > # never re-key data. MTB > # MTB > unstack (c2-c3) (c5-c6) (c7-c8) (c9-c10) (c11-c12); SUBC> subscripts c2. MTB > print c1-c12 ROW C1 C2 C3 C5 C6 C7 C8 C9 C10 C11 C12 1 1 1 89 1 89 2 93 3 95 4 84 2 2 1 81 1 81 2 86 3 96 4 85 3 3 1 73 1 73 2 89 3 83 4 73 4 4 1 84 1 84 2 95 3 89 4 92 5 5 1 70 1 70 2 89 3 88 4 57 6 6 1 56 1 56 2 88 3 87 4 63 7 7 1 70 1 70 2 98 3 94 4 69 8 8 1 81 1 81 2 89 3 97 4 73 9 9 1 78 1 78 2 94 3 95 4 91 10 10 1 69 1 69 2 95 3 93 4 65 11 11 1 89 1 89 2 95 3 85 4 74 12 12 1 88 1 88 2 98 3 95 4 71 13 13 1 45 1 45 2 87 3 92 4 68 14 14 1 83 1 83 2 85 3 82 4 62 15 15 1 95 1 95 2 98 3 86 4 56 16 16 1 77 1 77 2 93 3 87 4 85 17 17 1 69 1 69 2 87 3 89 18 18 1 80 1 80 2 95 3 97 Continue? y 19 19 2 93 2 93 3 100 20 20 2 86 2 93 3 93 21 21 2 89 3 96 22 22 2 95 23 23 2 89 24 24 2 88 25 25 2 98 26 26 2 89 27 27 2 94 28 28 2 95 29 29 2 95 30 30 2 98 31 31 2 87 32 32 2 85 33 33 2 98 34 34 2 93 35 35 2 87 36 36 2 95 37 37 2 93 38 38 2 93 39 39 3 95 40 40 3 96 41 41 3 83 Continue? y 42 42 3 89 43 43 3 88 44 44 3 87 45 45 3 94 46 46 3 97 47 47 3 95 48 48 3 93 49 49 3 85 50 50 3 95 51 51 3 92 52 52 3 82 53 53 3 86 54 54 3 87 55 55 3 89 56 56 3 97 57 57 3 100 58 58 3 93 59 59 3 96 60 60 4 84 61 61 4 85 62 62 4 73 63 63 4 92 64 64 4 57 Continue? y 65 65 4 63 66 66 4 69 67 67 4 73 68 68 4 91 69 69 4 65 70 70 4 74 71 71 4 71 72 72 4 68 73 73 4 62 74 74 4 56 75 75 4 85 * NOTE * One or more variables are undefined. MTB > describe c6 c8 c10 c12 N MEAN MEDIAN TRMEAN STDEV SEMEAN C6 18 76.50 79.00 77.31 12.28 2.89 C8 20 92.000 93.000 92.056 4.168 0.932 C10 21 91.38 93.00 91.42 5.10 1.11 C12 16 73.00 72.00 72.86 11.46 2.87 MIN MAX Q1 Q3 C6 45.00 95.00 69.75 85.00 C8 85.000 98.000 88.250 95.000 C10 82.00 100.00 87.00 95.50 C12 56.00 92.00 63.50 84.75 MTB > histogram c6 c8 c10 c12 Histogram of C6 N = 18 Midpoint Count 45 1 * 50 0 55 1 * 60 0 65 0 70 4 **** 75 2 ** 80 4 **** 85 2 ** 90 3 *** 95 1 * Continue? y Histogram of C8 N = 20 Midpoint Count 85 1 * 86 1 * 87 2 ** 88 1 * 89 3 *** 90 0 91 0 92 0 93 4 **** 94 1 * 95 4 **** 96 0 97 0 98 3 *** Continue? y Histogram of C10 N = 21 Midpoint Count 82 1 * 84 1 * 86 2 ** 88 3 *** 90 2 ** 92 1 * 94 3 *** 96 5 ***** 98 2 ** 100 1 * Continue? y Histogram of C12 N = 16 Midpoint Count 55 2 ** 60 1 * 65 2 ** 70 3 *** 75 3 *** 80 0 85 3 *** 90 2 ** MTB > # I will now use the MINITAB command for STACKED data. MTB > # MTB > oneway c3 c2 ANALYSIS OF VARIANCE ON C3 SOURCE DF SS MS F p C2 3 5372.3 1790.8 23.62 0.000 ERROR 71 5383.5 75.8 TOTAL 74 10755.8 INDIVIDUAL 95 PCT CI'S FOR MEAN BASED ON POOLED STDEV LEVEL N MEAN STDEV -----+---------+---------+---------+- 1 18 76.500 12.277 (----*----) 2 20 92.000 4.168 (----*----) 3 21 91.381 5.104 (----*----) 4 16 73.000 11.460 (----*-----) -----+---------+---------+---------+- POOLED STDEV = 8.708 72.0 80.0 88.0 96.0 MTB > # MTB > # And I will now use the MINITAB command for UNSTACKED data. MTB > aovoneway c6 c8 c10 c12 ANALYSIS OF VARIANCE SOURCE DF SS MS F p FACTOR 3 5372.3 1790.8 23.62 0.000 ERROR 71 5383.5 75.8 TOTAL 74 10755.8 INDIVIDUAL 95 PCT CI'S FOR MEAN BASED ON POOLED STDEV LEVEL N MEAN STDEV -----+---------+---------+---------+- C6 18 76.500 12.277 (----*----) C8 20 92.000 4.168 (----*----) C10 21 91.381 5.104 (----*----) C12 16 73.000 11.460 (----*-----) -----+---------+---------+---------+- POOLED STDEV = 8.708 72.0 80.0 88.0 96.0 MTB > # MTB > # Although I'm keen on the use of SPSS, the graphic MTB > # output with MINITAB on a Oneway ANOVA is very MTB > # useful and easy to understand. MTB > # MTB > # Here, you can easilty see that C6 (lecture) and c12 MTB > # (independent study) share the same pooled mean score MTB > # on the final examination. Equally, you can also see MTB > # that c8 (CBT) and c10 (videotape instruction) also MTB > # share the same pooled mean. MTB > # MTB > # Finally, you can also see that lecture and independent MTB > # study final examination scores are totally different, MTB > # with no overlap, from CBT and videotape instruction MTB > # final examination scores. MTB > stop -------------------------- Disclaimer: All care was used to prepare the information in this tutorial. Even so, the author does not and cannot guarantee the accuracy of this information. The author disclaims any and all injury that may come about from the use of this tutorial. As always, students and all others should check with their advisor(s) and/or other appropriate professionals for any and all assistance on research design, analysis, selected levels of significance, and interpretation of output file(s). The author is entitled to exclusive distribution of this tutorial. Readers have permission to print this tutorial for individual use, provided that the copyright statement appears and that there is no redistribution of this tutorial without permission. Prepared 980316 Revised 980914 end-of-file 'one_anov.ssi'