January/February 2003 // Assessment
The Impact of Computer-Based Testing on Student Attitudes and Behavior
by Darrell L. Butler
Note: This article was originally published in The Technology Source (http://ts.mivu.org/) as: Darrell L. Butler "The Impact of Computer-Based Testing on Student Attitudes and Behavior" The Technology Source, January/February 2003. Available online at http://ts.mivu.org/default.asp?show=article&id=1034. The article is reprinted here with permission of the publisher.

Large classes are relatively efficient pedagogical systems, but the probability of student success is much lower than in small classes. Osborne, Browne, Shapiro, and Wagor (2000) reported that in the United States, the combination of Ds, Fs, and Ws can be nearly 50% in large courses, and Robertson and Ober (1997) found a large negative correlation between class size and mean course grade in 647 sections of core curriculum courses at a Midwestern state university. Lindsay and Paton-Saltzberg (1987) also reported a strong negative correlation between class size and grades in Great Britain.

Over the past 70 years, researchers have consistently found that one way to improve student success is to increase the frequency of exams (Graham, 1999; Keys, 1934; Kika, McLaughlin, & Dixon, 1992; Pikunas & Mazzota, 1965; Turney, 1931). For example, Keys reported a 14% improvement when tests were given weekly instead of monthly, Pikunas and Mazzota found that performance was 10% higher when tests were given weekly instead of every 6 weeks, and Graham found a 4% increase just from the addition of unannounced quizzes. However, when classes meet for a set number of times, an instructor's decision to give more exams typically means that students have less time for learning activities during class meetings.

To solve this dilemma, Ball State University created a small, proctored, computer-based testing (PCBT) facility where students can take exams on-campus relatively frequently outside of class time. At the time of this study, the PCBT facility had 24 computers running a severely restricted version of Windows NT; one computer was equipped with adaptations for students with disabilities. Tests were available through InQsit using an Internet browser.

In theory, the PCBT facility seemed like a great pedagogical solution. Faculty members could increase the number of tests and quizzes and also gain some class time for learning activities. We were unsure, however, about the real impact of this option. Would using the PCBT facility actually help students perform better? How would students feel about being asked to take tests outside of class time? Would they see it as an infringement or as a stress reducer? Would moving from pencil and paper to computers have consequences we had not foreseen?

The PCBT Facility: Preliminary Assessment

Researchers have generally found that when computer-based testing (CBT) is similar in format to pencil and paper tests, it has little if any effect on test performance (Neuman & Baydoun, 1998; Bugbee, 1996). However, developments in hardware and software have made the CBT approach to testing and grading economical and practical (Barua, 1999; Halcomb et al., 1989; Zakrzewski & Bull, 1998), and CBT offers advantages over pencil and paper tests. For example, it permits

  • specialized kinds of test stimuli, such as speech and multimedia, that are much more difficult to provide with pencil and paper tests (Bennett et al., 1999);
  • the use of different testing approaches, such as adaptive testing; and
  • quick scoring of some kinds of test items, as well as rapid feedback to students and summaries to faculty.

For a preliminary assessment of the PCBT facility, I evaluated students in an undergraduate Introduction to Psychology course that made use of the facility. These students were given eight exams instead of the three or four exams given to students in previous semesters, when tests were taken in class using pencil and paper. Students who used the PCBT facility earned about 10% higher scores than students who took fewer, in-class tests. The former reported that they preferred both taking tests in the PCBT lab and taking more than the traditional three exams. Similarly, my colleague Michael O'Hara (2002) reported that students in an Introduction to Theater course performed about 15% better when they took many exams in the PCBT facility than when they took fewer exams in class, using pencil and paper. O'Hara also noted that students had favorable attitudes about taking exams and quizzes in the PCBT facility. However, in O'Hara's study and my own preliminary assessment, there were concerns about cheating; in both cases, a slightly higher percent of students who used the PCBT lab reported that they cheated.

I subsequently undertook a more thorough assessment of the impact of a small, on-campus PCBT facility. I was particularly interested in whether a large sample of students from many classes would have positive attitudes toward taking more exams than is traditional and toward the PCBT facility itself. In addition, I wanted to look more closely at the incidence and nature of cheating.

Methods

The participants in this study were 908 volunteers from 25 different classes at Ball State University. The sampling scheme was a bit complex. First, I made two requests of faculty members using the PCBT facility: that they participate in the study and that they identify courses similar to theirs that were not PCBT-affiliated. I then contacted the instructors of the latter courses and asked them to encourage their students to volunteer for the study. Although few of the faculty members gave students extra credit for volunteering, the percentage of students using PCBT that were offered incentives was very similar to the percentage of students not using PCBT that were offered incentives.

Participants' classes differed in two major ways important to the study: the number of exams varied from 2 to 14 (M = 6.9, SD = 3.3) and students either took tests in a proctored, computer-based facility outside of class time (N = 653) or took tests on paper in the classroom during class time (N = 255). Most exams were multiple choice, and in all classes exam performance was the primary basis of grades. None of the students had take-home or other non-proctored tests outside the classroom.

A 26-item online survey (Exhibit 1) was created with InQsit software and consisted of two sections. The first section concerned demographic data; the second section elicited feedback about exams—namely, participant attitudes about tests and the nature of their ethical or unethical behavior during tests. Using any computer connected to the Internet, volunteers accessed the Web site to participate in the survey anonymously.

Results

Attitudes about Grades and Learning. Significance tests indicated that a moderate number of tests—about five to eight, or an exam every 2 to 3 weeks—was associated with better overall attitudes about grades. As shown in Figure 1, for five to eight exams, attitudes were in the middle of the preference scale, indicating that students did not prefer either more or fewer exams. Relative response rates for low and high numbers of exams also were consistent with an overall preference for moderate testing: the mean preference of students who took 2, 3, or 4 exams was greater than 3, indicating that these students preferred more exams; conversely, the mean for students who took 14 exams was below the middle of the preference scale, indicating that these students wanted fewer exams. Grade-related attitudes were generally more favorable to testing in the PCBT environment than to testing in the traditional classroom. Similar trends were found for students' attitudes about learning: they preferred both a moderate number of exams (Figure 2) and testing in the computer-based facility.

Attitudes about Anxiety and Readiness. Students' responses to prompts about anxiety suggested that fewer tests may produce higher levels of anxiety. As indicated in Figure 3, students tested in the classroom generally reported more anxiety than students tested in the PCBT facility. This outcome may be attributable to the fact that faculty members who used in-class testing gave fewer exams. In contrast, the number of tests had no clear effect on students' attitudes about readiness (Figure 4); the same was true for test location.

Attitudes about Convenience and Control. Very frequent testing was associated with the lowest ratings of convenience, but even those scores were near the middle of the scale (Figure 5). Relatively infrequent testing was associated with the lowest ratings of control. The PCBT facility was associated with somewhat higher ratings of control than the classroom (Figure 6).

Cheating. For most questions related to cheating, both types of testing generated similar student responses. However, on the matter of whether students who had taken an exam could talk with students who had not, there were significant differences (Figure 7). Students tested in the PCBT facility were much more likely to talk generally about a test with students who had not yet taken it. Although the differences were small, students tested in the PCBT facility were also more likely to talk about specific questions and answers.

Discussion

Overall, a moderate number of tests was associated with better student attitudes. Student attitudes were generally more positive toward the proctored, computer-based testing facility than toward in-class, pencil and paper testing. Because this study did not involve careful manipulation of the independent variables, some of these attitude effects may have been due to the practices of faculty members or other unmeasured variables. However, I did confer with professors who were using the PCBT facility, and all believed that the same attitude shifts occurred within their classes when they moved from in-class testing to testing in the PCBT facility.

Students using the PCBT facility reported cheating to a greater degree than students who did not use the facility; this outcome was related to the fact that, within the PCBT facility, not all students took tests at the same time. Despite the proctored nature of the exams, students who had taken an exam had the opportunity to talk to classmates who had not. Most conversation among students was generally about the content of the test. In response to open-ended survey questions, many students stated that they did not believe that such dialogue should be considered cheating—they believed that they were providing the same kind of information that the professor would provide. However, some students did talk specifically about questions and answers.

There are a number of ways that instructors can reduce the kinds of cheating described by these students, or at least reduce the impact of cheating on grades. One option is to increase the size of the facility and make all students take exams at the same time. This is not a cost-effective solution, however, and it undoubtedly would have a negative impact on student attitudes about convenience and some other advantages of a smaller PCBT facility. Alternatively, faculty can try to reduce the impact of students' communication with one another. For example, professors can refrain from giving feedback on individual test items until all students have taken the exam. If students do not know the professor's answer key, they cannot share answers with certainty. Similarly, if test items are relatively complex, students may not be able to remember them well enough to share them with others.

Another option is to give similar but not identical exams to students. Professors can generate different exams, or at least some different items, for each student. Generating multiple exams may be very easy for classes that test computations (i.e., numbers can be different for every student). Otherwise, generating multiple exams is somewhat more work for faculty because creating a pool of comparable items generally takes longer than creating a single exam; moreover, there is always the danger of producing non-equivalent exams.

Not all assessments must be graded. Angelo and Cross (1995) provide strong arguments for the use of ungraded assessments. It is unclear whether the results of the Ball State study are relevant to ungraded assessment. I suspect that since students do not study for most ungraded assessments, attitudes about the number of such assessments would be different than found here for graded exams. Other comparisons of graded and ungraded assignments in relation to the PCBT facility are not obvious and require further research.

Summary

This results of this study suggest that a small, proctored, computer-based testing facility that students can use outside of class time offers a range of benefits to educators. Without sacrificing class time, faculty can give more exams than is typical and thus aid student performance. Professors also can use testing stimuli and strategies that are difficult to implement with pencil and paper tests. Students seem to have positive attitudes about PCBT, particularly when a moderate number of tests are required. However, the use of a small facility does increase the opportunity for students who have taken a test to communicate with those who have not. Faculty members should take steps to reduce the impact of these conversations on exam performance.

References

Angelo, T. A., & Cross, K. P. (1995). Classroom assessment techniques: A handbook for college teachers (2nd ed.). San Francisco: Jossey-Bass.

Barua, J. (1999, April). Computer-based testing on campus. Syllabus, 51-52.

Bennett, R. E., Goodman, M., Hessinger, J., Kahmn, H., Ligget, J., Marshall, G., et al. (1999). Using multimedia in large-scale computer-based testing programs. Computers in Human Behavior, 15, 283-294.

Bugbee, Jr., A. C. (1996). The equivalence of paper-and-pencil and computer-based testing. Journal of Research on Computing in Education, 28, 282-299.

Graham, R. B. (1999). Unannounced quizzes raise test scores selectively for mid-range students. Teaching of Psychology, 26, 271-273.

Halcomb, C. G., Chatfield, D. C., Stewart, B. E., Stokes, M. T., Cruse, B. H., & Weimer, J. (1989). A computer-based instructional management system for general psychology. Teaching of Psychology, 16, 149-151.

Keys, W. (1934). The influence of learning of retention of weekly as opposed to monthly tests. Journal of Educational Psychology, 25, 511-520.

Kika, F. M., McLaughlin, T. F., & Dixon, J. (1992). Effects of frequent testing of secondary algebra students. Journal of Educational Research, 85, 159-162.

Lindsay, R., & Paton-Saltzberg, R. (1987). Resource changes and academic performance at an English Polytechnic. Studies in Higher Education, 12(2), 213-27.

Neuman, G., & Baydoun, R. (1998). Computerization of pencil and paper tests: When are they equivalent? Applied Psychological Measurement, 22, 71-83.

O'Hara, M. (2002). Technology and theatre pedagogy: A call from the trenches. In A. Fliotsos and G. Medford (Eds.), Theatre pedagogy. Manuscript submitted for publication.

Osborne, R. E., Browne, W. F., Shapiro, S. J., & Wagor, W. F. (2000). Transforming introductory psychology: Trading ownership for student success. To Improve the Academy, 18, 128-146.

Pikunas, J., & Mazzota, E. (1965). The effects of weekly testing in the teaching of science. Science Education, 49, 373-376.

Robertson, T., & Ober, D. (1997). General studies assessment data—average grade earned and average section size. Ball State University Campus Report. Muncie, IN: Ball State University.

Turney, A. H. (1931). The effect of frequent short objective tests upon the achievement of college students in educational psychology. School and Society, 33, 760-762.

Whitley, B. (1998). Factors associated with cheating among college students: A review. Research in Higher Education, 39, 235-274.

Zakrzewski, S., & Bull, J. (1998). The mass implementation and evaluation of computer-based assessments. Assessment and Evaluation in Higher Education, 23, 141-152.

action gamesmahjongsimulation gamesdownloadable pc gameskids gamesbest pc gamespuzzle games
View Related Articles >