The Technology Source Archives - Electronic Course Evaluation Is Not Necessarily the Solution

November/December 2000 // Letters to the Editor

Electronic Course Evaluation Is Not Necessarily the Solution

Note: This article was originally published in The Technology Source (http://ts.mivu.org/) as: Michael Theall "Electronic Course Evaluation Is Not Necessarily the Solution" The Technology Source, November/December 2000. Available online at http://ts.mivu.org/default.asp?show=article&id=1034. The article is reprinted here with permission of the publisher.

I agree with Keith Hmieleski and Matthew Champagne (2000, September/October) that the exploration of effective uses of technology is critical to higher education. I also agree that we must capitalize on the capabilities of new technologies to improve operational efficiency and day-to-day practice. However, the Hmieleski and Champagne article suggests two disturbing things: (a) that the authors are not familiar with the theory, literature, and reality of faculty evaluation and student ratings; and (b) that the authors have considered only one aspect of the process and thus are proposing an overly simple, unilateral, and potentially dangerous response to a complex problem.

Hmieleski and Champagne begin by noting that paper-based data collection is the current mode. This statement is based on a recent Rensselaer study (Hmieleski, 2000) and is essentially supported by prior studies of evaluation practice from Centra (1993), Traylor (1992), and Seldin (in a series of studies and books over the past 15 years, e.g., 1993, 1999). Though the focus of the early studies was not Web-based evaluation, the information they provide is similar to Hmieleski and Champagne's with respect to format, frequency of feedback, report latency, return rates, and faculty and student support. It seems fair to say that the student ratings process could be improved at many institutions around the country. One caution is in order, however. In citing all-too prevalent examples of poor practice, the authors have set up a straw man that anyone can knock down. Their use of the term "autopsy method" is typically one-sided, creating the impression that every form of paper-based evaluation is inefficient or invalid. They do not note that many of the problems in day-to-day practice can be solved. Other evaluation literature of the past 25 years (Arreola, 2000; Braskamp & Ory, 1994; Centra, 1979; Doyle, 1975; Miller, 1987; Theall & Franklin, 1990a), if incorporated, would broaden the article's scope. Web-based systems cannot and will not, by themselves, solve the problems the authors describe. For example, the difficulty with their scenario about students doing evaluation after the final exam is that it has nothing to do with the appropriateness of paper-based evaluation. It is just bad practice that can be corrected without the need to move evaluation online.

The authors then discuss "three steps toward Web-based course evaluation": (a) converting from paper, (b) incorporating a feedback and refinement process, and (c) using currently available online technology. These are reasonable steps, but at this point, the authors make claims not supported by research or day-to-day practice. They say, for example, that the feedback and refinement process "removes obstacles to learning, improves student satisfaction, and rapidly improves course delivery." Evidence dating back to Cohen's meta-analysis (1980) and supported by more recent work (Brinko, 1991; Theall & Franklin, 1991; Brinko & Menges, 1997) has shown that the simple provision of data has little or no improvement effect on teaching, whereas data accompanied by support from knowledgeable instructional/faculty development staff can enhance teaching success. Hmieleski and Champagne make no mention of this important need, thus giving the impression that Web-based delivery of evaluation information will automatically improve teaching. In fact, research evidence argues that electronic collection and processing in itself will not provide such improvement.

Hmieleski and Champagne fail to distinguish between formative and summative uses of the evaluation data that will be collected, and, while I agree that their three steps can provide data useful for improvement, I also oppose the uses of these same data for summative purposes. While student comments are extremely valuable for improvement, there are many substantial reasons they should not be used for summative decision making (Franklin & Berman, 1998). The authors also suggest that benefits can be drawn from analyses of individual results rather than the average student response. Again, while this method can be a useful formative technique, omitting class average ratings of specific teaching behaviors or global items or factor scores ignores the potential of such items to inform the formative process and denies the absolute need for such data for summative purposes. Of course, a valid and reliable instrument, proper analysis, effective reporting of the data, and correct interpretation of reports are all critical to good evaluation. Using a Web-based system cannot guarantee this level of quality any more than a paper-based system can, and at the moment (given the vast amount of research on paper-based systems in face-to-face classes) Web-based evaluation may be less reliable and valid since its particular and unique aspects have not been thoroughly investigated.

In online and distance education courses, there is little question that Web-based evaluation may be a practical necessity, but we know very little about the dynamics and influence of Web-based and distance education vis-?Ã‰ -vis the evaluation of teaching in these contexts. Certainly, the roles, responsibilities, and tasks of teachers and learners in these contexts are different from those in on-campus classes. Putting a questionnaire on the Web to make its use more rapid or widespread may simply increase the occurrences of bad practice.

Cost is another question. The figures provided by Hmieleski and Champagne present the best case for electronic data processing and the worst case for paper-based systems. If we choose to disregard the large infrastructure costs associated with any electronic system, if we choose to let the entire process be done electronically, if we offer no improvement services, and if we assume that all the necessary tasks will be provided at no extra cost by existing staff, then it is possible to accept some of the figures in the article. But are we really so sure about these figures that we can promise that the total cost of processing 327 evaluations will be $18.75, or that the cost of a paper-based process for these same evaluations would be $568.60? Likewise, the data claiming that "analyzing 327 forms takes approximately 16 hours of labor" cry out for explanation because those of us with experience in operating valid and reliable paper-based systems know this statement is simply an incorrect generalization. It is another straw man based on a worst-possible-case set of assumptions.

The return rate issue is a serious one. The average return rate in well-run, paper-based systems presently exceeds the average return rate in electronic systems. The authors claim that comparing return rates is impossible due to "far too many alternative explanations." While I agree that such comparisons may not be terribly important or revealing, I can at least cite average return rates of over 80% in the systems that colleague Jennifer Franklin and I have operated. Citing Champagne et. al (1999) and Phipps & Merisotis (1999), the authors make important points about issues such as faculty support of the process, changes resulting from feedback, and validity influencing return rates. These are indeed important issues and they reflect my previous comments that the quality of the overall system is much more important than the particular data collection process used. Hmielseki and Champagne offer no evidence that electronic data processing will correct these problems.

The authors also fail to discuss the degree of control exercised over the context in which data collection takes place. True, in online systems, students can respond at their leisure, but the total absence of control over how and with whom the evaluations are completed, fly in the face of good methodology. The previously cited literature includes many appropriate suggestions for eliciting the highest return rate and the highest quality data from in-class evaluations. There is no equivalent set of tested methods supporting quality control in online data collection.

But all this is almost moot. Putting student ratings systems online purely for supposed efficiency will do nothing to improve the poor state of evaluation practice. It will only allow bad information to be misinterpreted and misused more rapidly by those who presently do so in paper-based systems. It will not improve formative evaluation simply because it is faster. It will not reduce the mythologies surrounding evaluation. It will not create confidence that evaluation is reliable, valid, fair, or useful. It will remain no better than the system, the questionnaire, the people, and the policies that surround it. And it may create massive problems in the areas of confidentiality and privacy issues with resulting faculty resistance and hostility. The bottom line is that rushing into a faster way to do something poorly doesn't benefit anyone.

I have nothing against using technology to make the evaluation process better or more efficient. My colleague Jennifer Franklin and I have installed and operated evaluation/ratings systems at many locations using both paper and electronic methods of data collection. As early as 1985, the initial development of what has become our "TCE-Tools" process was based on using technology as part of a complex evaluation and improvement system (Theall & Franklin, 1990b) designed to provide the most reliable, timely, and acceptable process possible, and to serve both formative and summative purposes. But we predicated our system on a thorough understanding of all of the evaluation literature as well as an understanding of college teaching and learning, technology, data management, teaching improvement, institutional dynamics, and the day-to-day needs of the various stakeholders. Unfortunately, if Hmieleski and Champagne's article represents the totality of their investigation of a major and complex problem, then that investigation and their proposed solution are seriously inadequate.

References

Arreola, R. A.(2000). Developing a comprehensive faculty evaluation system (2nd ed.). Bolton, MA: Anker Publications.

Braskamp, L. A & Ory, J. C.(1994). Assessing faculty work. San Francisco: Jossey-Bass.

Brinko, K. T. (1991). The interactions of teaching improvement. In M. Theall & J. Franklin (eds.) Effective Practices for Improving Teaching: New Directions for Teaching and Learning, 48. San Francisco: Jossey-Bass.

Brinko, K. T. & Menges, R. J. (1997). Practically speaking: A source book for instructional consultants in higher education. Stillwater, OK: New Forums Press.

Centra, J. A. (1979). Determining faculty effectiveness: Assessing teaching, research, and service for personnel decisions and improvement. San Francisco: Jossey-Bass.

Centra, J.A. (1993). Reflective faculty evaluation. San Francisco: Jossey Bass.

Champagne, M. V., Wisher, R. A. Pawluk, J. L., & Curnow, C. K. (1999). An assessment of distance learning evaluations. Proceedings of the 15^th annual conference on distance teaching and learning,15, 85-90.

Cohen. P. A. (1980). Effectiveness of student ratings feedback for improving college instruction: A meta-analysis of findings. Research in Higher Education, 13(4), 321-341.

Doyle, K O. (1975). Student evaluation of instruction. Lexington, MA: D. C. Heath.

Franklin, J. & Berman, E. (1998). Using student written comments in evaluating teaching. Instructional Evaluation and Faculty Development, 18(1). Retrieved 13 October 2000 from the World Wide Web: http://iaes.arizona.edu/PAGEFILES/ documents/facdev/iefd-18-1.html#Using Students' Written Comments in Evaluating Teaching

Hmieleski, K. H. (2000). Barriers to on-line evaluation. Troy, NY: Rensselear Polytechnic Institute Interactive and Distance Education Assessment (IDEA) Laboratory. Retrieved from the World Wide Web: http://idea.psych.rpi.edu/evaluation/report.htm

Miller, R. I. (1987). Evaluating faculty for promotion and tenure. San Francisco: Jossey-Bass.

Phipps, R. & Merisotis, J. (1999). What's the difference?: A review of contemporary research on the effectiveness of distance learning in higher education. Institute for Higher Education policy. Retrieved 13 October 2000 from the World Wide Web: http://www.ihep.com/difference.pdf

Seldin, P. (1999). Changing practices in faculty evaluation. Bolton, MA: Anker Publications.

Seldin, P. (1993, October). How colleges evaluate professors: 1983 versus 1993. AAHE Bulletin, 12, 6-8.

Theall, M. & Franklin, J. (eds.) (1990a). Student ratings of instruction: Issues for improving practice. New Directions for Teaching and Learning, 43. San Francisco: Jossey-Bass.

Theall, M. & Franklin, J. (1990b). Student ratings in the context of complex evaluation systems. In M. Theall & J. Franklin (Eds.) Student ratings of instruction: Issues for improving practice. New Directions for Teaching and Learning, 43. San Francisco: Jossey-Bass.

Theall, M. & Franklin, J. (eds.) (1991). Effective practices for improving teaching. New Directions for Teaching and Learning, 48. San Francisco: Jossey-Bass.

Traylor, C. (1992). A comparative analysis of selected criteria used in four-year colleges and universities to evaluate teaching, scholarship, service, and faculty overall performance. Dissertation Abstracts International, 53(5), 1422A.