Association of American Law Schools Home  Calendar

Conference on New Ideas for Experienced Teachers

June 9–13, 2001
Calgary, Alberta, Canada


Back to list of Conference Materials

  THE PROBLEM WITH TRADITIONAL ASSESSMENT IN LAW SCHOOL

(From MUNRO, OUTCOMES ASSESSMENT FOR LAW SCHOOLS [2000])

Examination practices in law schools are so uniform that one can fairly generalize when describing them.1 The typical law school evaluates students by giving, at the end of the course, a single examination that purports to test the material covered during the semester.2 The primary form of examination is hypothetical essay questions and, less often, multiple-choice questions3 whose resolution focuses almost entirely on application of judicial doctrine.4 The Blue Book examinations impose severe time constraints5 averaging 2 to 4 hours per course6 or longer for take home exams. Nickles found that faculty took five weeks to grade exams and return grades to students,7 and, while there were exceptions, faculty generally provided no "post mortem" in which feedback was given to the class or student.8 Anecdotal evidence suggests that students often find faculty unwilling to spend time with them to discuss individual exams. Faculty fail to establish in advance of examination explicit standards for performance, and, if learning objectives exist for a course, they are not given to students.

Law school classes are generally graded, although some courses are pass/fail. In his 1977 study, Nickles found institutional objective criteria for grading to be generally nonexistent, ensuring a lack of consistency in grading.9 He attributed the absence of grading standards to the failure of law schools to define professional expectations for students, that is, student outcomes.10 There is no evidence today that would justify asserting that there has been any change in the past two decades. Instead, it appears from a 1996 survey that 84% of law schools have coped with the problem of lack of institutional objective grade criteria by artificially adopting some form of grade normalization or grade curve to control distribution of grades.11 Nevertheless, as part of the system, faculty rank students academically,12 and a host of benefits depend on the grades.13

Law school examinations may appropriately be characterized as summative, meaning that they measure student learning after the fact, but are seldom used as a diagnostic tool or instructional device for student learning during the course.14 Formative evaluation processes in which the students perform tasks, are evaluated, are provided feedback, and learn at the same time are rare in law school, possibly because of large class sizes. Notable exceptions exist, however, in clinical education, legal writing, professional skills simulations, and competitive moot court, client counseling, trial, and negotiations teams.

Exceptions aside, law school assessment of student learning not only relies on a narrow band of evaluation, but also makes restricted use of it. It is common in undergraduate education for students to perform a variety of tasks, each of which might in some way be evaluated. For example, students in a business course may take weekly quizzes, make a classroom presentation, work with a partner to formulate a business plan, write a report and take a final exam. An architecture student may submit multiple drawings, design and build models, make a classroom presentation and take a final exam. In those cases, the students' final grades in the course represent a synthesis of the multiple evaluations of their performance. By contrast, the law school final examination is the only formal evaluation opportunity; it is a "do or die" examination resulting in a course grade. Hence, the examination, a single means of evaluation, is so subsumed in the grade, a symbol, as to defy any separate purpose, such as a learning tool.15 Naturally, in light of the summative character of the process and the importance of grades, no provision exists for allowing a student to prepare and perform again to reach competency.16

The irony in the fact that legal education has chosen the blue book essay exam as its primary means of evaluation is that the instrument itself lacks a sound basis in educational or assessment principles. Legal educators who have subjected the essay or blue book exam to critical analysis during the last seventy five years have roundly criticized it.17 After studying the examination process and philosophies of faculty at Columbia Law School in 1923,18 Professor Wood reported that "[i]n general, the method of deriving grades [was] characterized by extreme subjectivity"19 and said of the essay exam that the evidence is so strong that we may almost say with finality that the traditional prose examination, singlehanded and alone, is inadequate for the requirements of modern educational administration."20

After Nickles 1977 survey of evaluation techniques at every law school in the nation21 disclosed the single course essay exam to be the convention in the schools,22 he asserted that "typical process of evaluation in our law schools is composed of procedures and techniques which have been discredited by research in education and psychology."23 He concluded that "legal education has paid insufficient attention to the problems and issues of student evaluation in American law schools.24

Janet Motley, in 1985, summed up the assessment of the blue book examination as the core of the evaluation tradition in law schools:

[L]aw school exams, as now administered and used, fail to serve an educational purpose and may be counter-productive to our educational goals, particularly the goal of teaching our students to learn from experience. Furthermore, the present system is guaranteed to create artificial categories and classes of students, which, in turn, stigmatize and dramatically affect their lives. All this, in the light of strong evidence indicating that the exam does not even do a satisfactory job at assessment, calls for a reevaluation of our perpetuation of tradition.25

Phillip C. Kissam, in his 1989 exhaustive analysis of the political and social context of blue book examinations, recommended that law schools make multiple changes in the nature and content of their examinations.26 Although he concluded that the blue book examination system tests for several complex attributes,27 he found that it was damaging to "effective and democratic legal education,"28 and promoted what he called "good paragraph thinking" and rules/facts approach to law.29 He concluded that "the adverse effects of the current examination system may be unnecessary."30

Finally, Douglas Henderson, in the latest analysis of the law school essay exam, declares it "psychometrically unsound,"31 lacking in the precision and accuracy for the function it purports to perform,32 inconsistently scored, and unreliable.33 Unfortunately, the problems of this system are compounded, because they are used for ranking of law students which Henderson finds deleterious and with few benefits,34 a view shared by proponents of quality management in law schools.35


  1. See Nickles, supra note 88. Nickles distributed a questionnaire on law school examinations to the deans and student bar association presidents at every American law school. Predictably he was able to describe the "typical American law school evaluation process" after analysis of the results. While there has been significant change in some courses and at some schools, the author believes that Nickles’ observations of 1977 are, with exceptions, still valid today.
  2. Id. at 432.
  3. Steve Sheppard, An Informal History of How Law Schools Evaluate Students, with a Predictable Emphasis on Law School Final Exams, 65 UMKC L. Rev. 657 (1997).
  4. Kissam, supra, note 277, at 439.
  5. Id. at 438.
  6. Downs & Levit, supra note 178, at 822.
  7. Id. at 426.
  8. Id. at 438; See Kissam, supra note 277, at 471.
  9. Id.
  10. Id. at 423.
  11. Downs & Levit, supra note 178, at 836.
  12. Barbara Glesner Fines, Competition and the Curve, 65 UMKC L. Rev. 879, 886 (1997).
  13. Id.
  14. Deborah Waire Post, Power and the Morality of Grading-A Case Study and a Few Critical Thoughts on Grade Normalization 65 UMKC L. Rev. 777, 784 (1997).
  15. Nickles, supra note 81, at 413.
  16. Id. at 436.
  17. Kissam, supra note 277; Motley, supra note 277; Nickles, supra note 81; Ben D. Wood, supra note 85(pts. 1-3).
  18. Harlan F. Stone, Forward to Wood, supra note 85(pt. 1).
  19. Wood, supra note 85 (pt. 1), at 224.
  20. Id. at 225-26. Wood also succinctly stated his basis for damning the essay exam:
    To many professors, to some even who have not been influenced by the large masses of evidence against the traditional subjectively scored examinations, the spectacle of a student trying to record an adequate sampling of his gains from a four-hour course of several months' duration in the English prose which he can produce in three hours under the conditions and circumstances of college examination week, and the correlative spectacle of the college professor passing judgment on that student on the sole basis of the product of those three hours of writing, seem, on a priori grounds alone, quite incompatible with current ideals of educational measurement and administration.
    Id. at 226 (emphasis in original).
  21. Nickles, supra note 81.
  22. Id. at 432.
  23. Id. at 412.
  24. Id.
  25. Motley, supra note 277, at 723-24.
  26. Kissam, supra note 277.
  27. Id. at 435. Kissam concedes that the blue book exam tests for "ability to internalize legal doctrine; . . . 'issue spotting' or . . . 'legal imagination'; and a 'legal productivity' [at drawing legal conclusions by applying rules to legal issues, and] . . . capacity for self-study and self-learning in diffuse, complex, and uncertain situations over sustained periods of time." Id.
  28. Id. at 436.
  29. Id. at 437.
  30. Id.
  31. Douglas A. Henderson, Uncivil Procedure: Ranking Law Students Among Their Peers, 27 U. Mich. J.L. Ref. 399, 407 (1994).
  32. Id.
  33. Id. at 409-11.
  34. Id. at 423-30; 411-18.
  35. Mixon & Otto, supra note 56, at 441.
Top of Page