Over the past several months, Joe Feldman, veteran educator and author of , and I have been discussing equitable grading. We鈥檝e touched on everything from grade inflation to whether this approach can ever really yield higher standards to what it takes for schools to responsibly pursue equitable grading. Today, we talk about what the research says about equitable grading, and Joe delivers a brief history of grading practices.
鈥搁颈肠办
Rick: You鈥檝e mentioned to me that there鈥檚 a mismatch between what the research on equitable grading says and the way the practice is regarded by its critics. Can you explain what you mean?
Joe: I get that 鈥渆quity鈥 has become a polarizing term of late, and I also recognize that there are probably bad or ineffective things happening under the banner of 鈥渆quitable grading鈥 that have little resemblance to the way that I鈥檝e defined鈥攁nd many others implement鈥攖hose practices. But the work of more accurate and fair grading, whether you want to call it 鈥渆quitable grading,鈥 鈥渟tandards-based grading,鈥 or even 鈥渃ommon-sense grading,鈥 is about creating the conditions for deeper, more rigorous teaching and learning through clearer and more truthful reporting of student progress that doesn鈥檛 reward or punish students based on teacher biases or circumstances outside a student鈥檚 control.
Taking a critical view of how we traditionally grade can lead to profound and positive changes. The most convincing evidence is from teachers who share their experiences. Here鈥檚 one from Nick, a high school physics teacher, who told me, 鈥淚鈥檝e told students that the homework, rather than being included in your grade, is your opportunity to practice and to see how well you understand things. Homework completion at first took a dip when I stopped counting it for points.鈥
But that鈥檚 not the end of the story. Before too long, Nick related, 鈥淭hey realized, 鈥極h, I want to get a good grade in this class. I need to understand the material,鈥 and then homework completion has shot up. It鈥檚 the opposite of what I feared would happen. Now they see that the purpose of homework is actually to learn the material.鈥
Rick: You鈥檝e suggested that common grading practices should be regarded as the product of inertia more than evidence. Can you say more about what you mean?
Joe: Entrenched practices can persist despite compelling evidence for change. When it comes to grading, our long-held beliefs often diverge from the most recent evidence and real experiences of practitioners. This isn鈥檛 unique to education: Physicians are famously resistant when long-standing practices are upended by emerging research or new data, even by fellow physicians. One is the adoption of handwashing in health-care settings. Ignaz Semmelweis, a Hungarian physician in the mid-19th century, discovered the importance of hand hygiene in preventing the spread of infectious diseases. Yet, despite Semmelweis鈥 findings and evidence, his ideas were initially met with skepticism and resistance from the medical community. It took several decades for handwashing to become widely accepted as a standard practice in health-care settings. I鈥檓 not going to claim that equitable grading is the same as handwashing, but I do think that equitable grading practices make grades less likely to be 鈥渋nfected鈥 by teachers鈥 biases.
I have always approached this work as a dialogue鈥攚here you and I approach this work with mutual curiosity and openness. I鈥檝e had many disagreements with skeptics that ended in our realizing that we are interested in the same goals for students鈥攑articularly those who have been historically underserved鈥攁nd we agree more than we disagree about the benefits of equitable grading once we are clear about what it is and what it isn鈥檛. I recall a Fox News interview where I was paired with a teacher who spoke about her adamant disagreement with 鈥渆quitable grading.鈥 When she shared her concerns, and I responded with clarifications, rationale, and evidence, she shifted to arguing that the biggest problem is that districts aren鈥檛 training all teachers to implement improved grading!
Rick: OK, let鈥檚 switch gears. We鈥檝e had a number of conversations about grading as practiced today. Given all that, I鈥檓 always curious how we got here. You鈥檝e noted in passing that the contemporary grading system grew out of the Industrial Revolution. Can you spell out what you mean by that?
Joe: Our current grading practices were developed over a century ago and shaped by that era鈥檚 beliefs about teaching, learning, and human potential鈥攎any of which have since been debunked. In the early 20th century, academics and educators believed intelligence was fixed and distributed across the population along a bell curve, with a few people at the high and low ends and most in the middle. Following the lead of universities, K鈥12 schools used norm-referenced grading, in which a student鈥檚 grade signified their achievement relative to others鈥 in the course.
Our traditional approach to grading largely stems from the century-old beliefs that too many A鈥檚 constitutes a weak, easy course and that fewer successful students indicates a rigorous course. That thinking flies in the face of what we now know about academic potential. Grades of A don鈥檛 have less value if more students achieve them. Equitable grading reinforces that the size of the bullseye doesn鈥檛 get smaller if more people hit the target. Rather, these practices reinforce the goal of great teachers, which is to get the largest number of students to hit the mark as possible.
A second example is that during the Industrial Revolution, animal trials by John Watson and B.F. Skinner supported the belief that humans were most effectively motivated by extrinsic rewards and punishments. This belief underlies the traditional grading practice of using points to incentivize鈥攐r some might say control鈥攕tudent behaviors, such as coming on time to class or completing homework.
But, over the past few decades, research from and from has demonstrated that this belief has severe limitations, one of which is that extrinsic rewards and punishments often undermine creative thinking and effective problem-solving. And while some might argue that using points to change behavior prepares students for the professional world, there鈥檚 no evidence I鈥檓 aware of that supports this. For example, there鈥檚 no evidence that employees who come on time to meetings do so because their teachers subtracted points for lateness or that employees who are habitually tardy had teachers with more lenient grading policies.
I believe a primary reason Industrial Revolution-era grading persists is that a critical understanding of grading research and practice hasn鈥檛 been included in teacher education or certification. For generations, teachers have had little choice but to replicate how they were graded, and many teachers were successful in school and ostensibly weren鈥檛 harmed by traditional grading鈥攖he reasoning goes something like, 鈥淚 did fine, so why change anything?鈥 We find that when teachers think critically about this underdeveloped aspect of their practice, they see the urgency to shift their grading to match modern, research-based understandings of student motivation.
Rick: You鈥檝e previously raised the issue of grade deflation, arguing we focus too much on grade inflation and not enough on deflation. I鈥檓 not sure what to make of the argument but would love to hear you explain a little more. Can you expand on what you have in mind?
Joe: Let鈥檚 start by clarifying what we mean by grade inflation. Grade inflation occurs when a student鈥檚 grade is higher than their actual understanding. When grades are inflated, that student, their parents, college-admissions officers, and others are told that the student is prepared for a certain level of academic challenge when they actually aren鈥檛. This inaccurate grade can have significant consequences, such as requiring unanticipated remediation, which, in college, can make students less likely to graduate on time, if at all.
Grade inflation has received particular attention since the pandemic. Interestingly, research by of the Fordham Institute published in 2018鈥攂efore the pandemic鈥攆ound that grade inflation was worse in schools attended by higher-income students, while after the pandemic suggests that, more recently, there has been a disproportionate increase in the grade inflation of students of color and those from low-income families.
Grade deflation鈥攁nd I have come to believe a more useful term might be 鈥済rade depression鈥濃攐ccurs when a teacher-assigned grade is lower than a student鈥檚 understanding of course content. Grade depression can be even more harmful than grade inflation. Rather than grade inflation, which opens doors for an opportunity a student is not prepared for, grade depression prevents students from pursuing opportunities鈥攍ike advanced coursework or postsecondary opportunities鈥攖hat they are fully prepared for.
We know that traditional grading practices can cause grade inflation and grade depression due to their reliance on the common practice of combining a student鈥檚 academic with nonacademic performance in their final grades. This practice renders grades inaccurate and unreliable. The student who doesn鈥檛 know the content particularly well but compensates for that weakness by following all class rules earns an inflated grade. On the other hand, the student who has an excellent understanding of the content but doesn鈥檛 adhere to all class rules receives a depressed grade. The student with an inflated grade is able to conceal the truth of their deficient academic understanding by pleasing the teacher, and the student with a depressed grade has their excellence hidden.
In a forthcoming paper by the Equitable Grading Project, my co-authors and I compare the teacher-assigned grades of secondary students from multiple states and districts with their corresponding standardized-test scores. The findings revealed a striking mismatch between grades and test scores. Of course, this could be caused by a host of reasons related to the weaknesses of standardized testing. However, we found that when teachers deviated from traditional methods of grading and used improved, more equitable grading practices, grade-test score consistency鈥攊.e., the similarity between grades that teachers assign and test scores鈥攊ncreased, meaning that the use of those practices reduced both grade inflation and grade depression. These results match what we in 2018.
There鈥檚 a lot to excavate about the forces influencing grade inflation and grade depression, but we know we can be confident that equitable grading practices dampen these forces and make grades more accurate and fairer.