Corrected: An earlier version of this story misstated the name of the National Council on Teacher Quality.
Bolstered by new research and federal incentives, experts decided about a decade ago that better teacher evaluation was the path to better student achievement. A flood of states started toughening their teacher-evaluation systems, and many of them did it by incorporating student-test scores into educators’ ratings.
And while those policies are still in place in a majority of states, there are signs the tide is turning: Over the past two years, a handful of states have begun reversing mandates on using student-growth measures—and standardized-test scores, in particular—to gauge teacher quality.
Six states—Alaska, Arkansas, Kansas, Kentucky, North Carolina, and Oklahoma—have now dropped requirements that evaluations include student-growth measures and begun letting districts decide what elements to include in assessing teachers, according to analyses from the Education Commission of the States and the National Council on Teacher Quality.
Connecticut, Nevada, and Utah passed policies that require some evidence of student learning but prohibit using state standardized-test scores for that purpose. Florida kept student-growth measures but now lets districts choose how they’re calculated.
Those are all “signals [states] are backing away from the inclusion of student-growth or value-added measures,” said Stephanie Aragon, a policy analyst for ECS.
The changes are due, at least partly, to the increased flexibility that states have under the new federal education law, the Every Student Succeeds Act.
Some analysts, though, say it’s still unclear whether states will follow any sort of trend on teacher evaluation.
“It is playing out and will play out differently across the 50 states,” said Patrick McGuinn, an associate professor of political science and education at Drew University in New Jersey. “The forces at play here are pushing in a couple different directions around teacher evaluation.”
‘Took the Hook Too Quickly’
Teachers’ unions have generally fought the use of test scores in teacher evaluations, particularly when those evaluations lead to decisions about teachers’ tenure, pay, and dismissal.
The American Federation of Teachers and the National Education Association filed more than a dozen lawsuits nationwide between 2011 and 2015 related to teacher-evaluation systems. The AFT briefly championed the slogan “VAM is a sham,” referring to value-added measures, which use complicated algorithms to determine how much a teacher contributed to students’ academic growth over a year.
Arguments against using student-growth measures, as well as hang-ups encountered in implementing such systems, have been persuasive in some states.
“Even for those who ultimately believe in the value of measuring outcomes and even accountability systems somehow linked to outcomes, the eagerness to jump really quickly on value-added measures and test-based accountability was premature,” said Jeffrey Henig, a professor of political science and education at Teachers College, Columbia University. Among some state leaders there is “a recognition … that we took the hook too quickly and too eagerly here.”
At the same time, legislatures and state school boards that pushed for the inclusion of test scores may be reluctant to turn their backs on those policies too soon. In some places, the evaluation systems were just getting going when ESSA was passed in December 2015.
“I don’t think anyone [is] overly enthusiastic to undo something they’d worked four to five years to roll out,” said Michelle Exstrom, a program director at the National Conference of State Legislatures.
In weighing whether to change these evaluation systems, state leaders are likely considering the time, energy, and resources they’ve put into developing them. “There’s a lot of sunk costs there,” said McGuinn. “And that dynamic works to sustain these systems for a bit.”
Annual teacher evaluations were traditionally based on information from a single source: observations from principals.
Starting in 2009, a confluence of factors led to more than two dozen states stiffening their teacher-evaluation requirements. That year, TNTP (formerly the New Teacher Project) published a seminal report called “The Widget Effect,” which found that 99 percent of all teachers were being rated as “satisfactory.” Policymakers and education leaders began questioning the validity of evaluation systems that failed to distinguish among teachers.
The Obama administration began its Race to the Top program toward the end of that year. The competitive-grant program offered states financial incentives to include student-test data in their evaluation systems.
At about the same time, the Bill & Melinda Gates Foundation began pouring millions of dollars into studying teacher quality. (ܹ̳ currently receives financial support from the Gates Foundation for coverage of continuous improvement strategies in education.) The foundation’s high-profile “Measures of Effective Teaching” study was among the largest randomized experiments of its kind, collecting data on 3,000 teachers across six large districts in order to compare different methods for gauging teacher performance.
And then there were the waivers. Starting in 2011, the U.S. Department of Education began offering states relief from some of the stringent requirements in what was then the main federal education law, the No Child Left Behind Act. (Among other provisions, the law mandated that all students perform at grade level in reading and math by 2014.) To get that flexibility—which most states ultimately did—states had to commit to linking student-achievement outcomes to their teacher-evaluation systems.
With all those incentives in place, the number of states using student-growth data in their evaluations skyrocketed, going from just 15 states in 2009 to 43 at the end of 2015, according to NCTQ.
Despite making those commitments, states actually implemented the policies at different speeds. Tennessee, for instance, was an early implementer of this kind of evaluation system—and hit roadblocks right away. Nevada, on the other hand, passed a law in 2011 requiring the use of student-test scores but had yet to start incorporating the data five years later.
ESSA Offers Reprieve
But with the 2015 passage of ESSA, states almost immediately got a reprieve.
The bipartisan law put teacher evaluation back in states’ hands—in essence renouncing the Obama administration’s push for strict test-based accountability.
While six states dropped requirements around using student-growth in evaluations, they did so in different ways.
Some, like Arkansas and Kentucky, did so through state legislation. In fact, the National Conference of State Legislators has tracked bills in 10 states that proposed such changes.
“We are just now starting to see the effects of ESSA on state legislation, and we don’t anticipate seeing the majority of it till next legislative session,” said Exstrom of NCSL.
A few other states, including Alaska, Connecticut, and North Carolina, passed policies backing away from student-growth requirements in teacher evaluations through their state boards of education.
Before ESSA, Kentucky required that student growth be a significant factor in teacher evaluations—and didn’t allow teachers to receive a high final rating without a high rating on student scores. In 2017, it passed a bill that allowed districts to decide whether to include student achievement measures at all.
So far, though, “a lot of districts have chosen not to make huge changes to their systems,” said Robin Hebert, the director of the division of next-generation professionals at the Kentucky education department. That’s likely because they’re waiting for the specific state regulations to be released, which should happen in the spring.
The Kentucky Education Association, the state NEA affiliate, for its part, is pleased with the change.
“KEA has always felt that student growth should not be one of the multiple measures included in the teacher-evaluation system,” the group’s president Stephanie Winkler wrote in an email.
However, Elizabeth Ross, the managing director for state policy at NCTQ, which advocates for measuring teacher effectiveness through objective data like test scores, called the Kentucky change “a huge step backward for them.”
In Arizona, Maine, and New Mexico, the legislatures approved bills to undo or reduce the weight of the student-achievement requirement—only to have them vetoed by their governors.
New Mexico, well known to have the toughest evaluation system in the country, has been at the center of the debate. The state education chief recently reduced the student-growth measure from 50 percent of a teacher’s evaluation to 35 percent.
Several other states, including Indiana and Louisiana, have convened task forces to look at the issue.
Changes Ahead?
There’s another way to view the state policy changes so far, some analysts say: There could have been more.
“Some folks thought after ESSA, states would rush to undo this,” said Exstrom.
In another sense, the swing toward including student achievement in teacher evaluations really didn’t change much on the ground.
Just as before, nearly all teachers continued to get positive ratings, even in states that overhauled their evaluation systems, a recent study showed. (New Mexico, where nearly 1 in 4 teachers were rated “ineffective,” is an outlier.) That’s largely because principal observations still make up the bulk of the evaluations nationwide—and principals almost never give bad reviews.
“There was some thought that if you [toughen evaluation systems], a whole bunch of teachers out there will be able to demonstrate they aren’t able to do their job,” said Exstrom. “I think that just hasn’t come to light. ... It hasn’t caused the big traumatic effects some thought would happen.” In fact, many states are more concerned with teacher shortages than they are with evaluation policies, she said.
Even so, the fate of teacher evaluations in many states may be dependent on the 2018 election results.
“Probably the more Democratic victories you see in state legislatures and governorships, the more likely you are to see teacher-evaluation reforms rolled back to one extent or another,” said McGuinn. “What the electoral results are in states and at the national level in 2018—certainly, it matters.”