States are going to give standardized tests for this school year. But what鈥檚 the best way to report scores from those tests in a way that鈥檚 useful, highlights students with the greatest needs, and isn鈥檛 fundamentally misleading?
An assessment expert thinks he can help, and he鈥檚 shared his proposal with the nation鈥檚 state schools chiefs. A big part of his approach involves thinking about the tests less like traditional exams and more like a census. Having data systems that have tracked individual students over the last five years is another major piece of it.
But the plan from Andrew Ho, a professor at the Harvard Graduate School of Education, doesn鈥檛 have neat solutions for all the potential obstacles involved in testing this year. And of course, it might not win over many doubters whose concerns about the tests extend beyond his pitch.
The U.S. Department of Education announced Feb. 22 that it is not entertaining requests from states to cancel standardized tests for this school year mandated by the Every Student Succeeds Act, despite a push from some states to do so. Not all states have given up hope of nixing these tests or replacing them in some way. But with the pandemic鈥檚 disruptions, many states will still confront significant challenges for testing.
One of the biggest will be how to report scores in a way that鈥檚 meaningful, but doesn鈥檛 discount remote learning, students who鈥檝e largely or entirely vanished from school, testing opt-outs, and other issues. In short: There鈥檚 a lot of doubt out there about putting much if any stock in these scores.
With these issues in mind, Ho has pitched and how to get them from the standardized exams. Late last month, after the Biden Education Department鈥檚 announcement, he presented his plan to a collaborative on testing at the Council of Chief State School Officers.
鈥淲e know the problems. The problems are substantial. They鈥檙e massive,鈥 Ho said in an interview. 鈥淭here is no way to interpret scores this year like it鈥檚 business as usual without massive misinterpretations and technical flaws.鈥
At the same time, Ho stressed, the problems shouldn鈥檛 seem so daunting that nobody tries 鈥渢o give technical solutions and to minimize those flaws.鈥
Equity, trends, and matches are key
There are three main elements of Ho鈥檚 proposal.
1. The first part is to report the percentage of students from this year鈥檚 state testing that have comparable previous test scores鈥攊ndeed, he stressed the importance of this and not scores being the first issue that states focus on when reporting testing data. This would mean looking at which students took the tests two years ago, and seeing whether they took the tests for this academic year. Remember: Last year states canceled their tests en masse, so there aren鈥檛 test scores from that time.
So, for example, states would report the percentage of students who, two years ago, took the tests in the 3rd grade and also took the tests in the 5th grade this year. This would require state data systems to track individual students.
Ho says that this amounts to conducting an educational census that would help states sort students into two important groups: one for which the state has comparable test score data, and the other for which there isn鈥檛 test score data.
鈥淚t instantly divides your attention into two deserving groups,鈥 he said, calling this piece of his idea the 鈥渕atch rate.鈥
2. The second part would focus on the students for whom there is comparable test score data from two school years ago.
Ho proposes that for those students, states find their previous 鈥渁cademic peers.鈥 In other words, states would identify students from 2017 and 2019 who performed at similar levels on the exams. Then they would study how the 2017 group performed on 2019 tests, and how the 2019 group performed on 2021 tests. (This is why Ho鈥檚 plan relies on states that have longitudinal data systems dating and 鈥渟table鈥 testing systems dating back to the 2016-17 school year.)
This method would help people determine the extent to which the pandemic has affected students鈥 academic progress, compared to similar students from before COVID-19. Ho calls this a 鈥渇air trend鈥 approach.
3. The third part focuses on students who don鈥檛 take the tests this year and who the system has lost track of, or what Ho calls the 鈥渆quity check.鈥
For those students, Ho also proposes looking at the scores from those students in 2019, and looking at their academic peers from 2017, and then looking at the test scores from that peer group in 2019.
Ho admitted that this third piece of his plan 鈥渞equires the most guesswork,鈥 but could still tell a meaningful, descriptive story. Yet he also said this 鈥渆quity check鈥 would probably paint a best-case picture of where these missing students stand. Why? Because it 鈥渁ssumes academic learning rates for those who went missing from 2019 to 2021 are the same as those in 2017 to 2019,鈥 Ho wrote.
Concerns about the tests will persist
Ho鈥檚 memo, which includes technical details about things like students who leave schools between academic years, does not directly address state tests for high schools. States are required to give exams in certain subjects once in high school grades, although federal law doesn鈥檛 mandate a particular grade for those years.
In his memo, Ho advises states to 鈥渒eep it simple鈥 lest states 鈥渞isk the public trust on what appears to be a black box.鈥 In the interview, he acknowledged that when it comes to thinking about the data, 鈥淚t is better to be complicated than wrong.鈥
The goal for this year, Ho stressed, should be to assess not just student needs, but where the pandemic has had the most dramatic effects: 鈥淲hat we don鈥檛 know is who needs disproportionate support this year compared to previous years.鈥
What about states that give tests in the fall? The Education Department told states they could administer 2020-21 tests outside the typical testing window, such as in the fall of this year. Yet Ho says he doesn鈥檛 think his approach would work for such tests because there aren鈥檛 past fall tests to use as an appropriate baseline. In addition, he said, things like extended learning opportunities (think summer school) that states might provide over the next several months, or, conversely, any sort of 鈥渟ummer slide鈥 make fall testing a dubious proposition in general.
What about parent opt-out? The prospect of large numbers of parents keeping their kids home (or otherwise away) from the state tests has raised a host of concerns about who will be tested and how results from those tests could be misinterpreted and misleading. What if many high-achieving students take the tests, for example, but their counterparts don鈥檛?
Ho said that鈥檚 the sort of distortion his first measure, the 鈥渕atch rate,鈥 is meant to counteract, because it emphasizes that this is an atypical year and would clearly demonstrate that in some cases large shares of students didn鈥檛 take this year鈥檚 test.
What about tests administered remotely? Ho acknowledged that remote administration of exams presents a 鈥渟erious risk鈥 to the comparability of scores. But he leaves it to ongoing or 鈥減ost hoc鈥 research to decide whether comparing in-person and remote test scores can be done fairly. And he says that in either case, the metrics he鈥檚 proposing can still be useful.
Not surprisingly, opponents of testing for this year have a list of concerns not addressed by Ho鈥檚 plan.
In an 澳门跑狗论坛 opinion piece from December, Lorrie Shepard, a distinguished professor at the University of Colorado鈥檚 school of education, said that if officials considered all the downsides, from parent sentiment to logistical problems, they would cancel most if not all of the state tests.
In an interview, Shepard said that Ho鈥檚 key suggestions for things like creating a 鈥渇air trend鈥 are good ones from a statistician鈥檚 point of view. But she said that鈥檚 no guarantee states will handle the scores fairly.
鈥淲e are all aware that when things are posted publicly, it is usually the simplistic interpretation that most people rely on,鈥 Shepard said. 鈥淭here will be misinterpretations and there will be misuse.鈥 (Read more on data reporting requirements here.)
High opt-out rates still pose problems for Ho鈥檚 approach, she noted, even though his approach tries to account for them.
Shepard also stressed that pre-pandemic data about other things, such as access to internet-connected devices, could help direct resources more efficiently than test scores, Shepard also said.
Ideally, test scores will 鈥渟erve a tertiary purpose,鈥 Ho said. Measures of students physical and mental well-being, as well as things like local assessments, should also be important considerations for state leaders.
Still, 鈥渟creaming from the rooftops with bad data鈥 about tests would be not just unhelpful but damaging, Ho stressed.
鈥淲hen would anybody trust you again?鈥 Ho asked.