At some point during the year, most students in U.S. schools will take a benchmark test.
These tests, also called interim assessments, are designed to measure students鈥 progress toward instructional goals and are given every few months. Districts can purchase off the shelf versions鈥擬AP, i-Ready, and Star are common tools, for example鈥攐r create their own tests.
But even though the vast majority of schools use these assessments, teachers and principals have different perceptions of their effectiveness.
More than 80 percent of principals say that, for the most part, the tests they use align to state standards and the end-of-year state-administered tests used for school accountability.
But about a third of teachers say that these interim tests don鈥檛 align with the curriculum and don鈥檛 accurately measure what students have learned. And teachers who report this mismatch are more likely to say that it鈥檚 difficult for them to figure out how the tests should inform their instruction.
These are the findings in a , which analyzed nationally representative survey data from about 1,500 principals and over 6,000 math and English/language arts teachers.
鈥淲e know that many education leaders and school systems are working really hard to help kids recover from the effects of the pandemic,鈥 said Ashley Woo, an assistant policy researcher at RAND and the lead author on the paper. 鈥淧art of that work entails knowing where students are and how to adjust learning to meet students鈥 needs.鈥
The findings underscore an important feature of interim tests: Different assessments are designed to serve different purposes. While some are meant to measure how well students will do on a state summative test, others claim to show whether students have mastered skills and knowledge taught in the curriculum.
Understanding each test鈥檚 intended purpose is crucial to using the data it produces well, assessment experts have stressed. That鈥檚 challenging because the commercial marketplace for these tests is a bit of a black box: Not much is known about the technical properties or assumptions that undergird most of the popular off-the-shelf exams.
While this RAND survey measures educators鈥 perceptions of interim assessments, other recent initiatives have tried to offer external evaluations.
Last year, the nonprofit curriculum reviewer EdReports announced that it planned to release reviews of commercially available interim tests, evaluating them for technical quality and alignment to commonly used curricula.
But in May, the group said that the plan had been paused after several providers of the tests wouldn鈥檛 commit to participation.
How schools use interim assessments
An overwhelming majority of schools鈥99 percent鈥攗se some sort of interim assessment, the researchers found.
Most principals report using both commercially available tests and ones that are locally created鈥攄eveloped by states, districts, or schools. But elementary and middle schools are slightly more likely to use purchased assessments, while high schools are slightly more likely to use those that were locally created.
Using multiple types of tests was common, according to the principal survey. On average, schools administered three different benchmark assessments in both ELA and math鈥攖hough the report notes that not all tests would necessarily be given to all students.
This could be because different types of tests are better suited to different purposes, Woo said.
鈥淔or instance, you might have something like an i-Ready or an NWEA MAP test, which is more standardized and allows you to compare student progress across different schools and districts because lots of different schools and districts are using these assessments,鈥 she said.
Others might measure whether students have learned the skills and knowledge teachers cover in class鈥攖hink unit tests, term papers, or other major assignments. 鈥淭hose kinds of assessments that are specifically designed to align with curriculum are the kinds of assessments that are giving teachers more real-time data,鈥 Woo said.
The report also tracked test use over time, including during the pandemic. The researchers found small but steady growth in schools鈥 use of commercial assessments between the 2018-19 and 2021-22 school years.
Use of the locally created assessments fluctuated, declining during the 2020-21 school year, and then jumped back up during 2021-22. Some of this change could stem from pandemic-related shifts in instruction and assessment, Woo said.
The overall trend in locally created test use during the period of the study is positive, though.
Where principals鈥 and teachers鈥 perspectives on interim tests differ
Principals generally thought that interim assessments could gauge whether students were meeting broad instructional goals. More than 80 percent of principals said that the interim tests they used aligned to state standards and end-of-year summative assessments.
The surveys asked teachers about a different metric: curriculum alignment. A majority, 64 percent, said that interim assessments did align with their curriculum. Still, though, about a third of teachers said they only partially aligned with the curriculum, or didn鈥檛 align at all.
These numbers varied slightly across different groups. For example, ELA teachers were less likely than math teachers to say that their assessments were aligned. Teachers who created their own interim assessments, and teachers who spent more time in professional development analyzing test data, were both more likely to report curriculum alignment.
In general, it seems somewhat difficult for teachers to figure out how interim test data should inform their instruction. Only 37 percent said that was easy to do; another 37 percent said it was difficult, and about a quarter said it was neither.
It was especially difficult for teachers who felt that the interim assessments weren鈥檛 well-aligned with their curriculum: Among those teachers, just 14 percent said it was easy to address student needs identified by their benchmark tests.
In interviews that the researchers conducted with 45 teachers, several mentioned unclear messaging from leadership about how to use the data that these tests provided.
鈥淭his is probably the biggest, most frustrating thing of our district,鈥 said one elementary ELA teacher. 鈥淭hey have adopted so many different ways to benchmark our kids, different formats and different resources. 鈥 And all of these different tests are all different standards, and they don鈥檛 align to our curriculum necessarily. 鈥 Looking at the data, it鈥檚 sometimes hard for us as teachers to know which data [are] important, which do I need to use to show that my children are progressing.鈥
Guidance from district and school leaders is necessary, Woo said. Leaders should know whether benchmark tests align to standards, year-end state tests, curriculum, or some combination鈥攁nd then use that information to convey messages to teachers about how to use them, she said.
鈥淲e think it鈥檚 important for state and local leaders to consider the slate of assessments 鈥 and the role that they expect each assessment to fulfill,鈥 said Woo.