It鈥檚 a spring ritual: Every year in the U.S., millions of schoolchildren take annual, standardized state tests to get a sense of how well their states, districts, schools, and even teachers are helping them learn.
Another sampling of students take the National Assessment of Educational Progress鈥攐r NAEP, better known as the Nation鈥檚 Report Card. Those results, released periodically, fill in the gaps to show how students in a particular state are performing relative to their peers.
That鈥檚 how accountability and assessment have worked in the United States at least since the advent of the No Child Left Behind Act back in 2002 and continuing with its replacement, the Every Student Succeeds Act of 2015.
And in fact, NAEP and Advanced Placement tests are prime components of Quality Counts鈥榮 Achievement Index, which grades and ranks states in this politically fraught category.
The United States is unique among countries in subjecting students so often to standardized tests, but as testing experts note, the resulting deluge of data comes with significant trade-offs on exam quality. And despite a few innovations under ESSA, plenty of them also wonder whether the road-not-taken might have produced a more nuanced and useful, if less frequent trove of information.
Testing every student every year is a costly prospect, said Marc Tucker, the president and CEO of the National Center on Education and the Economy, a research and policy organization in Washington. Tucker鈥檚 research has focused on the policies and practices of the countries with the best education systems.
And the expense means that the tests are often lower quality than tests used in other countries, and a poor gauge of the higher-order critical thinking skills that students need in college, the workforce, and life, he added.
鈥淲e鈥檝e made it virtually impossible to have the quality of tests that other nations that are far ahead of us are using to determine how well their own kids are doing,鈥 Tucker said. 鈥淪o what we鈥檝e done is to deprive ourselves of tests that will enable us to measure the things that are the most important about whether or not are kids are going to be ready for what鈥檚 coming. That鈥檚 a very poor trade. A very poor trade.鈥
By contrast, very few of the highest-performing countries test students every year, Tucker said. And when they do test, they often use deeper assessments that include performance tasks or writing prompts, giving educators a richer understanding of what students know and are able to do.
Singapore, for example, outperforms the U.S. on international measures such as the Program for International Student Achievement, or PISA, in average reading, math, and science performance. It tests students only about three times in the course of their careers鈥攐nce at the end of elementary school, once in middle school, and once in high school, Tucker said.
Focus on Equity
But there are also advantages of the American system. One huge plus: Testing every student every year gives policymakers and educators a sense of how different demographic groups are doing relative to each other, within the same school, said Randy Bennett, a research chair at the Education Testing Service or ETS, a testing company that administers the SAT and other assessments, and has assisted with NAEP.
After the passage of the NCLB law, 鈥渨e could know for the first time that a very good school was not performing so well when you looked at some of its demographic groups,鈥 he said. 鈥淚f you care about equity, it鈥檚 a strength鈥 of the American system.
But Bennett, like Tucker, acknowledged that testing students frequently can lead to lower-quality assessments.
Catch up on how the nation and states fared on a broad range of K-12 categories, including school finance, as reported in this year鈥檚 first installment of Quality Counts, published Jan. 17.
One 鈥渃onsequence of having to test all students is that you want to do it as efficiently [as possible], which takes you down the road of tests that don鈥檛 necessarily reflect the kind of tasks or at least the kind of breadth and depth and processing that you would like to use in teaching,鈥 he said.
That鈥檚 been a big frustration for Matthew Blomstedt, the commissioner of education in Nebraska. He understands the need for data and wants to hold schools accountable. But he鈥檚 worried that concentrating on end-of-the-year test scores leaves a lot of other factors that could contribute to student achievement by the wayside.
鈥淚t鈥檚 not that math and reading scores don鈥檛 matter, because they do,鈥 Blomstedt said. 鈥淏ut we鈥檝e put so much focus on those things and used them as the driving force of how we鈥檙e making educational decisions in this country.鈥
He said he鈥檚 visited Native American tribes in Nebraska and offered up math and reading coaches, when what they really needed were counselors for their students.
Blomstedt is exploring how Nebraska can use data from interim assessments鈥攚hich he said focus on a broader range of skills that better reflect what students are actually learning in class鈥攖o better inform accountability. That鈥檚 something that鈥檚 allowed under ESSA, but few states have taken advantage of it.
ESSA also gives states a chance to move beyond the kind of fill-in-the-bubble tests that have drawn sharp criticism from educators and some experts. Up to seven states can opt to participate in a pilot program allowing them to try out new kinds of tests in a handful of districts, with the goal of eventually taking the tests statewide.
But there are significant hurdles to participating, including showing that these new assessments are 鈥渃omparable鈥 to state tests. So far, only two states鈥擫ouisiana and New Hampshire鈥攈ave applied for the pilot.
Previous efforts to improve the quality of state tests have had mixed results.
In 2010, the Obama administration funded the development of tests meant to provide a richer sense of student skills. The resulting PARCC and Smarter Balanced tests were still in use by 21 states during the 2016-17 school year, according to an 澳门跑狗论坛 review.
But many states have dropped those tests, particularly at the high school level, in part because of complaints from parents that they were too long or took up too much classroom time.
Picking a yardstick for student achievement from among the jumble of measures can be tricky and subjective鈥攁nd can leave out factors that some would argue are crucial to a fully rounded picture of student achievement.
Indexing Achievement
The 澳门跑狗论坛 Research Center, for instance, considers 4th and 8th grade NAEP scores, graduation rates, Advanced Placement scores, and gaps between students in poverty and their peers in determining which states have the highest achievement.
The index looks at both current achievement and growth over time in those measures. In crafting the index, the research center put a premium on metrics that would be continually updated and are common across all 50 states.
Those indicators may be missing some nuances, but they have 鈥渧alue because they provide an apples-to-apples comparison,鈥 said Sterling Lloyd, the assistant director of the research center. And he said it 鈥渞ewards states that have made progress on some important measures.鈥 That鈥檚 especially key because it gives state policymakers a chance to evaluate their strengths and weaknesses against other states, Lloyd added.
The Achievement Index couldn鈥檛 capture high school performance, other than AP test results, because the 12th NAEP scores are not available on the state level. It also could not include things like real-world problem-solving skills, teamwork, and collaboration, because standardized tests don鈥檛 measure them very well.
And a test score can鈥檛 tell policymakers and educators everything they need to know about whether a particular student is actually improving, said Leslie Rutkowski, an associate professor of inquiry methodology at Indiana University in Bloomington who specializes in educational measurement.
鈥淎s long as we are using it as one piece of evidence in a broader profile, then fine,鈥 she said. 鈥淏ut if that is the one thing that we鈥檙e using to make decisions, I think that鈥檚 very risky and prone to error. We鈥檙e just not that good at testing.鈥
She noted that some other countries鈥 assessment systems carry high stakes for students in a way that tests in the U.S. don鈥檛.
In Germany, for example, students take a test at about age 18 to determine if they can pursue higher education or an apprenticeship program. By contrast, in the U.S. a student can perform poorly on college entrance exams such as the ACT or SAT and still get a four-year diploma, although that student鈥檚 college choices may be more limited.
To be sure, ESSA requires states to look beyond test scores and graduation rates in gauging student achievement. More than 30 states have included chronic absenteeism, or some measure of attendance, in their accountability plans. And at least 35 have included some kind of 鈥渃ollege and career readiness鈥 indicator.
Academic Measures Still King
Nevertheless, academic measures鈥攊ncluding test scores, graduation rates, and English-language proficiency鈥攁re still king under the law.
In the U.S., policymakers tend to think of student achievement and school quality as one and the same. If a school shows growth in test scores, or has high scores, it is rewarded. Schools where children are underperforming鈥攐r slipping鈥攁re targeted for extra help. But there are problems with linking school quality and student achievement, Rutkowski said.
鈥淭hose conversations are conflated,鈥 she said. 鈥淲e can鈥檛 reasonably attribute the performance of the school or assign the performance of a school to an individual. You can well have high-performing students in poor schools and low-performing students in great schools.鈥
Other countries, such as the Netherlands, are more holistic in their approach to gauging school quality, said Daniel Koretz, a professor of education at Harvard University and the author of The Testing Charade: Pretending to Make Schools Better.
Dutch schools are allowed to choose their own standardized tests, although in practice most use the same exam, he said. But they are also subject to intensive inspections. (Some U.S. states鈥攊ncluding Vermont鈥攁re trying to emulate that approach, as 澳门跑狗论坛 has reported.)
In the Netherlands, people 鈥渨ill often get test scores in the context of a [school] inspection report that has a lot more information,鈥 Koretz said. 鈥淭hey don鈥檛 shy away from the fact that you need human judgement to have a complete evaluation of schools.鈥