Includes updates and/or revisions.
Amid battles over teacher quality and school restructuring, there鈥檚 one thing everyone seems to want in the next version of the Elementary and Secondary Education Act: an accountability system that measures student growth.
Yet the results of the U.S. Department of Education鈥檚 growth-model pilot program, whose final evaluation was released earlier this year, suggest lawmakers may have to do some heavy lifting to include growth in accountability. Not only do state growth models vary considerably, but they also play out in ways that can run counter to the aims of providing greater transparency and better accountability for all students, not just those 鈥渙n the bubble,鈥 or just below passing rates for their state exams.
鈥淚t seems to me there is a serious disconnect between the rhetoric supporting growth models and the incentives and structures they end up creating,鈥 said Andrew D. Ho, an assistant professor at the Harvard Graduate School of Education and a co-author of the federal growth-model pilot program evaluation.
Daria Hall, the director of K-12 policy development for the Education Trust, a Washington research and advocacy group, and one of the original peer reviewers for state growth-model proposals, agreed that some of the rhetoric in favor the concept has not been supported by the data.
鈥淲hile there are certainly students who are not proficient but are making big learning gains, there鈥檚 not nearly enough of them and not nearly as many as folks hoped or assumed that there were,鈥 Ms. Hall said, 鈥渁nd that鈥檚 a real problem.鈥
Department鈥檚 Plan
More than half of states are already using or developing their own growth models, and incorporating growth into the next federal accountability system has become one of the most often-requested changes to the ESEA, whose current edition, the No Child Left Behind Act, was signed into law in 2002. Proposals to use growth models have the support of 15 national education organizations, including groups representing state schools chiefs, legislatures, governors, and school boards, as well as the National Education Association and the American Federation of Teachers.
When the U.S. Department of Education undertook a pilot project to evaluate growth models for judging whether schools are meeting their performance targets under the federal No Child Left Behind law, most of the participating states saw gains.
SOURCE: U.S. Department of Education, Office of Planning, Evaluation, and Policy Development
Growth-based accountability is also a centerpiece of the Education Department鈥檚 vision for the ESEA reauthorization. Secretary of Education Arne Duncan House education committee members at a hearing last month: 鈥淸W]e mean a system of accountability based on individual student growth鈥攐ne that recognizes and rewards success and holds us all accountable for the quality of education we provide to every single student in America.鈥
鈥淭his is a sea change from the current law鈥攚hich simply allows every state to set an arbitrary bar for proficiency鈥攁nd measures only whether students are above or below the bar,鈥 he added.
Growth models have gained popularity because supporters say they provide that more nuanced picture of how students are progressing academically and what schools contribute to their learning.
鈥淪imply counting the percent proficient is not a very good way to evaluate a school,鈥 said Peter G. Goldschmidt, a senior researcher at the National Center for Research on Evaluation, Standards, and Student Testing at the University of California Los Angeles, who has studied growth models. 鈥淵ou want to see how schools are facilitating learning, for which you need to look at individual kids.鈥
Former Education Secretary Margaret Spellings started allowing states to experiment with growth models in 2005, via a pilot initially limited to 10 states. Each state had to tie growth to the existing annual proficiency targets for math and reading under NCLB, rather than setting different expectations for students based on their backgrounds or their schools鈥 characteristics.
Ohio an Outlier
The Education Department evaluated only the states in the original pilot: Alaska, Arizona, Arkansas, Delaware, Florida, Iowa, North Carolina, Ohio, and Tennessee. Of those, only Ohio has shown the growth model to make a big difference in the number of schools that made adequate yearly progress: More than twice as many Ohio schools made AYP via showing their students grew academically than by increasing the percentages of students who hit proficiency targets in 2007-08. Evaluators found, however, that Ohio uses a much more inclusive definition of students鈥 being on track than other states. For the rest of the pilot states, growth alone accounted for a mere 4 percent of schools making AYP.
鈥淚t鈥檚 not surprising then that the growth model didn鈥檛 have much effect,鈥 Mr. Goldschmidt said. 鈥淭here were a whole slew of adjustments 鈥 and when you took all of these, the growth model really became just another adjustment to how you count the percentage [of students] proficient.鈥
Aside from Delaware, states do not use growth as a primary accountability measure for all students, the evaluation shows. Instead, schools are judged first on the basic proficiency status of each student group, and then through the use of those other 鈥渁djustments,鈥 such as a confidence interval to correct for changes in group size from year to year, or the federal law鈥檚 鈥渟afe harbor鈥 provision, which credits a school for improving by 10 percent or more from the previous year.
The pilot evaluation found more than 83 percent of schools made AYP by one of those standard measures, meaning growth didn鈥檛 figure much into the accountability picture.
Moreover, growth models in most cases didn鈥檛 hold schools accountable for high-achieving students found to be falling off track. Some states, like Colorado and Tennessee, require schools to base growth accountability only on the students who are on track to meet their ultimate target, regardless of their current performance. Most states base growth accountability only on the students who are not on grade level now, which can obscure future problems, Mr. Goldschmidt said.
鈥楻ube Goldberg鈥 Models
Damian W. Betebenner, an architect of the Colorado model and an associate at the National Center for the Improvement of Educational Assessment, in Dover, N.H., agreed: 鈥淚n my original forays into this, the [status] accountability system was often indistinguishable from the growth model. It could be a Rube Goldberg machine that kind of led you to the 鈥榶es AYP/no AYP鈥 decision, but there was nothing else in it.鈥
Colorado鈥檚 growth model, which has become a model for at least 15 other states, reports growth and status measures for all students and is the only state to allow parents as well as educators to use its database to look at a student鈥檚 predicted achievement over various time frames, to understand how quickly the student will need to advance.
鈥淵ou have some students starting 500 miles away from the destination and expect them to get to the destination in two hours; we can calculate that out and know they aren鈥檛 going to get to the destination by driving,鈥 Mr. Betebenner said. 鈥淪o we need to look at other ways to get them to the destination, or we need to consider that it will take them more time to get there.鈥
Yet the implementation differences between states may make it harder for policymakers to glean best practices to include growth measures in the next ESEA.
鈥淓ven if we all had the same growth model, we could very well end up in a case where one state sees vast differences [in the number of schools making progress] and other states do not,鈥 Mr. Ho said.
Growth models have become popular in spite of the pilot鈥檚 lackluster results. The Education Department opened the pilot to all states in 2007, and by 2010, according to a by the Council of Chief State School Officers, 17 states had implemented a growth model and another 13 were developing one.
Not Value-Added
While they can at times rely on the same testing and other data, state growth models differ from the 鈥渧alue added鈥 models that have attracted attention as a tool for evaluating teachers.
State growth models have myriad permutations, but they fall into three basic categories:
鈥 The trajectory model, used by Colorado and Tennessee, among other states, is what most people think of when they envision growth: It takes the gap between the student鈥檚 base test score and the proficiency target, usually three or four years out, to calculate how much the student must progress each year. A student who isn鈥檛 on grade level this year, but whose prior test scores show he or she will reach proficiency within the allowed time frame, would be considered on track.
鈥 A few states, such as Delaware, use a transition matrix. Rather than using test-score gaps, it measures how students achieve on a matrix of performance benchmarks, such as moving from the below-basic to the basic level.
鈥 States like Ohio use a regression model, a statistical formula that predicts a student鈥檚 likely achievement by comparing his or her test scores over several years with those of a representative cohort of students and then projecting the result out to the proficiency target. It can feel counterintuitive to teachers, Mr. Ho said, because it gives no weight to increasing test scores. 鈥淚t basically says, if you have high scores now, but most of your scores are low, [the high score] is an anomaly,鈥 he said.
Mr. Ho and other researchers have the regression model, also called a projection model, is the most accurate at predicting which students will actually be proficient at the end of the time frame.
Yet that model can send a grim message to struggling students. 鈥淚t鈥檚 even harder to demonstrate progress than [with] a status model,鈥 Mr. Ho said.
鈥淵ou have to score really, really high, sometimes almost impossibly high, to demonstrate progress, so this is actually raising the standards on the lowest students.鈥
In the end, Mr. Ho said, 鈥渘one of [the growth models] has the perfect combination of transparency and incentives and rhetoric.鈥