Includes updates and/or revisions.
Top researchers studying new 鈥渧alue added鈥 or 鈥済rowth index鈥 models for measuring a teacher鈥檚 contribution to student achievement completely agree on only one thing: These methods should be used in staff-evaluation systems with more care than they have been so far.
That area of agreement emerged in an Aug. 9 meeting that drew together a who鈥檚 who of a dozen of the nation鈥檚 top education researchers on value-added methods鈥攊n areas from education to economics鈥攖o build, if not consensus, at least familiarity within a disparate research community for value-added systems. The U.S. Department of Education鈥檚 research agency, which organized the forum, last week released the proceedings of the meeting, as well as individual briefs from each of the experts.
鈥淭here鈥檚 been a huge amount of research in this field in recent years, but it tends to be really siloed,鈥 John Q. Easton, the director of the Institute of Education Sciences, told members of the National Board for Education Sciences, IES鈥檚 advisory group, during a briefing earlier this month. 鈥淧eople don鈥檛 seem to read each other鈥檚 work, and it鈥檚 published in totally different journals. It was so typical to read somebody鈥檚 study who was not citing all the others.鈥
Pros and Cons
The federal Institute of Education Sciences recently convened a meeting of a dozen top researchers on the use of value-added methods to measure teacher effectiveness:
鈥 DAMIAN W. BETEBENNER, senior associate, National Center for the Improvement of Educational Assessment, Dover, N.H.
鈥 HENRY BRAUN, director, Center for the Study of Testing, Evaluation, and Education Policy, and professor of education and public policy, Boston College
鈥 SEAN P. CORCORAN, associate professor of educational economics, Steinhardt School of Culture, Education, and Human Development, New York University
鈥 LINDA DARLING-HAMMOND, professor of education and faculty co-director, Stanford Center for Opportunity Policy in Education, Stanford University
鈥 JOHN N. FRIEDMAN, assistant professor of public policy, John F. Kennedy School of Government, Harvard University, and faculty research fellow, National Bureau of Economic Research, Cambridge, Mass.
鈥 DANIEL GOLDHABER, director, Center for Education Data and Research, Seattle, and interdisciplinary arts and sciences professor, University of Washington Bothell
鈥 ANDREW HO, assistant professor, Harvard Graduate School of Education
鈥 THOMAS KANE, professor of education and economics, Harvard Graduate School of Education, and faculty director, Center for Education Policy Research, Cambridge, Mass.
鈥 HELEN F. LADD, professor of economics and public policy, Duke University
鈥 ROBERT C. PIANTA, dean, Curry School of Education, University of Virginia, and director of the university鈥檚 Center for Advanced Study of Teaching and Learning
鈥 JONAH E. ROCKOFF, associate professor of business, Columbia Graduate School of Business, and faculty research fellow, National Bureau of Economic Research
鈥 JESSE ROTHSTEIN, professor of public policy and economics, University of California, Berkeley, and research associate, National Bureau of Economic Research
SOURCES: Institute of Education Sciences, U.S. Department of Education
Value-added methods, which attempt to measure teachers鈥 performance based on their students鈥 test scores, have gained support in the last decade, as studies by Stanford University economist Eric A. Hanushek and others found inconclusive evidence to support a link between a teacher鈥檚 effectiveness and his or her degree credentials鈥攖he latter of which is the traditional basis for teacher pay. Massive federal support, in the form of the $290 million Teacher Incentive Fund and the $4 billion Race to the Top competition has led to rapid growth in the number of states and districts adopting these methods in their teacher evaluation systems.
Advocates argue that value-added methods can be more objective than principal observations alone, and if done well can provide information about areas in which a teacher needs to beef up instruction. Critics contend these scores can only be used for teachers of mathematics and English/language arts in tested grades, leaving out both a large proportion of district teachers and any contribution a teacher makes to untested subjects or skills, be they science or self-control.
One influential study by Jesse Rothstein, a public policy and economics professor at the University of California, Berkeley, and a participant in the meeting, found a standard value-added model was biased because it did not take into account that parents and principals often push teachers to take certain students, rather than assigning them at random.
鈥淸Value-added measures] will deteriorate鈥攚ill become less reliable and less closely tied to true effectiveness鈥攊f they are used for high-stakes individual decisions,鈥 Mr. Rothstein wrote in a brief for the meeting. 鈥淗ow much will teachers change their content coverage, neglect nontested subjects and topics, lobby for the right students, teach test-taking strategies, and cheat outright? ... We simply don鈥檛 know.鈥
Tools for Improvement
The Measures of Effective Teaching Project, funded by the Seattle-based Bill and Melinda Gates Foundation, is expected to release a report later this year in which class rosters were randomly assigned to clusters of teachers by school, grade, and subject area. (澳门跑狗论坛 receives support from the Gates Foundation for coverage of the education industry and K-12 innovation.) This may help identify how the selection bias Mr. Rothstein mentioned takes place and can be prevented, according to Thomas Kane, an education and economics professor at the Harvard Graduate School of Education, and a meeting participant.
Mr. Kane and fellow Harvard assistant education professor Andrew Ho, contended that district leaders should focus less on using value-added systems to rank teachers, which Mr. Ho likened to hospital intake questionnaires that identify initial symptoms. 鈥淢edicine (and education) is not only about symptoms (and even less so about one-dimensional rankings of symptoms), but, far more critically, diagnosis and ultimately treatment,鈥 Mr. Ho said. 鈥淗ow can we use VAM results to improve teaching and the teacher corps?鈥
Education officials鈥 tendency to average multiple measures or years of data into a single composite score worried many researchers.
From one year to the next, a teacher鈥檚 ratings under some of the value-added systems now in use can vary by 4 percent to 25 percent, according to Linda Darling-Hammond, an education professor and faculty co-director of the Stanford Center for Opportunity Policy in Education at Stanford University in Palo Alto, Calif. She argued that researchers and policymakers must take into account the range of scores available on their state鈥檚 tests when developing a value-added system. For example, a teacher of gifted students may not show up as very effective, because his or her students are already performing near the top of the test鈥檚 ability to measure their progress.
Mr. Kane countered that teachers have such a strong effect on student achievement that if value-added measures help identify teachers in the bottom 5 percent of performance and bring them up to the district average, they can lead to an average increase in lifetime earnings for each student of $52,000 as a result of being taught by that teacher for one year.
Common-Core Concerns
Many of the experts see both promise and peril in the rollout of the Common Core State Standards and their effect on existing and emerging teacher evaluation systems.
鈥淔or more on value-added, read 鈥溾榁alue Added鈥 Measures at Secondary Level Questioned.鈥
In most districts, researchers voiced concern that evaluation systems do not take into account the time it will take for even the most effective teachers to adapt to new areas of focus in the standards鈥攏ot to mention that the common core deliberately omits guidance on specific teaching strategies to meet the new requirements.
For example, Henry Braun, the director of the Center for the Study of Testing, Evaluation, and Education Policy at Boston College and a consultant with the Partnership for Assessment of Readiness for College and Careers, or PARCC, one of the two consortia developing tests for the common core, has been struggling with how to design an assessment which likely will end up being used for teacher evaluation. He worried that if the teacher accountability 鈥渢ail鈥 wags the student assessment 鈥渄og,鈥 tests won鈥檛 be designed appropriately to measure students鈥 learning rather than teacher behavior.
Experts called for state policy leaders to consider how their individual state tests will affect the validity of individual districts鈥 evaluation systems.