Recent studies, including our own, have used data from the National Assessment of Educational Progress to examine achievement across public, charter, and private schools. Debates have ensued, focusing on what NAEP 肠补苍鈥檛 tell us about school effectiveness. Some claims have been accurate, some are questionable, and others are simply wrong. What the discussion needs are clear-eyed appraisals of NAEP鈥檚 strengths and limitations. We offer here our own.
鈥擭ip Rogers
In 2004, an American Federation of Teachers report based on 2003 NAEP results said that charter schools were trailing public schools in math and reading. In response, charter school advocates criticized the evidence in the report, correctly noting that research drawing on the Web-based 2003 data was incapable of simultaneously controlling for multiple factors, such as free-lunch eligibility and racial and ethnic status. This claim has been repeated regarding the recently released 2005 NAEP data. (鈥淲hat NAEP Can鈥檛 Tell Us About Charter School Effectiveness,鈥 Dec. 7, 2005.) Amid the debates about NAEP results, there has been little additional illumination on the issue of school sector and academic achievement.
We already know, for instance, that private schools outscore public schools on NAEP. But private schools also serve more-advantaged students, with characteristics that are associated with higher test scores. So the real question is whether differences in schools鈥 scores are due to differences in their effectiveness or to differences in the types of students they educate.
New evidence sheds some light on this question. Researchers now have access to the raw 2003 data on more than 340,000 4th and 8th grade students. With these data, we recently examined mathematics achievement in public, charter, and various types of private schools, while accounting for multiple demographic differences in their student populations. (鈥淣AEP Analysis Questions Private Schools鈥 Edge,鈥 Feb. 1, 2006.) Our study found that after controlling for these differences between student populations, neither charter schools nor any type of private schools outscored public schools to any statistically significant degree. In fact, public school averages were often higher.
These are significant findings from a comprehensive data set, but we do not see them as the final word on an issue as complex as school achievement. Still, the results do offer some encouraging news for public schools, and they call for caution regarding policies based on common assumptions about the inherent inferiority of public schools.
Public school critics will challenge our findings by attacking the data from which they are drawn. So it鈥檚 important to understand what NAEP can and cannot tell us.
When people speak of 鈥淣AEP data,鈥 they typically are referring to the main NAEP assessment (as opposed to the long-term-trend NAEP). The main NAEP includes not only achievement data, but also survey data from students, teachers, and administrators from thousands of schools nationwide. Sample sizes increased tenfold in 2003, when state-level samples (state NAEP) became part of the national sample. Perhaps the best way to discuss the strengths and limitations of the main NAEP is to examine claims that have been made about it.
鈥 鈥淣AEP is just a traditional standardized test; it doesn鈥檛 really measure what students know.鈥
The main NAEP is in fact an unusually rich standardized test involving a mixture of multiple-choice, short-answer, and extended-constructed-response items. Clearly, there are important things we want students to learn in school that go beyond what can be measured on a standardized test, but as large-scale measures go, NAEP is actually remarkably good.
鈥 鈥淣AEP data allow comparisons of only one variable at a time.鈥
This claim reveals confusion about the two ways of analyzing NAEP data. The most accessible method is through a Web-based portal that allows users to make achievement comparisons using the most popular NAEP variables. It is true that only one variable could be analyzed in conjunction with school type on the old Web-based tool, but an upgrade last year has made this tool more flexible. What is important to understand, though, is that some months after the release of the Web-based data, the raw data become available to researchers who file security agreements with the government.
The raw data allow researchers to use appropriate statistical techniques to analyze dozens of student- and school-level variables simultaneously. Typically, when a new round of NAEP data is released, various interest groups make preliminary claims about the results based on a limited, Web-based analysis. There is necessarily a lag between these studies and more-advanced statistical studies of the data. For example, the 2005 data became available via the Web-based tool last autumn, but the corresponding raw data are not yet available.
鈥 鈥淣AEP data only let us see averages; we 肠补苍鈥檛 tell if there are big differences within groups.鈥
Actually, NAEP analyses can tell us about variation within groups. This is obviously true with the raw data, but it is also true with the Web-based tool, which can calculate standard deviations for any subgroup designated.
鈥 鈥淣AEP gives only a snapshot of achievement; it鈥檚 not longitudinal.鈥
This is true, and is probably the most important limitation of NAEP. As cross-sectional data, NAEP does not track the performance of students over time.
In the case of our study, we showed that the achievement gaps favoring private schools disappear (and often are reversed) when the demographic differences between student populations are accounted for. However, it is important to note that we focused on gaps in achievement at a point in time, which are not the same as gaps in growth.
The real question is whether differences in schools鈥 scores are due to differences in their effectiveness or to differences in the types of students they educate.
Longitudinal data sets have the advantage of allowing researchers to measure the academic growth of students as they go through particular schools. But those data sets are also imperfect; many students drop out or switch schools, for example. Consider that if a study loses a modest 15 percent of its participants per year, after four years only about half the original participants remain (disproportionately, those with stable families, a fact that can bias results). Longitudinal data sets also tend to be much smaller than the immense NAEP samples, making it difficult to examine nationally representative samples of a variety of school types.
Although we cannot make causal claims from cross-sectional studies such as NAEP, with the raw data we can account for the most significant of the possible confounding variables that may explain differences in achievement between schools. This makes it less likely that longitudinal data would tell a different story.
If longitudinal data were to tell a different story鈥攊f they were to indicate that private or charter schools show greater gains than public schools鈥攖hen what our NAEP results do tell us is that private and charter elementary school students must be starting out with lower achievement than demographically comparable students in public schools. In other words, parents would have to be systematically choosing to enroll their children in private or charter schools because their children are low-achieving. There probably are many cases where this occurs.
But there is also good reason to believe that many students would begin their private or charter school careers with higher achievement than demographically similar public school students. The fact that a child鈥檚 family takes the time to select a school, and perhaps even pay tuition, suggests that the child has some initial advantage over public school counterparts. There simply are not measures to control for that type of difference. 鈥淧arental commitment to schooling鈥 is not a NAEP variable.
That is why our results about private schools, in particular, are so surprising to many: because the admittedly rough demographic measures in NAEP more than wash out the inherent advantage that private school students would seem to have at the start.
NAEP-based studies do not tell us about students鈥 academic achievement as they enter a school. NAEP cannot tell us what makes a parent choose a particular type of school, or why parents move their children from one school to another. Certainly, longitudinal and qualitative studies that shed further light on these issues are needed.
So what is NAEP good for? Although NAEP is a snapshot of school performance, it offers a large-scale, high-resolution portrait of U.S. academic outcomes. It is also a critically important tool for monitoring achievement inequities. For example, one thing that is clear from our NAEP analysis is that a child鈥檚 demographic background is a much stronger predictor of achievement than the type of school attended.
NAEP can also tell us about changes in achievement over time for the nation, as well as for subgroups, particularly if those subgroups are stable (a problematic criterion in terms of tracking achievement in the rapidly growing pool of charter schools). Moreover, NAEP survey items allow analyses of a wide variety of factors, such as teachers鈥 backgrounds, students鈥 beliefs, and parents鈥 participation in schooling (all factors we are further examining in our own research). Examinations of NAEP results can shed light on which of these factors do and do not correlate with achievement after demographic and other differences are considered, thereby pointing to important areas for further study.
In debates on school-sector effects, we have noticed a tendency for various interest groups to trumpet NAEP findings that help their cause, while attacking NAEP when the results are less favorable. Clearly, the question of how to help all students succeed academically is complex and needs to be addressed through a number of data sets and varying methodological approaches. Claims that there is only one way to understand this issue are dangerous, and will ultimately undercut our ability to identify the most promising educational practices.