U.S. performance in reading, math, and science has remained stagnant since 2009 as other nations have plowed ahead, according to new results from a prominent international assessment.
Nineteen countries and education systems scored higher than the United States in reading on the 2012 Program for International Student Assessment, or PISA, up from nine systems when the test was last administered in 2009. Germany and Poland, for instance, have seen steady gains on the reading assessment over time, and are now ahead of the United States.
In mathematics, 29 nations and other jurisdictions outperformed the United States by a statistically significant margin, up from 23 three years ago, the results released Tuesday show. The nations that eclipsed the U.S. average included not only traditional high fliers like South Korea and Singapore, but also Austria, the United Kingdom, and Vietnam.
In science, 22 education systems scored above the U.S. average, up from 18 in 2009.
鈥淲hile we鈥檙e standing still, other countries are making progress,鈥 said Jack Buckley, the commissioner of the National Center for Education Statistics, which issued the U.S. report on PISA.
The global assessment compares reading, math, and science 鈥渓iteracy鈥濃攐r knowledge and application of skills鈥攁mong 15-year-olds internationally. For the first time, the report also includes separately reported results for public school students in three American states: Connecticut, Florida, and Massachusetts.
Massachusetts, long a top-performing U.S. state, demonstrated especially strong performance on the global stage: It scored better than the average for leading industrialized nations in all subjects.
Mitchell D. Chester, the education commissioner for Massachusetts, said the new PISA data 鈥渉elped reinforce that our students are performing among some of the better-performing nations in the world, and it also made clear to me that we shouldn鈥檛 be complacent.鈥
Shanghai Tops Rankings
Among the 65 participating education systems, the highest performer in all three subjects was Shanghai, though the methodology around treating the Chinese city as a stand-alone system has raised eyebrows.
Overall, U.S. performance in reading and science was on par, as it was three years ago, with the average for the 34 industrialized nations in the Organization for Economic Cooperation and Development. And once again, U.S. scores were below the OECD average in math.
鈥淚t鈥檚 a policy question whether one should be OK with average,鈥 Mr. Buckley said. 鈥淚鈥檇 be more willing to tolerate our position if I saw that we were improving.鈥
The United States continued to have its strongest showing in reading, though there was no measurable change from its 2009 scores. On the PISA scale of 1 to 1,000, the nation scored 498 in reading, statistically similar to the OECD average of 496 and well below Shanghai鈥檚 570.
Massachusetts scored 527 in reading, outperforming all but three education systems. Connecticut came in just behind its neighbor state. Florida鈥檚 score was not statistically different from the U.S. average.
While Americans鈥 reading scores were flat, 10 education systems have surpassed the United States in the subject since 2009, including Ireland, Chinese Taipei (Taiwan), Poland, Estonia, the Netherlands, and Germany.
The 2012 reading results seem 鈥減articularly dramatic,鈥 Mr. Buckley said, because several countries that were tied with the United States in 2009 made just enough improvement to statistically edge ahead.
Countries that have demonstrated a 鈥渟trong multiyear history of sustained improvement鈥 include Germany and Poland, he noted. 鈥淭hat鈥檚 a very different trajectory than the U.S. has observed,鈥 Mr. Buckley said.
In math, the United States scored 481, measurably lower than the OECD average. Poland, Vietnam, Austria, Ireland, the United Kingdom, Latvia, and Luxembourg all overtook the United States by statistically significant margins in the math standings for 2012, while Norway dropped behind.
鈥楻oom for Improvement鈥
In Massachusetts, about one in five students were 鈥渢op performers鈥 in math, scoring at levels 5 and 6 (on a scale with six levels of performance). The same proportion scored below level 2, or the 鈥渂aseline proficiency鈥 level. By comparison, more than half of Shanghai 15-year-olds scored at the top two levels in math and just 4 percent scored at the bottom level.
鈥淥ne of the things that concerns me is the gap between our top and bottom performers,鈥 said Mr. Chester of Massachusetts. 鈥淲hile our aggregate results are very strong, there鈥檚 much room for improvement in bringing up our scores in the bottom.鈥
In science, the average for U.S. students was statistically similar to the OECD average and not measurably different from the 2009 results. Massachusetts and Connecticut both scored higher than the United States as a whole, while Florida scored lower.
Some of the most-anticipated results among policymakers are those from Finland, which became a darling of the education policy world after posting strong results on PISA in 2003. Recent results on the Trends in Mathematics and Science Study, or TIMSS, another international exam, have called Finland鈥檚 reputation into question. In math, for example, the performance of Finland鈥檚 8th graders on the 2011 TIMSS was not measurably different from that of their counterparts in the United States, and trailed several U.S. states that participated.
On the 2012 PISA, Finland scored above the U.S. and OECD averages in all three subjects, but its raw scores were all down from 2009, with the biggest drop in math. Finland ranked sixth among OECD countries in math for 2012. Three years earlier it was among the top three math performers.
In discussing the results for Shanghai, the top performer on PISA, several experts offered the caveat that its results are not representative of China as a whole. Tom Loveless, a senior fellow at the Brookings Institution鈥檚 Brown Center for Education Policy, recently wrote in a blog post that 鈥淪hanghai has an economically and culturally elite population with systems in place to make sure that students who may perform poorly are not allowed into public schools.鈥
Twelve provinces in China took the 2012 PISA test, the OECD confirmed, but only the results from Shanghai, Hong Kong, and Macao were publicly released.
Mr. Loveless was especially critical of that action, and suggested in an interview that the OECD 鈥渃ut a special deal鈥 with the Chinese government, allowing for 鈥渃herry-picked鈥 results. In 2011, a Chinese website leaked the average PISA scores from 2009 for all 12 participating provinces. According to those results, China scored measurably above the United States in math and science, but significantly below the U.S. average in reading.
Mr. Buckley of the NCES said that juxtaposing results in Shanghai and Massachusetts鈥攁 top-performing U.S. state by most measures鈥攊s 鈥渁 better comparison than Shanghai to the U.S.鈥 In all three subjects tested, Massachusetts鈥 scores fell far behind those of the Chinese city.
鈥淭he Shanghai results suggest that even better things are possible for Massachusetts,鈥 said Mr. Chester.
The OECD report also seeks to offer some insights concerning the impact of poverty and other socioeconomic factors on student achievement. It finds that the extent to which socioeconomic status predicts student performance in the United States is similar to the average for OECD nations. But the report identifies some countries in which achievement is not as closely tied to such factors, including Hong Kong, Estonia, and Japan.
鈥淭he large differences between countries/economies in the extent to which socio-economic status influences learning outcomes suggests that it is possible to combine high performance with high levels of equity in education,鈥 the report states.
Making Causal Inferences
In a webinar last month for the Washington-based Education Writers Association, Andreas Schleicher, the OECD鈥檚 deputy director for education and skills, said that the new PISA results contradicted the widely held belief, based in part on a 2010 McKinsey & Co. report on teacher recruitment, that high-performing countries draw their teachers from the top third of the nation鈥檚 academic pool.
鈥淎ctually, I should tell you that鈥檚 not something we can see through PISA,鈥 he said. 鈥淚n fact, many of the countries doing really well on PISA get pretty well the average graduate. But they are very good in developing that talent, retaining that talent.鈥
Mr. Schleicher explained that the highest-performing countries also place a high value on education, have 鈥渦niversal education standards,鈥 and use 鈥減ersonalization in addressing diversity鈥 rather than 鈥渢racking and streaming students early on.鈥 Above all, though, he argued, the best systems are 鈥渧ery good in getting the most-talented teachers to the most-challenging classrooms. ... They prioritize the quality of teachers over the size of classes.鈥
The OECD on Tuesday also released a nearly 550-page addendum report, 鈥淲hat Makes Schools Successful?,鈥 that aims to 鈥渟hare evidence of the best policies and practices and to offer our timely and targeted support.鈥
Many education experts warn, however, against making broad policy prescriptions based on PISA scores.
鈥淭hese kinds of studies are really good at describing where we stand and maybe looking at trends,鈥 said Mr. Buckley. 鈥淭hey鈥檙e not good at all at telling us why. The study design is not one that supports causal inference.鈥
Mark Schneider, a vice president of the Washington-based American Institutes for Research and a former NCES commissioner, said that, too often, stakeholders use PISA and other such tests to 鈥渢o confirm existing policy preferences,鈥 which he calls a 鈥渟erious problem.鈥
鈥淧eople have their favorite policy prescriptions and plug PISA data into it,鈥 he said. 鈥淚t鈥檚 not clear to me what the logical foundation is for observing a sample of 15-year-olds and talking about preschool.鈥
Diving Into the 2012 PISA Scores
Use the interactive table below to sort through the 2012 PISA scores. Click the icon to the right of the subjects to order the countries by those test scores. The scores have also been color-coded to indicate scores that are statistically higher, lower, and the same as the U.S. average score.