澳门跑狗论坛

Special Report
Assessment

Essay-Grading Software Seen as Time-Saving Tool

By Caralee J. Adams 鈥 March 10, 2014 7 min read
  • Save to favorites
  • Print
Email Copy URL

Jeff Pence knows the best way for his 7th grade English students to improve their writing is to do more of it. But with 140 students, it would take him at least two weeks to grade a batch of their essays.

So the Canton, Ga., middle school teacher uses an online, automated essay-scoring program that allows students to get feedback on their writing before handing in their work.

鈥淚t doesn鈥檛 tell them what to do, but it points out where issues may exist,鈥 said Mr. Pence, who says the a program engages the students almost like a game.

With the technology, he has been able to assign an essay a week and individualize instruction efficiently. 鈥淚 feel it鈥檚 pretty accurate,鈥 Mr. Pence said. 鈥淚s it perfect? No. But when I reach that 67th essay, I鈥檓 not real accurate, either. As a team, we are pretty good.鈥

With the push for students to become better writers and meet the new Common Core State Standards, teachers are eager for new tools to help out. Pearson, which is based in London and New York City, is one of several companies upgrading its technology in this space, also known as artificial intelligence, AI, or machine-reading. New assessments to test deeper learning and move beyond multiple-choice answers are also fueling the demand for software to help automate the scoring of open-ended questions.

Critics contend the software doesn鈥檛 do much more than count words and therefore can鈥檛 replace , so researchers are working hard to improve the software algorithms and counter the naysayers.

While the technology has been developed primarily by companies in proprietary settings, there has been a new focus on improving it through open-source platforms. New players in the market, such as the startup venture and , the nonprofit enterprise started by Harvard University and the Massachusetts Institute of Technology, are openly sharing their research. Last year, the sponsored an open-source competition to spur innovation in automated writing assessments that attracted commercial vendors and teams of scientists from around the world. (The Hewlett Foundation supports coverage of 鈥渄eeper learning鈥 issues in 澳门跑狗论坛.)

鈥淲e are seeing a lot of collaboration among competitors and individuals,鈥 said Michelle Barrett, the director of research systems and analysis for CTB/McGraw-Hill, which produces the for use in grades 3-12. 鈥淭his unprecedented collaboration is encouraging a lot of discussion and transparency.鈥

Mark D. Shermis, an education professor at the University of Akron, in Ohio, who supervised the Hewlett contest, said the meeting of top public and commercial researchers, along with input from a variety of fields, could help boost performance of the technology. The recommendation from the Hewlett trials is that the automated software be used as a 鈥渟econd reader鈥 to monitor the human readers鈥 performance or provide additional information about writing, Mr. Shermis said.

鈥淭he technology can鈥檛 do everything, and nobody is claiming it can,鈥 he said. 鈥淏ut it is a technology that has a promising future.鈥

鈥楬ot Topic鈥

The first automated essay-scoring systems go back to the early 1970s, but there wasn鈥檛 much progress made until the 1990s with the advent of the Internet and the ability to store data on hard-disk drives, Mr. Shermis said. More recently, improvements have been made in the technology鈥檚 ability to evaluate language, grammar, mechanics, and style; detect plagiarism; and provide quantitative and qualitative feedback.

The computer programs assign grades to writing samples, sometimes on a scale of 1 to 6, in a variety of areas, from word choice to organization. The products give feedback to help students improve their writing. Others can grade short answers for content. To save time and money, the technology can be used in various ways on formative exercises or summative tests.

The Educational Testing Service first used its e-rater automated-scoring engine for a high-stakes exam in 1999 for the Graduate Management Admission Test, or GMAT, according to David Williamson, a senior research director for assessment innovation for the Princeton, N.J.-based company. It also uses the technology in its for grades 4-12.

Over the years, the capabilities changed substantially, evolving from simple rule-based coding to more sophisticated software systems. And statistical techniques from computational linguists, natural language processing, and machine learning have helped develop better ways of identifying certain patterns in writing.

But challenges remain in coming up with a universal definition of good writing, and in training a computer to understand nuances such as 鈥渧oice.鈥

In time, with larger sets of data, more experts can identify nuanced aspects of writing and improve the technology, said Mr. Williamson, who is encouraged by the new era of openness about the research.

鈥淚t鈥檚 a hot topic,鈥 he said. 鈥淭here are a lot of researchers and academia and industry looking into this, and that鈥檚 a good thing.鈥

High-Stakes Testing

In addition to using the technology to improve writing in the classroom, West Virginia employs automated software for its statewide annual reading language arts assessments for grades 3-11. The state has worked with CTB/McGraw-Hill to customize its product and train the engine, using thousands of papers it has collected, to score the students鈥 writing based on a specific prompt.

鈥淲e are confident the scoring is very accurate,鈥 said Sandra Foster, the lead coordinator of assessment and accountability in the West Virginia education office, who acknowledged facing skepticism initially from teachers. But many were won over, she said, after a comparability study showed that the accuracy of a trained teacher and the scoring engine performed better than two trained teachers. Training involved a few hours in how to assess the writing rubric. Plus, writing scores have gone up since implementing the technology.

Automated essay scoring is also used on the ACT Compass exams for community college placement, the new Pearson General Educational Development tests for a high school equivalency diploma, and other summative tests. But it has not yet been embraced by the College Board for the SAT or the rival ACT college-entrance exams.

The two consortia delivering the new assessments under the Common Core State Standards are reviewing machine-grading but have not committed to it.

Jeffrey Nellhaus, the director of policy, research, and design for the Partnership for Assessment of Readiness for College and Careers, or PARCC, wants to know if the technology will be a good fit with its assessment, and the consortium will be conducting a study based on writing from its first field test to see how the scoring engine performs.

Likewise, Tony Alpert, the chief operating officer for the Smarter Balanced Assessment Consortium, said his consortium will evaluate the technology carefully.

Open-Source Options

With his new company LightSide, in Pittsburgh, owner Elijah Mayfield said his data-driven approach to automated writing assessment sets itself apart from other products on the market.

鈥淲hat we are trying to do is build a system that instead of correcting errors, finds the strongest and weakest sections of the writing and where to improve,鈥 he said. 鈥淚t is acting more as a revisionist than a textbook.鈥

The new software, which is available on an open-source platform, is being piloted this spring in districts in Pennsylvania and New York.

In higher education, edX has just introduced automated software to grade open-response questions for use by teachers and professors through its free online courses. 鈥淥ne of the challenges in the past was that the code and algorithms were not public. They were seen as black magic,鈥 said company President Anant Argawal, noting the technology is in an experimental stage. 鈥淲ith edX, we put the code into open source where you can see how it is done to help us improve it.鈥

Still, critics of essay-grading software, such as Les Perelman, want academic researchers to have broader access to vendors鈥 products to evaluate their merit. Now retired, the former director of the MIT Writing Across the Curriculum program has studied some of the devices and was able to get a high score from one with an essay of gibberish.

鈥淢y main concern is that it doesn鈥檛 work,鈥 he said. While the technology has some limited use with grading short answers for content, it relies too much on counting words and reading an essay requires a deeper level of analysis best done by a human, contended Mr. Perelman.

鈥淭he real danger of this is that it can really dumb down education,鈥 he said. 鈥淚t will make teachers teach students to write long, meaningless sentences and not care that much about actual content.鈥

Related Tags:

Events

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of 澳门跑狗论坛's editorial staff.
Sponsor
Reading & Literacy Webinar
Literacy Success: How Districts Are Closing Reading Gaps Fast
67% of 4th graders read below grade level. Learn how high-dosage virtual tutoring is closing the reading gap in schools across the country.
Content provided by 
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of 澳门跑狗论坛's editorial staff.
Sponsor
Artificial Intelligence Webinar
AI and Educational Leadership: Driving Innovation and Equity
Discover how to leverage AI to transform teaching, leadership, and administration. Network with experts and learn practical strategies.
Content provided by 
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of 澳门跑狗论坛's editorial staff.
Sponsor
School Climate & Safety Webinar
Investing in Success: Leading a Culture of Safety and Support
Content provided by 

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide 鈥 elementary, middle, high school and more.
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.

Read Next

Assessment Opinion Students Shouldn't Have to Pass a State Test to Graduate High School
There are better ways than high-stakes tests to think about whether students are prepared for their next step, writes a former high school teacher.
Alex Green
4 min read
Reaching hands from The Creation of Adam of Michelangelo illustration representing the creation or origins of of high stakes testing.
Frances Coch/iStock + 澳门跑狗论坛
Assessment Opinion Why Are Advanced Placement Scores Suddenly So High?
In 2024, nearly three-quarters of students passed the AP U.S. History exam, compared with less than half in 2022.
10 min read
Image shows a multi-tailed arrow hitting the bullseye of a target.
DigitalVision Vectors/Getty
Assessment Grades and Standardized Test Scores Aren't Matching Up. Here's Why
Researchers have found discrepancies between student grades and their scores on standardized tests such as the SAT and ACT.
5 min read
Student writing at a desk balancing on a scale. Weighing test scores against grades.
Vanessa Solis/澳门跑狗论坛 + Getty Images
Assessment Why Are States So Slow to Release Test Scores?
Nearly a dozen states still haven't put out scores from spring tests. What's taking so long?
7 min read
Illustration of a man near a sheet of paper with test scores on which lies a magnifying glass and next to it is a question mark.
iStock/Getty