鈥淎ccessible鈥 or 鈥渦niversally designed鈥 tests strive to measure content area knowledge without throwing unneeded obstacles in the way of students with disabilities or students who are mastering English.
But at the same time, the tests can鈥檛 be so simple that general education students can skate through without having to demonstrate their competence.
Walking that line between accessibility and test validity is a challenge. But researchers at Vanderbilt hope to provide help through a checklist of questions they developed for test creators to ask themselves.
By following their field-tested guidelines, they say, they can eliminate common problems that make tests less accessible.
This question from a 7th grade math test shows how the principles of universal design can be used to modify questions for students with disabilities, or students learning English, without losing the meaning of the test items.
SOURCE: Vanderbilt University
For example, many multiple choice tests offer three wrong answers, or 鈥渄istractors,鈥 along with the correct answer. But having two distractors instead of three doesn鈥檛 seriously detract from the difficulty of the test, and is less likely to trip up students for reasons unrelated to their knowledge of the material, the researchers say.
Test creators also commonly use artistic elements, like pictures or cartoon images, in an attempt to enhance the look of the test. But students may think those stray illustrations hold some clue to the final answer. It鈥檚 better to use illustrations only when directly related to the answer, the checklist states.
鈥淎 lot of this stuff is somewhat intuitive,鈥 said Peter A. Beddow, the senior author of the checklist, which has been named the , or TAMI. 鈥淏ut when you鈥檙e taking these original items, you almost need permission to go as far as you need to go.鈥
Take, for instance, a question with a long poem that is designed to test a student鈥檚 vocabulary skills.
鈥淥ften, it really could be vastly shortened. The writing might not be as elegant, but the question is, does this item measure the construct it is supposed to measure?鈥 said Mr. Beddow, who is a doctoral candidate at Vanderbilt University鈥檚 Peabody College of Education and Human Development in Nashville, Tenn.
Stephen N. Elliott, a professor of education at Peabody, and Ryan J. Kettler, a research assistant professor in the special education department, also worked on the development of the checklist. Their work was funded by a grant from the U.S. Department of Education as part of the Consortium for Alternate Assessment Validity and Experimental Studies.
NCLB Impact
The checklist is key, Mr. Elliott said. 鈥淧eople need a structure. This was a lot of inventions born out of necessity,鈥 he said.
The testing requirements of the federal No Child Left Behind Act are a major driver of the need for accessible tests. The law鈥檚 regulations allow some students with disabilities to take different types of assessments than general education students. Two percent of all students, or about 20 percent of students with disabilities, can be counted as proficient when they take alternate assessments based on modified, but grade-level, academic standards. Those tests can have fewer questions, fewer choices in a multiple-choice section, and require a lower level of reading skill.
The Peabody test inventory, still in its early stages, can hopefully help test writers create assessments that meet those standards, Mr. Beddow said. The researchers developed their work by using universal design principles and 鈥渃ognitive load theory,鈥 which refers to how much information a person must hold in their mind to perform a task.
If a test isn鈥檛 intended to be assessing a student鈥檚 memory, then cognitive load should be reduced, the Peabody researchers suggest, even if that means cutting a long poem down to a few stanzas. 鈥淓xtraneous cognitive load can throw [students] off,鈥 Mr. Beddow said.
But to make these changes, test creators sometimes have to set familiar concepts aside. People tend to use three distractors on a test because that鈥檚 the way it鈥檚 always been done, Mr. Elliott said. Some college professors may even use four or five, including such trip-up questions as 鈥渘one of the above鈥 or 鈥渁ll of the above鈥濃攁nd they pass those habits on to teachers-in-training, who may use those same types of questions in their own classroom tests.
鈥淐omplicated does not necessarily translate to a better test,鈥 Mr. Elliott said.
The checklist could be useful for training test item writers or state content teams, said Scott Marion, the associate director for the Dover, N.H.-based Center for Assessment, a nonprofit organization that works with states and districts to improve their testing-and-accountability systems.
However, shortening test items could be contrary to another theory of test creation, which suggests that students do well with questions that include lengthy real-world examples.
鈥淭here鈥檚 got to be a balance,鈥 Mr. Marion said. He also questions the researchers鈥 rating system for test items, and the suggestion that all of the checklisted items are of equal importance.
鈥淚f you鈥檙e trying to put numbers on things and treat them all as equal, I鈥檓 not sure that鈥檚 correct,鈥 he said.
The Vanderbilt researchers have said that the inventory is in its early stages, and that they are seeking feedback from practitioners to make the checklist better.
The work at Vanderbilt is part of an active research effort surrounding the creation of accessible tests.
Jamal Abedi, a professor of education at the University of California, Davis and a researcher with the Los Angeles-based National Center for Research on Evaluation, Standards, and Student Testing, is part of a team developing accessible tests for reading comprehension.
Aiding Comprehension
Tests of reading comprehension offer a specific challenge, Mr. Abedi said, because changing the language of a text passage鈥攆or instance, using simpler vocabulary words鈥攎ay actually change what the test is supposed to measure.
The researchers have found some changes that help all students, without the concern of making the test inappropriately easy, Mr. Abedi said. For example, Mr. Abedi and his colleagues have experimented with sprinkling questions throughout a long text passage, rather than leaving all of the questions at the end.
Using that method, students only need to read a few paragraphs at a time before answering questions related to that section. The only questions that would be left at the end of the passage are summary-type test items, Mr. Abedi said.
鈥淭his approach increased reliability without affecting comprehension,鈥 said Mr. Abedi.
Other parts of the research group that Mr. Abedi is working with are exploring such options as allowing students to select the passage they want to read out of several choices. The rationale is that students will be less frustrated or distracted if they鈥檙e reading material that interests them.
Mr. Abedi said the goal of the team鈥檚 research is to develop tests that are appropriate for all students, not just those with disabilities or English language learners.
Though the work at Peabody has its genesis in the special education field, Mr. Beddow hopes that the checklist could be used to create better tests for all students.
鈥淢y hope is, as a field, we鈥檙e shifting our perspective in writing tests,鈥 he said.