Most states’ NCLB tests are, sadly, essentially insensitive to instruction, that is, those tests are unable to detect the impact of improved instruction in a school or district even if such improvement is unarguably present. The chief cause for such instructional insensitivity stems directly from the test-construction procedures employed to create almost all NCLB tests. Those procedures turn out to make scores on NCLB tests more directly related to students’ socioeconomic status than to how well those students have been taught. Instructionally insensitive NCLB tests simply can’t distinguish between effective and ineffective instruction. (Emphasis added)
W. James Popham, UCLA, “AN AUTUMNAL MESSAGE: LET FLY THE AYP PIGEONS.“
These profiles emerge as an artifact of how items are selected. Test developers include in their respective proprietary item pools only those items shown to sort students in the same relative order in terms of their likeliness of getting an item correct. (In other words, ideally for each item in a given area, Student Q should always be more likely to get it right than Student S.) When high-stakes tests are then assembled using only the items that fit with these internal sorting profiles, the tests themselves also end up being remarkably robust in keeping students in the same relative order in terms of their overall scores (Student Q’s overall test score is very likely to be higher than S’s).
Using this approach, test scores will continue to predict other tests scores in ways that will remain remarkably insensitive to the quality of content-specific instruction. And just one of the unintended consequences of this insensitivity to instruction may be that those schools feeling the most pressure to improve test scores will resort to emphasizing test-taking skills, as opposed to meaningful academic content, as a compelling alternative strategy for attaining immediate, if short-lived, results. (Emphases added)
Walter M. Stroup, “What Bernie Madoff Can Teach Us About Accountability in Education.”
I came across this phrase a few times recently and I really think it captures one huge flaw with the reliance of standardized tests. By design they do not measure learning, instead they sort into a bell (or other) curve. If all students learn something, no matter how important that something is, it will not be included on a standardized test because it doesn’t sort.
This inescapable truth seems to be lost on President Obama, Sec. Arne Duncan and all those in Congress, state legislatures and local school districts who keep calling for more money to be spent on testing and data systems. Although there is potential for better testing I fear that this will only expand the inappropriate uses of the existing testing, testing that for the most part hinders real accountability by this “insensitivity to instruction,” and harms education by wasting time and money on things that don’t help students be successful in anything but taking tests. Garbage in, garbage out.
For more, see:
Deborah Meier, “‘Data Informed,’ Not ‘Data Driven.'”
Diane Ravitch, “President Obama’s Agenda.”
John Thompson, “God Does Not Play Dice.”
And for a local angle:
Thomas J. Mertz