Teaching to the Wrong Tests
California students are thankfully done with the old Standardized Testing and Reporting (STAR) program, and for better or worse, are moving on to the Common Core aligned Smarter Balanced assessments. The change results from the passage of AB 484 in this past legislative session. Certain education reform advocates have concerns about the lack of accountability that will result from abandoning one testing regimen before the other is fully operational. Those concerns assume the prior notion of “accountability” was working, which I’d suggest is not really the case; the old tests were so weak and the stakes so high that we did students and schools a disservice to spend time focused on those tests. (To be fair, no law dictated to schools or teachers that they should raise scores by teaching to the test and engaging in test prep, but it was a rather predictable outcome; policy makers at many levels have exacerbated those problems for years, even with the evidence available to suggest they should have instead aimed to mitigated the damage). Teachers with experience in high-performing schools can tell you that high scores don’t depend on test preparation. For lower performing schools under more intense pressure to raise scores, the best approach would be to enrich student learning in any number of ways – almost anything other than test preparation. But that’s mostly what I’ve heard about from colleagues around the state, with the worst example being whole high school courses dedicated to test prep.
For all the debate about what to do with test results – rank schools, evaluate teachers, “improve” teaching – I’ve found too little concern about the weak quality of the actual tests. As an English teacher, I’m more skilled at identifying problems with the language arts assessments, and so I won’t discuss the math tests. I’ve seen, and tried taking the Smarter Balanced sample ELA test for 11th graders, and I do think it shows some modest improvements over the California test it will replace. I might review that test in a future post, but what follows here is based on my review of a test prep booklet for California’s now defunct 4th grade ELA test. Anyone in a state still using these kinds of tests should take note of these flaws, as I hardly expect California tests were really unique.
In the booklet I reviewed, one question asks what would be the best item to add to a “school report about Helen Keller.” There’s no information offered about Helen Keller. Some students might be confused: why are they being asked what to put in a report on a person they know nothing about? The apparent intent of the question – apparent to me as an adult and a teacher – is for students to consider the nature of a biographical report (without calling it biography, I guess). Students who might know who Helen Keller was might be tempted into a wrong answer because all the wrong answers are the ones that actually sound interesting and actually have something to do with Helen Keller; instead, they are supposed to infer that the “correct” answer is the one uniquely devoid of specifics, offering merely the idea that reports should focus on events and chronology.
Questions that involve vocabulary are often poorly designed. The main problem is that they encourage the use of context but can’t differentiate the reasons for correct or incorrect answers. One item asks students to pick a synonym for the word “select” as it’s used in cell phone instructions telling us to press a certain button to “select” a particular function. Options include “use” and “choose” – both are logical, if you don’t already know the word “select.”Another “analysis” item in the booklet uses the phrase “feather in her cap.” Three of the four choices would be logical in the sentence, if a student doesn’t know already know the idiom. Supposedly successful analysis of the idiom assumes a certain attitude towards winning a race, which really is more than word analysis: why is pride more appropriate than surprise or thankfulness? Wouldn’t you be surprised to find a feather in your cap? After all, you hardly ever see anyone with a feather in their cap, especially in elementary school. And, if you like feathers a lot, perhaps “thankful” is a tempting interpretation.
Here’s one so bad you have to see it to believe it:
Read this sentence.
She baked a very tasty casserole.
The word casserole is
A. a Spanish word meaning bread.
B. a Chinese word meaning platter.
C. a French word meaning small bowl.
D. an Italian word meaning ice cream.
So, first of all, this English test item is framed as an exercise in understanding a word that’s not even English, which is distracting enough, and ignores the fact that “casserole” is now part of English as well. But never mind that you’re supposed to choose a non-English answer; you can still use context, like your teacher taught you, and if you’re in fourth grade you probably know the rest of the words in the sentence. You know that you can’t bake platters, bowls, or ice cream, so of course, the answer is A, because people do bake bread! But if you speak Spanish, you know the word for “bread” isn’t “casserole.” (Hey, how about that? The cultural bias in a question helps Spanish-speaking students). Or maybe you sometimes have tuna casserole for dinner, in which case you know it’s not bread, and it’s certainly not a small bowl, or ice cream, so the answer must be B, “platter.” Except, you think you’ve heard kids speaking Chinese and they never say words that sound like “casserole.” Or maybe you are Chinese, and know better. The “correct” answer is C, although in both contemporary French and English, that would be a poor definition of the word. Confused much?
A similar item asks about Southwestern homes made of “adobe” – no context, just a sentence, and four options from other languages. Again, if you speak Spanish, you’re in luck! And if you can picture an adobe structure, you’re in good shape. If you need to do what the test suggests in this section and engage in “Word Analysis” then you’re left to guess if these homes are made of wood or brick. If you’re a child who thinks houses are made of wood, and you lack the background knowledge of the Spanish influence in the Southwest, what is the right answer? “Italian word meaning wood” is every bit as logical as “Spanish word meaning brick.”
On to other skills! Students should know about bibliographies, but I’d never ask students to memorize bibliography formats – whether they’re in elementary grades, middle school, or high school. In high school, we teach students to use bibliography tools that provide accurate citations, properly organized and formatted automatically, and as a backup, teach them how to use format guidelines, not memorize them. After all, there are different styles that are correct for different contexts and purposes. And of course, in a multiple choice format, a question that appears to require knowing the proper format may often be answered by reasoning, rather than actual knowledge – so it’s a faulty assessment. Then there are questions about a thesaurus entry which, regardless of the standard they aim to assess, can be correctly answered through a variety of reasoning/inferences and vocabulary knowledge that may or may not related to thesaurus familiarity.
Another problem that comes up frequently concerns “Which is the best _______” types of questions, if it concerns something beyond grammar and conventions. First of all, students I’ve talked with about tests say they hate this type of question because it uses the language of opinion in a format that they associate with fact. Sure, we adults can (usually) set aside such reactions, but even then we sometimes have to read the minds of the exam writers a bit more than might be reasonable for all of our students. These types of questions can do double duty; as reading questions, they might have some minimal value, if we could have a discussion with students and learn the reasoning behind their answers. When this style of question purports to be about editing writing, it’s even worse; students don’t write in multiple choice, and they may have multiple ideas of how to improve the writing but not find their idea reflected among the options. I often coach students to avoid a writing problem they can’t solve. I’m sure most writers have had that experience, trying in vain to wrangle an unruly sentence into coherence, only to strike the whole thing and come back at the idea with entirely new sentence structure and diction.
It’s not hard to spot and dissect these lousy test items, but it’s hard to understand why otherwise intelligent people with good motives think such tests should be the backbone of effective policies or instructional improvement.