You, Me, and AGT – Part 2
Last week, my classroom children took the California Standards Test. This is about as thrilling as walking the plank during Shark Week.
In February, John Deasy announced that he wants 30% of my evaluation based on these test scores. And then, later in the same week, came an email from my principal asking for my Stull Initial Planning Sheet.
Now, I don’t mind being evaluated and I like that being next door to my principal’s office means that he’s always in and out. I like that he was a fellow teacher right there with me a few years ago and that he’s very, very close to the challenges of teaching and learning. There’s trust.
I believe that measurement should be part of the job I do, and that I should be able to show some kind of proof that my children are learning. So I don’t get it when people get all upset about data. I am mystified when a colleague claims that teaching is an art that cannot possibly be measured and so we shouldn’t use any data at all in evaluation. It’s complicated. (Here’s a 14-minute video of Linda Darling-Hammond explaining all the factors that go into teacher effectiveness, just for a refresher: https://www.youtube.com/watch?v=oe76cUWIqBY. It turns out that student attendance is just as big a factor as a teacher’s “value-added” value.)
However, folks can’t go around declaring a jihad on testing, or teachers, or evaluation or anything at all, unless we’re prepared to offer an alternative.
I think there is one.
Darling-Hammond makes the point that the Common Core requires a whole different skill set and, obviously, you can never pinpoint the learning of a freshman who reads at 4th grade level using a 9th grade level test. You need something responsive that accounts for attendance, and individual learning styles, and skill levels. Something that even a classroom teacher’s grading system may not reflect. Something that shows engagement in reading and writing.
Maybe, something like this. Here’s a look at how I grade and analyze my students’ essays:
The capacity of a screen shot is limited, but you can tell that vertically, one student is green (83% mastery), five are yellowish (70% range of mastery), and the rest are orange (below 60% mastery). Of this last group, many have empty boxes signifying missing work that is still factored into their average. If you could see the numbers in detail, you’d be able to tell that some of these “problem” kids are actually doing all right. Wendy, for example, earned an 85% for her single essay. Spread out over three assignments, that equals 28%. Is she failing my language arts class? Yes. Can she pass the California High School Exit Exam? Very likely. A good number of my students fit this profile. Wendy’s scores on my own spreadsheet reflect her growing ability. And they show that if she’s motivated, Wendy can perform at grade level, as far as writing skill – which for me includes reading comprehension and critical analysis – which is NOT measured on the multiple-choice, formulated-by-experts California Standards Test (CST).
What happens if we just take students’ proven abilities into account, rather than holding missing assignments against them? Take a look:
Wait a minute! Now all of a sudden it appears that most of my students aren’t in the danger zone after all, and neither am I. Vertically, students’ actual tested mastery is in the 70 percent range and several show green, which is 80+ percent.
The spreadsheet part disaggregates students’ scores for two writing assignments. In December, we were in the 50% range of mastery across the board. In January, we’re in the 70% range on four out of five criteria. Improvement, right? That’s what I thought! But we can’t use it to measure how effective I am as an English teacher. We have to use raw test scores.
Here is a prediction of how the same students are likely to score on the CST:
This “forecast” is from a software program we use about three times a month in the classroom. It’s test prep, basically, that has interesting, non-fiction, leveled readings followed by some writing activities and multiple choice questions. The forecast itself is generated from a reading-level test that students take the first time they log on.
Notice the reading levels and Lexile scores of the students predicted to score “Basic” on the CST. Wendy reads at 6th grade level. If we are to put faith in the test that gave us this score, that in itself explains the prediction, it has nothing to do with her Language Arts teacher, your humble narrator. (On a separate note, the one student who had an 11.1% chance of scoring Advanced left us for a school in a nearby bedroom community. THAT’S a topic all by itself.)
The next questions are obvious ones – what are the pitfalls of my data tracking approach, and is there one that’s reliable and easy to use out there somewhere? Teachers? Let’s compare notes. Our evaluation agreement says a teacher can use “multiple measures” to show efficacy and I’d like some help over here… the sharks are circling and the quartermaster just gave me a shove from behind.