Next Steps for LAUSD Teacher Evaluations: You, Me, and AGT
[edit: see end of post]
An interesting thing happened at a Teach Plus event in Los Angeles last month. LAUSD Superintendent John Deasy engaged in a “conversation” with about 100 public and charter school teachers and teacher-leaders on the district’s new Teacher Evaluation System (TES). After Deasy’s opening remarks, moderator Celine Coggins, CEO of Teach Plus, asked the assembly how we felt about evaluation and what constituted an “effective teacher.” Three disparate comments followed:
One teacher declared that it is high time that variations in students’ school experiences and subsequent performance were reduced. The woman stated that teachers needed to “get over” their reluctance to identify what they were striving for in the classroom and for individual students, and then actually measure it. This, she said, would make teachers more effective by revealing what actually worked.
Another teacher responded by saying that creativity and imagination cannot possibly be measured, and that education was always going to have that element of art and mystery. In a later personal conversation, she did not clearly explain how she planned, graded and reflected on her students’ progress, but she did state that they all performed extremely well on their California Standards Tests.
A third teacher said we would measure what we thought to be important, and educators need to define what this is themselves. Dr. Deasy “respectfully disagreed,” with her, saying we have “standards clearly defined” at every grade level, and why wasn’t that obvious?
Will our new, computerized Teacher Evaluation System be able to find any middle ground among the three viewpoints?
The Los Angeles Unified School District rolled out its technology-based teacher evaluation system in August. Seven hundred fifty “pioneer” teachers volunteered to test the new system, featuring a newly-negotiated set of teaching and learning standards. The new framework was devised with input from over 1,000 educators working in small groups. To begin the process, teachers grade themselves (from “ineffective” to “highly effective”) on 63 teaching standards, then fill in a lesson plan template, identifying which of the 63 standards their lesson addresses, and to what degree. Trained observers download the lesson, observe the teacher using it, and enter their own data. Everything but the observation itself is managed online.
Teachers at the September conversation showed a real willingness to reform their own approach to evaluation; not one spoke up to say they would not participate, or be against the new system. Of course, many stated historical concerns: did we really expect that a new system would foster collaboration, when the current system supposedly depends on collaboration but doesn’t produce enough of it? What about principals who are not experts in the teacher’s content area? Are we really going to continue using standardized test scores, when there is so much evidence that many learning gains happen in ways the tests cannot measure?
October’s “pioneer teachers” session included a lesson on AGT – Academic Growth over Time. This measure, the facilitator explained, factored in the intangibles we worry about: students’ language abilities, past performance, poverty factors, etc. The baseline is three, a comparison number that equals zero on the AGT number line, but allows comparing without anyone being in negative territory. If my students perform (that is, if I perform) below the district average, I’m not minus-one, I’m at “two.” Underneath my own performance number is a line stretching from my possible lowest score (one) to my possible highest score (five), called the “confidence interval.” In a perfect world, my own AGT would be above the district norm, showing my students had gained more than others, and the confidence interval would be very, very short, showing supreme confidence in that result.
In reality, while my AGT was slightly higher than the district average (good! good?) the confidence interval went from about 1.5 to about 4, because the results were based on twelve students. Which twelve? What happened to the rest?
All right, we’re piloting a new system. We’re helping work out the glitches. That’s fine. And I understand that my AGT number has been run through an algorithm designed by experts, and that efforts have been made to ensure that it’s fair. I’m comforted knowing that gender, race, homelessness and special education status have been factored in. That kind of takes some of the wind out of my protests, but actually, but I still have a question: when and where can I measure for the standards I am teaching, in a way that’s appropriate for these students, at this time?
We teach differently from one class to the next, from one year to the next, and we do it because we are paying attention to our students’ specific needs. I have asked my district, my administration, my network partner and my district’s foundation partner for a way to track and measure my own metrics, and though I finally found one possibly solution in an online grading program, no one who returned my calls or emails could really help me. Now the question becomes: when can I be evaluated on my actual practice rather than student results from an annual high-stakes test? It’s a question I continue to ask and encourage my union to pursue. If you’re a teacher, you should too.
[Note: This blog post was originally posted as an ACT Guest Blog Post. Lisa has since become a regular contributor to InterACT, and her old “guest” posts have been modified to reflect her authorship, and have had the original introduction removed. – David Cohen]