Mixed Messages, or Meaningless Labels?
California education writers and citizens are wondering what to make of the contrasting stories told by state and federal accountability measures. In short, the state uses an Academic Performance Index that sets a level of satisfactory performance and also targets for growth that are based on prior performance. The federal system under No Child Left Behind (NCLB) relies an a measure called Adequate Yearly Progress. (Kathryn Baron has a more detailed explanation at Thought on Public Education). Somehow, schools that are doing quite well by state measures are turning out to be failures on the federal measure, which strikes some observers as a case of mixed messages – though it is hardly that. The fact is that “failure” and “program improvement” have lost any educationally descriptive meaning, and now only describe a relationship between some data and some legal provisions – though the consequences remain all too real for schools and districts tangled up in this failed policy.
The ultimate goal of NCLB is an exercise in fantasy: one hundred percent proficiency for all of America’s students and schools, all at the same time, and presumably never to drop again. Some people still cling to this ideal and insist we should be keep fighting to reach that goal. Look, I expect city planners and engineers to construct roads and install traffic control measures with the goal of zero accidents on public roads, but I wouldn’t expect a city to fire those planners and engineers if the goal of zero accidents proves elusive.
If schools that are anything other than perfect are now failing, then the term “failing” no longer provides any useful information for talking about improving schools. Reading a recent edition of “Themes in the News” (produced by UCLA IDEA), I found the Sacramento Bee reporting that Superintendent David Gordon of Sacramento County Schools said that,
as more good schools fall into Program Improvement status, No Child Left Behind could lose its credibility with the public. “It doesn’t seem plausible,” Gordon said. “They will say it’s a darn good school and ask why it is in Program Improvement.”
I think it’s time for a new definition of failure, then. Here’s my eduspeak suggestion:
failure: in school evaluation – the condition of having a school’s test results, (overall, or in a subgroup) falling below politicians’ goals
The Sacramento Bee article has more information about successful failing schools and districts in that region. Read carefully, and then pray for the federal government to come to our rescue and tell us how to avoid such failure in the future.
Meanwhile, I’ll highlight a few of our supposedly failing schools in my region. Stay tuned for a couple more examples in my next blog post. And let me preface the rest of this post by saying that I am hardly one to suggest that test scores should be the definitive measure of anything. I use them to support my arguments here only to highlight the problems of labeling these schools as failures even using the preferred measurement of governments and policy makers. Were I interested in a deeper examination of school quality and educational attainment, I would quickly bypass test scores and state ratings in favor of much more important and varied indicators.
Oxford Elementary School, Berkeley, CA
From Berkeley Patch: “In Berkeley, the only school labelled as in need of Program Improvement (PI) — Oxford Elementary — also scored 871 for API, the second highest score in the district.” (API scores max out at 1,000 with anything over 800 considered satisfactory for policy purposes).
I looked at a little more data about Oxford Elementary, and found that on the state’s decile rankings, the school has a rank of 9 out of 10, and even when compared to similar schools rather than all schools, Oxford has a ranking of 8 out of 10. In other words, based on test scores, they’re doing better than at least eighty percent of the state’s schools and at least seventy percent of the schools that have similar characteristics. They tested only 197 students, and they weren’t the same 197 students as last year. The school met 15 out of 17 targets for school and subgroup performance, and the subgroups that did not meet targets consisted of 51 students and 105 students. No matter, slap a label on ’em: FAILURE.
William Hyde Middle School, Cupertino, CA
In this case, let’s first consider that the entire school district goes into program improvement due to test results at a portion of the schools. Cupertino Union School District is a suburban district in Silicon Valley, in the same city as the corporate headquarters of Apple. This district is home to the two highest rated public elementary schools in the entire state of California. We have almost 5,600 public elementary schools. Naturally, the test scores at those schools have much to do with the wealth and educational levels in the homes of the students, but it strains credulity to think that one small district can produce the two highest rated schools out of nearly 5,600 and simultaneously fall on the wrong end of the binary of success/failure. “I challenge anybody to come to Cupertino and say that we are a PI district,” Superintendent Phil Quon reportedly told the Mercury News.
So how bad are the scores at Hyde Middle School? Their API ranking is in the 9th decile overall, but their similar school rank is a 1 out of 10. Sounds bad. How far off are they? A school in Poway, CA has an API only 4 points higher, and a similar schools rank of 3, while another similar school in the Cambrian District (same county as Cupertino) has an API 9 points higher and a similar schools ranking of 4. And keep in mind, we’re talking about movement in the 80th to 90th percentile of overall school performance already. It’s sort of like coming in last place in the Olympic finals for the 100-meter race. Someone has to finish last, but the gap between first and last may not be so large, and that last-place finisher still lays legitimate claim to a place among the best.
Cabrillo Elementary School, Fremont Unified School District
Just one more quick example I dug up with a little research – let’s look at the numbers. In the past six years, Cabrillo has improved its API by almost 100 points (717 to 814), but this year they entered program improvement by meeting only sixteen out of seventeen targets for AYP. The one subgroup test where they came up short was Hispanic/Latino math scores. The school is quite diverse and has many demographic groups that are not statistically significant enough to be weighted for AYP purposes, but I did find it interesting that Cabrillo did meet targets for socioeconomically disadvantaged students and English language learners. I guess the message to Cabrillo staff is that they must do a better job of teaching math to their Hispanic or Latino students who are not socioeconomically disadvantaged and fluent in English.
And how much better must they do? Are they dropping the ball here? Last year, the percent proficient in this subgroup was 53.5, while this year, it was 52.8. For the number of students tested, that’s one less proficient student than the prior year. However, holding steady won’t cut it for NCLB. No matter how many changes occurred on Cabrillo’s campus – due to staff turnover, larger class sizes, or any other factors that affect schools, the scores must go up. If I read the “safe harbor” provisions correctly, Cabrillo needed to have 58.2% of this subgroup testing proficient in math last year to avoid program improvement. Based on Cabrillo’s student population, we’re talking about a difference of seven students.
I’m not arguing that those seven students and their math skills aren’t important. But it’s counterproductive to create unrealistic targets for a comparison of only partially similar students in paritally similar conditions, which will then set in motion huge policy ramifications for an entire school or district – because seven students struggled on a math test.
And much further north…
And though this blog tries to focus on California, here’s one more example – “Misusing Data” – from a blogging colleague in the state of Washington. Mark’s blog post begins:
I teach high school English. At our inservice meetings this past week, last spring’s HSPE scores were unveiled. Our 10th graders passed the reading HSPE at a rate of 91.7%, above the state average of 85.1%. Bolstering our pride even more, 75.3% of our 474 tested sophomores earned an L4 score, the highest bracket of scores. Out of all 474 students, only six scored L1 (“well below standard”). While we certainly still need to keep finding ways to support those kids who don’t yet have skills up to standard, those numbers are pretty good. Data doesn’t lie, right?
Something to celebrate, right?
Nope. The data, when read properly, actually proves that we failed. We failed miserably.
If you’ve read this far, I hope you’ll go on and read the rest of Mark’s post. But, spoiler alert: Mark’s school really only failed in the new eduspeak definition of the word. The real failure is No Child Left Behind.