Skip to content

Mixed Messages, or Meaningless Labels?

September 5, 2011

California education writers and citizens are wondering what to make of the contrasting stories told by state and federal accountability measures.  In short, the state uses an Academic Performance Index that sets a level of satisfactory performance and also targets for growth that are based on prior performance.  The federal system under No Child Left Behind (NCLB) relies an a measure called Adequate Yearly Progress.  (Kathryn Baron has a more detailed explanation at Thought on Public Education).  Somehow, schools that are doing quite well by state measures are turning out to be failures on the federal measure, which strikes some observers as a case of mixed messages – though it is hardly that.  The fact is that “failure” and “program improvement” have lost any educationally descriptive meaning, and now only describe a relationship between some data and some legal provisions – though the consequences remain all too real for schools and districts tangled up in this failed policy.

The ultimate goal of NCLB is an exercise in fantasy: one hundred percent proficiency for all of America’s students and schools, all at the same time, and presumably never to drop again.  Some people still cling to this ideal and insist we should be keep fighting to reach that goal.  Look, I expect city planners and engineers to construct roads and install traffic control measures with the goal of zero accidents on public roads, but I wouldn’t expect a city to fire those planners and engineers if the goal of zero accidents proves elusive.

If schools that are anything other than perfect are now failing, then the term “failing” no longer provides any useful information for talking about improving schools.  Reading a recent edition of “Themes in the News” (produced by UCLA IDEA), I found the Sacramento Bee reporting that Superintendent David Gordon of Sacramento County Schools said that,

as more good schools fall into Program Improvement status, No Child Left Behind could lose its credibility with the public.  “It doesn’t seem plausible,” Gordon said. “They will say it’s a darn good school and ask why it is in Program Improvement.”

I think it’s time for a new definition of failure, then.  Here’s my eduspeak suggestion:

failure: in school evaluation – the condition of having a school’s test results, (overall, or in a subgroup) falling below politicians’ goals

The Sacramento Bee article has more information about successful failing schools and districts in that region.  Read carefully, and then pray for the federal government to come to our rescue and tell us how to avoid such failure in the future.

Meanwhile, I’ll highlight a few of our supposedly failing schools in my region.  Stay tuned for a couple more examples in my next blog post.  And let me preface the rest of this post by saying that I am hardly one to suggest that test scores should be the definitive measure of anything.  I use them to support my arguments here only to highlight the problems of labeling these schools as failures even using the preferred measurement of governments and policy makers.  Were I interested in a deeper examination of school quality and educational attainment, I would quickly bypass test scores and state ratings in favor of much more important and varied indicators.

Oxford Elementary School, Berkeley, CA

From Berkeley Patch:  “In Berkeley, the only school labelled as in need of Program Improvement (PI) — Oxford Elementary — also scored 871 for API, the second highest score in the district.”  (API scores max out at 1,000 with anything over 800 considered satisfactory for policy purposes).

I looked at a little more data about Oxford Elementary, and found that on the state’s decile rankings, the school has a rank of 9 out of 10, and even when compared to similar schools rather than all schools, Oxford has a ranking of 8 out of 10.  In other words, based on test scores, they’re doing better than at least eighty percent of the state’s schools and at least seventy percent of the schools that have similar characteristics.  They tested only 197 students, and they weren’t the same 197 students as last year.  The school met 15 out of 17 targets for school and subgroup performance, and the subgroups that did not meet targets consisted of 51 students and 105 students.  No matter, slap a label on ’em: FAILURE.

William Hyde Middle School, Cupertino, CA

In this case, let’s first consider that the entire school district goes into program improvement due to test results at a portion of the schools.  Cupertino Union School District is a suburban district in Silicon Valley, in the same city as the corporate headquarters of Apple.  This district is home to the two highest rated public elementary schools in the entire state of California.  We have almost 5,600 public elementary schools.  Naturally, the test scores at those schools have much to do with the wealth and educational levels in the homes of the students, but it strains credulity to think that one small district can produce the two highest rated schools out of nearly 5,600 and simultaneously fall on the wrong end of the binary of success/failure.  “I challenge anybody to come to Cupertino and say that we are a PI district,” Superintendent Phil Quon reportedly told the Mercury News.

So how bad are the scores at Hyde Middle School?  Their API ranking is in the 9th decile overall, but their similar school rank is a 1 out of 10.  Sounds bad.  How far off are they?  A school in Poway, CA has an API only 4 points higher, and a similar schools rank of 3, while another similar school in the Cambrian District (same county as Cupertino) has an API 9 points higher and a similar schools ranking of 4.  And keep in mind, we’re talking about movement in the 80th to 90th percentile of overall school performance already.  It’s sort of like coming in last place in the Olympic finals for the 100-meter race.  Someone has to finish last, but the gap between first and last may not be so large, and that last-place finisher still lays legitimate claim to a place among the best.

Cabrillo Elementary School, Fremont Unified School District

Just one more quick example I dug up with a little research – let’s look at the numbers.  In the past six years, Cabrillo has improved its API by almost 100 points (717 to 814), but this year they entered program improvement by meeting only sixteen out of seventeen targets for AYP.  The one subgroup test where they came up short was Hispanic/Latino math scores.  The school is quite diverse and has many demographic groups that are not statistically significant enough to be weighted for AYP purposes, but I did find it interesting that Cabrillo did meet targets for socioeconomically disadvantaged students and English language learners.  I guess the message to Cabrillo staff is that they must do a better job of teaching math to their Hispanic or Latino students who are not socioeconomically disadvantaged and fluent in English.

And how much better must they do?  Are they dropping the ball here?  Last year, the percent proficient in this subgroup was 53.5, while this year, it was 52.8.  For the number of students tested, that’s one less proficient student than the prior year.  However, holding steady won’t cut it for NCLB.  No matter how many changes occurred on Cabrillo’s campus – due to staff turnover, larger class sizes, or any other factors that affect schools, the scores must go up.  If I read the “safe harbor” provisions correctly, Cabrillo needed to have 58.2% of this subgroup testing proficient in math last year to avoid program improvement.  Based on Cabrillo’s student population, we’re talking about a difference of seven students.

I’m not arguing that those seven students and their math skills aren’t important.  But it’s counterproductive to create unrealistic targets for a comparison of only partially similar students in paritally similar conditions, which will then set in motion huge policy ramifications for an entire school or district – because seven students struggled on a math test.

And much further north…

And though this blog tries to focus on California, here’s one more example – “Misusing Data” – from a blogging colleague in the state of Washington.  Mark’s blog post begins:

I teach high school English. At our inservice meetings this past week, last spring’s HSPE scores were unveiled. Our 10th graders passed the reading HSPE at a rate of 91.7%, above the state average of 85.1%. Bolstering our pride even more, 75.3% of our 474 tested sophomores earned an L4 score, the highest bracket of scores. Out of all 474 students, only six scored L1 (“well below standard”). While we certainly still need to keep finding ways to support those kids who don’t yet have skills up to standard, those numbers are pretty good. Data doesn’t lie, right?

Something to celebrate, right?

Nope. The data, when read properly, actually proves that we failed. We failed miserably.

If you’ve read this far, I hope you’ll go on and read the rest of Mark’s post.  But, spoiler alert: Mark’s school really only failed in the new eduspeak definition of the word.  The real failure is No Child Left Behind.

11 Comments leave one →
  1. Lea permalink
    September 5, 2011 9:02 am

    Based on the estimates and the fact that our API went up several points, we thought we would freeze at year 1 of PI. Wrong. Because our white 6th graders math scores went down 0.2, we’re now in year 2. TWO TENTHS OF A POINT. I don’t even know how we have enough white students to count for a subgroup. It’s beyond frustrating and demoralizing. The good news is that several teachers have realized that there’s always going to be something for them to bonk us on the heads with, so they’re not going to hyper-focus on test scores this year and just focus on teaching children. That’s a win for kids in my book.

    • David B. Cohen permalink*
      September 5, 2011 8:59 pm

      Lea, definitely better to have the teachers focus on the kids rather than the test, and I think your colleagues are right about PI being unavoidable. NCLB is about to crumble under its own weight – the only question is what kind of damage it will do coming down, in terms of unnecessary school closures, or faulty policies foisted on us in pursuit of waivers from the USDOE. I wish you well.

  2. September 5, 2011 11:54 am

    Thanks for this post and for reaching a little north!

    What is also frustrating in Washington is that in the last three years there have been three completely different assessments in high school mathematics. Not only are we not measuring actual student growth (for individual students), but the instrument by which students is measured is an ever-moving target. When this year’s math scores were released, schools across the state performed generally better on the math assessment than in years past. I have mixed feelings about this: on one hand, it makes me think we’re getting closer to an assessment which reconciles what kids “ought to know” and what is realistic for kids at the present. On the other hand, I worry about next year for my math counterparts. Will this growth make people think the bar has been lowered…Will the assessment change yet again?

    The silver lining for me and my department, I suppose, is that after years of the highest reading and writing scores in the county, now missing AYP might attract a little attention from the district office: when math struggled through it’s various incarnations of assessment and was failing to meet AYP, mountains of resources were allocated–and their use was very teacher-directed. I’m hoping the same for my department. I’m trying to cling to the positive here, I guess.

    • David B. Cohen permalink*
      September 5, 2011 8:53 pm

      More resources sounds like a good thing, but you have to win the debate on which resources matter, which means shared goals and vision. If the district or state has reachedt the point where the test score is the goal, then the resources you get might not be those you’d ask for. Good luck, Mark!

      • September 6, 2011 3:48 pm

        Too true… but it is so rare nowadays that teachers are given both trust and resources. I am cautiously optimistic… We’ll have to see how the cards play.

  3. September 5, 2011 7:31 pm

    While I would agree that NCLB’s magic 100% proficiency by 2014 goal is no longer valid given the few number of schools in California that even approach 90%, it doesn’t mean that the API is the best measure of student achievement. I believe that the percentage of students who are proficient or above, or in other words the percentage of students who are at or above grade-level is the best measure. Parents want to know if their child is at grade-level. It is certainly one that parents will understand, unlike the API.

    The API is a flawed measurement. For example, the magic 800 that is the goal for California schools is relatively easy to achieve. Contrary to what many parents believe, when a school is at 800 it doesn’t mean that all students are at grade-level. When a school reaches 800, only about 55% of its students are at grade-level.

    In your three examples of schools who you believe are unfairly marked as failing schools, only one of them really makes your case. Cabrillo Elementary is indeed a school that has shown dramatic improvement in recent years. It has tripled the percentage of students at grade-level in language arts and mathematics since 2002.

    Oxford Elementary on the other hand has a significant and consistent achievement gap for its African American and students in poverty. If you’re part of the 1/3 of the school who are white, you’re likely to be at grade-level as about 90% of those students are proficient in both language arts and mathematics. However, if you’re poor or African American, only about 40% of you will be at grade-level in language arts and about 60% of you will be at grade-level in mathematics.

    Similarly, at Warren Hyde Middle, there are significant achievement gaps for students who are not white or Asian. In fact, scores for white and English Learner students have been declining in mathematics since 2009. Mathematics scores for other students have been stagnant since 2002.

    So, the answer isn’t to abandon the measurement of grade-level proficiency percentages in favor of the API. As I’ve shown you, the API masks these achievement gaps and allows both school districts and the California Department of Education to claim improvement in academic achievement where significant issues still exist that are adversely affecting student learning. We need to continue looking at this common sense measure of grade-level proficiency rates. We need to look at individual student growth year to year using a measure such as the student growth percentile model. We should abandon the API because it doesn’t provide the information we need to know to improve our schools. It serves the needs of the adults in education at the expense of our students.

    • David B. Cohen permalink*
      September 5, 2011 8:51 pm

      Dave, thank you so much for enriching the conversation. I tried to put in a disclaimer that I don’t think the use of test scores is really the proper way to judge a school at all, and noted that I was only using API to highlight the meaninglessness of AYP at this point. If you want to debate the merits of API (independent of AYP), I wouldn’t be much a debate opponent because I’d agree with you about what’s lost or hidden underneath the API ranking. But it’s also interesting to note that among the schools I wrote about, you picked the one with the lowest API ranking as the most successful. We can agree that they deserve credit for improvement (again, big assumption is that test score changes indicate improvement), but I think it would be fruitless to argue about one school being better than another as a yes/no question. One of my main objections to all this testing and ranking and rating business is that so much is made of so little information, and so much more of importance is lost in the conversation. I wouldn’t care too much about it if not for the fact that incredibly important decisions are made based on those rankings and ratings.
      On the topic of the achievement gap, I think it behooves any school or district to pay attention to it and ensure that they constantly check their curriculum, school policies, instructional methods – anything that might contribute to that gap. However, the gap exists before kids enter school, and much of it is created and maintained by factors outside the schools control. Of course race is not determinant here, but the trends are well-documented and not surprising. White students are more likely to have parents with higher income and education levels, and I can tell you from working in an affluent area that wealthy parents (of any background) are using their money in ways that enrich their children’s learning – every day, every weekend, and especially over the summer. To think schools can make up for that on their own is wishful thinking (to put it mildly). When so many social factors and institutions, not to mention personal and societal history are responsible for the gap, it’s bizarre to think that one institution alone – a school – could make up for all the others.


  1. Face to Face: Real Accountability « InterACT
  2. Repairing a Culture of Blame « InterACT
  3. 2011 at InterACT – Martha Infante « InterACT
  4. Unpacking the Meaning of Appreciation « InterACT

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: