Faking the Grade

Last school year, the Texas Education Agency implemented a new “growth measure” purported to reward schools for improving student performance — even if they still fail state tests. The effect on state accountability ratings was immediate and dramatic: The number of campuses considered “exemplary” by the state doubled, to 2,158. But a new analysis shows the projections of future student success may be wrong as much as half the time.

By Brian Thevenot

July 9, 20105 AM Central

Republish

Last school year, the Texas Education Agency implemented a new “growth measure” purported to reward schools for improving student performance — even if they still fail state tests. The effect on state accountability rankings was immediate and dramatic: The number of campuses considered “exemplary” by the state doubled, to 2,158.

Meanwhile, the formula drastically cut the number of “unacceptable” schools, which normally are in line for state-mandated overhauls and even closure. Without the so-called Texas Projection Measure, 603 schools would have fallen into that failing category this year. Under the formula applied, just 245 fell into that category.

The agency has said that the formula only rewards schools for failing students if they are improving at such a pace that they are expected to pass in subsequent years. The projection, the agency said, tested “92 percent accurate” in a test analysis of data that projected performance from 2008 to 2009, before the formula was implemented. But a new analysis of the same data shows the student projections that would actually affect school ratings were wrong as much as half the time, according to a document the agency released this week at the insistence of state Rep. Scott Hochberg, D-Houston. The data show the formula predicted incorrectly between 19 and 48 percent of the time. The highest error rate came in fifth grade math; the lowest in eighth grade reading. In fifth grade reading, 44 percent of students who were projected to pass actually failed. In eighth grade math, 38 percent. In 11th grade reading and math, 30 percent and 28 percent, respectively. The agency did not provide corresponding data for other tests in other subjects.

In another revelation, the formula can project passing grades for students who score well below passing grades. In one example, a student who passed math at the minimum level in fourth grade could then fail the reading test by a huge margin — getting nine questions right out of 48, when 35 correct answers are needed to pass — and his school would still be credited as if the student passed. In another case, a student who passed both math and reading tests at minimum levels could get every single question wrong on the writing test — literally, sign his name and go to sleep — and still be projected to pass the following year. Such examples were among many discussed at a recent hearing of an education subcommittee of the Texas House, at which Hochberg grilled Criss Cloudt, associate commissioner for accountability at the TEA. Schools, instead of being penalized for such failures, get credited as if those students were passing, which inflates hundreds of accountability rankings statewide.

What’s more, the formula’s predictions have nothing to do with the pace of academic improvement of the individual students in the first place — despite the TEA’s initial statements that it does and the implication of the name, “growth measure.” In fact, a student’s performance can actually be worsening from one year to the next, and his or her school will still be given credit for a passing projection, Cloudt conceded under questioning from Hochberg at a recent hearing of an education subcommittee and in a Thursday interview with the Tribune.

Meanwhile, as hundreds more schools reported "recognized" or "exemplary" status to parents and taxpayers — based on failing scores elevated by the formula — the students themselves still had to deal with the consequences of the failure, including being held back in some grades.

Texas Education Commissioner Robert Scott wasn't immediately available for comment on Thursday. He also skipped the legislative hearing last week, at which Hochberg — who said he had worked with the TEA to schedule the hearing around Scott’s availability — was left to grill the commissioner’s subordinates about the formula. Late Thursday evening, however, Scott issued a letter to district administrators across Texas signaling he might overhaul or even eliminate the projection measure. In the letter, Scott says educators’ “hard work is being overshadowed by criticism of the use of TPM for state accountability purposes,” and that he is seeking input on “several options” to be implemented in 2011, including “suspension of the use of TPM for accountability purposes.” Among the other options: allowing districts to opt out of the measure, setting “performance floors” to ensure that the formula doesn’t apply to students who fail a particular test by wide margins, counting the projected passers as a “fraction of a passer,” and placing other limits on the extent and manner in which a district can apply the formula.

As far as Hochberg is concerned, the measure should be scrapped immediately. If nothing changes, the measure will be applied to the next set of school rankings, which are due out at the end of the month and will be based on 2009-2010 test performance. “The way they’ve used the TPM makes the accountability ratings meaningless. If a student could fail every year up to grade 11, and the schools can get credit for that student passing every year, than it’s a bad measure,” Hochberg said. “This really disrespects the work of educators, because if people don’t have faith in the results that are reported to them, parents and the public, then the legitimate good work is subject to question."

The Texas Association of Business, along with other advocacy groups of various political stripes, has criticized the measure before — and found even more reason to do so upon learning its projections are often simply wrong. “We opposed the Texas Projection Measure when we thought it was valid,” said TAB President Bill Hammond. “If it turns out the projection doesn’t even work, that just makes our argument that much stronger. Based on what you’re telling me today, the state should place an immediate hold on all the ratings and redo them without the projection model.”

Cloudt said the state would release accuracy measures for the projection model along with the new accountability ratings for schools. She said she expected the projections would be more accurate because of a tweak in the formula that based projections on two years of student data instead of one. But she said that she “can’t confirm” that the accuracy of the projections from 2009 to 2010 would be any more accurate than those examined from 2008 to 2009, which were not used in accountability ratings.

Asking the wrong question

In response to previous criticism that the measure inflates school ratings, Scott has assured that the agency would dump the formula if its predictions didn’t pan out. But the analysis the agency released to Hochberg this week shows that state officials already had data that could have showed formula’s predictions were likely to be often wrong — and they moved forward to implement the system anyway.

The TEA, however, chose to measure accuracy much differently. Until now, the agency has said that the TPM “classification accuracy percentages” range from 78 percent to 98 percent, with an average of 92 percent. Those figures are based on an analysis of projections made using student data from 2007-2008, the year before the policy was implemented. But those high accuracy percentages, listed in an agency report, were based on projections for all students in Texas — rather than only those students who failed but were projected to pass. Turns out, those projections — the only ones that matter in boosting the accountability ratings of their schools — are where almost all of the inaccuracies lie. Calculating accuracy rates based on all students effectively masks errors in the smaller group of failers-turned-passers. Essentially, when the agency went to study the projection accuracy, it asked the wrong question — and got an answer that made the formula appear far more accurate than it is.

In an interview Thursday, Cloudt stuck to the position that the agency’s original analysis is a better way to measure accuracy. “You have to look at the accuracy of the measure for every single student. You can’t just pick a measure that applies to the kids who fail and are projected to pass,” she said. “What you’re doing is really distorting the measure itself by only looking at students projected to meet the standard who didn’t meet it … The data show the measure is working.” At the same time, Cloudt acknowledged that students who fail and are projected to pass are the only ones who affect school rankings.

The reason the formula can accurately predict the performance of all students, taken as one huge group, is that the vast majority are either well above or well below the passing score, making their future performance easy to forecast — it will be the same, either passing or failing. There are only three performance levels on the state TAKS test: did not meet standard, met standard, and commended. So it’s only the performance of students in a narrow band just below the passing line that is difficult to predict.

“Growth measure” doesn’t measure growth

It’s unclear where the push for a growth measure in state accountability started. Cloudt suggested the agency was reacting to a legislative mandate. TEA staff, she said, examined several options currently in use in other states and approved by the U.S. Department of Education and found the option they now use the most accurate. “It’s as accurate as it gets,” she said. Cloudt cited the state statute mandating the new measurement as Section 39.034 of the education code. It reads, “The commissioner shall determine a method by which the agency may measure annual growth in student achievement from one school year to the next.”

In January 2009, the TEA first got approval for the formula from the federal agency, then in the waning days of the administration of Secretary of Education Margaret Spellings, a George W. Bush appointee and a Texan. In announcing it, the TEA described the formula much in the way the legislation Cloudt cited reads: “Growth measures track individual student achievement on state tests from one year to the next to determine whether students are meeting annual goals,” the TEA news release read. It quoted Scott saying, “Our ability to use a growth measure for accountability purposes will help recognize the hard work being done in schools where students are making significant educational progress.”

The formula, however, actually does nothing of the kind, which prompted pointed questions from Hochberg at the recent legislative hearing. “This is a growth measure, right? That’s what it was designed to be, right?” Hochberg asked.

Cloudt stammered in response: “The [U.S. Department of Education] considers it a growth measure … We categorize it as a growth measure. It’s probably better characterized as a, um, um, as a measure of, um, um, um … Let me think of a better way to put it, um, um, probably better categorized as how, um, whether you’re on track to pass the test in a subsequent grade.” Cloudt then stressed that the projection was based on the performance of “hundreds and hundreds and hundreds of students … exactly like that student with exactly the same score, and they passed in a subsequent grade.”

But that’s not true, either, as Cloudt would acknowledge later in the meeting, after she was corrected by a representative from Pearson, the textbook-and-testing company that devised the formula for the state. In fact, the measure predicts future performance of failing students by multiplying test scores by a factor intended to mimic typical performance of all students in Texas, of all abilities — not the individual student’s own track record or that of similarly able students. The bottom line: Every student making a predefined, non-passing score in each subject, often a substantially lowered bar, is considered passing for the purposes of reporting their school’s performance to parents and taxpayers. Another way to think of it: handing out free touchdowns to any team that gets to the 20-yard line, or, in some cases, midfield; or in the case of the writing test example, while sitting on the bench.

Texas Tribune donors or members may be quoted or mentioned in our stories, or may be the subject of them. For a complete list of contributors, click here.

Learn about The Texas Tribune’s policies, including our partnership with The Trust Project to increase transparency in news.

Texans need truth. Help us report it.

Support independent Texas news

Information about the authors

Brian Thevenot

Explore related story topics