Testing Firm Hits Back Against Claims of Flaws
The state's test vendor has fired back at research from a University of Texas at Austin professor that calls into question the method it uses to develop Texas standardized exams, saying that his claims lack factual evidence.
Denny Way, the senior vice president for measurement services at Pearson, said the company welcomed a “very broad and open dialogue” about the role of standardized testing in evaluating students and schools. But he said that should take place based on well-founded research, not professor Walter Stroup’s “wild conclusions,” which he said would not stand up to the review of outside ...

Comments (13)
Mike Webster
“We design assessments to fit a very specific, clearly defined need. In the case of the accountability test, it is defined broadly to benchmark how schools and school districts in the state are doing from year to year,” said Everson, who has served as the vice president for research at the College Board, which produces the SAT and other admissions tests.
Standardized testing as a whole should be a measure of student progress not to benchmark schools/districts. Such benchmarking requires many more factors including programming, college going rates, career readiness rates, student discipline and attendance, culture etc. This is yet again another guise.
David Doss
After more than thirty years of working with educational assessments, I have to agree with Dr. Everson who worked for the College Board that statewide accountability exams are not sensitive to recent instruction. I have concluded that they are essentially measures of ability, and such measures do not change much from year to year. An examination of the average scale scores across grades on standardized achievement tests created using item response theory shows that they are very similar to growth in verbal and mathematical ability. The amount of growth is much greater at the early grades and then levels off at the upper grades. There is hardly any growth in the underlying scale scores at the higher grades.
From Dr. Everson’s remarks it is clear that testing companies understand that large gains are unlikely to occur from year to year and that policy makers, politicians, and the media are going to misuse the results, but that does not keep them from making large amount of money from selling the tests they know will be misused. Bottom line—the tests are excellent measures of students’ developed ability, but they have very little use for the improvement of instruction. They provide meaningful results for understanding the intellectual status of individuals but are essentially useless for accountability purposes. They enrich testing companies and education reformers but do nothing for schools.
I suggest the Texas Tribune include this commentary by Diane Ravitch in Tribwire. There’s a lot of truth there. http://schoolsofthought.blogs.cnn.com/2012/08/09/my-view-rhee-is-wrong-and-misinformed/?hpt=hp_c2
ed researcher
There continues to be some confusion about whose work this actually is. In the sixth paragraph, it's referred to as "his" (Stroup's) research. It is not. It his graduate student's dissertation. Stroup is not, as far as I can tell, a co-author on this dissertation.
It is highly unusual for an academic, whose promotion, reputation, and tenure decisions at an R1 university hinge on their ability to publish, to withhold submitting an article for publication that so provocatively challenges one of the most important advancements in test development/measurement in past half-century. If this work was so important and could withstand scientific scrutiny, it would be a bombshell that would reverberate throughout the US, not just in Texas. This explanation absolutely does not hold water. I would also be curious about the timeline on this. Stroup claimed, in a 2009 presentation, that the results were being prepared for submission. Had STAAR been announced then? It also seems to raise some ethical issues about an academic being co-opted from publishing findings that would be potentially embarrassing to the state.
As far as I can tell, there's not a single psychometrician on Pham's dissertation. They all appear to be curriculum experts, with a few biologists, which is bizarre since the dissertation so heavily relies on IRT. This means that, from a department with a number of exceptional psychometricians, not one of them was on the dissertation committee, nor do the acknowledgments mention one even reviewing it.
Having read through most of the dissertation, the primary puzzle that is motivating this anti-STAAR/TAKS campaign (that the test was demonstrated to be "insensitive to instruction") is not tested in the dissertation. In fact, it was directly contradicted by Stroup and co-authors in evaluations that were funded by Texas Instruments. This is from the executive summary of an evaluation of the program in Richardson ISD referred to in the original article. If the test was completely "insensitive to instruction"--here, an intervention--how is it able to "close the gap" in achievement? Why, in the dissertation, does Pham then claim “These damages include affective issues with students who perform poorly, the maintenance of the “achievement” gap, and the sanctioning teachers for having the wrong profile of students in their classrooms.”
"In RISD, the subset of MathForward students who failed the TAKS in the previous year gained more on the TAKS than similar students district-wide. This result was not due to chance, and suggests the MathForward students are “closing the gap” in achievement."
http://education.ti.com/sites/US/downloads/pdf/research_execsummary_RISD.pdf
Last, and unfortunately this is not clear from Pearson's statement or from this article, but Stroup nor his graduate student's identification strategy used in their research can separate "test taking ability" from "content knowledge" or "aptitude". Stroup claims that the dependence in test scores across years is a function of test design/IRT, and so kids are consistently ranked based not on their aptitude, but on their ability to "take tests." Their results show no such thing, and it would be quite a feat to do this, since one would have to design an experiment with two assessments in which one literally measures "test taking ability" but is unmoored from aptitude, and another that measures aptitude, but is unmoored from "test taking ability." Then, you would compare response profiles between these two assessments to determine whether the aptitude test performance is associated with "test taking ability". If he has in fact done this, then kudos, I'd love to see it.
Scooter Jonesy
Morgan, you were given a nice, juicy pitch to hit here and you basically watched it whiz right by you. Dr. Stroup's claim about not seeking "peer-review" of research that he bases his entire "suspicion" on is absolutely ridiculous. There is not a respectable field in academia that does not demand that research be peer-reviewed before it is referenced, especially a graduate student's dissertation, and even then, dissertation chapters are routinely submitted to academic journals where methodological and substantive experts can review it and deem it worthy of publication. Referencing a dissertation or non-peer reviewed research is just not done - or at least not done by anyone who expects his work to be taken seriously. Aside from this, wouldn't you think it was important to look a little deeper into these accusations? How about looking at some previous work done by Dr. Stroup where he had absolutely no problem using TAKS as an outcome measure and received payment from Texas Instruments for such results? Why is he now raising these issues when they were perfectly fine when he was getting paid to use them? Or maybe why an individual heavily involved in teacher prep and certification programs (UTeach) might not appreciate the use of student assessments that can (and are) used to gauge the effectiveness of teachers? Are these factors not relevant to this discussion? Do they not put some of his unsubstantiated comments in context? This is a very important issue and if you truly want to inform the public, please do so completely and provide us with the entire story; not just want they want you to tell us.
Shawn AndMichelle Wehmeyer via Texas Tribune on Facebook
Want to bet there will be no educators on Pearson's panel of "experts"? It is time Pearson realizes that the experts have already spoken and their voice is growing in number every day....teachers and parents are speaking out against the testing industry and the damage that it has done to an entire generation of children and their education!
Adele Roberson
Look at number 2 on the list below and you will know why Republicans are so anxious for Texas Public Schools to fail.
Jul 24, 2012 ... The Center for Media and Democracy has EXPOSED over 800 "model" bills and resolutions secretly voted on by corporations and politicians ...
What is ALEC?
Privatizing Public Education ... (just think abt all that taxpayer money)
Democracy, Voter Rights, and ...
Environment, Energy, and ...
Wisconsin Report
Guns, Prisons, Crime, and ...
Worker Rights and Consumer ...
Bills Affecting the Rights of ...
[ More results from www.alecexposed.org ]
T D
"Testing Firm Hits Back Against Claims of Flaws"?
Forget the article: you can see this in some of the comments above.
Ralph Moore
While I can't speak to the state tests (except that my daughter takes them), or to what they can or cannot measure, I do know that going to the press with findings that are not peer-reviewed is the equilivant of spiking a football on the 10 yard line. Stroup's rational for not publishing is embarrassing. The professors in my doctorate program constantly pushed us to write everything with an eye to publication - and they wanted in on all of it due to the pressures they were under.
GS Crispus
So, how does one hold "education corporations" accountable? We can take away the University Professor's tenure, and slash his pay, but how do we hold the corporation accountable in this state?
Kim Blowers Midkiff via Texas Tribune on Facebook
of course the Pearson rep is going to call Professor Stroup's research "wild conclusions!" How much money is Pearson taking in on these tests- 90-something million dollars?
Alice Taylor
“We design assessments to fit a very specific, clearly defined need. In the case of the accountability test, it is defined broadly to benchmark how schools and school districts in the state are doing from year to year,” said Everson, who has served as the vice president for research at the College Board, which produces the SAT and other admissions tests.
Because statewide accountability exams are created to compare school districts across the board, he said, they aren’t a good measure of how well specific instructional practices or curriculum programs are working within a single district.
“Typically they aren't that sensitive to instruction because the instruction varies from school district to school district,” he said.
If this is the case, then the tests have been misrepresented to everyone. The tests ARE absolutely used to rate student success and how well specific instructional practices work. If the tests were used as Pearson says they're supposed to be used, as a way to rate districts against each other, then why are kids being denied diplomas if they fail the tests? Why are teachers with low class scores being disciplined?
This entire statement is why teachers are howling so much when lawmakers propose that individual class test results should be used to rate the teacher's performance. There are many reasons why TAKS/STAARS type tests are unfair as a teacher performance measurement, but when you have a test that's not designed to measure student performance and instructional effectiveness, but designed for another purpose altogether used to make life changing decisions that's just instructional malpractice.
Teachers are not against being evaluated, just against being evaluated unfairly using models that are skewed for variables that are out of their control and with flawed instruments designed for other purposes.
hans5162@ix.netcom.com hans
As a parent, we need less testing and more instruction. I want my children to learn academic skills. 1/3 of the 180 day school year is devoted to different assessments. How much more could we be teaching our kids if we devoted that time to instruction in new skills? Everyone advocating for the testing has a financial interest in keeping the testing, from Pearson to Bill Hammond to Sandy Kress. They care nothing about whether my children are getting a good education in the public schools of Texas. They only care about the contracts, membership dues and political contributions generated by maintaining the status quo. Where are the demands for transparency and accountability in the outsourcing of critical governement services? On that, they are mute, because there is no transparency.
Frank Smith
Does validity matter? Truthfully! Gov. Perry, Sandy Kress, and republican legislators are being funded by Pearson either for lobbying or campaign funds to keep them in office. Testing has nothing to do with accountability. Its all about getting rich and making public schools look bad so the state can usher in vouchers and relieve the states funding of public education as stipulated in the constitution. "Edubisiness" and venture capitalist are foaming at the mouth because Privatization will be a financial boom for them. We will have consolidation of small rural districts; huge poor urban districts; a few large property wealthy districts; for profit charter schools; Private schools; and vouchers to fund all of it. Those who can't afford good private and selective charter schools even with help from vouchers will be enrolled in poor performing consolidated districts or poor performing urban districts because they will not meet the transfer requirements of property wealthy districts. Welcome to education segregation. I could be wrong. I hope so.