We can’t do right by kids if even white liberals like Tim Wise are confused about testing
January 15, 2016
Share on FacebookTweet about this on Twitter

by Catlin Goodrow

There’s a lot of b.s. out there in the world about standards and standardized-testing, and it’s gotten to the point where for many liberals, misinformation is taken as truth.

For example, what I heard anti-racist activist Tim Wise say recently on the Rock the Schools podcast.

He said:

Most of the tests we’re using to evaluate excellence under federal educational law and various state laws are these standardized exams which are essentially guaranteed by definition to have 50% of the test takers perform under 50%… which means that we’re using tests that essentially produce failure on purpose so as to then say whether schools are failing.

That sounds terrible, even immoral… except it’s not true.

A little background to battle the misinformation:

Wise was referring to what are known as norm-referenced tests. As a kid, if you took a test and got back a report that said you were in the 99th percentile or the 62nd percentile, you took a norm-referenced test. These type of tests take all the test takers and basically rank them along a continuum from first to 100th percentile – it’s more complicated than that, but that’s the basic idea. As Wise points out, it would be a terrible way to evaluate schools, because some would always be failing.

Luckily, that’s not how schools are evaluated. The critical tests almost all kids take (including the new Common Core-aligned tests) are what’s called criterion-based exams. Criterion-based exams are based on standards. If a standard says that a child should be able to identify important ideas in nonfiction, the test will ask a child to read some nonfiction and identify the important ideas. Wise pointed out that these are pretty good ways to assess what kids have learned.

Theoretically then, if all children were getting wonderful, standards-aligned instruction, every kid in the state could get all the answers right on such an exam. Rainbows would appear in the sky and unicorns would dance on the school lawns; it would be a beautiful day.

Of course, that’s not what’s happening. Many, many children are not being taught to read, write, and count. That’s why it matters how we talk about tests.

I’m not sure how Wise was misinformed about the type of tests students take most often. It’s not surprising, however, because Facebook, Twitter, and the dining room tables of middle-class liberals are exploding with falsehoods, big to small, about standardized testing and how it is destroying our educational system. It’s particularly insidious, because falsehoods and misinformation are infecting the discourse of people, like Wise, who want to further anti-racist work.

Many of these well-meaning liberals believe that if standardized tests were removed from schools, education would somehow start working for all children. I have three responses to this, informed by my years as a teacher and teacher educator and coach.

First: thoughtful educators use standardized tests to elevate and guide their instruction. They don’t let the existence of standardized tests turn their classrooms into test prep factories. Take one fifth grade teacher I coach. Her students were studying sensory language in poetry. The typical way to teach that standard might be to have kids find similes and metaphors in poems. If the teacher is creative, she might have the kids illustrate them. However, when we looked at the standardized (criterion-referenced) test, we found that what students were actually going to be asked to do was to analyze how sensory language impacted the author’s message. That’s a lot more interesting and rigorous than just finding some similes.

A week after looking at the tests with the teacher, I walk into the classroom and I can’t find her. It turns out she’s sitting at a table having a rich discussion with her students. Kids were reading Langston Hughes’s “Dreams” and were analyzing how the images of barrenness and broken wings were impacting Hughes’s overall theme. That’s the kind of instruction that any parent would want for their kids (I hope), and looking at test samples helped the teacher to make it happen.

Second: spreading misinformation about testing keeps us from having nuanced conversations that we really need to have. For example, how can we eliminate bias from tests? How can we ensure that we are not holding children of color to white norms, while still keeping teachers accountable for teaching kids to read, write, and count? Are there more authentic ways to assess students that would also be cost-effective for states? How can we best grow teachers so that they are not under-teaching students year after year? All these things need to be discussed, but the current rhetoric around tests keeps us from talking about these topics that involve a lot of intellectual complexity.

And third, and most important to me: Spreading misinformation about testing threatens one of the primary data points that can be used by parents, teachers, and lawyers to fight for the civil rights of children who have been under-taught. Every time I read one of those anti-test Facebook rants or Twitter threads, I get a sinking feeling in the pit of my stomach because the voices of middle-class, mostly white, liberals are rising above the needs of children of color. Every time someone opts their middle-class kid out of an exam, they are impacting the validity of data that could be used in a court case to prove that students’ civil rights are being violated in their schools. Every time someone spreads the lie that teachers can’t do their jobs because of standardized testing, they give credence to forces who don’t believe that teachers should be accountable at all.

When it comes to talking about testing, maybe we can all take a lesson from the posters that some teachers have in their classrooms. Is what we’re going to say truthful? Is it helpful? And is it necessary? If not, we might need to step back so we can have a debate that is grounded in truth, is intellectually rigorous, and focused on the civil rights of children who have not been taught to their potential.


Catlin is leader of Teacher Development in Literacy for Promise Community Schools.


  1. navigio

    Yes and no.
    While I agree with everything else you say, especially teachers being smart enough to avoid their classrooms becoming testing factories and the danger of listening to voices that are not focused on the needs of all students, we have to remember that even criterion-referenced tests are scaled. The reason this is done is ostensibly to make year to year comparisons valid, but it’s worth noting that testing entities are not transparent about how this scaling is done. And, more importantly, it changes the distribution of scores.
    One of the ironies of school ‘improvement ‘ has been that, although everyone has seemed to have incrementally improved, that improvement has been greater for traditionally higher performing kids than lower performing ones. If results are scaled, who is to say that is not at least partially a result of that scaling process?
    Furthermore, when we talk about describing schools in terms of ‘the hundred lowest performing’ or ‘the lowest 5%’ or similar (as most of our laws do, ESSA included), we absolutely are ‘converting’ any potentially criterion-referenced result to a curve.
    And finally, criteria/standards have a shelf life. If we get to a point where we can no longer sufficiently distinguish differences, we simply change the criteria. Common core tests are a perfect example. ‘Proficiency’ rates prior were somewhere 60%. Now they’re in the 30s. That has effectively guaranteed that we will have ‘below 50% fodder’ for years to come.
    When you create a criterion-referenced assessment, you have to start somewhere. Where you generally start is based on a survey of what children can reasonably be expected to know. And how you map those criteria to performance bands is very much based on the notion of ‘reasonable’ (which is another way of saying normal). As a result, by definition, such a starting point is absolutely norm-referenced, even if subsequent years are manipulated to remain criterion-referenced (essentially criterion-referencing a normed standard).
    I do agree that the nature of our test development is very important, and something very few people understand. I am glad you cover it and attempt to keep people honest. But I think it is a mistake to dismiss such criticisms based solely on the claim that tests actually are criterion-referenced because that’s what they tell us. Thanks for your article.

  2. Catlin

    Thanks, @Navigio! I think your comment is indicative of the kind of more nuanced dialogue that we need to be having about standardized testing. I also agree that the intricacies of how tests are scaled are incredibly challenging for teachers to understand and make it really hard for them to predict how their kids are doing — which should be something that tests help us do. It does make sense to me, however, to move the proficiency bar at times, based on how kids are doing, changes in standards, etc. In ELA, which is my area, we’ve dramatically changed expectations so such changes should follow. I would love for that process to be much more transparent, though. In Texas, we’ve seen years of an incredibly low proficiency bar because we switched to a new more rigorous test, and are seeing exactly what you’ve named in terms of schools not even meeting that bar. As a coach, I try to help teachers shift their practice so that they are actually teaching to a higher bar, rather than simply trying to reach it by doing the same things they did before the shift.

  3. Eric Milou

    I will never be able to follow the rationale that a test, any test, is a civil rights issue. Maybe Im just too naive to think that a 40 question test (like PARCC) can be such a cure. Moreover, the real problem with these tests is their complete lack of validity whatsoever. They are scaled (PARCC from 650 to 850) and the reason they are is to hide the raw scores which are abyssal. For instance, I believe that it takes no more than 30 to 40% of the raw points to reach a level 4 (college and career ready) on a PARCC test. PARCC wont confirm this because they would be embarrassed to admit so but the state of Ohio has leaked some data at http://education.ohio.gov/getattachment/Topics/Testing/State-Test-Updates-for-2015_2016/September-2015/Performance-Level-Recommendations-for-Ohios-State/performancelevels.pdf.aspx.
    So given that students perform so poorly on the raw data, the scores are scaled to hide the truth and confuse educators and parents. If we truly want to stop spreading misinformation, then the real data, the real scores, and the real information must flow from the tests themselves.

    • Citizen Stewart

      You obviously over-simply the position of civil rights activists who see the utility of data gathered through student assessments for equity purposes. Indeed, test scores have been used in court cases for everything from pushing for better state funding, integration plans, and greater focus on serving the needs of historically marginalized populations.

  4. Catlin

    Hi, Eric! Thank you for your comments. It seems like you are really saying that even larger numbers of kids than we would suspect are not college and career ready – and I believe that’s likely. As a high school valedictorian who often felt under-prepared for college myself, I agree.

    I admit, I live in Texas, where we don’t have Common Core, but as an ELA educator I’ve found the Common Core on point with what kids can and should do; so the fact that raw scores are “abysmal” doesn’t indict the tests or standards for me. It says that we need to do the work of pedagogical change. It’s slow work, and hard, but I’m extremely inspired by the educators I know who are undertaking it.


Submit a Comment

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Share This