Why Standardized Testing Flopped

In the fall of 2018, an article appeared in my news feed multiple times. Peter Greene, a contributor to Forbes magazine posed the question “Is The Big Standardized Test A Big Standardized Flop?” in the title of his article. No educator (or parent, or higher education professional, or employer) is going to be surprised to read the answer is “no.”

I do not intend to repeat the many reasons why and how it failed (many educators and scholars have addressed this with eloquence and completeness). I do intend to describe why it was doomed to fail from the very beginning.

Readers may recall the rationale for the decision to focus so intensely on standardized test and the associated data and accountability movements was grounded in the argument, “the public should only pay for ‘what works,’ in education and we determine what works by measuring.” The approach was very familiar to those of us with a background in science.

The natural sciences are based on a simple approach to answering questions: conduct and experiment by:

Holding everything in the environment constant, expect for one variable.
Changing the one variable in for one group of “things”
Measuring whatever “growth” or change that interests you.
Ascribing any changes to the one variable.

When I was a botany student as an undergraduate, I grew legumes hydroponically and under grow lights. We mixed the nutrients that were added to the water, used the same substrate, randomly selected seeds, grew all of our plants side-by-side under the lights which were located in the same room. We controlled everything we could, but added trace amounts of heavy metals to the nutrients used to grow one group of plants; that was the treatment group. We attributed the reduced nodules (growths on the roots that help the plant capture nitrogen) in the treatment group to the effects of those metals.

By definition, experiments are designed to remove the environment as a factor in our observations and measurements. We reduce the environment to a single variable.

Science and the objective measurements we make in science (measurements the advocates for standardized test-based data and accountability) is possible only when the environment and the variation we find in the environment is removed from the observations. If something exists in a rich and variable environment, then it is very difficult to conduct experiments on it.

Students live in a rich and variable environment. We cannot possibly control all of the variables that affect how they attend, engage with, and learn in school. Few would even want to control all of those variables. Because we cannot reduce the environments of our students to control all relevant variables, we have no way of ascribing changes in student performance to instructional practices.

There are many reasons we can criticize the approach of using standardized tests for school accountability. The constructs are ill-defined, the validity and reliability of the instruments are questionable, the data collection methods are dubious; students in education research courses who analyze the standardized test methods find many practices that violate the ethics of data collection.

We need not look at those details, however. Standardized test-based data and accountability is built upon an untenable foundation. It was bound to fail. It is unfortunate that a generation of students suffered while the rest of the world figured out what students and teachers have known all along.