Let’s Understand Tests

In the “data-driven” world today, school leaders are always in search of data that will support their decisions. In many cases… no… in all cases (at least I have yet to find any instances in which it isn’t), the “data” comes from a test. I’m not going to consider the potential problems with tests (including dubious ethics, validity, reliability, and false assumptions of objectivity), but I do want to discuss a simple reality that often leads to bad decisions being made with test data.

Consider this case: There is a condition, let’s call it “Macuserness.” Individual with “Macuserness” can only use Macintosh computers, and it characterizes 10% of the population. For these 10% of the population, they must have a Macintosh computer.

Let’s assume there is test for ”Macuserness” that is 90% effective. This means that if 10 tests are given, it will accurately label subjects nine times. This seems a pretty good test.

If you are purchasing and distributing new computers for 100 incoming students, you can predict you need 10 Macintosh computers as 10% of the population has “Macuserness.” We can imagine the incoming students will comprise those with “Macuserness” and those without:

 “Macuserness”“No Macserness”
Incoming students:1090

The “data-driven” leader might reasonably decide to administer the test which give 90% accurate results and 10% inaccurate. We can now divide the incoming students into four groups:

 “Macuserness”“No Macserness”
90% Get accurate result981
10% Get inaccurate result19

Now this seems quite simple mathematically, but if we consider what those numbers mean, you see that our “data-driven” decision maker has not really resolved the problem.

 “Macuserness”“No Macserness”
90% Get accurate result9 Students accurately
identified as needing a Mac
81 Students accurately
identified as not needing a Mac
10% Get inaccurate result1
Student who needs a Mac,
but identified as not needing one
9
Students identified as
needing a Mac,
but who don’t really need one

While the “data-driven” leader has reduced the size of the population of interest by 81% (those accurately identified as not needing a Mac), the remaining population is interesting. Half of those who have been identified as needing a Mac really don’t need one, and one person will be given a PC even though then need a Mac!

Testing results have been the focus of intense political conflict recently (it cannot be labeled as discussion or debate, as both of those are based on reason and accurate representation of facts). My intent is not to provide support for any “side” of any political conflict.

My intent is to point out a common misconception regarding testing and the interpretation of results. If I am trying to make any point beyond “here is some mathematics that may not be obvious,” it is that tests do not give a complete and reliable picture of our populations. If we truly want to make decisions based on data, we must use multiple sources and we must understand the results produced by those sources.