Can Statistics be Misleading?
There is an old adage that figures don’t lie, but liars know how to figure. In a sense this represents people’s wariness of statistics. Statistical interpretation can cause data to appear misleading. It depends upon the statistician’s interpretation of data and what figures are brought to the fore as the key points of a statistical report.
For example, in grammar school, students now study measures of central tendency, which are mean, median, mode, and range. The mean is a sum of all data, divided by the number of data. For example one might get the sum of a person’s test scores and divide it by the number of tests to determine a grade. However, mean can be affected by what is called an outlier, a number far outside the normal range of testing. This can suggest that mean may be a misleading way of assessing performance.
If a person takes five tests perfectly and fails to take a sixth test thus earning a zero, the mean reflects this. If the tests are all worth 100 points for example, the mean score is approximately an 85%. However, this does not really suggest average performance in this case because of the outlier of zero.
Another measure of central tendency that may be used is evaluation of the median. The median is the middle number in a group of data arranged numerically. If a statistician evaluates for the median, this may not be representative of a true average of performance, or of whatever is being evaluated. The median cannot account for a data range that can be enormous and thus can be misleading.
Central tendency evaluated by mode merely means looking at a number that occurs most often in a set of data. So the test taker for example has a mode of 100. Yet, this does not reflect the person taking the test failed to take one, which is misleading.
Other ways in which statistics can be misleading is the way in which questions are asked, in a survey perhaps, and the degree to which the survey is a representative sample of a community. If one surveys a group of high school students and ask “How happy are you with your education on a scale of 1-5?” one may get very different answers depending upon whether the group is representative of the “average” student.
If one surveys a group of students that all get straight As and go to a fantastic, well-funded school, to publish such data as a representative sample is to be deliberately misleading. If one asks students of different schools with different grades, then a survey is likely to be more representative and fairer. However, if one asks students what they think of schools and then publishes the results as a representative sample of the general population, the answers will then be highly skewed.
Numbers can seem very concrete, and some are misled by numbers simply because they seem to be fact and have an indisputable value. Thus statistical data can often be used in a misleading fashion to wow people with numbers, and make things in dispute seem more like fact. Reputable statisticians know that questions need to be generalized, and also need to be asked of people who represent populations.
However, numbers and statistics can be misleading because they do not represent the individual. They may show how people “in general” respond to an idea, to a product, or to a political candidate. They cannot show how a single person in all his or her infinitely variable qualities will feel.
Discussion Comments
@KoiwiGal - It's actually not that difficult to learn the basics of a decent statistical analysis so you can tell whether or not a study is trying to mislead you. Most of the time they will teach kids at high school how to run a good experiment and those rules apply to statistics as well as a few others that they have mentioned in the article.
@pastanaga - I actually think that the average person should be suspicious of statistics. All too often people will use them to say whatever it is that they want to say. Now, sometimes people go too far in refusing to believe what the numbers are telling them.
But if people were less ready to believe misleading advertising we wouldn't be having the current crisis where children are not being given their vaccinations.
Someone basically used statistics to link the rising rates of autism diagnosis (which is likely due to the fact that we've only just started seriously diagnosing it in the last few decades) with vaccinations even though there is no proof at all that vaccinations are linked to autism. And lots and lots of people believed this and then refused to believe the real studies showing that there is no link.
So, I think part of the reason people don't trust statistics has something to do with the fact that scientists will say that they have manipulated the data. Technically, that just means that they have made sure that the data is correct by removing any obvious errors and doing a few other things to correct it.
But the word manipulation does sound like they are moving the data around to try and come up with misleading graphs or summaries that say what they want them to say.
I don't know what the solution is here, because I don't think scientists and statisticians should have to change their terminology, but I can see why it would make the average person suspicious.
Post your comments