Science

Fact Checked

What Is the False Discovery Rate?

Mary McMahon

Last Modified Date: June 06, 2023

The false discovery rate (FDR) is a statistical prediction of how many results can be expected to be false positives. This allows researchers to analyze data to determine whether it is statistically meaningful or worthless. Depending on the type of project, there can be a high tolerance for a high false discovery rate, because the other findings still are valid and might be useful. Researchers usually present statistical analysis of their findings and discuss this in the presentation of their work.

This concept is related to the p-value, an estimate of the probability of getting a meaningful and valid result. Small p-values suggest that the data is not as meaningful, because there is a low statistical probability that it is unique. For example, if someone is pulling colored balls out of a bag that contains balls of three colors, that person would expect to pull roughly equal numbers of each color. If 20 balls are drawn and 10 of them are the same color, this would be statistically unlikely. To find the p-value, the researcher could run a statistical analysis to determine how likely it is to draw 10 balls of the same color in a 20-ball draw.

In the case of the FDR, there is more lenience than with a p-value. Rather than looking at the statistical probability that the results are actually unique, it examines the number of false positives that are likely to be found in the results. A high number of false positives could still yield useful data. The researchers will need to be able to identify and exclude the false positives from their results, but the remaining information could be very important.

Is Amazon actually giving you a competitive price? This little known plugin reveals the answer.

Numerous calculations can be used to determine the false discovery rate. If researchers find that this rate is high when they set up an experiment, they might make some adjustments to control for it. This could include changes to the study's methodology, such as getting a larger sample to lower the numbers of false positives. Meticulous study design is very important, because errors in this process could create problems with the experiment.

Computer programs to assist with false discovery rate calculations are available. It also is possible to perform them by hand. In the course of developing a study methodology, researchers might do some calculations to identify obvious flaws in the design before the experiment proceeds. This can help them find weak points and address them to make the experiment as strong and as useful as possible.

Mary McMahon

Ever since she began contributing to the site several years ago, Mary has embraced the exciting challenge of being a AllTheScience researcher and writer. Mary has a liberal arts degree from Goddard College and spends her free time reading, cooking, and exploring the great outdoors.

Mary McMahon

Discussion Comments

Mor

November 5, 2011

I just read an article which mentioned how the human mind isn't really set up to take probability into account.

It is able to grasp the concept, obviously, but that grasp doesn't translate very well emotionally, or with decision making.

So, even if you tell someone there is a "false discovery rate" when it comes to something like gambling, they might not believe you, and that's where superstitions about luck come into play.

lluviaporos

November 4, 2011

@indigomoth - I don't see why scientists should have to change their formal language in order to cater to the lowest common denominator in society. Most people aren't reading scientific papers, if anything they are reading a magazine or an article which reports on the paper.

I agree that journalists should be careful to use "normal" language or at least to explain the terms which might be mistaken carefully.

But I think that if someone takes terms like "manipulating data" or "false discovery rate" in a scientific paper out of context, it is because they always intended to try and find a particular agenda in the paper, rather than because they were innocently confused.

indigomoth

November 3, 2011

Calculations related to the false discovery rate are often one of the things scientists are talking about when they say they "manipulated" the data. Unfortunately, this term is often taken out of a scientific context, so that it sounds like they are flat out changing the data to say what they want it to say.

It's one of the problems when scientists don't think about presenting results in a way that someone in the general public can clearly understand.