Calculate Fdr Using Pvalue – Calculator City

FDR Calculator: Calculate False Discovery Rate from P-values

An expert tool to control for multiple hypothesis testing using the Benjamini-Hochberg procedure.

P-values

Enter values separated by commas, spaces, or new lines. Values must be between 0 and 1.

Q-value (Desired FDR)

The desired False Discovery Rate level. A value of 0.05 means you accept that 5% of significant results may be false positives.

Total Number of Tests (m)

Optional. If left blank, this will be the count of the p-values you entered.

What is False Discovery Rate (FDR)?

The False Discovery Rate (FDR) is a statistical method used to correct for multiple comparisons. When you perform many hypothesis tests simultaneously (for example, testing thousands of genes in a genomic study), the probability of getting a “significant” result just by chance (a false positive) increases dramatically. If you use a standard p-value cutoff of 0.05, you’d expect 5% of your tests on truly null hypotheses to be false positives. This is the multiple testing problem.

FDR control, most famously through the Benjamini-Hochberg procedure, offers a practical solution. Instead of controlling the chance of making even one false positive (like the very strict Bonferroni correction), FDR controls the expected proportion of false positives among all the results you declare significant. For instance, if you set your FDR to 5% (or a Q-value of 0.05), you are accepting that out of all the discoveries you make, you expect about 5% of them to be false. This approach provides greater power to detect true effects compared to stricter methods.

This method is essential for researchers in fields like genomics, proteomics, neuroimaging, and any domain where large numbers of statistical tests are performed. To learn more about the theory, see our guide on statistical significance calculator.

The Benjamini-Hochberg (BH) Formula and Explanation

The Benjamini-Hochberg (BH) procedure is an algorithm that allows us to calculate fdr using pvalue. It works by finding a p-value threshold that adapts to the data. Here’s how it’s done:

Collect and Sort: Gather all your p-values from `m` different tests. Sort them in ascending order, from smallest to largest: p₍₁₎, p₍₂₎, …, p_(m).
Assign Ranks: Give each sorted p-value a rank, i, from 1 to m.
Calculate Critical Value: For each p-value, calculate its Benjamini-Hochberg critical value: (i / m) * Q, where i is the rank, m is the total number of tests, and Q is your chosen FDR level (e.g., 0.05).
Find the Threshold: Find the largest p-value, p_(k), that is less than or equal to its critical value: p_(k) ≤ (k / m) * Q.
Declare Significance: All p-values from the original, unsorted list that are less than or equal to this threshold p_(k) are considered significant discoveries.

Variables Used in FDR Calculation
Variable	Meaning	Unit	Typical Range
p	The p-value from an individual hypothesis test.	Unitless (Probability)	0 to 1
m	The total number of hypothesis tests being conducted.	Unitless (Count)	2 to 1,000,000+
i	The rank of a p-value when sorted in ascending order.	Unitless (Rank)	1 to m
Q	The desired False Discovery Rate, also known as the q-value cutoff.	Unitless (Proportion)	0.01 to 0.25

Practical Examples

Example 1: A Small Gene Set

Imagine a researcher tests 8 genes for differential expression and gets the following p-values: 0.002, 0.041, 0.15, 0.008, 0.5, 0.03, 0.01. They want to control the FDR at 5% (Q=0.05).

Inputs: P-values = [0.002, 0.041, 0.15, 0.008, 0.5, 0.03, 0.01], Q = 0.05, m = 7.
Process: The calculator sorts the p-values, calculates the (i/m)*Q critical value for each, and finds the highest-ranked p-value that is below its line. In this case, the p-value 0.041 (rank 5) is less than its critical value (5/7 * 0.05 = 0.0357). The next p-value 0.030 (rank 4) *is* less than its critical value (4/7 * 0.05 = ~0.028). The process continues until the largest rank ‘k’ is found where the p-value is less than the critical value.
Results: The calculator would identify the significant genes based on the BH threshold. For example, it might find 4 significant results with a p-value threshold of 0.041. For a more detailed breakdown, a p-value calculator can be useful.

Example 2: A Clinical Trial

A clinical trial tests 20 different secondary outcomes, with a stricter desired FDR of 1% (Q=0.01) to be conservative.

Inputs: 20 p-values, Q = 0.01, m = 20.
Process: The procedure is the same, but the critical value line (i/20)*0.01 is much lower, making it harder for a p-value to be declared significant.
Results: This will result in fewer significant findings than if a Q of 0.05 were used. This demonstrates the trade-off: a lower FDR reduces the number of false positives but also reduces the power to detect true effects. A A/B Test Calculator might be used for primary outcomes, but FDR is critical for these secondary analyses.

How to Use This FDR Calculator

Enter P-values: Paste your list of p-values into the text area. You can separate them with commas, spaces, or new lines. The tool will automatically parse them.
Set the Q-value: Choose your desired False Discovery Rate (FDR). 0.05 is a common choice, but you may choose 0.10 for more exploratory analysis or 0.01 for very conservative testing.
Specify Total Tests (Optional): The calculator automatically counts the number of p-values you entered to determine ‘m’. However, if your p-values are a subset of a larger number of tests, you should enter the true total number here.
Calculate and Interpret: Click “Calculate”. The main result shows how many of your tests are significant under the specified FDR. The results table provides a detailed breakdown, showing each p-value’s rank and whether it met the BH criteria. The chart provides a visual representation of which p-values (blue dots) fell below the critical value line (red line).

Key Factors That Affect FDR Calculation

Choice of Q-value: This is the most direct factor. A higher Q (e.g., 0.10) is less stringent and will lead to more discoveries, while a lower Q (e.g., 0.01) is more stringent and will lead to fewer discoveries.
Total Number of Tests (m): The FDR correction becomes more stringent as you perform more tests. The denominator ‘m’ in the critical value formula means the hurdle for significance gets higher with more tests.
P-value Distribution: The actual values of your p-values matter. A dataset with many small p-values is more likely to yield significant results after FDR correction than a dataset where most p-values are large.
Independence of Tests: The original Benjamini-Hochberg procedure assumes that the tests are independent. While it’s robust to some forms of dependency, highly correlated tests can affect the FDR control.
Proportion of True Nulls: The power of the FDR procedure is highest when a large fraction of the tests represent true effects (i.e., have non-null hypotheses). If almost all of your hypotheses are truly null, it will be very difficult to find any significant results.
Data Quality: Low-quality data, small sample sizes, or high variance can lead to larger p-values overall, making it less likely any will be significant after correction. Using a sample size calculator beforehand is crucial.

Frequently Asked Questions (FAQ)

1. What is a good Q-value to use?
The most common Q-values are 0.05 and 0.10. A value of 0.05 is generally standard for publication. A value of 0.10 might be used in exploratory research where researchers are willing to accept a higher proportion of false discoveries. Values as high as 0.20 or 0.25 have been used in some fields like genetics, but this is less common.

2. What is the difference between a p-value and a q-value?
A p-value is the result from a single test, representing the probability of observing your data if the null hypothesis is true. A q-value (or FDR-adjusted p-value) is specific to multiple testing and represents the minimum FDR at which that test could be called significant. In short, p-values are for single tests; q-values are for multiple tests.

3. Why not just use Bonferroni correction?
The Bonferroni correction (dividing your alpha by the number of tests) controls the Family-Wise Error Rate (FWER), the probability of making even one false positive. This is often too strict for studies with thousands of tests (like genomics) and can lead to you missing many true discoveries (low power). FDR control is generally preferred in these scenarios.

4. Can I get a q-value from this calculator?
This calculator applies the Benjamini-Hochberg procedure to determine a single p-value threshold for significance. The adjusted p-values (often called q-values) for each individual test are shown in the results table for a comprehensive analysis. To understand how scores relate, a z-score calculator might be helpful.

5. What does it mean if no p-values are significant after correction?
This means that even your smallest p-value was not small enough to overcome the penalty for multiple testing at your chosen Q-level. It suggests there is not enough evidence in your data to declare any discoveries while controlling the false discovery rate.

6. Are the values from this tool unitless?
Yes. P-values, ranks, Q-values, and the resulting FDR are all unitless probabilities, counts, or ratios.

7. Why is my adjusted p-value larger than my original p-value?
This is the entire point of multiple testing correction. The adjusted p-value reflects the “cost” of performing many tests. It is an increased threshold for significance, so it will always be greater than or equal to the original p-value.

8. Can I use this calculator if my tests are not independent?
The Benjamini-Yekutieli procedure is a modification for non-independent tests, but the original Benjamini-Hochberg (BH) procedure used here is proven to be robust under a common form of dependency known as “positive regression dependency,” which covers many real-world scenarios. For most applications, the BH procedure is considered appropriate.

Related Tools and Internal Resources

Expand your statistical analysis with our other specialized tools:

Statistical Significance Calculator: Determine if the results of an experiment are statistically meaningful.
P-value Calculator: Calculate a p-value from a Z-score, t-score, or chi-square value.
A/B Test Calculator: Analyze the results of your A/B tests to make data-driven decisions.
Sample Size Calculator: Determine the minimum sample size needed for your study.
Confidence Interval Calculator: Calculate the confidence interval for a sample mean or proportion.
Z-Score Calculator: Find the Z-score for any data point in a normal distribution.