When P-values Hit the Floor: Handling Astronomically Small Statistics in QQ Plots
There is a number that sits quietly at the bottom of every R session, rarely visited:
.Machine$double.xmin
# [1] 2.225074e-308
For most statistical work, this floor is purely theoretical — something you would never expect to bump into. A p-value of 10⁻³⁰⁰ is not just statistically significant; it is so far beyond any reasonable significance threshold that one might wonder why it even matters whether the software handles it correctly.
And yet, here we are.
The QQ Plot: A Diagnostic Workhorse
The quantile-quantile (QQ) plot is one of the most important diagnostic tools in statistical genetics. Its premise is simple: if your test statistics follow their null distribution, plotting observed quantiles against expected quantiles should yield a straight diagonal line. Deviations from that line — particularly the upward sweep at the extreme right that geneticists call “inflation” — tell you something real is happening: true signal, population stratification, model misspecification, or some combination of all three.
For genome-wide association studies (GWAS), the QQ plot is as routine as checking residuals in a regression. You run your association tests across millions of genetic variants, collect the p-values, and plot them. A well-behaved null looks like a line. An inflated study looks like a ski jump.
The problem is that modern GWAS has grown rather large.
The Scale Problem
Early GWAS studies tested hundreds of thousands of variants in a few thousand individuals. Today, biobank-scale studies test tens of millions of variants in hundreds of thousands — sometimes over a million — participants. The UK Biobank, FinnGen, the Estonian Biobank, deCODE genetics in Iceland: these are not incremental improvements in scale; they represent qualitative changes in statistical power.
This creates an immediate practical problem for plotting. How do you render a QQ plot for 100 million p-values?
The naive answer — hand them to your plotting library and let it draw a point for each one — runs into trouble quickly. The original qqman R package, the long-standing standard for GWAS visualization, takes over 13 minutes to render a single plot at that scale. If you are saving to a vector graphics format, the file stores every single point, and the file size becomes absurd. What should be a 5-second diagnostic check turns into a workflow bottleneck.
This was the original motivation for fastqq: a drop-in replacement for qqman::qq that exploits a key property of QQ plots. Since the points form a monotonically increasing sequence, many of them are so close together at the dense lower end of the distribution that they are visually indistinguishable. The package filters these redundant points before rendering, achieving an 80× speedup for 100 million points — reducing a 13-minute wait to under 10 seconds.
A New Problem Emerges: The Extreme Tail
The speed problem was solved. But with the arrival of meta-analyses — combining results across multiple large biobank cohorts — a different issue appeared, one that required confronting those theoretical limits of floating-point arithmetic head-on.
Consider what happens when you combine GWAS results across, say, ten cohorts totaling 2 million participants for a highly heritable trait with a common variant of large effect. The resulting p-values at the top hits are not just small in the conventional GWAS sense (10⁻⁸, the genome-wide significance threshold). They can be vanishingly, almost incomprehensibly small. Values of 10⁻¹⁰⁰, 10⁻²⁰⁰, even 10⁻⁵⁰⁰ are not fantasy — they are the mathematically correct p-values that emerge from combining sufficient evidence.
R’s double-precision floating point cannot represent these numbers. Anything below approximately 2.2×10⁻³⁰⁸ underflows silently to zero. The p-value is not flagged as missing or infinite; it simply becomes 0. And fastqq, like any software working on the raw p-value scale, had no idea what to do with a zero. The plotting machinery would either error out or silently drop the point.
The most extreme signals in a meta-analysis — precisely the ones you most want to inspect visually — were disappearing from the diagnostic plot.
Three Solutions for Three Scenarios
The updated fastqq addresses this through three complementary additions, each targeting a different point in the analyst’s workflow.
1. Explicit Zero Handling in qq
The simplest case: you have raw p-values, some of which have underflowed to exactly zero in the software that produced them. The original association software may have reported a p-value of zero because its own internal computation hit the floor; the true value is small but finite.
For this, qq now accepts a zero_action parameter:
pvec_with_zeros <- c(runif(1e4), 0, 0)
qq(pvec_with_zeros, zero_action = 1e-300)
# Warning: 2 p-value(s) equal to zero replaced with 1e-300.
The substitution is opt-in — you have to explicitly choose the replacement value, which forces you to think about what it means. And the function warns you loudly when it makes the substitution, including how many points were affected. This is the right behavior: silent imputation of missing extremes would be far more dangerous than a warning.
2. The qqlog Function: Pre-transformed Input
A more principled approach is to compute −log₁₀(p) before the values pass through R’s floating-point representation, using whatever extended-precision machinery is available to you, and then hand those transformed values directly to the plotting function.
# Compute -log10(p) in extended precision via Rmpfr or similar,
# then plot directly:
fastqq::qqlog(my_precomputed_neglog10_pvalues)
qqlog has exactly the same signature as qq, but accepts −log₁₀(p) as its input. Internally it uses the same fast filtering logic, so performance is unchanged. The correctness of the plot now depends entirely on how you computed the transformed values — which puts control where it belongs, with the analyst.
This also matters for software interoperability: some association testing tools (particularly those designed to handle extreme statistics) output −log₁₀(p) directly rather than raw p-values, precisely because the raw values would underflow. qqlog accepts that output without any further transformation.
3. The qqchisq1 Function: Log-space Conversion
The most elegant solution is the one that requires the least from the user: accept the test statistic directly, and perform the conversion to −log₁₀(p) internally using log-space arithmetic that never requires the intermediate p-value to be representable as a double.
fastqq::qqchisq1(my_chisq_statistics)
The mathematics here are straightforward but easy to overlook. R’s pchisq function accepts a log.p = TRUE argument that returns log(p) rather than p. From there:
This never computes p itself — only its logarithm, which remains a perfectly ordinary floating-point number even when the p-value it encodes would be 10⁻⁵⁰⁰. A χ² statistic of approximately 2303 (on 1 degree of freedom) corresponds to a p-value around 10⁻⁵⁰⁰, far beyond R’s double-precision floor. Naively, pchisq(2303, 1, lower.tail = FALSE) returns 0. Via log-space, it returns the correct −log₁₀(p) of approximately 502:
-pchisq(2303, 1, lower.tail = FALSE, log.p = TRUE) / log(10)
# [1] 501.87
qqchisq1 uses this approach internally, so the point for that variant shows up correctly at the top of the QQ plot — without any special handling from the user and without any loss of information.
Why This Matters More Now Than It Did Five Years Ago
The shift toward mega-cohort meta-analysis in complex trait genetics is not slowing down. Efforts like the Global Biobank Meta-analysis Initiative are combining data across dozens of biobanks spanning millions of participants. For common, highly heritable phenotypes — height, BMI, educational attainment, blood pressure — the top signals from these analyses will routinely produce test statistics that break double-precision arithmetic.
This is not a crisis; it is a calibration issue. But it does mean that tooling built for the GWAS scale of ten years ago is increasingly inadequate for the GWAS scale of today. A QQ plot that silently drops the most extreme points is not merely aesthetically incomplete — it obscures exactly the region of the distribution that carries the most biological and statistical information.
The fix, as implemented in fastqq, turns out to be conceptually simple: work in log-space, never materialize the raw p-value, and let the analyst supply pre-transformed values when their pipeline already handles the arithmetic correctly. What made it interesting to implement was precisely the subtlety of the failure mode — silent zero underflow is easy to miss, easy to confuse with true zeros in the data, and consequential for the diagnostic plots that the entire community relies on.
Getting Started
fastqq is available on CRAN and GitHub:
# CRAN
install.packages("fastqq")
# Development version
devtools::install_github("gumeo/fastqq")
For analysts working with meta-analysis summary statistics where extreme p-values are expected, the recommended approach is qqchisq1 if you have χ² statistics available, or qqlog if your pipeline already produces −log₁₀(p). For standard GWAS p-values with occasional zeros from software underflow, qq(..., zero_action = 1e-300) handles the common case with appropriate warnings.
The full documentation, examples, and source code are at github.com/gumeo/fastqq.
fastqq is an open-source R package released under GPL-3.0.