Too many R packages: CRAN is inundated with submissions
Overwhelmed by Volume: The CRAN Submission Surge
CRAN remains the world's most accessible gateway to statistical expertise, and the rate at which new packages are being integrated is accelerating at an unprecedented pace.
However, this raises a critical question: Is the R community actually gaining value from this expansion?
The Curator's Burden
Written by Joseph Rickert | Published June 12, 2026
For those following along via R-bloggers, you are likely aware that I curate a "Top 40" list of the most intriguing new R packages arriving on CRAN. My journey with this project has spanned several platforms:
- Initially at Revolution Analytics.
- Later via R Views (for RStudio and Posit).
- Currently hosted on R Works.
In the past, selecting these forty packages was a pleasant hobby taking about one day of work per month a grueling "hamster-on-the-wheel" endeavor. Previously, I could feasibly review the landing pages of a hundred packages and personally test a handful of them. Now, the volume is simply too high.
Visualizing the Growth
The following ggplot2 code generates a visualization of the monthly volume of new submissions since I began my work at R Works:
library(tidyverse)
file_path <- "new-cran-pkgs.csv"
if (!file.exists(file_path)) {
stop("Please check the path: ", file_path)
}
# Safe ingestion of text and numeric data
raw_data <- read.csv(file_path, colClasses = c("character", "numeric"), stringsAsFactors = FALSE)
plot_data <- raw_data %>%
mutate(Date = my(Month)) %>%
arrange(Date)
new_pkg <- ggplot(plot_data, aes(x = Date, y = Num_pkgs, group = 1)) +
geom_line(color = "#1f77b4", size = 1) +
geom_point(color = "#d62728", size = 1.2) +
labs(
title = "Monthly Volume of New CRAN Packages",
x = "Date",
y = "Number of Packages",
caption = "Source: R Works monthly Top 40 posts"
) +
theme(
plot.title = element_text(face = "bold"),
panel.grid.minor = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1)
)
new_pkg
Why the Explosion?
The surge is likely due to the fact that packaging code and shipping it to CRAN has become incredibly effortless. This mirrors a broader trend in software development; creating and deploying apps is easier than ever.
John Burn-Murdoch recently highlighted this in the Financial Times, citing NBER research regarding the "Agentic AI" era. The data suggests a paradox: while the number of apps is exploding, their actual utility—measured by usage, reviews, or profit—is not keeping pace.
Evaluating the Contribution
We must apply the same scrutiny to R. Are these new packages truly advancing the field? To be a meaningful contribution, a package should ideally meet several of these criteria:
- Introduce novel statistical methodologies.
- Expand R's utility into untapped application domains.
- Provide high-performance, optimized code.
- Offer significant utility to the broader community.
From my perspective as an "engaged dilettante," the answer is often no. A staggering number of submissions lack the basic documentation required to explain their purpose.
The Documentation Gap (May Data)
Consider the statistics from May:
| Metric | Value |
|---|---|
| Total New Packages | 323 |
| Packages lacking README, Vignettes, and URLs | 40 |
Mathematically, the proportion of "silent" packages is:
Unless a package is purely infrastructure for another suite or has documentation available via an external journal, a package that fails to explain the what, why, and how is not a contribution.
"As a hamster on the wheel, I would be very pleased to hear what you have to say."
Join the Conversation:
If you have thoughts on this trend, please share them by leaving a comment on Issue #68 within the R Works GitHub repository.