The papers being cited fastest in 2025 are not discoveries

SciPy 1.0 was published in February 2020. In the twelve months ending October 2025 it was cited 1,731 times. That is a five-year-old methods paper picking up roughly five new citations per day, with the rate still climbing. Its acceleration over the prior year is +94 per month, the third highest in the entire top-300 leaderboard of papers ranked by recent citation slope. The two papers ahead of it are the ERA5 climate reanalysis data product and a 2018 statistics handbook on partial-least-squares structural equation modeling. None of the top three describe a finding.

It is not alone. Run that leaderboard and the front of the pack looks like a software inventory. Array programming with NumPy (2020) is adding 144 citations a month. Astropy v2.0 (2018) is adding 80 a month and accelerating — its monthly rate roughly doubled between 2024 and 2025. The newer Astropy v5.0 paper (2022) is at 65 a month. Tidyverse, MEGA11, fastp, REDCap, AutoDock Vina, SAMtools, and eggNOG-mapper all sit in the top fifty by slope. Among the dozen single-paper “fields” that aggregate the highest citation heat in October 2025, the densest cluster is just scientific software and the data releases that feed it.

The dataset behind this is the Mattermark-of-Research index built off OpenAlex. We tracked monthly citations for the 300 most-cited papers globally over the trailing twelve months ending October 2025, fit a slope per paper, embedded each paper, and clustered the top into research “fields.” The field with the highest combined citation acceleration in late 2025, by a clean margin, is built around SciPy, NumPy, Astropy, MESA (the stellar evolution code), Gaia Data Release 3, and the Vera Rubin Observatory design paper. Ten papers. Together they pulled in 5,800-plus citations in the past 12 months and the rate is still going up.

Two things are happening at once.

First, scientific software is now cited the way reagents used to be in a wet-lab Methods section. If your pipeline used SciPy, you cite the SciPy paper. NumPy is cited in nearly every quantitative paper that uses Python at all, which is most of them. The acceleration is a measure of how thoroughly Python ate research computing between 2020 and 2025.

Second, the astronomy crowd is in a data release cycle. Gaia DR3 came out in mid-2022 and is still adding 48 citations a month and accelerating. MESA, the open-source stellar evolution code that almost every theoretical paper on a star uses, occupies three slots in the cluster (2018, 2019, 2023 versions). The LSST reference design paper is climbing as the Vera Rubin Observatory comes online. Big surveys produce papers; the survey papers get cited; and the codes that turn the survey data into models get cited again.

The patent and clinical-trial overlay for this cluster is, predictably, empty. There are no useful US patent matches for “SciPy 1.0” because SciPy is not a patentable invention. There are no Phase 2 trials for “Astropy.” This is the rare case where a research front shows up loud in the citation index and silent everywhere else, and that asymmetry is itself the signal: the most-cited papers in modern science describe the measurement apparatus, not the measurement.

If you are an R&D leader using citation metrics to identify hot research fronts, the take is uncomfortable. The top of the citation leaderboard is dominated by methods, infrastructure, and data releases, not findings. Of the top fifteen papers by recent citation slope, eleven are software or statistics handbooks (SciPy, NumPy, Astropy v2.0, Astropy v5.0, Tidyverse, MEGA11, Multivariate Data Analysis, PLS-SEM Using R, “When to use and how to report PLS-SEM”), one is the ERA5 climate data product, two are qualitative-research methodology papers (Thematic Analysis, Reflecting on reflexive thematic analysis), and one is AlphaFold 3, which is itself cited the way a tool is cited, not the way a finding is. The leaderboard of citation acceleration is a leaderboard of dependency adoption.

The thing to watch over the next year is what happens when the first big Vera Rubin / LSST data products start to appear. If this pattern holds, the data-release paper and the analysis-pipeline paper will be in the top five almost overnight.

Method note. The signal here is monthly citation slope, computed from a paper-by-month citation matrix derived from OpenAlex. Anchor month is October 2025. We pulled the top 300 papers globally by trailing-12-month citations, fit a least-squares slope across those twelve monthly counts per paper, and computed acceleration as the difference between the trailing 12-month and prior 12-month rates. Clusters were assigned by embedding paper titles with BAAI/bge-base-en-v1.5 and applying MiniBatchKMeans (k=20). The cluster called out above is the one whose papers sum to the highest combined heat (slope × cites) in the index. The patent overlay was checked against 9.3M USPTO grants; no meaningful matches existed, which is itself the point.

Signal Net

The papers being cited fastest in 2025 are not discoveries

Leave a Reply Cancel reply