Last refreshed: 2026-04-21
Aggregating data on the usage of tools and build backends over time, using the grep.app search, the source distributions of the top PyPI packages, and the PyPI Linehaul BigQuery data.
Linehaul: Each PyPI simple API access and
file download is logged to
BigQuery through
linehaul. The metadata includes package name/version, installer info,
subcommand, and CI status. The data is reliably available since 2019,
with CI tracking being added in 2024 (via CI=1). Poetry
doesn't report CI info. We use 7-day and 90-day rolling averages to
smooth daily fluctuations.
Build backends: Analysis of the
top 15k PyPI packages
by downloads. Build backend info is extracted from
pyproject.toml and setup.py in source
distributions. The setup.py category tracks packages
without a [build-system] table. 1 of 15,000 packages
failed to analyze and was excluded.
grep.app: Uses search hits for different tools on grep.app, which indexes public code repositories. While smaller than GitHub, it accurately reports hit counts. According to a Vercel blog post, grep.app includes "1M+ pre-indexed repos". Search hits are captured daily.
Two figures were excluded from this page for lack of signal.
The intermediate data aggregated data used to generate plots.