DESI 2024 II: sample definitions, characteristics, and two-point clustering statistics

Q: What is the paper’s central research question?

How to build and validate DESI DR1 galaxy/quasar LSS catalogs (including randoms, veto masks, and weights) so that measured two-point clustering statistics can be modeled accurately despite survey, instrumental, and astrophysical systematics.

Q: What data source and time span are used?

DESI DR1 main survey observations processed with the “iron” spectroscopic reduction, using data through June 14, 2022.

Q: How are the final clustering samples defined for each tracer?

By tracer-specific spectroscopic success criteria (e.g., BGS: ZWARN==0 and $\Delta\chi^2>40$; LRG: ZWARN==0 and $\Delta\chi^2>15$; ELG: $\log_{10}(S[\mathrm{OII}])+0.2\log_{10}(\Delta\chi^2)>0.9$; QSO: not rejected by quasar catalog) plus redshift cuts (BGS $0.1<z<0.4$, LRG $0.4<z<1.1$, ELG $0.8<z<1.6$, QSO primary $0.8<z<2.1$).

Q: What are the final sample sizes used for clustering?

BGS 300,043; LRG 2,138,627; ELG 2,432,072; QSO 1,223,391 total, with 856,831 in the primary $0.8<z<2.1$ range.

Q: How does the paper correct for fiber assignment incompleteness?

It defines assignment completeness and uses fiducial completeness weights $w_{\mathrm{comp}}=1/f_{\mathrm{TLID}}$, with additional handling of tile-group effects via $f_{\mathrm{tile}}$ applied to randoms. It also applies a $\theta$-cut (pairs with $\theta<0.05^\circ$) to avoid small-scale biases.

Q: What is the fiducial imaging-systematics mitigation strategy?

Tracer- and redshift-bin-dependent regressions using Healpix imaging property maps: BGS uses linear regression; LRG uses linear regression; QSO uses random-forest regression; ELG uses SYSNet. The regression-derived inverse models are turned into weights $w_{\mathrm{imsys}}$.

Q: How are spectroscopic systematics handled?

By modeling redshift success as a function of TSNR2 (and sometimes fiberflux and redshift) and applying redshift-failure weights $w_{z\mathrm{fail}}$.

Q: What two-point estimators are used and in what spaces?

Correlation function multipoles via the Landy–Szalay estimator (with $\ell=0,2,4$ Legendre projections) and power spectrum multipoles via an FKP-based estimator with FFTs.

Q: What validation result do the authors emphasize?

Data and realistic fiber-assignment mocks are generally consistent to within about 2% in the inferred real-space overdensity field; detailed $\chi^2$ comparisons show the largest residual mismatches for ELG at low $k$ and for QSO in Fourier-space quadrupole, improving when restricting to $k<0.2\,h\mathrm{Mpc}^{-1}$.

A. G. Adame, José Aguilar, S. P. Ahlen, Shadab Alam, D. M. Alexander, Marcelo A. Alvarez, O. Alves, Abhishek Anand, U. Andrade, E. Armengaud, +90 more

Journal of Cosmology and Astroparticle Physics·2025·Mathematics·59 citations

8 min read

Read the full paper at DOI or on arxiv

TL;DR

The paper provides end-to-end DESI DR1 LSS catalog construction: target selection, redshift success cuts, veto masks, random catalogs, and systematic weights.

Briefing Cornell Notes

Briefing

This DESI Collaboration paper addresses a practical but central question for precision cosmology: how to construct and validate large-scale structure (LSS) galaxy and quasar catalogs from DESI Data Release 1 (DR1) such that the measured two-point clustering statistics (correlation functions and power spectra) can be reliably compared to theoretical models. The question matters because raw observed number densities are modulated by many non-cosmological effects—survey geometry, fiber assignment incompleteness, imaging systematics, and spectroscopic success variations—each of which can imprint spurious correlations or bias the inferred cosmological parameters if not modeled correctly.

The paper’s contribution is methodological and infrastructural. It defines the DESI DR1 target samples (BGS, LRG, ELG, QSO), specifies the spectroscopic redshift selection criteria used to define “good” tracers, constructs the corresponding random catalogs that encode the survey footprint, and develops a weighting and correction framework to remove or marginalize known observational systematics. It also describes the measurement pipelines for two-point statistics in both configuration space and Fourier space, including the role of window functions, normalization constraints, and integral-constraint effects. Finally, it validates the resulting “raw” two-point measurements by comparing DESI DR1 data to mock catalogs that include realistic DESI fiber assignment.

Methodologically, the study is a large-scale survey data processing and validation effort rather than a controlled experiment. The data source is DESI DR1, using observations through June 14, 2022. The sample construction begins with photometric target selection from Legacy Surveys DR9 imaging, split into North (BASS/MzLS) and South (DECam/DECaLS) regions, with WISE W1/W2 used across the sky. The paper then adds spectroscopic information from the DESI “iron” reduction pipeline (Redrock redshifts, plus QSO classifiers such as QuasarNET and an MgII afterburner). A special reprocessing substitution is applied for data taken on Dec. 12, 2021 due to a calibration bug, affecting 8 dark and 9 bright tiles.

The final clustering samples are defined by tracer-specific spectroscopic success criteria and redshift cuts. For BGS, success requires ZWARN = 0 and $Δ χ^{2} > 40$ , with redshift selection $0.1 < z < 0.4$ and an absolute magnitude cut $M_{r} < - 21.5$ (using Fastspecfit-based k-corrections and an evolution correction). This reduces the number of successful redshifts from 4,036,190 to 485,331 before the redshift cut, and then to 300,043 used in the final analysis. For LRG, success is ZWARN = 0 and $Δ χ^{2} > 15$ , with $0.4 < z < 1.1$ split into three bins. For ELG, success is $lo g_{10} (S [OII]) + 0.2 lo g_{10} (Δ χ^{2}) > 0.9$ and $0.8 < z < 1.6$ split into two bins. For QSO, success is “not rejected by the quasar catalog,” using Redrock/MgII/QuasarNET identifications, with a broad $0.8 < z < 3.5$ selection but primary clustering using $0.8 < z < 2.1$ .

The paper reports final sample sizes and key completeness metrics (Table 2). After vetoes and selection, the number of good redshifts are: BGS 300,043; LRG 2,138,627; ELG 2,432,072; and QSO 1,223,391 (with the primary QSO subset 856,831 in $0.8 < z < 2.1$ ). Spectroscopic success rates within the footprint are high for BGS (98.9%) and LRG (99.1%), moderate for ELG (72.7%), and lower for QSO (66.8%).

A major methodological component is the construction of random catalogs and the correction of selection-function variations. Randoms are generated at uniform angular density (2500 deg $^{- 2}$ per random file) over the DESI footprint defined by “reachable” targets under good hardware conditions. The paper applies multiple veto masks: hardware vetoes (bad fibers, low template signal-to-noise thresholds TSNR2, and instrument flags), a priority veto (e.g., QSO and rare strong-lens candidates remove area from lower-priority samples), and imaging vetoes (bright object masks and Healpix-based cuts on imaging property tails). The paper quantifies footprint losses; for example, hardware veto removes 3.1% of the dark-time footprint and 2.3% of the bright-time footprint. Priority veto removes 1666.7 deg $^{2}$ (20.3%) for LRG & ELG and 38.5 deg $^{2}$ (0.5%) for QSO.

Fiber assignment incompleteness is treated by defining assignment completeness $C_{assign}$ and decomposing it into weights based on tile/fiber competition. The fiducial completeness weight is $w_{comp} = 1/ f_{TLID}$ (Eq. $5.2$ ), with additional handling of tile-group effects via $f_{tile}$ applied to randoms. The paper emphasizes that this fiducial approach is not strictly unbiased at small angular scales because of physical fiber collision constraints; therefore, DESI’s default two-point analysis removes pairs with angular separation $θ < 0.0 5^{\circ}$ (“ $θ$ -cut”) and incorporates the resulting bias through window matrices.

Imaging systematics are corrected using tracer-dependent regression methods. BGS uses linear regression with three maps (r-band depth, stellar density, HI column density). LRG uses linear regression with five maps in BASS/MzLS and one additional map in DECam regions depending on redshift bin. QSO uses a random-forest regression (Regressis). ELG uses a SYSNet neural-network regression. The paper validates these choices with null tests: it computes $χ^{2}$ statistics of normalized projected density versus imaging-property bins, comparing data to uncontaminated mocks. It reports large improvements in the null $χ^{2}$ after applying weights (e.g., for BGS, improvement close to a factor of 2 in DECam; for QSO, overall improvement greater than a factor of five in DECam). It also identifies residual concerns: for LRG, some residual trends in DECam are driven by CIB contamination components of $Δ E (B - V)$ , and for ELG the imaging systematic impact is the most severe.

Spectroscopic systematics are handled via redshift-failure weights $w_{z fail}$ , modeled as a function of TSNR2 (and for some tracers, fiberflux and redshift). The paper reports that despite significant trends in success rates with observing conditions, the impact on two-point clustering is negligible for DR1 when using these weights.

Two-point statistics are measured using Landy–Szalay estimators for correlation functions and FKP-based estimators for power spectra, with multipoles $ℓ = 0, 2, 4$ . The paper describes the use of multiple random catalogs to reduce noise, the role of $θ$ -cut, and the computation of window matrices $W$ that map theoretical predictions to binned measurements. For power spectra, it also details corrections for radial integral constraint (RIC) and angular integral constraint (AIC) induced by imaging weights, using polynomial templates fit to differences measured in EZmocks and AbacusSummit cut-sky mocks.

The key validation results compare DESI DR1 “raw” two-point measurements to the mean of 25 “altmtl” mocks (mocks with realistic fiber assignment and the same LSS pipeline). The paper’s headline conclusion is that configuration- and Fourier-space two-point measurements are generally consistent to within about a 2% factor in the inferred real-space overdensity field. In the detailed $χ^{2}$ comparisons (Table 9), BGS shows excellent configuration-space agreement (e.g., $χ^{2} / d o f = 49.6/45$ for $ξ (s)$ over $20 < s < 200 h^{- 1} Mpc$ ), while Fourier-space agreement is somewhat worse but improves when restricting to $k < 0.2 h Mpc^{- 1}$ . LRG configuration-space agreement is generally strong; the largest $χ^{2} / d o f$ values occur in Fourier space for the highest redshift bin, but improve when restricting to $k < 0.2 h Mpc^{- 1}$ (e.g., LRG3 monopole $χ^{2} / d o f = 37.7/39$ under $k < 0.2$ with a bias scaling of 0.99). ELG exhibits the most notable tension: configuration-space $χ^{2} / d o f$ for $ξ_{0} + ξ_{2}$ in $0.8 < z < 1.1$ has PTE $= 0.014$ , and Fourier-space mismatches are largest at low $k$ , consistent with residual imaging systematics. QSO mismatches are strongly scale dependent in Fourier space, with improved agreement when restricting to $k < 0.2 h Mpc^{- 1}$ (e.g., QSO monopole $χ^{2} / d o f = 65.6/40$ for $k < 0.2$ without scaling, improving to $49.3/39$ with a scaling factor 0.9882).

Limitations are inherent to the methodology. The paper’s validation uses a finite number of mocks (25 altmtl mocks for comparisons; 1000 EZmocks for covariance validation), and the mocks only approximately reproduce all aspects of the real data (notably fiber assignment realism and imaging systematics). It also acknowledges that the fiducial completeness weighting is not unbiased at small scales and therefore relies on the $θ$ -cut and window-matrix modeling. For small-scale clustering, the paper points to an alternative catalog version (v1.5pip) using PIP (pairwise inverse probability) weights and angular up-weighting, which is computationally more demanding and noisier on large scales.

Practically, the results matter for anyone performing DESI DR1 clustering analyses—BAO, full-shape RSD, and primordial non-Gaussianity studies—because they provide the public LSS catalogs, the recommended weights, and the measurement/correction pipeline needed to avoid systematic biases. The paper also provides guidance on robustness testing: analyses should assess sensitivity to imaging systematic weights (especially for ELG), spectroscopic success modeling, and residual integral-constraint effects. The intended audience includes DESI collaboration analysts and external researchers using DESI DR1 LSS products, as well as method developers interested in survey systematics modeling.

Overall, the paper’s core contribution is an end-to-end, validated framework for turning DESI DR1 observations into clustering-ready catalogs and two-point measurements, with explicit quantification of sample sizes, completeness, vetoed footprint fractions, weighting schemes, and the degree of agreement between data and realistic simulations.

Cornell Notes

The paper constructs DESI DR1 LSS catalogs (galaxies and quasars) and the associated random catalogs, weights, and veto masks needed for unbiased two-point clustering measurements. It then validates the resulting configuration- and Fourier-space multipoles against realistic fiber-assignment mocks, finding generally consistent agreement at the 2% level in the inferred overdensity field, with the largest residual issues for ELG and on the lowest Fourier scales.

What is the paper’s central research question?

How to build and validate DESI DR1 galaxy/quasar LSS catalogs (including randoms, veto masks, and weights) so that measured two-point clustering statistics can be modeled accurately despite survey, instrumental, and astrophysical systematics.

What data source and time span are used?

DESI DR1 main survey observations processed with the “iron” spectroscopic reduction, using data through June 14, 2022.

How are the final clustering samples defined for each tracer?

By tracer-specific spectroscopic success criteria (e.g., BGS: ZWARN==0 and $Δ χ^{2} > 40$ ; LRG: ZWARN==0 and $Δ χ^{2} > 15$ ; ELG: $lo g_{10} (S [OII]) + 0.2 lo g_{10} (Δ χ^{2}) > 0.9$ ; QSO: not rejected by quasar catalog) plus redshift cuts (BGS $0.1 < z < 0.4$ , LRG $0.4 < z < 1.1$ , ELG $0.8 < z < 1.6$ , QSO primary $0.8 < z < 2.1$ ).

What are the final sample sizes used for clustering?

BGS 300,043; LRG 2,138,627; ELG 2,432,072; QSO 1,223,391 total, with 856,831 in the primary $0.8 < z < 2.1$ range.

How does the paper correct for fiber assignment incompleteness?

It defines assignment completeness and uses fiducial completeness weights $w_{comp} = 1/ f_{TLID}$ , with additional handling of tile-group effects via $f_{tile}$ applied to randoms. It also applies a $θ$ -cut (pairs with $θ < 0.0 5^{\circ}$ ) to avoid small-scale biases.

What is the fiducial imaging-systematics mitigation strategy?

Tracer- and redshift-bin-dependent regressions using Healpix imaging property maps: BGS uses linear regression; LRG uses linear regression; QSO uses random-forest regression; ELG uses SYSNet. The regression-derived inverse models are turned into weights $w_{imsys}$ .

How are spectroscopic systematics handled?

By modeling redshift success as a function of TSNR2 (and sometimes fiberflux and redshift) and applying redshift-failure weights $w_{z fail}$ .

What two-point estimators are used and in what spaces?

Correlation function multipoles via the Landy–Szalay estimator (with $ℓ = 0, 2, 4$ Legendre projections) and power spectrum multipoles via an FKP-based estimator with FFTs.

What validation result do the authors emphasize?

Data and realistic fiber-assignment mocks are generally consistent to within about 2% in the inferred real-space overdensity field; detailed $χ^{2}$ comparisons show the largest residual mismatches for ELG at low $k$ and for QSO in Fourier-space quadrupole, improving when restricting to $k < 0.2 h Mpc^{- 1}$ .

Review Questions

Explain why the paper uses a $θ$ -cut and window-matrix modeling rather than relying solely on completeness weights for fiber assignment.
Which imaging-systematics regression method is used for each tracer (BGS, LRG, ELG, QSO), and how are the resulting weights validated?
Describe the role of randoms in inducing radial integral constraint (RIC) and how the paper corrects for it in Fourier-space.
From Table 9, identify one tracer where configuration-space agreement is strong and one where Fourier-space agreement requires scale cuts; summarize the key $χ^{2} / d o f$ behavior.

Key Points

1
The paper provides end-to-end DESI DR1 LSS catalog construction: target selection, redshift success cuts, veto masks, random catalogs, and systematic weights.
2
Fiber assignment incompleteness is corrected with completeness weights $w_{comp}$ , but unbiased small-scale clustering requires a $θ < 0.0 5^{\circ}$ pair truncation and window-matrix modeling (fiducial approach).
3
Imaging systematics are mitigated with tracer-specific regressions (linear for BGS/LRG, SYSNet for ELG, random-forest for QSO) and validated via $χ^{2}$ null tests against uncontaminated mocks.
4
Spectroscopic systematics are corrected using redshift-failure weights modeled as functions of TSNR2 (and additional variables for some tracers).
5
Two-point statistics are measured as multipoles $ℓ = 0, 2, 4$ in both configuration and Fourier space, with explicit corrections for window effects, RIC, and AIC.
6
Validation against realistic “altmtl” mocks shows general agreement at the 2% level in inferred overdensity; residual tensions are largest for ELG at low $k$ and for QSO Fourier-space multipoles.

Highlights

The authors state that configuration- and Fourier-space two-point measurements are “generally in statistical agreement to within 2% in the inferred real-space over-density field.”

BGS sample size after selection is 300,043 good redshifts in 0.1<z<0.4, with spectroscopic success 98.9%.

ELG imaging-systematics impact is quantified as severe: for 0.8<z<1.1, the configuration-space monopole+quadrupole comparison has χ2/dof=68.2/45 with PTE =0.014.

For QSO, Fourier-space agreement improves substantially when restricting to k<0.2hMpc−1; e.g., QSO monopole χ2/dof=65.6/40 and improves to 49.3/39 with a scaling factor 0.9882.

Topics

Large-scale structure (LSS) cosmology
Galaxy and quasar clustering
Survey systematics modeling
Two-point correlation functions and power spectra
BAO and full-shape cosmological inference
Redshift-space distortions
Survey window functions and integral constraints
Mock catalog generation and validation

Mentioned

DESI (Dark Energy Spectroscopic Instrument)
DESI DR1 (Data Release 1)
Legacy Surveys DR9 (LS)
BASS (Beijing-Arizona Sky Survey)
MzLS (Mayall z-band Legacy Survey)
DECaLS (Dark Energy Camera Legacy Survey)
DECam (Dark Energy Camera)
WISE (Wide-field Infrared Survey Explorer)
Redrock (spectroscopic redshift pipeline)
QuasarNET (quasar classifier)
MgII afterburner
fiberassign (fiber assignment simulation)
HEALPix
pycorr (correlation function measurement)
Corrfunc (pair counting backend)
pyFFTW / FFT-based power spectrum estimation tools (via pypower/nbodykit stack)
pypower (power spectrum measurement)
nbodykit (FFT/power spectrum toolkit)
RascalC (configuration-space covariance modeling)
AbacusSummit (N-body simulations)
EZmocks (fast mock generation)
EZmocks (for RIC/AIC templates)
AbacusSummit cut-sky mocks
AbacusSummit altmtl mocks pipeline
SYSNet (neural network regression)
Regressis (random forest regression)
A. G. Adame
José Aguilar
S. P. Ahlen
Shadab Alam
D. M. Alexander
E. Chaussidon
A. J. Ross
J. Lasker
U. Andrade
R. Zhou
A. Rosado-Marin
A. Krolewski
J. Yu
M. Pinon
C. Zhao
E. F. Schlafly
D. J. Eisenstein
DESI - Dark Energy Spectroscopic Instrument
DR1 - Data Release 1
LSS - Large-scale structure
2PCF - Two-point correlation function
BAO - Baryon acoustic oscillations
RSD - Redshift-space distortions
QSO - Quasi-stellar object (quasar)
LRG - Luminous red galaxy
ELG - Emission-line galaxy
BGS - Bright galaxy sample
MTL - Merged Target Ledger
TSNR2 - Template signal-to-noise squared
FKP - Feldman-Kaiser-Peacock weighting
RIC - Radial integral constraint
AIC - Angular integral constraint
PIP - Pairwise inverse probability
PTE - Probability to exceed
FFT - Fast Fourier transform
EFT - Effective field theory
SV - Survey Validation (contextual; used for earlier DESI releases)
NGC/SGC - North/South Galactic Cap
HEALPix - Hierarchical Equal Area isoLatitude Pixelization