exoALMA. II. Data Calibration and Imaging Pipeline

Q: How does the pipeline avoid CLEAN masking bias toward Keplerian structures?

It does not use Keplerian masking. Instead, it generates CLEAN masks from the observed emission morphology in each channel by convolving an initial CLEAN model with a wide Gaussian kernel (0.5–0.7) and thresholding.

Q: What imaging strategy is chosen to manage non-Gaussian PSFs from combined configurations?

They avoid the JvM residual scaling approach and avoid beam circularization that would blur kinematic features; they instead use Briggs robust weighting and moderate uv-tapering to produce sufficiently Gaussian PSFs for kinematic analysis.

Q: What evidence of improvement does the pipeline report during self-calibration?

Peak SNR improvements up to 50% after per-EB self-calibration, and improvements >300% in most cases during ACA+SB phase self-calibration; amplitude-vs-time “waterfalls” are significantly reduced.

Ryan A. Loomis, Stefano Facchini, M. Benisty, Pietro Curone, John D. Ilee, Gianni Cataldi, Hsi-Wei Yen, Richard Teague, C. Pinte, Jane Huang, +23 more

The Astrophysical Journal Letters·2025·Physics and Astronomy·29 citations

10 min read

Read the full paper at DOI or on arxiv

TL;DR

The pipeline is designed to prevent calibration and imaging artifacts from masquerading as subtle non-Keplerian kinematic deviations in protoplanetary disk channel maps.

Briefing Cornell Notes

Briefing

This paper, “exoALMA. II. Data Calibration and Imaging Pipeline” (Loomis et al. 2025), addresses a practical but scientifically central question: how can ALMA interferometric data be calibrated and imaged with sufficiently high fidelity that faint, spatially localized deviations from Keplerian rotation in protoplanetary disks are not artifacts of miscalibration, misalignment, or imaging choices? The exoALMA Large Program is designed to search for subtle kinematic signatures—potentially produced by embedded planets—using high angular and spectral resolution Band 7 observations of three molecular lines (12CO J=3–2, 13CO J=3–2, and CS J=7–6). Because the analysis is image-plane and channel-map based, the “noise floor” for kinematic features is set not only by thermal noise but also by calibration and imaging systematics that can mimic real velocity residuals. Thus, the pipeline must preserve image fidelity across multiple execution blocks (EBs), multiple array configurations (12-m compact/extended and, for some targets, ACA 7-m), and multiple spectral windows.

The study is not a statistical experiment with a conventional sample size; instead, it is a methodological paper describing the end-to-end processing of a large observational dataset. The exoALMA program targeted 15 protoplanetary disks, observed between October 2021 and May 2023, with 95 12-m execution blocks and 27 7-m execution blocks. The observations used four spectral windows in Band 7: three centered on the molecular lines with 15.3 kHz sampling (native velocity spacing 13.5 m s1 and effective velocity resolution of 26 m s1) and 3840 channels per spw, and a fourth wideband spw (1.875 GHz bandwidth) for deep continuum and self-calibration support. The program’s sensitivity goal was 3 K in a 0.1 beam over a 150 m s1 channel at the representative 12CO frequency. For targets with large angular sizes (disk diameters in 12CO above 6), ACA data were included for 7 sources to recover large-scale emission.

Methodologically, the authors build on the ALMA pipeline calibration (CASA-based, using CASA v6.2.1.7) but explicitly add custom steps for (i) alignment between EBs, (ii) self-calibration to correct phase decoherence, (iii) flux rescaling when needed, and (iv) careful imaging with masking strategies designed to avoid biasing results toward Keplerian expectations. The pipeline begins with ALMA pipeline-calibrated measurement sets, followed by manual quality checks. They report that most EBs were fine, but some were semi-pass flagged for specific issues: in one case (LkCa 15) the phase calibrator for long-baseline observations was spatially resolved, so those data were excluded; in another (J1615-3255) maser frequency lock issues produced time-variable LO frequency instability with a standard deviation of 10 kHz. For the latter, they used per-scan centroiding of a match-filtered impulse response spectrum of 12CO to re-center frequencies, finding no structural imaging artifacts.

A key part of the methodology is diagnosing three classes of inter-EB problems: relative alignment errors, phase decoherence (decorrelation), and flux differences. The authors use continuum image comparisons and visibility-domain diagnostics (deprojected baseline amplitude trends and amplitude-vs-time “waterfalls”) to identify decoherence. They then implement a staged self-calibration workflow. First, they create pseudo-continuum measurement sets by flagging 15 km s1 around the systemic velocities for each line and averaging the remaining data into 250 MHz channels (chosen to avoid frequency smearing at Band 7). They also note that in a few disks, complex organic molecule emission may not have been flagged, but testing showed no notable impact on the self-calibration outcome.

The self-calibration proceeds in multiple iterations and at multiple data-combination levels. For each individual EB, they perform one round of phase-only self-calibration using tclean with Briggs weighting (robust parameter 0.5), stopping CLEAN at a 66 RMS threshold to avoid inserting spurious signal. They report that this step increased peak SNR in the resulting EB images by up to 50% relative to the original images. Next, they align EBs in the uv-plane rather than relying on image-plane Gaussian fits, because disk morphologies can be asymmetric and ring/gap dominated. Their uv-alignment grids visibilities onto a common uv-grid with natural weighting and minimizes the difference only over overlapping uv-cells, solving for positional shifts (phase shifts) while separating flux scaling into a later step. They emphasize that the order of operations matters when self-calibration is involved: misalignment can create artificial asymmetries that could be misinterpreted as non-Keplerian kinematic features. Empirically, they find the most robust order is EB-based self-calibration, then alignment, then iterative group self-calibration from shorter to longer baselines.

After spatial alignment, they perform flux alignment. They diagnose flux offsets as either constant multiplicative scaling errors (absolute flux calibration differences) or baseline-dependent offsets indicative of residual phase decoherence. They apply a single flux correction when offsets exceed 4% and there is no evidence of decoherence; otherwise, they use a modified group self-calibration approach. In cases with decoherence, they first self-calibrate ACA+SB to reduce baseline-dependent decorrelation, then assess flux offsets and rescale any EB with offsets exceeding 4%, repeating the phase self-calibration after rescaling. They report that after this second iteration, flux offsets are verified to be all <4%.

For group-level self-calibration, the pipeline progressively adds longer baselines, similar to DSHARP’s strategy. For sources with ACA, they start with two rounds of ACA phase self-calibration: the first on EB-long intervals combining scans but not spws/polarizations, and the second on 30 s intervals after combining spws. They then concatenate ACA and SB and run multiple phase self-calibration rounds with progressively shorter solution intervals (360 s, 120 s, 60 s, 30 s, 18 s), again CLEANing down to 66 RMS each time. They report that peak SNR improvements can be dramatic: improvements of >300% in most cases during the ACA+SB phase self-calibration. They also report that decoherence improvements were observed down to 18 s solution intervals, and that amplitude-vs-time “waterfalls” were significantly reduced.

If flux offsets remain significant even after phase self-calibration (an example is shown for PDS 66), they revert to un-self-calibrated datasets, apply uniform flux rescaling for the offset EBs, and re-run the phase self-calibration. Finally, they add long-baseline (LB) data and perform phase self-calibration (with multiple rounds) followed by two rounds of amplitude+phase self-calibration. For amplitude solutions, they use a conservative SNR threshold of 5 and flag outliers with amplitude solutions <0.8 or >1.2. The CLEAN model for amplitude rounds is generated deeper (down to 16) to include low-level flux on relevant angular scales. They again report substantial peak SNR and improved noise structure in several disks.

Once the final self-calibration is complete, they produce “fiducial” measurement sets: a time-averaged continuum set (30 s binning) and line measurement sets for each spectral window (30 s binning), including continuum-subtracted versions. They apply continuum subtraction using uvcontsub with solint=1 and fitorder=1. The authors stress that the order of applying gain solutions, phase shifts, and flux alignments is non-commutative and must match the pseudo-continuum pipeline order.

The imaging section addresses two additional fidelity threats: non-Gaussian PSFs from combined array configurations and spectral artifacts that can map into spurious spatial distortions. They discuss three restoration strategies from the literature (re-weighting visibilities, JvM residual scaling, or using a larger restoring beam). For exoALMA, they avoid the JvM correction to prevent confusion in statistical significance when point-like features coexist with extended emission. They also avoid beam enlargement because it would blur small-scale kinematic features. Instead, they use Briggs robust weighting and moderate uv-tapering to produce PSFs “Gaussian enough” for kinematic analysis, and they record a PSF conditioning metric (defined as restoring beam power divided by main-lobe dirty beam power) in FITS headers. They caution that for some ACA-including images, indicates beams that remain significantly non-Gaussian, so flux measurements in low-SNR regions should be treated carefully.

For line imaging, the pipeline uses fiducial measurement sets and produces channel maps with specified channel spacing (typically 100 m s1 for CO and 200 m s1 for CS) and target beam sizes (often 0.150.15 for fiducial cubes). A central methodological choice is the CLEAN masking strategy: because the science targets non-Keplerian morphologies, they do not use Keplerian masking. Instead, they generate CLEAN masks based on the observed emission morphology in each channel by convolving the previous-step CLEAN model with a relatively wide Gaussian kernel (0.5–0.7) and thresholding to create a binary mask. They then perform deep iterative CLEANing within the mask, producing multiple products at different CLEAN thresholds (e.g., 6, 5, 4, 36 RMSfin).

The paper also evaluates spectral effects: ensuring adequate sampling relative to the narrow linewidth along a single line of sight, accounting for ALMA’s Hann-window spectral response and channel covariance, and considering spectral regridding from the topocentric frame into the source co-moving frame. They conclude that their spectral sampling is high enough to mitigate these effects for the spatio-kinematic scales relevant to embedded planets.

Although the paper’s primary deliverable is a pipeline rather than a set of astrophysical results, it provides concrete performance indicators for the produced images. For continuum images, Table 2 reports achieved RMS values and peak SNRs; for example, DM Tau has achieved RMS 26.6 cJy beam11 with peak SNR 128.6, while PDS 66 reaches peak SNR 1232.5 with RMS 26.0 cJy beam11. For fiducial line cubes (Table 3), the achieved RMS and peak SNR vary by source and molecule; for instance, MWC 758 CO 3–2 has RMS 4.39 mJy beam11 and peak SNR 37.73, while HD 135344B CO 3–2 has RMS 2.98 mJy beam11 and peak SNR 67.11. High-resolution cubes (Table 4) show that some sources have reduced SNR due to resolution demands; e.g., LkCa 15 CO 3–2 at high resolution has RMS 2.53 mJy beam11 and peak SNR 14.48.

Limitations are mainly methodological and acknowledged implicitly through conservative choices. The pipeline avoids aggressive amplitude self-calibration early and uses calonly modes to prevent flagging low-SNR data. It also notes that PSF non-Gaussianity can persist for some ACA-including images, requiring caution when measuring fluxes at high resolution and low SNR. Additionally, while the paper tests the impact of unflagged COM emission on self-calibration, it does not claim that all astrophysical complexity is fully captured; rather, it demonstrates that the calibration outcome is robust to at least that specific complication.

Practically, the results matter to anyone using exoALMA data for kinematic analyses of planet-disk interactions. The pipeline’s main contribution is a set of calibrated and aligned measurement sets and fiducial images designed to minimize spurious non-Keplerian artifacts. Researchers studying embedded planets, disk dynamics, or instabilities should care because the pipeline explicitly targets the dominant systematic risks for channel-map residual studies: phase decoherence, EB misalignment, flux scaling errors, and CLEAN/restoration artifacts. The authors also release the calibration scripts and QA figures via the exoALMA data release, enabling reproducibility and future improvements. In broader terms, the paper provides a template for high-fidelity interferometric processing when the scientific inference depends on subtle spatially localized deviations rather than on total flux or point-source detection.

Cornell Notes

This paper presents the exoALMA collaboration’s end-to-end calibration and imaging pipeline for high-fidelity Band 7 channel maps of protoplanetary disks. It details custom uv-plane alignment, staged self-calibration to correct phase decoherence, flux rescaling logic, and non-Keplerian-aware CLEAN masking to prevent calibration/imaging artifacts from masquerading as kinematic deviations.

What scientific problem drives the need for a specialized calibration and imaging pipeline?

The program searches for faint, localized deviations from Keplerian rotation in channel maps; any calibration or imaging artifact can set the effective noise floor and mimic real kinematic residuals.

What observational data and spectral setup does the pipeline target?

Band 7 observations of 12CO J=3–2, 13CO J=3–2, and CS J=7–6 with native velocity resolution 26 m s1 (CO) and 440 m s1 sampling (CS spw), plus a wideband continuum spw for self-calibration; 95 12-m EBs and 27 7-m EBs across 15 disks.

How does the pipeline handle phase decoherence found in some execution blocks?

It performs iterative phase-only self-calibration, first per-EB, then at the group level (ACA, then ACA+SB, then ACA+SB+LB), using progressively shorter solution intervals down to 18 s, and re-runs after flux rescaling when needed.

How are execution blocks aligned, and why not use image-plane Gaussian fitting?

They use a uv-plane alignment that minimizes gridded visibility differences over overlapping uv-cells; this avoids biases from asymmetric ring/gap morphologies where Gaussian fits can fail.

What is the role of flux alignment, and what threshold triggers rescaling?

Flux offsets are diagnosed as either constant scaling errors or baseline-dependent decoherence. If offsets exceed 4% (and no decoherence is present), a correction is applied; if decoherence is present, they rescale any EB with offsets >4% and repeat self-calibration, ending with offsets <4%.

How does the pipeline avoid CLEAN masking bias toward Keplerian structures?

It does not use Keplerian masking. Instead, it generates CLEAN masks from the observed emission morphology in each channel by convolving an initial CLEAN model with a wide Gaussian kernel (0.5–0.7) and thresholding.

What imaging strategy is chosen to manage non-Gaussian PSFs from combined configurations?

They avoid the JvM residual scaling approach and avoid beam circularization that would blur kinematic features; they instead use Briggs robust weighting and moderate uv-tapering to produce sufficiently Gaussian PSFs for kinematic analysis.

What evidence of improvement does the pipeline report during self-calibration?

Peak SNR improvements up to 50% after per-EB self-calibration, and improvements >300% in most cases during ACA+SB phase self-calibration; amplitude-vs-time “waterfalls” are significantly reduced.

What practical outputs does the pipeline deliver for downstream science?

Fiducial continuum and line measurement sets (including continuum-subtracted versions), plus fiducial continuum images and line cubes produced with recorded PSF conditioning metrics () and QA products.

Review Questions

Explain why the order of operations (EB self-calibration, spatial alignment, flux scaling, group self-calibration) matters for avoiding artificial asymmetries.
Describe how uv-plane alignment differs from image-plane alignment and why asymmetries/rings motivate the uv approach.
What are the two distinct physical interpretations of “flux offsets” in this pipeline, and how does the pipeline decide which correction strategy to use?
How does the non-Keplerian-aware CLEAN masking procedure work, and what bias does it prevent compared with Keplerian masking?
Why does the paper avoid the JvM correction in the context of point-like features embedded in extended emission?

Key Points

1
The pipeline is designed to prevent calibration and imaging artifacts from masquerading as subtle non-Keplerian kinematic deviations in protoplanetary disk channel maps.
2
It uses staged self-calibration (per-EB phase-only, then group-level ACA SB LB) to correct phase decoherence, with progressively shorter solution intervals down to 18 s.
3
Execution blocks are aligned in the uv-plane by minimizing gridded visibility differences over overlapping uv-cells, avoiding biases from asymmetric ring/gap morphologies.
4
Flux alignment is handled after spatial alignment and self-calibration diagnostics, with a 4% threshold and a re-run strategy when baseline-dependent decoherence is present; final flux offsets are verified to be <4%.
5
Line imaging avoids Keplerian masking and instead builds CLEAN masks from observed channel-by-channel morphology using a wide convolution kernel (0.5–0.7).
6
To manage non-Gaussian PSFs from multi-configuration data, the pipeline avoids JvM residual scaling and instead uses Briggs robust weighting and moderate uv-tapering; PSF conditioning () is recorded in FITS headers.
7
The paper provides concrete achieved RMS and peak SNR values for continuum and line cubes, supporting the claim of high image fidelity for downstream kinematic analyses.

Highlights

“Phase decoherence was found in several datasets, which was corrected by an iterative self-calibration procedure.”

Peak SNR improvements after per-EB self-calibration were “up to 50% compared to the original images,” and during ACA+SB phase self-calibration “improvements > 300% in most cases.”

Flux rescaling logic: “if the flux differences were larger than 4%, a single correction was applied,” and after the second iteration “flux offsets are verified to be all <4%.”

Non-Keplerian-aware masking: “Since one of the primary goals of the exoALMA program is to detect and characterize emission that deviates from Keplerian rotation, the use of Keplerian masking during imaging is not appropriate.”

Imaging fidelity metric: the pipeline records “ (restoring beam power / main lobe dirty beam power)” in FITS headers and cautions that some ACA-including images remain significantly non-Gaussian.

Topics

Radio interferometry
ALMA data calibration
Self-calibration and phase decoherence
Interferometric imaging and deconvolution
Protoplanetary disk kinematics
Planet-disk interaction signatures
High dynamic range imaging
Spectral regridding and spatio-spectral artifacts

Mentioned

ALMA
CASA (v6.2.1.7)
tclean
applycal
uvcontsub
Briggs weighting
Hogbom deconvolver
multiscale deconvolver
analysisUtils
emcee
numpy
scipy
matplotlib
astropy
JupyterNotebook
numba
DSHARP and MAPS reduction utilities scripts
eddy (Teague 2019)
Ryan A. Loomis
Stefano Facchini
M. Benisty
Pietro Curone
John D. Ilee
Gianni Cataldi
Hsi-Wei Yen
Richard Teague
Christophe Pinte
Jane Huang
Himanshi Garg
Ryuta Orihara
Ian Czekala
Brianna Zawadzki
Sean M. Andrews
David J. Wilner
Jaehan Bae
Marcelo Barraza-Alfaro
Daniele Fasano
Mario Flock
Misato Fukagawa
Maria Galloway-Sprietsma
Andrs F. Izquierdo
Kazuhiro Kanagawa
Geoffroy Lesur
Cristiano Longarini
Francois Mnard
Daniel J. Price
Giovanni Rosotti
Jochen Stadler
Gaylor Wafflard-Fernandez
Lisa Wlfer
Tomohiro C. Yoshida
E. Fomalont (private communication referenced)
S. Casassus
M. Crcamo
I. Czekala
S. Andrews
T. Hunter
E. Brogan
A. Kepley
G. Jsater
G. van Moorsel
R. Leroy
R. Oberg
C. Pinte
R. Teague
ALMA - Atacama Large Millimeter/submillimeter Array
ACA - Atacama Compact Array
EB - Execution Block
MOUS - Member Observing Unit Set
CASA - Common Astronomy Software Applications
SNR - Signal-to-noise ratio
PSF - Point spread function
JvM - Jsater & van Moorsel correction method
LAS - Largest Angular Scale
uv-plane - Spatial frequency plane of interferometric measurements
COM - Complex organic molecule(s)
CS - Carbon monosulfide (molecular line in this paper)
vLSR - Velocity in the Local Standard of Rest
tclean - CASA imaging/deconvolution task
uvcontsub - CASA continuum subtraction in the uv domain
Briggs robust - Imaging weighting parameter controlling trade-off between resolution and sensitivity