TDCOSMO XIII. Improved Hubble constant measurement from lensing time delays using spatially resolved stellar kinematics of the lens galaxy

Page created by Cynthia Newton
 
CONTINUE READING
FERMILAB-PUB-23-013-PPD
                                              Astronomy & Astrophysics manuscript no. ms                                                                                                                         ©ESO 2023
                                              January 9, 2023

                                                                                                                       TDCOSMO
                                                  XIII. Improved Hubble constant measurement from lensing time delays using
                                                              spatially resolved stellar kinematics of the lens galaxy
                                              Anowar J. Shajib,1, 2,? Pritom Mozumdar,3 Geoff C.-F. Chen,4 Tommaso Treu,4 Michele Cappellari,5 Shawn Knabel,4
                                                 Sherry H. Suyu,6, 7, 8 Vardha N. Bennert,9 Joshua A. Frieman,1, 2, 10 Dominique Sluse,11 Simon Birrer,12, 13, 14
                                                             Frederic Courbin,15 Christopher D. Fassnacht,3 Lizvette Villafaña,4 Peter R. Williams4
                                                                                                        (Affiliations can be found after the references)

                                                                                                           Received xxx, xxxx; accepted xxx, xxxx
arXiv:2301.02656v1 [astro-ph.CO] 6 Jan 2023

                                                                                                                             ABSTRACT

                                        Strong-lensing time delays enable measurement of the Hubble constant (H0 ) independently of other traditional methods. The main limitation to
                                        the precision of time-delay cosmography is mass-sheet degeneracy (MSD). Some of the previous TDCOSMO analyses broke the MSD by making
                                        standard assumptions about the mass density profile of the lens galaxy, reaching 2% precision from seven lenses. However, this approach could
                                        potentially bias the H0 measurement or underestimate the errors. In this work, for the first time, we break the MSD using spatially resolved
                                        kinematics of the lens galaxy in RXJ1131−1231 obtained from the Keck Cosmic Web Imager spectroscopy, in combination with previously
                                        published time delay and lens models derived from Hubble Space Telescope imaging. This approach allows us to robustly estimate H0 , effectively
                                        implementing a maximally flexible mass model. Following a blind analysis, we estimate the angular diameter distance to the lens galaxy Dd =
                                        865+85                                                 +472                       +7.3
                                            −81 Mpc and the time-delay distance D∆t = 2180−271 Mpc, giving H0 = 77.1−7.1 km s
                                                                                                                                   −1
                                                                                                                                      Mpc−1 – for a flat Λ cold dark matter cosmology. The
                                        error budget accounts for all uncertainties, including the MSD inherent to the lens mass profile and the line-of-sight effects, and those related to the
                                        mass–anisotropy degeneracy and projection effects. Our new measurement is in excellent agreement with those obtained in the past using standard
                                        simply parametrized mass profiles for this single system (H0 = 78.3+3.4
                                                                                                             −3.3 km s
                                                                                                                       −1
                                                                                                                          Mpc−1 ) and for seven lenses (H0 = 74.2+1.6
                                                                                                                                                                    −1.6 km s
                                                                                                                                                                              −1
                                                                                                                                                                                 Mpc−1 ), or for
                                                                                                                                                         +5.8
                                        seven lenses using single-aperture kinematics and the same maximally flexible models used by us (H0 = 73.3−5.8 km s Mpc ). This agreement
                                                                                                                                                                   −1      −1

                                        corroborates the methodology of time-delay cosmography.
                                        Key words. cosmology: distance scale – gravitational lensing: strong – Galaxy: kinematics and dynamics – Galaxies: elliptical and lenticular, cD
                                        – Galaxies: individual: RXJ1131−1231

                                        1. Introduction                                                                                     To determine whether this “Hubble tension” is due to sys-
                                                                                                                                        tematics or new physics, multiple independent methods to mea-
                                        The Hubble constant, H0 , the current value of the Universe’s ex-                               sure H0 are needed (e.g., Verde et al. 2019; Di Valentino et al.
                                        pansion rate, is a crucial cosmological parameter that also sets                                2021; Freedman 2021). The Carnegie–Chicago Hubble Project
                                        the extragalactic distance scale. Recently, tension has emerged                                 uses the tip of the red giant branch (TRGB) to calibrate the
                                        between early- and late-Universe estimates of H0 (e.g., Freed-                                  SNe Ia absolute brightness and measures H0 = 69.6 ± 1.9 km
                                        man 2021; Abdalla et al. 2022). The temperature and polarisa-                                   s−1 Mpc−1 (Freedman et al. 2019, 2020). This TRGB-calibrated
                                        tion fluctuations in the cosmic microwave background (CMB)                                      measurement is statistically consistent with both the SH0ES
                                        provide an estimate of the Hubble parameter at the last scatter-                                measurement and the CMB-based measurements. However, sev-
                                        ing surface H(z ≈ 1100), which can be extrapolated to the cur-                                  eral independent local probes strengthen the “H0 tension” by
                                        rent epoch using the Λ cold dark matter (ΛCDM) cosmology.                                       measuring values consistent with the SH0ES value. For exam-
                                        The CMB measurements from Planck give H0 = 67.4 ± 0.5 km                                        ple, the Megamaser Cosmology Project (MCP) estimates H0 =
                                        s−1 Mpc−1 (Planck Collaboration 2020) and H0 = 67.6 ± 1.1 km                                    73.9 ± 3.0 km s−1 Mpc−1 (Pesce et al. 2020), the surface bright-
                                        s−1 Mpc−1 (Aiola et al. 2020). In the local Universe, H0 can be                                 ness fluctuation (SBF) method measures H0 = 73.7 ± 0.7 ± 2.4
                                        estimated using the cosmic distance ladder, which uses luminos-                                 km s−1 Mpc−1 (Blakeslee et al. 2021), and the Tully–Fisher-
                                        ity distances of type Ia supernovae (SNe Ia) with their absolute                                relation-based method calibrated with Cepheids measures H0 =
                                        brightness calibrated using different classes of stars. The Super-                              75.1 ± 0.2 ± 0.3 km s−1 Mpc−1 (Kourkchi et al. 2020).
                                        nova H0 for the Equation of State of the dark energy (SH0ES)
                                        team uses Cepheids and parallax distances for this calibration,                                     Strong-lensing time delays provide an independent measure-
                                        and they find H0 = 73.04 ± 1.04 km s−1 Mpc−1 (Riess et al.                                      ment of H0 (Refsdal 1964; for an up-to-date review, see Bir-
                                        2022). This value is in 5σ tension with the Planck CMB-based                                    rer et al. 2022b; Treu et al. 2022, for a historical perspective,
                                        measurements. If this difference is not due to systematic errors                                see Treu & Marshall 2016). In strong lensing, a background
                                        in either of these measurements (e.g., Efstathiou 2021), then this                              source appears as multiple images due to the gravitational de-
                                        tension could point to new physics beyond the ΛCDM cosmo-                                       flection of photons by a massive foreground galaxy or galaxy
                                        logical model (e.g., Knox & Millea 2020).                                                       cluster. The photons that were emitted at the same time from
                                                                                                                                        the background source arrive in different images with a rela-
                                              ?
                                                  NFHP Einstein Fellow                                                                  tive time delay. This time delay carries cosmological informa-
                                                                                                                                                                                               Article number, page 1 of 21

                                        This manuscript has been authored by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy, Office of Science, Office of High Energy Physics.
A&A proofs: manuscript no. ms

tion through a combination of angular diameter distances in-         and non-time-delay lenses (Gomer et al. 2022). Such differ-
volved in the lensing system. This combination is referred to        ences could arise, for example, from evolutionary effects, as the
as the “time-delay distance", which is inversely proportional to     SLACS sample is at lower redshift than the TDCOSMO lenses
H0 (Refsdal 1964; Schneider et al. 1992; Suyu et al. 2010). The      (see, e.g., Sonnenfeld et al. 2015, for a discussion of the evolu-
Time-Delay COSMOgraphy (TDCOSMO) collaboration has an-               tion of mass density profiles of massive elliptical galaxies).
alyzed seven time-delay lenses to measure H0 with 2% error,              Spatially resolved velocity dispersion measurements of lens
H0 = 74.2±1.6 km s−1 Mpc−1 assuming a power-law or compos-           galaxies for systems with measured time delays are critical to
ite (i.e., stars and Navarro–Frenk–White (NFW) halo; Navarro         drastically improving the H0 precision, given the limited sam-
et al. 1996, 1997) mass profile for the lensing galaxies (Mil-       ple size of time-delay lenses (Shajib et al. 2018; Yıldırım et al.
lon et al. 2020b). The TDCOSMO collaboration encompasses             2021). The spatially resolved nature of the measured veloc-
the COSmological MOnitoring of GRAvItational Lenses (COS-            ity dispersion is especially powerful in simultaneously break-
MOGRAIL; Courbin et al. 2005; Millon et al. 2020a), the H0           ing the MSD and the mass-anisotropy degeneracy (Cappellari
Lenses in COSMOGRAIL’s Wellspring (H0LiCOW; Suyu et al.              2008; Barnabè et al. 2009, 2012; Collett et al. 2018; Shajib et al.
2010, 2013; Bonvin et al. 2017; Birrer et al. 2019; Rusu et al.      2018). Spatially resolved velocity dispersion measurements for
2020; Wong et al. 2020), the Strong-lensing High Angular Reso-       ∼40 time-delay lens galaxies will yield an independent .2%
lution Programme (SHARP; Chen et al. 2019), and the STRong-          H0 measurement without any mass profile assumption (Birrer
lensing Insights into the Dark Energy Survey (STRIDES; Treu          & Treu 2021). Additional constraints from velocity dispersion
et al. 2018; Shajib et al. 2020) collaborations.                     measurements of non-time-delay lens galaxies or magnification
    The simple parametric lens models, e.g., the power law,          information for standardizable lensed type Ia supernovae can
adopted in the TDCOSMO analyses are “industry standard” con-         further improve the uncertainty to . 1% (Birrer & Treu 2021;
sistent with non-lensing measurements. The TDCOSMO collab-           Birrer et al. 2022a).
oration has performed various systematic checks on the adopted           In this paper, we measure the spatially resolved velocity dis-
lens modeling procedure. These checks find potential system-         persion for the lens galaxy in the strongly lensed quasar system
atic biases to be lower than the acceptable limit (∼1%) from         RXJ1131−1231using the Keck Cosmic Web Imager (KCWI) in-
the choice of mass model parametrization (i.e., power law or         tegral field spectrograph on the W. M. Keck Observatory (Mor-
composite, Millon et al. 2020b), from ignoring dark substruc-        rissey et al. 2012, 2018). and constrain H0 without any mass
tures in the lens galaxy’s halo (Gilman et al. 2020), from ig-       profile assumption from this single time-delay lens system. This
noring disky or boxy-ness in the baryonic distribution (Van de       is the first application of spatially resolved velocity dispersion
Vyvere et al. 2022a), from using different lens modeling soft-       from a time-delay lens to measure H0 . This lens system was pre-
ware (Shajib et al. 2022a), and from ignoring potential isoden-      viously used to measure H0 by combining the observed imaging
sity twists and ellipticity gradients in the lens galaxy (Van de     data, single-aperture velocity dispersion, time delays, and anal-
Vyvere et al. 2022b). However, a significant source of potential     ysis of the line-of-sight environment (Suyu et al. 2013, 2014).
systematics could arise due to the relatively simple parametriza-    However, these previous studies assumed simple parametriza-
tion of the lens mass profile (Kochanek 2020). The well-known        tions for the mass profile, such as a power law or a combination
mass-sheet degeneracy (MSD) does not allow one to constrain          of the NFW profile and the stellar profile with constant mass-
the mass profile shape of the deflector galaxy from lens imaging     to-light, which is the industry standard in modeling of galaxy-
observables alone (Falco et al. 1985; Schneider & Sluse 2013,        scale lenses (Shajib et al. 2022b). Birrer et al. (2016) marginal-
2014). Non-lensing observables, such as the deflector galaxy’s       ized over the MSD effect for the system RXJ1131−1231 to con-
velocity dispersion or the source’s unlensed intrinsic brightness,   strain H0 using a single-aperture velocity dispersion measure-
are required to break the mass-sheet degeneracy and simultane-       ment. Here, we allow the maximal freedom in the MSD by in-
ously constrain H0 and the mass profile shape (Treu & Koop-          troducing one free parameter on top of the simply parametrized
mans 2002; Shajib et al. 2018; Yıldırım et al. 2020, 2021; Birrer    mass profile constrained by lens modeling, which is completely
et al. 2020, 2022a).                                                 degenerate with H0 .
                                                                         This paper is organized as follows. In Section 2, we describe
    The TDCOSMO collaboration has redesigned the experi-             the observational strategy and data reduction. In Section 3, we
ment to mitigate this systematic by relaxing the simple paramet-     describe the procedures to extract the spatially resolved kine-
ric assumptions in the mass profile and constraining the profile     matics map from the KCWI data. In Section 4, we briefly review
shape solely from stellar velocity dispersion measurements of        the lensing and dynamical formalisms and how we combine the
the lensing galaxies (Birrer et al. 2020). Relaxing the assump-      two to mitigate the MSD in our analysis. Then in Section 5, we
tion on the mass profile leads to an increase in the H0 uncer-       describe our dynamical models and present results. We infer the
tainty from 2 to 8% – which is dominated by the uncertainty of       cosmological parameters from our analysis in the Section 6. We
the measured velocity dispersion – giving H0 = 74.5+5.6
                                                     −6.1 km s
                                                                −1
                                                                     discuss our results in Section 7 and conclude the paper in Sec-
Mpc−1 . One approach to improving the precision is to incorpo-       tion 8.
rate prior information on the mass profile shape from the mea-           We performed the cosmological inference blindly in this pa-
sured velocity dispersions of a larger sample of external lenses     per. The measurement of velocity dispersion was not blinded.
without measured time delays. Assuming that the Sloan Lens           However, we blinded the cosmological and other model parame-
ACS (SLACS) survey’s lens galaxies are drawn from the same           ters directly related to cosmological parameters in the dynamical
population as the TDCOSMO lens galaxies and using their ve-          modeling. Before unblinding, this analysis went through an in-
locity dispersions to constrain the mass profile shape, the un-      ternal collaboration-wide review and code review. After all the
certainty on H0 improves to 5%, giving H0 = 67.4+4.1 −3.2 km s
                                                                −1
                                                                     coauthors had agreed that the necessary systematic checks were
     −1
Mpc (Birrer et al. 2020). Note that this estimate is statistically   satisfactorily performed, we froze the analysis and unblinded on
consistent within 1σ with the larger 8% H0 measurement above.        5 January 2023. All the sections in this paper except for the final
However, the shift could also arise from systematic differences,     discussion in Section 7 and summary in Section 8 were written
e.g., a difference between the parent populations of time-delay      before unblinding. After unblinding, we only made minor edits
Article number, page 2 of 21
A. J. Shajib et al.: H0 from spatially resolved kinematics of RXJ1131−1231

                                                                           system while marginalizing the MSD with a prior on the source
                                                                           size. These authors found the prior choice on the anisotropy in
                                                                           the dynamical modeling to be the dominant systematic in infer-
                                                                           ring H0 .

                                                                           2.2. KCWI Spectroscopy
                                                                           We obtained integral field unit (IFU) spectroscopy of
                                                                           RXJ1131−1231 on 16 May and 7 June 2021 with the KCWI in-
                                        S                                  strument on the Keck Observatory (Morrissey et al. 2012, 2018).
                        B
                                             D                             We chose KCWI with the small IFU slicer and the low-resolution
                                                                           blue grating (BL) with a field-of-view (FoV) of 800. 4 × 2000. 4. The
                       A            G                                      spectral resolution is R ≈ 3600, corresponding to an instrumental
                                                                           dispersion σinst ∼ 35 km s−1 . The reciprocal dispersion is 0.5 Å
                            C                                              per pixel. The observed wavelength range 3600–5600 Å covers
                                                                           the Ca H&K lines with wavelengths λλ3933, 3968 Å at the red-
                                                                           shift of the lens galaxy (zd = 0.295). We primarily use these lines
                                                                 N         to determine the stellar velocity dispersion. The redshifted 4304
                                                                           Å G-band is beyond the observed range, so it is not accessible
      1"                                                                   with the KCWI for the RXJ1131−1231 system.
                                                       E                        We aligned the FoV’s longer side with the North direction
                                                                           (i.e., PA = 0◦ ) and dithered the individual exposures by 900 along
                                                                           the North-South direction. As the extent of the RXJ1131−1231
Fig. 1. HST/ACS image of RXJ1131−1231 in the F814W band. The
                                                                           system is smaller than the FoV, each exposure contained the en-
four quasar images are labeled with A, B, C, and D. The central deflec-    tire lens system within the FoV. In different exposures, the lens
tor is marked with G, of which we are measuring the spatially resolved     system occupied the upper or lower portion of the FoV. Thus,
velocity dispersion. An arrow points to the nearby satellite S, which we   the sky in an exposure with the system occupying the upper por-
mask out in the velocity dispersion measurement.. The North and East       tion can be subtracted using another exposure with the system
directions and 1s̈cale are also illustrated.                               occupying the lower portion, and vice versa. We obtained six ex-
                                                                           posures with a total integration time of 10,560 s on 16 May and
                                                                           three with a total integration time of 5,400 s on 7 June. There-
for clarity and grammatical corrections in the previous sections           fore, the total exposure time is texp = 15, 960 s. The airmass
and added the unblinded numbers where relevant in the abstract,            ranged from 1.2 to 1.48 over the integrating period.
main text, and plots.

                                                                           2.3. Data Reduction
2. Observations and data reduction                                         We use the official Python-based data reduction pipeline1 (DRP)
In this section, we provide a brief description of the lens sys-           to reduce our data. The DRP converts the 2D raw data captured
tem RXJ1131−1231 (Section 2.1), the spectroscopic observation              on the detector into a 3D datacube. It performs geometry cor-
with KCWI (Section 2.2), and the data reduction procedure (Sec-            rection, differential atmospheric refraction correction, and wave-
tion 2.3).                                                                 length calibration and produces a final standard-star-calibrated
                                                                           3D datacube for each exposure. The calibration with the stan-
                                                                           dard star corrects for instrumental response and scales the data
2.1. Description of lens system                                            to flux units (Morrissey et al. 2018). We use the final output file
                                                                           with the suffix “_icubes” for further analysis.
The quadruply imaged quasar lens system RXJ1131−1231 was
                                                                                We stack the dithered datacubes through drizzling (Fruchter
discovered by Sluse et al. (2003). The deflector in this system
                                                                           & Hook 2002). Since the exposures are obtained on different
is an elliptical galaxy with redshift zd = 0.295, and the source
                                                                           dates, the world coordinate system information is not accurate
redshift is zs = 0.657 (Sluse et al. 2003). Due to its low red-
                                                                           enough to determine the relative positions of the dithered expo-
shifts, the system is relatively bright and large in angular size.
                                                                           sures. We follow Chen et al. (2021b) to determine the relative po-
The Einstein ring in this system contains intricate features, pro-
                                                                           sitions by simultaneously fitting the point spread function (PSF)
viding a wealth of information to constrain the lens mass model
                                                                           to the four quasar image positions. To perform the drizzling on
(see Figure 1). Due to its early discovery and information-rich
                                                                           the datacubes, we repurpose the drizzling routine of the DRP for
features, this system is one of the most studied lensed quasar
                                                                           OSIRIS, another IFU spectrograph on the Keck Observatory2 .
systems. The time delays for this system were measured by
                                                                           For the drizzling process, we set pixfrac = 0.7 as recom-
Tewes et al. (2013). Suyu et al. (2013, 2014) performed cosmo-
                                                                           mended to reduce correlated uncertainties between the drizzled
graphic analyses of this system. These authors combined simply
                                                                           pixels (Avila et al. 2015). We calculate the drizzled weight image
parametrized lens models based on the high-resolution imaging
                                                                           and ensure that the ratio of RMS/median < 0.2 in the region of
from the HST’s Advanced Camera for Surveys (ACS) instru-
                                                                           interest so that the trade-off is balanced between improving the
ment (HST-GO 9744; PI: Kochanek), the measured time delays,
single-aperture velocity dispersion, and external convergence es-          1
                                                                             developed by Luca Rizzi, Don Neill, Max Brodheim; https://
timate to infer H0 = 80.0+4.5
                          −4.7 km s
                                   −1
                                      Mpc−1 . However, such sim-           kcwi-drp.readthedocs.io/
ply parametrized lens models implicitly break the MSD. Birrer              2
                                                                             https://github.com/Keck-DataReductionPipelines/
et al. (2016) performed an independent mass modeling of this               OsirisDRP

                                                                                                                    Article number, page 3 of 21
A&A proofs: manuscript no. ms

image resolution and increasing the background noise (Gonzaga               to preserve the maximal spatial resolution and reduce the bias in
et al. 2012). The rectangular pixel size 000. 145700 × 000. 3395 of the     the lower-S /N region (Cappellari & Copin 2003). We elaborate
KCWI is kept the same in the drizzled output. We transform the              on these steps below.
datacube to have square pixels of size 000. 1457 × 000. 1457 through            To estimate the lens galaxy’s S /N in each spaxel, we first si-
resampling while conserving the total flux. We converted the pix-           multaneously fit the quasar and the lens galaxy in each spaxel to
els into square sizes for the convenience of Voronoi binning the            calculate the signal from each of them. We perform this fitting
spectra using the software vorbin as described in Section 3.2.              within the wavelength range 3400–4300 Å. As the four quasar
    We directly estimate the PSF from the observed data. We                 images surround the lens galaxy, each spaxel receives a differ-
produce a 2D image from the datacube by summing along the                   ent contribution from the quasar light. We take spectra at the
wavelength axis (see Figure 2). We create a model for this KCWI             central spaxel of image A as the quasar template, ignoring the
image using a high-resolution template from the HST imaging                 lens galaxy’s small contribution. Later in Section 3.3, we also
(Figure 1) that has a pixel size 000. 05 and PSF full width at half         choose the quasar template from images B and C to account for
maximum (FWHM) 000. 10. In the model, the template is con-                  the associated systematic uncertainty, i.e., the potential impact
volved with a Gaussian PSF with a free FWHM parameter, and                  of chromatic microlensing that may change the contrast between
the positioning of the template on the KCWI image grid is fitted            the line and the continuum (e.g., Sluse et al. 2007).
with two additional free parameters. By optimizing the model,                   We determine a single optimal template spectrum for the lens
we estimate that the PSF FWHM is 000. 96.                                   galaxy template. For this purpose, we binned the spectra from
                                                                            spaxels within a circular region of radius 000. 5 centered on the
                                                                            lens galaxy and fit it with pPXF using the 628 stellar templates
3. Kinematics maps                                                          from the XSL and the quasar template. We also include a Leg-
This section describes our procedure to obtain the final kine-              endre polynomial of degree 3 as a component in the fitting to ac-
matics map. We use the pPXF package3 to fit the spectra with                count for any residual gradient in the continuum. pPXF chooses
a library of stellar templates and extract the velocity dispersion          39 of the stellar templates and builds the optimal template by
(Cappellari 2017, 2022). In Section 3.1, we describe the stel-              taking a weighted linear combination of them. See Figure 3 for
lar templates used for the analysis. In Section 3.2, we present             the weighted distribution of spectral types of the full template li-
the measurement of the spatially-resolved kinematics map of the             brary and that of the 39 stars selected by pPXF. Among those
lens galaxy. In Section 3.3, we test the systematics of the velocity        stars in the XSL with stellar classes specified by the Simbad
dispersion measurement.                                                     database (Wenger et al. 2000), G-type stars are selected with
                                                                            the highest total weight, consistent with the fact that massive
                                                                            elliptical galaxy spectra are dominated by G and K-type stars.
3.1. Library of Stellar Templates                                           In the pPXF fitting procedure, the stellar templates are broad-
The popularly used template libraries Medium-resolution                     ened, corresponding to a freely varying velocity dispersion, but
Isaac Newton Telescope library of empirical spectra (MILES;                 the velocity dispersion does not broaden the quasar template.
Sánchez-Blázquez et al. 2006) and INDO-US templates (Valdes                     Once the optimal galaxy template is constructed, we use this
et al. 2004) are both too low resolution to fit our datasets. The           template and the quasar template to fit the spectrum of each
KCWI’s instrumental resolution of R ≈ 3600 leads to σinst ∼ 35              spaxel individually. We use this optimal template to fit the galaxy
km s−1 for a Gaussian line spread function (LSF)4 . MILES                   spectra in individual spaxels instead of the full template library
has a resolution of σtemplate ∼ 64 km s−1 (i.e., R ∼ 2000),                 to avoid large spurious fluctuations in the measured velocity dis-
and the INDO-US templates have an approximately constant-                   persion from spaxel to spaxel. We show the decomposition of
                                                                            the spectra from one example spaxel into different components
wavelength resolution of 1.2 Å, which corresponds to σtemplate =
                                                                            after fitting with pPXF in Figure 2. We calculate the signal of the
39 km s−1 over the Ca H&K wavelength range. Therefore, we                   lens galaxy’s spectrum in each spaxel by subtracting the mod-
choose the X-shooter Spectral Library (XSL), which contains                 eled quasar component from the observed spectra. The noise is
628 stars covering three segments, including UVB, Vis, and NIR              estimated by adding in quadrature the Poisson noise of the total
bands (Gonneau et al. 2020). As our data cover the rest-frame               signal and the background noise estimated from an
blue/UV range, we only use the UVB segment to fit the data,                                                                     √ empty patch
                                                                            of the sky. The noise values are multiplied by 2 to account
where its resolution is R ∼ 9700 and σtemplate ∼ 13 km s−1 .
                                                                            for the fact that the square pixels are created from the rectan-
                                                                            gular pixels about double the size through resampling. We esti-
3.2. Measuring the velocity dispersion                                      mate the S /N using the restframe wavelength range 3985–4085
                                                                            Å, slightly above the Ca H&K absorption lines in wavelength
We choose a cutout centred on the lens system with 600. 235 ×
                                                                            (see the purple shaded region in Figure 2).
600. 235 (43 pixels × 43 pixels) to initiate the analysis (see Fig-
                                                                                To perform Voronoi binning before the velocity dispersion
ure 2). We estimate the lens galaxy light’s signal-to-noise ra-
                                                                            measurement, we select the spaxels within a radius of 100. 55 from
tio (S /N) in each spatial pixel (hereafter, spaxel) within this ini-
                                                                            the lens galaxy center that avoid the brightest spaxels contain-
tial cutout. We then select a region with sufficient S /N from the
                                                                            ing images A, B, and C and the lensed arcs. We also exclude a
lens galaxy and relatively low quasar contamination for measur-
                                                                            circular region around image D with radius 000. 5. To avoid any po-
ing the velocity dispersion (the yellow contour in Figure 2’s left
                                                                            tential bias due to contamination from the satellite galaxy S, we
panel). We perform Voronoi binning within this selected region
                                                                            exclude the spaxel at its position (∆RA = 000. 09, ∆Dec = 000. 54
3
   https://pypi.org/project/ppxf/                                           from the galaxy center, Suyu et al. 2013). We also exclude pixels
4
   We quantitatively verified that the shape of the instrumental LSF is     with S /N < 1 Å−1 . In the end, the spaxels within the selected re-
Gaussian (cf. Figure 28 of Morrissey et al. 2018). Thus, the treatment of   gion have S /N > 1.4 Å−1 (see Figure 2 for the selected region).
the instrumental LSF in pPXF is self-consistent and avoids any system-
atic bias due to inconsistent definitions of the LSF’s FWHM (Robertson       5
                                                                               For reference, 100. 5 corresponds to 6.6 kpc at zd = 0.295 for a fiducial
2013).                                                                      flat ΛCDM cosmology with H0 = 70 km s−1 Mpc−1 and Ωm = 0.3.

Article number, page 4 of 21
A. J. Shajib et al.: H0 from spatially resolved kinematics of RXJ1131−1231

                     KCWI…data…of…RXJ1131
                                      1231
                                                           1200
                                                           1000                              0.05                                                Data

                                                                 Flux…(arbitrary…unit)
                                                           800                                                                                   Best…fit…model…(galaxy…+…quasar)
                                                                                             0.04                                                Best…fit…quasar…model
                                                           600
                                                                                                                                                 Data……quasar
                                                           400                               0.03
                                                                                                                                                 Residual…(
                                                                                                                                                          = …data…
                                                                                                                                                                 …model)
                                                           200                               0.02

                                                                                             0.01
                                                   N
                                                           0
                     1"                                                                      0.00
                                           E
                                                                                                        3400        3600             3800             4000             4200
                                                                                                                           Restframe…wavelength…(Å)

Fig. 2. Left: 2D representation (median-collapsed) of the 3D KCWI datacube for RXJ1131−1231. The yellow contour traces the region with
100. 5 radial extent from the center selected for stellar kinematic measurement. A circular region with 000. 5 radius around image D and the spaxel
containing the satellite S are excluded from this selected region. All the individual spaxels within this region have continuum S /N > 1.4 Å−1 for
the lens galaxy’s light within 3985–4085 Å (the purple shaded range in the right panel). Right: The spectra (grey) from an example pixel (grey
box in the left panel) and the estimate of the signal from the lens galaxy’s spectra (orange) after removing the contribution from the quasar light
(blue). The full model of the spectra is presented with the red line, and the model’s residual is plotted in emerald color. The vertical purple shaded
region marks where we compute the continuum S /N.

                                                                                                               3.3. Estimation of systematic uncertainty
                               All
                                                                                                               To estimate the systematic uncertainties in the velocity disper-
Normalized…density

                               Set…1
                     1.5                                                                                       sion measurement, we consider a range of plausible choices in
                               Set…2                                                                           the extraction procedure: the degrees of the additive Legendre
                               Set…3                                                                           polynomial used to correct the template continuum shape be-
                     1.0                                                                                       tween 2 to 4; the quasar template obtained from images A, B, and
                                                                                                               C; the fitted wavelength range chosen from 3300–4200 Å, 3350–
                                                                                                               4250 Å, and 3400–4300 Å; and three sets of template spectra
                     0.5                                                                                       used in the fitting. The first set of template spectra contains the
                                                                                                               complete XSL of 628 stars. The second set contains half of the
                                                                                                               entire sample that is randomly selected, and the third set con-
                     0.0                                                                                       tains the other half. The numbers of stars selected by pPXF in
                           O   B   A   F       G       K       M                         L      S   C     ~    the three sets are 39, 32, and 33, respectively. Sets 2 and 3 have
                                               Stellar…type                                                    15 and 17 stars, respectively, in common with Set 1. Figure 3
                                                                                                               shows the distribution of spectral types in all three sets and the
Fig. 3. Distribution of the stellar spectral types in the XSL according                                        entire library. We do not take the quasar template from image D
to the Simbad database. Unspecified stars are grouped in the ‘∼’ class.                                        as it is much fainter than the other images, and thus the galaxy
The dark grey color represents the full library of 628 stars. Set 1 (or-                                       contribution in the brightest spaxel on image D is non-negligible.
ange) refers to the 39 stars selected by pPXF out of the full library to                                       Taking a combination of all of these choices yields 81 different
construct an optimal template 1. Set 2 (blue) refers to 32 stars selected
from a random half of the full library and set 3 (emerald) refers to 33
                                                                                                               setups. We illustrate the shift in the extracted velocity dispersion
stars selected from the other half. Sets 2 and 3 have 15 and 17 stars, re-                                     maps for one change of setting at a time in Figures 6 and 7.
spectively, in common with Set 1. The alternating light grey and white                                             We estimate the variance-covariance matrix of the binned ve-
vertical regions divide the spectral classes for easier visualization.                                         locity dispersions from these 81 setups. To do this, we generate
                                                                                                               1,000 random realizations of the measured velocity dispersion
                                                                                                               map for each of the 81 setups using the corresponding statistical
We perform Voronoi binning using vorbin6 given the estimated                                                   uncertainty. We create the variance-covariance matrix from the
S /N values for each spaxel. In Figure 4, we show the 41 Voronoi                                               81,000 realizations combined from all the setups. In this way,
bins obtained by setting the target S /N ≈ 23 Å−1 for each bin.                                                the diagonal terms of the variance-covariance matrix encode
This target S /N was chosen so that the resultant S /N & 20 Å−1                                                the total variance from systematic and statistical uncertainties,
for each bin, which is standard practice (Figure 4, only bin 16                                                and the off-diagonal terms encode the systematic covariances.
has S /N ≈ 18 Å−1 ).                                                                                           For example, if all 81 setups hypothetically provided the same
     For each Voronoi bin, we measure the velocity dispersion by                                               velocity dispersion map and uncertainty, then the off-diagonal
fitting the binned spectra using pPXF using the optimal galaxy                                                 terms would be zero, and the diagonal terms would reflect only
template described above, the quasar template, and the additive                                                the statistical uncertainties. We show the systematic variance-
Legendre polynomial to model any slight gradient in the popula-                                                covariance relative to the statistical variance in Figure 8. The
tion. A few examples of pPXF fit of the binned spectra are shown                                               systematic variance is subdominant relative to the statistical vari-
in Figure 5.                                                                                                   ance (with a median of 0.47 of the ratio between systematic and
                                                                                                               statistical covariances along the diagonal) except for bins 29 and
 6
           https://pypi.org/project/vorbin/                                                                    31. These two bins are closest to quasar images A and C, and
                                                                                                                                                       Article number, page 5 of 21
A&A proofs: manuscript no. ms

                                              Map…of…
                                                   Voronoi…bins
                                                                                                                                        30
                                                                  40
                                                                                                                                        28
                                                                                                  39
                                             37             36             38
                                                                                                                                        26
                                                                      34                     35
                                                   33                              28
                                                             32                                                                         24

                                                                                                                             S/N…(Å )
                                                                          27                 26

                                                                                                                         1
                                                  30
                                                             25           21            22
                                   31
                                                        24        11                    5         15
                                                  23         10
                                                                      2
                                                                           1
                                                                                        4
                                                                                                            16
                                                                                                                                        22
                                                             9                 3                                 41
                                                                                              6

                                        29             20
                                                                  8
                                                                                   7                        14                          20
                                                                      12                 13
                                                                                                       17                               18
                                                   19
                                                                                                                                                    Voronoi…bins
                                                                          18
                                                                                                                                        16          Target…S/N

                                                                                                                                         0          5                                             10         15          20      25        30        35      40
                                                                                                                                                                                                                   Bin…number

Fig. 4. Left: Voronoi binning of the selected spaxels within 100. 5 from the galaxy center that avoid lensed arcs, quasar images, and the satellite
galaxy S. The different colors illustrate the regions for each Voronoi bin in a cartographic manner for easier visualization, with the bin number
specified within each bin. We perform the binning with a target S /N ≈ 23 Å−1 for each bin, which results in 41 bins in total. Right: Resultant S /N
for each Voronoi bin (red points).

                                             bin 1: σlos = 282 ± 7 km s−1                                                                                                        0.150                    bin 6: σlos = 284 ± 11 km s−1
                          0.06
 Flux (arbitrary unit)

                                                                                                                                                        Flux (arbitrary unit)

                                                                                                                                                                                 0.125

                                                                                                                                                                                 0.100
                          0.04
                                                                                                                                                                                 0.075

                                                                                                                                                                                 0.050
                          0.02
                                                                                                                                                                                 0.025

                          0.00                                                                                                                                                   0.000
                            3400                             3600                                 3800                4000                   4200                                   3400                          3600          3800        4000          4200
                                                                           Restframe wavelength (Å)                                                                                                                     Restframe wavelength (Å)

                                             bin 20: σlos = 274 ± 13 km s−1                                                                                                                       1.0     bin 37: σlos = 263 ± 16 km s−1
                                  0.3
          Flux (arbitrary unit)

                                                                                                                                                                          Flux (arbitrary unit)

                                                                                                                                                                                                  0.8
                                                                                                                                                                                                                                 Data
                                  0.2                                                                                                                                                             0.6
                                                                                                                                                                                                                                 Best fit model (galaxy + quasar)
                                                                                                                                                                                                                                 Best fit quasar model
                                                                                                                                                                                                  0.4
                                  0.1
                                                                                                                                                                                                  0.2

                                  0.0                                                                                                                                                             0.0
                                   3400                      3600                                 3800                4000                   4200                                                  3400           3600          3800        4000          4200
                                                                           Restframe wavelength (Å)                                                                                                                     Restframe wavelength (Å)

Fig. 5. pPXF fitting to the spectra from four examples of Voronoi bins. The bin number and the measured velocity dispersion for the corresponding
bin are specified in each panel. The grey line presents the full spectra, the red line traces the best-fit model, and the blue line shows the quasar
component in the best-fit model.

thus largely susceptible to the choice of quasar template (see                                                                                                    velocity of 182 km s−1 using the pafit7 software program (Kra-
Figures 6 and 7.)                                                                                                                                                 jnović et al. 2006) and subtract it from the mean velocity map.
                                                                                                                                                                  The systematic velocity is the result of a slight deviation in the
                                                                                                                                                                  true redshift from the fiducial value. The mean velocity map does
    We show the velocity dispersion and mean velocity maps av-
                                                                                                                                                                      7
eraged over the 81 setups in Figure 9. We estimate a systematic                                                                                                                        https://pypi.org/project/pafit/

Article number, page 6 of 21
A. J. Shajib et al.: H0 from spatially resolved kinematics of RXJ1131−1231

not show any significant evidence of ordered rotation above the       4.1.2. Description of the MST
systematic and statistical noise levels. Thus it is consistent with
the lens galaxy being a slow rotator. We use this systematic-         The MST is a mathematical transform of the convergence pro-
averaged velocity dispersion map and the variance-covariance          file that leaves invariant all the imaging observables, such as the
matrix estimated above when computing the likelihood function         image positions and the flux ratios (Falco et al. 1985; Schneider
for dynamical modeling in Section 5.                                  & Sluse 2014). This transform scales the convergence and the
    To test the impact of our choice for the Voronoi binning          unknown source position as
scheme, we adopt an alternative target S /N ≈ 28 Å−1 for each         κ → κ0 = λMST κ + (1 − λMST ),
bin, which results in 27 bins. We similarly produce another                                                                               (6)
set of 81 model setups in this binning scheme and produce             ς → ς0 = λMST ς.
the variance-covariance matrix for these binned velocity disper-      where λMST is the transformation parameter. The predicted time
sions. We test the systematic impact of this different binning        delay ∆t scales under the transform as
scheme on the cosmological measurement later in Section 5.2.
We show the difference in the extracted kinematics between the        ∆t → ∆t0 = λMST ∆t.                                                 (7)
two binning schemes in Figure 10.
                                                                      Then, the inferred time-delay distance D∆t and the Hubble con-
                                                                      stant H0 based on the observed time delays will change as
4. Overview of lens and dynamical modeling
                                                                              D∆t
This section reviews the theoretical formalism of lens and dy-        D0∆t =      ,
namical modeling.                                                            λMST                                                         (8)
                                                                      H00 = λMST H0 .
4.1. Lensing observables and modeling                                 However, the MST changes the predicted velocity dispersion,
We briefly review the strong lensing formalism in the context         thus measuring it breaks the MSD. Notably, the MST also
of time-delay cosmography in Section 4.1.1, describe the mass-        rescales the lensing magnifications. Thus, standardizable candles
sheet transform (MST) in Section 4.1.2, and explain the internal      can also be used to break the MSD (Bertin & Lombardi 2006;
and external components of the MST in Section 4.1.3.                  Birrer et al. 2022a) provided that microlensing and millilensing
                                                                      can be mitigated (e.g., Yahalomi et al. 2017; More et al. 2017;
                                                                      Foxley-Marrable et al. 2018).
4.1.1. Strong lensing formalism
In the thin lens approximation applicable in this case, lensing       4.1.3. Internal and external MST
observables are described using the surface mass density Σ(R)
projected from the 3D mass density distribution ρ(r) in the lens      We can express the “true” (i.e., physically present) lensing mass
galaxy. Formally, the lensing observables depend on the dimen-        distribution as
sionless convergence defined as                                       κtrue = κgal + κext ,                                               (9)
       Σ(θ)
κ(θ) ≡      ,                                                   (1)   where κgal is the mass distribution of the central lens galaxy
       Σcr
                                                                      (or galaxies) that is (are) considered in the lens modeling, and
which is the surface mass density normalized by the critical den-     κext is called the external convergence, which approximates the
sity                                                                  projected mass distribution of line-of-sight structures as a mass
                                                                      sheet. Since limθ→∞ κgal = 0 has to be satisfied, we find that
          c2 Ds                                                       limθ→∞ κtrue = κext , hence the interpretation of κext as the lensing
Σcr ≡             .                                             (2)
        4πGDd Dds                                                     mass far from (or, “external” to) the central deflector(s).
Here, c is the speed of light, G is the gravitational constant, Ds         All the lensing observables including imaging observables
is the angular diameter distance between the observer and the         result from κtrue . However, since only the central galaxies are
source, Dd is the angular diameter distance between the observer      usually considered in lens modeling with imaging observables,
and the lens galaxy, and Dds is the angular diameter distance         the lens model provides κmodel0
                                                                                                          with limθ→∞ κmodel
                                                                                                                         0
                                                                                                                                 = 0. This
between the lens galaxy and the source. The on-sky deflection         κmodel is an MST of κtrue for λMST = 1/(1 − κext ) as
                                                                        0

angle α(θ) relates to the convergence as                                         κgal + κext        1       κgal
                                                                      κmodel
                                                                       0
                                                                             =               +1−         =         .                     (10)
       1
κ(θ) = ∇ · α(θ).                                        (3)                       1 − κext       1 + κext 1 − κext
       2
                                                                      Lens mass models are usually described with simply
The time delay between two quasar images labeled A and B is           parametrized models, such as the power law or a combi-
given by                                                              nation of the NFW profile and the observed stellar distribution.
        D∆t (θA − ς)2 (θB − ς)2
           "                                      #                   In that case, the assumption of a simple parametric form
∆tAB =                −          − ψ(θA ) + ψ(θB ) ,    (4)           implicitly breaks the MSD. Therefore, the simply parametrized
         c       2         2                                          model κmodel can be expressed as another approximate MST of
where θA is the angular position of image A, ς is the source’s        the κmodel
                                                                           0
                                                                                 as
angular position, ψ(θ) is the lensing potential, and the time-delay
distance D∆t is defined as                                            κmodel
                                                                       0
                                                                             ≈ λint κmodel + (1 − λint )κs (θ),                          (11)

                  Dd Ds                                               where λint is called the internal MST parameter, and κs is a “vari-
D∆t ≡ (1 + zd )         .                                       (5)   able” mass sheet with limθ→∞ κs (θ) = 0 to ensure that both
                   Dds
                                                                                                                  Article number, page 7 of 21
A&A proofs: manuscript no. ms

 Range:…33504250…Å                       Range:…33004200…Å                     Polynomial…degree…2                       Polynomial…degree…4

                                20                                    20                                       20                           20

                                0                                     0                                        0                            0

                                    20                                    20                                       20                           20

 Stellar…template…set…2 Stellar…template…set…3                                        Quasar…B                                Quasar…C

                                20                                    20                                       20                           20

                                0                                     0                                        0                            0

                                    20                                    20                                       20                           20

Fig. 6. Absolute difference in km s−1 between the extracted velocity dispersion from two setups that differ by one setting. The baseline setup has
the range: 3400–4300 Å, polynomial degree: 3, stellar template set 1, and quasar template from image A. The different setting for each case is
specified at the top of each panel.

 Range:…33504250…Å                       Range:…33004200…Å                     Polynomial…degree…2                       Polynomial…degree…4

                                 2                                    2                                          2                             2

                                 0                                    0                                          0                             0

                                     2                                    2                                          2                             2

 Stellar…template…set…2 Stellar…template…set…3                                         Quasar…B                                 Quasar…C

                                 2                                    2                                          2                             2

                                 0                                    0                                          0                             0

                                     2                                    2                                          2                             2

Fig. 7. Same as Figure 6, but the difference is normalized by the statistical uncertainty of the baseline setup.

limθ→∞ κmodel
          0
               = 0 and limθ→∞ κmodel = 0 are satisfied. How-                   where θs  θE is a scale radius where the variable mass-sheet
ever, for Equation (11) to be an approximate MST, the variable                 smoothly transitions from 1 − λint to 0. This approximate MST
mass-sheet needs to satisfy κs (θ) ' 1 within the central region               converges to the pure MST in the limit θs → ∞. Thus, the actual
that lensing observables are sensitive to (θ . 2θE , Schneider &               mass distribution of the central deflector(s) relates to the mod-
Sluse 2013). This can be achieved with the formulation (Blum                   eled mass distribution as
et al. 2020)
                                                                               κgal ≈ (1 − κext ) [λint κmodel + (1 − λint )κs (θ)] .        (13)
                θs2                                                                The external convergence κext can be estimated by using rel-
κs (θ) =                    ,                                       (12)       ative number counts of line-of-sight galaxies near the central de-
           θ2   +     θs2
Article number, page 8 of 21
A. J. Shajib et al.: H0 from spatially resolved kinematics of RXJ1131−1231

                                                                    h                    i
                                                                                   2                                             1
                                                                                                                                                                       los /
                                                                        diag(          )
                                                                                                                         los …(km…s)
                                                               xy                  stat xy
                  Fractional…systematic…covariance:…2                                                                                                                          los
                                                                         stat, x

                                                                                                                                                    20                                2
              5
                                                                                                    0
                                                                                              10
             10                                                                                                                                     0                                 0

             15
Bin…number

                                                                                                                                                        20                                2
             20
                                                                                              0
                                                                                                              Fig. 10. Absolute (left) and uncertainty-normalized (right) difference in
             25                                                                                               the extracted velocity dispersion between two Voronoi binning schemes.
                                                                                                              The two binning schemes are obtained by setting the target S /N to 23
                                                                                                              Å−1 and 28 Å−1 for each bin. We take the case with target S /N ≈ 23 Å−1
             30                                                                                               for each bin as the baseline in our analysis.
                                                                                                        0
             35                                                                                   10
                                                                                                              flector(s) (e.g., Suyu et al. 2010; Greene et al. 2013; Rusu et al.
             40                                                                                               2017; Buckley-Geer et al. 2020), or by using weak lensing of
                          10                20               30                       40                      distant galaxies by the line-of-sight mass distribution (e.g., Ti-
                                    Bin…number                                                                hhonova et al. 2018). The measured velocity dispersion then
                                                                                                              constrains the internal MST parameter λint (Birrer et al. 2020;
Fig. 8. Illustration of the systematic covariance relative to the statistical                                 Yıldırım et al. 2021).
covariance. Σ is the variance-covariance matrix of the Voronoi-binned
velocity dispersions (with target S /N ≈ 23 Å−1 for each bin), σstat,x is
the statistical uncertainty in bin number x from our fiducial setup, and                                      4.2. Dynamical modeling
diag(σstat ) is a diagonal matrix. Note that we assume no covariance in
the statistical uncertainty from each setup for kinematic measurement.       In this section, we describe the Jeans anisotropic multi-
Thus the off-diagonal terms in the variance-covariance matrix purely         Gaussian-expansion (JAM) framework to model our dynamical
represent the systematic covariance. Most diagonal terms are < 1 (with
a median of 0.47), showing that the systematic variances are subdom-
                                                                             observable, which is the spatially resolved stellar velocity dis-
inant to the statistical variances except for bins 29 and 31. These bins     persion measured in Section 3. The orbital motions of the stars,
are close to images A and C. Thus, they are largely susceptible to the       i.e., the distribution function f (x, v) of position x and velocity v,
choice of the quasar template, as seen in Figures 6 and 7.                   in the galactic potential Φ is described by the steady-state col-
                                                                             lisionless Boltzmann equation (Binney & Tremaine 1987, Eq.
   Velocity…dispersion                       Velocity…dispersion…uncertainty 4-13b)
                                300
                                                                                                  …(km…s
                                     los …(km…s

                                                                                                              3
                                                                                                       )
                                               )

                                                                                                                       ∂f   ∂Φ ∂ f
                                                                                                                                   !
                                                                                              1
                                             1

                                                                                                              X
                                275                                                          40                     vi    −          = 0.                                             (14)
                                                                                                              i=1
                                                                                                                       ∂xi ∂xi ∂vi
                                250
                                                                                                        los

                                                                                             20               We assume an axisymmetric case (i.e., ∂Φ/∂φ = ∂ f /∂φ = 0
                                225                                                                           with φ being the polar angle in the spherical coordinate sys-
                                                                                                              tem), a spherically aligned velocity ellipsoid, and the anisotropy
                                                                                                              for each Gaussian component in the multi-Gaussian expan-
             Mean…velocity                          Mean…velocity…uncertainty                                 sion (MGE; Emsellem et al. 1994; Cappellari 2002) to be spa-
                                50                                                                            tially constant. Slow rotators such as the deflector galaxy in
                                         vmean …(km…s

                                                                                              vmean …(km…s
                                                    )

                                                                                                          )

                                                                                                              RXJ1131−1231 are in general expected to be weakly triaxial or
                                      1

                                                                                                        1

                                                                                             30               oblate but never flat and instead quite close to spherical in their
                                0                                                                             central parts (e.g., Cappellari 2016). For this reason, we expect
                                                                                                              the spherical alignment of the velocity ellipsoid of jamsph (Cap-
                                                                                                              pellari 2020) to provide a better approximation to the galaxy dy-
                                                                                             20
                                    50                                                                        namics than the cylindrical alignment jamcyl solution (Cappellari
                                                                                                              2008). Then, the above equation gives two Jeans equations in
Fig. 9. Maps of extracted velocity dispersion (top row) and mean ve-                                          spherical coordinates (Jeans 1922; Bacon et al. 1983; de Zeeuw
locity (bottom row) in Voronoi bins along with the corresponding un-                                          et al. 1996; Cappellari 2020)
certainties (right column). The Voronoi binning was tuned to achieve
S /N ≈ 23 Å−1 for each bin. The illustrated maps (left column) corre-                                                           
spond to the average values after combining 81 model setups, and the                                                    ∂ ζhv2r i        (1 + β)ζhv2r i − ζhv2φ i          ∂Φ
uncertainty maps correspond to the square root of the diagonal of the                                                                +                              = −ζ      ,
variance-covariance matrices. A systematic velocity of 182 km s−1 was                                                     ∂r                    r                        ∂r         (15)
subtracted from the mean velocity map.                                                                                  ∂ ζhv2r i        (1 − β)ζhv2r i − ζhv2φ i        ∂Φ
                                                                                                              (1 − β)                +                              = −ζ    ,
                                                                                                                           ∂θ                     tan θ                  ∂θ
                                                                                                                                                               Article number, page 9 of 21
A&A proofs: manuscript no. ms

where the following notations are used                                      software program mgefit9 . jampy deprojects the MGE compo-
             Z                                                              nents into an oblate or prolate spheroid with an inclination angle
ζhv p vq i ≡   v p vq f d3 v,                                               i (Cappellari 2002). The deprojected 3D mass density provides
                                                                            the 3D potential Φ for the kinematic computation. We also take
                                                                     (16)   the MGE of the surface brightness I(x, y) for deprojection to 3D
               hv2 i
        β ≡ 1 − θ2 .                                                        with the inclination angle i for the kinematic computation by
               hvr i                                                        jampy.
Here, β is the anisotropy parameter, and the velocity dispersion                 The combination of lens imaging observables and the stellar
ellipsoid is assumed to be spherically aligned, giving hvr vθ i = 0.        kinematics is sensitive to λint (1 − κext )Ds /Dds (Birrer et al. 2016;
     The line-of-sight second moment hv2los i is the integral given         Chen et al. 2021a). We apply a prior on κext using the estimated
by                                                                          κext distribution from Suyu et al. (2014) to help break the de-
                                                                            generacy in distributing the total MSD into external and internal
                   Z ∞                                                      components.
S hv2los i(x, y) =     dz ζhv2z i.                              (17)
                   −∞
                                                                            4.4. Bayesian framework
where S (x, y) is the surface density of the dynamical tracer.
Given that there is no evidence of significant ordered rotation             According to Bayes’ theorem, the posterior of the model param-
and the only significantly nonzero velocities are likely due to             eters Ξ = {ξmass , ξlight , Dmodel
                                                                                                         ∆t    , i, κext , λint , Dd , β} as
systematic errors (see Figure 9), we assume hvlos i = 0 and define
hv2los i = σ2los . The observed line-of-sight velocity dispersion is         p(Ξ | D) ∝ p(D | Ξ) p(Ξ),                                        (21)
given a luminosity-weighted integral as
                                                                            where p(D | Ξ) is the likelihood given data D and p(Ξ) is the
                             R
                                  dxdy Ihv2los i ⊗ PSF                      prior. In this study, the data D is the measured velocity disper-
h       i     h        i       ap                                           sions in Voronoi bins (Figure 9). The observational information
  σlos
    2
             = hvlos i
                   2
                            = R                        ,        (18)
         obs            obs          dxdy I  ⊗  PSF                         from the published time delays, lens models using HST imag-
                                  ap
                                                                            ing, and the line-of-sight effects (Tewes et al. 2013; Suyu et al.
where the symbol “⊗ PSF” denotes a convolution with the PSF.                2013, 2014) is incorporated by adopting those previous posteri-
In the equation above, we have chosen the surface brightness                ors as the prior on our model parameters. The likelihood of the
profile I(x, y) as a substitute for the surface density S (x, y) of         observed velocity dispersion vector σlos ≡ [σ1 , . . . , σNbin ], with
the dynamical tracer since the constant factor between surface              Nbin being the number of Voronoi bins, is given by
brightness and surface number density cancels out in this ex-                                  "                #
pression.                                                                                          1 T −1
                                                                            L(σlos | Ξ) ∝ exp − σlos Σ σlos ,                                 (22)
    We use the dynamical modeling software jampy8 to compute                                       2
the observed velocity dispersion by solving the Jeans equation
from Equation (15) for a given 3D potential Φ(r) and anisotropy             where Σ is the variance-covariance matrix. Specific priors used
profile β(r). Specifically, we use the jam_axi_proj() routine               in this Bayesian framework are given in Section 5. We ob-
with the keyword align=‘sph’. See Cappellari (2008, 2020)                   tain the posterior probability distribution function (PDF) of
for a detailed formalism in computing Equation (18) by jampy.               the model parameters using the Markov-chain Monte Carlo
                                                                            (MCMC) method using the affine-invariant ensemble sampler
                                                                            emcee (Goodman & Weare 2010; Foreman-Mackey et al. 2013).
4.3. Cosmological inference from combining dynamical and                    We ensure the MCMC chains’ convergence by running the
     lensing observables                                                    chains for &20 times the autocorrelation length after the chains
                                                                            have stabilized (Foreman-Mackey et al. 2013).
We parametrize the 3D potential Φ(r) using the lens model pa-
rameters ξmass and the internal MST parameter λint to conve-
niently use the lens model posterior from Suyu et al. (2013) as a           5. Dynamical models
mass model prior in the dynamical modeling. Thus from Equa-
tion (13), the surface mass density for our dynamical model is              We first describe our baseline dynamical model in Section 5.1
given by                                                                    and then perform various checks on systematics in Section 5.2.

Σ(θ) = Σcr (1 − κext ) [λint κmodel (θ) + (1 − λint )κs (θ)] .       (19)
                                                                            5.1. Baseline dynamical model
We include D∆t and Dd as free parameters in our model, which                This subsection describes the baseline settings in our dynamical
give the critical density Σcr as                                            model, namely the specific parametrization of the mass model
                                                                            (Section 5.1.1), the dynamical tracer profile (Section 5.1.2), the
         c2     D∆t       c2           Dmodel
                                         ∆t                                 probability of oblate or prolate axisymmetry (Section 5.1.3), the
Σcr =                   =                                 ,          (20)
        4πG (1 + zd )D2d 4πG (1 + zd )(1 − κext )λint D2d                   inclination angle (Section 5.1.4), and the choice of anisotropy
                                                                            profile (Section 5.1.5).
where Dmodel
        ∆t   is the time-delay distance predicted by the lens
mass model κmodel (θ) for the time delays observed by Tewes et al.
                                                                            5.1.1. Parametrization of the mass model
(2013).
   We approximate the surface mass density Σ(θ) with an MGE                 We adopt the power-law mass model as our baseline model. In
(Emsellem et al. 1994; Cappellari 2002; Shajib 2019) using the              this model, the mass profile is defined with Einstein radius θE ,
8                                                                            9
    https://pypi.org/project/jampy/                                              https://pypi.org/project/mgefit/

Article number, page 10 of 21
A. J. Shajib et al.: H0 from spatially resolved kinematics of RXJ1131−1231

logarithmic slope γ, projected axis ratio qm , and position angle      Table 1. Values of the light model parameters for the double Sérsic
ϕmass . The convergence profile κmodel in Equation (19) for the        model in our fitting of a large cutout and those from Suyu et al. (2013).
                                                                       The position angle ϕlight is defined as East of North.
power-law model is given by
                                                   γ−1                 Parameter              This analysis       Suyu et al. (2013)
                    3 − γ             θE
                                                 
 pl                                                                                            Sérsic profile 1
κmodel (θ1 , θ2 ) =
                                                      
                               q                          (23)
                      2                                                I0 (e−1 s−1 pixel−1 )    32.8 ±0.1              36.4 ±0.4
                                                         
                                   qm θ12 + θ22 /qm 
                                                                         θeff ( )
                                                                               00
                                                                                                2.437 ±0.005            2.49 ±0.01
Here, the coordinates (θ1 , θ2 ) are rotated by ϕmass from the (RA,      ns                      1.10 ±0.01             0.93 ±0.03
Dec) coordinate system. We adopt the lens model posterior from           ql                     0.865 ±0.001           0.878 ±0.004
Suyu et al. (2013) as a prior in our dynamical model. For sim-                                 Sérsic profile 2
plicity, we set the position angle ϕmass the same as the observed        I0 (e−1 s−1 pixel−1 )     441 ±7                 356 ±12
position angle of light ϕlight . We use the estimated κext distribu-
                                                                         θeff (00 )             0.300 ±0.003           0.362 ±0.009
tion for the power-law model as the prior (see Figure 3 of Suyu
et al. 2014).                                                            ns                      1.60 ±0.02             1.59 ±0.03
     We set θs = 1200 (' 7.5θE ) in the approximate mass-sheet κs        ql                     0.847 ±0.002           0.849 ±0.004
(Equation 12) so that the imaging constraints alone cannot dif-          ϕlight (◦ )             120.5 ±0.3             121.6 ±0.5
ferentiate the power-law mass profile and its approximate MST
from Equation (11). We obtain this lower limit by running the
jupyter notebook that produces Figure 3 of (Birrer et al. 2020).10     Following Suyu et al. (2013), we use the double Sérsic model to
However, we adjusted the fiducial lens model parameters in the         fit the light profile, which is a superposition of two concentric
notebook to match with those for RXJ1131−1231. We take a uni-          Sérsic profiles. The Sérsic profile is defined as
form prior for the internal MST parameter λint ∼ U(0.5, 1.13).                                                          1/n          
                                                                                                       q θ2 + θ2 /q  s
                                                                                                      q
The upper limit of 1.13 is set by the requirement that the trans-                             
                                                                                                             l 1
                                                                                                                                       
                                                                                                                     2 l
formed mass profile under the approximate MST must be mono-            I(θ1 , θ2 ) = I0 exp −bn 
                                                                                                                 
                                                                                                                              + bn  ,
                                                                                                                                         
                                                                                                                                             (24)
                                                                                                               θ
                                                                                                                          
tonic so that the MGE can approximate the transformed pro-                                               
                                                                                                                eff        
                                                                                                                                         
file sufficiently well (Shajib et al. 2019). Previous studies also
found similar or more restrictive upper limits for λint to satisfy     where I0 the amplitude, ql is the axis ratio, θeff is the effective
the physical requirement of non-negative density (Birrer et al.        radius, n s is the Sérsic index, and bn = 1.999n − 0.327 is a nor-
2020; Yıldırım et al. 2021).                                           malizing factor so that θeff becomes the half-light radius (Sérsic
     The appropriate number of MGE components for the mass             1968). The coordinates (θ1 , θ2 ) are rotated by ϕlight from the (Ra,
or light profile is automatically chosen by jampy with a maxi-         Dec) coordinate system.
mum of 20 components. We check that the MGE approximates                   We first mask circular regions at the quasar image positions
the input mass or light profile very well (with a maximum 1%           due to slightly saturated pixels producing significant residuals in
deviation at < 1000 and maximum 10% deviation between 1000 –           the subtracted cutout (see Figure 11). We then iteratively mask
5000 ). These deviations from the density profile have an oscilla-     the other pixels with significant residuals above statistical expec-
tory pattern due to the MGE approximation’s nature, except near        tations to effectively perform an outlier rejection while preserv-
the end of the fitted ranges. Thus the deviation in the integrated     ing the shape of a Gaussian tail. For each iteration of this pro-
mass profile often averages out in the line-of-sight integration up    cess, we take a discrepancy threshold, which we decrease from
to a very large radius. We perform the MGE fitting up to 10000 .       5σ to 2σ with step size 0.5σ across these iterations. We then
Thus, the large mismatch between the MGE approximation and             randomly mask a subset of the pixels with residuals more than
the original profile occurs largely outside the integration limit      the discrepancy level at the given iteration such that the number
∼ 7000 . The chosen number of maximum Gaussian components              of remaining pixels with such high residuals is statistically ex-
is not a dominant source of numerical error. Setting this maxi-        pected. The final masked area after the iterations is illustrated in
mum number to a very high value, such as 100, shifts the com-          Figure 11. We tabulate the best-fit light model parameters in Ta-
puted velocity dispersion by only < 0.5% within the observed           ble 1 and compare them with those from Suyu et al. (2013). The
region, which is insignificant compared to the 1% numerical sta-       circularized half-light radius for our best-fit model is θeff = 100. 91,
bility targeted by jampy.                                              which is slightly larger than the value θeff = 100. 85 from Suyu
                                                                       et al. (2013) based on the same imaging data but from a smaller
5.1.2. Dynamical tracer profile                                        cutout (illustrated in 11). We then take the MGE of the fitted
                                                                       double Sérsic profile as the light distribution I(x, y) in our dy-
We update the light profile fitting for the lens galaxy from Suyu      namical modeling. We propagate the uncertainties and covari-
et al. (2013) using a larger HST image cutout than that therein,       ances from the light profile fitting into the dynamical modeling.
which did not contain the full extent of the lens galaxy’s light       To do that, we sample from the multivariate normal distribution
profile (see Figure 11). The lensed arcs and quasar images are         corresponding to all the light model parameters for each call of
first subtracted from the cutout using the prediction of the best-     the likelihood function within the MCMC process and then take
fit lens model from Suyu et al. (2013). We use the software pack-      the MGE of the light profile given the sampled parameters.
age lenstronomy11 to fit the residual light distribution attributed
to the lens galaxy (Birrer & Amara 2018; Birrer et al. 2021).
                                                                       5.1.3. Oblate or prolate shape of the axisymmetry
10
   https://github.com/TDCOSMO/hierarchy_analysis_2020_
public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/                  The oblateness, prolateness, or triaxiality of a slow rotator
MST_impact/MST_pl_cored.ipynb                                          galaxy can, in principle, be constrained from the kinematic mis-
11
   https://github.com/lenstronomy/lenstronomy                          alignment angle ∆ϕkin ≡ ϕkin − ϕlight . However, we do not
                                                                                                                 Article number, page 11 of 21
You can also read