THE SPECTRE CAUCHY-CHARACTERISTIC EVOLUTION SYSTEM FOR RAPID, PRECISE WAVEFORM EXTRACTION

Page created by Michael Blake
 
CONTINUE READING
The SpECTRE Cauchy-characteristic evolution system for rapid, precise waveform
                                                                           extraction
                                                              Jordan Moxon,1 Mark A. Scheel,1 Saul A. Teukolsky,1, 2 Nils Deppe,1 Nils
                                                               Fischer,3 Francois Hébert,1 Lawrence E. Kidder,2 and William Throwe2
                                                                 1
                                                                    Theoretical Astrophysics, Walter Burke Institute for Theoretical Physics,
                                                                       California Institute of Technology, Pasadena, CA 91125, USA.
                                                                           2
                                                                             Cornell Center for Astrophysics and Planetary Science,
                                                                              Cornell University, Ithaca, New York 14853, USA.
                                          3
                                            Max Planck Institute for Gravitational Physics (Albert Einstein Institute), Am Mühlenberg 1, D-14476 Potsdam, Germany
                                                        We give full details regarding the new Cauchy-characteristic evolution (CCE) system in SpEC-
                                                      TRE. The implementation is built to provide streamlined flexibility for either extracting waveforms
                                                      during the process of a SpECTRE binary compact object simulation, or as a standalone module
                                                      for extracting waveforms from worldtube data provided by another code base. Using our recently
arXiv:2110.08635v1 [gr-qc] 16 Oct 2021

                                                      presented improved analytic formulation, the CCE system is free of pure-gauge logarithms that
                                                      would spoil the spectral convergence of the scheme. It gracefully extracts all five Weyl scalars, in
                                                      addition to the news and the strain. The SpECTRE CCE system makes significant improvements
                                                      on previous implementations in modularity, ease of use, and speed of computation.

                                                         I.   INTRODUCTION                                stein field equations. Current strong-field numerical rela-
                                                                                                          tivity simulation methods are ‘Cauchy’ methods [20–23]:
                                                                                                          initial data is generated for a desired configuration of
                                            Since the original gravitational wave detections by
                                                                                                          the compact binary using an elliptic solve on a restricted
                                         the LIGO-VIRGO collaborations [1, 2], sensitivities of
                                                                                                          region, and that spacelike hypersurface data is evolved
                                         ground-based detectors have continued to advance [3, 4].
                                                                                                          in the timelike direction. One output of a Cauchy sim-
                                         A crucial requirement for the successful detection and
                                                                                                          ulation is the metric and its derivatives as a function
                                         parameter estimation of astrophysical gravitational-wave
                                                                                                          of time, evaluated on one or more spheres of finite dis-
                                         sources is the accurate modelling of potential gravita-
                                                                                                          tance from the binary, typically ∼ 100 − 1000M from the
                                         tional wave signals. Gravitational wave modelling is re-
                                                                                                          coalescence. Waveform extraction then uses the Cauchy
                                         quired both to construct templates for extracting sig-
                                                                                                          worldtube metric and its derivatives to determine the ob-
                                         nals from instrumentation noise [5, 6] and for performing
                                                                                                          servable asymptotic waveform that is directly applicable
                                         follow-up parameter estimation [7–11]. Currently, the
                                                                                                          to data analysis efforts for gravitational wave interferom-
                                         precision of numerical relativity waveforms is sufficient
                                                                                                          eters.
                                         to cause no significant bias in detections produced by the
                                         present generation of gravitational wave detectors [12].            The most widely used technique of waveform extrac-
                                            As the technology of the current network of gravi-            tion is the method of extrapolation to large radii using
                                         tational wave detectors (Advanced LIGO [13], VIRGO,              several worldtubes of finite radius [24, 25]. For each wave-
                                         and KAGRA [14]) continues to mature, next-generation             form quantity of interest, such as the gravitational wave
                                         ground based interferometers (Cosmic Explorer [15] and           strain or one of the Weyl scalars, there is a clear power
                                         Einstein Telescope [16]) are planned, and space-based            law asymptotic behavior in well-behaved gauges. The
                                         gravitational wave detector projects (LISA [17], Tian-           extrapolation method then fits for the leading behavior
                                         Qin [18] and DECIGO [19]) move forward, the demand               in r−1 and obtains a reasonable approximation for the
                                         for high-precision waveform models for binary inspirals          asymptotic waveform. The extrapolation method has
                                         continues to grow. Recent investigations [12] have indi-         been used to generate a great number of useful waveforms
                                         cated that future ground-based gravitational wave detec-         for gravitational wave data analysis [26–28]. However,
                                         tors will have sufficient sensitivity that current numeri-       the extrapolation method makes a number of simplify-
                                         cal relativity waveforms are not precise enough to pro-          ing assumptions regarding the choice of coordinates and
                                         duce unbiased parameter recovery. Further, space-based           behavior of the field equations far from the system that
                                         gravitational wave detectors, such as LISA, will likely          diminish the precision of the method.
                                         observe several sources simultaneously, and sufficiently            In addition, there is good evidence [29] that there
                                         precise modelling of each source will help make best use         are large, low-frequency parts of gravitational waveforms
                                         of the resulting data by improving the capability to dis-        (‘memory’ contributions) that are not well modeled by
                                         tinguish overlapping signals.                                    waveform extrapolation. These memory effects do not
                                            An important ingredient to improved precision for nu-         have significant impact on the frequency bands impor-
                                         merical relativity waveforms is the refinement of wave-          tant for LIGO, but will likely be important for more sen-
                                         form extraction methods. The process of waveform ex-             sitive detectors (such as the Einstein Telescope or Cosmic
                                         traction refers to the calculation of the observable asymp-      Explorer) or detectors sensitive to lower frequency bands
                                         totic waveform from a strong-field simulation of the Ein-        (such as DECIGO or LISA).
2

                                                                          to achieve high precision and can be very costly to run
                                                                          [35]. The first spectral implementation of CCE is a mod-
                                                                          ule of the Spectral Einstein Code (SpEC). That imple-
                                                I+                        mentation was first reported in [36], and has undergone a
                                                       u                  number of updates and refinements [37, 38], including re-
                                                                          cent work that assembled a number of valuable analytic
                                                                          tests that assisted in refining and optimizing the code
                                                                          [35].
                                                                             In this paper, we present our new implementation of
                              Γ                                           CCE in the SpECTRE [39] code base, which incorporates
                                                                          a number of improvements to the waveform extraction
                                                                          system. The SpECTRE CCE module implements a mod-
                                                                          ified version of the evolution system in Bondi-Sachs coor-
                                       Σu                                 dinates [40] that is able to guarantee that no pure-gauge
                                                                          logarithms arise that spoil the spectral convergence of the
                                             CCE Domain                   scheme as the system evolves. Further, the SpECTRE
                                                                          CCE system is able to use formulation simplifications to
                                                                          implement the computation for all five Weyl scalars as
                                                                          suggested in [40]. We have also implemented numerical
     Cauchy Domain                                                        optimizations specific to the SpECTRE CCE system to
                                                                          ensure rapid and precise waveform extraction, and we
                                                                          have re-implemented and extended the collection of tests
                                                                          that was previously effective in testing and refining the
FIG. 1: A sketch of the Cauchy and Characteristic domains.                SpEC implementation [35].
The Cauchy system evolves Einstein’s equations on spacelike                  SpECTRE [39, 41] is a next-generation code base for
hypersurfaces, while the Characteristic system evolves Ein-
                                                                          which the aim is to construct scalable multi-physics sim-
stein’s equations on compactified null hypersurfaces Σu that
extend to I + . Boundary conditions for the Characteristic sys-
                                                                          ulations of astrophysical phenomenon such as neutron
tem are required on the worldtube Γ and are provided there                star mergers, binary black hole coalescences, and core-
by the Cauchy system.                                                     collapse supernovae. It is the goal of the SpECTRE
                                                                          project to construct a highly precise astrophysical simula-
                                                                          tion framework that scales well to & 106 cores. The core
   Cauchy-characteristic         evolution1       Cauchy-                 SpECTRE evolution system uses discontinuous Galerkin
characteristic evolution (CCE) [30–32] is an alternative                  methods with a task-based parallelism model. The dis-
waveform extraction method that uses metric data on a                     continuous Galerkin method has the ability to refine a
single worldtube Γ to provide boundary conditions for a                   domain by subdividing the computation into local cal-
second full nonlinear field simulation along hypersurfaces                culations coupled by boundary fluxes. SpECTRE then
generated by outgoing null geodesics. CCE avoids many                     uses the task-based parallelism framework, charm++
of the assumptions made by other extraction methods,                      [42–44], to schedule and run the resulting multitude of
and instead computes the full solution to Einstein’s                      separate calculations, which ensures good scaling prop-
equations in a Bondi-Sachs coordinate system at I + ,                     erties of the method.
from which waveform quantities may be unambiguously                          The CCE system in SpECTRE enjoys some efficiency
derived. The CCE domain and salient hypersurfaces are                     gain from sharing a common well-optimized infrastruc-
illustrated in Fig. 1.                                                    ture with the discontinuous Galerkin methods and makes
   There are two notable previous implementations of                      modest use of the parallelization framework (see Sec. IV).
CCE. The original implementation, PITT Null [33, 34],                     However, the characteristic evolution itself is imple-
is a part of the Einstein Toolkit, and demonstrated the                   mented as a single spectral domain that covers the entire
feasibility of the CCE approach. Unfortunately, as it is a                asymptotic region from the worldtube Γ out to I + . The
finite difference implementation, PITT Null struggles                     smooth behavior of the metric away from the binary co-
                                                                          alescence ensures exponential convergence of the mono-
                                                                          lithic spectral method. In principle, the CCE method
                                                                          could be applied to a subdivided asymptotic domain.
1   The acronym CCE has also been used in the past to refer to            However, the unusual features of the field equations for
    “Cauchy-characteristic extraction”, which describes only the part     CCE (reviewed in Sec. II) would require special treat-
    of the computation moving from the Cauchy coordinates to a set        ment to appropriately account for boundary information.
    of quantities that could separately be evolved on null characteris-   Moreover, any subdivision of the angular direction would
    tic curves. Most of our descriptions refer to the entire algorithm
    as a single part of the wave computation, so we refer to the com-     obscure the spherical shell geometry that permits efficient
    bination of Cauchy-characteristic extraction and characteristic       calculation of the angular degrees of freedom of the sys-
    evolution as simply CCE.                                              tem via spin-weighted spherical harmonic (SWSH) meth-
3

ods.                                                                 We use Greek letters α, β, γ, . . . to represent spacetime
   It is important to note that the SpECTRE CCE mod-                 indices, uppercase roman letters A, B, C, . . . to represent
ule, like every part of SpECTRE, is a rapidly evolving               spherical angular indices, and lowercase roman letters
open-source code base. The discussion in this paper rep-             from the middle of the alphabet i, j, k, . . . to represent
resents as completely as possible the state of our efforts           spatial indices.
to optimize and refine the system at the time of pub-                   When relevant, we similarly adorn the spin-weighted
lication. However, we will continue to make modifica-                scalars and tensors that represent components of the met-
tions and improvements, so we encourage the reader to                ric to indicate the coordinates in which they are com-
explore the full code base at [45], and refer to the docu-           ponents of the Bondi-like metric. For instance, the gr̂û
mentation at [46]. For up-to-date details on making use              component of a partially flat Bondi-like metric is −e2β̂ .
of the standalone SpECTRE CCE system, please see the                 Our notation conventions are consistent with our previ-
documentation page [47].                                             ous paper regarding the mathematics of the CCE system
   We first describe the mathematical aspects of the evo-            [40].
lution system, including the incorporation of formulation
improvements from [40] in Sec. II. Next, we discuss some
of the numerical methods that we have constructed for                                 A.     Spectral representation
our new SpECTRE implementation to improve runtime
and precision in Sec. III. We discuss the how the SpEC-
TRE CCE module fits into the wider task-based SpEC-                    The SpECTRE CCE system represents its null hyper-
TRE infrastructure in Sec. IV. Finally, we demonstrate               surface data on the domain I ×S 2 , where the real interval
the precision and accuracy of the code by applying the               I describes the domain y ∈ [−1, 1] for compactified radial
system to a collection of analytic test cases in Sec. V, and         coordinate
to a realistic use-case of extracting data from a binary
black-hole evolution from SpEC in Sec. VI. We describe                                                      2R̂(û, x̂Â )
                                                                                              y̆ = 1 −                     ,            (2)
the major future improvements that we hope to make for                                                           r̂
the CCE system in Sec. VII.
                                                                     where r̂ is the partially flat Bondi-like radial coordinate
                                                                     and R̂ is the Bondi-like radius at the worldtube.
          II.   THE EVOLUTION SYSTEM                                    We use a pseudospectral representation for each
                                                                     physical variable on this domain, using Gauss-Lobatto
                                                                     points for the radial dependence, and libsharp[49, 50]-
   The discussion of CCE and its numerical implementa-               compatible collocation points for the angular depen-
tions relies closely on a number of coordinate systems.              dence. The angular collocation points are chosen to
We use the following notation for coordinate variables               be equiangular in the φ direction, and Gauss-Legendre
and spacetime indices:                                               points in cos θ 2 .
   • xα : {u, r, θ, φ} are generic Bondi-like coordinates.              The choice of Gauss-Lobatto points for the radial de-
     These are the coordinates determined by the first               pendence simplifies the CCE algorithm because it is con-
     stage of local coordinate transformations at the                venient to specify boundary conditions for the radial in-
     worldtube first derived in [48].                                tegrals as simple boundary values.
                                                                        The choice of angular collocation points enables fast
   • x̂α̂ : {û, r̂, θ̂, φ̂} are partially flat Bondi-like coordi-   SWSH transforms, so that libsharp routines can effi-
     nates introduced in [40].                                       ciently provide the angular harmonic coefficients s alm (y̆)
                                                                     for an arbitrary function f (y̆, θ̆, φ̆) of spin weight s, de-
   • x̆ᾰ : {ŭ, y̆, θ̆, φ̆} are numeric partially flat coordi-      fined by
     nates. These are the coordinates directly repre-
     sented in the SpECTRE numeric implementation,                                                     X
                                                                                    f (y̆, θ̆, φ̆) =        s a`m (y̆)s Y`m (θ̆, φ̆).   (3)
     and are related to the partially flat Bondi-like co-
                                                                                                       `m
     ordinates by

            ŭ = û,        y̆ = 1 − 2R̂/r̂,                         Here s Y`m (θ̆, φ̆) are the SWSHs as defined in Eq. (C1).
                                                                       We then perform all angular calculus operations using
            θ̆ = θ̂,        φ̆ = φ̂,                         (1a)                                                    ˘. We use
                                                                     the spin-weighted derivative operators ð̆ and ð̄
      where the worldtube hypersurface is determined by
      r̂ = R̂(û, θ̂, φ̂).
                                                                     2
   • x̊α̊ : {ů,r̊, θ̊, φ̊} are the asymptotically flat ‘true’           It is of some numerical convenience that there are no points at the
                                                                         poles, where spherical polar coordinates are singular. However,
     Bondi-Sachs coordinates. These are the coordi-                      care must still be taken to avoid unnecessary factors of sin θ in
     nates in which we’d like to determine the final wave-               quantities like derivative operators, as they give rise to greater
     form quantities.                                                    numerical errors when points are merely close to the pole.
4

an angular dyad q̆ Ă :                                                        In a Bondi-like metric, surfaces of constant u are gener-
                                                                             ated by outgoing null geodesics. The Bondi-Sachs met-
                                                −i                             ric further imposes asymptotic conditions on each com-
                           q̆ Ă =       −1,                .            (4)
                                               sin θ̆                          ponent of the metric that we will not impose for all of
Then, for any spin-weighted scalar quantity v̆ =                               our coordinate systems. The same form (7) holds in any
                                                                               Bondi-like coordinates, including the partially flat Bondi-
q̆1Ă1 . . . q̆nĂn v̆Ă1 ...Ăn , where each q̆i may be either q̆ or q̄˘,     like coordinates x̂α̂ and true Bondi-Sachs coordinates x̊α̊ .
we define the spin-weighted derivative operators                                  It is important to note that for numerical implementa-
                                                                               tions, the system is usually not evolved in a true Bondi-
                  ð̆v̆ = q̆1Ă1 . . . q̆nĂn q̆ B̆ D̆B̆ vĂ1 ...Ăn ,   (5a)
                                                                               Sachs coordinate system. For convenience of numerical
                  ˘v̆ = q̆ Ă1 . . . q̆ Ăn q̄˘B̆ D̆ v̆                 (5b)   calculation, most CCE implementations enforce gauge
                  ð̄      1            n            B̆ Ă1 ...Ăn ,
                                                                               choices only at the worldtube boundary, and therefore
where D̆Ă is the angular covariant derivative. All angular                    do not ensure asymptotic flatness. The SpECTRE CCE
                                                          ˘                    implementation employs a somewhat different tactic, as
derivatives may be expressed in a combination of ð̆ and ð̄
                                                                               the generic Bondi-like gauge is vulnerable to pure-gauge
operators. We perform angular differentiation of an arbi-
                                                                               logarithmic dependence that spoils spectral convergence.
trary function f (y̆, θ̆, φ̆) of spin weight s by transforming
                                                                               Instead, we use the partially flat gauge introduced in [40],
to SWSH modes on each concentric spherical slice of the
                                                                               which ensures that the evolved coordinates are in the
domain represented by s alm (y̆), then applying the diag-
                                                                               asymptotically inertial angular coordinates, while keep-
onal modal multipliers
                                                                               ing the time coordinate choice fixed by the arbitrary
  ð̆f (y̆, θ̆, φ̆)                                                             Cauchy time coordinate.
        Xp                                                                        In the Bondi-like coordinates, it is possible to choose a
     =             (` − s)(` + s + 1)s a`m (y̆) s+1 Y`m (θ̆, φ̆) (6a)          subset of the Einstein field equations that entirely deter-
        `m                                                                     mine the scalars {J, β, U, W } and that form a computa-
  ˘f (y̆, θ̆, φ̆)
  ð̄                                                                           tionally elegant, hierarchical set of differential equations.
       X p                                                                     Represented in terms of the numerical Bondi-like coor-
     =        − (` + s)(` − s + 1)s a`m (y̆) s−1 Y`m (θ̆, φ̆),                 dinates {ŭ, y̆, θ̆, φ̆}, the hierarchical differential equations
        `m                                                                     take the form
                                                                        (6b)
                                                                                                                      ˘
                                                                                                        ∂y̆ β̆ = Sβ̆ (J),                   (9a)
and then performing an inverse transform.
   In addition, it is occasionally valuable to apply the in-                                                          ˘ β̆),
                                                                                            ∂y̆ ((1 − y̆)2 Q̆) = SQ̆ (J,                    (9b)
verse of the angular derivative operators ð̆ and ð̄ ˘. This                                                           ˘ β̆, Q̆),
                                                                                                        ∂y̆ Ŭ = SŬ (J,                    (9c)
can be performed applying the inverse of the multiplica-
                                                                                                                       ˘ β̆, Q̆, Ŭ ),
                                                                                           ∂y̆ ((1 − y̆)2 W̆ ) = SW̆ (J,                    (9d)
tive factors in the modal representation (6), and is ap-
proximately as efficient to compute as the derivative.                                    ∂y̆ ((1 − y̆)H̆)+L (J,  ˘ β̆, Q̆, Ŭ , W̆ )H̆
                                                                                         
                                                                                                               H̆
                                                                                                                                      ˘
                                                                                                                  ˘ β̆, Q̆, Ŭ , W̆ )H̄
                                                                                                           +LH̄˘ (J,
             B.      Hierarchical evolution system                                                                   ˘ β̆, Q̆, Ŭ , W̆ ),
                                                                                                              = SH̆ (J,                     (9e)
  For evolution in the characteristic domain (see Fig. 1),                                              ∂ŭ J˘ = H̆.                        (9f)
we solve the Einstein field equations for the spin-weighted
scalars that appear in the Bondi-Sachs form of the metric:                     The detailed definitions for the source functions S̆(. . . )
                                                                               and the factors LH̆ in (9) can be found in Sec. IV of [40].
                                                                               We emphasize that the only time derivative appearing
                                    
               2β V
    ds = − e
      2                  2
                     − r hAB U UA B
                                        du2 − 2e2β dudr
                  r                                                            in the core evolution system (9) is that of J˘ (9f), so we
                                                                               have only the single complex field to evolve and all of the
             − 2r2 hAB U B dudxA + r2 hAB dxA dxB .                      (7)
                                                                               other equations are radial constraints within each null
The spin-weighted scalars that are used in the evolution                       hypersurface.
system are then J, β, Q, U, W, and H, where                                       The SpECTRE CCE system requires input data spec-
                                                                               ified on two hypersurfaces: the worldtube Γ and the ini-
                          U ≡ U A qA ,                                  (8a)   tial hypersurface Σŭ0 (see Fig. 1). The worldtube sur-
                          Q≡r e    2 −2β A
                                            q hAB ∂r U ,        B
                                                                        (8b)   face data must provide sufficient information to set the
                                                                               boundary values for each of the radial differential equa-
                      2
                    r W ≡ V − r,                                        (8c)   tions in (9). Namely, we must specify β̆, Ŭ , Q̆, W̆ , and
                          1                                                    H̆ at the worldtube (see Sec. II C below). The worldtube
                      J ≡ q A q B hAB ,                                 (8d)
                          2                                                    data is typically specified by determining the full space-
                          1                                                    time metric on a surface of constant coordinate radius in
                      K ≡ q A q̄ B hAB .                                (8e)
                          2                                                    a Cauchy code, then performing multiple gauge transfor-
5

mations to adapt the boundary data to the appropriate                            transformations for these scalars depend only
partially flat Bondi-like gauge.                                                 on angular Jacobians ∂Ă xB , and are described
   The initial hypersurface data requires specification                          in Sec. II D.
                                 ˘ In contrast to Cauchy
only of the single evolved field J.                                          (b) Evaluate the hypersurface equation for the
approaches to the Einstein field equations, the initial                          spin-weighted scalar I˘ using the radial inte-
data for CCE does not have a collection of constraints                           gration methods described in Sec. III B.
that form an elliptic differential equation. Instead, J˘
may be arbitrarily specified on the initial data surface,                3. Determine the time derivative of the angular coor-
constrained only by asymptotic flatness conditions. The                     dinates ∂ŭ xA (x̆) (see Sec. II D) using the asymp-
choice of “correct” initial data to best match the physi-                   totic value of U.
cal history of an inspiral system, however, remains very
                                                                         4. Transform U to the partially flat gauge Ŭ by sub-
difficult. We discuss our current heuristic methods for
                                                                            tracting its asymptotic value U0 ≡ U|I + .
fixing the initial hypersurface data in Sec. II E.
                                                                         5. For each spin weighted scalar I in {W, H}:
             C.    Gauge-corrected control flow                              (a) Transform I to partially flat gauge I˘ via the
                                                                                 angular coordinates xA (x̆Ă ) and their first
   The SpECTRE CCE system implements the partially                               derivatives ∂ŭ xA (x̆) – see Sec. II D.
flat gauge strategy discussed at length in [40]. The prac-                   (b) Evaluate the hypersurface equation for I.˘
tical impact of the method is that we must include the
evolved angular coordinates in the process of determin-                  6. For each output waveform                quantity     O    in
ing the Bondi-Sachs scalars for the radial hypersurface                     {h, N, Ψ4 , Ψ3 , Ψ2 , Ψ1 , Ψ0 }:
equations. Past implementations have performed the an-                       (a) Compute asymptotic value of O, and trans-
gular transformation at I + , which results in a simpler                         form to asymptotically inertial coordinate
algorithm, but also gives rise to undesirable pure-gauge
logarithmic dependence.                                                          time as described in App. B, using ů(x̆Ă ).
   In this discussion, we make use of the local Bondi-                   7. Step J˘ forward in time using ∂ŭ J˘ = H̆, step xA
Sachs-like coordinates x̂µ̂ on the worldtube that are de-                   using Eq. (12) below for ∂ŭ xA , and step ů using
termined by the standard procedure introduced in [30]                       Eq. (B1) below for ∂ŭ ů.
and reviewed in [35, 40]. This procedure obtains a unique
Bondi-Sachs-like coordinate system by generating a null               See Sec. III A for details regarding the calculation of the
hypersurface with geodesics outgoing with respect to the              angular Jacobian factors required for the gauge trans-
worldtube, and with time and angular coordinates chosen               formation and the practical methods used to evolve the
to match the Cauchy coordinates on the worldtube.                     angular coordinates.
   In the below discussion, we make use of an intermedi-
ate spin-weight 1 scalar
                                                                              D.    Worldtube data interpolation and
                           U = Ŭ + U0 ,                      (10)                        transformation
where U0 = U|I + is a radially-independent contribution
fixed by the worldtube boundary conditions. U obeys                     The collection of hypersurface equations (9) requires
the same radial differential equation as Ŭ , but possesses           data for each of the quantities {β̆, Q̆, Ŭ , W̆ , H̆} on a sin-
a constant asymptotic value that is used to determine the             gle spherical shell at each timestep. For β̆ and Ŭ , the
evolution of the angular coordinates.                                 worldtube data specifies the constant-in-y̆ part of the so-
   The computational procedure with the gauge transfor-               lution on the hypersurface, for Q̆ and W̆ , the worldtube
mation to partially flat coordinates is then:                         data fixes the ∝ (1 − y̆)2 part, and for H̆, the worldtube
     1. Perform the gauge transformation from the Cauchy              data fixes a combination of radial modes that includes
        gauge metric to the local Bondi-Sachs coordinates             the ∝ (1 − y̆) contribution.
        on the worldtube Γ, generated by geodesics with                 The worldtube data provided by a Cauchy simulation
        null vectors that are outgoing with respect to the            contains the spacetime metric, as well as its first radial
        worldtube surface.                                            and time derivatives. The procedure for transforming
                                                                      the data provided by the Cauchy evolution to boundary
     2. For each spin weighted scalar I in {β, Q, U }:                data for the hypersurface equations (9) is then, for each
         (a) Transform I to partially flat gauge I˘ (or U)            hypersurface time ŭ,
             via the angular coordinates xA (ŭ, x̆Ă ) 3 . All

                                                                        of the target collocation points in the source coordinate system.
                                                                        See Sec. III A for more details regarding our interpolation meth-
3   When performing spectral interpolation, we require the position     ods.
6

   1. Interpolate the worldtube data to the desired hy-                  to a set of angular coordinates xA (x̆Ă ) for which the met-
      persurface time ŭ                                                 ric satisfies the asymptotic conditions:
   2. Perform the local transformation of the Cauchy
                                                                                                     lim J˘ = 0,                (11a)
      worldtube metric and its derivatives to a Bondi-                                               y̆→1
      like gauge as described in [48]                                                                lim Ŭ = 0.                (11b)
                                                                                                     y̆→1
   3. Perform angular transformation and interpolation
      from the generic Bondi-like gauge to the partially                   These conditions are satisfied if the angular coordi-
      flat gauge used for the evolution quantities.                      nates obey the radially-independent evolution equation
   The worldtube data is usually generated by the Cauchy                 [40]
simulation at time steps that are suited to the strong-field
calculations, but the characteristic system can usually                                         ∂ŭ xA = −U0Ă ∂Ă xA ,          (12)
take significantly larger time steps. Once the character-
istic time stepping infrastructure has selected a desired
                                                                         where U0Ă q̆Ă ≡ U0 .
time step, we interpolate the worldtube data at each an-
gular collocation point to the target time for the next hy-                The angular transformations for the remaining spin-
persurface. In SpECTRE, the interpolation is performed                   weighted scalars require the spin-weighted angular Jaco-
by selecting a number of time points as centered as pos-                 bian factors
sible on the target time, then performing a barycentric
rational interpolation to the target time.                                                        ă = q̆ Ă ∂Ă xB qB          (13a)
   After the time interpolation of the worldtube data,                                                 ˘Ă
                                                                                                  b̆ = q̄ ∂Ă x qB
                                                                                                                B
                                                                                                                                (13b)
we have the values of the spacetime metric and its ra-
dial and time derivatives on a single inner boundary of                                                                         (13c)
the CCE hypersurface of constant retarded time       ŭ. We
                                                   0
then compute the outgoing radial null vector lµ (denot-                  and conformal factor
ing Cauchy coordinate quantities with 0 ) , construct a
                                                                                     1
                                                                                       q
radial null coordinate system using 0the affine parameter                       ω̆ =     b̆˘b̄ − ăā
                                                                                                   ˘                            (14a)
along null geodesics generated by lµ , then normalize the                            2
                                                                                                       1
radial coordinate to construct an areal radius r. Fol-                               ω̆ ˘                    ˘ω̆ + Ū ð̆ω̆
                                                                                                                            
                                                                            ∂ŭ ω̆ =     ð̄U0 + ð̆Ū0 +    U0 ð̄      0         (14b)
lowing these transformations, for which explicit formulas                            4                  2
are given in [35, 40, 48], the spacetime metric gαβ is of
the form (7), but with no asymptotic flatness behavior                      Given the angular coordinates determined by the time
imposed. During the transformation from the Cauchy co-                   evolution of (12), we perform interpolation of each of the
ordinates to the Bondi-like coordinates, the angular and                 spin-weighted scalars {R, ∂u R, J, U, ∂r U, β, Q, W, H} to
time coordinates remain fixed on the worldtube surface,                  the new angular collocation points (more details for the
so no alteration of the pseudospectral grid is necessary.                numerical interpolation procedure are in Sec. III A), and
   The final step for the worldtube computation is to per-               perform the transformation of the spin-weighted scalars
form a constant-in-r angular coordinate transformation                   as

         R̆ =ω̆R,                                                                                                               (15a)
                                   ω̆  ˘              
      ∂ŭ R̆ =ω̆∂u R + ∂ŭ ω̆ +        U0 ð̄R + Ū0 ð̆R ,                                                                       (15b)
                                   2
                 1 ˘2
          J = 2 b̄ J + ă J + 2ă˘b̄K ,
           ˘                    2 ¯
                                              
                                                                                                                                (15c)
              4ω̆
                2β
              e
       e2β̆ =      ,                                                                                                            (15d)
                ω̆
                                                                                        ˘¯
                                                                                 ∂y̆ (J˘J)
                                                            "                              !#
                     R̆      
                               ˘
                                                      e2β̆ ˘          ˘
      ∂y̆ Ŭ = 3               b̄∂r U − c̆∂r Ū + 4R̆          ð̄ω̆∂y̆ J − ð̆ω̆
              ω̆ (1 − y̆)2                              ω̆                         2K̆
                                                                              ˘¯
                                                                                !2 
                       e2β̆  ˘˘                          ˘¯ J˘ − ∂y̆ (J˘J)
                                             
               + 2R̆         J ð̄ω̆ − K̆ ð̆ω̆ −1 + ∂y̆ J∂  y̆
                                                                                    ,                                          (15e)
                        ω̆                                                2K̆
7

                                       ˘ ,
                                 ˘ y̆ Ū
                                        
         Q̆ =2R̆e−2β̆ K̆∂y̆ Ŭ + J∂                                                                                                              (15f)

              1 ˘                 e2β̆ (1 − y̆)              ˘ω̆),
         U=      (b̄U − c̆Ū ) −                (K̆ ð̆ω̆ − J˘ð̄                                                                                 (15g)
             2ω̆                      2R̆ω̆
         Ŭ =U − U0 ,                                                                                                                           (15h)
                     (ω̆ − 1)(1 − y̆) e2β̆ (1 − y̆) h ˘ ˘ 2                                                        ˘       ˘
                                                                                               ˘ω̆) − 2∂ŭ ω̆ − Ŭ ð̄ω̆ + Ū ð̆ω̆ ,
                                                                         ˘¯ ð̆ω̆)2 − 2K̆(ð̆ω̆)(ð̄
                                                                                                   i
        W̆ =W +                         +                 J(ð̄ω̆) + J(                                                                           (15i)
                           2R̆                4R̆ω̆ 2                                                    ω̆             ω̆
                                                                    ˘
             1                    ˘ − J˘ð̆Ū0 + ∂ŭ ω̆ − 2 (U0 ð̄ω̆ + Ū0 ð̆ω̆) (2J˘ − 2∂y̆ J)
                                                              1
                     ˘J˘ + ð̆(Ū J)                                                                   ˘U + K̆ ð̆Ū
                                                                                                ˘ − J˘ð̄
               h                               i
        H̆ = U0 ð̄              0                                                                        0         0
             2                                                     ω̆
              1 ˘2                        H J¯ + J H̄
                                                      
                    b̄ H + ă2 H̄ + ˘b̄c̆
                                                              ∂ŭ R̆ ˘
                                                         +2         ∂y̆ J,                                                                       (15j)
             4ω̆                              K                 R̆

                                                                                                        SpECTRE CCE initial data transient
                                       q
where K = 1 + J J and K̆ = 1 + J˘J.
                        ¯                      ˘ Finally, the
                                               ¯
             p
                                                                                           0.100
quantities {β̆, Q̆, U, W̆ , H̆} are used directly to determine
                                                                                           0.075
the integration constants in the hypersurface equations
(9). Note that in all of the equations (15h) onward, we                                    0.050
have explicit dependence on U0 or implicit dependence on                                   0.025
U0 via ∂ŭ ω̆. This dependence necessitates finishing the                                  0.000
hypersurface integration of U to determine its asymptotic                                  0.025
                                                                                                                                             Re Y2 2
value before computing the remaining gauge-transformed                                             Strain h                                  Im Y2 2
                                                                                           0.050                                             Re Y2 0
                                                                        Mode coefficient
quantities on the worldtube.                                                                                                                 Im Y2 0
                                                                                           0.075
                                                                                            200
                      E.    Initial data
                                                                                              0
   In addition to the specification of the worldtube data                                   200
at the interface to the Cauchy simulation, the character-
isticsystem requires initial data at the first outgoing null                                400
                                                                                                                                             Re Y2 2
hypersurface in the evolution (see Fig. 1). The initial                                                                                      Im Y2 2
data problem on this hypersurface is physically similar
                                                                                            600    Curvature   0                             Re Y2 0
to the initial data problem for the Cauchy evolution: It                                    800                                              Im Y2 0
is computationally prohibitive to directly construct the                                            0    200   400     600   800   1000 1200 1400
spacetime metric in the state that it would possess dur-                                                           Simulation time (M)
ing the inspiral. Ideally, we would like the starting state
of the simulation to be simply a snapshot of the state if                   FIG. 2: The initial data transient for an example CCE run
we had been simulating the system for far longer.                           using worldtube data obtained from a binary black hole sim-
   The initial data problem in CCE has been investigated                    ulation SXS:BBH:2096 from the SXS catalog. The dominant
previously by [51], in which a linearized solution scheme                   modes of the strain and Ψ0 display visually apparent drift
was considered. The most important part of the initial                      during the first ∼ 2 orbits of the inspiral. The initial data
data specification appears to be choosing the first hyper-                  transient contaminates the data for the early part of the sim-
surface such that it is consistent with the boundary data                   ulation and leads to a BMS frame shift in the strain waveform.
at the same timestep. Without that constraint, previous                     The frame shift can be seen visually from the fact that the
authors [51], and empirical tests of our own code, indi-                    Y22 mode does not oscillate about 0. The initial data method
                                                                            used for this demonstration is the cubic ansatz initial data
cate that spurious oscillations emerge that often last the
                                                                            described as method 1 below.
full duration of the simulation.
   Computationally, the initial data freedom in CCE
is much simpler than the Cauchy case [52, 53]. We                           coordinate transformation 4 .
may specify the Bondi-Sachs transverse-traceless angu-
lar scalar J˘ arbitrarily. Even when we take the practical
constraint that J˘ must be consistent with the worldtube
data at the first timestep, we still have almost arbitrary                   4      In our evolution system, we track and perform an angular coor-
freedom in the specification of J, as it must be consistent                         dinate transformation at the worldtube regardless of initial data
with the worldtube data only up to an arbitrary angular                             choice, so permitting this transformation on the initial hypersur-
8

   Current methods of choosing initial data for J do not                      (1 − y)2 part of J to vanish, which is sufficient to
represent a snapshot of a much longer simulation, and                         prevent the emergence of pure-gauge logarithmic
this gives rise to transients in the resulting strain out-                    dependence during the evolution of J.
puts (see Fig. 2). These initial data transients are analo-
gous to ‘junk radiation’ frequently found in Cauchy sim-
ulations, but are somewhat more frustrating for data                        3. Set J˘ = 0 along the entire initial hypersurface. In
analysis because the CCE initial data transients tend to                       general, this choice will be inconsistent with the
have comparatively long timescales. We observe that the                        data specified on the worldtube J|Γ , so it is neces-
strain waveform tends to settle to a suitable state within                     sary to construct an angular transformation x(x̆Ă )
a few orbits of the start of the simulation. However,                                     ˘ Γ = 0 following the transformation.
                                                                               such that J|
when recovering high-fidelity waveforms from an expen-
sive Cauchy simulation, every orbit of trustworthy world-                Methods 2 and 3 above require the ability to compute
tube data is precious, and it is disappointing to lose those             the angular coordinate transformation xA (x̂B̂ ) such that
first orbits of data to the initial data transient. It is a
topic of ongoing work to develop methods of efficiently                                             ˘b̄2 J˘ + ă2 J˘¯ + 2ă˘b̄K̆
generating high-quality initial data for CCE to improve                                  0 = J˘ =                                   (18)
                                                                                                                4ω̆ 2
the initial data transient behavior (see Sec. VII A).
   We currently support three methods for generating                     on some surface. Solving (18) in general would amount
initial hypersurface data:                                               to an expensive high-dimensional root-find.
                                                                            However, in our present application, practical solutions
                                                                         in the wave zone typically have a value of J˘ no greater
   1. Keep J˘ and ∂y̆ J˘ consistent with the first timestep              than ∼ 5 × 10−3 , and we should not expect to find a well-
      of the worldtube data. Use those quantities to fix                 behaved angular coordinate transform otherwise. So, we
      the angularly dependent coefficients A and B in the                take advantage of the small parameter in the equation to
      cubic initial hypersurface ansatz:                                 iteratively construct candidate angular coordinate sys-
                                                                         tems that approach the condition (18). Our linearized
    ˘ θ̆, φ̆) = A(θ̆, φ̆)(1 − y̆) + B(θ̆, φ̆)(1 − y̆)3 .
    J(y̆,                                                        (16)    iteration is based on the approximation
      This is a similar initial data construction to [51],                                  1 J˘n ω̆n
      and is chosen to omit any (1 − y̆)2 dependence,                               ăn+1 = −                                      (19a)
                                                                                            2 ˘b̄ K̆
      which guarantees that no pure-gauge logarithmic                                            n n
                                                                                          1 −1 
      terms arise during the evolution [40].                                  x̆n+1 (x̆) = ð̆n+1 ăn+1 ð̆x̆i + ˘b̄n+1 ð̄
                                                                                                                      ˘x̆i ,
                                                                                                                          
                                                                                i
                                                                                                                                   (19b)
                                                                                          2

   2. Set the Newman-Penrose quantity Ψ0 = 0 on the                      for a collection of Cartesian coordinates x̆i that are rep-
      initial hypersurface.This amounts to enforcing a                   resentative of the angular coordinate transformation (see
      second-order nonlinear ordinary differential equa-                 Sec. III A).
      tion in y ≡ 1 − 2R/r for J, before constructing the                   We find that this procedure typically approaches
      coordinate transformation from xα to x̆ᾰ . After                  roundoff in ∼ 103 iterations. Despite the crude ineffi-
      some simplification, the expression for Ψ0 in [40]                 ciency of this approximation, the iterative solve needs
      may be used to show that the equation                              to be conducted only once, so it represents only a small
                                                                         portion of the CCE execution time for the initial data
            1                                                            methods that take advantage of it.
∂y2 J =          J¯2 (∂y J)2 − 2(2 + J J)∂
                                       ¯ y J∂y J¯ + J 2 (∂y J)
                                                             ¯2
                                                                
          16K 2                                                             In practical investigations, it has been found that most
                × (−4J − (1 − y)∂y J)                       (17)         frequently the simplest method of an inverse cubic ansatz
                                                                         (1. above) performs best in various measures of asymp-
      is equivalent to the condition Ψ0 = 0. The initial                 totic data quality [54]. However, because the reasons
      hypersurface data is generated by first using (17)                 for the difference in precision for different initial data
      to perform a radial ODE integration out to I + ,                   schemes are not currently well understood, we believe it
      with boundary values of J and ∂y J on the initial                  useful to include descriptions of all viable methods.
      worldtube. However, the data so generated is
      not necessarily asymptotically flat, so an angular
      coordinate transformation is calculated to fix                         III.    IMPLEMENTATION DETAILS AND
       ˘ I + = 0. Encouragingly, fixing both (17) and the
      J|                                                                             NUMERICAL OPTIMIZATIONS
      asymptotic flatness condition also constrains the
                                                                            Much of the good performance of the SpECTRE CCE
                                                                         system is inherited from the shared SpECTRE infrastruc-
                                                                         ture. In particular, the SpECTRE data structures of-
  face amounts only to setting nontrivial initial data for xA (x̂Â ).   fer easy interfaces to aggregated allocations (which limit
9

expensive allocation of memory), fast vector operations         source collocation values are transformed to spectral co-
through the interface with the open source Blaze library        efficients a`,m . The Clenshaw algorithm can be applied
[55], and rapid SWSH transforms via the open source             directly at each of the target points (θ, φ), to obtain
libsharp library. Further, we take advantage of per-            the values f (θ, φ). Note that the step of caching the
core caching mechanisms to avoid recomputing common             α`,m (θ, φ) and β`,m (θ, φ) is primarily useful for interpo-
numerical constants, such as spectral weights and collo-        lating multiple functions to the same grid; if only one
cation values.                                                  function is needed for each grid, there will be little gain
   However, in addition to establishing ambitious “best         in caching α and β, as they would each be evaluated only
practices” for the mechanical details of the software de-       once in a given recurrence chain.
velopment, we have implemented numerical optimiza-                 In Appendix C, we give full details of the specific re-
tions specialized to calculations in the CCE system. We         currence relations that can be used to efficiently calculate
give a brief explanation of the techniques we use to im-        the Clenshaw sum for SWSH, as well as additional recur-
prove performance of angular interpolation in Sec. III A,       rence relations that improve performance when moving
which is required to perform the gauge transformation           between the m modes. For the remaining discussion it
discussed in Sec. II D. In Sec. III B, we explain our meth-     is convenient to define a few auxiliary variables that are
ods for efficiently performing the hypersurface integrals       used in the formulas for the SWSH recurrence:
in our chosen Legendre-Gauss-Lobatto pseudospectral
representation.                                                                  a = |s + m|                            (24a)
                                                                                 b = |s − m|                            (24b)
                                                                                     (
                                                                                       0,     s ≥ −m
     A.    Angular interpolation techniques using                                λ=                                     (24c)
           spin-weighted Clenshaw algorithm                                            s + m, s < −m

                                                                  The step-by-step procedure for efficiently interpolating
   The Clenshaw recurrence algorithm is a fast method           a spin-weighted function represented as a series of spin-
of computing the sum over basis functions,                      weighted spherical harmonic coefficients to a set of target
                              N
                                                                collocation points (θi , φi ) is then:
                              X
                    f (x) =         an φn (x),          (20)       1. Assemble the lookup table of required (α`
                                                                                                                     (a,b)
                                                                                                                             (θ),
                              n=0                                      (a,b)
                                                                      β` , λm ):
provided the set of basis functions φn obeys a standard                (a) For each m ∈ [−`max , `max ] there is a pair
form of a three-term recurrence relation common to many                    (a, b) from (24) to be computed. Note that
polynomial bases. In particular, it is assumed that φn                       (a,b)
                                                                           α`      must be cached separately for each tar-
may be written as,                                                                           (a,b)
                                                                           get point, but β`       does not depend on the
          φn (x) = αn (x)φn−1 (x) + βn (x)φn−2 (x),     (21)               target coordinates.

for some set of easily computed αn and βn .                        2. For m ∈ [0, `max ]:
  The algorithm for computing the full sum f (x) [56] is               (a) If |s| ≥ |m|: Determine s Y|s|,m (θ, φ) from
then to compute the set of quantities yn for n ≥ 1, where                  direct evaluation of (C1) with (C3) and
yn is                                                                      s Y|s|+1,m (θ, φ) from (C10); Store s Y|s|,m (θ, φ)
                                                                           for recursion if |s| = |m|.
            yN +2 (x) =yN +1 (x) = 0                   (22a)
                                                                       (b) If |m| > |s|: Determine s Y|m|,m (θ, φ) from re-
               yn (x) =αn+1 (x)yn+1 (x)                                    currence (C9) and s Y|m|+1,m (θ, φ) from (C10).
                       + βn+2 (x)yn+2 (x) + an         (22b)               Store s Y|m|,m (θ, φ) for recursion.
                                                                       (c) Perform the Clenshaw algorithm to sum over
Once the last two quantities in the chain y1 (x) and y2 (x)                l ∈ [min(|s|, |m|), `max ], using the spectral
are determined, the final sum is obtained from the for-                                                            (a,b)
                                                                           coefficients a`m , the precomputed α`          and
mula                                                                         (a,b)
                                                                           β`      recurrence coefficients, and the first two
  f (x) = β2 (x)φ0 (x)y2 (x) + φ1 (x)y1 (x) + a0 φ0 (x). (23)              harmonics in the sequence computed from the
                                                                           previous step.
  We use the Clenshaw method for interpolating SWSH
                                                                   3. For m ∈ [−1, −`max ], repeat the substeps of step 2,
data to arbitrary points x on the sphere.             For
                                                                      but for the negative set of m’s.
spherical harmonics, it is successive values of ` that
have convenient three-term recurrence relations, so the            Although the procedure for interpolation is performed
lowest modes in the recursion are Y|m|,m (θ, φ) and             efficiently, there are a number of details of the imple-
Y|m|+1,m (θ, φ). The values of α`,m (θ, φ) and β`,m (θ, φ)      mentation of the angular coordinate transformation that
are cached for the target interpolation points, and the         must be handled carefully.
10

                                                                      The spin-weighted interpolation procedure can be per-
                                                                   formed only on quantities that are representable by the
               field values at
                                                                   SWSH basis. We can store non-representable quantities
          source collocation                                       (including, e.g. the angular coordinates themselves) on
                                                                   our chosen angular grid, but we cannot perform a SWSH
                                                                   transform on such quantities, so we cannot interpolate
                                                                   them using pseudospectral methods with any predictable
                             Source frame                          accuracy. Inconveniently, we are burdened with a num-
                                                                   ber of quantities that are not representable on the SWSH
                                                                   basis. Immediately after interpolation, J(xA (x̆Ă )) is not
                                                                   representable on the basis corresponding to the new grid
                                                                   because the Jacobian factors have not yet been applied.
             field values at                                        Similarly, the Jacobian factors ă and b̆ are not repre-
                                                                   sentable on the SWSH basis whenever the angular trans-
        target collocation                                         form is not trivial.
                                                                      Accordingly, for our example of J,˘ we must apply the
                                                                   transformation operations in a specific sequence:

                            Target frame                              1. Interpolate J(xA ) and K(xA ) to J(xA (x̆Ă )) and
                                                                         K(xA (x̆Ă )).
                                                                      2. Multiply the result by the Jacobian factors that
FIG. 3: An illustration of the interpolation reasoning for pseu-         appear in (25).
dospectral methods. The input to the interpolation is the field
values at the collocation points in the source frame, and we          We meet a similar complication when manipulating
wish to determine the field values for the same function at the
                                                                   the evolved angular coordinates xA (ŭ, x̆Ă ). The angu-
collocation points in the target frame, which will be at non-
collocation points in the source frame coordinates. Therefore,     lar coordinates are not representable on the SWSH ba-
the interpolation seeks to calculate the field value at points     sis, yet we must take angular derivatives of the angular
x(x̂) in the source frame, for all collocation points x̂ in the    coordinates to determine the Jacobian factors (13). The
target frame.                                                      method we use to evade the problems for the angular
                                                                   coordinate representation is to introduce a unit sphere
                                                                   Cartesian representation of the angular coordinates:
   First, it is important to note the counterintuitive na-
ture of the set of coordinate functions we require for the                               xunit = sin θ cos φ,                 (26a)
interpolation. In both the source frame and the target                                   yunit = sin θ sin φ,                 (26b)
frame, we use a pseudospectral grid, evenly spaced in φ,                                 zunit = cos θ.                       (26c)
and at Legendre-Gauss points in θ. When interpolating,
we require the location in the source frame coordinates of         The evolution equation for the unit sphere Cartesian rep-
the target frame collocation points. Therefore, when ex-           resentation is then derived from the angular coordinate
pressed as a function over collocation points, the function        evolution equation (12).
that we use for interpolation is xA (x̂A ). We have found
this feature of the interpolation for pseudospectral meth-                    ∂ŭ xiunit = U0Ă ∂Ă xiunit
ods easy to misremember, so we have included Fig. 3 to
                                                                                           1 ˘ i                        
assist in recalling the correct reasoning.                                               =      U0 ð̄xunit + Ū0 ð̆xiunit .    (27)
   Most of the quantities that we wish to interpolate have                                 2
nonzero spin-weight, so do not transform as scalars. In-             The main advantage of promoting the angular coordi-
stead, their transformation involves factors of the spin-          nates xA (ŭ, x̆Ă ) to their unit sphere Cartesian analogs is
weighted angular Jacobians (13). The tensor transforma-            that the Cartesian coordinates xi are spin-weight 0 and
tions for each of the relevant quantities at the worldtube         so we can quickly and accurately evaluate their angular
boundary are given in (15). For illustration, let us discuss       derivatives.
the transformation of the spin-weight 2 scalar J: ˘
                                                                     The spin-weighted Jacobian factors (13) are then cal-
                       ˘b̄2 J + ă2 J¯ + 2ă˘b̄K                   culated as
                  J˘ =                                     (25)
                                  4ω̆                                                     ă = ð̆xi ∂i xA qA ,                (28a)
It is important to note that at the start of the transfor-                                     ˘ xi ∂ xA q ,
                                                                                          b̆ = ð̄                             (28b)
                                                                                                     i     A
mation procedure, we have the values of J on the source
grid xA and the values of ă, b̆, and ω̆ on the target grid        where the factors ∂i xA are the Cartesian-to-angular Ja-
x̆Ă (the Jacobians are derivatives of x(x̆); see Fig. 3).         cobians in the source frame, so are analytically computed
11

as                                                                        almost-tridiagonal indefinite integration matrix for the
                                                                          spectral representation
               ∂x θ = cos[φ(x̂Â )] cos[θ(x̂Â )],                (29a)
                                                                                −1    1 −1         1              ···    (−1)n+1
                                                                                                                                  
               ∂x φ = − sin[φ(x̂ )]/ sin[θ(x̂ )],
                                     Â              Â
                                                                  (29b)
                                                                               −1    0 −1/3       0              ···       0      
                                                                               0     1 0                                   0
               ∂y θ = cos[θ(x̂Â )] sin[φ(x̂Â )]                 (29c)      I=                −1/5              ···              
                                                                                                                                   .
                                                                                .    .. . .      ..              ..        ..
                                                                                ..
                                                                                                                                   
               ∂y φ = cos[φ(x̂Â )]/ sin[θ(x̂Â )],               (29d)                .     .       .               .       .     
                                                                                    0 0 · · · 1/(2n − 1)           0 −1/(2n + 3)
               ∂z θ = − sin[θ(x̂ )], Â
                                                                  (29e)
                                                                                                                                 (33)
               ∂z φ = 0.                                          (29f)   Here the first row is chosen to zero the function at the
                                                                          innermost gridpoint (at y̆ = −1). It is convenient to gen-
                                                                          erate linear operators acting entirely on the nodal rep-
     B.    Rapid linear algebra methods for radial                        resentation. These are composed as M −1 IM , where M
                       integration
                                                                          is the linear operator that maps the nodal representa-
                                                                          tion to the modal representation. We may then add an
   SpECTRE CCE uses a Legendre Gauss-Lobatto spec-                        integration constant freely to the result of the indefinite
tral representation for the radial dependence of the spin-                integration operator in the nodal representation to satisfy
weighted scalars on its domain. The use of spectral                       the boundary conditions.
methods allows rapid integration of the radial differential                   Two of the five equations (those that determine β̆ and
equations of the hierarchical CCE system (9). The nu-                     Ŭ ) take the simple form
merical methods we employ in this section are not them-
selves new, but they have not previously been applied to                                           ∂y̆ f = Sf .                        (34)
efficiently solving the CCE system of equations.
   Each of the angular derivatives that appears in the hi-                The radial ODE solves for these cases are a straightfor-
erarchy of radial differential equations is first evaluated               ward application of the nodal integration matrix M −1 IM
by the procedure described around Eq. (6): perform a                      using (33). In the CCE system, the choice to zero the
spin-weighted spherical
                      p harmonic transform using lib-                     value at the innermost boundary point ensures that we
sharp, multiply by (` − s)(` + s + 1) in the modal ba-                    may impose the boundary conditions for the worldtube
                                            ˘, and recover
sis for the ð̆ and − (` + s)(` − s + 1) for ð̄
                     p
                                                                          quantities β̆|Γ and Ŭ |Γ by adding the appropriate bound-
the nodal representation of the derivative with an inverse                ary value to all points along the radial rays for each an-
spin-weighted transform. Using these nodal values of the                  gular point on the boundary.
angular derivative terms , we may then directly compute                     Two more of the radial differential equations (those
each of the right-hand sides of the radial differential equa-             that determine Q̆ and W̆ ) take the form
tions over the nodal grid. Therefore, for each of the radial
differential equations, the problem reduces to a collection                                  (1 − y̆)∂y̆ f + 2f = Sf .                 (35)
of radial ODE solves.
   The spectral representation in the radial direction al-                This case requires more care than the original indefinite
lows the further simplification of determining linear op-                 integral, but the full integration matrix is still readily
erators that correspond to indefinite integration. Given                  calculable for arbitrary Legendre order n.
the function f expressed in the modal representation                         Considering again the modal representation (30), we
                                                                          wish to find the linear operator K such that
                             X
                    f (y̆) =   an Pn (y̆),               (30)               X                X
                                n                                               an Pn (y̆) =   (K · a)n [(1 − y̆)∂y̆ Pn (y̆) + 2Pn (y̆)].
                                                                             n                n
we seek the integration matrix I such that                                                                                      (36)
                                                                          The operator K is the inverse of the operator in Eq. (35).
             X Z y̆
                                                                            We will again make use of the integration matrix I
                                  X
                 an     Pn (y̆) =   (I · a)n Pn (y̆),
                  n                         n                             (33). We also require the inverse of the matrix C associ-
               X                     X                                    ated with multiplication by (1 − y̆):
          =⇒          an Pn (y̆) =        (I · a)n ∂y̆ Pn (y̆),    (31)
                  n                   n
                                                                                  X                      X
                                                                                      (C · a)n Pn (y̆) =   an (1 − y̆)Pn (y̆).  (37)
The relevant identity for Legendre polynomials that we
use to determine the integration matrix I is                              The matrix C is derived by algebraic manipulations of
                                                                          Bonnet’s recursion formula for Legendre polynomials
                    1     d
      Pn (y̆) =              [Pn+1 (y̆) − Pn−1 (y̆)] .             (32)
                  2n + 1 dy̆                                               (n + 1)Pn+1 = (2n + 1)y̆Pn − nPn−1
By integrating both sides of this equation and applying                                      n+1                 n
                                                                           ⇒ (1 − y̆)Pn = −        Pn+1 + Pn −        Pn−1 (38)
the result to the modal representation (30), we find the                                    2n + 1             2n + 1
12

Therefore, composing the operations of C and I, we find            IV.   PARALLELIZATION AND MODULARITY
X                            X
   ((C + 2I) · a)n Pn (y̆) =   (I · a)n [(1 − y̆)∂y̆ Pn + 2Pn ]
                                                                     Because of the dependence of the gauge transformation
 n                            n
                                                          (39)    at the inner boundary on the field values at I + needed
and                                                               to establish an asymptotically flat gauge, the opportuni-
                                                                  ties for subdividing the CCE domain for parallelization
                    K = I · (C + 2I)−1                    (40)    purposes are limited. However, we are able to take ad-
                                                                  vantage of the task-based parallelism in SpECTRE to: a)
To compute K in practice, we determine the values of C            parallelize independent portions of the CCE information
and I analytically, then perform a single numerical inver-        flow, and b) efficiently parallelize the CCE calculation
sion to finish the computation of (40). Boundary condi-           with a simultaneously running Cauchy simulation.
tions then determine the quadratic part of the solution,
so are imposed by adding the appropriate b(θ̆, φ̆)(1 − y̆)2
contribution along each radial ray.                                            A.   Component construction
   Importantly, for both of the above types of the radial
ODE solve, the integration matrix in question is inde-
                                                                     In SpECTRE, we refer to the separate units of the sim-
pendent of the values of the fields. So, at the start of
                                                                  ulation that may be executed in parallel via task-based
the simulation, we precompute and store the necessary
                                                                  parallelism as components. For instance, in the near-field
integration matrices, reducing each of the ODE solves de-
                                                                  region in which the domain can be parallelized among
scribed above to a matrix-vector multiplication for each
                                                                  several subregions of the domain, each portion of the do-
radial ray. In SpECTRE, these matrix-vector product
                                                                  main is associated with a component.
calculations are optimized via the vector intrinsic library
                                                                     For SpECTRE CCE, we use three components (in ad-
libxsmm [57].
                                                                  dition to components that are used for the Cauchy evolu-
   The final type of radial differential equation appears
                                                                  tion): one component for the characteristic evolution, an-
only in the equation that determines H. This type is
                                                                  other component dedicated to providing boundary data
more complicated:
                                                                  on the worldtube, and a third component for writing re-
(1 − y̆)∂y̆ f + [1 + (1 − y̆)LG LJ ]f + (1 − y̆)L̄G LJ f¯ = S,    sults to disk.
                                                           (41)      Much of the efficiency and precision of the SpECTRE
                                                                  CCE system comes from the ability to cover the entire
in which the L factors depend on the field quantities of          asymptotic domain from the worldtube Γ to I + with a
the current hypersurface. In this case, there is little hope      single spectral domain. In principle, there may be oppor-
of determining an elegant simplification using the modal          tunity to parallelize multiple radial shells of the compu-
basis. In any case, there would be no opportunity for             tation, but in practice our initial assessments indicated
caching and reusing an integration matrix, as the differ-         that there would be little gain for the typical gravita-
ential operator that acts on f depends on the other fields        tional wave extraction scenario. First, there is a signifi-
on the hypersurface. So, for the integration of the H             cant constraint that comes from the asymptotic flatness
equation, we decompose the complex linear differential            condition — the gauge transformation throughout the
equation into a real linear equation on vectors of length         domain on a given hypersurface depends on the asymp-
2n:                                                               totic value U|I + on the same hypersurface, which forces a
                                                                  significant portion of the computation to serial execution.
      (1 − y̆)∂y̆ + 1        0
                                    
                                                                  Additionally, we have seen very rapid convergence in the
              0       (1 − y̆)∂y̆ + 1
                                                                  number of radial points used for the CCE system, so it is
               Re(LJ )Re(LG ) Re(LJ )Im(LG )        Re(f )
                                                       
  + (1 − y̆)                                                      unlikely that subdividing the domain radially would offer
              Im(LJ )Re(LG ) Im(LJ )Im(LG )         Im(f )        much additional gain for the typical use case.
                                          
                                            Re(S)
                                                                    Therefore, the entire characteristic evolution system is
                                        =            , (42)       assigned to a single component, and represents the com-
                                           Im(S)
                                                                  putational core of the algorithm. The evolution compo-
where the multiplication by (1 − y̆) and differentiation          nent is responsible for
∂y̆ are understood to represent linear operators on the
Legendre Gauss-Lobatto nodal representation. We then                 • The angular gauge transformation and interpola-
solve (42) by numerically computing the linear operator                tion (via Clenshaw recurrence)
along each radial ray and performing an aggregated lin-
                                                                     • The calculation of the right-hand sides of the set of
ear solve via LAPACK. Boundary conditions are imposed
                                                                       hierarchical equations (9)
as usual by setting the first row of the operands Re(S)
and Im(S) to the desired boundary value before the op-               • The integration of each of the radial ODEs
eration, and adjusting the first and (n + 1) row of the
linear operator to be equivalent to the first and (n + 1)            • The time interpolation and preparation of wave-
row of the identity matrix.                                            form data
13

                Interpolation/gauge to Bondi-Sachs;
                One of:
            Worldtube from
                               Boundary calculation
                    disk
                               from stored metric
                                                               Hypersurface solve         Write waveform
              DG interpolation                                 and Evolution              to disk
                                 GH boundary
                                 (parallel w/ Cauchy)

                                 Analytic boundary
                                    Gμν =0

FIG. 4: Components of the CCE task-based parallelism system. The worldtube component (left) is modular and can be switched
out according to the desired source of worldtube data. We currently support reading worldtube data from disk, interpolating
worldtube data from a simultaneously running Generalized Harmonic system in SpECTRE, or computing analytic boundary
data from a known solution or approximation to the Einstein field equations.

The core evolution component performs no reads from or            Finally, there is a generic observer component that
writes to the filesystem, which ensures that the expensive     handles the output of the waveform data to disk. When
part of the computation will not waste time waiting for        CCE is simultaneously running with a Cauchy evolution,
potentially slow disk operations.                              there will be additional components running in parallel
  The second component used in CCE is the worldtube            with the CCE components, such as components that per-
component. A worldtube component is responsible for:           form the Cauchy evolution, components that search for
                                                               apparent horizons, and components that write simula-
   • Collecting the Cauchy worldtube metric and its            tion data to disk. The division of the CCE pipeline into
     derivatives from an assigned data source                  parallel components is illustrated in Fig. 4.
   • Interpolating the data to time steps appropriate to
     the CCE evolution system                                    B.   Independently stepped interface with Cauchy
                                                                                    simulation
   • Performing the transformation to the Bondi-Sachs-
     like coordinate system on the worldtube
                                                                  Because the Cauchy-characteristic evolution system
The user has a choice of several different worldtube com-      does not have much opportunity to parallelize internally,
ponents, each of which corresponds to a different source       we need to ensure that its serial execution is optimized.
of the metric quantities on the worldtube. Worldtube           Our goal is that when running simultaneously with the
components are available that:                                 highly parallel discontinuous Galerkin system used for
                                                               the Generalized Harmonic evolution, the CCE system
   • Read worldtube data directly from disk                    does not impose any significant runtime penalty.
                                                                  An important contribution to the efficiency of the CCE
   • Accept interpolated data from a simultaneously
                                                               system is that the solutions to the Einstein field equa-
     running Cauchy execution in SpECTRE
                                                               tions are smooth and slowly varying in time. As a result,
   • Calculate worldtube data from an analytically de-         the spectral methods used in CCE converge rapidly, and
     termined metric on the boundary                           the scales that we seek to resolve with the time-stepper
                                                               are primarily on orbital timescales. Therefore, we antic-
Our methods for reading from disk are currently opti-          ipate that the CCE system should be able to take far
mized for easily reading worldtube data written by SpEC,       larger timesteps than the Generalized Harmonic system
but our worldtube module should accept data from any           running in concert, and it will be important for the over-
code that can produce the spacetime metric and its first       all efficiency of the extraction pipeline to adjust the time
derivatives decomposed into spherical harmonic modes.          steps of the CCE evolution independently of the time
You can also read