Workshop: Federated Infrastructures und cloud computing: Organisation and Preparation of BMBF Proposal - DESY Indico

Page created by Dwayne Delgado
Workshop: Federated Infrastructures und cloud computing: Organisation and Preparation of BMBF Proposal - DESY Indico

                                           Federated Infrastructures und cloud
                                           computing: Organisation and Preparation of
                                           BMBF Proposal
                                             4 September 2020

KAT                                                                               Andreas Haungs, KIT

KIT – The Research University in the Helmholtz Association                        
Workshop: Federated Infrastructures und cloud computing: Organisation and Preparation of BMBF Proposal - DESY Indico
Initiative for a               Large-scale cosmic
                               structure: fields and              Gravitational waves                         Nuclear
(national / global)            objects                                                                   Astrophysics
Analysis & Data
                                                                  Ultra-high energy
Center                                                                  cosmic rays
in Astroparticle
Physics                                                                           Galactic
                                                                               cosmic rays

    Astroparticle Physics =
    Understanding the
      Multi-Messenger                                                      gamma
                                                       neutrino            astronomy
      Dark Universe

needs an                           search for Dark
experiment-overarching             Matter annihilation
(computing) platform!
(there is no CERN or FAIR or
ESO)                                                                                                         neutrino
                                                                              search for Dark
                                                                              Matter scattering              astronomy

2                                                                                            A. Haungs, September 2020
Workshop: Federated Infrastructures und cloud computing: Organisation and Preparation of BMBF Proposal - DESY Indico
Analysis and Data Center in Astroparticle Physics

                                         Simulations        Real-time                          Education
       Data                                                                    Open                                     Data
                        Analysis         & Methods          analysis                            in Data
     availability                                                             access                                   archive
                                        development          center                             Science

     Data availability:                                           Open access:
    All researchers of the individual experiments or facilities   It is necessary to make the scientific data available also to
    require quick and easy access to the relevant data.           the interested public: public data for public money!
     Analysis:                                                    Education in data science:
    Fast access to the generally distributed data from      Not only data analysis itself, but also the efficient use of
    measurements and simulations is required. Corresponding central data and computing infrastructures requires special
    computing capacities should also be available.          training.
     Simulations and methods development:                         Data archive:
    Researchers need an environment for simulations and the       The valuable scientific data and metadata must be
    development of new methods (machine learning).                preserved and remain interpretable for later use (data
     Real-time analysis center:                                  preservation).
    The multi-messenger ansatz requires a framework to
    develop and apply methods for joint data stream analysis.
3                                                                                                          A. Haungs, September 2020
Workshop: Federated Infrastructures und cloud computing: Organisation and Preparation of BMBF Proposal - DESY Indico
Status Infrastructures in Astroparticle Physics
      • (Co-use of) Institutional resources (partly WLCG resources)
      • GridKa: Tier1-centre in the world wide LHC Computing Grid (e.g. Auger@GridKa)
      • Experiment-oriented resources (e.g. CTA@DESY)
      • Co-use of facility infrastructures (e.g. IceCube at DESY)
      • Moderate use of HPC cluster (Gauß Alliance)

      Resarch Data Management:
      • KCDC: KASCADE Cosmic ray Data Centre (data access)
      • VISPA: to analyze data (Learning Deep Learning)
      • GAVO (German Astrophysical Virtual Observatory)
      • CERN Open Data Portal (not yet used by APP)

4                                                                          A. Haungs, September 2020
Federated Infrastructures for Astroparticle Physics (…in Germany)
    •   Starting position
             o   more and more complex experiments and research facilities
             o   rapidly increasing digitization levels and therefore growing data volumes of the instruments
             o   sophisticated simulation and data analyses
             o   request of combination of data from different facilities (Multi-Messenger APP)
     considerably growing needs of the scientific community for an efficient Information and
    Communication Technology (ICT) infrastructure
    •   A scientific (ICT) infrastructure for data-intensive research requires
             o   large Storage, fast Network, high Computing Power
    •   The future computing model for Astroparticle Physics will have many similarities and synergies with
        the HEP (HL-LHC) activities
    •   Such an ICT infrastructure for Astroparticle Physics should be seen in the context of broader
        national research data infrastructures and at the international level, e.g. in the context of European
        cloud initiatives
    • Such a common virtual ICT infrastructure should be connected to experiment-specific infrastructures
      and should foster the inclusion of commercial resources.

                       requires in Germany a dedicated (federated) infrastructure
5                                                                                                    A. Haungs, September 2020
Assessment of the demand for federated resources in computing of APP:
     -   To WLCG system projected requests of German share of computing requests of the ErUM-Pro projects
         in addition to usage of institutional resources.
     -   Projected requests for 2028: factor ~8 for CPU, ~5 Disk and ~10 Tape, factor ~20 for GPU (mainly due to

     -   Theory: The current needs are met by federal or state-operated supercomputer centers (Jülich SC, Leibniz Center Munich, HLRN,
         etc.); not clear if this is possible for 2028.

    Request in 2021                                                             GERDA /                    Multi-
                          Auger     IceCube     CTA*          ET     KATRIN     LEGEND      DARWIN     Messenger        Theorie         Summe
    CPU [CPU-years]         500         500      500           0        500         n/a          0           100              0           2100
    GPU [GPU-years]          40         200        0           0          0         n/a          0            50              0            290
    Disk [PB]               0.8           1      0.5           0        n/a         n/a          0           0.2              0            2.5
    Tape [PB]                 3           0        0           0        n/a         n/a          0             0              0              3

    Projected for 2028                                                          GERDA /                    Multi-
                          Auger     IceCube      CTA          ET     KATRIN     LEGEND      DARWIN     Messenger        Theorie         Summe
    CPU [CPU-years]          800       2000     1000        5000         600         n/a       2500          1000           1000          13900
    GPU [GPU-years]           70        400         0       5000         400         n/a          13          500            300            6670
    Disk [PB]                1.5           2        3          2          n/a        n/a         1.9            2             0.2           12.6
    Tape [PB]                  5          10       10          0          n/a        n/a         1.1            4               0               30

6                                                                                                                   A. Haungs, September 2020
Draft of a text as result of this assessment:
    Der Bedarf an Computing-Ressourcen für die Astroteilchenphysik in Deutschland wird in den nächsten Jahren beträchtlich
    zunehmen. Im Jahre 2020 erfolgt das Computing für die deutschen Leuchtturm-Experimente (Auger, CTA, IceCube, ET, KATRIN,
    Gerda/Legend, DARWIN, Multi-Messenger, Theorie) im Wesentlichen über institutionelle, experimentspezifische oder wie bei der
    Theorie über föderierte Supercomputer- Ressourcen und nur zu einem kleinen Teil über den deutschen WLCG-Verbund. Eine
    Abschätzung des Bedarfes für 2021 für den deutschen ‘fair-share’ Anteil am Computing der internationalen Experimente ergab eine
    Summe von 2.000 CPU-Jahren, 300-GPU-Jahren, 2.5 PB Plattenplatz und 3 TB Tape Kapazität, die im Wesentlichen bereits über das
    WLCG (Tier-1 und Tier-2) abgedeckt sind. Eine Projektion in das Jahr 2028 ergab einen gesteigerten Bedarf von ca. Faktor 8 in
    CPU-Jahren, Faktor 20 in GPU-Jahren, Faktor 5 in Plattenplatz und Faktor 10 in Tapes.


    The demand for computing resources for astroparticle physics in Germany will increase considerably in the coming years. In 2020,
    the computing for the German flagship experiments (Auger, CTA, IceCube, ET, KATRIN, Gerda/Legend, DARWIN, Multi-Messenger,
    Theory) will mainly be carried out via institutional, experiment-specific or, as in the case of theory, federated supercomputer
    resources and only to a small extent via the German WLCG network. An estimation of the 2021 requirements for the German fair-
    share of the computing of the international experiments resulted in a sum of 2,000 CPU years, 300 GPU years, 2.5 PB disk space
    and 3 TB tape capacity, which are already largely covered by the WLCG (Tier-1 and Tier-2). A projection into the year 2028 showed
    an increased demand of about factor 8 in CPU years, factor 20 in GPU years, factor 5 in disk space and factor 10 in tape capacity.

7                                                                                                                       A. Haungs, September 2020
You can also read