The Pacific Research Platform- a High-Bandwidth Global-Scale Private 'Cloud' Connected to Commercial Clouds

Page created by Richard Stevens

Style & Fashion

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

The Pacific Research Platform- a High-Bandwidth Global-Scale Private 'Cloud' Connected to Commercial Clouds

“The Pacific Research Platform-
     a High-Bandwidth Global-Scale Private ‘Cloud’
           Connected to Commercial Clouds”

   Presentation to the UC Berkeley Cloud Computing MeetUp
                         May 26, 2020

                                    Dr. Larry Smarr
Director, California Institute for Telecommunications and Information Technology
                              Harry E. Gruber Professor,
                    Dept. of Computer Science and Engineering
                        Jacobs School of Engineering, UCSD
                                                                                   1
                                 http://lsmarr.calit2.net

Before the PRP: ESnet’s ScienceDMZ Accelerates Science Research:
     DOE & NSF Partnering on Science Engagement and Technology Adoption

ScienceDMZ Coined in 2010 by ESnet                                            NSF Campus Cyberinfrastructure Program
Basis of PRP Architecture and Design                                               Has Made Over 250 Awards
                     Network
                   Architecture
                  (zero friction)
                                                            DOE
Data Transfer                          Performance
   Nodes                                 Monitoring
(DTN/FIONA)                            (perfSONAR)           NSF

                    Science
                     DMZ

       http://fasterdata.es.net/science-dmz/
                                                                                2012   2013   2014   2015   2016   2017   2018

                                      Slide Adapted From Inder Monga, ESnet

2015 Vision: The Pacific Research Platform Will Connect Science DMZs
Creating a Regional End-to-End Science-Driven Community Cyberinfrastructure
                                                                   NSF CC*DNI Grant
                                                                 $6.3M 10/2015-10/2020
                                                                     In Year 5 Now
                                                       PI: Larry Smarr, UC San Diego Calit2
                                                       Co-PIs:
                                       Supercomputer
                                          Centers      • Camille Crittenden, UC Berkeley CITRIS,
                                                       • Philip Papadopoulos, UCI
                                                       • Tom DeFanti, UC San Diego Calit2/QI,
                                                       • Frank Wuerthwein, UCSD Physics and SDSC

    (GDC)

                                                             Letters of Commitment from:
                                                             • 50 Researchers from 15 Campuses
                                                             • 32 IT/Network Organization Leaders

            Source: John Hess, CENIC

PRP Links At-Risk Cultural Heritage and Archaeology Datasets
             at UCB, UCLA, UCM and UCSD with CAVEkiosks

48 Megapixel CAVEkiosk                   48 Megapixel CAVEkiosk                        24 Megapixel CAVEkiosk
     UCSD Library                       UCB CITRIS Tech Museum                              UCM Library

                                UC President Napolitano's Research Catalyst Award to
    UC San Diego (Tom Levy), UC Berkeley (Benjamin Porter), UC Merced (Nicola Lercari) and UCLA (Willeke Wendrich)

Terminating the Fiber Optics - Data Transfer Nodes (DTNs):
                  Flash I/O Network Appliances (FIONAs)
       UCSD-Designed FIONAs Solved the Disk-to-Disk Data Transfer Problem
          at Near Full Speed on Best-Effort 10G, 40G and 100G Networks

Two FIONA DTNs at UC Santa Cruz: 40G & 100G                       Add Up to 8 Nvidia GPUs Per 2U FIONA
        Up to 192 TB Rotating Storage                              To Add Machine Learning Capability

                          FIONAs Designed by UCSD’s Phil Papadopoulos, John Graham,
                                         Joe Keefe, and Tom DeFanti

2017-2020: NSF CHASE-CI Grant Adds a Machine Learning Layer
                      Built on Top of the Pacific Research Platform
                    MSU

       UCB
              UCM
  Stanford
    UCSC

       Caltech    UCI UCR
                 UCSD SDSU

  NSF Grant for High Speed “Cloud” of 256 GPUs
For 30 ML Faculty & Their Students at 10 Campuses
       for Training AI Algorithms on Big Data

2018-2021: Toward the National Research Platform (NRP) -
           Using CENIC & Internet2 to Connect Quilt Regional R&E Networks

  “Towards
  The NRP”
3-Year Grant
   Funded
   by NSF
    $2.5M
October 2018                 Original PRP
  PI Smarr
Co-PIs Altintas
Papadopoulos
 Wuerthwein
   Rosing
   DeFanti

                                 NSF CENIC Link
                                  CENIC/PW Link

2018/2019: PRP Game Changer!
     Using Kubernetes to Orchestrate Containers Across the PRP

“Kubernetes is a way of stitching together
     a collection of machines into,
       basically, a big computer,”
        --Craig Mcluckie, Google
  and now CEO and Founder of Heptio

"Everything at Google runs in a container."
            --Joe Beda,Google

PRP’s Nautilus Hypercluster Adopted Kubernetes to Orchestrate Software Containers
    and Rook, Which Runs Inside of Kubernetes, to Manage Distributed Storage

                                                https://rook.io/

          “Kubernetes with Rook/Ceph Allows Us to Manage Petabytes of Distributed Storage
                                    and GPUs for Data Science,
                            While We Measure and Monitor Network Use.”
                              --John Graham, Calit2/QI UC San Diego

PRP’s California Nautilus Hypercluster Connected
                                       by Use of CENIC 100G Network
Minority Serving Institution                                                                                  USD
                                                                    UCLA               Caltech
                                                    USC
      PRP Disks                                                  2x40G 160TB       100G NVMe 6.4TB         40G 192TB
                                  UCR             40G 160TB
      CHASE-CI                                                 100G NVMe 6.4TB                               UCSB
                               40G 160TB
                                                                                          CSUSB
      *= July RT                1 FIONA8                                                                   40G 192TB
                                                                                       10G 3TB
                                                                                                           2 FIONA8s*
          Calit2/UCI
         4 FIONA8s*                                                                                       UCSC
         40G 160TB                            15-Campus Nautilus Cluster:                               40G 160TB
  40G 160TB HPWREN                             4360 CPU Cores 134 Hosts                              100G NVMe 6.4TB
                                                    ~1.7 PB Storage                                    4.5 FIONA8s
       SDSC @ UCSD                            407 GPUs, ~4000 cores each
                                                                                                          NPS
 8 FIONA8s + 5 FIONA8s
     100G Gold NVMe                                                                                     100G 48TB
     100G Epyc NVMe
                                                                                                     Stanford U
                                           SDSU
            UCSD                                                 UCM                                 40G 160TB
                                    FPGAs + 2PB BeeGFS                             UCSF
2x40G 160TB HPWREN                                            40G 160TB                              1 FIONA8*
                                    1 FIONA8* 2 FIONA4s                          40G 192TB
        17 FIONA8s                                            2 FIONA8
                                      100G NVMe 6.4TB
        35 FIONA2s                   40G 160TB HPWREN         10 FIONA2s

PRP/TNRP’s United States Nautilus Hypercluster FIONAs
Now Connects 4 More Regionals and 3 Internet2 Storage Sites

  UWashington
  40G 192TB                           StarLight
                                      40G 3TB
                                         I2 Chicago      UIC         I2 NYC
                  NCAR-WY
                 40G 160TB              100G FIONA     40G FIONA   100G FIONA
                                                      10G FIONA1
                                    I2 Kansas City
                                    100G FIONA

   U Hawaii
   40G 3TB
                    CENIC/PW Link

PRP Global Nautilus Hypercluster Is Rapidly Adding International Partners
                  Beyond Our Original Partner in Amsterdam

    KISTI                        Transoceanic Nodes Show Distance is Not a Barrier   Netherlands
  40G 28TB                            to Above 5Gb/s Disk-to-Disk Performance
 40G FIONA6                                                                                        UvA
                                                                                               10G 35TB

                  Korea
                                       PRP

                   Guam
                     U of Guam
                      10G 96TB
Singapore
                     U of Queensland
                        100G 35TB
                                                                                                   PRP’s Current
              Australia                                                                             International
                                         GRP Workshop 9/17-18/2019                                    Partners
                                             at Calit2@UCSD

PRP’s Nautilus Forms a Multi-Application
Powerful Distributed “Big Data” Storage and Machine-Learning Computer

           Source: grafana.nautilus.optiputer.net on 1/27/2020

Collaboration on Distributed Machine Learning for Atmospheric Water in the West
                      Between UC San Diego and UC Irvine

                       Pacific Research Platform (10-100 Gb/s)

                              Complete workflow time:
                               19.2 daysà52 Minutes!
UC, Irvine                       532 Times Faster!                      UC, San Diego
               GPUs                    SDSC’s COMET             GPUs

             Calit2’s FIONA                                     Calit2’s FIONA

                                  Source: Scott Sellers, CW3E

UCB Science Engagement Workshop:
              Applying Advanced Astronomy AI to Microscopy Workflows

   Organized and
   Coordinated by
    UCB’s PRP
Science Engagement
       Team

Co-Existence
    NSF Large-Scaleof Interactive and
                       Observatories
Asked to Utilize PRP
 Non-Interactive     Compute Resources
                   Computing    on PRP

                 GPU Simulations Needed to Improve Ice Model.
                      Þ Results in Significant Improvement
              in Pointing Resolution for Multi-Messenger Astrophysics
                 Þ But IceCube Did Not Have Access to GPUs

Number of Requested PRP Nautilus GPUs For All Projects Has Gone Up 4X in 2019
          Largely Driven By the Unplanned Access by NSF’s IceCube

  4X
                                                                         IceCube

              https://grafana.nautilus.optiputer.net/d/fHSeM5Lmk/k8s-compute-resources-cluster-
              gpus?orgId=1&fullscreen&panelId=2&from=1546329600000&to=1577865599000

Multi-Messenger Astrophysics
 with IceCube Across All Available GPUs in the Cloud

• Integrate All GPUs Available for Sale Worldwide
  into a Single HTCondor Pool
   – Use 28 Regions Across AWS, Azure, and Google Cloud
     for a Burst of a Couple Hours, or so
   – Launch From PRP FIONAs
• IceCube Submits Their Photon Propagation Workflow
  to this HTCondor Pool.
   – The Input, Jobs on the GPUs, and Output are All Part of
     a Single Globally Distributed System
   – This Demo Used Just the Standard HTCondor Tools

          Run a GPU Burst Relevant in-Scale
          for Future Exascale HPC Systems

Science with 51,000 GPUs
               Achieved as Peak Performance

                                            Each Color is a Different
                                         Cloud Region in US, EU, or Asia.

                                           Total of 28 Regions in Use

                                         Peaked at 51,500 GPUs

                                             ~380 Petaflops of FP32

              Time in Minutes

     Summary of Stats at Peak - 8 Generations of NVIDIA GPUs Used

19

Engaging More Scientists: PRP Website
       http://ucsd-prp.gitlab.io/

You can also read

Student Lesson Plan - How to Use Microsoft Movie Maker

Investor Presentation - Q1 2019 Activity - GTT

FIRST ANNUAL NEUROSCIENCE PATIENTS - WEDNESDAY, JANUARY 20-SATURDAY, JANUARY 23, 2021 - American Association of Neuroscience Nurses

Are you playing Chess or Checkers with your Cyber-security posture - DERRICK A. BUTTS, CISSP, ITILV3 Aug. 2018

PARETO SECURITIES' TECHSAAS CONFERENCE - CISION

Manor Fields News Spring 2018 - Manor Fields Primary School

BCMS VENTUR Bird & Drone detection system - The Edge Company

Expert reviewers for Orphanet in 2018 2018

Talent-is Campus Engineering Veurne - a Belgian school for technology, science and engineering - VTI Veurne

Distance Learning Platforms Comparison Chart - Instruction ...

STUDENT HANDBOOK 2020-2021 - Civil & Environmental ...

Space Shuttle Endeavour Transport Project

CASE STUDY - EMR supports Dublin City Council's flood management system for capital's citizens and businesses - EMR Integrated Solutions

MOOC Development - Why do IT and Business Courses Matter? - SHS Web of Conferences

COMMITTEE ON TOXICITY OF CHEMICALS IN FOOD, CONSUMER PRODUCTS AND THE ENVIRONMENT UPDATE ON THE WORK OF OTHER SCIENTIFIC ADVISORY COMMITTEES

Cluster randomised trial of Communities That Care in Australia: Translating research into prevention practice

KS2 SATS 2019 KS2 SATS Guidance for Parents - SPRING 2019 - Christ Church CE Primary School Barnet

SHA CORPORATE PLAN 2021-2022 - Spiritual Health Association

Dental Clinical Effectiveness - SDCEP

Title Title Daily Current Affairs Capsule 12th January 2021

CALL FOR PROPOSALS 2016-2018 - "FONDATION ARTHRITIS & CLARINS WORLDWIDE 2016"

Being Human Café What is a Being Human Café? - Being Human Festival

Academic Guide Exchange 2019-2020 - Faculty of IT & Design - The Hague University

Changes & Choices: An in-depth guide to University - Epsom College

Masters in Clinical Education - Faculty of Life Sciences & Medicine - King's College London

Successful Career in UX Design - Ux TIPS FOR A - Academy Xi

FIA Submission on Australia Post's draft price notification

Kuala Lumpur Offi ce - Savills

2019 employee sentiment research - Investors in People

KBC Sunrise Market Commentary

Englischsprachige Vorlesungen Wintersemester 2018/19

STRATEGIC PLAN Information Technology Services - UNCW

A Portrait of Hochschule Darmstadt-University of Applied Sciences (h_da)

Comparison Visual Facade On The Oceanarium Building, Case Study: Seaworld Indonesia In Indonesia, The Blue Planet 3XN In Copenhagen, And Batumi ...

FOCUS ON MSCA IF -Key steps in preparing a successful proposal-Melanie ten Asbroek, MA - euraxess

3rd Annual European Doctoral Summer School in Professional Development 22-24 June 2020 Overview and goals

The Benefits of Collaborative Marketing at Purdue University Libraries / Press

Recharge Centers, Core Facilities, and Federal Update - NURAP at Noon, January 2012

ESN MEMBERS' LIST November 2018 - European Social Network

Postal Advocacy 2021/22 - Context - Citizens Advice Scotland

A PHYTOMEDICINE-BACOPA MONNIERI (BRAHMI) - IJCRT

Announcement and request for submission of contributions - September 24th-26th, 2019 - BTU-Tagung

PDSA CYCLES AND QUALITY IMPROVEMENT PROJECT REGISTRY (QIPR) - LYNNE E MEYER, PHD, MPH LILIANA BELL, MHA, PMP AUGUST 14, 2020

Can BJP sustain the bounce from Pulwama and Balakot?

An update on WHO process, interim guidance and future guidelines. What's next? - James Kiarie and Petrus Steyn Human Reproduction Team Department ...

New research projects on Coronavirus (August 2020)

Course Summary + Graphics at Stanford Today - Lecture 19: Interactive Computer Graphics Stanford CS248, Winter 2021

RENATER's White Box CPE in Normandy Regional network WP6 T1 monitoring and management activity - Xavier JEANNIN, RENATER Sebastien VIGNERON ...

Wikipedia Readership Proposal - Understanding the usage, behaviors and attitudes of Wikipedia readers

Multinational Species Conservation Fund: FY2020 Appropriations - April 5, 2019