Intersect360 Research White Paper: NEW AMD CPUs and GPUs CONTINUE
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Intersect360 Research White Paper: NEW AMD CPUs and GPUs CONTINUE MOMENTUM INTO HPC FUTURE EXECUTIVE SUMMARY The HPC industry is in the midst of an era of expansion. Intersect360 Research studies have shown the challenge that organizations face is the need to serve not only their ravenous technical computing applications, but also the adopted appetites of data science and machine learning. As this trend has progressed, it has led to a steady, corresponding shift in computational architecture for HPC. Today, HPC relies on new concepts of scalability. Most new HPC deployments are heterogeneous; in addition to the CPUs that run the system, they are powered by complementary co-processors, usually in the form of GPUs. The fields of machine learning and data science, combined with the attendant computational power of GPUs, offer tremendous upside for the application of HPC to an ever-increasing array of endeavors. Among all the HPC processing options to emerge in the past few years—and there have been many—the ones with the most momentum come from Advanced Micro Devices (AMD). AMD has a nearly monomaniacal focus on performance, and the company has impressively maintained consistent messaging company-wide, across generations of development. AMD has performed benchmark testing in which its EPYC™ processors outperform comparable Intel Xeon® parts on commonly-used HPC applications. As a result of this dedicated effort, AMD perception and adoption are on the rise among HPC users. The future of supercomputing is solidly heterogeneous, incorporating both CPUs and GPUs. AMD serves HPC not only with AMD EPYC CPUs, but also with AMD Instinct™ accelerators, and these are predestined to work well together. Upcoming generations of EPYC and Instinct will be integrated with AMD’s high-speed Infinity architecture, boosting inter-processor communication between CPU and GPU. AMD is addressing heterogeneous programming in two ways. First, with coherent memory, programming is more straightforward, with fewer lines of code or custom calls. Second, AMD supports an open “ROCm™” (pronounced “rock ’em”) programming and software environment, making applications more portable to future generations of processors, regardless of where they come from. By bringing together new technologies and new workloads, AMD provides a compelling vision of scalability into the future of HPC. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
MARKET DYNAMICS: THE NEW SCALABILITY One basic characteristic of High Performance Computing (HPC) is that it doesn’t hold still. No amount of scientific discovery, product enhancement, or engineering achievement is final; it Data science and is merely the next evolutionary step in ongoing advancement. Every solution unveils the next machine question, and the tools of HPC must also improve and evolve to solve each new generation of learning challenges. techniques can augment Due to the nature of this ongoing expansion, HPC has always included notions of scalability. preexisting Invariably, scalability was tied to notions of “more”: more processors, more computations, computational more bandwidth. But thanks to added complexity due to a confluence of trends, scalability in HPC is more complicated than ever, and new insights and discoveries are increasingly linked methods, to combinations of factors. unlocking new computing The New Scalability: Applications capabilities. The HPC industry is in the midst of an era of expansion beyond its everyday, relentless Applications in pursuit of knowledge. The last decade has seen the introduction of successive, related uber- financial trends: the first, big data, which established data science and analytics as high-performance services, enterprise workloads; the second, artificial intelligence (AI), which popularized machine manufacturing, learning as an alternate, experiential approach to computing. Both are enabled by the oil exploration, creation and accessibility of vast amounts of data to complement raw computation. and bio-sciences As they evolve, data science and machine learning need not be independent from traditional are all data- HPC. These techniques can augment preexisting computational methods, unlocking new intensive and computing capabilities. Applications in financial services, manufacturing, oil exploration, and well-suited to bio-sciences are all data-intensive and well-suited to the incorporation of machine learning. machine Intersect360 Research studies have shown the strong majority of HPC-using organizations learning. have incorporated data science or machine learning workloads into their environments. Note that this does not imply a corresponding tripling of budgets; far from it, although there has been an increase. The corresponding challenge that organizations face is the need to serve not only their ravenous technical computing applications, but also the adopted appetites of data science and machine learning. As this trend has progressed, it has led to a steady, corresponding shift in computational architecture for HPC. The New Scalability: Architectures As HPC applications have continued to evolve, the systems that power them have had to continue to evolve as well. Just as no application is complete for all time, no supercomputer has ever proved powerful enough that it could not eventually be saturated. Over time, HPC architectures have processed through their own eras to achieve greater scalability for expanding application sets. Vector processors were replaced by RISC scalar. RISC gave way to x86 as single-system deployments were replaced by clusters. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
For years, cluster deployments were largely homogeneous. They ran x86 processors in dual- socket, industry-standard servers, over some type of networking fabric. One of the advantages this hegemony presented was portability. Applications could be migrated from one cluster to the next without much care to vendor. Components like CPUs, memory, and network could be upgraded, but the model remained the same. But innovation isn’t innovation if someone else gets there first. Furthermore, new application requirements—such as those posed by AI—mean that one single processor or configuration isn’t always best for every workload. This set of circumstances has conspired to help establish a new computational norm in HPC: accelerated computing. Today, most new HPC deployments are heterogeneous; in addition to the CPUs that run the system, they are powered by complementary co-processors, usually in the form of GPUs. High-power GPUs—graphics processing units—have their roots in gaming and entertainment. The processing elements that power these high-end visual effects are strong engines for floating-point performance, and with the right programming tools, they can be harnessed to boost mathematical performance for scientific computing. As GPUs have evolved as accelerators, differences have begun to emerge in the needs between HPC and graphics, demanding further specialization. The tide of GPU computing was already rising in HPC before AI burst onto the scene. As it happened, GPUs were well suited to machine learning, for both training and inference. AI accelerated the adoption of GPUs for HPC, particularly for mixed-workload environments. And there has been a budgetary effect as well: Over 60% of HPC users are running machine learning as part of their environments, leading to frequent increases in budgets for high- performance workloads—in some cases more than doubling (see chart below).1 The New Scalability: Challenges The fields of machine learning and data science, combined with the attendant computational power of GPUs, offer tremendous upside for the application of HPC to an ever-increasing array of endeavors. All that’s left to worry about is the software. Application development and optimization are perpetual challenges in HPC. Not only does the developer need to create algorithms for simulating complex phenomena, but the resulting programs have to be run by vast arrays of independent processing elements. Single, massive problems must be decomposed into digestible computations that can be run quickly, with interim calculations compared and reassigned in step after step. Heterogeneous computing—the use of multiple types of processors, such as CPUs together with GPUs—makes programming more complicated. The programmer needs to specify not 1 Intersect360 Research, HPC User Budget Map survey data, 2020. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
only how to decompose a problem across multiple nodes, but also which portions of work to assign to each type of element. In the past, proprietary programming languages have been a hindrance for custom co- processors. Recently, programming for GPUs has become more widespread, with more accessible languages and libraries; however, the most commonly used tools are still proprietary to particular brands of GPUs. The net result is that HPC has, for the time being, returned to an era of specialization. Great innovations are possible, but this has come at the expense of portability; application advancements have been tied to particular vendors’ product lines. When the HPC landscape was dominated by Intel x86 processors with few options, this wasn’t a particular concern. Today, with a diversity of options between CPUs, GPUs, and In a 2016 other components, users are less certain in committing to one vendor only. Portability is still Intersect360 an issue, as software investment protection remains important in the continual updating of Research study, capabilities. Furthermore, as machine learning and data science beget new applications, 36% of HPC there is an interest in securing a bridge to the future. users said they had a favorable Effect on High-Performance Workload Budget Related to Incorporation of Machine Learning forward-looking Intersect360 Research, 2021 impression of AMD CPUs. In a similar study in 2020, that percentage had risen to 78%. INTERSECT360 RESEARCH ANALYSIS An “EPYC” Comeback for AMD Among all the HPC processing options to emerge in the past few years—and there have been many—the ones with the most momentum come from Advanced Micro Devices (AMD), a company on a major upward trend. AMD gained attention in HPC when it announced its first generation of EPYC x86 processors. With the second generation, AMD © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
gained notoriety, becoming the first microprocessor vendor going to market with a 7- nanometer (7nm) manufacturing process. Now, with the launch of third-generation AMD EPYC processors, AMD has continued its monomaniacal focus on winning the performance race. The AMD “Zen 3” core architecture at the heart of the EPYC 7003 CPU delivers a range of optimizations, resulting in up to 19% improvement in operations per cycle, according to AMD. This performance boost comes from a combination of enhancements over the “Zen 2” core architecture, such as increased bandwidth for both loads and store, and improved latency for certain calculations. (See chart.) AMD “Zen 3” Architecture Enhancements vs. “Zen 2”2 Source: AMD, 2021 The AMD EPYC 7003 enhancements go beyond the architecture performance. The complete SOC (system on chip) has enhanced memory and cache functionality and additional security features, while maintaining socket compatibility with previous AMD EPYC versions. In particular, the L3 cache is integrated into a single, large 32GB reservoir, rather than two smaller ones, allowing full cache allocation to any single core that may need it, thereby benefiting applications with databases that may fit into the single, larger cache. In one other subtle improvement, the AMD Infinity Fabric™ clock is now synchronous with DRAM memory. This helps reduce latency in waiting for data, resulting in an improvement for memory-sensitive applications, which are common in HPC. These enhancements add up to real-world results on applications at the heart of HPC. AMD has performed benchmark testing in which its EPYC processors outperform comparable Intel 2 AMD claim, based on AMD internal testing as of February 1, 2021, average performance improvement at ISO-frequency on an AMD EPYC™ 72F3 (8C/8T, 3.7 GHz), compared to an AMD EPYC™ 7F32 (8C/8T, 3.7 GHz), per-core, single thread, using a select set of workloads including estimated SPECrate®2017_int_base, SPECrate®2017_fp_base, and representative server workloads. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
Xeon processors on commonly-used HPC applications, with average improvements ranging from 43% on crash-test simulations up to 99% for computational fluid dynamics. (See chart.) AMD EPYC 7003 Comparative Benchmark Results on HPC Applications3 2x AMD EPYC™ 75F3 (32 core) vs. 2x Intel® Xeon® Gold 6258R (28 core), Average Performance Across Representative Workloads Source: AMD, 2021 As a result of this dedicated effort, AMD perception is on the rise among HPC users. In a 2016 Intersect360 Research study, 36% of HPC users said they had a favorable forward-looking impression of AMD CPUs. In a similar study in 2020, that percentage had risen to 78%— AMD CPUs have higher even than the percentage of users with a favorable future impression of Intel CPUs a presence today (69%). (See chart below.) in 70% of HPC sites, an And with AMD now shipping its third-generation of EPYC CPUs, this positive sentiment is astonishing beginning to show up in real customer deployments. AMD was named as the processor transition from vendor in only 5% of surveyed HPC systems in 2017 and 2018 combined.4 Today, 23% of HPC three years ago. users say they have AMD EPYC processors in widespread use. An additional 47% are testing 3 AMD internal testing. All tests compare 2x EPYC™ 75F3 (32C) to 2x Intel® Xeon® Gold 6258R (28C) processors. WRF version 4.1.5 comparison based on testing completed on 2/17/2021 on an AMD reference platform compared to an Intel server on a production system. ANSYS® CFX® 2021.1 comparison based on testing as of 2/5/2021 measuring the time to run the Release 14.0 test case simulations (converted to jobs/day – higher is better). ANSYS® LS-DYNA® version 2021.1 comparison based on testing as of 2/5/2021 measuring the time to run neon, 3cars, PPT-short, odb10m-short, and car2car test case simulations (converted to jobs/day – higher is better); results were AMD 17,555 total seconds versus Intel 28,774 total seconds, for ~81.0% more per node or ~59% more per core performance advantage; the 3cars test case gain individually was ~126% more per node or ~98% more per core jobs/day performance. ESI Virtual Performance Solution (VPS better known as PAM-CRASH®) version 2020.0 comparison based on testing as of 2/5/2021 measuring the neon test case simulation (converted to jobs/day – higher is better) for ~43% more per node or ~25% more per core jobs/day performance. Star-CCM+ 2020.3 comparison based on testing as of 2/5/2021 measuring the average seconds to complete 11 test cases and converted to jobs/day (higher is better); the KCS Marine Hull with No Rudder in Fine Waves test case individually was ~79% more per node or ~57% per core performance. Results may vary. 4 Intersect360 Research HPC User Site Census surveys across 2017 and 2018, total proportion of systems for which AMD was identified as CPU provider, including half-system credit for systems in which AMD was a shared CPU provider. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
or using AMD EPYC at some level, giving AMD CPUs a presence in 70% of HPC sites, an astonishing transition from three years ago. (See charts below.) Percent of HPC Users with Favorable Forward-Looking Impressions of CPUs 5 Source: Intersect360 Research, 2021 Current Penetration of AMD CPUs Among Surveyed HPC Sites 6 Source: Intersect360 Research, 2021 23% of HPC sites report “broad usage” of AMD CPUs 70% of HPC sites have at least some AMD CPUs 5 Intersect360 Research data from multiple studies. 2016: Special study: “Processing Elements for HPC”; question, “Overall, how favorable is your forward-looking impression of each of the following, with respect to your HPC workloads? (1 = Completely unfavorable; 5 = completely favorable)”; scores are combined percentage 4 and 5 for AMD Opteron versus Intel Xeon. 2020: “Vendor Satisfaction and Loyalty in HPC”; question, “What is your impression of each of the following vendors' future prospects for HPC?” (five-point scale); scores are combined top-two responses (Very Impressed; Impressed) for AMD EPYC CPUs versus Intel Xeon CPUs. 6 Intersect360 Research HPC Technology Survey, 2021. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
AMD’s momentum shows up in more than user surveys. The U.S. Department of Energy (DOE) has selected AMD as the processor vendor for the pending Frontier and El Capitan supercomputers. These will each be among the world’s earliest “Exascale”-class systems, with peak rated speeds of over an Exaflop: one quintillion calculations per second7 for scientific, 64-bit calculations. Frontier, when it arrives, is expected to be the first Exascale supercomputer in the U.S.,8 perhaps in the world, and El Capitan is projected to be the world’s first supercomputer at 2 Exaflops peak performance.9 Marrying CPU and GPU It wasn’t the EPYC CPU alone that attracted the DOE to AMD for Frontier and El Capitan. As GPU-accelerated applications have become more common, few HPC users want to give up their acceleration. The future of supercomputing is solidly heterogeneous, incorporating both CPUs and GPUs for HPC, data science, and machine learning. For the past ten years, Intel has been the dominant provider of CPUs, and the dominant AMD Instinct provider of computational GPUs has been NVIDIA. While this situation has worked well MI100 is the enough thus far, the polarization it presents is worrisome. NVIDIA and Intel compete more first GPU with than they cooperate, both in processing and in networking, and programming for one over 10 company’s solutions does not translate to the other’s. Teraflops of Enter AMD, again. AMD serves HPC not only with its EPYC CPUs, but also with AMD Instinct performance: GPUs, and these are predestined to work well together. In fact, the upcoming generations of 11.5 Teraflops EPYC and Instinct will be integrated with AMD’s high-speed Infinity architecture, boosting of peak 64-bit inter-processor communication between CPU and GPU. performance. Since they work in concert over a high-bandwidth, low-latency connection, AMD will be first to market with a feature not previously seen for heterogeneous architectures: coherent memory. With a single, coherent memory space across an integrated chipset, programmers can assign individual calculations, subroutines, or loops without moving data. “Pass the pointer, not the data” has been a mantra for advocates of coherent memory; now this concept is applicable to heterogeneous computing as well. AMD Instinct MI100 Accelerator Reaching Exascale levels of performance doesn’t happen merely by ganging more elements together; the individual elements need to get faster as well. AMD has advanced the CPU components with its EPYC processor line, and in November 2020, in conjunction with the annual SC conference10 for the worldwide supercomputing community, AMD announced its latest GPU offering for HPC, the AMD Instinct MI100. 7 One quintillion = one billion billion = one million million million = 1018 = 1,000,000,000,000,000,000. 8 https://www.hpcwire.com/2020/10/01/auroras-troubles-move-frontier-into-pole-exascale-position/. 9 https://www.amd.com/en/products/exascale-era. 10 http://supercomputing.org/index.php. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
The MI100 brings AMD’s focus on HPC performance to its GPU line. AMD is promoting MI100 “Pass the as the first GPU with over 10 Teraflops of performance, with 11.5 Teraflops of peak 64-bit pointer, not the performance.11 data” has been a The MI100 is the first GPU based on the new AMD CDNA™ (Compute DNA) architecture, mantra for which separates GPU designs between those focused on computation (AMD CDNA) and advocates of those focused on graphics (AMD RDNA, for Radeon™ DNA). The AMD CDNA architecture coherent offers a new core design with double the computational efficiency of previous AMD GPUs,12 memory; now along with “Matrix Core Technology” that targets acceleration for HPC and AI applications. The this concept is MI100 with AMD CDNA has 32GB of HBM2 memory, with over 1.2 Terabytes per second of applicable to memory throughput,13 and the second-generation Infinity architecture, with up to a 37% heterogeneous speed-up of GPU-to-GPU communication over the first generation.14 computing as As much as it targets HPC as a primary market, MI100 does not ignore AI, which is an well. integrated part of HPC environments. The AMD CDNA architecture supports mixed-precision workloads, with FP32 and FP16 matrix, bfloat16, INT8, and INT4 for machine learning. AMD states MI100 is nearly seven times faster than its previous Radeon GPUs for FP16 workloads for AI.15 Programming for the Future All the computing power in the world doesn’t help if you can’t program for it. AMD is addressing this in two ways. First, with coherent memory, programming is more straightforward, with fewer lines of code or custom calls. [See diagrams below.] This is useful for the ongoing development of new applications that take advantage of heterogeneous computing. Second, AMD supports programming models that are open, rather than 11 AMD claim: Calculations conducted by AMD Performance Labs as of September 18, 2020 for the AMD Instinct™ MI100 (32GB HBM2 PCIe® card) accelerator at 1,502 MHz peak boost engine clock resulted in 11.54 TFLOPS peak double precision (FP64), 46.1 TFLOPS peak single precision matrix (FP32), 23.1 TFLOPS peak single precision (FP32), 184.6 TFLOPS peak half precision (FP16) peak theoretical, floating-point performance. Published results on the NVidia Ampere A100 (40GB) GPU accelerator resulted in 9.7 TFLOPS peak double precision (FP64). 19.5 TFLOPS peak single precision (FP32), 78 TFLOPS peak half precision (FP16) theoretical, floating-point performance. Server manufacturers may vary configuration offerings yielding different results. MI100-03. 12 AMD claim: AMD Instinct™ MI100 accelerators provide 120 compute units and 7,680 stream cores in a 300W accelerator card. Radeon Instinct™ MI50 accelerators provide 60 compute units (CUs) and 3,840 stream cores in a 300W accelerator card. MI100-09. 13 AMD claim: Calculations by AMD Performance Labs as of Oct 5th, 2020 for the AMD Instinct™ MI100 accelerator designed with AMD AMD CDNA 7nm FinFET process technology at 1,200 MHz peak memory clock resulted in 1.2288 TFLOPS peak theoretical memory bandwidth performance. The results calculated for Radeon Instinct™ MI50 GPU designed with “Vega” 7nm FinFET process technology with 1,000 MHz peak memory clock resulted in 1.024 TFLOPS peak theoretical memory bandwidth performance. AMD CDNA-04. 14 AMD claim: Calculations as of SEP 18th, 2020. AMD Instinct™ MI100 accelerators support PCIe® Gen4 providing up to 64 GB/s peak theoretical transport data bandwidth from CPU to GPU per card. AMD Instinct™ MI100 accelerators include three Infinity Fabric™ links providing up to 276 GB/s peak theoretical GPU to GPU or Peer-to-Peer (P2P) transport rate bandwidth performance per GPU card. Combined with PCIe Gen4 support, this provides an aggregate GPU card I/O peak bandwidth of up to 340 GB/s. Server manufacturers may vary configuration offerings yielding different results. MI100-06. 15 AMD claim: Calculations performed by AMD Performance Labs as of September 18, 2020 for the AMD Instinct™ MI100 accelerator at 1,502 MHz peak boost engine clock resulted in 184.57 TFLOPS peak theoretical half precision (FP16) and 46.14 TFLOPS peak theoretical single precision (FP32 Matrix) floating-point performance. The results calculated for Radeon Instinct™ MI50 GPU at 1,725 MHz peak engine clock resulted in 26.5 TFLOPS peak theoretical half precision (FP16) and 13.25 TFLOPS peak theoretical single precision (FP32 Matrix) floating-point performance. Server manufacturers may vary configuration offerings yielding different results. MI100-04. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
proprietary, making them more portable to future generations of processors, regardless of where they come from. Programming Heterogeneous Computing with Coherent Memory Source: AMD, 2020 AMD Commitment to Open-Source Development for Heterogeneous Computing Source: AMD, 2020 To achieve this, AMD is putting forward its own open-development environment, AMD ROCm (pronounced “Rock ’em”). ROCm is the critical piece that will determine software performance and scalability in HPC environments. (See chart below.) © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
And what about applications that already exist? Many users have already spent years porting and optimizing applications with NVIDIA’s CUDA tools. Here AMD has a potentially critical solution: HIP (Heterogeneous-Compute Interface for Portability), which is a programming model that enables applications to run on both AMD and NVIDIA hardware with the same code base. HIP provides an abstraction layer to call the optimized code, compiled for each architecture. Through a pair of “HIPify” porting tools, AMD offers automated conversion of applications written to run only on NVIDIA CUDA to be compatible with open AMD HIP. AMD cites examples of over 90% to 95% of code converting automatically, with no user intervention.16 Features of the AMD ROCm Software Environment Source: AMD, 2021 Recognition Recommend ation Engine Data Security Processing Language Govt Labs Academic & Detection Research Oil & Gas Sciences Image Life ◢ Complete set of libraries, tools and HPC Industry Solutions management APIs ROCmTM Software Stack ◢ Open-source compiler for OpenMP and HIP DC Tools, Dev Tools, Comm & Math Libraries, Compilers Validated, Optimized Systems ◢ Easy, automated tools to convert CUDA code to HIP with virtually no performance loss ◢ Scales from Workstation to Cloud to Exascale ◢ Open ecosystem for 3rd party development Here the partnership with DOE will pay dividends for AMD as well. In a 2019 vendor profile of AMD, Intersect360 Research wrote: “AMD need not wait for the installation of Frontier to begin reaping the benefits of its affiliation with the DOE labs. … DOE researchers now have a vested interest in seeing their scientific research applications run well, at scale, on AMD architectures. Furthermore, the DOE is committed to open science; researchers will have an incentive to push any porting or optimization to the broader scientific community.” 17 Bringing It All Together HPC is continuing to expand and diversify in ways that are both exciting and frightening. New workloads, new architectures, and new frontiers of scalability will lead to innovations and discoveries beyond yesterday’s conception. But with expanding possibility comes the added challenge of harnessing all that power. AMD is back in the HPC game, and in many ways, AMD is back in the lead. Winning on price/performance involves innovation on two fronts: maximizing raw performance and 16 https://www.admin-magazine.com/HPC/Articles/Porting-CUDA-to-HIP. 17 Intersect360 Research, Vendor Overview and Outlook: AMD in HPC, November 2019. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
optimizing efficiency. AMD is targeting both. AMD has both CPU and GPU, each aiming for HPC supremacy, over its own integrated fabric. Additionally, AMD has an open software environment that not only protects existing investments with automated conversion tools but also promotes open software development in the future. And AMD has a critical relationship with the DOE that will provide support and stability for a new generation of supercomputing. Incorporating CPU, GPU, and software into a consolidated, open-community environment is something no other company has done yet. By bringing together new technologies and new workloads, AMD provides a compelling vision of scalability into the future of HPC. For more information about AMD solutions for HPC, visit www.AMD.com/HPC. AMD, the AMD logo, EPYC, Infinity, AMD Instinct, ROCm, Radeon, AMD CDNA, AMD RDNA and combinations thereof are trademarks of Advanced Micro Devices, Inc. © 2021 Intersect360 Research. White paper sponsored by AMD. Full neutrality statement at https://www.intersect360.com/features/neutrality-statement. P.O. Box 60296 | Sunnyvale, CA 94088 | Tel. [888] 256-0124 www.Intersect360.com | info@Intersect360.com
You can also read