SOFTWARE DEFINED HARDWARE: THE NEW ERA OF COMPUTING INFRASTRUCTURE Accenture Labs
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Executive summary As companies embrace digital transformation and intelligent process automation, demand for computing services continues to skyrocket. To keep up, infrastructure providers are continuously adding more and more general-purpose processors into their compute environments. However, energy consumption and the associated costs are quickly becoming bottlenecks and limiting their growth. Adding to this complication is the fact that the computing speed-up experienced in the last five decades—driven by Moore’s law, shrinking transistors and higher-density chips—is slowing down. In just 10 years, when the size of a transistor reaches the size of an atom, it’s expected to come to a halt. Many researchers are working on the next generation of transistors, with new materials, design and fabrication technologies; but these new transistors are at least a decade away from mass production, leaving a critical gap between the predicted end of Moore’s law and the readiness of new approaches to pick up where it leaves off. Many infrastructure providers, such as Microsoft and Baidu, are embracing hardware accelerators and specialized computers to continue scaling their compute environments without the commensurate energy consumption of general-purpose processors. In terms of accelerators, General Purpose Graphic Processing Units (GPGPUs) are the most common. Application-specific Integrated Circuits (ASICs) are also becoming more popular, with Google creating its own Tensor Processing Unit to support AI applications. But the most disruptive hardware accelerators are powered by Field Programmable Gate Arrays (FPGAs). With additional tools and frameworks such as Open Computing Language (OpenCL), FPGAs allow traditional software developers to embed custom logic into the hardware—effectively creating their own custom hardware accelerators—hence the term “software-defined hardware.” Several cloud providers such as Amazon and Microsoft are already offering FPGA cloud services, dubbed FPGA-as-a-service. In addition to accelerators, specialized computers such as adiabatic quantum computers (www.accenture.com/quantum) and neuromorphic computers, along with traditional super computers, are being used to solve specific computation problems. The future of the enterprise computing infrastructure will consist of a diverse set of computational hardware. But it’s critical to note that each of these approaches will only scale specific types of applications and functions, whether those are artificial intelligence (AI), data transformation, security, or other segments of collective enterprise needs. To create performant applications at the right cost, companies will need to orchestrate the right computing workload across a range of evolving computing hardware. 2 | Software-Defined Hardware
Implications Companies must consider the implications for future infrastructure and software decisions in four key areas: Growing demand for AI, data transformation and secured applications has companies increasingly turning to hardware accelerators for their unique ability to scale these applications and functions. As a result, hardware accelerators are playing larger roles in the overall computing infrastructure, to the point where they will become a standard component. Companies must therefore manage and share hardware accelerators like other first-class infrastructure resources. Software-defined hardware is blurring the line between hardware and software. The growing interdependency between the two implies there will be more software that can only run on specific hardware. Thus, the process of selecting hardware and software—which has traditionally been decoupled—is becoming more complicated. The same applies to the dependencies between software and cloud providers, as different cloud providers are using different types of hardware accelerators, which impacts the type of software they can accelerate. Companies must ensure that their cloud providers can address these dependencies and continue to scale their software as demand continues to increase. More fragmentation or “balkanization” of cloud providers will occur due to the increasing hardware-software dependency. Thus, the market will see more software and services that are only available in specific cloud infrastructure. Cloud providers will be quick to take advantage of this balkanization effect, building up an ecosystem of partners of high-performance software applications that work only within the cloud provider’s infrastructure confine. This will become a major source of differentiation among competitors. Selecting the right cloud providers will become even more critical as it may limit companies from using certain software. Similarly, before building new software, companies should ensure their selected software components are aligned with and available from the cloud providers. Services, processes and frameworks (which may include containerization technologies, microservices and APIs) that allow easy orchestration of the right workload on the right compute infrastructure—whether within the same cloud provider or across different cloud providers—will play a key role in the future compute infrastructure for many large enterprises. This has a direct impact on how these companies should design their new software. The remainder of this document covers the research, analysis and market examples used to derive these findings. Further, it outlines the steps companies can take now to prepare for a future of software-defined hardware. 3 | Software-Defined Hardware
Computational power hits a plateau The demand for computing power continues to rise. The 2017 revenue growth of infrastructure-as-a-service was 36.8 percent, and spending growth on IT infrastructure products was 15.3 percent year-over-year in the same timeframe.1 But Moore’s law, which drove the last five decades of chip development, is slowing and is expected to hit its limit in the next decade. Already, Intel has indicated that instead of doubling the number of transistors on a chip every 18 months, the timeframe will ratchet up to as much as 36 months. This reversal of a fundamental tenet of computing can be attributed to the challenges of manufacturing circuits and nanometer-scale transistors. The latest Intel processor, Kaby Lake— first released in 2014—is 14 nanometers (nm) in size, which is smaller than a typical virus; and the 10 nm Cannon Lake processor is expected to be released in 2019. Not the be outdone, Samsung has announced early production of a 4 nm chip with full release scheduled for 2020. As a reference, 4 nm is the size of a “quantum dot” of just seven atoms in a single silicon crystal. A single atom transistor chip is within reach, but it will require the creation of new process and etching technologies for mass-scale production. Remarkably, such a transistor was first created in 2012, but it had be operated at temperatures barely above absolute zero. Figure 1 shows the relative cost of design for each generation of exceedingly smaller chip. The cost of designing has risen steadily due to the complexity involved, which only serves to extend the production timeline. Figure 1: Increasing design cost for reduced chip component size (Source: Accenture analysis) 800 Design Cost ($M) 600 400 200 0 20 16 10 7 5 Chip Manufacturing Processes (in nm) 4 | Software-Defined Hardware
Even if the physics issues can be addressed, the cost of fabrication for atom-sized chips will be staggering. For instance, the cost to build a fabrication plant for 14 nm chip is more than $5 billion. For 5 nm chips, it is estimated tto cost roughly $16 billion. This projection does not account for the additional complexity involved in changing to new materials (e.g., Germanium), new structure (e.g., 3D stacking) and completely new fabrication process (e.g., carbon nanotube). All of these factors could seriously impact the commercial viability of the upcoming chips. Hardware accelerators to the rescue Infrastructure providers—and the companies that rely on their services—are increasingly turning to hardware accelerators to provide the necessary compute power and speed at scale. Traditional central processing units (CPU) were designed to run a wide range of tasks. However, for specific or repetitive functions, especially the ones that can be executed in parallel, hardware accelerators can help run the task more efficiently in terms of time and power consumption. There are a range of hardware accelerators, the most common of which is the Graphics Processing Unit or GPU. Other common types include Application Specific Integrated Circuits (ASIC) and Field Programmable Field Gate Arrays (FPGA). Figure 2 provides a high- level comparison of the different types of accelerators. 5 | Software-Defined Hardware
Figure 2: Relative hierarchy of hardware accelerators (Source: Accenture analysis) Application-Specific Processor Central Processing Graphics Processing Field Programmable Integrated Circuit Performance Unit (CPU) Unit (GPU) Gate Array (FGPA) (ASIC) Array of Designed for Designed for general programmable blocks Custom designed for graphics-related purpose applications with a programmable specific functionality computations interconnect Relative Performance 1 100 1000 10,000 – 100,000 Special purpose Field (re) Flexibility General purpose Application specific processor programmable Somewhat restricted Somewhat restricted Market Market agnostic Market specific market market Widely available Requires specialized Requires specialized Ease of Programming Rigid; interface only programming skills skills skills Xilinz, Intel (Altera), Key Players Intel, AMD, ARM NVIDIA, AMD, Intel NEC, LSI, Samsung Actel In the hierarchy of processor performance ranging from general purpose CPUs to ASIC, there is a tradeoff between flexibility and efficiency, with efficiency increasing by orders of magnitude when any given application is implemented higher up in that hierarchy. GPU applications expanded in last two decades GPUs, the most common hardware accelerator, are continuing to gain traction. The chip was first introduced to accelerate graphics-related tasks such as quickly rendering the shadow of an object. However, GPUs can also be used to speed non-graphics tasks such as signal processing, virus pattern recognition and medical image recognition. Companies like NVIDIA have been capitalizing on this more general use of GPU, which is also known as General-Purpose Computing on GPU (GPGPU). NVIDIA’s DGX-1 System coupled with its Volta GPU, for example, is designed specifically to support AI applications such as deep learning training, inference and accelerated analytics all in one system. This approach has tripled the company’s revenue of its datacenter segment to a record $409 million, up 186 percent year-over-year growth.2 Most cloud providers like Amazon, Microsoft and Google also allow their customers to tap into the power of GPGPU through GPU cloud. 6 | Software-Defined Hardware
ASIC is performant but expensive to produce An Application-Specific Integrated Circuit (ASIC) is a chip that is designed for a purpose. Because it is highly optimized for a specific function, it typically operates at a higher level of efficiency than its CPU- or GPGPU-counterparts. Google, for example, created its own ASIC-based accelerator called a Tensor Processing Unit (TPU) to speed up machine learning applications. In addition to being faster, ASICs typically consume less power. This is one of the reasons Microsoft, for example, created its own ASIC-based Holographic Processing Unit (HPU) to process data from various sensors in its HoloLens units. Given that ASIC requires a large one-time, up-front investment for design and manufacture— sometimes in the millions of dollars—this type of chip targets high-production volume. Further, because ASIC functionality is fixed once manufactured, it is hard to quickly refine or update functionality. As such, ASICs traditionally target functionality that is relatively stable. In recent years, demand for ASIC has grown due to widespread use in smartphones and tablets to quench the need for bandwidth. A recent study shows the global ASIC market is estimated to grow at a CAGR of 17.01 percent during the period 2017 to 2021.3 Democratization of custom hardware accelerators Field Programmable Gate Arrays (FPGA) have been in existence for decades. However, the original circuit design was bulky, hard to program and interface with, making it of limited use. Until recently, FPGA was used primarily by engineers specializing in digital hardware design [i.e., using VHSIC Hardware Description Language (VHDL) or Verilog]—mostly as a hardware prototyping tool.4 FPGA is different from other hardware accelerators in that it does not have any specific functionality when manufactured. As a hardware vessel, it needs to be programmed. But once the software logic is embedded into it, running algorithms at the hardware level may yield orders of magnitude in performance improvement. Unlike ASIC-based accelerators, which may take months to design and manufacture, FPGA-based accelerators can be developed in a matter of weeks. And unlike GPGPU, FPGA’s functionality is not confined to graphics-related operations. 7 | Software-Defined Hardware
Perhaps the biggest strength of FPGA, however, is its ability to be reprogrammed. Its functionality can be further refined and upgraded on the fly—perfect for quickly evolving areas such as machine learning. Using FPGA, Microsoft has reported—for a certain type of computing—it has achieved up to a 150 to 200x improvement in data throughput, up to a 50-fold improvement in energy efficiency compared to a CPU, and lowered latency by about 75 percent. Other examples of companies using FPGA as accelerators are: • China-based Baidu has adopted FPGA to accelerate SQL processing5 icrosoft Azure uses FPGA to route network traffic and Office 365 uses FPGA •M for encryption and compression.6 •N ervana Systems—recently acquired by Intel—has developed FPGA for deep learning; another startup, DeePhi is doing the same.7 As shown in Figure 3, market leaders like Intel, Microsoft and Amazon are at the forefront of FPGA adoption. Intel, through its Altera acquisition, and Amazon, through its FPGA- equipped Elastic Cloud Compute (EC2), have demonstrated their support in the future of FPGA-driven hardware acceleration. Figure 3: Market leaders upbeat about the future of FPGA (Source: Accenture analysis) • In 2015, Intel bought Altera, a maker of FPGA, for $16.7 billion (its largest acquisition to date).8 • Intel is baking FPGA into its Xeon-based servers as a CPU accelerator. It was announced in the 2016 Open Compute Project Summit to help accelerate adoption.9 • Microsoft is now putting FPGA on PCI Express networking cards in every new server it deploys in its data centers.10 • Microsoft Bing’s machine learning algorithms on FPGA yielded 40-100x performance improvements. • Microsoft has announced the availability of Brainwave, an FPGA-based system for ultra-low latency deep learning for Azure.16 • In April 2017, Amazon released a FPGA-equipped EC2 offer called F1 (i.e., FPGA Cloud) to allow its customers to create custom hardware accelerators. • F1 comes with tools to develop, simulate, debug and compile hardware acceleration code, including a FPGA Developer AMI and Hardware Developer Kit.10 8 | Software-Defined Hardware
“By 2020, a third of all servers inside all the major cloud computing companies will include FPGA.” —DIANE BRYANT, Group President, Data Center, Intel FPGA-accelerated tools are getting popular, too, with interesting use cases of FPGA- accelerated platforms and reference frameworks beginning to emerge. For example, Bigstream claims to offer 2 to 5x hyper-acceleration of Spark applications using FPGA. Bigstream uses acceleration techniques like native compilation, vectorization, locality optimization and custom data connectors to provide faster time-to-insight at significantly lower cost. Figure 4: Sample FPGA-accelerated tools • The DRAGEN engine is a software framework and constituent library of hardware accelerator blocks, implemented in FPGA. • The platform solves two key unmet needs in big data genomics: compute and storage. • It offers a scalable, accelerated and cost-efficient analysis solution for genomics applications. • Ryft outperforms the fastest data analytics platforms by 200x or more. • Ryft Cloud enables users to get fast, actionable insight from their cloud-based data 72x faster than is currently possible with commodity cloud infrastructures. • Ryft ONE accelerator makes data analytics fast and simple by combining heterogeneous FPGA/x86 compute, SSD-based storage, a library of analytics algorithms and an open API. With more companies turning to FPGA, the price has been declining as well: a developer board now starts for as low as $80 and a developer kit at $20,000.12 Further, more software capabilities (see OpenCL below) are being developed to allow people to develop their own custom accelerators (hence the democratization of FPGA).13 9 | Software-Defined Hardware
The rise of custom accelerator marketplace An interesting aspect of an FPGA-based accelerator is that it has two independent components: 1 FPGA hardware & 2 The program to be deployed to FPGA The latter is a digital asset—just like music, movies, apps—which easily can be bought and sold. As such, it is ripe for an app store-like marketplace to monetize such programs online. Amazon and companies like Accelize have started capitalizing on this phenomenon. For instance, Accelize not only provides a marketplace to buy and sell programs for FPGA accelerators in the form of AFI files (a specific file format deployable to AWS FPGA cloud), but also provides services like digital rights management for these files, and the associated payment services to allow AFI developers to monetize their work. Accelize is also aggressively forming alliances to create an ecosystem of partners to hasten the development and use of FPGA accelerators. 10 | Software-Defined Hardware
OpenCL plays a big role in driving hardware accelerators Open Computing Language (OpenCL) is a framework for writing programs that execute easily across heterogeneous platforms consisting of CPUs, GPUs, FPGAs and other accelerators. Prior to OpenCL, programmers had to use vendor-provided toolkits (such as NVIDIA CUDA) to use an accelerator. Since the code was vendor-specific, it locked the software implementation to a specific vendor and limited the number of different accelerators an application could use at any particular time. OpenCL was originally developed by Apple in 2008, and is now managed by a large, industry-wide consortium—Khronos Group—with 100+ members including IBM, Google, Amazon, Microsoft and Baidu. This cross-platform framework has been deployed in a wide range of applications, from machine learning, gaming and creative tools, to scientific and medical applications. OpenCL is originally based on C programming languages and has been expanded to cover other languages including Python and JavaScript. 11 | Software-Defined Hardware
Future of compute infrastructure Given the approaching end of Moore’s law and the demand for compute solutions that scale across a variety of application needs, the future of compute infrastructure will consist of a diverse set of hardware. The movement toward hardware accelerators is also supplemented with active development on specialized computers to augment the general-purpose computers used today. For example, D-Wave’s quantum computers can be used to rapidly solve optimization problems (see www.accenture.com/quantum). IBM’s TrueNorth, a neuromorphic computer, is great for pattern recognition—a key capability for AI applications. While these specialized computers are not covered in this article, it is important to acknowledge their roles in the future of compute infrastructure. Given their flexibility and pervasiveness, traditional general-purpose computers will evolve toward working as orchestrators, directing specialized computers and accelerators to do specific tasks, as well as covering areas beyond those specific tasks. Figure 5 shows how general-purpose hardware and specialized hardware will most likely fit together, and this structure is already emerging. For example, 1Qbit (see www.accenture.com/quantum) provides an interface between traditional computers and quantum computers; in the process, the interface translates the business problem into a form that is recognizable by a quantum computer. Figure 5: Diversity of compute infrastructure in the future (Source: Accenture analysis) Applications Common H/W Frameworks Micro-Services General (e.g., Open CL) (API’s) Purpose CPU GPUs FPGAs ASICs Other Quantum Neuromorphic Other Accelerators Computers Computers HPCs Specialized Hardware Accelerators Specialized Computers 12 | Software-Defined Hardware
Business implications and next steps From managing supply chains in real-time to predicting the evolution of cancerous cells in human body, the world is undoubtedly moving toward computational heterogeneity to meet the computing demands of next decade. Such changes will not happen overnight, but with recent developments in quantum computing and neuromorphic computing, the landscape is changing fast. 13 | Software-Defined Hardware
Historically, companies have treated hardware and software as largely independent entities, managed by disparate groups. But the future of computing will bring widespread and far- reaching change. With hardware accelerators, the layer of separation between hardware and software is now blurring, creating a tighter coupling between the two—a coupling that demands changes throughout the organization. Critically, this new coupling will also create an infrastructure divergence for cloud providers like Google, Microsoft and Amazon, as they become even more specialized in terms of which type of hardware they choose to accelerate their services. Companies may need to work with more cloud providers in order to meet their software needs, as software becomes more tightly linked to specific hardware approaches. Conversely, if companies restrict themselves to a specific subset of infrastructure/ cloud providers, they may limit their ability to use best-of-breed software available in the future. Companies choosing specialized hardware to improve performance must do so thoughtfully, as it increases the complexity of the software and the dependency on certain types of hardware, especially in the context of hybrid cloud. What’s more, as companies accept more complexity in their architectural solutions, they must also contend with the fact that the skills available to develop and maintain such environments become more limited. Hardware accelerators have much to offer, and it’s no surprise that they are increasingly being deployed across the enterprise. But companies must carefully manage their architectures to ensure that the benefits from acceleration are commensurate with the complexity of managing and maintaining the system in the long term. General purpose CPU will continue to be the main workhorse to run general workloads, as well as orchestrating and parceling out work from the specialized hardware. Software layering, componentization and API approaches can be used to isolate the specialized workload. It will be essential to revisit existing software development processes, tools and architecture to prepare for—and maximize—the evolution to software-defined hardware. 14 | Software-Defined Hardware
AUTHORS PAUL DAUGHERTY Chief Technology & Innovation Officer Accenture EDY LIONGOSARI Chief Research Scientist Accenture Labs PRANAV KUDESIA Tech Research Lead Accenture Research CONTRIBUTORS COLIN PURI TERESA TUNG CARL DUKATZ RENEE BYRNES 15 | Software-Defined Hardware
REFERENCES ABOUT ACCENTURE LABS 1 ttps://www.forbes.com/sites/ h Accenture Labs incubates and prototypes new louiscolumbus/2017/04/29/roundup-of-cloud- concepts through applied R&D projects that are computing-forecasts-2017/#acb92fb31e87 expected to have a significant strategic impact on clients’ businesses. Our dedicated team of 2 ttps://www.fool.com/investing/2017/05/17/nvidia- h technologists and researchers work with leaders delivers-stunning-ai-growth-on-a-solid-gami.aspx across the company to invest in, incubate and 3 ttp://dailynewsks.com/2017/05/application- h deliver breakthrough ideas and solutions that specific-ic-asic-market-shares-competitive- help our clients create new sources of business landscape-analysis-challenges-2021/ advantage. Accenture Labs is located in seven key research hubs around the world and collaborates 4 http://www.ni.com/white-paper/6983/en/ extensively with Accenture’s network of nearly 5 ttps://www.nextplatform.com/2016/08/24/baidu- h 400 innovation centers, studios and centers takes-fpga-approach-accelerating-big-sql/ of excellence globally to deliver cutting-edge 6 ttp://anonhq.com/microsoft-bets-future- h research, insights and solutions to clients where reprogrammable-computer-chip/ they operate and live. For more information, please visit www.accenture.com/labs 7 ttps://www.nextplatform.com/2016/08/23/fpga- h based-deep-learning-accelerators-take-asics/ 8 ttp://www.eweek.com/servers/intel-begins- h shipping-xeon-chips-with-fpga-accelerators ABOUT ACCENTURE RESEARCH 9 ttps://newsroom.intel.com/news/intel-eases-use- h Accenture Research identifies and anticipates fpga-acceleration-combines-platforms-software- game-changing business, market and technology stack-ecosystem-solutions/ trends through provocative thought leadership. Our 250 researchers partner with world-class 10 ttps://aws.amazon.com/blogs/aws/ec2-f1- h organizations such as MIT and Singularity to instances-with-fpgas-now-generally-available/ discover innovative solutions for our clients. 11 ttp://anonhq.com/microsoft-bets-future- h reprogrammable-computer-chip/ 12 https://www.adafruit.com/category/69 ABOUT ACCENTURE 13 ttp://www.barrons.com/articles/nvidia-amd- h xilinx-to-benefit-from-rise-of-gpu-fpga-says- Accenture is a leading global professional jefferies-1479222054 services company, providing a broad range of services and solutions in strategy, consulting, 14 ttp://spectrum.ieee.org/semiconductors/design/ h digital, technology and operations. Combining the-death-of-moores-law-will-spur-innovation unmatched experience and specialized skills 15 ttps://techcrunch.com/2017/08/22/microsoft- h across more than 40 industries and all business brainwave-aims-to-accelerate-deep-learning- functions – underpinned by the world’s largest with-fpgas/ delivery network – Accenture works at the intersection of business and technology to help clients improve their performance and create sustainable value for their stakeholders. With Copyright © 2018 Accenture 449,000 people serving clients in more than All rights reserved. 120 countries, Accenture drives innovation to improve the way the world works and lives. Accenture, its logo, and Visit us at www.accenture.com. High Performance Delivered are trademarks of Accenture.
You can also read