Personal Volunteer Computing - Computing Frontier 2019 Sardinia, Italy, May 1st 2019 - Erick Lavoie
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Personal Volunteer Computing Erick Lavoie, Laurie Hendren McGill University Computing Frontier 2019 Sardinia, Italy, May 1st 2019
Motivation Paradigms Applicability Future This presentation will cover 4 aspects: the motivation for introducing a new distributed computing paradigm, its comparison to existing ones, its applicability today, and possible future directions.
Pioneer Cycle Faster and More Technical Innovation Efficient Devices New Applications Industry Growth and and Services / More Devices Sold More Profitable Operations The success of computing technologies has been fuelled by a virtuous cycle of: 1. technical innovation, that leads to 2. better machines, 3. opening new applications and services, and more profitable operations, 4. that grow industrial applications and sell more devices, that generate resources for new innovation
Billions Smartphones Sold (Source: Gartner) Slowing of Moore’s Law 1.6 1E-03 that it isn’t. His 1988 publication reviewed funda- Landauer 1988 1E-05 mental research demonstrating the possibility of an kT (300) 1.2 IB M energy-conserving form of computation. As in today’s 1E-07 Switching energy (J) Intel circuits, the devices in energy-conserving circuits ITRS 0.8 1E-09 Rapid V reduction would store enough energy—many times kT—to 1E-11 Slower V reduction reliably distinguish the digital state from the inevi- 0.4 1E-13 Clock freq. plateaus THE END OF MOORE’S 1E-15 LAW table thermal noise. For good engineering reasons, today’s circuits dissipate that stored energy every 0 2014 2015 2016 2017 2018 1E-17 time a device is switched. In contrast, energy- 1E-19 1988 extrapolation conserving circuits would dissipate only a small frac- 1E-21 tion of the stored energy in each switching event. In 1940 1960 1980 2000 2020 2040 such circuits, there’s no fundamental lower bound Personal Laptop Collection! Year on the energy efficiency of digital computation. 100.00 Figure 1. Minimum switching energy dissipation in 1 of information. A lthough Dynamic random access no commercially viablememory energy- SRAM logic devices used in computing systems CGPas a function (DRAM), conserving the slower computing but denser systems and emerged therefore in the less 1990s CGP/M1 pitch (µm) of time. 10.00 Black diamonds replicate dataM1 from Rolf pitch expensive memory or later, digital that usually quantum computing,resides stillon memory in its infancy, SRAM area (µm2) Landauer,5 and the dashed line is Landauer’s 1988 chips peripheral to the processor, uses one FETTo exemplifies the energy-conserving approach. extrapolation of the historic trend toward kT (evaluated and one capacitor to store a bit. Flash memory, theaf- at T =1.00 300 K), indicated by the dotted line. Triangles 0.1 show what did happen in the commercial sector and Xs are published values from IBM and Intel, very densewe ter 1988, butadded ratherdataslowtomemory Figure 1that storesswitch- showing data respectively, compiled by Chi-Shuen Lee and Jieying when the power is off, uses one ing energies for minimum channel width CMOSFET with a specially 0.10 Luo at Stanford University during their PhD thesis designed gate structure FET technologies based to store one bit (or on technical more re- publications research with one of the authors (Wong). Open squares cently, from IBM several andbits). Intel.Thus For each a while,of these dominant switching energy are values 0.01 from the 2013 International Technology 0.01 devices in today’s memory hierarchy is subject to 1990 1995 2000 2005 2010 Roadmap for Semiconductors (ITRS). The data is 2015 2020 continued to drop rapidly, as IBM led the industry Year scaling constraints similar to those in rapidly reducing operating voltage. Roughly fol- for the FETs available at https://purl.stanford.edu/gc095kp2609. Current, up-to-date data can be accessed at https:// used lowingin logic, plus additional the elegant scaling rulesconstraints laid outunique to by Robert Figure 2. Three key measures of integration nano.stanford.edu/cmos-technology-scaling-trend. density as a each device. 6 function of time. Blue dots show static random access Dennard and colleagues, each successive genera- Despite theselower daunting problems, we’redevices very This feedback loop has been so successful that, for example, there has now been enough memory smartphones (SRAM) density. produced Green trianglesto M1 pitch,onetion provide show for of everysmaller,human voltage, on earth.lowerBut powerthat also the minimum wire-to-wire spacing in the first wiring optimistic was also faster.aboutIncreasingly the prospectspotentfor further CMOS dramatic technol- led to electronic “waste” in the form of under-used or even discarded but still-working older devices: layer.Figure Red squares here show isormy personalpitch, collection of laptop advances and phones in computing technology. as aAftersample. decades 1 shows aCGPvery contacted importantgate exponential ogy extended the long run of exponential increases the minimum spacing between transistors. Progress of progress centered on miniaturization of the trend in information technology that already shows in microprocessor clock frequency7—a key mea- is still rapid, but there’s evidence of slowing in recent 5 CMOS transistor, we see a growing potential for At the same time the slowing of Moore’s law means that these older devices can be used longer, a sharp years. slowing Data asofnewer compiled progress. generations In 1988, from published bring by Chi- smaller Rolf Landauer literature sure relative of computing improvements. performance—that had begun advances based on the discovery and implemen- published some remarkable data on energy Shuen Lee at Stanford University during his PhD thesis dissipation with the Intel 4004 in 1972. And the ever smaller, tation of truly new devices, integration processes, in computing that had been collected over many years ever cheaper transistors enabled rapid elaboration research with one of the authors (Wong). The data is Both phenomenon open an opportunity for a new cycle, which we call the Seral Cycle. available at https://purl.stanford.edu/gc095kp2609. and architectures for computing. By truly new by his IBM colleague Robert Keyes. From the 1940s of computer architecture. For example, the intro- Current, up-to-date data can be accessed at https:// devices, we mean devices that operate by physical through the 1980s, a span that includes the replace- duction in the 1990s of sophisticated approaches to nano.stanford.edu/cmos-technology-scaling-trend. principles that are fundamentally different from ment of vacuum tubes by bipolar transistors, the in- instruction-level parallelism (superscalar architectures) the operating principle of the FET and are there- vention of the integrated circuit, and the early stages further multiplied the system-level performance fore not subject to its fundamental limits, particu- of the replacement of bipolar transistors by field-effect gains from increasing clock speed. By the late 1990s, performance gains have been muted. Furthermore, larly the voltage scaling limit. By truly new integra- transistors the (FETs), the typical more straightforward energy dissipated elaborations of the stan- in a tion CMOS had displaced technologies, we mean the monolithic more power-hungry integrationbi- digital switching event dropped exponentially by over polar transistor dard von Neumann computer architecture have in three dimensions in a fine-grained manner from its last remaining applications that 10 orders already of magnitude. been implemented, We7 replotted that data in and prospects for sig- immerses in high-performance memory within computing. Despite these computational units.tri- Figure 1 along with Landauer’s extrapolation of the umphs, the historic nificant performance gains from further increases And by truly new architectures, we mean circuit- rate of reduction in switching trend in toward switching parallelism energies appear limited even onatthetheorder of the and multicore energy couldn’t be higher-level maintainedthat architectures through are muchthe 1990s moreas thermal level. 14,15fluctuation energy, kT, evaluated at T = 300 K. the FET approached It’s no surprise then that the replacement energy efficient than the von Neumann architec- some fundamental constraints Landauer cycle for was well aware computing that the of equipment switching all sortsenergy to itsparticularly has ture, further development. for the important algorithms and would not approach kT around 2015, not with the es- lengthened. An increasing number of semiconductor applications of the coming decades. We now touch tablished complementary manufacturers are findingmetal-oxide-semiconductor profits by investing in de- briefly Physical on someConstraints on Continued of the emerging research concepts (CMOS) device and circuit technology. His extrapo- Miniaturization velopment of product attributes, such as architectures that fuel our optimism. lation for was a way improved of highlighting memory access, thatthe possibility have little of, to and do Note that this slowing of a highly desirable ex- perhaps the need for, with smaller feature size. a new way of computing. ponential New Devices trend for in Logicinformation technology—the Some have Figure mistaken 2 shows progress theinkTthree per switching key indicatorsevent As break we in slopeseveral write, so evident in Figure distinct physical1—has noth- principles as a fundamental lower bound on the energy con- ing to do with of the achievable integration density for complex are known by which a voltage-gated switch (that the approach of switching energy
Seral Cycle Valorize “Old” Devices Technical Innovation from Pioneering Cycle New Applications Generate Surpluses/ and Services Donations with Low Margins and Low Capital In this cycle, the innovation consists in extracting more value out of older devices produced by the current Pioneering Cycle. As the cost of devices has already been been borne for previous applications, they can be applied to newer applications with low margins and little capital. This could potentially generate surpluses that could fuel further innovation.
Seral Cycle Valorize “Old” Devices Technical Innovation from Pioneering Cycle Distributed Computing Paradigm(s)? New Applications Generate Surpluses/ and Services Donations with Low Margins and Low Capital But our field lacks wide-spread paradigms that fit that cycle. Let’s review what I consider the three existing major ones of distributed computing.
Cloud Computing Paradigm Target Users … Resource Providers … First, Cloud computing is based on a computing market, where the computing resources are provided by major private companies that both use the infrastructure for world-wide services and rent the same infrastructure to smaller companies, which has enabled some like Uber and AirBnB to grow to major platforms in a few years. However, Clouds are inaccessible to those with no financial means or instruments and their economical management requires standardized homogenous resources at scale. Therefore they cannot leverage previous heterogeneous personal devices.
Grid Computing Paradigm Target Users Resource Providers Second, Grid Computing started with the vision of a computing utility that could integrate the computing resources from multiple participating organizations under standard interfaces, similar to the way the electric grid has been developed. However, it survives today as a Scientific Utility for research groups that share common computing infrastructure and is funded by governments. In its current form, the Grid Resources are reserved to those with administrative permissions, typically employees of governmental organizations and students. They are therefore not available to the general public. Moreover, similar to Clouds, the Grid approach also relies on mostly homogeneous and recent hardware.
Volunteer Computing Paradigm Target Users By Bonvallite - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=23522958 Resource Providers By Sam Howzit - V For Vendetta, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=42480624 As a third paradigm, Volunteer Computing relies on participants putting their resources in common to, for example, help research teams to discover life in the Universe or design new drugs. BOINC, as the most popular tool for Volunteer Computing, relies on anonymous participants on the Internet to share computing cycles on their computer. However, those participants cannot be trusted and the BOINC tools rely on dedicated hardware to reach hundred of thousands of participants. We believe the complexity of the BOINC tools and the cost of acquiring dedicated hardware are two factors that limit the adoption of the paradigm beyond the current niche.
Personal Volunteer Computing Paradigm Target Users Anyone with significant computing needs but limited resources! By Bonvallite - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=23522958 Resource Providers By Chocolatechocolate128 - Own work, By Mamhe Adw oaa - Own work, CC BY-SA 4.0, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=47546765 https://commons.wikimedia.org/w/index.php?curid=77324736 We therefore propose Personal Volunteer Computing as a more personal approach based on simpler tools, leveraging the inherent trust between friends and family and focusing on smaller applications, to open the approach to anyone with significant computing needs but limited resources, financial or others, to satisfy them.
Personal Volunteer Computing Overview of Personal Volunteer Computing Overview of Personal Volunteer Computing Overview of Personal Volunteer PersonalProjects Personal Projects Computing Personal Personal Devices Devices Personal Projects Personal Devices Overview Personal of Projects Personal Volunteer PersonalComputing Devices Personal Projects Personal Devices Personal Social Network Personal Tools Personal Social Network Personal Tools Personal Personal Social Social Personal Network SocialNetwork Network Personal Tools Personal Personal ToolsTools Said differently, Personal Volunteer Computing targets personal projects, uses all personal devices and those of the community of the project initiator, and provides personal tools rather than global platforms. We believe this combination of characteristics has not been the focus of much research in the last decades and could provide significant benefits to those that are not well served by the major paradigms of today. Can Personal Volunteer Computing provide computing benefits given today’s devices?
http://192.168.1.53:5000 Tablet Desktop Laptop 1 2 3 f(x) = … Pando To answer the question, we used Pando, a tool we built along that paradigm. It applies a function on every value of a stream but distributes the processing among a dynamic set of participating devices. Each device performs the processing in their browser.
tion, they want to parallelize the rendering of individual 5 module . exports [ / pando /1.0.0 ] = function ( frames, while still obtaining them in the correct order. cameraPos , cb ) { Animation Rendering (raytracing) 6 try { 7 var pixels = render ( parseFloat ( cameraPos ) ) 8 cb ( null , zlib . gzipSync ( new Buffer ( pixels ) ) . toString ( base64 ) ) Collatz 9 } catch ( err ) { 10 cb ( err ) 11 } 12 } Figure 1. Rotation animation around a 3D scene. Figure 2. JavaScript programming interface example fo Crypto-Mining rendering with raytracing. (Bitcoin Proof-of-Work) ImageIf this Processing Machine were a professional project, our user Learning could have 1 Thoselast three operations take a negligible amount of time compared relied on professional solutions [20, 26]. However, Agent Trainingthese are rendering the image. 3 Random-Testing of Concurrent Interleavings of Pando Workers We used the tool for six applications, including testing the correctness of the implementation of Pando itself, as well as synthesizing animations, or training a machine learning agent. All are CPU-bound.
Novena MacBook Pro Asus Laptop (Linux Arm) 2016 (Windows Intel) iPhone SE MacBook Air iPhone 4S (2016, not shown) 2011 (2011) We measured the collective computing contributions of my collection of laptops and phones, that includes a Linux ARM laptop, a Windows laptop, a MacBook Pro, and two phones.
MBPro 2016 iPhone SE Asus Laptop MBAir 2011 Novena iPhone 4S 100 3.8 4.1 3.2 5.2 4.2 13 14.9 13.7 7.5 10 14.5 15.9 14 75 15.1 16.7 17.6 15.2 44.3 21.6 18.7 15.8 11.9 13.5 50 48.4 50.8 49.3 49.8 12.3 37.9 25 27.1 0 tz g t. r. . g ss de in in es la e in in .-T ol en oc -M ra C .-R nd Pr -T to nt e- Ra im p ge ag ry An C LA Im M We obtained the following results. This graph shows the relative contribution of each device to the total throughput, normalized to a hundred percentage.
MBPro 2016 iPhone SE Asus Laptop MBAir 2011 Novena iPhone 4S 100 3.8 4.1 3.2 5.2 4.2 13 14.9 13.7 7.5 10 14.5 15.9 14 75 15.1 16.7 17.6 15.2 44.3 21.6 18.7 15.8 11.9 13.5 50 48.4 50.8 49.3 49.8 12.3 37.9 25 27.1 0 tz g t. r. . g ss de in in es la e in in .-T ol en oc -M ra C .-R nd Pr -T to nt e- Ra im p ge ag ry An C LA Im M First, we can see that the iPhone 4S has an insignificant contribution, which illustrates that not all older devices are still useful.
MBPro 2016 iPhone SE Asus Laptop MBAir 2011 Novena iPhone 4S 100 3.8 4.1 3.2 5.2 4.2 13 14.9 13.7 7.5 10 14.5 15.9 14 75 15.1 16.7 17.6 15.2 44.3 21.6 18.7 15.8 11.9 13.5 50 48.4 50.8 49.3 49.8 12.3 37.9 25 27.1 0 tz g t. r. . g ss de in in es la e in in .-T ol en oc -M ra C .-R nd Pr -T to nt e- Ra im p ge ag ry An C LA Im M But it also shows that all other devices, some as old as 2011, can collectively provide as much computing power as a top-of-the-line laptop from 2016.
iPhone 4S/MBAir 2011 (1 core) iPhone SE/MBPro 2016 (1 core) 1 3.28 11 1 0.77 0.63 0.64 0.55 0 0.47 0.25 0.3 0.07 0.1 0 tz g t. r. . g ss de in in es la e in in .-T ol en oc -M ra C .-R nd Pr -T to nt e- Ra im p ge ag ry An C LA Im M Interestingly, as I happened to have pairs of phones and laptop models from 2011 and 2016, I computed the performance ratio between a single core on each. This showed that the gap between the two is clearly closing. In the image processing case, the phone was significantly faster but that was due to Safari performing optimizations that Firefox was not doing on the MacBook Pro. Using Safari on the MacBook Pro provided similar results as for the other applications.
Picture of Experimental Setup As a second experiment, I invited my colleague to participate with their personal phones. That brought fun community dynamics to the experiment. It was also significantly easier to convince colleagues I interact with regularly to contribute than anonymous strangers on the Internet.
Lenovo P2a42 2016 7% LG G6 H870 2017 Wileyfox Storm 2016 11% 5% Xiaomi redmi note 6 pro 12% Honor 5% Samsung Galaxy S7 Zenfone 3 13% 4% Samsung A3 2016 4% Zenfone 2 Huawei P10 lite 2017 2% 15% Huawei P10 lite 2017 iPhone SE 1% 19% Here is the break down of the relative contributions of the phones they brought on the random testing application.
Lenovo P2a42 2016 7% LG G6 H870 2017 Wileyfox Storm 2016 11% 5% Xiaomi redmi note 6 pro 12% Honor 5% Samsung Galaxy S7 Zenfone 3 13% 4% Samsung A3 2016 4% Zenfone 2 Huawei P10 lite 2017 2% 15% Huawei P10 lite 2017 iPhone SE 1% 19% First we can see that the iPhone SE contributed the most, although this could be due to performance scaling because it was plugged in while all other devices were running from their battery.
Lenovo P2a42 2016 7% LG G6 H870 2017 Wileyfox Storm 2016 11% 5% Xiaomi redmi note 6 pro 12% Honor 5% Samsung Galaxy S7 Zenfone 3 13% 4% Samsung A3 2016 4% Zenfone 2 Huawei P10 lite 2017 2% 15% Huawei P10 lite 2017 iPhone SE 1% 19% This also seems to be the case for the Huawei phone as the slowest one’s screen locked during the experiment and probably went into power-saving mode.
Lenovo P2a42 2016 7% LG G6 H870 2017 Wileyfox Storm 2016 11% 5% Xiaomi redmi note 6 pro 12% Honor 5% Samsung Galaxy S7 Zenfone 3 13% 4% Samsung A3 2016 4% Zenfone 2 Huawei P10 lite 2017 2% 15% Huawei P10 lite 2017 iPhone SE 1% 19% Second, the performance difference between the fastest and slowest phones was significant, we observed a factor of 9 between the extremes.
MacBook Pro 2016 (2 cores) 12 Smartphones (1 core/each) 2400 1800 1200 600 0 Random-Testing Nonetheless, they collectively outperformed the Macbook Pro 2016, showing that there is value in using a collection of smartphones for computations. Where do we go from here? Beside adding support for more applications and obtaining more and more second-hand devices, how could we extend the scope of Personal Volunteer Computing in the future?
Solar-powered Flashcrowd? Picture of Experimental Setup By Suntactics - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=26538312 Starting with a first example, as the data centre crowd as already realized, the energy source and intensity of computing devices has become significant. What if we gave all participants inexpensive solar panels to power our computations from sunlight or other renewable energy? We could end up with solar-powered computing flashcrowds!
40k solar-powered CPU cores? ~= 2016’s Dell SuperComputing in South Africa! Image Source: By Kondah - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=57730956 If we were to scale that up to entire stadiums of sport supporters, we could even reach the computing capabilities of a SuperComputer installed in Africa in 2016! Source: https://www.zdnet.com/article/dells-new-supercomputer-is-the-fastest-in-africa/
Asynchronous Intermittent Computations? As a second example, what if we synchronized computing operations with the availability of energy? It could be beneficial for long-running asynchronous computations, perhaps some forms of indexation or machine learning training?
Seral Cycle Personal Volunteer Computing Technical Valorize “Old” Devices Innovation from Pioneering Cycle Applications Devices Personal Social Network New Applications Generate Surpluses/ and Services Tool Donations with Low Margins and Low Capital Old Laptops ~= Macbook Pro + iPhone SE 2016 Decent. Renewable Energy? Smartphone / Laptop Intermittent computations? 12 Smartphones > Macbook Pro 2016 In summary, based on an innovation cycle that reuses older devices, I have proposed the Personal Volunteer Computing paradigm for Distributed Computing that targets personal applications, personal devices, contributions from friends and family, and personal tools. Using Pando, built along these lines, we have shown various trends that show that old devices can provide significant computing power. We finally proposed to extend the paradigm to take into account decentralized energy sources and increase the scope of applications.
Seral Cycle Personal Volunteer Computing Technical Valorize “Old” Devices Innovation from Pioneering Cycle Applications Devices Personal Social Network New Applications Generate Surpluses/ and Services Tool Donations with Low Margins and Low Capital ? Old Laptops ~= Macbook Pro + iPhone SE 2016 Decent. Renewable Energy? Smartphone / Laptop Intermittent computations? 12 Smartphones > Macbook Pro 2016 Questions?
?
Cloud Grid Volunteer Personal Volunteer Lower infras. Lower infras. Lower op. Lowest op. Motivation costs & hard. costs costs & risks costs Paradigm Market Scientific Commons Commons Utility Ress. Single Cie. Mult. org. General Public Friends & Family Providers Target Users Customers Researchers Researchers General Public Funders Customers Governments Governments General Public?
Edge/Gray P2P Volunteer Personal Volunteer Lower op. Lower op. Lowest op. Motivation Reliability costs costs & hard. costs Paradigm Market Commons Commons — Target Users Customers Researchers Researchers General Public Ress. End users — General Public Friends & Family Providers Platform P2P Tool Trusted op. Algo. op. Friends & Family Parties Global Global Persistent Transient Design Platform Platform Tool Tool Centralized Centralized Coordination Centralized Distributed (disjoint) (disjoint) Dedicated All Dedicated Coordinators Servers User Devices Server Device
Personal Volunteer Lowest op. Motivation & hard. costs Personal Applications Paradigm Commons General Public Target Users Ress. Friends & Family Personal Devices Providers Trusted Friends & Family Personal Social Network Parties Transient Design Tool Centralized Personal Tool Coordination (disjoint) User Coordinators Device Can that work today?
You can also read