Cicada: Introducing Predictive Guarantees for Cloud Networks
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Cicada: Introducing Predictive Guarantees for Cloud Networks Katrina LaCurts* , Jeffrey C. Mogul† , Hari Balakrishnan* , Yoshio Turner‡ * MIT CSAIL, † Google, Inc., ‡ HP Labs Abstract and [24], current cloud systems fail to offer even basic In cloud-computing systems, network-bandwidth guaran- network performance guarantees; this inhibits cloud use tees have been shown to improve predictability of applica- by enterprises that must provide service-level agreements. tion performance and cost [1, 28]. Most previous work on Others have proposed mechanisms to support cloud cloud-bandwidth guarantees has assumed that cloud ten- bandwidth guarantees (see §2); with few exceptions, these ants know what bandwidth guarantees they want [1, 17]. works assume that tenants know what guarantee they want, However, as we show in this work, application bandwidth and require the tenants to explicitly specify bandwidth demands can be complex and time-varying, and many requirements, or to request a specific set of network re- tenants might lack sufficient information to request a guar- sources [1, 17]. But do cloud customers really know their antee that is well-matched to their needs, which can lead network requirements? Application bandwidth demands to over-provisioning (and thus reduced cost-efficiency) or can be complex and time-varying, and not all application under-provisioning (and thus poor user experience). owners accurately know their bandwidth demands. A We analyze traffic traces gathered over six months from tenant’s lack of accurate knowledge about its future band- an HP Cloud Services datacenter, finding that application width demands can lead to over- or under-provisioning. bandwidth consumption is both time-varying and spatially For many (but not all) cloud applications, future band- inhomogeneous. This variability makes it hard to predict width requirements are in fact predictable. In this paper, requirements. To solve this problem, we develop a predic- we use network traces from HP Cloud Services [11] to tion algorithm usable by a cloud provider to suggest an demonstrate that tenant bandwidth demands can be time- appropriate bandwidth guarantee to a tenant. With tenant varying and spatially inhomogeneous, but can also be VM placement using these predictive guarantees, we find predicted, based on automated inference from their previ- that the inter-rack network utilization in certain datacenter ous history. topologies can be more than doubled. We argue that predictive guarantees provide a better abstraction than prior approaches, for three reasons. First, 1 I NTRODUCTION the predictive guarantee abstraction is simpler for the ten- This paper introduces predictive guarantees, a new ab- ant, because the provider automatically predicts a suitable straction for bandwidth guarantees in cloud networks. A guarantee and presents it to the tenant. provider of predictive guarantees observes traffic along Second, the predictive guarantee abstraction supports the network paths between virtual machines (VMs), and time- and space-varying demands. Prior approaches typi- uses those observations to predict the data rate require- cally offer bandwidth guarantees that are static in at least ments over a future time interval. The cloud provider can one of those respects, but these approaches do not capture offer a tenant a guarantee based on this prediction. general cloud applications. For example, we expect tem- Why should clouds offer bandwidth guarantees, and poral variation for user-facing applications with diurnally- why should they use predictive guarantees in particu- varying workloads, and spatial variation in VM-to-VM lar? Cloud computing infrastructures, especially public traffic for applications such as three-tier services. “Infrastructure-as-a-Service” (IaaS) clouds, such as those Third, the predictive guarantee abstraction easily sup- offered by Amazon, HP, Google, Microsoft, and others, ports fine-grained guarantees. By generating guarantees are being used not just by small companies, but also by automatically, rather than requiring the tenant to specify large enterprises. For distributed applications involving them, we can feasibly support a different guarantee on significant inter-node communication, such as in [1], [23], each VM-to-VM directed path, and for relatively short 1
time intervals. Fine-grained guarantees are potentially duce jobs are scheduled so that their network-intensive more efficient than coarser-grained guarantees, because phases do not interfere with each other. Proteus assumes they allow the provider to pack more tenants into the same uniform all-to-all hose-model bandwidth requirements infrastructure. during network-intensive phases, although each such Recent research has addressed these issues in part. phase can run at a different predicted bandwidth. Pro- For example, Oktopus [1] supports a limited form of teus (and other related work [13, 19]) does not generalize spatial variation; Proteus [28] supports a limited form to a broad range of enterprise applications as Cicada does. of temporal-variation prediction. But no prior work, Hajjat et al. [9] describe a technique to decide which to our knowledge, has offered a comprehensive frame- application components to place in a cloud datacenter, work for cloud customers and providers to agree on ef- for hybrid enterprises where some components remain in ficient network-bandwidth guarantees, for applications a private datacenter. Their technique tries to minimize with time-varying and space-varying traffic demands. the traffic between the private and cloud datacenters, and We place predictive guarantees in the concrete context hence recognizes that inter-component traffic demands of Cicada1 , a system that implements predictive guaran- are spatially non-uniform. In contrast to Cicada, they do tees. Cicada observes a tenant’s past network usage to not consider time-varying traffic nor how to predict it, and predict its future network usage, and acts on its predic- they focus primarily on the consequences of wide-area tions by offering predictive guarantees to the customer, traffic, rather than intra-datacenter traffic. as well as by placing (or migrating) VMs to increase Cicada does not focus on the problem of enforcing utilization and improve load balancing in the provider’s guarantees [8, 16, 22, 25] nor the tradeoff between guar- network. Cicada is therefore beneficial to both the cloud antees and fairness [21]. Though these issues would arise provider and cloud tenants. More details about Cicada for a provider using Cicada, they can be addressed with are available in [15]. any of the above techniques. 2 R ELATED W ORK 3 T HE D ESIGN OF C ICADA Throughout, we use the following definitions. A provider The goal of Cicada is to free tenants from choosing refers to an entity offering a public cloud (IaaS) service. between under-provisioning for peak periods, or over- A customer is an entity paying for use of the public cloud. paying for unused bandwidth. Predictive guarantees per- We distinguish between customer and tenant, as tenant mit a provider and tenant to agree on a guarantee that has a specific meaning in OpenStack [20]: operationally, varies in time and/or space. The tenant can get the net- a collection of VMs that communicate with each other. A work service it needs at a good price, while the provider single customer may be responsible for multiple tenants. can avoid allocating unneeded bandwidth and can amor- Finally, application refers to software run by a tenant; a tize its infrastructure across more tenants. tenant may run multiple applications at once. Recent research has proposed various forms of cloud 3.1 An Overview of Cicada network guarantees. Oktopus supports a two-stage “vir- Cicada has several components, corresponding to the tual oversubscribed cluster” (VOC) model [1], intended steps in Figure 1. After determining whether to admit to match a typical application pattern in which clusters a tenant—taking CPU, memory, and network resources of VMs require high intra-cluster bandwidth and lower into account—and making an initial placement (steps 1 inter-cluster bandwidth. VOC is a hierarchical generaliza- and 2), Cicada measures the tenant’s traffic (step 3), and tion of the hose model [5]; the standard hose model, as delivers a time series of traffic matrices to a logically used in MPLS, specifies for each node its total ingress and centralized controller. The controller uses these measure- egress bandwidths. The finer-grained pipe model specifies ments to predict future bandwidth requirements (step 4). bandwidth values between each pair of VMs. CloudMir- In most cases, Cicada converts a bandwidth prediction ror allows tenants to specific a tenant abstraction graph, into an offered guarantee for some future interval. Cus- which reflects the structure of the application itself [17]. tomers may choose to accept or reject Cicada’s predictive Cicada supports any of these models, and unlike these guarantees (step 5). Because Cicada collects measure- systems, does not require the tenant to specify network ment data continually, it can make new predictions and demands up front. offer new guarantees throughout the lifetime of the tenant. The Proteus system [28] profiles specific MapReduce Cicada interacts with other aspects of the provider’s jobs at a fine time scale, to exploit the predictable phased infrastructure and control system. The provider needs to behavior of these jobs. It supports a “temporally inter- rate-limit the tenant’s traffic to ensure that no tenant un- leaved virtual cluster” model, in which multiple MapRe- dermines the guarantees sold to other customers. We dis- 1 Thanksto Dave Levin, Cicada is an acronym: Cicada Infers Cloud tinguish between guarantees and limits. If the provider’s Application Demands Automatically. limit is larger than the corresponding guarantee, tenants 2
hypervisors enforce rate limits 1. customer makes 2. provider places tenant VMs and collect measurements initial request and sets initial rate limits 3. hypervisors continually send measurements to Cicada controller 5b. provider may update 5a. provider may offer the placement based on Cicadaʼs prediction 4. Cicada predicts bandwidth customer a guarantee based for each tenant on Cicadaʼs prediction Figure 1: Cicada’s architecture. can exploit best-effort bandwidth beyond their guarantees. Note that the predictive guarantees offered by Cicada A provider may wish to place VMs based on their are limited by any caps set by the provider; thus, a pro- associated bandwidth guarantees, to improve network posed guarantee might be lower than suggested by the utilization. We describe a VM placement method in §6. prediction algorithm. The provider could also migrate VMs to increase the A predictive guarantee entails some risk of either under- number of guarantees that the network can support [6]. provisioning or over-provisioning, and different tenants will have different tolerances for these risks, typically 3.2 Assumptions expressed as a percentile (e.g., the tenant wants sufficient In order to make predictions, Cicada needs to gather suffi- bandwidth for 99.99% of the 10-second intervals). Ci- cient data about a tenant’s behavior. Based on our eval- cada’s prediction algorithm also recognizes conditions uation, Cicada may need at least an hour or two of data under which the prediction is unreliable, in which case before it can offer useful predictions. Cicada does not propose a guarantee for a tenant. Any shared resource that provides guarantees must in- clude an admission control mechanism, to avoid making 3.5 Recovering from Faulty Predictions infeasible guarantees. We assume that Cicada will in- Cicada’s prediction algorithm may make faulty predic- corporate network admission control using an existing tions because of inherent limitations or insufficient prior mechanism [1, 12, 14]. We also assume that the cloud information. Because Cicada continually collects mea- provider has a method to enforce guarantees, such as [21]. surements, it can detect when its current guarantee is 3.3 Measurement Collection inappropriate for the tenant’s current network load. Cicada collects a time series of traffic matrices for each When Cicada detects a faulty prediction, it can take one tenant, either passively, by collecting NetFlow or sFlow of many actions: stick to the existing guarantee, propose data at switches within the network, or by using an agent a new guarantee, upgrade the tenant to a higher, more that runs in the hypervisor of each machine. We would expensive guarantee, etc. How and whether to upgrade like to base predictions on offered load, but when the guarantees, as well as what to do if Cicada over-predicts, provider imposes rate limits, we risk underestimating peak is a pricing-policy decision, and outside our scope. We loads that exceed those limits, and thus under-predicting note, however, that typical System Level Agreements future traffic. We can detect when a VM’s traffic is rate- (SLAs) include penalty clauses, in which the provider limited via packet drops, so these underestimates can be agrees to remit some or all of the customer’s payment if detected, too, although not precisely quantified. Currently, the SLA is not met. Cicada does not account for this potential error. 4 M EASUREMENT R ESULTS 3.4 Prediction Model We designed Cicada under the assumption that the traffic Some applications, such as backup or database ingestion, from cloud tenants is predictable, but in ways that are not require bulk bandwidth, and need guarantees that the captured by existing models. Before building Cicada, we average bandwidth over a period of H hours will meet collected data from HP Cloud Services, to analyze the their needs (e.g., a guarantee that two VMs will be able spatial and temporal variability of its tenants’ traffic. to transfer 3.6Gbit in an hour, but not necessarily at a constant rate of 1Mbit/s). Other applications, such as user- 4.1 Data facing systems, require guarantees for peak bandwidth We have collected sFlow [27] data from HP Cloud Ser- over much shorter intervals (e.g., a guarantee that two vices, which we refer to as the HPCS dataset. Our dataset VMs will be able to transfer at a rate of 1Mbit/s for the consists of about six months of samples from 71 Top-of- next hour). Rack (ToR) switches. Each ToR switch connects to either Cicada’s predictions describe the maximum bandwidth 48 servers via 1GbE NICs, or 16 servers via 10GbE NICs. expected during any averaging interval δ during a given In total, the data represents about 1360 servers, spanning time interval H. If δ = H, the prediction is for the band- several availability zones. This dataset differs from those width requirement averaged over H hours, but for an in- in previous work—such as [2, 3, 7]—in that it captures teractive application, δ might be just a few milliseconds. VM-to-VM traffic patterns. 3
1 1 For each tenant in the HPCS dataset, we calculate its 0.8 0.8 temporal variation value, and plot the CDF in Figure 2. 0.6 0.6 CDF CDF As with spatial variation, most tenants have high temporal 0.4 0.4 1 hour 0.2 0.2 4 hours variability. This variability decreases as we increase the 0 0 24 hours time interval H, but we see variability at all time scales. 0.01 0.1 1 10 100 0.01 0.1 1 10 100 These results indicate that a rigid model is insufficient Spatial Variation Temporal Variation for making traffic predictions. Given the high spatial Figure 2: Variation in the HPCS dataset. cov values in Figure 2, a less strict but not entirely gen- We aggregate the data over five-minute intervals, to eral model such as VOC [1] may not be sufficient either. produce datapoints of the form htimestamp, source, des- Furthermore, the high temporal variation values indicate tination, number of bytes transferred from source to des- that static models cannot accurately represent the traffic tinationi. We keep only the datapoints where both the patterns of the tenants in the HPCS dataset. source and destination are private IPs of virtual machines, and thus all our datapoints represent traffic within the data- 5 C ICADA ’ S T RAFFIC P REDICTION M ETHOD center. Making predictions about traffic traveling outside Much work has been done on predicting traffic matrices the datacenter or between datacenters is a challenging from noisy measurements such as link-load data [26, 29]. task, which we do not address in this work. In these scenarios, prediction approaches such as Kalman Under the agreement by which we obtained this data, filters and Hidden Markov Models—which try to estimate we are unable to reveal information such as the total num- true values from noisy samples—are appropriate. Cicada, ber of tenants, the number of VMs per tenant, or the however, knows the exact traffic matrices observed in past growth rates for these values. epochs; its problem is to predict a future traffic matrix. 4.2 Spatial and Temporal Variability Cicada’s prediction algorithm adapts Herbster and War- To quantify spatial variability, we compare the observed muth’s “tracking the best expert” idea [10], which has tenants to a static, all-to-all tenant. This tenant is the easi- been successfully adapted before in wireless power-saving est to make predictions for: every intra-tenant connection and energy reduction contexts [4, 18]. To predict the traf- sends the same amount of data at a constant rate. Let Fi j fic matrix for epoch n + 1, we use all previously observed be the fraction of this tenant’s traffic sent from V Mi to traffic matrices, M1 , . . . , Mn (below, we show that data V M j . For this “ideal” all-to-all tenant, all of these values from the distant past can be pruned away without affect- are equal; every VM sends the same amount of data to ing accuracy). In such a matrix, the entry in row i and every other VM. As another example, consider a bimodal column j specifies the number of bytes that VM i sent to tenant with some pairs that never converse and others that VM j in the corresponding epoch (for predicting average converse equally. Let k = n/2. Each VM communicates bandwidth) or the maximum observed over a δ -length with k VMs, and sends 1/k of its traffic; hence half of the interval (for peak bandwidth). F values are 1/k, and the other half are zero. The algorithm assigns each of the previous matrices a For each tenant in the HPCS dataset, we calculate the weight as part of a linear combination. The result of this distribution of its F values, and plot the coefficient of combination is the prediction for Mn+1 . Over time, these variation in Figure 2. The median cov value is 1.732, weights get updated; the higher the weight of a previous which suggests nontrivial overall spatial variation (for matrix, the more important that matrix is for prediction. contrast, the cov of our ideal tenant is zero, and the cov In the HPCS dataset, we found that the 12 most recent of our bimodal tenant is one2 ). hours of traffic are weighted heavily, and there is also a To quantify temporal variability, we first pick a time spike at 24 hours earlier. Weights are vanishingly small interval H. For each consecutive H-hour interval, we cal- prior to this. Thus, at least in our dataset, one does not culate the sum of the total number of bytes sent between need weeks’ worth of data to make reliable predictions. each pair p of VMs, which gives us a distribution Tp of We refer the reader to [15] for a full description of our bandwidth totals. We then compute the coefficient of vari- prediction algorithm, including the method for updating ation for this distribution, cov p . The temporal variation the weights, as well as an evaluation against a VOC-style for a tenant is the weighted sum of these values. The prediction algorithm [1]. To summarize those results, Ci- weight for cov p is the bandwidth used by pair p, to reflect cada’s prediction method decreases the median relative the notion that tenants where only one small flow changes error by 90% compared to VOC when predicting either over time are less temporally-variable than those where average or peak bandwidth. Cicada also makes predic- one large flow changes over time. tions quickly. In the HPCS dataset, the mean prediction speed is under 10 msec in all but one case (under 25 msec p 2 For the bimodal tenant, µ = 1/2k and σ = 1/n ∑n (1/2k)2 = 1/2k. in all cases). i=1 4
Algorithm 1 Cicada’s VM placement algorithm Inter-rack Bandwidth Available 600000 bw factor 1x 1: P = set of VM pairs, in descending order of their 500000 bw factor 25x bw factor 250x bandwidth predictions (GByte/hour) 400000 2: for (src, dst) ∈ P do 300000 3: if resources aren’t available to place (src, dst) then 200000 4: revert reservation 100000 5: return False 0 1 1.4142 1.5 1.666 1.75 2 4 8 16 6: A = All available paths Oversubscription Factor 7: if src or dst has already been placed then 8: restrict A to paths including the appropriate end- Figure 3: Inter-rack bandwidth available after placement. point 9: Place (src, dst) on the path in A with the most bandwidth—what remains unallocated after the VMs are available bandwidth placed—varies with O p , the physical oversubscription factor. In all cases, Cicada’s placement algorithm leaves more inter-rack bandwidth available. When O p is greater 6 E VALUATION than two, both algorithms perform comparably, since this Although Cicada’s predictive guarantees allow cloud is a constrained environment with little bandwidth avail- providers to decrease wasted bandwidth, it is not clear able overall. However, with lower over-subscription fac- that these providers treat wasted intra-rack and inter-rack tors, Cicada’s algorithm leaves more than twice as much bandwidth equally. Inter-rack bandwidth may cost more, bandwidth available, suggesting that it uses network re- and even if intra-rack bandwidth is free, over-allocating sources more efficiently in this setting. network resources on one rack can prevent other tenants Over-provisioning reduces the value of our improved from being placed on the same rack. placement, but it does not remove the need for better guar- To determine whether wasted bandwidth is intra- or antees. Even on an over-provisioned network, a tenant inter-rack, we need a VM-placement algorithm. For VOC, whose guarantee is too low for its needs may suffer if its we use the placement algorithm detailed in [1], which VM placement is unlucky. A “full bisection bandwidth” tries to place clusters on the smallest subtree that will network is only that under optimal routing; bad routing contain them. For Cicada, we developed a greedy place- decisions or bad placement can still waste bandwidth. ment algorithm, detailed in Algorithm 1. This algorithm is similar in spirit to VOC’s placement algorithm; Ci- 7 C ONCLUSION cada’s algorithm tries to place the most-used VM pairs This paper described the rationale for predictive guar- on the highest-bandwidth paths, which in a typical dat- antees in cloud networks, and the design of Cicada, a acenter corresponds to placing them on the same rack, system that provides this abstraction to tenants. Cicada and then the same subtree. However, since Cicada uses provides fine-grained, temporally- and spatially-varying fine-grained, pipe-model predictions, it has the potential guarantees without requiring the clients to specify their to allocate more flexibly; VMs that do not transfer data to demands explicitly. We outlined a prediction algorithm one another need not be placed on the same subtree, even where the prediction for a future epoch is a weighted lin- if they belong to the same tenant. ear combination of past observations, with the weights We compare Cicada’s placement algorithm against updated and learned online in an automatic way. Using VOC’s on a simulated physical infrastructure with 71 traces from HP Cloud Services, we showed that the fine- racks with 16 servers each, 10 VM slots per server, and grained structure of predictive guarantees can be used by (10G/O p )Gbps inter-rack links, where O p is the physical a cloud provider to improve network utilization in certain oversubscription factor. For each algorithm, we select a datacenter topologies. random tenant, and use the ground truth data to determine this tenant’s bandwidth needs for a random hour of its ac- ACKNOWLEDGMENTS tivity, and place its VMs. We repeat this process until 99% We are indebted to a variety of people at HP Labs and HP of the VM slots are filled. Using the ground-truth data Cloud for help with data collection and for discussions of allows us to compare the placement algorithms explicitly, this work. In particular, we thank Sujata Banerjee, Ken without conflating this comparison with prediction errors. Burden, Phil Day, Eric Hagedorn, Richard Kaufmann, To get a sense of what would happen with more network- Jack McCann, Gavin Pratt, Henrique Rodrigues, and Re- intensive tenants, we also evaluated scenarios where each nato Santos. This work was supported in part by the VM-pair’s relative bandwidth use was multiplied by a National Science Foundation under grant IIS-1065219 as constant bandwidth factor (1×, 25×, or 250×). well as an NSF Graduate Research Fellowship. Figure 3 shows how the available inter-rack 5
R EFERENCES ogy, 2014. [1] BALLANI , H., C OSTA , P., K ARAGIANNIS , T., AND [16] L AM , T. V., R ADHAKRISHNAN , S., PAN , R., ROWSTRON , A. Towards Predictable Datacenter VAHDAT, A., AND VARGHESE , G. NetShare and Networks. In SIGCOMM (2011). Stochastic NetShare: Predictable Bandwidth Allo- [2] B ENSON , T., A KELLA , A., AND M ALTZ , D. A. cation for Data Centers. SIGCOMM CCR 42, 3 Network Traffic Characteristics of Data Centers in (2012). the Wild. In IMC (2010). [17] L EE , J., L EE , M., P OPA , L., T URNER , Y., BANER - JEE , S., S HARMA , P., AND S TEPHENSON , B. [3] B ENSON , T., A NAND , A., A KELLA , A., AND Z HANG , M. Understanding Data Center Traffic CloudMirror: Application-Aware Bandwidth Reser- Characteristics. In WREN (2009). vations in the Cloud. In HotCloud (2013). [4] D ENG , S., AND BALAKRISHNAN , H. Traffic-aware [18] M ONTELEONI , C., BALAKRISHNAN , H., F EAM - STER , N., AND JAAKKOLA , T. Managing the Techniques to Reduce 3G/LTE Wireless Energy Consumption. In CoNEXT (2012). 802.11 Energy/Performance Tradeoff with Machine [5] D UFFIELD , N. G., G OYAL , P., G REENBERG , Learning. Tech. Rep. MIT-LCS-TR-971, Mas- A., M ISHRA , P., R AMAKRISHNAN , K. K., AND sachusetts Institute of Technology, 2004. VAN DER M ERWE , J. E. A Flexible Model for Re- [19] N IU , D., F ENG , C., AND L I , B. Pricing Cloud source Management in Virtual Private Networks. In Bandwidth Reservations Under Demand Uncer- SIGCOMM (1999). tainty. In Sigmetrics (2012). [6] E RICKSON , D., H ELLER , B., YANG , S., C HU , J., [20] OpenStack Operations Guide - Managing Projects E LLITHORPE , J., W HYTE , S., S TUART, S., M C K- and Users. http://docs.openstack. EOWN , N., PARULKAR , G., AND ROSENBLUM , org/trunk/openstack-ops/content/ M. Optimizing a Virtualized Data Center. In SIG- projects_users.html. COMM Demos (2011). [21] P OPA , L., K UMAR , G., C HOWDHURY, M., K RISH - NAMURTHY, A., R ATNASAMY, S., AND S TOICA , I. [7] G REENBERG , A., H AMILTON , J. R., JAIN , N., K ANDULA , S., K IM , C., L AHIRI , P., M ALTZ , FairCloud: Sharing the Network in Cloud Comput- D. A., PATEL , P., AND S ENGUPTA , S. VL2: A ing. In SIGCOMM (2012). Scalable and Flexible Data Center Network. In SIG- [22] R AGHAVAN , B., V ISHWANATH , K., R AMABHAD - RAN , S., YOCUM , K., AND S NOEREN , A. C. COMM (2009). [8] G UI , C., L U , G., WANG , H. J., YANG , S., KONG , Cloud Control with Distributed Rate Limiting. In C., S UN , P., W U , W., AND Z HANG , Y. SecondNet: SIGCOMM (2007). A Data Center Network Virtualization Architecture [23] R ASMUSSEN , A., C ONLEY, M., K APOOR , R., with Bandwidth Guarantees. In CoNEXT (2010). L AM , V. T., P ORTER , G., AND VAHDAT, A. [9] H AJJAT, M., S UN , X., S UNG , Y.-W. E., M ALTZ , Themis: An I/O-Efficient MapReduce. In SOCC D., R AO , S., S RIPANIDKULCHAI , K., AND (2012). TAWARMALANI , M. Cloudward Bound: Planning [24] R ASMUSSEN , A., P ORTER , G., C ONLEY, M., for Beneficial Migration of Enterprise Applications M ADHYASTHA , H., M YSORE , R. N., P UCHER , A., AND VAHDAT, A. TritonSort: A Balanced Large- to the Cloud. In SIGCOMM (2010). [10] H ERBSTER , M., AND WARMUTH , M. K. Tracking Scale Sorting System. In NSDI (2011). the Best Expert. Machine Learning 32 (1998). [25] RODRIGUES , H., S ANTOS , J. R., T URNER , Y., [11] HP Cloud. http://hpcloud.com. S OARES , P., AND G UEDES , D. Gatekeeper: Sup- [12] JAMIN , S., S HENKER , S. J., AND DANZIG , P. B. porting Bandwidth Guarantees for Multi-tenant Dat- Comparison of Measurement-based Admission Con- acenter Networks. In WIOV (2011). trol Algorithms for Controlled-load Service. In In- [26] ROUGHAN , M., T HORUP, M., AND Z HANG , Y. focom (1997). Traffic Engineering with Estimated Traffic Matrices. [13] K IM , G., PARK , H., Y U , J., AND L EE , W. Vir- In IMC (2003). tual Machines Placement for Network Isolation in [27] sFlow. http://sflow.org. Clouds. In RACS (2012). [28] X IE , D., D ING , N., H U , Y. C., AND KOMPELLA , [14] K NIGHTLY, E. W., AND S HROFF , N. B. Admission R. The Only Constant is Change: Incorporating Control for Statistical QoS: Theory and Practice. Time-Varying Network Reservations in Data Cen- IEEE Network 13, 2 (1999). ters. In SIGCOMM (2012). [15] L AC URTS , K., M OGUL , J. C., BALAKRISHNAN , [29] Z HANG , Y., ROUGHAN , M., D UFFIELD , N., AND H., AND T URNER , Y. Cicada: Predictive Guaran- G REENBERG , A. Fast Accurate Computation of tees for Cloud Network Bandwidth. Tech. Rep. MIT- Large-Scale IP Traffic Matrices from Link Loads. CSAIL-TR-004, Massachusetts Institute of Technol- In Sigmetrics (2003). 6
You can also read