Load-balancing EU-DataGrid Resource Brokers
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Proc UK e-Science All Hands Meeting 2003, © EPSRC Sept 2003, ISBN 1-904425-11-9 Load-balancing EU-DataGrid Resource Brokers William Lee Steve McGough Steven Newhouse John Darlington London e-Science Centre, Imperial College London, South Kensington Campus, London SW7 2AZ, UK Email: lesc-staff@doc.ic.ac.uk Abstract The European DataGrid (EDG)[17] project aims to provide a platform to satisfy the ever-growing demand of high computation and storage requirements across scientific disciplines. Its resource broker is the gateway to a managed set of compute elements for handling the job submission and accounting. In this paper, we present an infrastructure layering on top of the resource brokers for load-balancing job submis- sion requests. The resource broker is abstracted as OGSI compliant services to provide interoperability to heterogeneous submission clients. We demonstrate the use of Javaspace as an information distribution framework for different strategic elements to co-operatively load-balance job submission requests. 1 Introduction dependent API, we leverage the Open Grid Ser- vices Infrastructure (OGSI)[18] to provide a web The European Data Grid (EDG)[17] is a plat- service environment to support client interaction. form to support intensive computational analysis The objectives of the load-balancing infrastruc- of extremely large-scale distributed datasets across ture presented in this paper are to pool and en- widely distributed scientific communities. capsulate a set of resource brokers as co-operative The Resource Broker (RB) in the EDG Work OGSI services. Package 1[1] is the gateway for submitting job Flexibily couple a dynamic set of Resource to a managed set of compute elements. End- Brokers to form a submission pool users wishing to submit jobs to the EDG platform have access to a collection of command-line and Provide a well-defined OGSI-compliant ser- graphical tools. Application Programming Inter- vice interface for submitting and managing face (API) for C++, Python or Java are available EDG jobs. for customisable clients. By hiding the network Job management service is distributed protocol details under the well-defined set of pro- across co-operative OGSI service containers grammable interfaces, it allows progressive refine- to reduce load and prevent single-point of ment to the underlying protocols without requiring failure. clients to be rewritten to deal with changes. The Resource Broker is based on the traditional client/server model[16]. The RB daemon listens 2 Background to a well-known port and uses the Grid Security Infrastructure (GSI) [7] to ensure client authentic- 2.1 The European Data Grid Re- ity. The daemon passes incoming job request to a source Broker set of multi-threaded agents to communicate with the EDG subsystems to submit and manage jobs. The Resource Broker is a middleware responsi- Therefore, the stability and scalability of the re- ble for carrying out a set of tasks related to user source broker is essential for the adoption of the job submission. These tasks include interacting infrastructure. with Replica Catalog (RC) to resolve logical data We recognise the need to allow heterogeneous set names, finding preliminary set of sites for data clients to interact with the resource broker in any transfer, job submission and management by inter- imaginative way. Clients can range from end-user acting with EDG sub-systems such as the Job Sub- submission interface, or autonomous agents man- mission Service (JSS). aging large batch jobs on behalf of users. In order The resource broker is a daemon process that to free the clients from the programming language listens to a TCP/IP socket for client requests. Upon 137
Proc UK e-Science All Hands Meeting 2003, © EPSRC Sept 2003, ISBN 1-904425-11-9 client interaction, a thread is spawned to handle the as WSDL. The WSDL description would contain client messages using a new available port. The the networking location, protocol and the messag- RB master daemon acts as the preliminary broker ing characteristics of the service, which the client for agent thread to handle individual client. The can access. By differentiating the service identity agent process acts as the hub for carrying out tasks from the network details, it decouples the client by communicating with the JSS, RC and the lo- from the locality of the service as well as providing cal Job Registry (JR) database through the DBMS an opportunity for the service to handle migration interface to ensure job state consistency and persis- or fault recovery. tence of queued jobs. The design suffers from problems of the tradi- tional client-server model. 3 Load-balancing Resource Clients address the RB they are wishing to Broker use by the host and port. Reliability and The resource broker can be considered as a load- availability of the resource broker is directly balancer for compute elements. It decides on a pre- presented to potential users. liminary set of compute elements where the jobs The Resource Broker is a single-point of can be launched based on the user requirements, failure and well-known RB might become such as priority, architecture and the current loads. biased with high loads. In order to avoid loading a particular instance of resource broker, our approach is to layer on top Under high load, available ports become a collection of resource brokers with a meta load- scarce and the master daemon becomes the balancer. The meta load-balancer acts as the gate- bottleneck of the resource broker. Scalabil- way for parallelising requests to resource brokers ity is limited by the machine architecture and acting as backends. operating system. Several options have been considered; By emulating the resource broker master 2.2 Open Grid Services Infrastruc- daemon, one can delegate requests to back- ture end resource brokers using a simple round- robin strategy. The agent thread is spawned The Open Grid Services Infrastructure (OGSI)[18] on the delegated resource broker, and the has brought about a convergence of the grid and URL for the client to contact the agent thread web services communities. It leverages commer- is returned to the user through the load- cially supported web services protocols to build a balancing daemon. Later conversation with Grid infrastructure. OGSI adopts the general web the agent thread is performed directly with service approach for describing the abstract inter- the RB host. This scheme allows a dedi- face and the implementation details of all Grid ser- cated machine to be used to delegate request vices by using WSDL. The network binding and to backend resource brokers, reducing the the messaging layer are interchangeable. This flex- load of the master daemon load on each RB. ibility allows existing protocols and future stan- However, it still presents a single point of dards to be described through a unified interface failure. Since the network protocol is well language. encapsulated by the API, emulating the re- The core contribution of the OGSI specification source broker daemon demands reverse en- is the standardisation of a set of core service types gineering the protocol, which is error-prone that are essential for distributed computing. The and suffered from later change in the proto- service port types include Factory for creating new col. service, Notification related port types for man- aging subscription and receiving notification, etc.. Hardware load-balancers have been widely OGSI introduces the notion of a Grid Service Han- used in client-server environment[6] . In- dle (GSH), which acts as a globally unique pointer stead of exposing backend resources through for locating service through a handle resolver. A delegation, hardware load-balancer hides the GSH is resolved by a HandleResolver service into backend resources behind a single network a Grid Service Reference (GSR). The GSR acts as identity. It provides an efficient mean for the binding-specific network pointer to the service. forwarding network packets to resources In the case of a GSH resolvable to a GSR expressed based on their operational metrics and loads. 138
Proc UK e-Science All Hands Meeting 2003, © EPSRC Sept 2003, ISBN 1-904425-11-9 The load-balancer will keep track of session 1. JobSubmissionFactory is an extension information based on some rules, such as of the OGSI Factory port type that is client IP address, protocol content, etc. to responsible for instantiating new in- ensure later conversational packets will be stance of JobManager service for a routed to the same backend resource to en- given job submission request. The sure consistency. However, this approach is JobSubmissionFactory will use the Re- mainly used in a cluster environment with source Broker client API to contact a high-speed network within one organisation. Resource Broker selected by the load- Multiple resource brokers in a geograph- balancing infrastructure. JobSubmis- ically dispersed virtual organisation might sionFactory service instances are ex- not be suitable for this use. Also the lack pected to be advertised and discov- of protocol transparency presents difficulty ered by clients through the multitudes in establishing session rules based on the of web service discovery mechanisms, packet content. such as UDDI directory[14], GT3 In- dex Service[4], etc.. 4 Load-balancing Architecture 2. JobManager is a service accessible only by the user who instantiate this The Load-balancing Architecture depicted in fig- service through the JobSubmission- ure 1 is constructed from a clear separation Factory. Each instance of JobMan- between information distribution and decision- ager represents a single job submis- making. We believe load-balancing strategies are sion. It encapsulates the manage- pluggable entities that are interchangeable based ment API for querying status and can- on the usage pattern as well as organisational poli- celling the job. The state of the Job- cies. Manager service is held by the Re- The information distribution channel is termed source Broker who is managing the Information Tuplespace, which serves as the core job, therefore the JobManager instance of the infrastructure. Client interaction takes place is free to migrate from container to in the OGSI and web-services layer. The OGSI container as long as the resolver ser- services provides an entry point to the system. vice would resolve the GSH of the Job- They introduce job submission request information Manager instance to a service running into the Tuplespace in the Pull strategy, or option- among a set of co-operative contain- ally Push a request to the Resource Broker Agent. ers. The JobManager instance is a Each agent represents a Resource Broker instance. transient service. It’s lifetime is de- It strategically pulls request from the Tuplespace termined by the status of the job it and instruct the Resource Broker to submit the job is representing as well as the client’s through the RB Client API. The agent is also re- desire to terminate the job. When sponsible to introduce response tuple into the Tu- the job has reached the JOB-OUTPUT- plespace to acknowledge a submitted job. This TRANSFERED state, the JobManager provides a natural load-balancing scheme for any instance will terminate after a con- participating OGSI containers to provide the job figurable period to reclaim used re- management function. sources. The JobManager port type ex- tends the OGSI NotificationSource port type. Clients can register Notification- 4.1 Characteristics Sink instances to the JobManager to receive important notification, such as The load-balancing architecture presented here ex- the change of job state service data. hibits the following characteristics Our implementation is based on the Interoperability - In order to increase adop- Globus Toolkit 3.0 (GT3)[3] core dis- tion of the EDG platform, functionalities tribution. This is a Java based ref- provided by the resource broker is encapsu- erence implementation for the OGSI lated as OGSI-compliant services. Job sub- specification. We have enhanced the mission and management capabilities are ab- GT3 service container by allowing in- stracted as two port types; stance created by the Factory port 139
Proc UK e-Science All Hands Meeting 2003, © EPSRC Sept 2003, ISBN 1-904425-11-9 æ G» Ö ÅI½ÄBÉIÂçxÄBÁ3ξDÅ>¾6åGÂ>è ÎÂÅ6Æ1ͲÌDÅ6Ã6Æ¿ ÄGÅ-ÄBÇ\ËÇ Ä`è Ã>Â>Ê@ÊÅ6Ä`Æ¿ Ͳ¿ Ã>¾6Æ¿ ÄBÅ-Í²Ç ÄBÎ ÆÐ>Â Ý ÄBÁ`äM¾DÅ>¾PåBÂDÇ º »6¼½`¾>¿ À ¾>Á`À Â-Ã6ÄBÅ6ƾ>¿ Å>ÂÇKÈ/¿ À ÀPƾ>ÉPÂ~¾~Ç Â>Ê6Ë>ÄBÅ6ÊPÂ-ÆÌËÀ Â Í²Ç ÄBÎÏÆÐDÂ-ÊP˾PÃ>Â~¾DÅ>Ñ9ξDÇxÉ/ÆÐDÂ-ÆÌËÀ Â~¾6ÊÒ>ÓÔGÕÖ ×GØÙDÚ\Û ÆÐDÂÅ~Ë`Ì6Æ7¿ ÆÁ¾6ÃDɧÆ|Ä3ÆÐDÂ-ÊP˾PÃ>Â-Æ|ÄÜ¿ Å>ÑG¿ Ã>¾PÆ|Â-Æ|Ä3ÆÐD '(*) Q7RST S UKV WX YU Ý ÄGÁ×GÌDÁGÎ-¿ Ê@Ê6¿ ÄBÅDÞP¾6Ã6ÆÄBÇ ß~ÆÐ>Â-ÊPÂÇ ½B¿ Ã6Â3¿ ÊÅ6ÄDÈàÃDÇ ÂD¾6Æ|ÂDÑ +-,/.10324657, á » ÃDÇ ÂD¾6Æ|ÂD×`ÂÇ ½G¿ Ã> Z\[]^@_GX ]`UKV 894603:1,-4 ÇxÂ>ÆÌÇxÅ>Ê/ÆÐ>Â~â×`ØãÄ`Í\ÆÐ>Â Ý ÄBÁ`äM¾DÅ>¾PåGÂÇ\¿ Å>Ê@Æ|¾DÅ6Ã> ;=@?BADC E>< FGC ?BHIz Z[]^@_BX ]`UKV Resquest Tuple # '(*) KGD>x>B>>@1 ¡-¢D ¸ ; =I?GAC E>< !" +-,/.10324657, D` -6D£6¤>~£>¡>¥-¦§£D M¨ª©B£ Response Tuple ¹ FBC ?BHPPD©B¡6P-D` 894603:1,-4 m n`op>qr s3tuuvDuw rxq>yPz|{Br s QRST {Gr\p>}Iq~{z>qr\qD>>}/z{ S UKV WX Y`U w }Iy6{>GqDr s~I{GGpDG-w }@ Z []^@_BX ]`UKV }Pw {BD6Py6z{Br s ;=@?BADC E>< FGC ?BHI
Proc UK e-Science All Hands Meeting 2003, © EPSRC Sept 2003, ISBN 1-904425-11-9 ber of operations (e.g. take, read, write, notify) This tuple is generated by the JobSubmission- for manipulating tuple persistently stored in the Factory when a createService request is received. space. We have chosen to use the reference imple- The factory introduce this tuple to the space and mentation of Javaspace[10] provided by the Java wait for a Job Response Tuple Resp that satisfies JINI[11] technology. The Javaspace technology the template tuple with the PUB field marked as has posed many advantages; true and matching GUID and UID. Javaspace is an API for a Java LINDA sys- Resp éîê GUID ì6ï/ì UID ì true ì>ï7í (2) tem. Multiple implementations are avail- able with different operational characteris- The JobSubmissionFactory will returns the tics, such as persistence method, security JobManager locator denoted by the MGSH field of and performance. By developing against the the tuple to the client. API, interchanging implementation for scal- ability purposes would only result in mini- 4.2.2 Job Response Tuple mal change in deployment. A Job Response Tuple Resp denotes a successful Javaspace operations are transaction aware. submission through a Resource Broker Agent. The By using the Java Transaction API (JTA)[9], Agent who has taken a Job Request Tuple {GUID, the system can ensure ACID property for job GSH, UID, JDL} from the Tuplespace will insert submission. a tuple {GUID, null, UID, false, JOBID} to the space to indicate the job is begin processed by the The Javaspace implementation persists tuple EDG sub-system. states into long-term storage. Failure in the Javaspace node can be recovered by replay- ing the persistence log. Javaspace clients Resp éîê GUID ì MGSH ì UID ì PUB ì JOBID í (3) refer to the tuplespace by name, therefore transparent to network migration. GUID - A globally unique identifier match- ing the GUID field of a previously published Most importantly, tuple published to a Javas- Job Request Tuple R pace not only carries information presented as states of a Java object. The Java object MGSH - The Grid Service Handle of the can also carry executable content inherited JobManager that handles the management from the JINI technology. We have used this request. A null value indicates it has not characteristic to implement the Job Courier been managed by any JobManager service. Tuple that routes job request to other spaces. UID - The credential of the user matching the UID field of request tuple R 4.2.1 Job Request Tuple PUB - A boolean field indicating whether a A Job Request Tuple Req consists of JobManager service is bound to this job. Req éëê GUID ì GSH ì UID ì JDL í (1) JOBID - The Job ID returned by the EDG Resource Broker for establishing manage- GUID - A globally unique identifier gener- ment conversation. ated per job request to identify a job submis- sion through the system Once the Job Response Tuple is introduced into the space. An available OGSI container will take GSH - The Grid Service Handle of the Job- the tuple off the space and instantiate a JobMan- SubmissionFactory that handles the user re- ager service instance in the local container. The quest JobManager will use the JOBID field to converse UID - The credential of the user who submits with the RB Client management API. Upon cre- the job. This is a org.ietf.jgss.GSSCredential ation of the service, the tuple is written back to object representing the GSI X.509 certificate the tuplespace with the MGSH field set to the of the user. GSH of the JobManager instance and PUB set to true. The JobSubmissionFactory that initiates the JDL - The job description document speci- request will now be unblocked with the locator fying the user intent. available to be returned to the client. 141
Proc UK e-Science All Hands Meeting 2003, © EPSRC Sept 2003, ISBN 1-904425-11-9 4.2.3 Job Courier Tuple load, we observe connection refusal by the resource broker master daemon. By throt- The Job Request and Response Tuples provide the tling the retrieval rate, the agent attempts to ingredients for a Pull load-balancing strategy. An minimize and adapt to the failure rate ac- agent pulls an available request tuple from the cordingly. space when it is free to do so. However, this cre- ates an environment where tuples will be left in the JobManager - The instantiation of JobMan- space for their lease to be expired and reclaimed ager is controlled by a manager thread run- when all the participating agents are busy. Al- ning inside each participating OGSI contain- though it provides natural load-balancing amongst ers. The manager will takes response tuple all the EDG Resource Brokers, since the take oper- from the space and create JobManager in the ation does not guarantee any ordering, a Push strat- local container. The decision for creating egy is needed to create multiple priority queue. new instances is govern by a configurable A Job Courier Tuple is an executable entity that maximum count, as well as the sum of in- represents a slot an agent offers for immediate ex- vocation activity to all the local JobManager ecution. A JobSubmissionFactory takes a courier instances. tuple from the space when a job needs to be pushed to an agent as quickly as possible. By invoking the submit method on the courier tuple, the tuple 5 Discussion and Future Works will invoke the resource broker agent by its inter- nal means to execute the job. The JobSubmission- The load-balancing framework presented can thus Factory can decide from the return value whether it be considered as generic job submission and man- will creates the JobManager service locally or del- agement services. The port types are designed to egate it back to the tuplespace as discussed in the have no reference to the European Data Grid plat- pull strategy. Once a JobSubmissionFactory has form. OGSI Service Data allows us to advertise finished using the courier, it can choose to release information related to the underlying job submis- the courier tuple back into the courier tuple for oth- sion system the service supports. Job description ers to use or retain it in a greedy fashion. is passed to the JobSubmissionFactory through the creationParameters extensibility element. The ser- 4.3 Strategic Elements vice data description advertises the support of the EDG JDL. In the last section, we have demonstrated the com- The JobManager contains operations for query- ponents that enable the Push, Pull and hybrid ing job status. However, different underlying strategies. The infrastructure provides three dis- submission systems have varying model of job tinct points for inserting different strategies. states. In order for generic tools to understand state JobSubmissionFactory - The JobSubmis- changes to take appropriate actions. One possible sionFactory is responsible for either insert- solution is to use service data description to adver- ing a request into the tuplespace for an agent tise the possible state transition for a given model to be retrieved, or it can look for courier as a directed graph. Graph state described onto- tuples for executing high-priority jobs. A logically can be inferred to establish understanding well-known JobSubmissionFactory can ag- between different models. gressively retain a high numbers of courier The load-balancing infrastructure can scale to tuples in advance. By reserving these avail- use any numbers of JobSubmissionService in- able slots, the factory sustaining a high rate stances running on dedicated machines. Hardware of submission can push jobs to the agents as load-balancers can be used to cluster multiple con- quickly as possible. tainers. Dedicated containers can be configured to serve as hosting environments for JobManager Resource Broker Agent - The Resource Bro- instances. They sustain higher activity than Job- ker Agent implementation takes request tu- SubmissionService. Although the multi-layered ples from the space and introduce response approach will increase latency for a job request- tuple when a job is passed to the EDG response compared to the direct use of the Re- sub-system. The rate of retrieval is a vari- source Broker Client API, clients are opened up to able in the agent strategy. This is typically a dynamic collection of resource brokers that can controlled by the failure submission rate to handle their requests. It potentially reduce failure the underlying resource broker. Under high rate considerably because of the transparent retry 142
Proc UK e-Science All Hands Meeting 2003, © EPSRC Sept 2003, ISBN 1-904425-11-9 semantics. The potential bottleneck of the sys- final/docs/infosvcs/indexsvc_ tem is the scalability of the information tuplespace. overvi%ew.html. Multiple open-source as well as commercial im- plementations of Javaspace are available with dif- [5] G. Avellino et al. The first deployment of ferent performance characteristics. Further perfor- workload management services on the eu mance testing is essential for determining different datagrid testbed: feedback on design and im- combination of strategy as well as implementation plementation. In Computing in High En- in different load pattern. ergy and Nuclear Physics (CHEP03), March 2003. 6 Conclusion [6] F5. Big-IP hardware load-balancer. http: //www.f5.com/f5products/bigip/. This paper presents a secure framework that ab- stracts the European Data Grid Resource Broker [7] I. Foster, C. Kesselman, G. Tsudik, and as a set of OGSI job submission and management S. Tuecke. A Security Architecture for services. The generic abstraction allows the infras- Computational Grids. In ACM Conference tructure to be applied to a variety of job submission on Computer and Communications Security system. Moreover, the OGSI based service ori- Conference, pages 83–92, 1998. ented architecture opens a wide integration avenue to heterogeneous clients. Performance profiling of [8] D. Gelernter. Generative communication in resource brokers under high submission rate have linda. In ACM Transactions on Programming shown a degradation of throughput and build-up of Languages and Systems, No.1, pp. 80-112, long submission queues[5]. By hiding the mono- January 1985. lithic Resource Broker as a component of a dis- tributed job submission system, load can be shared [9] Sun Microsystems. Java transaction api spec- between resource broker instances. This paper has ification v.1.0.1. http://java.sun.com/ demonstrated the design and implementation of the products/jta/index.html. information tuplespace and the strategic elements that makes use of these information to distributed [10] Sun Microsystems. Javaspace service job requests. A LINDA inspired tuplespace based specification. http://java.sun.com/ on the Javaspace API is an ideal candidate for products/jini/2.0/doc/specs/html/ publishing job requests to a bulletin-board where js-spec.html. agents respond and execute the request based on its [11] Sun Microsystems. Jini(tm) Network Tech- strategy. The three tuple types are the ingredients nology. http://java.sun.com/jini/. for a push-pull hybrid approach for load-balancing. [12] Sun Microsystems. Sun one grid engine soft- Acknowledgement - The work has been carried ware. http://wwws.sun.com/software/ out as part of the EPSRC project on Effective gridware/. Multi-User and Multi-Job Resource Utilisation (GR/R74505/01) [13] F. Pacini. Job Description Lan- guage How-to. http://server11. References infn.it/workload-grid/docs/ DataGrid-01-TEN-0102-0_2-Doc% [1] The European Data Grid work pack- ument.pdf. age 1. http://server11.infn.it/ workload-grid/. [14] UDDI Project. Universal Description, Dis- covery and Integration (UDDI), September [2] The Globus Resource Specification Lan- 2002. Available at http://www.uddi.org. guage RSLÊv1.0. http://www.globus. org/gram/rsl_spec1.html. [15] M. Solomon R. Raman, M. Livny. Match- making: Distributed resource management [3] The Globus Toolkit 3.0. http://www-unix. for high throughput computing. In Pro- globus.org/toolkit/download.html. ceedings of the Seventh IEEE International [4] GT3 Index Service overview. http: Symposium on High Performance Distributed //www.globus.org/ogsa/releases/ Computing, Chicago, IL, USA, July 1998. 143
Proc UK e-Science All Hands Meeting 2003, © EPSRC Sept 2003, ISBN 1-904425-11-9 [16] S. Monforte S. Cavalieri. Re- eu-datagrid.org/. source Broker Architecture and API. http://www.infn.it/workload-grid/ [18] Foster I. Frey J. Graham S. Kesselman C. docs/20010613-RBArch-2.doc. Snelling D. Vanderbilt P. Tuecke S., Cza- jkowski K. Open Grid Service Infrastructure [17] The DataGrid Project. http://www. (OGSI) v.1.0 Specification, Feburary 2003. 144
You can also read