SMash: Secure Component Model for Cross-Domain Mashups on Unmodified Browsers
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
WWW 2008 / Refereed Track: Security and Privacy - Web Client Security April 21-25, 2008 · Beijing, China SMash: Secure Component Model for Cross-Domain Mashups on Unmodified Browsers Frederik De Keukelaere, Sumeer Bhola, Michael Steiner, Suresh Chari, Sachiko Yoshihama {eb41704, sachikoy}@jp.ibm.com, {sbhola, msteiner, schari}@us.ibm.com IBM Tokyo Research Laboratory, Kanagawa, Japan; IBM T.J. Watson Research Center, New York, USA ABSTRACT security requirements, like consumer banking sites and en- Mashup applications mix and merge content (data and code) terprise applications. Since existing browser security models from multiple content providers in a user’s browser, to pro- were defined and developed without anticipating such appli- vide high-value web applications that can rival the user ex- cations, these applications are pushing the boundaries of perience provided by desktop applications. Current browser current browser security models. security models were not designed to support such appli- Since content in a mashup application can stem from mu- cations and they are therefore implemented with insecure tually untrusting providers, it is clear that they should be workarounds. In this paper, we present a secure component built on a sound security foundation protecting the interests model, where components are provided by different trust of the various involved parties such as the content providers domains, and can interact using a communication abstrac- and the end-user. For example, consider a mashup applica- tion that allows ease of specification of a security policy. tion scenario of a car portal where information from multiple We have developed an implementation of this model that car dealers, insurance companies and the user’s bank could works currently in all major browsers, and addresses chal- be combined and co-resident on the user’s browser. It is lenges of communication integrity and frame-phishing. An clear that, at a minimum, we want certain security require- evaluation of the performance of our implementation shows ments to be enforced, such as the dealer’s scripts not being that this approach is not just feasible but also practical. able to modify each others car prices, nor should they be able to spy on a user’s bank account information. Categories and Subject Descriptors: D.2.0 [General]: The traditional browser security model dictates that con- Protection mechanisms, D.2.11 [Software Architectures]: In- tent from different origins 2 cannot interact with each other, formation hiding, domain-specific while content from the same origin can interact without con- General Terms: Security, design. straints (reading and writing of each others’ state). This Keywords: Web 2.0, browser, mashup, component model, model does not support mashups, where controlled interac- phishing. tion is desirable. To overcome the no interaction restriction, mashup developers typically enable interaction by either (1) using a web application proxy server which fetches the con- 1. INTRODUCTION tent from different servers and serves it to the mashup, or Web applications increasingly rely on extensive scripting (2) by directly including code and data from different ori- on the client-side (browser) using readily available JavaScript gins (using tags). In both cases, it appears to libraries. One motivation for this is to enable a browser user the browser that the mashup originates from a single origin, experience comparable to that of desktop applications. The though it contains content from different trust domains3 , extensive use of scripting on the client side and programming enabling uncontrolled interaction. paradigms such as AJAX [14] has also led to the growth In this paper we examine the problem of building se- of applications, called mashups, which mix and merge con- cure componentized mashups. We propose an abstraction on tent1 coming from different content providers in the browser. which security policies can be specified, and an implementa- Mashups are now prevalent in a number of contexts, includ- tion that realizes this abstraction for unmodified browsers. ing news websites, which have integrity requirements, or web email, which handles confidential information. They are es- 1.1 Our Approach sential to an advertising supported business model, and for Our abstraction involves encapsulating content from dif- allowing user-generated content in Web 2.0 websites. The ferent trust domains as components running in a browser tremendous additional value that can be provided to users by 2 mixing and merging content implies that such applications Origin is a pair of hostname (DNS domain or IP address) will eventually be prevalent in contexts with stricter data and URI-scheme (protocol) from a given URL. The “same- origin policy” [17] usually includes also the port. However, 1 We use the term content to refer to active content, i.e., this is not done by all browsers, e.g., Internet Explorer 6 both data and JavaScript. ignores ports when comparing origins. 3 Clearly trust domain is not the same as DNS domain. How- Copyright is held by the International World Wide Web Conference Com- ever, designing a secure solution for current browsers in- mittee (IW3C2). Distribution of these papers is limited to classroom use, volves some mapping between the two. In the remainder of and personal use by others. the paper, the term domain used alone refers to DNS domain WWW 2008, April 21–25, 2008, Beijing, China. or IP address. ACM 978-1-60558-085-2/08/04. 535
WWW 2008 / Refereed Track: Security and Privacy - Web Client Security April 21-25, 2008 · Beijing, China window. Components can be loaded and unloaded dynam- using channels, will provide a key primitive for secure com- ically, as the application in the browser window evolves. ponentized mashups. The mashup provider does not need Components are wired together using communication chan- to proxy all the components so that they appear to be from nels, which are the only means for them to interact in the the same origin, nor resort to other unsafe practices, such browser. The channel abstraction and security policies as- as directly including scripts from different domains. It is sociated with channels are implemented by an event hub also in line with the general principle of least privilege [22]: that is part of the Trusted Computing Base (TCB) from unavoidable programming and configuration errors mandate the mashup provider’s perspective. Components are loaded decomposition of web applications into small isolated units from their own server (as opposed to being proxied or using even when they come from the same administrative domain. a element) and isolated from the mashup applica- Finally, it provides a general security mechanism to guard tion code. This has security advantages in (1) not requiring against cross-site scripting (XSS) attacks, as discussed next. the component to completely trust the mashup application, In the context of attacks on web applications, XSS at- and (2) making the component abstraction compatible with tacks [23] are getting significant attention. These affect a password anti-phishing mechanisms that use the component special case of a mashup, where a website insecurely com- (DNS) domain or the certificate of the SSL connection, like bines content generated by the website with content gener- Passpet [28] and Microsoft CardSpace [15]. ated by its users, which are different trust domains. XSS By creating a high level abstraction for cross-component attacks allow user-generated malicious content to not just interaction, we ensure that the different communication tech- read and write website-generated (trusted) content, but also nologies used in our implementation can be replaced by, gives it access to browser-cached credentials (e.g., cookies) or even combined with, other technologies as they become for that trusted content. The common XSS mitigation ap- available on the browser platforms. proach is to disallow users to generate content that contains We realize the component abstraction using the HTML JavaScript, using content filters, but this is both (1) difficult element (Section 16.5 of [19]), which was designed to implement, resulting in many attacks against incomplete as a container for loading sub-documents inside the main content filtering, and (2) limiting for user creativity. Us- document. The technical challenges that we address in the ing our component abstraction, user-generated code can be implementation are (1) enabling parent to child document safely placed in a separate, less-trusted component. communication links when the parent and child are from The paper is structured as follows. In Section 2 we present different trust domains, (2) ensuring integrity and confiden- our secure component model, followed by a discussion of se- tiality of information on these links, and (3) guarding against curity policies in Section 3. Then we describe our imple- frame-phishing, where a malicious component in a mashup mentation and how we addressed various attack vectors in can change which component is loaded in another part of the Section 4. We present a performance evaluation in Section 5, mashup. Our implementation is a JavaScript library which discuss related work in Section 6, and conclude in Section 7. is used by the mashup and component providers. It is avail- able as open-source in the OpenAjax Alliance [1] Source- 2. SECURE COMPONENT MODEL Forge project. We have tested our library on Internet Ex- In this section, we describe our secure component model plorer (IE), Firefox, Safari, and Opera.4 for mashups. As discussed in the introduction, the current There are several competing technologies under develop- browser security model either allows different content to ar- ment that attempt to address the problem of secure mashups. bitrarily interact if they are from the same origin, or disal- We briefly contrast them here and discuss them in more de- lows all interaction if they are from different origins. Thus, tail in related work (see Section 6). Many of these proposals it is clear that the current browser security model is in- require HTML and browser modifications, but the long time sufficient for mashup applications. What is desirable is a line of adoption by standards committees, browser vendors, new security model that allows content to be separated by and eventually by end-users, makes these nonviable for any- trust domain, with a carefully mediated interaction between one wanting to build secure mashups in the near term. In ad- such separated content. Furthermore, to make this accessi- dition, there are some proposals that work without browser ble from a programmer’s perspective, there must be a high modifications. These are either targeted at client-server level programming interface that allows creation of secure cross-domain communication for mashups, which is a very mashups, and makes this easy. different programming model, or web widget/gadget deploy- Both from a programmability and usable security perspec- ments. Our components have some conceptual similarity tive, it is essential to follow the concept of information hid- with web widgets, but need not have a user-interface. Also, ing [18] by limiting the exposure of internal design and state. web widget deployments offer a programming abstraction Thus, our model is analogous to using a message-passing in- only to widget developers, not mashup developers. All these teraction style instead of a shared-memory style. We first proposals typically, (1) are vulnerable to frame-phishing, (2) present the model, and then explicitly specify the security consider untrusted components but not untrusted mashup requirements. applications. 2.1 Model 1.2 Building Secure Mashups Figure 1 graphically represents, in an abstract manner, We believe that using our high-level abstraction of com- our proposed model for secure mashups. The model consists ponents which communicate through a mediated event hub, of components, with input/output ports, and an event hub, 4 More specifically, the tested browser versions are Internet with mediated communication channels. Explorer 7.0, Firefox 2.0.0.7, Safari 3.0.3, and Opera 9.24. The mashup application, provided by the mashup provider, There is no technical reason, though, that the library would consists of an event hub and one or more components, which not work on different versions or other browsers. can be provided by third-party component providers. Com- 536
WWW 2008 / Refereed Track: Security and Privacy - Web Client Security April 21-25, 2008 · Beijing, China mechanisms, like Microsoft CardSpace [15] and Pass- pet [28], that take into account the (DNS) domain of the component, or the certificate of the SSL connec- tion. This is important if the component needs to authenticate the user to perform its function; (3) It retains the possibility of using cookies, secured by the DNS domain, as is still common practise for session au- thentication. This also has the benefit of not needing to deploy an additional web proxy. 4. Inter-component communication is secure: Malicious Figure 1: Component Model components can not affect the integrity and confiden- tiality of communication between other components. Specifically, the model provides one-way authentica- ponents could be visible or invisible. Visible components tion of components to the mashup application. Mutual share the browser display in a manner determined by the authentication could be implemented with a slightly mashup application. A component contains (active) content more elaborated protocol. However, the context in from one trust domain. The components, and therefore also which we use our client-side library does not require it: the trust domains, are logically separated and can only com- that aspect is addressed with server-side access control municate with each other through the mediated channels for component loading (section 3.2). implemented by the event hub. A component may spec- ify input/output ports (respectively depicted in the figure 5. Component loading and unloading is completely un- by white and gray ellipses) that define the types of input der the control of the mashup application. A ma- and output that the component expects in order to function licious component cannot replace a peer component properly. The event hub wires the component ports to the in another part of the application without detection appropriate channels (depicted by the arrows in Figure 1). and appropriate (context-specific) remediation by the The event hub is a publish/subscribe system with many- mashup application. to-many channels on which messages are published and dis- 6. Channel access control is completely under the control tributed. This model differs from a traditional publish/sub- of the mashup application. scribe model by separating the namespace of the compo- nent ports from the namespace of the channels. This avoids Limiting Trust in Mashup Application Based on these clashes in the namespace of the component ports and en- security requirements, a mashup application cannot directly hances the re-usability of the components. In Figure 1, Com- attack the integrity or confidentiality of a component. How- ponent A can publish to Channel 1 and Channel 3 and is ever, a component has to trust the mashup application for subscribed to Channel 2 and Channel 3. The channels sup- the integrity of its user interface (excluding user authenti- port asynchronous pass-by-value communication. An asyn- cation, as mentioned above). This is reasonable from an chronous model, though typically considered harder to pro- end-to-end perspective since the end-user trusts the mashup gram to, is a natural one for applications running in the application for integrity of the overall user interface (UI), browser, since they have significant logic for asynchronous and there is a simple delegation of trust from the mashup event handling related to user-interface (UI) events. In ad- application to components for parts of the UI. Note that dition, an asynchronous model allows more implementation the mashup application cannot attack the confidentiality of choices, since application code running in a browser is single- information displayed by a component in its UI. threaded. A component needs to trust the mashup application for The model can be extended hierarchically, i.e., a compo- integrity and confidentiality of communication with its peer nent could contain sub-components, and an event hub which components. Removing this trust is expensive without in- mediates between these sub-components. built browser support. The cross-document messaging pro- posal in HTML 5 (section 6.2) addresses this. 2.2 Security Requirements We explicitly list the security requirements for the previ- 3. SECURITY POLICY ously presented model, when embodied in a browser. The component model described in the previous section 1. The DOM (Document Object Model) [26] tree of each provides component isolation and mediated inter-component component is completely isolated from other compo- communication. To describe which component interactions nents. That is, there is no reading and writing of DOM are permitted and which are forbidden, we need to define a elements across trust domains. security policy for the model. The policy specification can be derived from many sources 2. The JavaScript namespace of each component is com- of high level policies. This is especially true for enterprise- pletely isolated from other components. class mashups, where the security policy may be a com- bination of enterprise-level policy, department policy, and 3. Components can be loaded directly from the compo- end-user policy. Component providers may want to express nent provider. From a security perspective, this has how their components are to be integrated into a mashup the following advantages: (1) A component is pro- application. For example, a component provider may con- tected (isolated) even from the mashup application strain what data is exposed by the component to a mashup code; (2) It is compatible with password anti-phishing provider, and what the mashup provider may do with it. 537
WWW 2008 / Refereed Track: Security and Privacy - Web Client Security April 21-25, 2008 · Beijing, China On the other hand, mashup providers may not trust compo- 4. IMPLEMENTING THE MODEL nents equally, and may want to prevent certain components In this section, we discuss our design and implementa- from communicating with each other. In general, security tion of the secure component model. Our approach is to policies for componentized mashups can have many facets, leverage the current browser security model to isolate com- depending on which entity specifies the policy. ponents belonging to different trust domains, and then to We have identified two different levels at which policies enable interaction by sending-receiving on named compo- can be expressed: low level policies that express basic access nent ports. These get translated to send-receive on named control, and high level policies which express interactions shared channels. Our approach does not require any browser at a more semantic level. We consider basic access control modifications, hence adoption can be immediate. policies in the remainder of this section. We envision that In section 4.1 we give some background on browser ab- basic policy is derived by inputs from high level policies. stractions and their security, which is relevant to our solu- Next, we describe two complimentary aspects of access tion. Section 4.2 provides an overview of our solution, and control: component interaction in the browser, and compo- section 4.3 gives details on the key aspects of the solution. nent to server interaction. Both are necessary for end-to-end control on what accesses are permitted to whom. 4.1 Background We review a few concepts from HTML documents and 3.1 Component Access Control in the Browser their properties, including the “same-origin” security policy This controls who is allowed to (1) create and destroy of browsers. named components, (2) create and destroy named channels, An HTML document loaded into a browser is represented and (3) which components can publish or subscribe to events in an object model called the Document Object Model. A on a named channel. The policy is enforced by the event document has a domain property which is the hostname of hub. The subjects in the policy are named components or the server it was accessed from. The hostname could be a trust domains. numeric IP address or a DNS domain name. DNS names are We consider a special case of such a policy, where com- hierarchical, and browsers allow a document to relax its ori- ponent and channel creation (and destruction) can only be gin via the domain property. For instance, a document with done by the mashup application. This special case is in line domain foo.bar.baz.com can change it to bar.baz.com or with current practice, and has the advantage that it makes baz.com. A document has a window object with a location policy specification very straightforward. The policy is im- property that represents the URL of the document. It is plicitly specified by the mashup application code by, creating possible to change the location property of the document, channels, loading the different components, and then wiring by updating the window.location. This will cause a new them together by specifying the channels that connect to document to be loaded to and to replace the current docu- the input/output ports of each component. ment in that window. Another alternative is for the event hub to read security Documents can include frames using HTML and policy constraints from a configuration policy file. This de- tags where each frame is a document with domain couples the policy definition from the actual coding process and location attributes (and a window object). Frames can of the mashup application and enables scenarios in which include sub-frames, forming a hierarchy, which we will refer the mashup developer is not the one defining the policy for to as a frame hierarchy. For consistency of terminology, we the component interaction. will refer to even the top-level document in the frame hier- archy as a frame. The “same-origin” policy says that doc- 3.2 Component to Server Access Control uments from different domains that are in the same frame For non-public services exposed by web servers, user au- hierarchy cannot examine or alter each others internal state. thentication is typically required in order to perform the Code running in a frame can read/write the internal state of access. Components should not have to trust the mashup all documents from the same origin (that is, domain) that application to acquire user credentials. Following the princi- are part of the same frame hierarchy. For purposes of our ple of least privilege [22], the authentication should also be discussion, we will use the terms origin and domain inter- coupled with limited delegation logic, e.g., as in SecPAL [3]. changeably. While the details of our current solution are outside the Even if frames are from different domains, a frame can scope of this paper, we give a brief overview here. In addition typically write the location property of any frame in the to traditional user authentication, our access control model same frame hierarchy, regardless of origin. Such browsers authenticates a component and its associated (limited) ac- are labeled permissive browsers [12], and include Firefox, cess rights, as well as the mashup application it is loaded in. Safari, and some configurations of IE 6 and IE 7. However In particular, it makes sure that components are loaded in a code from one domain can never read the location property proper iframe of the component’s domain, not vulnerable to of a frame from another domain. a man-in-the-middle attack, and only after verifying that the requesting code has explicitly been granted the right to load 4.2 Solution Overview the component. This approach allows us to have a unified In this section, we give an overview of the key features and access control policy specification for expressing limited del- issues addressed in our solution, which include component egation within and across trust domains, and for both client- isolation, enabling and securing component-mashup commu- side mashups (mashup happens in browser) and server-side nication links, and protection against frame-phishing. mashups (mashup happens at server). Our solution can be Component isolation using iframes: In our solution, considered an extension of current authentication and dele- we load components which come from different trust do- gation protocols, such as Yahoo’s BBAuth [27] or Google’s mains into different iframes, i.e., they represent different AuthSub [9]. sub-frames. Even in the case where a single host is serv- 538
WWW 2008 / Refereed Track: Security and Privacy - Web Client Security April 21-25, 2008 · Beijing, China tigated the use of other inter-frame communication mecha- nisms, details of which are described in Section 5. Link Security: Permissive browsers allow complete nav- igation of the frame hierarchy. For example, in Figure 2, component B can get direct access to the window object of component A even though they both come from different origins. Note that it is not possible for B to read the lo- cation property of an iframe not in the same origin. This ensures confidentiality of the link between component A and the mashup. However, integrity is not guaranteed. To ensure integrity, we extend the fragment communica- Figure 2: Isolated components with component- tion protocol with an additional security token. The details mashup communication are discussed in the next section. Protection from frame-phishing: Since it is possible for a malicious component to write to the location property ing components from multiple trust domains, isolation is of another component’s iframe, it can navigate a component obtained by having the host use multiple DNS names, one away from its URL to another, possibly malicious, URL. We for each trust domain.5 However, given that components refer to this as frame-phishing, and is a common vulnerabil- can change their domain properties as described earlier, the ity of current sites that use iframes for isolating untrusted DNS domains used are such that a component cannot relax content. its domain property to attack another component. For in- Frame-phishing is dangerous for a number of reasons, (1) stance, a server foo.bar.com that serves components from it compromises integrity of a part of the application view trust domain t1 and t2 can create two DNS domains t1.foo. that was not meant to be controlled by the malicious com- bar.com and t2.foo.bar.com. Unless both components re- ponent, (2) it can be used to steal information entered into lax to the same super-domain no attacks are possible. the phished frame, much like the more common phishing of Component-mashup communication links: While iso- sites, which could include personally identifiable information lation can be achieved by placing components in different (PII) or passwords. Since the URLs loaded by iframes are frames, the challenge is to enable cross-domain inter-com- not visible in the browser’s URL bar, one cannot rely on the ponent communication, since it is not explicitly supported end-user to detect frame-phishing, and so it has to be done in browsers. Recently, an approach to communicate be- in an automated manner. tween iframes using the fragment identifier of the URL of Note that this attack is not only limited to changing the the iframe has been discovered [4]. The communication is location of a component’s iframe. It can be performed on based on the observation that even though the parent and the tunnel iframe, and even the main mashup document. child iframes have different origins, the parent can write to One could argue that such an attack on the mashup docu- the child’s location property. Note that if we only modify ment could be detected using the browser’s URL bar, but the fragment identifier of the location property the docu- (1) the URL bar has been shown to offer very weak phishing ment will not be reloaded and hence the document state can protection [6], and (2) the timing of the attack can be ar- be preserved because the fragment identifier was designed to bitrary, unlike conventional phishing attacks. For instance, be used for navigation inside a document. This technique a user that keeps a web application open in one of the tabs has been used in the Dojo JavaScript toolkit [7] to circum- of their browser could be attacked at an arbitrary point in vent the same-origin policy for client-server communication, the lifetime of the application, after the user has verified the and more recently in XDDE [13]. We reused this technique URL in the URL bar. to enable communication between components. We use a combination of event handlers, timeouts, and As mentioned previously, browsers vary in the constraints communication using the tunnel iframe, to detect such at- they place on navigating the frame hierarchy. Hence, one tacks. The details are discussed in the next section. needs to organize the iframes in a way that allows for frag- ment communication. Figure 2 illustrates the iframe config- 4.3 Solution Details uration used with three trust domains in the mashup. The Our implementation is the SMash JavaScript library used first trust domain, represented by origin www.mashup.com, by both the mashup application and component program- is that of the mashup application. It uses two components, mer. The library is structured as a layered communication both from the same server with DNS domain component. stack, where the peer layers in the component and mashup com, but from two different trust domains. To leverage the application communicate using the lower layers. This is iframe isolation, the components are loaded from different shown in Figure 3: the event hub layer, event communi- origins: component A is loaded into c1.component.com and cation layer and the fragment communication layer. component B into c2.component.com. In each of the compo- The event hub layer is aware of component ports and nents, there is a tunnel iframe loaded from www.mashup.com. channels wiring these ports. In the mashup application, this This tunnel can access the JavaScript context of the mashup layer is responsible for loading and unloading components, application since it is in the same origin as the mashup appli- creating and deleting channels, and wiring the ports of the cation. The tunnel and the component iframes communicate components to channels. In a component, this layer is aware cross-domain using fragment identifiers. We have also inves- of the component’s ports and handles sending and receiving events on these ports. The event communication layer is re- 5 HTTP virtual hosts and DNS aliases, e.g., wildcard re- sponsible for composing the messages which are used to mul- source records, allow easy realization of such co-residence. tiplex the multiple component ports on a single link. The 539
WWW 2008 / Refereed Track: Security and Privacy - Web Client Security April 21-25, 2008 · Beijing, China versa. To write the next segment, the component has to Channel 1 Event Hub wait till the previous segment has been read by the tunnel. 6 7 7 This is implemented by using a periodic timer at both the Event Comm Event Comm component and the tunnel, and polling for changes to one’s 5 8 8 own location property. As the component cannot read the Frag Comm Frag Comm location of the tunnel, it needs to get an explicit acknowl- 3 4 9 10 9 10 edgment when the tunnel has read the previous segment. Frag Comm Frag Comm This is accomplished using acknowledgment messages. 2 11 11 Link Integrity: As mentioned previously, link confiden- Event Comm Event Comm tiality is implicit in fragment communication. For link in- 1 12 12 tegrity, we have to guard against a malicious component Event Hub Event Hub modifying the location property of a peer component or Component A Component B the peer’s tunnel. We cannot prevent a malicious component from modify- Mashup Application ing location. However, it can only do so atomically, i.e., it has to overwrite the complete value of the current frag- Browser ment (overwriting the non-fragment part is discussed under frame-phishing, in the next section). Therefore we use the Figure 3: Layered communication stack approach of embedding a shared secret, in each fragment, to authenticate the sender of the fragment, and hence ensure its integrity. This shared secret is embedded in the clear, fragment communication layer is the only layer aware of the as the attacker cannot read the location. Note that this is use of fragments to communicate between the component sufficient to provide (one-way) authentication of the compo- and the mashup application. By substituting this layer, it is nent to the mashup application as required by Section 2.2. possible to employ another low-level communication mecha- We also need to securely distribute this shared secret be- nism, e.g., inter-iframe communication potentially built into tween the mashup application and the component. The future browsers (see Section 6). SMash library in the mashup application creates the secret, In the following subsections, we describe the API offered an unguessable random value. When creating the compo- to mashup and component programmers, provide some de- nent, it includes the secret in the fragment of the compo- tails on fragment communication and corresponding integrity nent URL. When the component creates the tunnel iframe protection, and describe how we detect frame-phishing. it passes the secret in the same manner. One of the reasons for doing the initial secret distribution to the component us- 4.3.1 API ing the fragment part of the URL is that RFC 2616 forbids The API offered by SMash is shown in Table 1. Most fragments in the header. So this secret will not of the API is self-explanatory. We draw, though, atten- be leaked by the component to remote parties when it loads tion to the component state lifecycle, shown with the get- data from them. ComponentState function. This state management is of- To ensure that the attacker cannot mount an attack by fered as a way for the programmers to deal with the asyn- deleting fragments, we use sequence numbering on consec- chrony of component loading/unloading, and the concur- utive fragments, so missing fragments will be detected and rency of wiring a loaded component. The state transition attributed to an attack. from loaded to wired is done by the mashup application, to indicate to the component that it can start publishing 4.3.3 Protection from Frame-Phishing events. The transition to startedCleanup is also done by the As described earlier, frame-phishing is a dangerous attack mashup application, when it intends to unload the compo- on mashups that use iframe isolation. We describe how we nent, while the transition to doneCleanup is typically done detect such attacks in SMash. by the component (subject to a timeout, to handle misbe- Since the mashup application cannot read the location having components). The remaining transitions are initi- property of its components, we cannot simply poll for changes ated inside SMash. To avoid polling of component state, to this value. The parent iframe could set onload and onun- the constructors of the hub (not shown) allow the mashup load event handlers on the child iframe. But browser imple- application and the component to register listener functions mentations vary and are unreliable in invoking these han- on the state transitions. dlers when a child event happens. We use a combination of onunload event handler on one’s own frame (which in our 4.3.2 Fragment Communication and Link Integrity experience is reliable), timeouts, and communication using As mentioned earlier, fragment communication is used to the tunnel iframe, to detect such attacks. implement the links between the mashup application and a There are three targets to consider with respect to such component. However, such communication is limited since attacks: (1) component, (2) tunnel, (3) mashup application. current browsers implement a limit on the URL length. In When the component is replaced by an attacker, the com- SMash, we work with a conservative limit of 4000 bytes, ponent’s onunload event handler is called. At this point, it which works well in all major browsers. Long messages have could try to notify the mashup application. However, since it to be split correspondingly into segments which, with pay- communicates with the mashup application asynchronously, load and protocol overhead, do not exceed that limit. using the tunnel, there is no guarantee that this communi- To send a segment to the tunnel, the component writes it cation will succeed before the unload completes. Instead, we to the fragment in the tunnel’s location property, and vice utilize the fact that replacing the component will also cause 540
WWW 2008 / Refereed Track: Security and Privacy - Web Client Security April 21-25, 2008 · Beijing, China Mashup API loadComponent(componentId,inPortNames,outPortNames) load a component with a given id createChannel(channelName) create a channel with the given name deleteChannel(channelName) delete a channel addReader(channelName, componentId, inPortName) wire an input port to a channel addWriter(channelName, componentId, outPortName) wire an outport port to a channel removeReader(channelName, componentId) unwire an input port removeWriter(channelName, componentId) unwire an output port broadcastOnChannel(channelName, message) event broadcast by mashup application getComponentState(componentId) get component state: start→loaded→wired →startedCleanup→doneCleanup→unloaded componentWired(componentId) transition to wired state startCleanupComponent(componentId) transition to startedCleanup state Component API getComponentState() get component state registerCallback(inPortName, callback) register function to handle event on input port publish(outPortName, message) send event on output port doneCleanupComponent() transition to doneCleanup state Table 1: API the onunload event on the tunnel to fire, as it is a child of ponents, to measure the scalability of the implementation the component iframe. The tunnel can communicate syn- (assuming all components are in different trust domains). chronously with the mashup application using a JavaScript Additionally, we vary a system parameter, the polling timer function call, and informs it of the unloading before it re- interval used for fragment communication, to see how it im- turns from the onunload event handler. Hence, we handle pacts communication throughput. The metrics used in the cases (1) and (2) in a unified manner. experiments are: Another possibility for the attacker is to replace the com- ponent before the tunnel iframe is loaded. This case is han- 1. Event Rate: This measures the maximum event rate dled by setting a timeout, in the mashup application, for suc- we can sustain, for small event payloads. This is typ- cessfully establishing initial communication with the tunnel. ical for many mashup applications where components If this timeout expires, an application specific error-handler exchange short events in response to user actions. is called, which could decide to unload and reload the com- ponent. 2. Data Throughput: This measures the maximum rate in Using the tunnel’s onunload event handler, though, poses kilobytes/s, and is important for data intensive mashup a challenge as the event is also called when the user nav- applications that want to transfer large volumes of igates away from the mashup application. To avoid false data between trust domains, inside the user’s browser. positives, we briefly delay the alert issuing in a timer. The timer will only fire and notify the application of a potential frame-phishing attack, if the user remains in the mashup 3. Component Load Latency: This measures the latency application. to load a component and setup the communication link Finally, we need to detect frame-phishing of the mashup between the component and the mashup application. document. We have found it hard to distinguish between the two cases of either a frame-phishing attack, or the user vol- Next, we describe our testbed, followed by the results for untarily navigating away from the document. In the interim, each of the above metrics. Finally, we briefly discuss using we use an alert box to notify the user about the change in an alternative inter-frame communication mechanism that the document URL. This has a negative impact on usability, uses Java applets, in the context of SMash. but in the absence of browser support to distinguish the two cases, it seems the only way to ensure protection of the user. 5.1 Testbed The authors acknowledge that providing the end-user with To do a fair comparison across the different browsers, all pop-up warnings, albeit being a best effort, is hardly a fail- tests were performed on exactly the same test platform. The safe solution. An alternative solution would be to stop all underlying test machine was an Intel Core 2 Duo T7300 @ interaction within the mashup application when an attack 2.0 GHz with 2 GB of RAM. This machine was running is detected. However, browser level fixes are the ideal solu- Windows XP with VMware player 2.0.1. The actual perfor- tion to these problems. Recently, there has been an in-depth mance evaluation was done on a VMware virtual machine study of the navigation behavior of browsers resulting in sev- emulating a single processor @ 2.0 GHz, 1 GB of RAM and eral improvements [2]. Until these fixes become widespread, running a clean fully patched Windows XP. Within this ma- our solution can be applied on current browsers. chine, we had Apache Tomcat 6.0.14 serving the data to the browsers. All of the tests were performed on Firefox 2.0.0.7, Internet Explorer 7.0 (IE), Safari 3.0.3, Opera 9.24. 5. PERFORMANCE EVALUATION To minimize measurement errors, all tests were performed In this section we describe a preliminary performance eval- multiple times until the standard deviation of the results uation of our implementation. The evaluation consists of became acceptably small. Due to space limitations, we only micro-benchmarks in which we vary the number of com- present the event rate and data throughput results for Fire- 541
WWW 2008 / Refereed Track: Security and Privacy - Web Client Security April 21-25, 2008 · Beijing, China Figure 4: Event rate (top) and data throughput (bottom) for Firefox and IE. fox and Internet Explorer, since they are the most popular messages from the mashup application to the components. browsers.6 7 Figure 4 gives a summary of the measured performance. Like the event rate experiments, the throughput increases 5.2 Event Rate linearly with the number of components and timer frequency, We used a small payload of 13 characters, representing a prior to the saturation point. At the saturation point, adding simple event name/value pair. Figure 4 (top) gives a sum- additional components or increasing the timer frequency re- mary of the measured performance as the number of com- sults in degraded performance. ponents is increased from 1 to 32 and the polling timer is varied from 10ms to 80ms. Both the X and Y axis are on a 5.4 Component Load Latency log scale. For evaluating the component loading performance, we For each browser we observed a saturation point. This measured per component loading latency while increasing point, mainly caused by CPU load, varies across browsers, the number of components being concurrently loaded. The timer intervals, and number of components. Prior to reach- overall loading latency is measured from the moment at ing this saturation point, there is a linear increase in aggre- which the mashup application creates the components until gate rate with increasing number of components and timer the moment at which the mashup application receives con- frequency. Once the saturation point is reached, and more firmation of a successful load of all components. This con- components are added, or the timer frequency is increased, firmation is communicated across domains using fragment performance loss due to over-saturation becomes apparent. communication. Figure 5 gives an overview of the measured An adaptive timer frequency, that detects and reacts to over- performance. saturation, may be able to avoid this behavior. Overall, We observe that load latency does not increase with an Firefox shows significantly better performance than IE. increase in number of components. In fact, the loading la- For many mashups the inter-component events are trig- tency per component initially decreases and and then starts gered by user actions, requiring only a low event rate. The to increase back towards its initial value. The latter can measured rates are sufficient for such applications. be explained by the additional system load due the concur- rent loading of many components. Note that caching was 5.3 Data Throughput disabled on all browsers except for Safari 9 . The data throughput was tested by transferring a total of 1 megabyte (MB) of data from the mashup application to the 5.5 Alternative Inter-Frame Communication components. Given the approximately 4 kilobyte (KB) limit As mentioned in section 4.3, it is possible to employ alter- of the fragment payload8 , this resulted in transferring 256 native inter-frame communication mechanisms in SMash, by 6 replacing a layer in the communication stack. In particular, Results for all browsers are available in the Appendix of alternatives such as Java applets, Adobe Flash, or Google the long version available from http://domino.research. ibm.com/comm/research_projects.nsf/pages/web_2. Gears, though lacking universal support in browsers, could 0_security.smash.html be used to increase throughput when available. In this ex- 7 For Opera, we observed some anomalies in system load, periment, we investigated the use of Java applets. that are probably due to a bug in the rendering engine when Our implementation employs different components that using fragments in invisible frames. 8 9 This limit was chosen based on the limit for IE, which had We were unable to disable caching on Safari due to the the smallest fragment length limit of the tested browsers. unavailability of cache configuration parameters. 542
WWW 2008 / Refereed Track: Security and Privacy - Web Client Security April 21-25, 2008 · Beijing, China munication within the browser. After some setup cost in- volving multiple iframes, this solution allows exchange of JavaScript objects of arbitrary length and hence scales very well in terms of bandwidth. Depending on the particular browser, though, the event rate will be bounded by the limits of polling similar to our approach. As a drawback, Subspace requires that all connections are setup in the very beginning, prohibiting dynamic on-demand loading of components. 10 More importantly and contrary to our approach, Subspace requires complete trust of the component providers in the mashup provider as their code is executed in DNS domains Figure 5: Component loading latency. controlled by the mashup provider. 6.1.2 Iframes with Server-Side Communication load applets from the same location, and use the static vari- An alternative to the above is to do server-side communi- ables of the applet to communicate in a secure manner with cation between client-side components that are represented the mashup application (ensuring that components cannot as iframes. Each client-side component has a correspond- interfere with each other’s communication). The LiveCon- ing server-side communication object. Communication be- nect feature in browsers is used to communicate between a tween component A and B would occur by component A component written in JavaScript and the low-level communi- sending a message to its server-side communication object cation layer written in Java. We use polling from JavaScript which would pass it to component B’s communication ob- to Java to detect the reception of a new message. This is ject, which would then pass it back to component B. This necessary since browsers enforce the same-origin policy, and solution requires asynchronous server to client communica- prevent a method call from Java to JavaScript from crossing tion, which could be implemented using emerging techniques origins. like the Bayeux protocol [21] and Cometd [24]. We did not Results: On both Firefox and IE there were stability prob- choose this approach since it needs a complicated connec- lems in loading large number of applets on the same page, so tion setup, and all component-component communication we limited the throughput tests to pages which had no more in the client needs to go through two (logical) servers, with than 8 applets, which implies no more than 8 components. increased server load and considerable latency. For Firefox and Opera we observed aggregate throughput of 16 MB/s, independent of the number of components. For IE, 6.1.3 JavaScript Rewriting or Restrictions we observed an aggregate throughput of 8 MB/s. Our con- An alternative to browser isolation mechanisms is to use clusion is that for browsers that do support Java applets and language analysis and rewriting. A combination of static LiveConnect, componentized mashup applications could use analysis, language restrictions and dynamic code rewriting this mechanism to get significantly higher throughput than could achieve isolation of JavaScript code [20, 29, 25, 16]. fragment communication. However, due to the complexity of rewriting self-modifying code in weakly documented and varied runtime environ- 6. RELATED WORK ments, it makes it hard to verify that the isolation achieved Related work can be largely divided into three groups: is complete. Furthermore, it requires a trusted server, lim- proposals working with unmodified browsers; extensions to iting the use to a single administrative domain. Finally, it HTML requiring browser modifications; solutions based on faces considerable performance and integration challenges browser plugins. In the following subsection, we will have a on client as well as server side. closer look at each of them. 6.2 HTML Extensions 6.1 Unmodified Browser There are a number of competing proposals to extend Alternatives for unmodified browsers fall into three cat- the HTML specifications with goals similar to our secure egories: (proxied) iframes with client-side communication, component model: the tag proposal [5], cross- (proxied) iframes with server-side communication, and server- document messaging in the HTML 5 working document of side induced JavaScript rewriting or restrictions. the Web Hypertext Application Technology Working Group (WHATWG) [10] and the element proposal [11]. 6.1.1 Iframes with Client-Side Communication All three proposals provide an isolated DOM container Closest to our work are XDDE [13] and the recently an- element – essentially variations of – which enables nounced Google PubSub [8]. Both use iframes and frag- cross-domain communication by offering a message passing ments for client-side communication similar to our approach. interface. Invoked in a controlled fashion and assisted by Neither of them, though, seems to address the integrity and additional meta-data about the caller, the receiving element frame-phishing issues raised and addressed in this paper. can then process such messages in a secure way. A second technique for client-side communication is Sub- While the granularity of isolation in HTML5 is based on space [12]. It exploits domain-relaxation of the domain DOM the same-origin policy, different s and s are attribute, mentioned in Section 4.1, to place a request to a isolated even when they are loaded from the same origin. foreign service safely in an iframe served from a one-time- This will simplify administration of servers hosting multi- use sub-domain of the mashup server. While their primary 10 The open-source implementation CrossSafe [30] relaxes this goal is secure cross-domain client-server communication, it requirement. However, it essentially requires all components could also be used for secure cross-domain component com- to trust each other, clearly unacceptable for many contexts. 543
WWW 2008 / Refereed Track: Security and Privacy - Web Client Security April 21-25, 2008 · Beijing, China ple components — not each of them has to be assigned a [3] M. Y. Becker, C. Fournet, and A. D. Gordon. SecPAL: Design separate DNS sub-domain — and encourage finer-granular and semantics of a decentralized authorization language. Technical Report MSR-TR-2006-120, Microsoft Research, Sept. components. However, it raises the question of how well 2006. it integrates with traditional authentication schemes, often [4] J. Burke. Cross domain frame communication with fragment tied to the same-origin policy due to cookies. identifiers. http://tagneto.blogspot.com/2006/06/ The key difference from a programming model perspec- cross-domain-frame-communication-with.html, June 2006. [5] D. Crockford. The tag. http://www.json.org/module. tive is that these proposals implement one port for each html, Oct. 2006. component, which multiplexes all communication, while we [6] R. Dhamija, J. Tygar, and M. Hearst. Why phishing works. In support multiple ports. In addition, we support (multi-cast) Conference on Human Factors in Computing Systems (CHI channels directly between components, and not just between 2006), 2006. [7] Dojo Foundation. Dojo javascript toolkit. http://www. a component and the parent. This allows us to define fine- dojotoolkit.org/. grained access control policies, which cannot be directly de- [8] Google. Gadget-to-gadget communication. http://www.google. fined using these proposals. However, as pointed out in Sec- com/apis/gadgets/pubsub.html. tion 4.3, these proposals could easily be integrated into our [9] Google. Google account authentication (AuthSub). http:// code.google.com/apis/accounts/AuthForWebApps.html. overall architecture by replacing the fragment communica- [10] I. Hickson (Editor). HTML 5. Technical report, Web Hypertext tion layer in Figure 3. Application Technology Working Group HTML 5, 2007. These proposals all require browser modifications and hence Working Draft, http://www.whatwg.org/specs/web-apps/ there will be a considerable time lag before they are adopted current-work. [11] J. Howell, C. Jackson, H. J. Wang, and X. Fan. MashupOS: by all end-users. Operating system abstractions for client mashups. In Proceedings of HotOS XI: The 11th Workshop on Hot Topics 6.3 Plugins in Operating Systems. USENIX, May 2007. [12] C. Jackson and H. Wang. Subspace: Secure cross-domain A third communication approach is to use browser plug- communication for web mashups. In 16th International ins to facilitate cross-domain communication. Good candi- Conference on the World-Wide Web, 2007. dates are Adobe Flash, Java Applets or Google Gears. An [13] G. Lee. Personal communication on XDDE. http://www. implementations based on these can be more efficient (see openspot.com, 2007. [14] B. McLaughlin. Mastering Ajax. IBM developerWorks, 2005 – Section 5.5). However, the required plugins might not be in- 2007. http://www-128.ibm.com/developerworks/views/web/ stalled in each browser, or not even available on a given plat- libraryview.jsp?search_by=Mastering+Ajax+Part. form, and are mostly based on proprietary technology. That [15] Microsoft. Windows cardspace. http://cardspace.netfx3.com, said, as discussed above for HTML extensions, such plugin- http://www.identityblog.com. [16] M. S. Miller, M. Samuel, B. Laurie, I. Awad, and M. Stay. based cross-domain communication could serve, where avail- Caja — safe active content in sanitized Javascript. http:// able, as an alternative provider in our lowest communication google-caja.googlecode.com/files/caja-spec-2007-10-11.pdf, layer with the fragment-based communication as a guaran- Oct. 2007. teed fallback. [17] Mozilla.org. The same origin policy. http://www.mozilla.org/ projects/security/components/same-origin.html. [18] D. Parnas. On the criteria to be used in decomposing systems 7. CONCLUSION into modules. Communications of the ACM, 15(12):1053–1058, Dec. 1972. This paper addresses the problem of securing mashup ap- [19] D. Raggett, H. Le Arnaud, and I. Jacobs (Editors). HyperText plications which mix active content from different trust do- Markup Language (HTML). W3C Recommendation 4.01, mains. As a solution, we propose a secure component model W3C, Dec, Dec. 1999. [20] C. Reis, J. Dunagan, H. J. Wang, O. Dubrovsky, and comprising a central event communication hub and governed S. Esmeir. BrowserShield: Vulnerability-driven filtering of communication channels which mediate the communication dynamic HTML. In Proceedings of the Sixth Symposium on between isolated components. We illustrated how such a Operating Systems Design and Implementation, Nov. 2006. model can be used to enforce basic access control policies [21] A. Russel, D. Davis, G. Wilkins, and M. Nesbitt. Bayeux protocol. Technical Report 1.0draft0, Dojo Foundation, 2007. which define the allowed interactions between components. [22] J. H. Saltzer and M. D. Schroeder. The protection of We described SMash, an implementation of this model on information in computer systems. Proceedings of the IEEE, current browsers, which can be used right away in build- 63(9):1278–1308, Sept. 1975. ing secure mashup applications. Our implementation de- [23] K. Spett. Cross-site scripting — are your web applications vulnerable? Technical report, SPI Dynamics, 2005. http://www. pends on iframes for isolation while bootstrapping a publish- spidynamics.com/whitepapers/SPIcross-sitescripting.pdf. subscribe model of communication using URL fragment iden- [24] Teknikill, Shadowcat Systems, and SitePen, Inc. Cometd. tifiers. Our programming model is intentionally general http://www.cometd.com/. enough that other communication techniques could be used [25] K. Vikram and M. Steiner. Mashup component isolation via server-side analysis and instrumentation. In Web 2.0 Security instead of URL fragments. SMash is resilient to attacks such & Privacy Workshop. IEEE Computer Society, Technical as channel spying, message forging, and frame-phishing. We Committee on Security and Privacy, 2007. have evaluated our implementation and find that it scales [26] World Wide Web Consortium. Document Object Model. http://www.w3.org/DOM/. well with increasing number of components in the mashup, [27] Yahoo! Browser-based authentication (BBAuth). http:// and has enough data throughput to be useful in a number of developer.yahoo.com/auth/. mashup application scenarios. Our implementation is avail- [28] K.-P. Yee and K. Sitaker. Passpet: Convenient password able as an open-source JavaScript library. management and phishing protection. In Symposium On Usable Privacy and Security, 2006. [29] D. Yu, A. Chander, N. Islam, and I. Serikov. JavaScript 8. REFERENCES instrumentation for browser security. In 34st ACM Symposium on Principles of Programming Languages (POPL), pages [1] OpenAjax Alliance Open Source Project. http:// 237–249, 2007. openajaxallianc.sourceforge.net. [30] K. Zyp. CrossSafe. http://code.google.com/p/crosssafe/. [2] A. Barth and C. Jackson. Protecting browsers from frame hijacking attacks. http://crypto.stanford.edu/frames/. 544
You can also read