A Personalized Recommender System for Pervasive Social Networks
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
A Personalized Recommender System for Pervasive Social Networks Valerio Arnaboldia , Mattia G. Campanaa,∗, Franca Delmastroa , Elena Paganib,a a IIT-CNR,Via G. Moruzzi 1, 56124, Pisa, Italy b Department of Computer Science, Università degli Studi di Milano, Via Comelico 39, 20135 Milano, Italy Abstract The current availability of interconnected portable devices, and the advent of the Web 2.0, raise the problem of supporting anywhere and anytime access to a huge amount of content, generated and shared by mobile users. On the one hand, arXiv:2205.15063v1 [cs.NI] 30 May 2022 users tend to be always connected for sharing experiences and conducting their social interactions with friends and acquaintances, through so-called Mobile Social Networks, further improving their social inclusion. On the other hand, the pervasiveness of communication infrastructures spreading data (cellular networks, direct device-to-device contacts, interactions with ambient devices as in the Internet-of-Things) makes compulsory the deployment of solutions able to filter off undesired information and to select what content should be addressed to which users, for both (i) better user experience, and (ii) resource saving of both devices and network. In this work, we propose a novel framework for pervasive social networks, called Pervasive PLIERS (p-PLIERS), able to discover and select, in a highly personalized way, contents of interest for single mobile users. p-PLIERS exploits the recently proposed PLIERS tag-based recommender system [2] as context a reasoning tool able to adapt recommendations to heterogeneous interest profiles of different users. p-PLIERS effectively operates also when limited knowledge about the network is maintained. It is implemented in a completely decentralized environment, in which new contents are continuously generated and diffused through the network, and it relies only on the exchange of single nodes’ knowledge during proximity contacts and through device-to-device communications. We evaluated p-PLIERS by simulating its behavior in three different scenarios: a big event (Expo 2015), a conference venue (ACM KDD’15), and a working day in the city of Helsinki. For each scenario, we used real or synthetic mobility traces and we extracted real datasets from Twitter interactions to characterize the generation and sharing of user contents. Keywords: pervasive content sharing, mobile social networks, opportunistic networks, personalized recommender systems 1. Introduction on device-to-device (D2D) wireless communications have been proposed in the literature, such as opportunistic net- The amount of data accessible through the Internet has works [7]. However, most of the classical mechanisms for dramatically increased over the last years. This trend has content identification and recommendation designed for been exacerbated by the advent of online social networks centralized infrastructures cannot be directly applied to (hereinafter OSNs) and by video-on-demand services like these scenarios, and ad-hoc solutions must be adopted. Netflix [3]. It is then expected to boom with the Internet This is the case of Mobile Social Networks (hereinafter of Things, where potentially every object in the physical MSNs), designed to further improve social interactions and world will create and share information over the network. inclusion through experience sharing based on opportunis- The availability of this data represents an important re- tic communications, and to this aim they need efficient and source that is revolutionizing our society, but it is also personalized mechanisms for useful content selection and posing some serious technological challenges in terms of distribution. maintenance, management, indexing and identification of The basic approaches to identify useful contents in oppor- contents. This affects especially mobile communications, tunistic networks are based on publish/subscribe mecha- where users want to be always connected and able to share nisms. Users have to explicitly define their interests by contents anywhere and anytime. To alleviate the burden subscribing to a fixed set of thematic channels and, when of data traffic on cellular networks, several solutions based they encounter other mobile users, they can ask them for contents related to the channels they are subscribed to ∗ Corresponding author (see for example the PodNet project [19]). Other solu- Email addresses: v.arnaboldi@iit.cnr.it (Valerio tions exploit context information (e.g., the history of phys- Arnaboldi), m.campana@iit.cnr.it (Mattia G. Campana), ical contacts between nodes, social information about the f.delmastro@iit.cnr.it (Franca Delmastro), pagani@di.unimi.it users or the presence of communities) to improve forward- (Elena Pagani) Preprint submitted to Elsevier May 31, 2022
ing and content dissemination to potentially interested homonyms, polysemies, and different users’ tagging be- users (ContentPlace [4], ProfileCast [15], SocialCast [8] havior make the reasoning process difficult to perform in and ICast [26]). They are generally defined as context- some cases, and undermine the use of simple tag match- aware forwarding or content dissemination protocols, and ing [14]. To overcome these limitations, a novel family context information generally includes information charac- of recommender systems, called Tag-based Recommender terizing the user’s behavior, her interests, generated and Systems [33], have been recently proposed for centralized shared contents and the surrounding environment. infrastructures, where global knowledge is available. To Recently researchers have investigated the possibility of us- understand whether a content is interesting for a user, ing recommender systems [16] to identify useful contents tag-based recommender systems do not only consider tags also in the opportunistic environment (mainly based on that are directly associated with contents, but also the re- content filtering and tag expansion) [9, 28, 29, 22]. Rec- lations existing between tags, trying to extract a semantic ommender systems perform better than publish/subscribe meaning from the contents. To the best of our knowledge, mechanisms by mainly relying on information about users’ currently there is no solution in the literature proposing past actions (e.g., past purchases in e-commerce [17, 20] the use of tag-based recommender systems in opportunistic or past visualizations of multimedia content in video-on- networks. We think that this could substantially change demand services [3, 1]). At the same time, they can be the way mobile users and services can access contents, es- included in the more general definition of context-aware pecially in Mobile Social Network scenarios. systems, in which context information is mainly focused We recently proposed a novel tag-based recommender sys- on content characterization and it is directly defined by tem, called PLIERS [2], which outperforms the state-of- users. In fact, through the use of Social Tagging Systems the-art tag-based recommender systems defined for online (STS), mobile users can directly define tags (i.e., labels) social networks and centralized solutions. In this paper, that describe contents from a semantic point of view. The we present a novel framework that exploits PLIERS prin- ensemble of tags generated by users in a certain online sys- ciples in a completely decentralized environment, relying tem is known in the literature as folksonomy [22, 25]. An only on the exchange of single nodes’ knowledge during important aspect of folksonomies is that, differently from proximity contacts and through D2D communications. We ontologies, no relationship between the terms is required called it Pervasive PLIERS (p-PLIERS). It represents a a priori (hierarchical or not). On the contrary, these rela- general framework for identifying useful and interesting tionships are automatically built thanks to the tags created contents in the mobile environment, on top of which sev- by the users and assigned to contents. Folksonomies have eral services can be built – from context-aware forward- the ability to quickly adapt to changes in the user’s vo- ing protocols, to content dissemination services (e.g., tar- cabulary and to represent highly personalized information geted advertising), to content sharing services, etc. We about the users’ interests. These aspects fit well mobile extensively evaluated the proposed solution through sim- users’ behavior, especially when they define, generate, and ulations considering three different application scenarios: share contents on-the-fly, while participating in an event (i) users attending Expo2015 during the World Food Day or living a particular experience. (one of the most crowded day of the entire event); (ii) users The richness of information contained in folksonomies can attending a scientific conference (ACM KDD 2015); (iii) be used to improve the identification of interesting con- users moving around the city of Helsinki during a working tents for each single user, and also to determine relation- day. For each scenario, we selected appropriate mobility ships and affinity between different users, depending on the traces, synthetic or real (when available) and we extracted content they generate and share. To this aim, it is essential real datasets from geo-localized Twitter interactions. The to define an algorithm able to efficiently detect interesting datasets would reflect the behavior of mobile users, gener- content starting from the local user preferences and a lim- ating and sharing contents related to a specific event or, ited knowledge of what is available on the network. This more in general, to their life in a European city. represents a context reasoning problem in a distributed The rest of the paper is organized as follows. In Section 2, and mobile environment, where each mobile device has a we present a summary of the use of recommender systems local representation of context information in the network in opportunistic networks for the identification of interest- (e.g., folksonomy in case of recommender systems). This ing contents. In Section 3, we describe PLIERS and we local knowledge is a view of the global knowledge of the compare it with existing tag-based recommender systems. information available from the whole network. Nodes can Then, in Section 4, we present p-PLIERS framework in enrich their local knowledge by sharing it with other nodes detail. In Section 5, we provide a general description of they physically encounter, through opportunistic commu- the experimental scenarios that we considered for the eval- nications. They can take decisions about which contents uation of the proposed solution. Specifically, in Section 6, to download or forward to other nodes using their local we present a comparison among existing recommender sys- knowledge. In this way, mobile users can identify interest- tems used in pervasive social networks, showing the ad- ing content locally, without accessing a centralized service. vantages of PLIERS. In Section 7, we present p-PLIERS However, since folksonomies are based on user-defined performance evaluation through simulations in the three tags, they also have some drawbacks. Synonyms, different scenarios. Finally, we discuss examples of ser- 2
vices that can benefit from the framework in Section 8, 19/24 i1 i1 and we conclude the paper in Section 9. 5/6 u1 u1 i2 5/24 i2 5/6 u 2 u2 2. Related Work i3 3/8 i3 1/3 u 1 3 u3 In the last few years, many researchers have used Rec- i1 i4 i4 5/8 ProbS 0 u ommender Systems for disseminating content in a mobile u1 0 4 u4 i2 i5 0 and pervasive setting. Most of the approaches in the liter- i5 u2 ature are based on popular Web Recommender Systems, i3 0 b) c) conveniently modified for mobile environments – e.g., by u3 2/3 1 i1 i1 reducing the computational complexity of the algorithms i4 1 u4 u1 u1 and limiting the amount of necessary memory. 0 HeatS i2 i2 1/2 i5 The most popular and widely implemented system is rep- 1/2 u u2 a) 2 1/3 resented by the Collaborative Filtering (CF) [13] approach. i3 i3 1/2 u The simplest implementation of CF makes recommenda- 3 u3 3/4 i4 i4 tions to a user based on items that other users with similar 0 u u4 interests liked in the past. The similarity in interests be- 4 i5 i5 0 tween two users is calculated by a similarity metric (e.g., d) e) cosine similarity or Pearson correlation) between their re- spective histories (i.e., the sets of items they liked in the Figure 1: Application of ProbS and HeatS with a bipartite graph.[34] past). This kind of CF is also known as “user-based CF ”, in contrast to the “item-based CF ” which models the pref- erence of a user for an item based on ratings of similar computational resources, these approaches take into ac- items by the same user. count just the relations existing between users and their Typically, CF-based systems operate on second-order ten- preferences, while they neglect the nature of the items sors (or matrices) that represent the relationships between shared in the network and do not exploit all the available users and items (or an item-item matrix for the item-based information from STS (i.e., the complete folksonomy). CF). For this reason, CF systems could suffer from scal- Tag-expansion [22] is the only solution proposed in the lit- ability problems depending on the size of the data struc- erature to perform folksonomy-based reasoning for content tures to be kept in memory and on the sparseness degree dissemination in an opportunistic environment. Using this of the matrix. Since the number of items in STS is typ- approach, each node builds a tag co-occurrence matrix to ically high and far beyond users’ ability to evaluate even identify the tags that are most frequently used in conjunc- a small fraction of them, the data representation in a CF tion with other tags (i.e., expanded tags), and it downloads matrix is often highly sparse. Recent research work (e.g., an item if it is tagged with one of these tags. One of the [9, 28, 29]) focused on the reduction of the complexity of main drawbacks is that a node could receive more items CF in order to identify useful content for mobile users in than those really interesting for it because the user may opportunistic networks and then optimize content dissem- not be interested in the topics represented by the expanded ination. For example, diffeRS [9] tries to classify users in tags. In addition, this approach can suffer from scalability two separate classes: mass-like minded and atypical users. problems depending on the dimension of the data struc- The former set is composed by users whose preferences tures to be kept in memory and on the sparseness of the are similar to the interests of other users encountered in matrix. Since the distribution of the popularity of tags the past (i.e., the community). For this kind of users, in STS generally follows a long tail distribution [18], the the recommendation is simply based on the average of the data representation in a tag co-occurrence matrix could be community’s preferences. For atypical users (i.e., users often highly sparse. whose preferences differ from those of the community), in- stead, a user-based CF approach is applied, computed only 2.1. Tag-based Recommender Systems among those other atypical users who similarly deviate To overcome the limitations of tag-expansion in oppor- from the community. Thereby, nodes exchange only the tunistic networks, we propose to use a new family of recom- contents that CF identifies as attractive to the local users. mender systems based on folksonomies: Tag-based Recom- MobHinter [28], instead, limits the CF computation to the mender Systems [33]. In the literature, many approaches most similar users according to a certain similarity mea- for Tag-based Recommender Systems have been proposed, sure, which takes into account different parameters (e.g., but – to the best of our knowledge – none of them has resources in common, similar behaviors, and so on). More- been so far employed in opportunistic networks. Among over, it proposes different strategies to limit the amount different possible tag-based solutions, diffusion-based [33] of information exchanged by nodes every time they meet. algorithms are the most promising ones for our reference However, even though there has been a complexity reduc- scenario. These algorithms try to overcome the complex- tion to allow CF to run on mobile devices with limited ity and scalability issues by using graphs as a natural way 3
u1 u3 u2 u4 u5 u6 u7 resource among the items connected to her (second diffu- sion step) and each item then receives the same portion of the resource. The final score of each item ij is given by the sum of the portions of resources that are assigned to it after Verdi Bach Piano Bocelli Concert Football Millwall Milan the two diffusion steps. The set of all the scores obtained in this way is called resource vector and it can be used to HeatS ProbS rank the items not directly linked to the target user. The (a) First recommendations to U1 higher the score obtained by an item, the greater could be the interest in it for the target user. The mechanism of HeatS is similar to ProbS, but it is based on opposite rules: each time a resource (or a portion of it) is redistributed, it u1 u3 u2 u4 u5 u6 U6 u7 is divided by the number of edges connected to the node towards which it is heading to. Figure 1 depicts the dif- fusion steps of the two algorithms, highlighting the dif- ferences in the two recommendations. Although they may Verdi Bach Piano Bocelli Concert Football Millwall Milan seem a good way to make recommendations, actually both HeatS ProbS of them are biased by the presence of extremely popular (b) First recommendations to U6 or non-popular items or tags, and they do not take into account the characteristics of the user’s interests. For this Figure 2: ProbS and HeatS suggestions. reason, they present strong limitations if applied to STS. Specifically, ProbS tends to recommend most popular con- tents, while HeatS tends to highlight those with minimal to represent folksonomies. In these cases, nodes of these popularity (i.e., with the smallest possible number of users graphs represent users, items and/or tags, while their links connected to them). Figure 2 depicts an example of the represent relationships among nodes. Since nodes are di- described behavior. If we consider a target user interested vided into three separate classes, folksonomies are usually in tags with low popularity (u1 in Figure 2a), HeatS will represented as tripartite graphs, or by separate bipartite suggests tags with low popularity (that, however, may be graphs for user-item or user-tag relationships in order to of limited interest for the target user from a semantic point further reduce the computational complexity. These ap- of view). On the other hand, ProbS tends to recommend proaches rely on the diffusion of fictitious resources within tags with high popularity, possibly semantically unrelated the folksonomy graph, starting from a node representing with the user’s interests. By contrast, if we consider a a target user (i.e., the target of the recommendation) and user with popular interests (u6 in Figure 2b), ProbS high- diffusing the resources by following links between nodes. lights the correct tags and, instead, HeatS still (wrongly) This permits to identify relevant items (or tags) as those recommends the less popular tags. nodes that are indirectly connected to the target user via To overcome these limitations, a ProbS+HeatS hybrid ap- other users with whom she shares one or more connec- proach (hereafter just Hybrid ) [34] has been recently pro- tions. The higher the number of links connecting an item posed in the literature. This algorithm calculates a linear i to the items of the target user, the higher the score that combination of the results of ProbS and HeatS with a pa- i will receive for the recommendation. In this way, the rameter λ governing the relative importance of one of the recommender system exploits the structure of the graph two original algorithms. However, the problem of Hybrid to identify content relevant for the user. In addition, this (and other recently proposed solutions [23, 21, 30, 24]), approach gives the opportunity to exploit additional hid- precisely lies in the use of parameters which can vary den information derived from graph-based analysis (e.g., greatly depending on the nature of the dataset, and that community detection [12]), which could be used to further are difficult to estimate in real situations. customize recommendations. PLIERS [2] solves the dilemma of the choice between pop- The oldest and most famous diffusion-based solutions are ular or non-popular items in the network in a more natu- represented by two algorithms: ProbS (also known as Mass ral way than the other diffusion-based algorithms, without Diffusion) [35, 31] and HeatS [32]. By applying ProbS to requiring any parameters to tune, and ensuring that the user-item bipartite graphs, a generic resource is assigned popularity of recommended items is always comparable to each item is directly linked to the target user ut of with the popularity of items already adopted by the users. the recommendation such that its value is 1 if an edge is PLIERS assumes that if the user is interested in general present between ut and is , and 0 otherwise. These items categories of items, with a high number of connected users represent the content that the user has created or down- (i.e., popular items), the recommendations will be general loaded in the past. The resource is then split (first diffu- as well. On the other hand, if the user is interested in less sion step) among the users directly connected to the item popular items, the recommendations will prefer items with and each user receives the same portion of the resource. less connected users. Subsequently, each user splits the portion of the received In addition, PLIERS assumes that a very popular item/tag 4
Table 1: Ranking for User 1 for PLIERS (PL), ProbS (Pr), HeatS 2 (He) and Hybrid (Hy). 1.5 Tag PL Pr He Hy orchestra (1) 1 12 1 1 Popularity debussy (1) 2 13 2 2 1 thepianoguys (1) 3 16 3 3 pavarotti (1) 4 19 4 4 0.5 nabucco (1) 5 20 5 5 millwall (22) 13 5 14 13 music (24) 15 4 15 14 0 sherlock (27) 16 3 16 15 sport (41) 19 2 19 18 t n i o n ch i al t rd in r ar an an ve be ic s Ba Ve oz os ss ho m Pi hu startrek (44) 22 1 22 19 M hu la R et Sc C Be Sc Figure 3: Tags popularity for User 1. 3.1. Notation Formally, a folksonomy can be represented with three node can semantically relate to a more “generic” topic com- sets: users U = {u1 , . . . , un }, items I = {i1 , . . . , im } and pared to a less popular item/tag that, instead, describes tags T = {t1 , . . . , tk }. Consequently, each binary relation a more “specific” topic. For example, any content related between them can be described using adjacency matrices, to the football club Millwall can be tagged with both tags AUI , AIT , AUT respectively for user-item, item-tag and “Millwall” and “Football” but the opposite is not always user-tag relations. If the user ul has collected the item is , true: all content concerning football will not always be we set aUI UI IT l,s = 1, otherwise al,s = 0. Similarly, we set as,q = tagged with “Millwall”. According to this assumption, IT 1 if is has been tagged with tq and as,q = 0 otherwise. we can therefore say that the tag “Football” refers to a Furthermore, connections between users and tags can be more generic topic than that referred by the tag “Mill- represented by an adjacency matrix AUT , where aUT l,q = 1 wall”. Users interested in the Millwall football club, but UT not connected to items tagged with “Football”, are clearly if ul owns items tagged with tq , and al,q = 0 otherwise. not interested in all the items tagged with the latter tag, as In the next subsection, we consider user-item bipartite these could contain information about other football clubs. graphs described by the AUI adjacency matrix. PLIERS leverages this assumption to improve the recom- mendations and to provide more personalized items to the 3.2. The algorithm users. We demonstrated that PLIERS outperforms all the PLIERS is inspired by ProbS and shares with it the same other diffusion-based approaches applied to Online Social two diffusion steps. In addition, PLIERS normalizes the Networks, both in terms of recommendation accuracy and value obtained by ProbS when comparing an item ij with personalization [2]. In this paper we present p-PLIERS, a one of the items is of the target user ut . This normalization framework able to merge the knowledge about users, inter- is performed by multiplying the recommendation score by ests and contents on single mobile devices and to evaluate the cardinality of the intersection between the set of users the relevance of the available contents for each single user connected to ij and the set of users connected to is , divided through PLIERS recommendations. To make the reader by k(ij ), which is the popularity of ij . In this way, items able to completely understand this new solution, we pro- with popularity similar to the popularity of the items of vide, in the next sections, an extended description of PLI- the target user, and that possibly share the same set of ERS notation and algorithm with respect to [2]. Then, in users, are preferred. Section 4, we present p-PLIERS. The final value of the item ij for the target user ut , in a graph with n users and m items, is then calculated by PLIERS with the following formula. 3. PLIERS: PopuLarity based ItEm Recom- mender System Xn X m al,j · al,s · at,s |Us ∩ Uj | In this section, we present PLIERS working principles on fjpl = j = 1, . . . , m, (1) l=1 s=1 k(ul ) · k(is ) k(ij ) a simple bipartite graph, and its comparison with ProbS, HeatS and Hybrid. Then, we describe in detail how PLI- Here, Uj is the set of users connected to the item ij , k(ij ) ERS can be applied to a tripartite graph, which is used is the popularity degree of the item ij (i.e., the number of by p-PLIERS to represent knowledge in opportunistic sce- connected users), and ax,y is an element of the AUI matrix. narios. The higher the value of fjpl , the more item ij is similar to the items already owned by ut . 5
Table 2: Ranking for User 3 for PLIERS (PL), ProbS (Pr), HeatS 50 (He) and Hybrid (Hy). 40 Tag PL Pr He Hy 3g (1) 26 26 5 18 Popularity 30 bag (1) 24 24 4 17 carling (1) 23 23 3 16 20 cpu (1) 22 22 2 15 cricket (1) 25 25 1 14 10 seriea (19) 1 1 6 1 android (21) 4 8 21 6 0 googleglass (21) 3 7 20 5 mobile (21) 2 6 19 4 on ic M rt nS n Pi iro M ano Sc ll S la Pa ell rt St var s W i s ia erlo r G k ot s St sca ll ar la N oog k ab le Bo cco lli ar ott eP Sh Inte La llwa E ba o ar no c Fo uy G re Sa ila ce 2C po ce C us a T u M crime (23) 12 4 23 7 i pop (23) 8 5 24 8 Th tv-series (34) 9 3 25 3 Figure 4: Tags popularity for User 3. sci-fi (37) 5 2 22 2 3.3. Working principles of PLIERS on a simple graph data) recommend similar tags for the first 12 positions, To describe in detail the mechanism of PLIERS, and to highlighting tags with a popularity degree similar to those prove its effectiveness, we manually built a synthetic user- already connected to the user, and more semantically re- tag bipartite graph consisting of 64 users, 62 tags and a lated with them (e.g., “Debussy”, “Orchestra”). On the total of 612 edges. The graph represents the relation be- contrary, ProbS presents wrong recommendations in the tween a set of users and several tags, which are character- first positions, by assigning a higher score to uncorrelated ized by a variable degree of popularity. A link between a tags (from a semantic point of view), which are charac- user and a tag indicates that the user owns that tag (i.e., terized with a popularity degree that deviates too much she has already downloaded a content labeled with that from that of the tags held by the user (e.g., “StarTrek”, tag). “Sport”). Figure 5 shows the structure of a synthetic bipartite graph, By contrast, User 3 is interested in both non-popular tags highlighting the characteristics of two opposite cases: the (such as “Pavarotti”, “2Cellos”, and “Nabucco”) and par- first user (User 1), connected just to non-popular tags ticularly popular (“StarTrek”, “Star Wars”, “Millwall”) or (more specific), and the second one (User 3), connected generic tags (“Sport”, “Music”), which increase the aver- to tags with, on average, higher popularity. To clarify the age popularity of her topics to the value of 15.3 (Figure 4). difference between popular and non-popular tags: the tag In this case, PLIERS adapts its recommendations to the “TV-Series” is far more popular than “SherlockHolmes” “generic” nature of the target’s interests, deviating from and, since the majority of users connected to the tag “Sher- HeatS and agreeing with the suggestions made by ProbS lockHolmes” are also connected to “TV-Series” (but the and Hybrid (see Table 2). opposite is not always true), semantically speaking, we can refer to “TV-Series” as a “superclass” of “Sherlock- Holmes”. In the following, we compare the differences in the recom- mendations lists (i.e., the resource vectors) generated by PLIERS, ProbS, HeatS and Hybrid for the two considered users. User 1 is interested in topics related to classical music (“Schubert”, “Verdi”, “Rossini”, etc.) that, in our graph, have a low popularity with mean equal to 1.22. Figure 3 depicts the popularity of each tag connected to User 1 in terms of number of users connected to that tag. Table 1 contains the results obtained from the execution of the four algorithms, considering User 1 as the target. We reported the top 10 tags common to the rankings of all the algorithms, together with their position in each rank- ing, so as to better compare the differences among the approaches. Next to each tag, its popularity is reported. Figure 5: Structure of the synthetic user-tag bipartite graph. PLIERS, HeatS and Hybrid (with λ = 0.5 – the value generally used when there is no a priori knowledge on the 6
Affinity Symmetrically, we define the similarity index of an item u1 u2 u3 ij with respect to the items already linked to the target user ut , as Xk X m az,j · az,s · at,s |Ts ∩ Tj | fjs = · , (3) Similarity z=1 s=1 ki (tz ) · kt (is ) kt (ij ) i1 i2 i3 i4 where Tj is the set of tags connected to the item ij , kt (ij ) is the number of tags with which the item ij was marked, ki (tz ) is the number of items connected to the tag tz , and ax,y is an element of the matrix AIT . We define the final score of an item ij as the linear com- t1 t2 t3 t4 t5 t6 bination of the two indices (2) and (3) as follows. fj = λ · fja + (1 − λ) · fjs , (4) Figure 6: Affinity and similarity indices in a tripartite folksonomy where λ ∈ [0, 1] is a tunable parameter of the algorithm graph. with which we can weigh the links between users and items and those between items and tags. 3.4. Extension to tripartite graphs 4. Pervasive PLIERS The three adjacency matrices introduced in Section 3.1 can be represented as a tripartite graph G = (U, I, T, E, F ) p-PLIERS implements a framework for the representation where U is the set of users in the folksonomy, I is the set and exchange of context information describing contents of items, T is the set of tags, E is the set of links between and users among nodes of an opportunistic network. It users and items, and F is the set of links between items exploits PLIERS for the evaluation of the collected knowl- and tags. If an edge el,s between the user-node ul and the edge and to provide personalized recommendations to the item-node is exists, we say that the user ul was interested users about available interesting contents. in the item is (i.e., she created or downloaded it in the To this aim, p-PLIERS supports the maintenance of a lo- past) and, in a completely analogous way, if the item-node cal representation of the knowledge about the users, items is is connected to the tag-nodes ti , . . . , tk , we mean that and tags in the network and their relations (i.e., a local the item is was tagged with ti , . . . , tk . knowledge graph - LKG) on each node of the network. Recommender systems based on tripartite rather than bi- This knowledge is built by merging the information about partite graphs exploit all the information available from contents created or downloaded by each local user, with social tagging systems, and this can lead to better results the local knowledge of other nodes gathered during phys- in terms of recommendation accuracy and precision, as we ical contacts. The nodes use this partial knowledge to show in Section 6. evaluate the relevance of items (content) carried by other As depicted in Figure 6, in a tripartite graph we can define nodes in proximity, with respect to the interests of their two characteristic values: local user. The evaluation is performed locally, without re- quiring centralized control and without the need of having Affinity index: is the score obtained by PLIERS when a global knowledge of the network (i.e., global knowledge applied on user-item links. This indicates the degree graph - GKG). We represent the LKG of each node as a tri- to which an item is close to the user’s interests and partite graph as described in the previous section. When preferences. a node creates a new item, it updates its LKG by inserting Similarity index: is the score obtained by PLIERS when the relation between the user entity that represents it in applied on tag-item links. This indicates the degree to which two items are similar in terms of the tags Algorithm 1 p-PLIERS: Content discovery and evalua- with which they are labeled. tion in opportunistic networks More formally, we define the affinity index of an item ij 1: procedure Encounter(node n) for a target user ut with the following formula: 2: Send my LKG to n 3: nLKG ← Receive n’s LKG Xn X m al,j · al,s · at,s |Us ∩ Uj | 4: Update my LKG with nLKG fja = · , (2) ki (ul ) · ku (is ) ku (ij ) 5: I ← new discovered items l=1 s=1 where Uj is the set of users connected to the item ij , ku (ij ) 6: for each item i ∈ I do is the popularity degree of the item ij (i.e., the number of 7: score(i) ← evaluate i with PLIERS connected users), ki (ul ) is the number of items connected 8: end for to the user ul , and ax,y is an element of the matrix AUI . 9: end procedure 7
1 1 500 450 400 0.1 350 Frequency 0.1 300 P(X > x) P(X > x) 250 0.01 200 150 0.01 100 0.001 50 po r 15 s tio lla po ay ti be a2 d de 5 nk ita n ex ge w is o ro 01 oo na tare ex ld ba llav 20 fo ne m at un im na tti un roh at m m g ze 0.001 0.0001 1 10 100 1 10 al Number of tweets per user Number of tags per tweet (a) (b) (c) Figure 7: Descriptive statistics of the Twitter dataset used for the WFD@Expo2015 scenario. (a) CCDF of the number of tweets per user, (b) CCDF of the number of tags per tweet, and (c) Frequency of use of the most used tags. the graph and the created item, and the relations between The static scenario allowed us to evaluate the accuracy the item and its tags. When a node encounters another of PLIERS using a standard evaluation method (i.e. link node in the network, the two nodes exchange their LKGs, prediction, as detailed in the following). For this scenario, and locally integrate them with the received information. we downloaded the tweets generated during World Food The basic operations of our solution are summarized in Day at Expo 2015 (WFD@Expo2015) in the urban area Algorithm 1. of Milan, and we built a tripartite graph composed by Our solution relies on the fact that, as we will demonstrate users, tweets, and their hashtags. This represents a realis- with simulations, using the local knowledge of each node tic folksonomy graph of online tagged contents related to is sufficient to correctly evaluate the relevance of contents, a popular event. and leads to recommendations that are compatible with For the dynamic scenarios, we considered different situa- the results that one would obtain with a global knowledge tions in which pervasive communication systems may be about all the information available in the whole network at used. Specifically, the scenarios are characterized by vari- the time of the evaluation. This supports the decentral- able number of people moving with different mobility pat- ization of recommendations, and allows us to efficiently terns generated by human mobility models or derived from apply recommender systems, and PLIERS in particular, real contact traces. Each scenario represents a specific to dynamic mobile environments. application use case: (i) a big event (WFD@Expo2015), (ii) a conference (ACM KDD’15), and (iii) an urban area (Helsinki city center). We performed a set of simulations 5. Experimental Evaluation: General Description based on these scenarios, where each person is expected As a first set of experiments, we evaluated the accuracy to use a mobile device able to communicate through D2D of PLIERS recommendations with respect to other recom- communications with other devices in proximity. In addi- mender systems typically used for content dissemination in tion, each device allows the users to generate and tag con- opportunistic networks. To do so, we considered a static tents over time, and uses p-PLIERS to identify potentially user-item-tag graph obtained from Twitter to evaluate the interesting contents created by others. We think that the accuracy of the recommendations given by PLIERS. Then, scenarios cover a significant set of cases where pervasive we thoroughly evaluated p-PLIERS in three different dy- and mobile systems may be applied, and represent thus namic opportunistic scenarios. In both cases (static and the basis for realistic experimental evaluations. dynamic), we used graphs derived from real online tagged contents obtained from Twitter, and we set the parameter 6. PLIERS Experimental Evaluation in a Static λ in equation 4 to 0.5. Scenario Table 3: Statistics of the Twitter dataset used for the simulations. We compared the performances of PLIERS with respect to the other recommender systems proposed in the litera- Statistic Value ture for opportunistic networks, namely user-based Col- N. of tweets 5,260 laborative Filtering and Tag Expansion. To assess the N. of users 2,946 accuracy of the recommendations, we performed a link N. of unique hashtags 3,292 prediction task on a tripartite graph. This is a standard Average n. of hashtags per tweet 2.12 way to evaluate and compare recommender systems [16]. Max n. of hashtags per tweet 13 In essence, the technique consists in randomly removing Min n. of hashtags per tweet 1 a small portion of links from a folksonomy graph, then verifying whether the recommendations generated on the 8
1200% 100% PLIERS vs Tag-Exp PLIERS vs CF PLIERS vs CF PLIERS vs Tag-Exp 1000% 80% 800% Precision Gain Recall Gain 600% 60% 40% 60% 40% 20% 20% 0% 0% 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 k k Figure 8: Precision gain obtained by PLIERS in a complete knowl- Figure 9: Recall gain obtained by PLIERS in a complete knowledge edge scenario. scenario. pruned graph coincide with the removed links. Intuitively, not be able to recover the removed links, reducing the sig- a good recommender system should be able to identify the nificance of the experimental results. The percentage of items that were originally connected to the removed links. links between users and items we deleted from the original graph is equal to 1.4%. We computed the performances 6.1. Dataset Description of each method using the measures of Precision (P) and We downloaded a dataset of 5,260 tweets generated during Recall (R) [16], defined as follows: World Food Day at Expo 2015 through Twitter Stream- 1 X 1 X 1 ing API. To do so, we applied a filter to Twitter API by P = , (5) indicating a set of hashtags (e.g., #Expo2015, #ExpoMi- |U | |T (u)| pos(t) u∈U t∈T (u) lan2015, and other possible combinations) and we queried only tweets generated in the area of Milan. We decided to 1 X |L(u) ∩ T (u)| R= , (6) validate PLIERS using tweets generated in a limited area |U | |T (u)| u∈U during a thematic event such as the WFD@Expo2015 since where U is the set of users for whom we have removed we think that the semantic relationships between the rel- links, L(u) is the recommendation list for the user u, T (u) ative folksonomy entities is significant, and we expect to is the set of links removed from the user u, and pos(t) is obtain meaningful and realistic recommendations. the position in which the removed item t appears in the The number of different users that generated the down- list of recommended items L(u). loaded tweets is 2,946. We removed the hashtags that we used to filter the data (e.g., #Expo2015, #ExpoMilan) 6.3. Results as they are contained in all the tweets, and they are thus not useful to describe the contents. In Table 3, we report Figures 8 and 9 depict respectively the Precision gain and the statistics of the dataset. Figure 7b and Figure 7a de- the Recall gain of PLIERS with respect to both CF and pict the CCDF of the number of tags per tweet and the Tag Expansion. The gains are expressed in percentage and number of tweets per user. It is worth noting that both identify the improvement obtained by PLIERS in terms of CCDFs show a long-tailed distribution, a typical result for precision and recall with respect to the other algorithms. social networks. In addition, in Figure 7c, we depict the The parameter k in the figures represents, for CF, the frequency of use of the 10 most used tags in the dataset. number of users similar to the target user to be considered in the calculation, and for Tag Expansion the k tags more 6.2. Link Prediction Task correlated with the tags of the target user to be considered. We removed 1 link from each user connected to at least Higher values of k should therefore lead to better recom- 5 items with popularity greater than 1, and then we ran mendations. It is worth noting that PLIERS outperforms PLIERS and the other systems on the updated graph to the two reference algorithms for both measures, receiving a generate the recommendation list for each user. Then, precision score up to eight times higher than that obtained we calculated the percentage of removed links that are in- by Tag Expansion and a score 40% higher than CF, even cluded in the recommendations of each algorithm (i.e., the in case of very high values of k. With regard to the Recall “recovered links”). We selected the links to be removed measure, PLIERS obtains a score around 50% higher than in this particular way in order to avoid the complete iso- that received by Tag Expansion and around 30% higher lation of the items connected to just a single user. In than that of CF. fact, in those cases, all the recommender systems would 9
These results indicate that PLIERS, being able to exploit knowledge about the associations between the contents in all the information contained in the folksonomy graph (i.e., the network and the nodes which created them, and the user-item-tag relationships), obtains better results than associations between the contents and the tags associated the state-of-the-art solutions used in opportunistic net- with them. works, which consider only partial user-item or item-tag The LKG of each node is initially empty. Every time a relationships. new item (tweet) i is created by node u, u, i, and each tag t associated with i are added to LKGu (if not present), as well as the link connecting u and i (eui ) and all the links 7. p-PLIERS Experimental Evaluation in Dy- connecting i and each of its tags t (fit ). namic Scenarios We also maintain GKG, the global tripartite graph that In [11], ElSherief et al. established theoretical limits on represents the complete knowledge of all the users, items, the performance of knowledge sharing in opportunistic so- and tags in the network at a certain time. cial networks. They calculated how many contacts are needed to ensure that nodes are able to well approximate 7.2. Measures the global knowledge (i.e., all the available information At each time step of the simulation (i.e., every minute), in the network) for different sharing policies in a scenario the simulator calculates the following measures: where the global knowledge is static and defined a priori. We performed an empirical evaluation similar to that car- a. Average similarity between the LKG of each node ried out in [11], but using more complex and realistic refer- and the GKG. ence scenarios, where the information is dynamically gen- b. Average similarity between the vector of recommen- erated over time by the nodes. Specifically, in our reference dations generated by PLIERS on the local and global scenarios, the users are characterized by different mobility graphs. and content generation patterns. To calculate the similarity between tripartite graphs, we first flattened the graphs, obtaining two adjacency lists L1 7.1. p-PLIERS Pervasive Simulator and L2 , and then calculated the Jaccard index J on these To simulate the dynamic scenarios, we implemented a lists: high-level simulator that emulates the execution of p- PLIERS on a set of mobile nodes that generate contents |L1 ∩ L2| J(L1 , L2) = (7) over time. The simulator requires in input (i) a set of |L1 ∪ L2| traces defining the physical contacts between nodes over In the same way, the similarity of two ranked recommen- time and (ii) a list of contents generated by the nodes and dation vectors R1 and R2 is calculated as J(R1 , R2 ). marked with a timestamp. Then, it calculates statistics Although the Jaccard index is a standard and widely used to evaluate p-PLIERS both in terms of its ability to ap- measure of similarity, it does not consider possible differ- proximate the global knowledge from a local perspective, ences in terms of position in the rankings given by the rec- and the accuracy of the given recommendations. The sim- ommendation vectors, but rather considers only the pres- ulation process proceeds in discrete steps, each of which ence or absence of recommended items in the two vectors. represents 1 minute of simulated time. For each contact To account for possible differences in terms of position of in the simulation steps, indicated by the timestamps in the same elements in the rankings produced either using the contact traces, the involved nodes establish a D2D local or global knowledge graphs, we also calculated a sim- communication between each other and they perform the ilarity measure based on Spearman’s Footrule [6] defined operations of Algorithm 1. as follows: Note that the simulator is at a higher abstraction level than other existing network simulators (e.g., generic net- P d(x, R1 , L2 ) work simulators such as ns-31 or OMNeT++2 , and simula- S(R1 , R2 ) = 1 − x∈R1 ∩R2 , (8) max(|R1 |, |R2 |) tors specific for opportunistic networks such as TheONE3 ). Thus, at this level, we do not consider network-related is- where sues, but we assume that, when two nodes are in contact, the communication channel between them was successfully ( |R1 (x) − R2 (x)| if x ∈ R1 ∩ R2 established. d(x, R1 , R2 ) = (9) To represent the knowledge about the available content in max(|R1 |, |R2 |) otherwise, the simulated mobile network, each node u maintains a local tripartite graph LKGu that represents its (partial) where R(x) is the position of object x in the ordered set R. S is similar to the Jaccard index, but, in addition, it penalizes possible differences in terms of rankings of the 1 https://www.nsnam.org/ elements in the resource vectors. 2 https://omnetpp.org/ In addition, for each simulation step, we calculated: 3 https://akeranen.github.io/the-one 10
... Figure 10: Map of Expo 2015 area with the position of five of the simulated communities. Note that the grid in the figure is only an example to show how we divided the area for the simulations, but it does not represent the real grid used. c. Number of contents generated by the nodes over presence of “communities”, i.e., each node has a higher time. probability to meet nodes within its own community than d. Average number of contacts between nodes over nodes belonging to different communities. This fits well time. with WFD@Expo2015 scenario, as the area was divided into several pavilions. We can reasonably assume that These measures are used to characterize the contact traces the mobility of people inside each pavilion was lower than and the contents used in the different scenarios. We an- the mobility of people moving between pavilions, and that, ticipate that the synthetic and the real traces that we consequently, the density of intra-community contacts was used show similar properties (e.g., the contact traces used higher than that of inter-community ones. We generated for the WFD@Expo2015 scenario show values compatible mobility traces through HCMM for 60 communities, ap- with those used in the conference scenario), thus support- proximately the number of pavilions at Expo (see Fig- ing the significance of the synthetic trace. ure 10 for a graphical representation of a map of the area We also calculated all the measures by considering that the of Expo and the communities used in HCMM). We set interests of nodes may be limited in time. To do so, we cal- a probability of inter-community contacts of 0.1 (this pa- culated the measures using only the most recent contents rameter is called “rewiring” probability in HCMM)4 . The generated in the network and considering only the infor- speed of nodes ranges between 0.01 m/s (almost steady mation about these contents in the folksonomy graphs. In nodes) and 1.86 m/s (nodes representing people walking the simulations, we considered different “expiry date” for relatively fast). the contents, i.e., 1, 2 or 3 hours. Using HCMM, we simulated the contacts between nodes for 13 hours, to cover the timespan of the 7.3. Scenario 1 - Big Event: World Food Day @Expo2015 WFD@Expo2015, from 10am (opening time) till 11pm As a first dynamic scenario for the evaluation of p- (closing time). We considered that each node has a trans- PLIERS, we considered a big event attended by a large mission range of 20 meters (the maximum distance to avoid number of people in a relatively large area. In this sce- that nodes in adjacent pavilions are constantly in contact nario, accessing the Internet from mobile devices may be with each other). When two nodes are in the transmission problematic and thus obtaining useful content from D2D range of each other, HCMM generates a contact between communications would provide an important source of in- them. We think that HCMM mobility model with these formation for the users. We considered the World Food settings can well approximate the real mobility of people Day at Expo 2015, organized on October 16, 2015. We as- during WFD@Expo2015. sumed that people attending the event were able to create tagged contents from their mobile devices, and that other 7.3.2. Content Generation attendees might have been interested in obtaining these To have a realistic representation of multimedia contents contents through D2D communications. generated during the WFD@Expo2015, we downloaded the tweets generated during the event, using the Twit- 7.3.1. Mobility Traces ter Streaming API. To be sure to obtain only information We simulated a set of nodes moving within an area of 300 x 2,000 meters, which coincides with the Expo area 4 We performed the same simulations described in the following in Milan. Each node represents a person equipped with a also with rewiring probabilities 0.2 and 0.3 (thus considering more mobile device. The mobility of nodes is simulated through dynamicity in the movements) and this resulted in a general, al- though slight, improvement due to a higher inter-community mo- the HCMM human mobility model [5]. The model gener- bility of the nodes. Since the contact traces generated by setting ates contacts between nodes according to a Pareto dis- a rewiring probability of 0.1 represent the worst case scenario, we tribution, as observed in real traces, and considers the present only the results for this parameter. 11
1 1 100 90 250 users 500 users 80 900 users 0.1 70 Frequency 60 0.1 50 P(X > x) P(X > x) 40 0.01 30 20 0.01 10 0.001 0 ita gm ood m 015 tio lla ay un to l po t ro 15 er 250 users 250 users pa ex lis ng na are v ld ze 20 at do ne lla 500 users 500 users f a2 na hu t de at 900 users 900 users ro 0.001 0.0001 be 1 10 100 1 10 al Number of tweets per user Number of hashtags per tweet (a) (b) (c) Figure 11: Descriptive statistics of the Twitter datasets used for the WFD@Expo2015 dynamic scenarios. (a) CCDF of the number of tweets per user, (b) CCDF of the number of tags per tweet, and (c) Frequency of use of the most used tags. 350 160000 250 agents 900 agents 250 agents 500 agents 500 agents 300 900 agents 140000 Number of contents Number of contacts 120000 250 100000 200 80000 150 60000 100 40000 20000 50 0 0 11 m 12 m am pm pm pm pm pm pm pm pm pm pm 11 m 12 m am pm pm pm pm pm pm pm pm 10 m pm a a a a p 10 1 2 3 4 5 6 7 8 9 10 10 1 2 3 4 5 6 7 8 9 Figure 12: Overall number of contacts between nodes during the Figure 13: Number of contents generated by nodes during the simu- simulation for the Expo 2015 scenario. lation for the Expo 2015 scenario. related to Expo, we filtered our requests in the same way number of tweets generated by users and the number of as done for the dataset collected for the evaluation in the hashtags per tweet respectively, for the samples extracted static scenario presented in Section 6. With respect to for the three settings (250, 500, and 900 nodes). The dis- the dataset used for the static scenario, we downloaded tribution of hashtags per tweet are approximately the same only tweets generated from 10am till 11pm, to cover the for the three samples: the majority of tweets have just one same span of time of the simulated contact traces. The or two hashtags associated with them, whereas just few obtained dataset contains 4,817 tweets generated by 2,660 tweets are linked to more hashtags. Similarly, the number users, and contains 3,008 unique hashtags. of users who generated a high number of tweets is low, We associated each node of the simulated contact traces when the majority of them has generated just one or two with a Twitter user. In this way, the tweets of the Twitter tweets. user associated with each node represent contents that the Figure 11c depicts the frequency of use for the 5 most node generated during the simulation. used tags for each sample. As expected, the most used tags are related to topics about the Expo2015 exhibition, 7.3.3. Simulation Settings the WFD and some special guests (e.g., “mattarella” refers We considered different settings for our simulations in or- to Sergio Mattarella, who is the actual President of Italy der to analyze the possible impact of different parame- and was already in charge during the Expo in 2015, and ters on the results. Specifically, we varied the number he was invited as a special speaker for the WFD event). of nodes, simulating 250, 500, and 900 nodes in the con- Figure 12 depicts the total number of contacts between sidered area. As the number of users that tweeted during nodes for each hour of simulation. It is worth noting that, WFD@Expo2015 was higher than the number of simulated for each setting, the number of contacts between nodes nodes (900 is the maximum number of nodes supported by is roughly constant for the entire simulation time, but it HCMM), we sampled, for each setting, the Twitter users greatly differs from one setting to another, providing thus with a uniform random sampling. significantly different cases in terms of opportunities of In order to give an idea of the dynamics of content gener- contact between nodes in the simulations. ation, Figures 11a and Figure 11b depict the CCDF of the Similarly, Figure 13 depicts the number of contents 12
1 1 1 Average Spearman similarity between Average Jaccard similarity between local and global resource vectors local and global resource vectors 0.8 0.8 0.8 Graphs similarity 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 250 agents 250 agents 250 agents 500 agents 500 agents 500 agents 900 agents 900 agents 900 agents 0 0 0 10am 1pm 4pm 8pm 11pm 10am 1pm 4pm 8pm 11pm 10am 1pm 4pm 8pm 11pm Time Time Time (a) (b) (c) Figure 14: Results for the WFD@Expo2015 scenario. (a) Average Jaccard similarity between local graphs of the agents and the global graph, for different number of agents. (b) Average Spearman and (c) Jaccard similarity between the PLIERS resource vectors obtained on the local graphs of the agents and those obtained from the global graph, for different number of agents. (tweets) generated by nodes for each hour of simulation. graphs converge to those on the global graph. The pres- The distribution is similar for the three settings: during ence of negative peaks in Figure 14b that are not visible in the lunch break (from 12am till 1pm) there is an increment Figure 14c highlights the higher sensitivity of the similarity in the generation of content, then it remain stable until the measure S compared to the Jaccard index. These nega- evening hours and decreases closer to the exposition clos- tive peaks indicate a slight decrease in the performances ing time. It is worth noting that no tweets containing the of PLIERS, which are nonetheless in the order of ∼15% selected hashtags have been generated between 10am and maximum, and the curve remains always above 0.6 for all 11am, probably due to the low number of visitors in the the settings, at least after 4 hours of simulated time. very first part of the event. Figure 25a depicts the average similarity between the lo- cal graphs of the agents and the global graph, where only 7.3.4. Results information generated not more than 1, 2 and 3 hours Figure 14a depicts the average Jaccard index between the (of simulated time) before the calculations is respectively local graphs of the agents and the global graph, for the considered. Note that the figure is related to the simula- different numbers of simulated agents. From the figure, tion with 900 agents. The differences in terms of average we can note that, after a certain time, the similarity be- similarity of the curves related to the simulations for 3 tween the local graphs and the global graph, on average, and 2 hours are similar to the results obtained without reaches a very high level for all the different settings con- temporal limitations. This tells us that, nodes can make sidered. After approximately two hours of simulated time, accurate recommendations even if they maintain limited the similarity reaches ∼80% for the cases with 900 and information about past history and purge the older infor- 500 agents, with slight variations, whereas ∼4 hours are mation from their memory. The case where only informa- needed to reach the same level of similarity for the case tion generated in the last hour is considered, the similarity with 250 agents. After that, the similarity remains quite is slightly lower, but it is still around 75%, which could be stable until the end of the simulation for all the settings. enough for some applications. This means that even with a small number of nodes (250) that generate items in a relatively large area (the whole 7.4. Scenario 2 - Conference: KDD 2015 area of Expo), the opportunities of contact are enough to As a second dynamic scenario, we considered a school cam- have a good approximation of the knowledge about items pus during a conference event, where people stay most and tags generated in the network, even after few hours. of the time within rooms, but they regularly gather at Interestingly enough, the differences between 250 and 900 breaks (e.g., coffee or lunch breaks). This is a typical sce- agents does not have a substantial impact on the curve. nario where pervasive and mobile communications could The similarity between the results obtained by PLIERS on improve content delivery services and user experience in local and global graphs are depicted in Figure 14b and 14c. general. We assume that the contents generated during the Specifically, the former figure depicts the average similar- conference can be shared by nodes through wireless com- ity S (derived from Spearman’s Footrule) for the recom- munication, and we assess the potential impact of PLIERS mendation vectors of the two cases, while the latter figure in the accuracy of recommendations using only local infor- depicts the average Jaccard index on the same recommen- mation obtained by mobile nodes from direct interactions dation vectors. From the figures, it is worth noting that, on with other nodes, similarly to the previous scenario. average, the differences between the results obtained from local and global graphs are quite small. In addition, the 7.4.1. Mobility Traces figures clearly indicate that the results obtained on local For this scenario, we used real contact traces representing the physical interactions of a group of students, professors, 13
You can also read