PHONEBOOK SEARCH ENGINE FOR MOBILE P2P SOCIAL NETWORKS
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
PHONEBOOK SEARCH ENGINE FOR MOBILE P2P SOCIAL NETWORKS Balázs Bakos, Lóránt Farkas and Jukka K. Nurminen Nokia Research Center, P.O. Box 392, H-1461 BUDAPEST, Hungary {balazs.bakos, lorant.farkas}@nokia.com Nokia Research Center, P.O. Box 407, FIN-00045 NOKIA GROUP, Finland jukka.k.nurminen@nokia.com ABSTRACT In this paper we are investigating the simple question “I Search engines generally lack the trust and need a reliable plumber close to my house”. As the word personalization dimension needed for recommender “reliable” leaves a lot of room for interpretation we systems to answer questions like: “I need a reliable consider the plumber reliable if your friends or plumber close to my house”. In order to obtain acquaintances have used his services and have been personalization, social relevance, and a decent amount of satisfied. We are thus using the social network of a person privacy, the search database itself needs to be as a source of recommendations. personalized. One possible dimension of personalization Using mobile phones to solve this problem is promising. is the social neighborhood of the searcher. Most small businesses or craftsmen are anyhow using One projection of a person’s social neighborhood is the mobile phones, the increasing capabilities of mobile set of phonebook entries in a mobile phone. In particular, phones make such search applications possible, and for the phonebook links represent a readily available the person searching for the service the mobile phone is infrastructure to create a peer-to-peer social network for usually available, even in urgent situations. Most socially relevant search. importantly, however, mobile phones already have the In order to demonstrate the concept our work introduces a social network defined. novel search engine algorithm for this kind of social The address book of the phone, and the phone number networks. The implementation of the concept has been that it contains, link the phone users to each other forming done on Series 60 Symbian platforms over GSM using a highly interconnected network of mobile phones. The short message exchange. We also discuss privacy aspects network has a very high average node degree, proving a and possible enhancements in the social dimension of the very high potential to become the underlying networking search. layer for various applications. This is the approach we followed in the present work. KEY WORDS The rest of the paper is organized as follows: section II peer-to-peer (P2P), time-to-live (TTL), search, ranking, presents an overview of the prior art. Section III analyses smart phone the role of the phone contacts in the application. Section IV presents our solution. Section V discusses the solution from several aspects. Section VI discusses possible future 1. Introduction solutions, including some privacy and confidentiality aspects, centralized solutions that would reduce the Efficient information retrieval is increasingly important traffic, the tradeoff being reduced privacy. Finally section both for our professional and personal life. A clear VII concludes our paper. indication of this is the success of search systems such as Google. While they are efficient tools to find product and company information worldwide they are not very useful 2. Previous Work in searching for small businesses in your neighbourhood, e.g. reliable plumbers or cozy, small restaurants. The worldwide web with its generic or more specific There are several reasons for the shortcomings. First, keyword search engines implements partially the same many small businesses or craftsmen do not have web functionality as our approach. Searching a good pages. Secondly, in choosing such a service the professional in a given area can be as simple as recommendations of previous customers are often more introducing an appropriate combination of keywords in important than the actual products of the company. There Google [1] or Yahoo [3], e.g. “plumber New York”. The are no big differences between the services offered by outcomes will be more or less reliable and related to the different plumbers, but the quality of the work can vary a plumber profession in the area of New York. Both Google lot. Thirdly, the need for such services often arises in an and Yahoo offer additionally a yellow pages service ex tempore fashion. containing names, addresses and phone numbers of
persons claiming to be plumbers, along with maps where related to professions and jobs, while [8] is more related they can be reached. The outcomes for the word to personal networks of friends and acquaintances. These “plumber” are actually quite spectacular. systems are focused on specific search areas and are However, if we turn now to other jobs/hobbies, like centralized: they all presume that the user data is present “fisher”, there will be a huge difference. It turns out that in a central location and the introduced data closely there are a large number of persons called Fisher. The matches the user himself and his social proximity. This system does not differentiate very well between names has several drawbacks. Users must explicitly update and and professions: the introduction of the word “fisher” in maintain their contacts in the system. Moreover the the yellow pages will result in many Fishers, having scalability, security, and failure recovery problems of various professions (e.g. attorney, doctor etc.) In addition, centralized systems apply. yellow pages need to be updated in order to reflect recent changes, but we would maybe like to have it done automatically, not entirely possible in these systems. 3. Phonebook: Infrastructure versus The search for places is also straightforward: yellow Distributed Database pages help us here, too. Introducing the combination “restaurant New York” will return dozens of restaurants The phonebook of a smart phone contains a large amount of various kinds in the area of New York. However, this of relevant personal data. We can introduce here data lacks the quality dimension: probably we would also like related to availability such as address, mobile and landline to get a recommendation, more or less elaborate, perhaps number, fax, beeper, e-mail, URL. In addition to that we something similar to what somebody could read in a can also introduce personal data such as the person’s Lonely Planets book: good cooking, low budget, birthday. Furthermore, the phone contacts database lets us absolutely to be tried. define programmatically as many fields as we like (for [2] is a similar service for the UK. In addition to finding details see e.g. the Symbian Contacts Model [10]. people business search is also possible. In addition it is Unfortunately the very appealing idea of using the possible to navigate in a list of “most wanted” businesses. phonebook not only as an infrastructure, but also as a The search results are much more detailed than in the distributed database, has some limitations. To illustrate, Google case. we compare in Table 1 certain aspects of a brief data [1], [2] and [3] all lack the recommendation dimension: it mining of three sample phonebooks on smart phones in is not possible to rank the person or the service. In our attempt to find a person whose profession is addition, a possible third dimension is also missing: it is “painter”: not possible to obtain hits in the social neighborhood of Table 1. Phone contact statistics the requestor. To be more specific, this relates to the Coll. 1 Coll. 2 Coll. 3 ranking dimension in the sense that somebody would “Painter” present x - - probably rely more on the opinion of a second person if Has painter x X x this person were his friend or the friend of a friend, in contacts other words, a socially relevant person. Percentage of 3% 1% 1.5% Previous work in the area of social proximity is manifold. contacts with Applications ranging from the intelligent learning systems reference to job/work based on user contact logging and analysis to the other “Local” phone 10% 0.5% 12% kind of systems adapting themselves to the mood and numbers generic context of the user, these all deal with various As shown in Table 1., the phonebook, although fit for the dimensions of social relevance. However, relatively few purpose of a highly connected social network, doesn’t are the applications that use a mapping of social relevance contain a large amount of useful information. For to a certain goodness of transferred data or content. [4] example, if somebody tried to look for meaningful and [5] treat reputation and trust models in peer-to-peer information about persons (professionals: a painter, networks, to some extent related to our application. persons having a given hobby etc.) or places (restaurants, Reputation and trust of content sharers are evaluated and shops etc.), he would seldom find keywords in the propagated based on a number of parameters: bandwidth, phonebook indicating that a given contact is a painter or a quality of content, variety of content, type of content, given name is a restaurant’s. The following problems had online/offline time ratio etc. This is fairly straightforward been detected: in content sharing. However, when a person or a place is 1. From the three analyzed phonebooks all three ranked, the possible set of parameters depends on the contained at least one contact being painters, but the query itself: a professional should be rated on the phonebook contained the word “painter” only in 2 cases professional quality of the person rather than the general out of 3. One possible explanation is that perhaps some characteristics of him. Therefore a generic set of people generally store contacts under their names if they parameters cannot be formulated in the phonebook search know them well enough and they keep in mind their jobs, application for all possible cases. instead of keeping the job also in the contact list. Finally, [6], [7] and [8] are examples of centralized 2. More generally speaking, usually only a small databases of communities of various kinds: [6] and [7] are percentage of contacts have their job field filled in the
phonebook. In addition, sometimes the profession is 1. The name/availability of the persons having the stored under the name, so a full keyword search in all the persons matching the initial query in their text fields is needed in order to find the relevant contacts. phonebook. It is thus possible to contact those In addition, in most languages some family names mean persons ask discuss their experiences. in fact professions, possibly leading to false hits. 2. The ranks, given by these persons, as a first 3. Phone numbers are not always shown correctly by approximation of the quality of the person in the some GSM networks. If the GSM number is local, some given area. networks would display it in its local form, e.g. 0620… One of the key aspects for an efficient search is an instead of +3620…. If the user stores the number from the adequately built profile. In previous sections we have phone logs, the number will not be relevant when trying already shown that simple keyword search can be to use it from abroad. Also in this case a phonebook problematic if there are similar items in other areas, like crawler would not know the fully qualified phone was in the case of the word “fisher”. In our phonebook number. In addition, the user might fill the wrong field, crawler the profile consists of three different types of e.g. the mobile number instead of the landline. In this case data: a short message would not even make sense. In fact, 1. personal data mobile numbers might be stored in landline fields and 2. professional information vice-versa; the label of the field does not guarantee the 3. interest. type of phone number, only the fact that the stored A sample profile and phonebook are shown in Table 2. number has an adequate phone number format. Table 2: Own profile and phonebook 4. There is always the problem of users having different mother tongues. People generally use their mother tongue P rofile in their mobiles. This makes it very hard to find relevant Own phone Na me E -‐mail UR L P rofes s ion Interes ts 36 4 45 5 6577 J ohn D oe J ohn@doe.com D oe.com Trainer tennis data unless some kinds of translation plugins are C ontacts employed. This is especially true in multinational P hone G roup environments when people of different nationalities get 36 4 45 5 6577 S earch socially close. 36 2 576 7 6 S earch From these statements it is obvious that the phonebooks 36 6 7467 6 47 F riends can be regarded today more as an appropriate As shown in Table 2, in the profile is currently a infrastructure than a universal container of user-related predefined basic set of fields that can be queried. In future significant data. Therefore we follow this second scenarios this could be extended: some fields could take approach of using the phonebook only as a link container, value from a predefined set (e.g. marital status), others not as a source of user-related details. could be numerical (e.g. phone number, age) and yet others could strings or sets of strings (e.g. personal interests, professional experience) in which the user is 4. Our Solution expected to introduce as many keywords as he likes. It is important to notice that typically the person has very The phonebook is an “always on” resource: our contacts few entries in the profile table (typically just a single do not change their mobile number very often on one entry). This is the only data that the user has to explicitly hand. On the other hand, the mobile phone of our friend update. The contacts table corresponds to the standard might be momentarily switched off, however, when he address book data of the phone (which the user is anyhow switches it on, the query would be serviced. Phonebook likely to maintain). As the contacts table is only used to search is the materialization of our attempt to implement a forward the queries to the linked persons the attribute search engine on this infrastructure. In our values, e.g. the name of the contact, do not have any implementation we have used short messaging (SMS) to effect to the search. transfer messages between phones. Obviously it is not optimal for this use case but as it is widely available and The person will be found based on the profile, using supported in almost all phone models.. In Sections V and standard “SELECT * FROM profile WHERE VI the various aspects of messaging technologies will be ” type of SQL queries. The more fields of the covered in more detail. profile are filled, the more likely he will be found during a In the most basic form the application provides the search operation and the more areas he will be related to. keyword search. If a user simply wants to find a For instance, the user can introduce a large number of professional, it is as simple as introducing it in the query words in the field “Interests” and will be found screen and press a button, results will then arrive to the accordingly by queries regarding each word. It is his screen of his phone. For more advanced use it is relevant interest to do so. that the search can be executed in two distinct phases. In It is to be noted that, as shown at the end of Section III, the first phase the keyword search is executed; later, in the phone stores only the user profile of the owner. So it the second phase, the user can request a recommendation is stored now in a simple SQL database having only one of the hits, utilizing the social dimension of the search: record, that of the phone’s user. Incoming queries are simply matched with this record.
The search mechanism is shown in Fig. 1. in more detail. The list of replies does not show all the details of the An example of query could be: search for a person whose reply message, only the phone number and the name of job is plumber and whose address contains the string the person representing a match of the query. Depending Budapest. The user introduces the desired query in his on how much information this person had just disclosed phone (step 1). In the next phase (step 2) the phone sends in step 6, the list items can be further expanded to show the query with the following parameters: all the received details of the match. It is a matter of Phonebook contains profiles of contacts: 3. Searches i ts own settings how many of these details will a reply message profile for matches Phone4 contains, this can depend on whether the asking person is 3. Searches i ts own profile 2. Search sent to all/some Phone2 4. Forwards to own contacts persons i n the phonebook (depending on search neighborhood) As already stated the usage can stop here. In the case 5. Returns matches 3. Searches i ts Phone5 when the user wants to get more precise own profile for matches Phone1 2. Search sent to all/some information/social relevance, he may continue with a rank persons i n the phonebook Bill’s phone request. This may have two distinct purposes: (i) to find 1. Search: job = plumber Phone3 5. Returns matches 3. Searches i ts own profile (plumber) out the people who had a link to the found person and 6. Matches shown anonymously: for matches Bill plumber (1 hit ) (ii) which is their evaluation about him, with respect to Bob plumber (2 hit) Bob’s phone (plumber) the given query. In this case the people who have the 3. Searches i ts own profile for matches found match in their phonebook will be alerted and they are prompted to if (i) they like to reveal there identity to Fig. 1. Search mechanism the requestor (ii) rank the person. If the user is not willing to the predefined contacts (in the settings it could • to respond then “no” answers are assumed. The ranking be set to “everybody”, “predefined group”, and mechanism is shown in Fig. 3. Phone 1 is asking “nobody”) info about Bob Can be turned always off the painter? • with a preset time to live (TTL) field (in the 4. Rank = by a preference selection Reveal your 4 from < identity? settings this can be set), having as effect a larger 050-‐ 2346 78 > Rank Bob? or smaller propagation horizon in the social 4. Rank = 5 from < 051-‐987654> Phone4 neighborhood war d 3. Display rank request > k for neno Phone2 . Ran The mobiles receiving the query check a match with their < Bill’s pho 2 uest: 2. R ank k req profile (step 3). If there is a match, a query hit message is 1 . Ran f orwa rd Phone5 returned to the originator (step 5). In all cases, if the TTL Phone1 1. Rank req 3. Display rank request uest: < Bo is not expired (value = 1), it is decremented and the query 4. Rank = b’s ph oneno> 3. Display rank request 5. Ranks shown 5 from < 052-‐ 13 Phone3 Bill’s phone is forwarded to the contacts of this phone (step 4). The 5792> query will reach all the phones within the range TTL of 3. Display rank request Bob’s phone contacts of the query initiator and all query hits will be returned to the originator. Fig. 3. Ranking mechanism It is important to note that the returned query hit (step 4) does not reveal who where the persons who had as Fig.4. shows two example screens when the user checks contact the person matching the query, so the privacy of for the still unsolved rank requests (marked with a star). these people is not violated. It is also emphasized that this works automatically: no user intervention is required (on the other hand the user might set the application to reply and forward only if he explicitly chooses to reply and/or forward). The replies are returned to the phone generating the query and the user is alerted about the incoming reply messages (step 6). In addition to that, not shown in Fig. Fig. 4. a)List of received rank requests; b) Detailed 1., the reply messages are stored in a list of replies that can always be examined, as shown in Fig. 2a. The rank request is sent by the phone (step 1) after selecting the desired hit in the query hit list and pressing “Rate”. The rank request follows the same path and TTL patterns as the query itself. The mobiles in the path receive the rank request and alert the user of the rank request to be solved (step 3) but this only happens if the person to be ranked is found to be its own contact (e.g. phone 4 will display the rank request, but phone 2 won’t, since Bob is not his contact). This will be displayed in the Fig.2. a)List of replies b)reply in detail same manner as the new SMS alert, using a sound alert and a modal dialog on the phone screen.
In all cases, if the TTL did not expire, the rank request is One alternative is an incorporated profile exchange forwarded (step 2). By checking the list of rank requests mechanism that would make it feasible to store the profile (new requests marked with a star, as shown in Fig. 4.) and of a given contact in an expanded form in the phonebook rating, the rate solve message will propagate to the querier or in a mirror database. The profile exchange could be (step 4) and it will be stored in the query hit in the triggered by certain events, like for instance a call “Rates” field, shown in Fig. 5. initiated towards this person or a short message sent to him. The advantage of this solution is that it can save one step in the path of the query: it is enough for the query to reach a person having the profile of the match, obtained e.g. through previous profile exchange. The main drawback is the need for the additional step of profile exchange and the need of an extended phonebook or mirror database – additional storage capacity. A second Fig. 5. Contacts without and with received ranks drawback is the possibility of malicious contact data Additionally the application can also display the contact mining: as a reply to an empty query the replier would return each of his contacts. An application could protect details of the persons who gave the ranks, so ultimately the user from this by setting a limit to the minimal the user can call them and ask them personally, what they number of significant characters in a query, or through think about the person matching the query. some other method, but anyway it is a problem that has to be additionally handled. The second alternative is the one presented in Section IV 5. Discussion in which the user stores only his own profile e.g. in the form of a plain text file. In this case the phone contacts 5.1. SIM and Costs are used only as the networking layer. Hits are returned to the search originator if the query matches this profile, A mobile phonebook contains on average 50 to 100 stored in the text file. The advantage of this solution is entries (the number is not statistically correct, 10 samples that it doesn’t require an extended phonebook or had been taken, colleagues and persons involved in additional database mirror, along with maintenance and academic research). In most cases the entries consist of updating tasks, generated by the profile exchange events. names and phone numbers. In earlier mobile phone The drawback is that it requires one step more, since the models these were stored on the SIM, which didn’t allow contact of a person whose profile matches the query does for introducing other meaningful data about contacts. not know if the match will take place, so this additional In the further scenarios other technologies than SMS will hop from the contact to the person with profile match is probably be employed. However, for the time given, it is additionally needed. Weighing the advantages and recommended to use the “Selected group” setting instead drawbacks of the two solutions, it turned out that the of “Send to everybody”, since the cost of a short message second solution is better, therefore we incorporated this is still not neglectable today, multiplying that by the one in our application. number of contacts the cost of one search for the querier can be evaluated as 50 to 100 times the cost of an SMS. 5.3 Speed Additionally, the same applies for forwarded queries: even if somebody doesn’t query at all but forwards In case of TTL = 1, the results would come mostly within queries of others towards his own contacts, the costs 10 seconds. This is the average round trip time measured generated for a user of his contacts using the application in the networks of local operators. Increasing the TTL, can become excessively high. further results will come in also later. The exact time intervals also depend on the operator policy with respect 5.2 Profile Exchange to minimal time intervals accepted from a different operator towards own subscriber. In an extreme case of As shown in Section 3, smart phones offer more enhanced switched of mobiles some results could also arrive hours possibilities: it is possible today to store a number of later. The essential probably is that in the case of basic entries, among which numbers, addresses, some short message exchange one would receive query hits professional information and e-mail addresses. Sometimes within some seconds or at most tens of seconds, once it is also possible to use a field for additional comments, there are persons matching the query. where text can be entered about a given entry. However, as shown in Section III, the profile contained in the 5.4. Startup Ramp and Usage Patterns phonebook itself is not detailed enough: the users simply do not introduce details about their contacts enough to The application deployment starts to be useful if there are build a full-featured distributed database. Two alternative enough users with their application running. If the solutions can be proposed, as follows. application does not run in a phone, the received messages would appear as normal short messages.
It is also to be noted that a person not interested in A more appealing alternative would be a centralized revealing his own details has not much use in running the solution in which a central database would be used to application. It actually has drawbacks since he forwards store the messages or alternatively the phone link deliberately the short messages on behalf of other users, networks of individual users. This however raises even generating unsolicited costs for himself. Further, the more the need for security, privacy and authentication. “free-rider” problem is also applicable: there could be It has to be emphasized that this “centralized” solution is users there who search frequently but in addition to that different from the centralized solutions described in they keep their application closed, so do not collaborate in Section II. Here the storage of messages and/or the the forwarding. The application tries to avoid that to a storage of links between mobiles would be also centrally certain extent by providing a unique value for the TTL stored. The search then could be executed in this and the search group, both for own queries and forwarded centralized database and the number of message queries. It would be then unpractical for a “free-rider” to exchanges would be largely decreased. The distributed keep on switching the TTL and the search group. version could also function as a backup solution if the central database fails. In the centralized case we have to decide exactly what 6. Future Work will be stored in the database. Two options exist: 1. Store the phone numbers, profiles and phone link 6.1. Privacy networks 2. Store only the query and rank request messages for Our solution solves most of the privacy concerns typical the individual case. to this kind of applications. One exception from this is Both solutions have potential advantages and drawbacks described in the following. If the messages are sent via but worth to be considered as enhancements. SMS as we implemented it in our application, each user can know the message originator and the content. This could lead to problems in cases when the searcher is not 7. Conclusions willing to disclose the subject of his query to anybody except the person himself matching the query. Possible In our work we present a phonebook crawler application use categories could include trade secrets, client attorney that return socially significant hits to queries about privilege or medical non-disclosure agreements. A persons and places. The novelty is the communication possible solution could have at its base public infrastructure, which is the contact list. In order to obtain cryptography. If user A is sending a query with encrypted useful hits the users need to introduce their own profiles. by his public key, the other users encrypt their own We put the idea into practice by implementing it on the profile fields using this key but will not be able to actually Nokia 6600 platform and testing it over a couple of usage decipher the query. The hit message is returned the same scenarios. Possible enhancements are suggested from the way as in the non-encrypted case. viewpoint of privacy and eventual partial or full centralization of the database of profiles. 6.2 Other Messaging Technologies References: In the current generation of cellular networks an alternative to SMS could be our earlier work [10]. [1] http://www.google.com/help/features.html#wp, Cellphones connected via GPRS in various fashions Google phonebook would be a cheaper alternative. However, advanced [2] http://192.com, UK directory enquiry system network maintenance would be necessary because of [3] http://people.yahoo.com, Yahoo! People search frequent disconnections met in GPRS, see [11] for details. [4] Y. Wang, J. Vassileva, Trust and reputation model in In the next generation cellular networks presence peer-to-peer networks, Proc. 3rd IEEE Int. Conf. On information is stored and accessible by phones through Peer-to-Peer Computing, Linköping, Sweden, 2003 IMS/SIP technologies, this being probably a cheaper [5] S. Marti, H. Garcia-Molina, Identity crisis: solution and an enhanced version of the phonebook anonymity vs. reputation in P2P systems, Proc. 3rd network. This would work as follows: the phonebook is IEEE Int. Conf. On Peer-to-Peer Computing, extended to contain additionally the SIP address of the Linköping, Sweden, 2003 contacts in addition to ordinary phone numbers. Having [6] US Patent Application 2003/45050, System and that, UDP packets can be sent between the phones or method for the provision of socially relevant alternatively, TCP connections can also be established recommendations between them. The messages would be than carried by IP [7] Spoke, www.spoke.com instead of SMS, a much cheaper alternative at least in the [8] LinkedIn, www.linkedin.com today scenario. But the whole concept of the application [9] Friendster, www.friendster.com would remain unchanged. [10] The Symbian Contacts model, http://www.symbian.com/developer/techlib/v70docs/ 6.3. Centralized solution
sdl_v7.0/doc_source/reference/cpp/contactsmodel/ind ex.html [11] B. Bakos et. al., Peer-to-peer Content Sharing in Wireless Networks, Proc. 15th Int. Symposium on Personal, Indoor and Mobile Communications, Barcelona, Spain, 2004
You can also read