ICS 691: TRUST AND SOCIAL CAPITAL ON COUCHSURFING AND OKCUPID - BJ PETER DELACRUZ AND MICHAEL CLAVERIA

Page created by Julio Hayes
 
CONTINUE READING
ICS 691: TRUST AND SOCIAL CAPITAL ON COUCHSURFING AND OKCUPID - BJ PETER DELACRUZ AND MICHAEL CLAVERIA
ICS 691: Trust and Social Capital on
    CouchSurfing and OkCupid
      BJ Peter DeLaCruz and Michael Claveria
Abstract
        Researchers in social computing traditionally evaluate online relationships under the same

assumptions that they use when evaluating offline relationships. Despite this fact, researchers also

regard online relationships as separate spheres from offline interaction. Any crossing between the two

occurs predominantly between friends, relatives, and acquaintances who are also members of those

online communities. However, some online social communities operate under the premise that people

will meet and form online relationships that translate into offline relationships through real-world

interactions.   We examined two such online social communities, CouchSurfing and OkCupid, to

determine how online interactions affect potential offline interactions.

Introduction
        In many online social communities, members remain anonymous and interact solely through the

mediums of online communication without the intention of face-to-face encounters. However, with the

advent of Web 2.0 technologies, online participation produced more dynamic relationships between

users in online environments. In some online communities, face-to-face interactions between users are

essential for the success of the community. One example is CouchSurfing [1], an online community

where users agree to host or visit other people who live in different areas. A dating website like

OkCupid [2] is another example on which people typically sign up with the intention of meeting other

members in person. We address the following questions in this paper: Do online social communities

that encourage face-to-face interaction between members operate under different assumptions than

those online communities that do not? What are some factors that influence people to meet others

offline and also create a sense of trust between them?

Trust and Verification Systems on Online Communities
        From a research standpoint, trust has many definitions ranging from business to sociology

depending upon different contexts. We examined trust under two different definitions. Reed defines
trust as “the foundation of any relationship, and a relationship is exactly what you are looking to

establish with your users” [3]. This definition applies to our study because of the emphasis on

developing relationships with other users. This description of trust can also apply to two people in the

offline world or two users in an online community. A second definition of trust is one person’s “reliance

on the integrity” of another [4]. This definition relates to the communities we studied because

evaluating reliability or integrity of user profiles is essential to forming relationships. For example, after

reading Joe’s profile on a social networking site like Facebook and MySpace, if Bill did not find his

personal information reliable, more than likely, he would not trust him and thus would not “friend” him

on either of those websites.

        Trust exists in the offline world and can easily be established between two people with the help

of one’s observations of another’s actions, gestures, words, and personality. However, this concept

cannot easily be translated to the online community because of the anonymity of the Internet. For

instance, if Joe helped Bill on MySpace but never met him face-to-face before, are Joe’s actions enough

for Bill to trust Joe? More than likely, Bill would want to know more about the person from whom he is

getting help. Thus, it would be in Joe’s best interest to be as truthful as possible or at least control what

he writes on his MySpace profile. Depending on the information on his profile, Joe’s chances of being

trusted by Bill would increase.

         Massa [5] discusses “trust statements,” which are opinions expressed by one user of another,

and gives examples of online communities that use them to help a user decide as to whether to trust

someone else online. Consider the scenario that Bill follows Joe’s reviews for products on websites like

Epinions and Amazon. If Bill found his reviews to be very valuable, he would more than likely consider

buying the next product that Joe reviews. In online communities such as CouchSurfing, this type of

system would especially be useful because “judgments entered about other users … are used to

personalize a specific user’s experience on the system” [5]. If Bill wanted to sleep over at Joe’s place but
did not contact him yet on CouchSurfing, his decision to get in touch with him would not only be

influenced by what Joe wrote on his profile page but also what other people wrote about him there.

        Not all online communities would benefit from having this particular type of verification system;

in fact, it could be detrimental to both the user and the online community. OkCupid is one example in

which a review system may not work. One reason is that the reviewers would be other users with

whom one had dated. Dating is considered an exclusive practice, and people do not want to display

their history of past relationships because they consider it private information. Past transgressions may

reflect poorly on a user’s character, and displaying them publicly can be detrimental to building

relationships with others. In general, most users do not want reviews from other users who went on

failed dates because of the potential for negative reviews and the ineffectiveness of positive reviews.

The paradox of a positive review is that other users will wonder why the reviewer is not dating that

person. The problem with a negative review is the strong bias of an ex-girlfriend or ex-boyfriend. For

OkCupid, implementing a verification system comes at the risk of privacy issues. Members tend to keep

a user’s identity confidential even when complaining about a failed date with that user. However, there

are examples of users who post negative remarks about others whom they met in real life.
A negative review of a user in the forum on OkCupid

OkCupid does not advocate the creation of such reviews, but their existence remains largely

unregulated by moderators. The difficulty in establishing a system of reviews on OkCupid is that there is

little incentive and value to writing a positive review while a negative review has serious consequences

for the parties involved.

        Trust is the determining factor for evaluating whether or not users in social networking

communities should meet face-to-face after meeting each other online. Thus, these websites have

implemented mechanisms or ways that help a user build trust in the person with whom he or she is

communicating. On CouchSurfing, the References section is similar to the review system on e-marketing

websites. A user writes comments about his or her host (the person who let the user stay over at his or

her place) or guest (the person who stayed over at the user’s place). The user has the option of giving a

positive, neutral, or negative rating. Interestingly, we did not see any negative ratings for users on

CouchSurfing. One reason for the dearth of negative reviews is the lack of anonymity in the review
process. A review has the writer’s profile displayed next to it; thus, the writer of a negative review risks

repercussions from the reviewed user. This inevitably leads to biased reviews as users may feel inclined

to leave neutral ratings for their hosts or guests instead and also phrase their words in such a way so as

to not embarrass or humiliate them. A negative rating could forever tarnish a user’s reputation in this

online community. However, as [5] pointed out, if these trust mechanisms are not enough for a user to

make a decision, the actual physical location (latitude and longitude) of the person can be verified by the

system. According to CouchSurfing,

        Verification [helps] our community stay safe. By confirming your name and address, you
        show other CouchSurfers that you are who you say you are. This simple gesture
        strengthens the trust system that allows CS to function. Your verified profile will help
        other CouchSurfers feel comfortable reaching out to you, and without that comfort, our
        mission of bringing people together across barriers could never be realized. Getting
        verified shows the community your commitment to the success of the project.

        On the other hand, on OkCupid, there is no verification system at all, so a user has to rely on the

information on a person’s profile and make the decision to either trust or not trust that person himself

or herself. In addition to the typical sections that one would find on a profile in an online dating website

(e.g., My Self-Summary, What I’m Doing with My Life, and I’m Really Good At), a user can upload

photographs of himself or herself, take tests, post to a forum, and answer questions that would help

increase the chances of him or her being matched with a potential date.

        Using a person’s answers to questions, the algorithm on OkCupid calculates how well he or she

matches with every other user in this online community and displays a match percentage to reflect it.

The algorithm also determines whether two users should become friends or stay away from each other

by displaying friend and enemy percentages, respectively. Altogether, the three percentages can be

used to help determine the trustworthiness of an individual. For example, a user may trust another who

has very high match and friend percentages because they share similar tastes; on the other hand, if the

person has very dissimilar tastes than the user, which is indicated by a very high enemy percentage,
then the latter would probably have to peruse through his or her profile more and basically have to

work harder to build his or her trust in that person.

        Another algorithm used on OkCupid determines how frequently a user would respond to a

person’s message. A user could use this feature to determine how trustworthy that person is through

status messages such as “Replies often” and “Replies very selectively.” These messages are indicative of

a user’s selectivity and popularity. However, there is one caveat about this feature. We noticed that

although the last time some users logged in was a couple of years ago, their status messages still

displayed “Replies often,” which, of course, is misleading. Instead of calculating the frequency by

averaging the number of times that a user replied to another user’s message from the time the former

registered up until the current date, the algorithm may make the calculation based on the time from

when he or she registered up until the last time he or she logged in. Thus, unless one reads a user’s

profile (or sorts the list of potential matches by Last Login), he or she may end up typing a long, detailed

message and sending it to the user before finding out that he or she would most likely never read it, and

in this setting, knowing that he or she never will is certainly frustrating for one looking for a romantic

relationship.

Social Capital on Online Communities
        According to Robert Putnum [6], social capital exists in two different forms: bridging and

bonding. Bonding capital occurs between people with similar traits, and these connections are more

emotional and more meaningful than those resulting from bridging capital. Bridging capital centers on

relationships with people from different backgrounds with more frequent but less meaningful

connections than those established by bonding capital. These two types of capital are interdependent

and the downfall of one brings about the downfall of the other. One difficulty in determining the

difference between the two types of capital is that the definitions are subjective and bound by the

method through which “closeness” is measured. In an effort to distinguish and measure the different
types of social capital, Williams [7] created a matrix of social capital measures. His two-by-two matrix

divides social capital into four types: offline bridging, offline bonding, online bridging, and online

bonding. The creation of such a matrix implies that there are clear divisions between these quadrants.

However, our study of CouchSurfing and OkCupid suggests that these categories are not distinct entities

and that often times strengthening the social capital in one area affects other areas in a similar fashion.

        Although Williams mentions that there is a gray area between bridging and bonding, he does

not allude to blurring between offline and online relationships. Thus, we argue that websites like

CouchSurfing and OkCupid transcend the distinction between offline and online relationships. Trust and

social capital both allow for a successful transition from an online relationship to an offline one and vice

versa. We will examine the mechanisms employed by the two aforementioned online communities

that, we believe, enable these transitions to take place.

CouchSurfing
        CouchSurfing users create bridging relationships through offline interactions organized by online

communication via the website. The premise of the website is that the majority of relationship-building

occurs when a user sleeps over at another person’s house or hosts that person at the user’s house.

These relationships fall under the category of bridging because hosting is meant to be temporary and

may occur with many different users from different locations. An individual user’s success rate in

establishing relationships on CouchSurfing depends on information expressed on his or her profile. The

online profile is the means through which a user will evaluate his or her compatibility with another user.

        Offline and online interactions that occur through CouchSurfing encourage the other type of

interaction in a variety of ways. Offline interaction feeds online interaction through references provided

by other users. These references are qualitative and thus give users more information about the

evaluated individual than numerical quantitative measurements. However, such evaluations like the

former make it difficult to compare quality between two users.
The reverse also occurs: online interactions through posting events on forums, for example,

encourage offline interactions. CouchSurfing has a forum in which users can post information about

community events that encourage other members to get together for social gatherings in specific areas,

and since the primary purpose of this online community is to form relationships with people travelling to

other locations, events would bring together people on CouchSurfing who would not normally meet

others in person in other online communities that do not encourage face-to-face, offline interactions.

OkCupid
        OkCupid differs from CouchSurfing in that one of the main supported features of the site is

searching for romantic relationships. Both sites support offline encounters through features such as

listing contact information of other users. Unlike CouchSurfing, however, where one user’s network of

friends is visible on his or her profile, OkCupid does not have built-in ways of displaying social

connections or relationships on a user’s profile, and this lack of linking between profiles does not allow

users to obtain bridging capital and form online relationships with other people (besides those with

whom they are communicating through OkCupid’s message system).

        On the other hand, OkCupid encourages bonding capital through a system that measures

compatibility using a user’s responses to a databank of questions. Just by looking at a user’s profile, one

can determine the match, friend, and enemy percentages of that user. In addition, users can tell the

search algorithm to display profiles with the highest match percentage if they were looking for a date.

We believe that the algorithm supports bonding capital because it quickly matches users who have very

high compatibility according to their responses.

        OkCupid also has the ability to compare two users based on information from questions they

answered, and the algorithm used here makes definitive judgments on users. For example, the compare

feature allows a user to compare any two users based on personality traits as determined by their
responses. The ability to filter through people makes it easier for users to evaluate potential matches

and theoretically increases the chances of finding a desirable partner.

Methodology
        To measure trust and social capital on CouchSurfing, we examined one hundred random user

profiles. Our hypothesis is that the more pictures that a user posts of himself or herself on his or her

profile, the more positive ratings and friends that the user will have. In addition to a user’s pictures, we

also recorded the number of friends that he or she has and the number of people who vouched for,

stayed with, and hosted that user, and the reason why we recorded these data is that there could be

other factors besides pictures that could be responsible for building trust in a user.

        After perusing through profiles on OkCupid, we were unable to find any sort of verification

system that enabled us to compare it against content that the user created or posted (for example,

pictures) in order to measure trust. Also, we could not view a user’s list of friends or at least see the

profiles of those whom he or she contacted, so we were unable to measure trust this way. Like we

mentioned before, on dating websites like OkCupid, individual contact is regarded as private

information between two people. Unlike CouchSurfing, there are no vouchers, no recommendations,

and no easy way to verify contact between users other than forum participation. Nonetheless, we

collected information on users including age, gender, sexual orientation, relationship status (for

example, single or married), number of tests taken, number of forum posts, number of pictures, number

of questions answered, and response frequency. We looked for general patterns within the variables

that suggested that they affected user interaction within the OkCupid community.

Results and Data Analysis
        For the CouchSurfing dataset, we examined seven major variables: age, gender, number of

positive ratings, number of photos, and number of people who vouched for, stayed with, and hosted a

given user. We examined these variables so that we could create a model that best predicted the
number of people whom a user hosted and another model that best predicted the number of people

who hosted a user. After looking at individual x-y plots of the given variable versus the number of users

who hosted or were hosted, t-tests revealed that there were relationships, although they were not

particularly strong. Looking at possible relationships, we assumed a linear model after testing for

quadratic, exponential, and logarithmic possibilities between variables.

                                                          Couch Surfing Photos vs. Recommendations
                                                                                                                                                         Couch Surf. Photos vs. Num. vouch
                                       200

                                                                                                                                           50
                                       150

                                                                                                                                           40
                 No.Pos_Ratings

                                                                                                                     No.People_who_Vouch

                                                                                                                                           30
                                       100

                                                                                                                                           20
                                       50

                                                                                                                                           10
                                       0

                                                                                                                                           0

                                                 0        20       40       60          80        100    120   140                              0   20      40      60      80    100    120   140

                                                                              No.Photos                                                                              No.Photos

                                                     Couch Surf Vouchers vs. Num. people who stay
                                  60
No.Stay_with_Person

                                  40
                                  20
                                  0

                                             0            10         20          30          40         50

                                                                  No.People_who_Vouch

                                                                                        Example plots of relationships between variables

                                                 As expected, there were correlations between variables that indicated that they affected trust.

For example, more people tended to stay with someone who has tons of vouchers than one who only
has a few. Rather than examining each pair of variables, we created a model that predicted the number

of CouchSurfers who stayed with a user based on the significant variables.

        We used a linear model for predicting the number of users who stayed with a person because

individual plots between this variable and the other variables appeared linear for each case.

lm(formula = No.Stay_with_Person ~ No.Pos_Ratings + No.Photos +
    No.People_who_Vouch + Gender + Age)

Residuals:
     Min      1Q              Median            3Q         Max
-22.3910 -3.9675             -0.8579        2.3569     50.3580

Coefficients:
                     Estimate Std. Error t value                         Pr(>|t|)
(Intercept)         -11.32414    5.39354 -2.100                          0.038445 *
No.Pos_Ratings        0.06359    0.04531   1.403                         0.163800
No.Photos             0.22121    0.05738   3.855                         0.000212 ***
No.People_who_Vouch -0.20647     0.25997 -0.794                          0.429082
GenderM               2.25199    1.94229   1.159                         0.249210
Age                   0.39478    0.19029   2.075                         0.040751 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05                          ‘.’ 0.1 ‘ ’ 1

Residual standard error: 9.339 on 94 degrees of freedom
Multiple R-squared: 0.3654,     Adjusted R-squared: 0.3317
F-statistic: 10.83 on 5 and 94 DF, p-value: 3.056e-08

The number of photos and age were significant factors in the model, indicating that a user’s number of

photos and his or her age were factors that helped to predict whether other CouchSurfers would want

to stay with that person. Interestingly, other factors like gender and number of vouchers and positive

ratings do not seem to be significant. However, this model did not have a very good R-squared value,

indicating that it did not explain much of the data.
# who stay with person vs. # of photos                                             # who stay with person vs. age
                      60

                                                                                                                60
No.Stay_with_Person

                                                                                          No.Stay_with_Person
                      40

                                                                                                                40
                      20

                                                                                                                20
                      0

                                                                                                                0
                           0      20        40      60       80    100     120     140                               20   25       30         35      40   45

                                                      No.Photos                                                                         Age

                               Both age and the number of photos appear to be significant factors in the combined linear

model. Individual graphs reveal rather weak linear relationships.                                                              Both the number of photos and

number of people who stayed with a user seem to fit a linear model decently and are only slightly worse

in a quadratic model. Age, however, does not appear to be a significant factor.

Call:
lm(formula = No.Stay_with_Person ~ No.Photos + Age)

Residuals:
     Min      1Q                                            Median                   3Q                             Max
-23.4743 -4.3592                                           -0.7885               2.7253                         51.3576

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.96464     2.83465   2.104   0.0379 *
No.Photos     0.21592   0.03124   6.911 5.09e-10 ***
Age         -0.16442    0.09146 -1.798    0.0753 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 9.446 on 97 degrees of freedom
Multiple R-squared: 0.3302,     Adjusted R-squared: 0.3164
F-statistic: 23.91 on 2 and 97 DF, p-value: 3.612e-09

                               After testing a model with only age and the number of photos as the predictor variables, the

latter remains the only significant variable in predicting the number of users who stayed with a person.
We ran a similar analysis to find the best predictor variables of the number of times that a person was

hosted.

lm(formula = No.Hosted_person ~ No.Pos_Ratings + No.Photos +
    No.People_who_Vouch + Gender + Age)

Residuals:
    Min      1Q Median                  3Q     Max
-7.2762 -2.8102 -0.5626             1.7734 17.2913

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)
(Intercept)          6.336465   2.379382   2.663 0.00911 **
No.Pos_Ratings      -0.004237   0.019990 -0.212 0.83258
No.Photos            0.003826   0.025315   0.151 0.88018
No.People_who_Vouch 0.311009    0.114686   2.712 0.00796 **
GenderM             -1.618380   0.856849 -1.889 0.06201 .
Age                 -0.060857   0.083946 -0.725 0.47028
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.12 on 94 degrees of freedom
Multiple R-squared: 0.3354,     Adjusted R-squared: 0.3001
F-statistic: 9.488 on 5 and 94 DF, p-value: 2.38e-07

According to this model, the most significant factor in predicting how often a person gets hosted by

other users is the number of people who vouched for that user. Gender is also a fairly strong factor,

although not as strong as the number of vouchers. The other factors—age, number of photos, and

number of positive ratings—are not significant to the model.
Number hosted by gender                                      # of people who hosted person vs. # of vouchers
   25

                                                                         25
   20

                                                                         20
                                                      No.Hosted_person
   15

                                                                         15
   10

                                                                         10
   5

                                                                         5
   0

                                                                         0
                 F                             M                              0         10         20         30        40          50

                                                                                                No.People_who_Vouch

        The above boxplot suggests that females are hosted more often than males on CouchSurfing.

The individual plot comparing the number of vouchers to the number of people who hosted a person

indicates a positive correlation between vouching and hosting someone. T-tests on the individual

variables also come up significant on a .05 level, indicating that there is a correlation between gender

and the number of hosts for a person, as well as vouchers and the number of hosts.

        For OkCupid, predictive models were not as helpful in displaying trust, but we found a few

interesting trends in the data. A quick look at forum posts indicate that most of the users whom we

found did not have any postings in the forum. On the other hand, there was a user who had more than

three hundred posts. The lack of posting in the forum for the vast majority of users indicates that

forums are not a primary tool for creating bridging relationships in OkCupid.

        A majority of users (88 out of 100) filled out each of the sections on the profile page. Another

interesting factor that we found was that response frequency differed highly by gender. There were no

men whom we sampled who fit into the category of very selective when responding to other users’

messages. Most men (around ninety percent of them) either responded on a frequent basis or were

never messaged. On the other hand, more than sixty percent of women were either selective or very

selective in their response rate.
Response Frequency by Gender
                        Histogram of the Number of Forum Posts

                                                                                                                              1.0
                                                                            Very Selectively

                                                                                                                              0.8
            80

                                                                            Selectively

                                                                                                                              0.6
            60
Frequency

                                                                        y

                                                                                                                              0.4
                                                                            Often
            40
            20

                                                                                                                              0.2
                                                                            NCOW
            0

                                                                                                                              0.0
                 0       50      100      150         200   250   300                          F                        M

                                       NoForumPosts                                                       Gender

                              Histogram of forum posts and a plot of response rate by gender for OkCupid

Conclusion
                     OkCupid and CouchSurfing both have hidden factors when dealing with establishing trust and

social capital. One significant variable in both online communities is gender. Our statistical data verified

that females are hosted in CouchSurfing more often than males. Females are also more selective in

responding to messages than males on OkCupid. It appears that females are more trusted or more

desired in the CouchSurfing community when evaluating whether to host a particular user. However,

there is evidence that suggests that the built-in verification system on CouchSurfing helps to foster trust.

Vouchers were the strongest factor in determining whether to host a person. When evaluating a host,

the most important factors according to our study were the number of photos and age.                                           This

observation suggests that visual verification plays a large role in establishing trust and promoting social

capital. The positive coefficient for the age variable suggests that older age correlates with more people

being hosted by a person. Perhaps age is associated with more responsibility, or maybe older people

are simply more likely to host other users.
Evaluating the transition between online and offline relationships in sites like CouchSurfing and

OkCupid is difficult because we only got a few glimpses of offline behavior. In CouchSurfing, the

vouchers and other forms of verification come from offline meetings and occurrences. For OkCupid,

there is not a clear way to determine how a user’s offline transgressions affect that user’s online

relationships.   However, there are still discernable patterns with some variables that suggest

establishing the right credentials in the community are essential for establishing trust and social capital

when engaging in offline interaction with other users.

References

[1] http://www.couchsurfing.com

[2] http://www.okcupid.com

[3] Reed, Martin. “The importance of trust.” 14 March 2007. Community Spark.

    http://www.communityspark.com/the-importance-of-trust/

[4] Dictionary.com, LLC. http://dictionary.reference.com/browse/trust

[5] Massa, Paolo (2006). A Survey of Trust Use and Modeling in Current Real Systems. Trust in E-services:

    Technologies, Practices and Challenges. Idea Group.

[6] Putnam, R. D. (2000). Bowling Alone: The Collapse and Revival of American Community. New York:

    Simon & Schuster.

[7] Williams, D. (2006). On and off the 'net: Scales for social capital in an online era. Journal of

    Computer-Mediated Communication, 11(2), article 11.
You can also read