Social TV: Designing for Distributed, Sociable Television Viewing
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Social TV: Designing for Distributed, Sociable Television Viewing Nicolas Ducheneaut1, Robert J. Moore1, Lora Oehlberg2, James D. Thornton1, Eric Nickell1 1 Palo Alto Research Center; 2UC Berkeley, Mechanical Engineering [nicolas, jthornton, bobmoore, nickell]@parc.com Abstract While there is now little doubt watching television Media research has shown that people enjoy can be a “ticket to talk” (Sacks, 1992) encouraging watching television as a part of socializing in interaction between groups of viewers, there is little groups. However, many constraints in daily life research available on the exact practices limit the opportunities for doing so. The Social TV surrounding sociable television viewing. Media project builds on the increasing integration of research has focused essentially on the role that television and computer technology to support television plays in social groups like the family, sociable, computer-mediated group viewing examining for instance how programs are experiences. In this paper, we describe the initial interpreted outside the viewing context and how results from a series of studies illustrating how this contributes (or not) to the maintenance of the people interact in front of a television set. Based on family system (Alexander, 1990). This research these results, we propose guidelines as well as tends to be based on surveys and ethnographic specific features to inform the design of future observations of group viewing like Lull’s (1990) “social television” prototypes. remain rare. Audiences have been generally considered as “a category rather than a way of 1. Introduction being” (Casey et al., 2002) – that is, past research Television has often been criticized as an isolating, has investigated how a given audience’s socio- anti-social experience. However, early ethnographic demographic characteristics affect their choice of studies (Lull, 1990) showed that programs, rather than how they behave as active participants engaged in social activities while “TV and other mass media, rarely watching television. mentioned as vital forces in the construction or maintenance of As such, the actual mechanics of joint viewing (e.g. interpersonal relations, can now be seen to When do people talk during a show? About what? play central roles in the methods which To what effect?) remain largely unexplored (an families and other social units employ to interesting exception can be found in Walker and interact normatively.” Bellamy, 2001, a survey of remote control use during family viewing). This lack of data has And indeed, television viewing appears to be important design implications: a better knowledge largely a social activity, often conducted in groups of joint viewing practices could help develop new (Morrison, 2001). In fact, the worth of a particular technology to better support television-mediated television program is often gauged according to the sociability. amount of social interaction it generates (White, 1986). Television can foster multiple forms of It might seem that practices around a technology as sociability: direct (e.g. when chatting with friends well-established as television are not amenable to and family during a “movie night” at home) or redesign, but in fact both societal change and indirect (e.g. when discussing previously viewed technological innovation have been affecting programs with colleagues at the office water viewing, as well as entertainment in general, for a cooler), and both are equally worthy of attention. long time. Many characteristics of contemporary Previous research (Morrison, 2001) highlighted a life in wealthy societies conspire against traditional similar distinction between the “internal” social joint television viewing, despite its social appeal. functions of television viewing (when family Urban sprawl, for instance, can make travelling to a members watch television together) and its friend’s house or a public space for a movie night “external” functions (e.g. television programs as inconvenient (Oldenburg, 1989); domestic isolation topics of conversation at work or elsewhere; special and scheduling constraints prevent gatherings events organized at home such as inviting friends (Putnam, 2000); and increasing mobility often over for watching the Superbowl). separates family members (e.g. a child living away from his family to attend university). Sociability is becoming more and more distributed in this context
as technology enables diverse remote interactions. down the barriers between audience members and Online computer games, for instance, are displacing performers, as opposed to facilitating group television among young viewers, in part because interaction while watching television. they offer an interactive, social, and location- independent experience (Subrahmanyam et al., 3. Understanding TV-Mediated Sociability 2002; Schwartz, 2004). In order to understand what can make group television viewing so enjoyable, we invited Technology is changing television practices also, participants to watch various television shows through integration of computing capabilities as in together in a specially equipped room in our the case of digital video recorders such as TiVo 1. laboratory. Our experiments were meant to Our Social TV project is about leveraging this kind reproduce one prevalent form of group television of computing integration to remove the increasing viewing: the “viewing party” where, for instance, a barriers to sociable interaction around video group of friends get together to watch the latest content. In focusing on the direct form of sociable episode of the show “Desperate Housewives.” It is viewing we will be talking about design for important to recognize, however, that other forms distributed, shared television viewing. While this of group television viewing are equally interesting necessarily changes the television experience, our and that they would probably affect viewing concern is to preserve the ‘natural’, familiar social patterns differently. For instance, we plan to atmosphere of watching television in a collocated investigate “everyday viewing” (e.g. married group. This goal of enhancing social opportunity couples watching television together in their own while fitting existing practice frames the central home, without planning) at a later date. design challenge. We began with three exploratory sessions where In this paper, we report on preliminary studies and viewers were all located in the same room and design concepts that serve our goal. We begin with could interact face-to-face. The smallest group size observation of what television viewers do while was 5 and the largest 8. Participants were between they are watching television in groups, both when 20 and 50 years old, with more males than females they are physically co-present and geographically (70% and 30% respectively). Most were colleagues distributed. Based on our observations, we then but a few shared social activities outside of work. articulate in the second part of this paper a series of Participants watched a soccer game or one of two design guidelines and concepts for future Social TV episodes of a documentary (the BBC’s “1900 prototypes. House”), each lasting approximately 2 hours overall. We used two cameras to videotape both the 2. Related Work participants’ behaviour and the content of the Surprisingly, very few systems have been television show. The tapes were later reviewed and developed to support social interactions of any kind coded (Glaser, 1998) by two of the authors, with a among television viewers. Two projects are closest particular attention to the local organization of talk to ours. The first one is Chuah’s (2002) “reality among the participants (Sacks et al., 1974). Our instant messenger”, which provides both “buddy observations were complemented by a surfing” (an awareness that friends are watching the questionnaire asking participants about their group same television program) and an IM-based television viewing habits (e.g. how often they communication channel between viewers. The viewed TV in groups, the kinds of content watched, second related project is Alcatel’s Amigo TV etc.), as well as exit-interviews after the viewing (Coppens et al., 2005), which allows television session to reflect on their experience. viewers to share opinions and feelings with friends via an interactive broadband link. Amigo TV shares In the second phase of our study we held six a similar design philosophy to ours but, as we shall additional viewing sessions, this time separating the see later in this paper, the range of features offered participants into two groups located in two different to users is quite different. rooms. A social audio link was established between the rooms using two computers running Robust Another approach has been to transform television Audio Tool (RAT2, see Figure 1). By relaying all into an inhabited virtual world. In Benford et al, audio between the two rooms, each group could (1998), audience members control avatars in a 3D hear what the other was saying at all times. While space and can interact with the performers of the the audio quality was adequate, it was certainly show they are watching. The focus is on breaking degraded compared to co-present interactions. 1 2 http://www.tivo.com http://www.netsys.com/mbone/software/rat/
However, the quality was good enough to allow our invited again some of the participants from the first experiments to proceed. phase to allow them to compare and contrast the two conditions. Participants watched a wide range of content ranging from sporting events (the Olympics) to cartoons (the Simpsons) and reality- TV shows (the Real World). 3.1. Survey Results The majority of the survey respondents reported that they routinely held viewing parties with their friends, and among the reasons why were that “TV [is] an excuse for sociality… [we watch] what[ever] TV is "big" enough to justify having someone come over.” This confirmed both our intuitions and earlier research about the attractiveness of group television viewing. We also asked people what kinds of television programs they watched in groups. Among the responses the most popular genres included Animation, Sports Events, Documentaries, Action- Figure 1 – Equipment used during Phase 2 Adventure, and Reality Television. It is clear from our participants’ comments that certain qualities in These later experiments were meant to simulate TV shows encourage sociability more than others. some of the conditions users might encounter while In particular, shows with bursty rhythms or watching television in a distributed setting. We redundant content (such as sporting events) provide chose not to transmit video streams of the plenty of pauses and opportunities for interaction. participants’ faces between the rooms, as this has People-centered content (such as reality TV shows) already been tried elsewhere (Huijnen et al., 2004). provide audiences with many “conversational Our focus was on the most basic aspect of props” (Lull, 1990). Indeed, as one of our sociability: conversation. participants humorously mentioned during an exit interview, “It’s fun to make fun of people.” This As before, we used cameras (this time, three of parallels earlier research on the social psychology them) to videotape both the participants and the of entertainment. Stone (1981), for instance, noted audience. We assembled and synchronized the three that sports spectacles “are conversation-pieces, and video feeds such that they could be reviewed and conversations about them before, during, and after coded on a single screen (see Figure 2). We the event bring people together in an emotional administered the same questionnaire after each rapport” (p. 222). The talk around the event itself session as was used in phase 1. may be interpreted as “a loosely structured game […] a form of nonutilitarian play” (Crabb and Goldstein, 1991, p.367). Poor quality movies were also often mentioned as a good way to foster social interaction. Our respondents did not only find it fun to talk about poor casting, acting, or effects: a consensus about a lack of important or relevant dialogue in a show seems to provide viewers with as many opportunities to comment as they would like. Shows such as “Mystery Science Theater 30003” have capitalized on this phenomenon. All the characteristics above encourage a kind of “vicarious Figure 2 – Merged video feeds used for coding audience play” (Sutton-Smith, 2001) that is central our participants’ interactions to sociable television viewing. In the second phase, there were at least 2 and up to 6 participants in each room. The gender ratio was the same as before, and so was the age range. We 3 http://www.mst3kinfo.com/mstfaq/basics.html
As we were collecting survey results, it became These rules were never openly discussed by the clear that content selection would make a difference participants. Instead, they seem to be part of a set of to the level of social interaction independent of ingrained cultural practices dictating proper other features of the experimental condition. This behaviour when watching television in groups. was indeed supported by our observations. For instance, the documentary we later used during the 3.2.1. Interactional Practices When Watching first phase of our experiments clearly generated less Television in Groups lively conversations than other genres (although Critics of television often point out that the nature interesting sociable exchanges occurred of television programs encourages passivity (Casey nonetheless, as we illustrate later in this paper). To et al., 2002). This passivity could be explained, at make sure our participants had enough material and least in part, by the fact that the television set tends incentive to communicate, we therefore to dominate most channels of communication (both purposefully selected the most popular genres for audio and visual) and, as such, it might not be most of our experiments. conducive to interactive exchanges between audience members. However, our experiments 3.2. Empirical Observations revealed that television viewers are quite adept at The most striking finding to emerge from our communicating with each other during a show. To observations was the surprising similarity in the do so, they rely on a set of interactional practices nature and structure of the participants’ that allow them to simultaneously socialize with conversations across the two experimental each other around the TV and to follow the ongoing conditions. We identified clear interaction rules that program with sufficient attention. participants respected when they interacted during a TV show, whether or not they were collocated. 01 Voice: in nineteen hundred it must have been (0.5) 02 fantastic to be able to whiz down the roa:d 03 (0.4) 04 Voice: on your bicycle.= 05 Music: =[((gets louder)) 06 [((P1 begins to take a drink)) 07 (2.2) 08 (1.2) ((P2 turns toward P1 and P3)) 09 P2: °( musta been [ ) ° 10 Voice: [HE::Y I’M FREE 11 (0.2) 12 P3: °yeah° 13 (0.1) 14 P3: [°yeah°] 15 P4: [wha:t?] 16 (0.8) 17 P4: [what ro:b?] 18 Music: [ Fade Out ] 19 (0.5) 20 P2: well it really [mus]t have been ju[st= 21 Video: [FO ] [FI 22 Music: [FI 23 P2: =liber[ating] 24 P4: [Oh:: ] yeah:: 25 (0.2) 26 Voice: but not everyone [(in the)] home can= 27 P4: [ yeah:: ] 28 Voice: =enjoy their freedom... Transcript 1 – The participants’ interactions are carefully timed to exploit gaps in the show
To illustrate this phenomenon, let us consider the overlap with the TV dialogue (line 27) and the transcript above. The excerpt is based on a three- participants then refrain from speaking as the new minute long segment from one of our Phase 1 scene opens. Thus, the participants work hard to fit sessions, during which seven co-located their conversation into the transition between participants were viewing “The 1900 House”, a scenes. historical documentary. The transcript is coded using the conventions of Conversation Analysis Beyond the kind of exchanges illustrated above, we (Sacks et al., 1974), emphasizing the sequential also observed that lulls in programming are used to organization of turn-taking in social exchanges. The help newcomers, who may have arrived late in the excerpt features four of the seven participants, program, catch up on what happened and is identified as P1-P4 (their talk appears in bold). FI currently going on in a program. The composition and FO indicate fade-in and fade-out in the video of the audience is often fluid during group stream. At this moment of the show, the main television viewing, with participants constantly protagonist (coded as “Voice” and in italics) leaving, arriving, or coming back. Our participants describes her joy at being able to ride a bicycle. used gaps in the show to progressively bring their co-viewers up to speed. In the transcript, we can see how the viewers are finely attuned to the structure of the show – a show Moreover, it is also important to note that, even that none of them has watched before – by the way when the participants had access to a remote control they insert their talk precisely in the gaps in allowing them to pause or stop the show at any time dialogue and transitions between scenes. This is it was almost never used. Instead, the participants made possible by the fact that such gaps are more preferred to let the show continue at its own pace or less projectable, much like turns-at-talk (Sacks et and relied on their aforementioned ability to predict al. 1974). There are several gaps in the TV gaps in order to communicate with each other. This dialogue, but they are not all appropriate places to confirms earlier research showing that group talk. For example, the 0.5 second silence at the end viewing creates a certain pressure “to just leave the of line 01 occurs at a place at which the narrator’s clicker alone” (Walker and Bellamy, 2001, p.83). utterance is hearably incomplete. In contrast, the Participants structure their interactions to avoid 0.4 second silence at line 03 is a possible using technological control over the show’s pacing completion point, although the actual completion and structure. point occurs at the end of line 04. Now at the end of line 04, there is a cinematic cue that projects an Although viewers tend to try to fit their talk within upcoming scene transition: the music gets louder gaps in the program, they are not constrained to do (line 05). P1 orients to this projected transition so and in fact sometimes choose not to. An example nonverbally by beginning to take a drink at of such overlap is illustrated in the following case precisely this point. (Transcript 2, next page). In this excerpt, from Phase 2 of our study (three viewers in room A and However, it is not until after an additional 2.2 two in room B), the viewers watched The Real seconds of silence (line 07) in the TV dialogue that World, a reality-TV show. P2 turns toward P1 and P3 and makes a quiet comment about the show (line 09). Although this In the first part of this excerpt (line 05), A2 makes a would seem an ideal place to talk, it actually joke about the car that one of the Real World overlaps slightly with the final utterance in the roommates is being driven to the airport in by her scene (line 10) that appears as a surprise. P2 then father. The joke is precisely timed to fit in a gap in receives two quiet agreement tokens from P3 (lines the TV conversation, like the conversation in 12 and 14), short responses that minimize the Transcript 1. However, because the joke is not length of the conversational sequence. But in placed near a scene transition, there is no gap for a overlap with the second, P2 also receives a request recipient to respond without overlapping with the for a repeat from P4 (lines 15 and 17) who is across program. Thus, in overlap with the next utterance the room and who apparently could not hear P2’s on the TV, B1 responds nonverbally with a smile comment. P4 thus expands the sequence. As P2 (line 08) that A2, in the other room, cannot see; A3 repeats his comment louder and directs it to P4, the responds with a short utterance that overlaps with scene fades to black and then fades back in (line 21) the TV only minimally (line 09); and B2 responds right in the middle of his turn. P4 then marks her with a longer joke (suggesting the backwardness of recognition of the repeat with “Oh::” in overlap Alabama) that occurs entirely in overlap with the with his turn and produces a minimal agreement TV (line 11). Thus, without a ready gap for a token (line 24) just before the dialogue of the next response, the participants nonetheless choose to scene begins (line 26). She then repeats the token in interrupt the TV dialogue to respond to A2.
01 TV: =an if you jus' stay in your little neighbuhood there 02 an it looks tuh me like everythang would be: 03 (0.5) 04 work out gr[eat. 05 A2: [they still make cars like that? 06 (0.1) 07 TV: {what ti:me} do you think [would be good] ti:mes= 08 {B1 smiles } [ ] 09 A3: [ not anymore ] 10 TV: =f[or you tuh [ca:ll ]me eleven uh] clock? 11 B2: [ in al[abama?] yeah. ] 12 A1: [ hhuh ] 13 (0.3) ((B2 smiles)) 14 TV: .hh (0.1) i- (0.3) in the mornin' an eleven o'clock at night? 15 (1.2) ((scene transition)) 16 (0.9) ((TV: voice over PA system)) 17 TV: tell me bah! 18 (0.9) 19 TV: °hm not like ol[( ] 20 A3: [i ain't ne{ver been on the air ee oh pla:ne.] 21 B1: {NO WONDER SHE WANTED TO GO TO NEW]= 22 {A2 turns head toward A3 & smiles 23 TV: )]}(oh my god)° 24 B1: =YORK!]} 25 } Transcript 2 – Competing for opportunities to comment Then at line 15 a scene transition occurs, and the room to hear. Transcript 3 offers an example of Real World roommate is shown at the airport such a this-side comment that develops into an saying goodbye to her aunt and her baby niece, to extended conversation. whom she says, “tell me bah!” or rather “bye” with a southern accent (line 17). Two of the participants Transcript 3 (next page), also from The Real then treat this as an opportunity to make additional World, begins during a scene transition in which an jokes. In complete overlap with the TV dialogue, image of the World Trade Center in New York A3 makes a joke picking up the earlier briefly appears. After 2 seconds but still during the backwardness theme, and mocks her southern scene transition, A2 quietly comments that it is a accent (line 20). Just after A3 begins his joke, B1 “bizarre sight” post 9/11 (line 03). She produces the comes in with another joke (lines 21 and 24) in comments so quietly that only A1, who is sitting overlap with A3 and the TV, suggesting perhaps next to her on the couch, can hear it, but not loud that the girl’s family situation is driving her out of enough for the viewers in room B to hear. After 4 her small town to the big city. B1 not only produces seconds, A1 expands the topic by quietly her joke in overlap with A3’s but produces it with a commenting on whether such as scene might get loud voice (indicated by caps), to be heard over edited out (line 05), and A2 quietly produces an him. Thus we see the viewers doing a couple of agreement token (line 08). At this point, the things: prioritizing joking about the program over program is still transitioning to the next scene with hearing parts of the program itself and competing a montage of images from the city that are slowly for limited opportunities immediately following narrowing in on the Real World apartment. The TV events in the program to comment on them. dialogue has not yet resumed. After waiting another 2.1 seconds, A2 expands the topic further by In addition to talking around and over the TV mentioning that an image of the World Trade program, viewers also used another practice for Center was edited out of another TV show (lines talking during a TV program in a group setting. In 10-12). After another 2.4 seconds, A1 adds a Phase 2, the viewers occasionally directed different example of a movie preview in which the comments only to a subset of their fellow viewers, World Trade Center was removed. As he is usually for only those in the same room, by describing the scene (lines 19-28), the TV dialogue producing them too quietly for those in the other finally begins (line 23). He completes his
description (lines 24-28), which includes a hand viewing audience. Because the verbal part of A1’s gesture depicting the Twin Towers, in overlap with description (lines 24 and 27) is spoken softly, it the TV dialogue, and then they drop the topic to doesn’t disrupt the ability of the room B viewers to listen to the program. hear the show. In addition to minimizing disruption through the loudness of their talk, A1 and A2 also This entire conversation is produced by A1 and A2 do so by placing it in a scene transition. They at a low volume and thus only for their room. A1’s appear to expand the topic cautiously by pausing descriptive hand gesture further shows that he was for somewhat long periods after each expansion designing his turns only for A2. Such this-side (lines 04, 09 and 14) in anticipation of the comments and conversations minimize disruption of resumption of TV dialogue. the TV program by engaging only a subset of the 01 ((TV: 3-second image of World Trade Center, New York)) 02 (2.0) 03 A2: °that's a bizarre sight° 04 (4.0) 05 A1: °i wonder if they would cut {something like that out°} 06 { A2 turns to look at A1 } 07 (0.7) 08 A2: °°yeahhh°° 09 (2.1) 10 A2: °.hhh they {took it out of the sopranos.}° 11 {A2 turns to look at A1 } 12 A2 °°didn't they. they took it from thee {uh::°° 13 {A2 turns back to TV 14 (2.4) 15 A1: °i remember there was uh: ° 16 (1.2) 17 A1: °there was a preview for that spiderman movie° 18 (0.5) ((A1 & A2 look at each other)) 19 A1: {that had 20 A1: {raises 21 (0.1)} 22 A1: hands} 23 TV: [{hello: / hi mo::m / how are you? / 24 A1: [{°the twin towers on it}{and they- they removed it} 25 A1: { traces Twin Towers }{ swirls hands around } 26 TV: fi:ne, how are you: ] 27 A1: {and did some other cg stuff°}]= 28 A1: {R-hand chop & crosses hands } 29 TV: oka::y 30 (0.2) 31 TV: are you going to... Transcript 3 – Use of body orientation and whispers during side comments Thus, in the preceding transcripts we see viewers their fellow viewers, to compete for opportunities using a variety of practices for coordinating their to make time-sensitive comments. talk with the dialogue in TV programs to avoid or reduce disruption: fitting their talk in the gaps, 3.2.2. Typology of comments using minimal responses, and producing quiet While we described how the viewers in our comments for a subset of their fellow viewers. experiment coordinate their talk with TV programs However, we saw also that viewers are by no and each other in the previous section, in this means constrained to avoid disruptions and will at section we focus on content and offer a typology of times talk over the TV dialogue, as well as over the different kinds of comments viewers make.
Type of Examples edited from data Our attempt at categorizing comments reveals that Comment some exchanges are more disruptive to group Content-based “I love the way Homer television viewing than others. Non-sequitur, in dances in this scene.” particular, can negatively affect a group’s Context-based “Didn’t Conan O’Brien write experience since they often lead a small subset of for the Simpsons?” participants to “talk over” the show about topics that may not be relevant to a majority of the group. Logistical “Could you turn up the In contrast, phatic responses do not affect the flow volume?” of the program and do not need to be so finely Non-Sequitur “Did you hear about the timed. They also scale well with the number of tornado in Germany?” people: many participants can easily laugh (but not Phatic “Whoa!”, laughter, gasps, talk) at the same time. groans, etc. To summarize, our observations reveal that Table 1 – Typology of comments observed interactions between television viewers are tightly during group television viewing sessions interwoven with the structure of the show they are watching. In other words, the television program is At a general level, we can characterize the both a resource and a constraint: it provides a rich comments exchanged by our participants using five conversational context that, to be properly broad types(Table 1): exploited, requires the audience members to “read” • Content-based comments directly reference the the upcoming structure of the show in order to time content that is on or recently shown on the screen their comments. Reaching a “flow experience” (Csikszentmihalyi, 1990) is important and viewers (as in the example in Transcript 1). appeared reluctant to interrupt the show to • Context-based comments are relevant to the show communicate – the show itself has to be structured in its greater context, but perhaps not the specific such that opportunities for communication exist. episode or moment that is being viewed. Moreover, only certain kinds of conversations Examples are references to the actors, past contribute positively to sociability and private side- episodes, show trivia, etc. conversations can “derail” the audience from the • Non-sequitur comments are social exchanges show. The clearest contributors to sociability are such as asking about one’s family, or talking phatic utterances and any kind of comment related about events unrelated to the TV program. These to the show that elicits only a quick back-and-forth are usually more common in groups who already between a few of the audience members, such that have some social connection with each other. the entire interaction sequence fits within the gaps They often take the form of side conversations provided by the television program. (usually whispered, or at least toned down) between two participants and rarely follow the 3.2.3. Visual Interaction and Presence structure of the show. During our observations, we also paid close attention to the body movements of our • Logistical comments are relevant to the television participants. One of the most surprising moments in watching experience, but are independent of the our experiments was during a soccer game when a programming. Tasks like changing channels, player almost scored a goal. Out of excitement, one adjusting volume, etc. must be verbally member of the audience jumped out of his seat communicated to the group so that whoever has completely to react to the event, and crashed to the control of the set can respond. floor. Surprisingly, no one in the room shifted their • Phatic responses are almost involuntary reactions focus from the TV to watch him. This illustrates from the audience like laughter, gasps, groans, another important feature of group television “Whoa!”, etc. Nearly content-free (Schneider, viewing, namely, that the interactions between the 1998), phatic comments are not complicated participants are visually peripheral: the televised interactions, but are vital to the social atmosphere content remains the visual focus. In most cases, it in the room. Unlike other interactions which looks as if participants talk to the television itself require turn-taking in order to make any sense, instead of addressing any specific person by turning phatic responses have a “more the merrier” effect their head. The sense of audience presence is where additional “audience participation” adds to mostly conveyed via subtle cues in the room such the sociable atmosphere. Phatic responses also do as gestures perceived at one’s visual periphery, not require a pre-existing social connection to movement in the room, and environmental audio participate; both socially unfamiliar and socially (sound of people shifting in their seats, putting gelled groups can share in a good laugh. things on a coffee table, etc.).
Side conversations are a very different matter, each participant’s utterances and transmitting the however (Transcript 3). They are most often held appropriate mix of social audio content to all with nearby audience members, and so as to viewers, Social TV would act as a “clearing house” establish a more personal connection, the facilitating distributed television viewing. participants will turn their heads towards each other to establish some visual contact (although as the conversation tapers off, the visual connection between the members deteriorates and the focus returns to the television). As we mentioned earlier, side conversations can negatively affect a group’s experience but our observations of body movements show that they can be easily detected using the orientation of someone’s face. This suggests potential avenues to dynamically move sub-groups in and out of the main audience – we discuss these possibilities and others in the next section of this paper. 4. Designing for Sociability We believe that the subtle aspects of group Figure 3 – A possible design for Social TV television viewing we have just described can powerfully shape the design of future social Based on the data obtained from our experiments, television technology. Moving from observations to we believe such software would be particularly design, we will now briefly describe some possible useful if designed to: features that could be implemented to support and encourage sociable, distributed television viewing. • Support the proper timing of social interaction We want to emphasize that, while we plan to during group television viewing; implement and test some of these features, our main • Minimize disruptions in the television focus so far has been on observing group program’s flow; behaviour, not prototyping. We hope, however, that • Isolate exchanges that are beneficial to the the design possibilities we describe might inspire group from side conversations and non-sequitur; future research in this domain. • Allow viewers to move in and out of the audience smoothly; Our experiments using two rooms show that groups • Avoid drawing viewers’ attention away from can socialize remotely while watching TV using a the television screen. simple, always-on audio channel. Indeed, as we mentioned earlier, there were no major differences Following these guidelines, we are considering the in behaviour whether or not our participants were features listed below for Social TV system co-located. Therefore, this basic communication prototypes. It is important to note that, while we infrastructure provided us with a foundation to start aim at simulating a conventional group television from. experience for distributed viewers, some of the features we describe will unavoidably change this Currently, we envision Social TV as a experience. We have tried to minimize such communication module that could be added to a disruptions and the amount of new skills required PVR (e.g. TiVo), another piece of audio/video from the television viewers as much as possible. equipment (e.g. a receiver), or maybe the television Note also that the list is far from exhaustive and we set itself. It would allow viewers to establish believe there is much room for future work: connections with as many of their remote friends as they wish (probably using a mechanism similar to • A preview of the oncoming show structure (for Chuah’s (2002) “buddy surfing”), opening up a instance, a curve moving up and down depending shared audio channel between these locations on the amount of dialogue in upcoming scenes). (using, for instance, voice-over-IP). Participants Audience members would be able to call up this would communicate with each other simply by preview to “glance ahead” and accurately time talking into a microphone that could be placed in when to start and wrap up their comments, so as the room (Figure 3) or, alternatively, on a small not to interrupt the program. This would be headset worn by each viewer. The main value of particularly useful since, in the absence of the the technology, however, would be in the software subtle cues given off by co-located viewers (e.g. available in each Social TV system. By processing heads turning), the show’s structure becomes the
only reliable indicator of when to start and stop a quickly (and silently) watching the synopsis conversation. without disturbing other viewers. This would • Allow for variable-rate video when the Social TV allow them to gain some shared background with system senses social interaction. When the the other viewers, which they can later use to audience keeps conversing close to the end of a participate in social interactions. break, our system could slow down and eventually pause the show if the conversation appears to continue after the show has resumed. This would prevent the ongoing television programming from stifling more developed social interaction. While this satisfies one design constraint (“encourage interactions”) it does, however, directly break another (“do not change the flow of the program”). User testing would allow us to explore whether or not the gain in sociability offsets the loss of continuity in the program. Note also that such a feature offers the intriguing possibility of adjusting the length of advertising Figure 4 – An unobtrusive interface to convey breaks dynamically. Since ad breaks are often the presence of other viewers used for conversations that have the potential to last longer than the break itself, our system could We have argued that Social TV should be as automatically keep adding more 30 seconds ads visually unobtrusive as possible, in order to keep until the conversation is over. This opens up the the audience’s focus on the show. To convey a door to alternative revenue streams and business subtle sense of audience presence, we argue in models for content providers and broadcasters. favor of an interface building off of the visual • Automatically isolate private side conversations. scheme of the television show “Mystery Science When audience members are not co-located, we Theater 3000.” It uses a movie theater metaphor to cannot rely on body orientation to detect who is visually indicate the presence of other characters as talking to whom. Technology developed in our they are watching movies. Similarly, the interface laboratory, however, leverages insights from the for the Social TV system could be a row of theater same conversation analytic techniques we used seats at the bottom of the television screen. When earlier (Transcript 1) to automatically detect and new members join in the Social TV session, their isolate multiple, independent conversations in a shadows would appear in the seats along the bottom shared audio space (Aoki et al., 2003). We plan of the screen. Whenever viewers want to judge how to use such technology to facilitate interactions many other viewers there are, they can call up this with Social TV. visualization which remains briefly superimposed over the lower third of the television screen (Figure • “What did they say?”: Often people want to 4). In other words, the interface only appears when hear every word of content but miss parts of it. a “presence” event occurs, such as joining viewers, Asking others about what was just said disrupts identifying speakers, or at the user’s explicit the flow of the program. Instead, our system will request. This is less distracting and uses far less offer the possibility to turn on (temporarily and screen space than full-bandwidth video showing on demand) “delayed closed captioning,” so that each remote viewer, as in Huijnen et al. (2004). Of others can read what was just said without course, the theater interface does not convey as interrupting the flow of the content. many social cues as full video, but Social TV • “Catch-me-up”: Since breaks are often used for compensates for this absence with the features bringing a new member of the audience up to described earlier. speed, we plan to offer a “Catch Me up” button to facilitate the process. The Social TV system will Moreover, this interface could also be useful for automatically generate a one-minute visual selective sub-conversations. By selecting another synopsis of the show, displaying the scenes and user’s shadow, one could activate a more private, moments that caused the biggest reaction from secondary audio channel where the two could the participants (in a first approximation, we will exchange comments that the main group may find use the general audio level in the room as an uninteresting or distracting. indicator of audience engagement with the show). New participants will be able to join the group,
5. Future Work 5.1. Technical improvements We hope that all of the above will eventually lead Future work on the project will involve us to deploying our technology in more natural implementing the aforementioned design settings, as we discuss in our conclusion. We plan suggestions in a working system, and later to provide families with Social TV systems so that evaluating their impacts and usability. To conduct we can see how daily television use is affected by further studies however, we first need to make sure the ability to share television viewing with remote that the quality and stability of the audio between friends and family members. non-collocated participants is enhanced. In our phase 2 studies, the quality of the audio transmitted 5.2. From synchronous to asynchronous use between rooms left much to be desired, since it was In the introduction, we pointed out that television plagued by echo and feedback issues that required can foster two different kinds of sociability: direct significant time and effort to be minimized. Such (when viewers get together and watch the same time-consuming set-up was the cost of a quick and show at the same time) and indirect (when viewers simple experimental configuration using off-the- exchange comments after the show has already shelf components but would be unacceptable in real been seen by each, independently and at different homes. In fact the problem might be worse in the times). While the former was the focus of this context of home deployments, since audio would be paper, the latter is rich in design possibilities and, in carried over the Internet and might suffer from the course of our studies, some possibilities for greater latency than in our laboratory connection. asynchronous technologies emerged. In the interest Providing a pleasant social audio experience in the of generating discussion in this space, we would presence of loud and diverse program audio and like to briefly mention some possibilities offered by latency is a challenge that requires attention to Social TV in the asynchronous realm. signal processing. It is possible to envision viewers running their Beyond the transmission of audio between Social TV device even when they are watching a locations, we also plan to investigate the effects of show alone. The device would still record and track providing a more specific directionality to the their reactions to the show but, instead of sending it social audio so that it is easier for users to discern to other viewers, it would store them for future use. between the comments originating from remote As we mentioned earlier in our description of the viewers and the audio coming directly from their “catch me up” feature, the reactions of the audience television. In our initial set-up both audio streams are rich in meaning and often indicate important were sent to the same set of speakers and our users moments in the show. Therefore, each individual repeatedly pointed out that it could be quite viewer’s reaction could be used as a “tag” to mark confusing. important moments in a program. This could then be used by other viewers, at a different time, in the The issue of controls is also something we have yet following ways (the list is, of course, not to explore. We need to consider how our system exhaustive): would resolve control requests being issued from two (or more) distributed locations that receive a • “My friends’ laugh track”: television shows often common video feed. One possibility is a virtual rely on edited laugh tracks to cue their audience remote control that can only be used by a single about funny lines. However, these often feel “host” at a time. The host would control the flow of highly artificial. Using the aforementioned tags, a the show (e.g. pausing) while other controls (e.g. solitary viewer could instead hear laughter only volume, brightness) would remain local. A marker when his/her friends laughed when they watched on the host’s shadow (Figure 4) would indicate who the show earlier. The laugh track would sound is holding the remote, and other users could issue much more natural and, moreover, it would be request to hand it over to them. This would further placed at the most appropriate moment – humor imitate the co-located viewing situation. But unlike differs across viewers and it is probably better to co-located viewing, any participant should be able rely on one’s friends’ reactions than the editing to wrestle the control back from any other viewer, of the show’s producers. in order to avoid some of the abusive behaviours • “My friends’ commentary”: commentary from described in previous research (Walker and movie directors and others is a popular feature on Bellamy, 2001): “compulsive grazers” can be quite DVDs. Social TV suggests the additional disruptive to a group’s experience, and the possibility of recording a commentary about any aforementioned research even describes users who show and sharing it with friends. A viewer would claim to gain gratification from annoying others by simply talk over the show and the voice stream controlling their viewing! would be recorded by the device. Each of his/her
friends, when viewing the same show at a later participants most often “talked to the television” date, would have the option of turning on the when exchanging comments with each other, and commentary track he/she produced. In fact, this changes in body orientation were rare and most feature need not be limited to friends: viewers often connoted private side-conversations. This, could offer their commentaries on the Web for however, could be an artefact of our laboratory anybody to download. This would emulate the setting. Studies of domestic life (e.g. Hughes et al., popular “user reviews” offered on Web sites such 2000) have shown that home dwellers can be very as Amazon.com, but this time in the audio realm mobile, conducting a variety of activities while and with the added advantage of being moving about the house (e.g. alternating between synchronized to a video stream. television watching and trips to the kitchen to watch over a cooking dish). Television might also 6. Conclusion not be the exclusive focus of attention, with other While counter-intuitive to many, watching tasks and devices asking for simultaneous attention television can be a very sociable activity. In this (e.g. taking care of a child and watching television paper, we have explored how groups of television only peripherally). If these were the dominant viewers interact with each other in front of the TV patterns of use, Social TV would need to find a way set. It became clear that television-mediated to “follow its users around the house”, so to speak. sociability is governed by a set of cultural practices This would imply decoupling the device from static and interaction rules, which apparently evolved A/V equipment and integrating it instead with such that joint viewers can simultaneously enjoy another, more mobile platform (e.g. a headset, as each other’s company and preserve the structure mentioned earlier, maybe connected to a mobile and pacing of the show they are watching together. phone equipped with the Social TV software). This way, users would be able to communicate about an However, the pressures of daily life make joint ongoing show whether or not they are facing the television viewing increasingly difficult. To counter television. this trend, we suggested design avenues for a Social TV prototype inspired by our studies. We envision It is clear much work remains to be done to a system integrated into standard audio-video redesign television and make sure it continues to equipment (e.g. a TV set, a PVR) and allowing play a role in the ways people socialize and interact geographically-distributed viewers to communicate with each other. We hope this paper will help with each other using an open audio channel. Social stimulate future research in this domain. TV would facilitate distributed, sociable television viewing by processing each participant’s comments Acknowledgements and ensuring that they fit within the interaction The authors express their gratitude to the three rules we inferred from our empirical observations. anonymous reviewers who provided insightful comments and ideas that have been incorporated in While we strongly believe in the value of this the paper. We also thank our study participants for approach, it is important to note that our studies are time and enthusiasm. not without limitations. First and foremost, our observations were conducted in a laboratory setting. References While we tried as much as possible to emulate a Alexander, A. (1990). Television and family real living room, the environment remains by interaction. In J. Bryant (Ed.), Television and the nature somewhat artificial. Moreover, the American family (pp. 211-225). Hillsdale, NJ: participants were mostly recruited individually, Lawrence Erlbaum Associates. without any attempt at reproducing the structure of social groups that most often watch TV together Aoki, P.M., Romaine, M., Szymanski, M.H., (e.g. groups of friends, family members). It could Thornton, J.D., Wilson, D. and Woodruff, A. be that such prior relationships greatly affect (2003). The Mad Hatter's Cocktail Party: A Mobile interaction patterns in front of the television, but Social Audio Space Supporting Multiple our data cannot shed light on this issue. More Simultaneous Conversations. Proc. ACM SIGCHI studies in “natural” environments (e.g. a Conf. on Human Factors in Computing Systems participant’s own home) would certainly be (CHI '03), Ft. Lauderdale, FL, Apr. 2003, 425-432. enlightening. Benford, S., Greenhalgh, C., Brown, C., Walker, G., Regan, T., Rea, P., Morphett, J., & Wyver, J. The above could be most important when (1998). Experiments in inhabited TV. In considering the form-factor of future Social TV Proceedings of CHI98 (pp. 289-290). New York, systems and, in particular, the input devices it NY: ACM. offers to its users. During our observations,
Casey, B., Casey, N., Calvert, B., French, L., & Oldenburg, R. (1989). The great good place. New Lewis, J. (2002). Television studies: The key York, NY: Marlowe & Company. concepts. London: Routledge. Putnam, R. (2000). Bowling alone: the collapse and Chuah, M. (2002). Reality instant messenger. In revival of American community. New York: Simon Proceedings of the 2nd Workshop on & Schuster. Personalization in Future TV (TV02). Malaga, Sacks, H. (1992). Lectures on Conversation. Spain. Oxford: Basil Blackwell. Coppens, T., Handekyn, K. and Vanparijs, F. Sacks, H., Schegloff, Emanuel A., and Jefferson, G. (2005). Amigo TV, Alcatel White Paper, available (1974) A simplest systematics for the organization at http://tinyurl.com/9h68d of turn-taking for conversation. Language, 50, pp. Crabb, P. B., & Goldstein, J. H. (1991). The social 696-735. psychology of watching sports: From Ilium to Schneider, K. (1998). Small Talk: Analysing Phatic living room. In J. Bryant & D. Zillmann (Eds.), Discourse. Marburg: Hitzeroth. Responding to the screen: reception and reaction processes (pp. 355-365). Hillsdale, NJ: Lawrence Schwartz, J. (2004). Leisure pursuits of today’s Erlbaum Associates. young man. The New York Times, March 29, 2004. http://tinyurl.com/csztu Csikszentmihalyi, M. (1990). Flow. Harper Perennial. Stone, G. P. (1981). Sport as community representation. In G. Luschen & G. Sage (Eds.), Glaser, B. G. (1998). Doing grounded theory: Handbook of social science of sport (pp. 214-245). issues and discussions. Mill Valley, CA: Sociology Champagne, IL: Stipes. Press. Subrahmanyam, K., Kraut, R., Greenfield, P., & Hughes, J., O'Brien, J., Rodden, T., Rouncefield, Gross, E. (2002). New forms of electronic media. M. and Viller, S. (2000). Patterns of home life: In D. G. Singer & J. L. Singer (Eds.), Handbook of informing design for domestic environments, Children and the Media (pp. 73-99). London: Sage. Personal Technologies, 4 (1) : 25-38. Sutton-Smith, Brian. (2001). The ambiguity of play. Huijnen, C., IJsselsteijn, W., Markopoulos, P., & de Harvard University Press. Ruyter, B. (2004). Social presence and group attraction: exploring the effects of awareness Walker, J. R., & Bellamy, R. V. (2001). Remote systems in the home. Cogn Tech Work, 6, 41-44. control devices and family viewing. In J. Bryant & J. A. Bryant (Eds.), Television and the American Lull, J. (1990). Inside family viewing: Ethnographic family, second edition (pp. 75-89). Mahwah, NJ: research on television's audiences. London: Lawrence Erlbaum Associates. Routledge. White, E. S. (1986). Interpersonal bias in television Morrison, M. (2001). A look at mass and computer and interactive media. In G. Gumpert & R. Cathcart mediated technologies: understanding the roles of (Eds.), Inter/Media: Interpersonal Communication television and computers in the home. Journal of in a Media World (pp. 110-120). Oxford: Oxford Broadcasting and Electronic Media, 45(1), 135- University Press. 161.
You can also read