Blog Car Radio: Minimal attention interface based on ranking of local blog entries
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Blog Car Radio: Minimal attention interface based on ranking of local blog entries Hiroshi Kori, Taro Tezuka and Katsumi Tanaka Graduate School of Informatics, Kyoto University Yoshida-Honmachi, Sakyo, Kyoto, 606-8501 Japan {kori,tezuka,tanaka}@dl.kuis.kyoto-u.ac.jp Abstract entries from existing blog search engines, selects geograph- ically relevant entries, rank them based on their suitability The media content presented to a vehicle driver is mainly for sonification (speech synthesis) and relevance to the user- auditory, since visual content is distracting and viewing it selected category, and present them by sound. Today, there increases the risk of an accident. Music and radio are thus is a vast amount of blog entries that contain personal ex- commonly listened to while driving. However, these types of periences and impressions with geographical locality. Blog content rarely reflect regional characteristics and are there- entries often have a conversational style also, which makes fore not well suited for tourists who want to get information them resemble the style of radio shows. Tourists who are about the region they are visiting. We have developed the unfamiliar with the region would benefit greatly from our Blog Car Radio system that presents blog entries in auditory system, yet even for drivers who use same routes every day, style using sonification (speech synthesis). Blog entries are the system can provide different information each time, be- obtained from blog search engines, selected by distances cause new blog entries are posted on a regular basis. from the vehicle’s current location, and ranked based on Our system has four primary functions, (1) Search by their suitability for sonification and relevance to the user- query constructed by place names, (2) Filtering blog entries specified category. By using Blog Car Radio, a driver can by geographical feature, (3) Classification of blog entries obtain local information with only a small amount of dis- into a certain category, and (4) Ranking blog entries based traction. In this paper, we particularly discussed the method on several features. Section 3 is the overview of the system. to rank text contents which is suitable for sonification. In Section 4, we discuss the suitability for sonification. Sec- tion 5 explains the method for ranking blog entries. Section 6 indicates future work. Section 7 is the conclusion. 1 Introduction 2 Related work Information systems are spreading to increasing varieties of environments. For each environment, there is an appro- There are various services that unify blog entries with priate mode of content presentation. One important exam- geographical information[3], most of them require manual ple is the driving environment, where the driver must pay registrations of blog entry’s locations. In contrast, the Blog close attention to the act of driving. Visual presentation Car Radio searches blog entries based on place names in is not appropriate in this environment. The user interface the target region. The method can be applied to any region, in the driving environment must require minimal attention as long as a set of place names with their coordinates is from the driver[2]. At present moment, most drivers en- available. A conventional GIS (geographical information joy music and radio programs while driving, since they are system) can provide such information. less distracting than visual contents. However, these con- One important issue in implementing the Blog Car Ra- tents are not region specific, and therefore can not satisfy dio system is the suitability of blog entries for sonification. the drivers with interests in the region that they are driving There are many researches on aural Web browsers for vi- in. sually handicapped people[4]. The goal of those systems To meet such need, we have developed the Blog Car Ra- is to convert arbitrary Web content into speech as natural dio system, which presents local blog (weblog) entries in an and easily understandable as possible. Meanwhile, we take auditory manner. The Blog Car Radio system collects blog an approach of ranking blog entries by their suitability for
! " ! HA BIB> *+,-./6101273-+43501 890:;,6-4 < DEFG NOPQRJST KLMOUVKW "#$%&$'() \]WYZQYOO[U XJKPLWMOMLQ => @AB ^B__^A=>` CB=?> a TOOLMQPbSYRKLPMUKc KOdP]WJ Figure 2. Spatial mapping of name places in blog entries Figure 1. Blog Car Radio system overview 3.1 Blog search sonification. When there is a large amount of contents avail- able, our method turns out to be effective, even if the speech In the first phase of the Blog Car Radio, the location data synthesis technology is not perfect. of the vehicle obtained by a GPS (global positioning sys- tem) unit is used to search blog from existing blog search engine. From a database storing place names and their coor- 3 System overview dinates, a set of k-neighbor place names from the vehicle’s location is extracted. Disjunction of these place names are The aim of the Blog Car Radio is to present local blog used as the search query for blog entries. entries in auditory format, similar to a conventional car ra- dio, so that drivers can enjoy Web contents without being 3.2 Geographical filtering distracted by a visual interface. As illustrated in Figure 1, the system has a client-server architecture in which each Based on the search result in phase 1, the system extracts vehicle is a client. Each client sends regular reports on the place names that have strong locality. A blog entry often vehicle’s location to the server, whenever transmission is contains more than one place name, and some of them are possible, such as when stopping at a traffic intersection. In near the vehicle, while others are not. Such a situation is return, the server sends the most suitable blog entries to the described in Figure 2. The system calculates the average clients. The client presents blog entries to the user by em- distance G(E) between the contained place names and ve- ploying speech synthesis. hicle’s location for each blog entryE. The Blog Car Radio is based on the concept of the min- imal attention interface. Necessary actions from the user G(E) = avg( δ(l, p) ) (1) are reduced to be as little as possible. One of the few ac- tions that must be performed by the user is a selection of l is the place name that the entry E refers to, δ(l, p) is the a category. This is similar to selecting a radio station in a distance between l and the current location, and avg() is conventional radio, except that each category corresponds averaging. The system extracts a set number of blog entries to a certain topic, such as dining, sightseeing, events, and with the lowest values of G(E). so on. The processing of the Blog Car Radio server consists of four phases: 3.3 Categorization Phase 1: Blog Search Out of the extracted result of phase 2, the system extracts Phase 2: Geographical Filtering blog entries that contain relevant terms to the user-selected Phase 3: Categorization category. A dictionary or an ontology is used for obtaining Phase 4: Ranking terms that are related to the category. We call these terms category terms. Each category class c has a set of category In the following subsections, we describe these phases in terms. The system evaluates blog entry by the ratio of cat- more details. egory terms contained in the entry, which is the number of
Type of presentation Visual Auditory &*'' ((- ), Scanning for relevant part not difficult difficult +, .)/(( 0 Emotional symbols comprehensible confusing Long sentences understandable confusing Reference terms understandable confusing Table 1. Comprison of presentation types category terms divided by the sum of terms in the entry E !" #$% as below. R(c, E) = m 1 X w∈WE ∩Wc tf (w) (2) tf (w) is the term frequency of the word w in entry E, and Figure 3. Text features and sonification m is the total sum of the term frequency of WE . A blog entry E is classified into the category c whose R(c, E) is the highest among categories. • Ratio of pronouns: The number of pronouns divided by the number of terms in the document. (t3 ) 3.4 Ranking For a blog entry E, we indicate the suitability for sonifica- tion based on textual features by T (E). In the ranking phase, blog entries in the search result are ranked based on (1) suitability for sonification and (2) rel- T (E) = t01 (E) + t02 (E) + t03 (E) (3) evance to the category. The following sections discuss the ranking mechanism in more detail. t0i (i = 1, 2, 3) is the standardized values of ti (i = 1, 2, 3), respectively. 4 Suitability for sonification 4.2 Keyword positions Table 1 compares characteristics of visual and auditory presentations of text content, focusing on the limitations of One important difference between visual and auditory auditory presentation. The table is partially based on the presentations of the content is that in auditory presentation, work by Kurohashi et al.[1]. It indicates that some blog skipping of irrelevant part is either difficult or impossible. entries are easily heard as read, yet others are hardly un- There are often cases where only a part of a blog entry is derstandable when presented aurally, depending on the tex- relevant to the user’s interest. In such case, when the con- tual features. The limited capability of auditory presenta- tent is presented visually, the user can scan through the text tion must be considered in the sonification process of blog until he reaches the relevant part, and start reading from entries. there. On the other hand, if the content was presented au- rally, the user has to wait or fast-forward the content until 4.1 Textual features it reaches the relevant part. It gives a stress to the user, es- pecially when the irrelevant part is very long. The length of Blog entries often contain textual features that give the the irrelevant part is an important factor for visual presenta- user difficulty in understanding, when presented aurally. tion as well, yet not as much as in the case of the auditory For example, emotional symbols such as face marks are of- presentation. This situation is indicated in Figure 3. ten found in blogs. Auditory presentation of such symbols If the keyword appears in the initial part, even though will confuse the user. Kurohashi et al. discussed several of there is a chance that only the initial part of the entry is these features in transforming text content into speech[1]. relevant, the whole document may be related to the user- Textual features considered in the Blog Car Radio system specified category. This is especially significant in the case are as follows: of place names, because of its special characteristics. Once • Ratio of symbols: The number of symbols divided by the place name appears in the text, it is less likely to be the number of characters in the document. (t1 ) repeated. It is a background information, or context. There- fore, in case of place names, there is better chance that the • Average sentence length: The average length of sen- whole text is relevant to the place if it appears in the initial tences in the document. (t2 ) part of the document. Since the Blog Car Radio presents
)*+,"-.!#*"/0$1! !"#$%&!'( 5 Ranking of blog entries In addition to their suitability for sonification, blog en- tries are evaluated using the relevance to the user-selected category. The blog entries with the highest final integrated scores are presented to the user. 5.1 Relevance to selected category 2,#*($13.!#*"/0$1! In addition to the extraction process, the relevance to the category is used in the ranking phase also. The number of the category terms contained in the document is the sim- Figure 4. Scoring of model by keyword posi- plest measure for the relevance to the category. We refer to tion R(c, E) in which entry E was classified into category c as R(E). blog entries aurally, it places more weight on the fact that the relevant part appears early in the content. 5.2 Integrated score The difference discussed here results from the presenta- tion types. In ranking blog entries for auditory presentation, The Blog Car Radio ranks the entries by the integrated the Blog Car Radio system must place more weight on the score based on textual features, keyword position, and rel- fact that the relevant part appears initially in the content. evance to the user-selected category. The blog entries with Figure 4 illustrates the scoring model, where the keyword the highest scores are presented to the user, in a descending position and the score is indicated by the x-axis and the y- order of the score. The integrated score S(E) for a blog en- axis, respectively. In case a keyword appears at the very try E is calculated by the following formula, where α, β, γ beginning of the content, the user’s stress for auditory pre- are arbitrary coefficients. sentation is assumed to be the same as the case of visual presentation. On the other hand, if the keyword appears at S(E) = αT 0 (E) + βP 0 (E) + γR0 (E), α + β + γ = 1 (7) the later part, the score is lower for auditory presentation than in visual presentation. Score p of a keyword with a T 0 (E), P 0 (E), R0 (E) are standardized values for textual position x (the number of characters before the key- T (E), P (E), R(E), respectively. β corresponds to word appears) is defined as follows. that of Formula 4, because β in Formula 7 also indicates ½ the weight by the keyword position. In case of visual −βv x (visual presentation) presentation, the coefficients fulfill conditions indicated p(x) = (4) −βa x (auditory presentation) below. αv = 0 , 0 < βv ¿ γv (8) 0 < β v < βa (5) In case of auditory presentation, the coefficients fulfill con- βv , βa indicate the weight on the keyword position for vi- ditions indicated below. sual and auditory presentations, respectively. Keywords used in the Blog Car Radio system consist of two types: αa > 0 , βa > 0 , γa > 0 , βa > βv (9) place names and category indicators. While the work by Kurohashi et al. was targeted on documents in specific for- We also plan to obtain better coefficients for the scoring mats in written language, we treat blog entry which often using training data, for example by employing SVM. has a conversational style. We did not consider the formats in written language. The score P (E) which is based on the 6 Future work keyword position in the entry E is as indicated below, by the positions of the place name li and those of the category The future work includes the enhancement of geograph- terms ci . ical filtering based on the trajectory record of the vehicle. m n Thus we optimize the timing when the user can listen to the 1 X X P (E) = { p(li ) + p(cj )} (6) blog information about location where he is driving. m + n i=1 j=1 Moreover, A radio-programs designed to satisfy the lis- teners could be constructed. Simple sonification of the blog m, n indicate the total number of the place names and the entry would provide little satisfaction. By creating pro- category terms contained in the entry, respectively. grams that go beyond the blog article contents, we should be
able to create a programs with a spatial factor that are enter- taining. For example, a function could be provided that rec- ommends music appropriate to the user’s environment. For example, a user might enjoy different music while driving along the seaside than while driving in through the moun- tains. 7 Conclusion The Blog Car Radio system enables the user to listen to blog entries as if listening to radio programs, learning personal experiences and impressions related to the location of the vehicle, with minimal attention to the hardware. We proposed a method to rank blog entries by the suitability for our system. We used the relevance to the user-selected category and the suitability for sonification in the ranking. References [1] S. Kurohashi, T. Ohizumi, T. Shibata, N. Kaji, D. Kawahara, M. Okamoto, and T. Nishida. Media conversion of linguistic information for conversational knowledge process. Journal of Shakai-Gijutsu, 2:173–180, 2003. [2] J. Pascoe, N. Ryan, and D. Morse. Issues in developing context-aware computing. First International Symposium on Handheld and Ubiquitous Computing (HTC 99), 1999. [3] H. Uematsu, K. Numa, T. Tokunaga, I. Ohmukai, and H. Takeda. Ba-log: a proposal for the use of locational in- formation in blog environment. The 6th Web and Ontology Workshop, 2004. [4] M. Wynblatt, D. Benson, and A. Hsu. Browsing the world wide web in a non-visual environment. In ICAD 97 Proceed- ings, pages 135–138, November 1997.
You can also read