Blog Car Radio: Minimal attention interface based on ranking of local blog entries

Page created by Darlene Jenkins
 
CONTINUE READING
Blog Car Radio: Minimal attention interface
                                based on ranking of local blog entries

                                 Hiroshi Kori, Taro Tezuka and Katsumi Tanaka
                                Graduate School of Informatics, Kyoto University
                                Yoshida-Honmachi, Sakyo, Kyoto, 606-8501 Japan
                                   {kori,tezuka,tanaka}@dl.kuis.kyoto-u.ac.jp

                         Abstract                                entries from existing blog search engines, selects geograph-
                                                                 ically relevant entries, rank them based on their suitability
   The media content presented to a vehicle driver is mainly     for sonification (speech synthesis) and relevance to the user-
auditory, since visual content is distracting and viewing it     selected category, and present them by sound. Today, there
increases the risk of an accident. Music and radio are thus      is a vast amount of blog entries that contain personal ex-
commonly listened to while driving. However, these types of      periences and impressions with geographical locality. Blog
content rarely reflect regional characteristics and are there-   entries often have a conversational style also, which makes
fore not well suited for tourists who want to get information    them resemble the style of radio shows. Tourists who are
about the region they are visiting. We have developed the        unfamiliar with the region would benefit greatly from our
Blog Car Radio system that presents blog entries in auditory     system, yet even for drivers who use same routes every day,
style using sonification (speech synthesis). Blog entries are    the system can provide different information each time, be-
obtained from blog search engines, selected by distances         cause new blog entries are posted on a regular basis.
from the vehicle’s current location, and ranked based on            Our system has four primary functions, (1) Search by
their suitability for sonification and relevance to the user-    query constructed by place names, (2) Filtering blog entries
specified category. By using Blog Car Radio, a driver can        by geographical feature, (3) Classification of blog entries
obtain local information with only a small amount of dis-        into a certain category, and (4) Ranking blog entries based
traction. In this paper, we particularly discussed the method    on several features. Section 3 is the overview of the system.
to rank text contents which is suitable for sonification.        In Section 4, we discuss the suitability for sonification. Sec-
                                                                 tion 5 explains the method for ranking blog entries. Section
                                                                 6 indicates future work. Section 7 is the conclusion.
1 Introduction
                                                                 2 Related work
    Information systems are spreading to increasing varieties
of environments. For each environment, there is an appro-            There are various services that unify blog entries with
priate mode of content presentation. One important exam-         geographical information[3], most of them require manual
ple is the driving environment, where the driver must pay        registrations of blog entry’s locations. In contrast, the Blog
close attention to the act of driving. Visual presentation       Car Radio searches blog entries based on place names in
is not appropriate in this environment. The user interface       the target region. The method can be applied to any region,
in the driving environment must require minimal attention        as long as a set of place names with their coordinates is
from the driver[2]. At present moment, most drivers en-          available. A conventional GIS (geographical information
joy music and radio programs while driving, since they are       system) can provide such information.
less distracting than visual contents. However, these con-           One important issue in implementing the Blog Car Ra-
tents are not region specific, and therefore can not satisfy     dio system is the suitability of blog entries for sonification.
the drivers with interests in the region that they are driving   There are many researches on aural Web browsers for vi-
in.                                                              sually handicapped people[4]. The goal of those systems
    To meet such need, we have developed the Blog Car Ra-        is to convert arbitrary Web content into speech as natural
dio system, which presents local blog (weblog) entries in an     and easily understandable as possible. Meanwhile, we take
auditory manner. The Blog Car Radio system collects blog         an approach of ranking blog entries by their suitability for
    
                                                                                                                            
                                                                                                                         
                                                                                                                               
                                                                                                                                                  
                                                                                                                                                     ! "
                            
 
        !  
                                                           HA
                                                                BIB>                                                                               
                                                                                                                                           
                 *+,-./6101273-+43501
                     890:;,6-4  <                           DEFG                                                                                     
      NOPQRJST
            KLMOUVKW                                                                                                                   
                                         "#$%&$'()                               \]WYZQYOO[U
                                                                         XJKPLWMOMLQ
                     =>     @AB                         ^B__^A=>`                                                            
                       CB=?>                                          
                                      a
                                      TOOLMQPbSYRKLPMUKc
                                         KOdP]WJ                                            Figure 2. Spatial mapping of name places in
                                                                                                  blog entries
     Figure 1. Blog Car Radio system overview

                                                                                               3.1    Blog search
sonification. When there is a large amount of contents avail-
able, our method turns out to be effective, even if the speech                                    In the first phase of the Blog Car Radio, the location data
synthesis technology is not perfect.                                                           of the vehicle obtained by a GPS (global positioning sys-
                                                                                               tem) unit is used to search blog from existing blog search
                                                                                               engine. From a database storing place names and their coor-
3 System overview
                                                                                               dinates, a set of k-neighbor place names from the vehicle’s
                                                                                               location is extracted. Disjunction of these place names are
   The aim of the Blog Car Radio is to present local blog                                      used as the search query for blog entries.
entries in auditory format, similar to a conventional car ra-
dio, so that drivers can enjoy Web contents without being                                      3.2    Geographical filtering
distracted by a visual interface. As illustrated in Figure 1,
the system has a client-server architecture in which each                                         Based on the search result in phase 1, the system extracts
vehicle is a client. Each client sends regular reports on the                                  place names that have strong locality. A blog entry often
vehicle’s location to the server, whenever transmission is                                     contains more than one place name, and some of them are
possible, such as when stopping at a traffic intersection. In                                  near the vehicle, while others are not. Such a situation is
return, the server sends the most suitable blog entries to the                                 described in Figure 2. The system calculates the average
clients. The client presents blog entries to the user by em-                                   distance G(E) between the contained place names and ve-
ploying speech synthesis.                                                                      hicle’s location for each blog entryE.
   The Blog Car Radio is based on the concept of the min-
imal attention interface. Necessary actions from the user                                                      G(E) = avg( δ(l, p) )                                (1)
are reduced to be as little as possible. One of the few ac-
tions that must be performed by the user is a selection of                                     l is the place name that the entry E refers to, δ(l, p) is the
a category. This is similar to selecting a radio station in a                                  distance between l and the current location, and avg() is
conventional radio, except that each category corresponds                                      averaging. The system extracts a set number of blog entries
to a certain topic, such as dining, sightseeing, events, and                                   with the lowest values of G(E).
so on. The processing of the Blog Car Radio server consists
of four phases:                                                                                3.3    Categorization

    Phase 1:        Blog Search                                                                   Out of the extracted result of phase 2, the system extracts
    Phase 2:        Geographical Filtering                                                     blog entries that contain relevant terms to the user-selected
    Phase 3:        Categorization                                                             category. A dictionary or an ontology is used for obtaining
    Phase 4:        Ranking                                                                    terms that are related to the category. We call these terms
                                                                                               category terms. Each category class c has a set of category
In the following subsections, we describe these phases in                                      terms. The system evaluates blog entry by the ratio of cat-
more details.                                                                                  egory terms contained in the entry, which is the number of

 Type of presentation              Visual          Auditory
                                                                        &*'' ((-
                                                                                 ),
 Scanning for relevant part      not difficult      difficult
                                                                              +,
                                                                       .)/(( 0                             
                                                                                                                
 Emotional symbols             comprehensible      confusing
                                                                                                                
                                                                                                                
                                                                                                                
 Long sentences                understandable      confusing
 Reference terms               understandable      confusing
                                                                                                                
      Table 1. Comprison of presentation types

category terms divided by the sum of terms in the entry E                                                     !"
                                                                                                                        #$%
as below.

           R(c, E) =
                      m
                       1    X

                           w∈WE ∩Wc
                                   tf (w)              (2)                                                             
tf (w) is the term frequency of the word w in entry E, and              Figure 3. Text features and sonification
m is the total sum of the term frequency of WE . A blog
entry E is classified into the category c whose R(c, E) is
the highest among categories.                                      • Ratio of pronouns: The number of pronouns divided
                                                                     by the number of terms in the document. (t3 )
3.4    Ranking                                                   For a blog entry E, we indicate the suitability for sonifica-
                                                                 tion based on textual features by T (E).
   In the ranking phase, blog entries in the search result are
ranked based on (1) suitability for sonification and (2) rel-               T (E) = t01 (E) + t02 (E) + t03 (E)                    (3)
evance to the category. The following sections discuss the
ranking mechanism in more detail.                                t0i (i = 1, 2, 3) is the standardized values of ti (i = 1, 2, 3),
                                                                 respectively.
4 Suitability for sonification
                                                                 4.2    Keyword positions
   Table 1 compares characteristics of visual and auditory
presentations of text content, focusing on the limitations of        One important difference between visual and auditory
auditory presentation. The table is partially based on the       presentations of the content is that in auditory presentation,
work by Kurohashi et al.[1]. It indicates that some blog         skipping of irrelevant part is either difficult or impossible.
entries are easily heard as read, yet others are hardly un-      There are often cases where only a part of a blog entry is
derstandable when presented aurally, depending on the tex-       relevant to the user’s interest. In such case, when the con-
tual features. The limited capability of auditory presenta-      tent is presented visually, the user can scan through the text
tion must be considered in the sonification process of blog      until he reaches the relevant part, and start reading from
entries.                                                         there. On the other hand, if the content was presented au-
                                                                 rally, the user has to wait or fast-forward the content until
4.1    Textual features                                          it reaches the relevant part. It gives a stress to the user, es-
                                                                 pecially when the irrelevant part is very long. The length of
   Blog entries often contain textual features that give the     the irrelevant part is an important factor for visual presenta-
user difficulty in understanding, when presented aurally.        tion as well, yet not as much as in the case of the auditory
For example, emotional symbols such as face marks are of-        presentation. This situation is indicated in Figure 3.
ten found in blogs. Auditory presentation of such symbols            If the keyword appears in the initial part, even though
will confuse the user. Kurohashi et al. discussed several of     there is a chance that only the initial part of the entry is
these features in transforming text content into speech[1].      relevant, the whole document may be related to the user-
Textual features considered in the Blog Car Radio system         specified category. This is especially significant in the case
are as follows:                                                  of place names, because of its special characteristics. Once
  • Ratio of symbols: The number of symbols divided by           the place name appears in the text, it is less likely to be
    the number of characters in the document. (t1 )              repeated. It is a background information, or context. There-
                                                                 fore, in case of place names, there is better chance that the
  • Average sentence length: The average length of sen-          whole text is relevant to the place if it appears in the initial
    tences in the document. (t2 )                                part of the document. Since the Blog Car Radio presents
)*+,"-.!#*"/0$1!
                      !"#$%&!'(         
                                                                     5     Ranking of blog entries

     
     
                                                                        In addition to their suitability for sonification, blog en-
                                                                     tries are evaluated using the relevance to the user-selected
                                                                     category. The blog entries with the highest final integrated
                                                                     scores are presented to the user.

                                                                     5.1     Relevance to selected category

  2,#*($13.!#*"/0$1!                    
                                                  
                                                                   In addition to the extraction process, the relevance to the
                                                                     category is used in the ranking phase also. The number of
                                                                     the category terms contained in the document is the sim-
   Figure 4. Scoring of model by keyword posi-                       plest measure for the relevance to the category. We refer to
   tion                                                              R(c, E) in which entry E was classified into category c as
                                                                     R(E).
blog entries aurally, it places more weight on the fact that
the relevant part appears early in the content.                      5.2     Integrated score
   The difference discussed here results from the presenta-
tion types. In ranking blog entries for auditory presentation,          The Blog Car Radio ranks the entries by the integrated
the Blog Car Radio system must place more weight on the              score based on textual features, keyword position, and rel-
fact that the relevant part appears initially in the content.        evance to the user-selected category. The blog entries with
Figure 4 illustrates the scoring model, where the keyword            the highest scores are presented to the user, in a descending
position and the score is indicated by the x-axis and the y-         order of the score. The integrated score S(E) for a blog en-
axis, respectively. In case a keyword appears at the very            try E is calculated by the following formula, where α, β, γ
beginning of the content, the user’s stress for auditory pre-        are arbitrary coefficients.
sentation is assumed to be the same as the case of visual
presentation. On the other hand, if the keyword appears at           S(E) = αT 0 (E) + βP 0 (E) + γR0 (E), α + β + γ = 1 (7)
the later part, the score is lower for auditory presentation
than in visual presentation. Score p of a keyword with a             T 0 (E), P 0 (E), R0 (E) are standardized values for
textual position x (the number of characters before the key-         T (E), P (E), R(E), respectively.       β corresponds to
word appears) is defined as follows.                                 that of Formula 4, because β in Formula 7 also indicates
             ½                                                       the weight by the keyword position. In case of visual
                −βv x (visual presentation)                          presentation, the coefficients fulfill conditions indicated
    p(x) =                                                 (4)
                −βa x (auditory presentation)                        below.
                                                                                    αv = 0 , 0 < βv ¿ γv                        (8)
                        0 < β v < βa                           (5)
                                                                     In case of auditory presentation, the coefficients fulfill con-
βv , βa indicate the weight on the keyword position for vi-          ditions indicated below.
sual and auditory presentations, respectively. Keywords
used in the Blog Car Radio system consist of two types:                     αa > 0 , βa > 0 , γa > 0 , βa > βv                  (9)
place names and category indicators. While the work by
Kurohashi et al. was targeted on documents in specific for-          We also plan to obtain better coefficients for the scoring
mats in written language, we treat blog entry which often            using training data, for example by employing SVM.
has a conversational style. We did not consider the formats
in written language. The score P (E) which is based on the           6     Future work
keyword position in the entry E is as indicated below, by
the positions of the place name li and those of the category            The future work includes the enhancement of geograph-
terms ci .                                                           ical filtering based on the trajectory record of the vehicle.
                      m            n
                                                                     Thus we optimize the timing when the user can listen to the
                 1    X           X
       P (E) =      {    p(li ) +     p(cj )}                  (6)   blog information about location where he is driving.
               m + n i=1          j=1
                                                                        Moreover, A radio-programs designed to satisfy the lis-
                                                                     teners could be constructed. Simple sonification of the blog
m, n indicate the total number of the place names and the            entry would provide little satisfaction. By creating pro-
category terms contained in the entry, respectively.                 grams that go beyond the blog article contents, we should be
able to create a programs with a spatial factor that are enter-
taining. For example, a function could be provided that rec-
ommends music appropriate to the user’s environment. For
example, a user might enjoy different music while driving
along the seaside than while driving in through the moun-
tains.

7 Conclusion

   The Blog Car Radio system enables the user to listen
to blog entries as if listening to radio programs, learning
personal experiences and impressions related to the location
of the vehicle, with minimal attention to the hardware. We
proposed a method to rank blog entries by the suitability
for our system. We used the relevance to the user-selected
category and the suitability for sonification in the ranking.

References

[1] S. Kurohashi, T. Ohizumi, T. Shibata, N. Kaji, D. Kawahara,
    M. Okamoto, and T. Nishida. Media conversion of linguistic
    information for conversational knowledge process. Journal of
    Shakai-Gijutsu, 2:173–180, 2003.
[2] J. Pascoe, N. Ryan, and D. Morse. Issues in developing
    context-aware computing. First International Symposium on
    Handheld and Ubiquitous Computing (HTC 99), 1999.
[3] H. Uematsu, K. Numa, T. Tokunaga, I. Ohmukai, and
    H. Takeda. Ba-log: a proposal for the use of locational in-
    formation in blog environment. The 6th Web and Ontology
    Workshop, 2004.
[4] M. Wynblatt, D. Benson, and A. Hsu. Browsing the world
    wide web in a non-visual environment. In ICAD 97 Proceed-
    ings, pages 135–138, November 1997.
You can also read