Semantic integration of TV data and services: A survey on challenges, and approaches
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Web Intelligence and Agent Systems: An International Journal 1 (2010) 1–24 1 IOS Press Semantic integration of TV data and services: A survey on challenges, and approaches Bassem Makni a , Stefan Dietze a and John Domingue a a Knowledge Media Institute, The Open University Walton Hall, Milton Keynes, MK7 6AA, United Kingdom E-mail: {b.makni,s.dietze,j.b.domingue}@open.ac.uk Abstract. In this paper, we are surveying the impact of semantic Web and semantic Web services on enabling novel television features. These novel features include being Internet based, mobile, interactive, personalised, social and semantic. Many research efforts have contributed to extending different aspects of television delivery and consumption, with respect to content production, metadata handling, semantic enrichment and recommendation. They adhere to the semantic Web vision for two goals: the seam- less integration of their data and the automation of their Web services interoperation and composition. Mainly two semantic Web services approaches are used, namely a top-down and a bottom-up approach. We study the contribution of different Semantic Web and Semantic Web Services-based approaches to enable novel TV features. Keywords: Next-generation TV, Semantic integration, TV data, TV services, WSMO, WSMO-lite, micro-WSMO, SAWSDL, semantic TV content annotation, social TV 1. Introduction the novel TV features, for instance content produc- ers, broadcasters, social scientists, interaction design- The television concept has evolved and ramified ers, etc. interoperability is a key issue. Chan and Zeng from its early form, of telegraph transmission of vi- [22] impute the reason for the proliferation of metadata sion [38], to our contemporary perception of televi- schemas to the requirements differences during the de- sion. However, this contemporary perception is be- sign phase with respect to intended users, subject do- coming increasingly vague, as today television content main, project needs, etc. These differences are radical is scattered over broadcast streams, Web, and private among the TV parties, and are reflected at two levels : Internet Protocol television (IPTV) networks, and is – TV data integration accessible, among classical sofa TV, via diverse en- – TV Services interoperability hanced devices such as smart phones, tablets, Apple TV1 and soon via Google TV2 devices. This content By TV data, we refer to the multimedia content and to scattering perplexes the TV experience by increasing the metadata describing this content. By TV services, the time spent in searching for relevant media. Thus, we refer to the operations on the multimedia content novel TV features, including personalisation and so- and its metadata such as consumption, publishing and cial networks integration are required to enhance the retrieval. TV experience. TV data integration challenge is to some extent sim- Since many parties, from different backgrounds and ilar to the integration of Web data, in terms of di- with different concerns, are involved in materialising versity and distribution. And thus, we conjecture that the lessons learnt from the Web of data movement are 1 http://www.apple.com/appletv/ valuable for TV data integration, and especially defend 2 http://www.google.com/tv/ the efficacy of semantic integration of TV data. 1570-1263/10/$17.00 c 2010 – IOS Press and the authors. All rights reserved
2 B. Makni et al. / Semantic integration of TV data and services Similar challenge are posed with respect to the inte- evolution has not yet been tackled as far as we know. gration of services. The Semantic Web services (SWS) Which is our main motivation for writing this survey research efforts have been motivated with the need as TV data is an interesting use case for Semantic Web for automation of Web services related tasks such as technologies. discovery, orchestration, mediation, and composition. To clarify the specificity of TV related services, we The first intention was the full automation of one or start by introducing core features of next generation more of these tasks by providing conceptual models TV, implying different parties and services types (syn- that comprehensively delineate the semantics of the chronous and asynchronous services, WS-* compliant services such as Ontology Web Language for Services and Web API etc.). Based on the requirements which (OWL-S) [71] and Web Services Modeling Ontology arise from these aspects, we survey the main Semantic (WSMO) [94]. These efforts led to complex frame- Web (SW) technologies usage within the TV domain works and tools for Semantic Web services annota- and the SWS approaches that we classify as top-down tion and brokering. Even with tools and communities or heavyweight and bottom-up or lightweight and their built around these SWS approaches, their complex- potential for supporting the above requirements. Fi- ity hindered their uptake at a large scale for the au- nally, we foresee the upcoming research challenges in tomation of services tasks by the Service-Oriented Ar- the SWS domain. chitecture (SOA) users. Moreover, the limited emer- gence of potential service automation scenarios, for in- stance, based on services which allow the creation of 2. Next generation TV features more complex orchestration or which provide analo- gous functionalities which could be exploited via SWS "New technology is transforming the TV industry", discovery, has put into question the actual potential for says Mark Thompson, BBC CEO for Observer [83]. SWS automation. We conjecture that the biggest catalyst of this transfor- Hence, efforts were made to redefine the SWS no- mation will be SW technologies. tion and the scope of its underlying technologies. More In this section, we classify the new TV features and recently, a lightweight approach has been proposed demonstrate in the following sections the impact of with the aim of popularisation of SWS annotation use, SW technologies to materialise these features. namely with the standardisation of Semantic Annota- tions for WSDL and XML Schema (SAWSDL) [65] to 2.1. The integration of the Internet and television ensure interoperability. While taking advantage from the lessons learnt from more heavyweight approaches, The integration of the Internet and television was the lightweight approaches offer a less costly way of being realised in both ways (a) by television services, annotating services. However, they lack support for like Video on Demand (VoD), becoming available over more complex reasoning, and therefore, provide only the Internet and (b) by television becoming connected limited opportunities for the automation of services re- to the Internet via connected TV and connected set-top lated tasks. boxes . Full integration with a built-in Internet browser From our experience of using SWS to broker TV re- in the TV set-top box will be popularised with projects lated services, within the European project NoTube3 such as Google TV4 and Apple TV5 . We focus on (a) that explores television’s future in the ubiquitous Web, way that we designate Internet based television. two requirements for services management are preva- Simpson and Greenfield [98] enumerate four ramifi- lent: a) offering a lightweight means for service anno- cations; namely IPTV, Internet Protocol Video on De- tation and documentation, and b) enabling service bro- mand (IPVoD), Internet TV and Internet video; and de- kerage through automating service discovery and or- fine criteria that classify them. We note that these rami- chestration [32] . fications cross the television boundaries to video espe- The literature contains surveys on semantic integra- cially for Internet video and IPVoD. Which is consis- tion [82,13,108,33], Semantic Web Services [93,35, tent with the The Telecommunication Standardization 86,73] and television evolution [2,12,63,41], but the Sector of the International Telecommunication Union impact of Semantic Web technologies on the television 4 http://www.google.com/tv/ 3 http://www.notube.tv/ 5 http://www.apple.com/appletv/
B. Makni et al. / Semantic integration of TV data and services 3 (ITU-T)6 definition of IPTV as “multimedia services Multicasting support Multicast addressing7 is a net- such as television/video/ audio/text/graphics/data de- work technology for the delivery of information to a livered over IP based networks managed to provide the group of destinations simultaneously using the most required level of QoS/QoE, security, interactivity, and efficient strategy to deliver the messages over each reliability”. link of the network only once, creating copies only Thus, our survey will cover Television and Digital when the links to the multiple destinations split. In video, as they share many research challenges for an- the video delivery context, multicasting is crucial be- notation and delivery. cause of the data amount that could congest the net- work when redundant packets flood the routers [45]. 2.1.1. Internet usefulness for TV experience IPVoD and Internet video support only unicasting to The two major advantages of Internet based televi- allow individual play functionalities such as pause and sion over broadcast television are (a) the built-in back rewind. While Internet TV uses replicated unicasting, channel and (b) the Internet Protocol (IP) . in which messages are sent one by one to each client [45], and IPTV supports multicasting. a) The back channel or return path carries the user’s feedback to the broadcaster. This is compulsory for Delivery methods Two methods could be used for the new TV features such as interactivity and the video delivery, either a HTTP based method called the personalisation. progressive download or a streaming method via dedi- b) While digital video is a precisely timed and contin- cated streaming protocols such as Real-time Transport uous stream and IP networks carry a loosely timed Protocol (RTP) over User Datagram Protocol (UDP). collection of data fragmented into discrete packets, The UDP stateless nature is useful for real time video both technologies are coupled for the following rea- streaming because dropping packets is preferable to sons: waiting for delayed packets. – IP networking low cost owed to massive Digital rights management Broadly refers to a set of equipment production, policies, techniques, and tools that guide the proper use – IP standardisation of digital content [102]. Video is one of the main ap- – IP independence from the physical commu- plications of Digital rights management (DRM) espe- nication layer, which could be either wired, cially in IPTV and IPVoD. wireless, 3G, or 4G based network. Discussion – IP ubiquity, i.e. support by our quotidian We excerpt the criteria chosen by Simpson and devices, like mobiles, tablets, and console Greenfield [98] to classify IP video in Table 1. games, allows the TV mobility feature. The novel technologies are continuously changing the way we consume and produce television content. 2.1.2. IP Video classification criteria Thus, internet video has been classified [98] and re- The main criteria used to classify IP video delivery classified [99] during the last decade by refining the systems are: classification criteria to consider new technological ad- Network type By network type we refer to network vances. We foresee that the discussed classification openness, while IPTV uses private networks to deliver will soon be made obsolete by initiatives like Hybrid content to subscribed users, IPVoD, Internet TV and Broadcast Broadband TV (HbbTV)8 and Google TV9 . Internet video are delivered via public networks typi- The emergence of hybrid architectures, such as cally the Internet. HbbTV, will solve the insufficiency of the existing net- works infrastructure to deliver large video content, like Quality of Service The Quality of Service (QoS) can High-definition television (HDTV) and 3D television be used to assign high priority to video packets so they (3D-TV). are privileged by routers. However, this is meaningless Furthermore, Google TV predict a full integration in public networks such as the Internet where each ap- of the Internet and television by providing a built-in plication can mark its packets as high priority, so man- aged video delivery QoS is used only in IPTV [98]. 7 http://en.wikipedia.org/wiki/Multicast 8 www.hbbtv.org 6 http://www.itu.int/ITU-T/index.html 9 http://www.google.com/tv/
4 B. Makni et al. / Semantic integration of TV data and services Table 1 IP video classification Criterion IPTV IPVoD Internet TV Internet video Network type Private Public Public Public Quality of Service Managed QoS Unmanaged QoS Unmanaged QoS Unmanaged QoS Multicast support Multicasting Unicasting Replicated unicasting Unicasting Delivery method RTP over UDP Progressive download HTTP streaming HTTP streaming Rights management Strong with DRM Strong often with DRM Fairly strong Weak or nonexistent Internet browser within the TV, and by allowing cross- 2.2.2. Re-enabling the social aspect of TV searching over the Internet and television content. Initially, watching the TV was a social activity where the whole family gathered around their sofa TV 2.2. Interactive TV to watch the news or the night movie and talk about it with friends the following day. Later on, with the in- creasing use of Personal video recorder (PVR), watch- Interactive Television (iTV) is an active watching ing TV became an individual activity. That reduced the experience engaging viewers in choices and actions. viewers discussions about programs, which, according This aim of making the television more dynamic and to the water-cooler effect [31], these discussions could participatory is as old as the television itself. However be more interesting and more entertaining for the view- the last century attempts to make the TV more inter- ers than the program itself. active were not followed by the expected uptake [54]. Social TV is defined as opportunity to interlink peo- This was mainly due to the high costs [92] and the in- ple and provide communication features to create con- trusive interfaces [11] making the interactivity cum- nectedness via the TV [56,21,70,109]. We define it as bersome. Hence, Jensen [54] described the iTV as a va- an adaptation of the social principle “It’s not what you porware10 , which is an advertised product, often com- know, it’s who you know” [79] to the TV context: “It’s puter software, whose launch has not happened yet and not what you watch, it’s what who you know watch”. might or might never happen. Recently new pragmatic And when we consider the asynchronous communica- approaches to iTV, consisting of building on the exist- tion, from the taxonomy of TV sociability [25], it be- ing and lowering the challenges, produced a new airi- comes “It’s not what you watch, it’s what who you ness for iTV. know watch or have watched”. 2.2.1. Interactivity types The study of social interactive television is also not Gawlinski [43] considers the lack of an agreed new [21], as Wellens [111] already stated in 1979 framework for describing different types of interactiv- that “interactive television represents means of linking individuals together by providing each with an elec- ity, as one of the iTV difficulties. However, the fol- tronically mediated representation of the other’s voice lowing taxonomy of TV interactivity types from Curry and visual presence”. However the lack of TV specific [28] is the most agreed one: guidelines for interaction, enforced the use of Human Distribution interactivity refers to controlling the computer interaction (HCI) techniques [44], which re- content delivery but not the content itself. sulted in a cumbersome social interaction that does not meet the expected seamless TV experience. Information interactivity consists of choosing the Recently, reenabling the social aspect to the TV has delivered information such as weather or local gained augmenting interest, so that MIT Technology news. Review11 listed it in the ten most important emerg- Participation interactivity involves the viewer in ac- ing technologies of 2010. This interest is mainly owed tions and choices that bring dynamic content. A to the new opportunities offered by virtual social net- typical action is voting, and a possible choice is works and by the commercial potential of the social the camera angle during a soccer game. TV. 10 http://en.wikipedia.org/wiki/Vaporware 11 http://www.technologyreview.com/tr10/
B. Makni et al. / Semantic integration of TV data and services 5 We distinguish two means for social interactivity in a range of techniques, such as recommendation the TV context: techniques based on collaborative filtering, for example. 1. Using ancillary devices: This explores the me- Presentation Generation and Tailoring The selec- dia multitasking practice where the user simul- tion, organization, and customisation of televi- taneously uses Internet and mobile phones while watching TV. According to Nielsen Three Screen sion material based on viewer queries, processed Report [81] survey about Television, Internet and programs, and viewer models. Mobile Usage in the US, simultaneous usage rose Interaction Management Adapt the human com- in the first quarter of 2010 by 35% to reach 60% puter interaction techniques to the TV context. of TV viewers. Comcast’s Tunerfish12 uses a web The human-TV interaction should include mech- and mobile interaction to allow friends from Twit- anisms for attention and dialogue management. ter13 and Facebook14 to share feedback about TV Evaluation of the user’s satisfaction with respect to shows. The ancillary device could also be an en- speed and accuracy. The speed in which the sys- hanced control such as a sensor-enhanced pillow tem is adapted to the user’s preferences and accu- [6]. racy in terms of precision and recall of the recom- 2. Directly on the TV screen: Where social interac- mended programs. tion is displayed on top of the watched program in form of avatars [26,80] for example. 2.4. Next-generation TV challenges 2.3. Personalised television Obviously, materialising the previously discussed TV challenges involves many parties, from different Similarly to the Web content expansion phenomena, backgrounds and with different concerns. the TV evolution to Internet TV has been coupled with the scattering of TV and multimedia content, where 1. Network experts to adapt the Internet infrastruc- the user struggles to find a relevant content, which per- ture for multimedia and TV data delivery. plexes the leisure time. Hence, the importance of the 2. HCI specialists to define TV interactivity patterns personalised feature within the next-generation TV. and human-TV interaction. The personalisation of TV could benefit from re- 3. Personalisation experts to propose TV specific search advances in recommendation systems but re- recommendation systems. quires previous adaptation to TV content and TV in- 4. Social scientists to build social networks around teraction. Ardissono et al. [3] enumerate the following the TV preferences. challenges to enable personalized television: And thus interoperability is a key issue for TV data Viewer Modelling The acquisition, representation, integration. and utilization of information about viewers, such Guenther and Radebaugh [47] define interoperabil- as their characteristics (e.g., gender and age), ity as “the ability of multiple systems with different preferences, interests, beliefs, and their viewing hardware and software platforms, data structures, and behaviour. This includes individual and group interfaces to exchange data with minimal loss of con- modelling. tent and functionality”. Since any loss of TV data con- Viewer Identification The recognition of the TV viewer(s) tent or functionalities implies degradation of reasoning to provide personalized services. capacities such as personalisation, the seamless inter- Program Processing Implying programs segmenta- operability of TV data is a high level requirement. tion, summarisation, and indexing. Chan and Zeng [22] defines three levels of interop- Program Representation and Reasoning Modelling erability: the programs’ characteristics to measure similar- ities or dissimilarities between the different pro- Schema level when different schemas are used. grams. Reasoning about programs can include Record level the same schema is used with different semantic interpretations of the elements. 12 tunerfish.com Repository level when accessing the data is depen- 13 twitter.com dant to the used repository, which hinders cross- 14 facebook.com collection searching.
6 B. Makni et al. / Semantic integration of TV data and services Table 2 SW and SWS impact on TV parties interoperability Interoperability level SW SWS expected impact Schema X TV data schema interoperability Record X Agreed semantics Repository X Unified way to access TV data The three levels of interoperability issues are om- We organise the following sections according to nipresent within the different TV parties. each challenge and discuss the impact of SWS to solve In the following sections, we try to answer the fol- them. lowing questions: 3.1. Bridging the semantic gap – To which extent can semantic Web technologies enable the semantic integration of TV data? – How can SWS enable TV services interoperabil- Effective management of multimedia assets, includ- ity? ing content-based indexing and retrieval, impose a – From our experience in brokering TV services, deep understanding of the content at the semantic level what hinders the SWS uptake? [23]. That could be performed either manually or auto- matically. On the one hand, manual semantic annota- tion of multimedia content suffers from subjectivity of 3. Semantic integration of TV data descriptions, which hinders interoperability [58], and is far from being a scalable solution. On the other hand, Ziegler and Dittrich [114] define the semantic inte- the automatically extracted multimedia features are gration as “the task of grouping, combining or com- low-level perceptual features, faraway from the high- pleting data from different sources by taking into ac- level semantic descriptions that match human cogni- count explicit and precise data semantics in order to tion [58]. In order to improve Content-Based Multi- avoid that semantically incompatible data is struc- media Indexing and Retrieval (CBMIR) accuracy, the turally merged.” research efforts have shifted from designing sophisti- Since TV data is a multimodal data composed of cated low-level features extraction algorithms to bridg- a) Multimedia content b) Structured Metadata descrip- ing this so-called semantic gap [67]. tions of the multimedia content c) Semi-structured Kompatsiaris et al. [58] classify these efforts in the metadata with free text descriptions of the programs following categories: Relevance feedback [96], knowl- embedded in Electronic program guide (EPG) for ex- edge based [112] and multimedia ontologies. Liu et al. ample , the semantic integration of TV data bene- [67] distinguish a specific category for Web image re- fits from research advances in each modality a) se- trieval, which is HTML text fusion with visual content mantic integration and retrieval of multimedia docu- from Web images, that we include into the multimodal ments, b) multimedia metadata interoperability. c) Nat- fusion [9]. We focus on multimedia ontologies cate- ural Language Processing (NLP) and semantic enrich- gory as it is the most relevant for the Semantic Web ment research. domain. The main efforts of applying semantic Web tech- Supported by the proved effectiveness of systems nologies for TV data integration aim a reasoning based with limited context of application [112], the knowl- personalisation of TV and bringing the social aspect to edge based approaches model the domain of applica- TV. Thus the next-generation TV challenges are: tion either explicitly or implicitly. 1. Adapting the advances of semantic multimedia Explicit Model based approaches uses a priori domain- retrieval to the TV content. specific knowledge [1,103] for guiding low-level 2. Semantic integration of the different TV related feature extraction, high-level descriptor deriva- metatdata. tion and symbolic inference [58]. Chang et al. 3. Semantic integration of semi-structured and struc- [24] introduced the idea of semantic visual tem- tured TV data. plates to link visual features to semantics, where 4. Reasoning based personalisation of TV content. each template represents a personalized view of 5. Enabling the Social TV. concepts. Prior knowledge inspired from cine-
B. Makni et al. / Semantic integration of TV data and services 7 matic principles is relevant also in video classifi- videos and concluded the efficacy of concept-based ap- cation [17]. Lighting level differentiate low light proaches. horror movies from well-lit comedies[17]. And 3.1.2. Linked multimedia motion speed separates fast action movies and The success of the semantic Web vision has been sports from slow drama. Audio effects are also pertinent to automatically detect horror movies limited to a small scope in enterprises and in vir- [76]. tual communities. This is essentially due to the scope Implicit Uses machine learning techniques for dis- of their domain knowledge that eases its modelling, covering complex relationships and interdepen- added to the non maturity of ontologies merging algo- dencies between numerical image data and the rithms. That led to isolated islets of semantic Web data. perceptually higher level concepts [58]. This ascertainment motivated the Linked Open Data (LOD) proposition [14]. This new vision defines the Multimedia ontologies play a key role in modelling Semantic Web as “a technology for sharing data, just this knowledge in a shared formalisation that allows as the hypertext Web is for sharing documents” [15]. automatic bonding of high-level concepts from the The linked data movement impulsed the adoption of model to the extracted low-level features. the Semantic Web vision at a large scale by linking 25 3.1.1. Multimedia ontologies billion Resource Description Framework (RDF) triples Kompatsiaris et al. [58] motivates the usage of mul- from 203 datasets (as of September 2010). timedia ontologies in formalizing the multimedia se- The need for this impulsion was also present within mantics as they fulfil the following requirements: the semantic multimedia community, and hence the idea of adapting the linked data principles to the multi- 1. Persistence: Modelling the multimedia semantics media context. Burger and Hausenblas [18] enumerate evolution, such as the evolution of typical desk the following principles to interlink multimedia data: components, to allow usage in future applica- tions. 1. Follow the LOD principles 2. Consistency: Precise and non ambiguous seman- – Use URIs as names for things tic annotations are crucial for efficient reasoning – Use HTTP URIs so that people can look up about the multimedia content. those names. 3. Context enabled: As multimedia objects exist in – When someone looks up a URI, provide context, modelling this context information is useful information, using the standards (RDF, beneficial for multimedia retrieval [39]. SPARQL) We add that – Include links to other URIs, so that they can 4. syntactic annotations are liable to ambiguities and discover more things. thus not interoperable nor interpretable by ma- 2. Consider the contextual aspect to represent the se- chines. mantics of multimedia content. 5. Semantic annotations refer to a knowledge for- 3. Deploy legacy multimedia metadata formats. malised by an external ontology to help solv- 4. As the need to refer fragments of multimedia ing ambiguities via persistent and implicit anno- based on space and temporal parameters is fun- tations. damental [105], a mechanism to specify URIs for 6. They are also operational annotations as they are these fragments is mandatory. intended to be consumed and generated by soft- 5. Interlinking methods are essential in order to ware agents. manually or (semi-) automatically interlink mul- Naphade et al. [78] have modelled large-scale con- timedia resources. cept ontology for multimedia (LSCOM) to enable au- Discussion tomatic extraction of broadcast news video. Similar The majority of the domains modelled by multime- approaches to link low-level Moving Picture Experts dia ontologies are relevant for TV content such as : Group-7 (MPEG-7) features to higher level concepts include [8,40,91,10]. Hauptmann et al. [50] have com- – News [78,51,100,55] pared concept-based using LSCOM ontology and text- – Sports [113,30] based retrieval accuracy over a collection of news – Movies [29,90]
8 B. Makni et al. / Semantic integration of TV data and services However the existing approaches for semantic annota- ber stations and related communities can share. tion of multimedia are offline [7] due to the time con- PBCore extends Dublin Core by adding a num- suming phases of features extraction and classification. ber of elements specific to audiovisual assets that That hinders the adoption for live programs broadcast- falls into three groups: ing and reveals the need for real-time semantic anno- 1. Content: provides descriptive metadata ele- tation of multimedia content. ments. 3.2. TV-related Metadata standards 2. Intellectual property: provides Rights man- agement metadata. We mentioned that digital video and TV share many 3. Instantiation: contains all technical meta- concerns such as content annotation, and thus we data about the physical or digital representa- will cover multimedia annotation standards namely tion of the asset such as format, media type, MPEG-7 and Society of Motion Picture and Television duration etc. Engineers (SMPTE) metatdata dictionary and TV spe- TV-Anytime The TV-Anytime forum17 is a worldwide cific TV-Anytime and Public Broadcasting Metadata project involving vendors, broadcasters, telecom- Dictionary (PBCore). munications companies, and the consumer elec- MPEG-7 is a multimedia content description interface tronics industry, which has defined an extensive standardised by the International Organization bundle of specifications for the use of local stor- for Standardization (ISO) and International Elec- age at home in a specialized “set-top box” or in trotechnical Commission (IEC). The standard de- the TV set [27]. fines the MPEG-7 scope by addressing applica- Discussion The coexistence of many metadata stan- tions that can be stored (on-line or off-line) or dards for TV is practically equivalent to the lack streamed (e.g. broadcast, push models on the In- of standards as users will again fall to using non- ternet), and can operate in both real-time and non interoperable metadata schemas. Which is inevitable real-time environments. SMPTE The SMPTE Metadata Dictionary15 [95] is a within heterogeneous communities [22]. Thus the need large list of structured metadata elements grouped for core multimedia ontologies to enable multimedia in the following classes: Identification, Adminis- metadata schemas interoperability [53]. tration, Interpretation, Parametric, Process, Rela- Moreover, the large number of elements of the dis- tional, Spatio-temporal, Organisationally Regis- cussed metadata schemas reveals their complexity, and tered Metadata, and Experimental Metadata. Al- the semantics of these elements remain implicit. For though it was originally designed to be encoded in example, very different syntactic variations may be the Key-Length-Value (KLV) data encoding stan- used in multimedia descriptions with the same in- dard, an Extensible Markup Language (XML) se- tended semantics, while remaining valid MPEG-7 de- rialisation is available. scriptions, which causes serious interoperability issues Standard Media Exchange Framework (SMEF) The for multimedia processing and exchange [104]. Hence BBC has defined SMEF16 to support and enable the need for multimedia ontologies unfolding metadata media asset management as an end-to-end pro- semantics and amending their interoperability at the cess from commissioning to delivery to the home. records level. The SMEF Data Model (SMEF-DM) provides a Multimedia ontologies for records interoperability set of definitions for the information required in Mai Chan and Lei Zeng [68] define the records production, distribution, and management of me- level interoperability as the “efforts intended to dia assets, expressed as a data dictionary and a set integrate the metadata records through the map- of Entity Relationship Diagrams. ping of the elements according to the semantic PBCore The PBCore was created by the Corpora- meanings of these elements”. tion for Public Broadcasting (CPB) in the United The multimedia ontologies used for records level States to provide a simple structure that its mem- interoperability include: 15 http://www.smpte-ra.org/mdd/index.html 16 www.bbc.co.uk/guidelines/smef 17 http://www.tv-anytime.org/
B. Makni et al. / Semantic integration of TV data and services 9 Core Ontology for Multimedia (COMM) [4] was the similar problem of TV content overload. Gauch designed by re-engineering the MPEG-7 et al. [42] define user profiling as gathering and ex- standard in order to discover multimedia ploiting some information about users in order to be patterns. Patterns recognition was based on more effective. In ontology-based user profiling [74], two of the main patterns of Descriptive On- the user profile is represented in terms of interesting tology for Linguistic and Cognitive Engi- concepts [46]. neering (DOLCE) which are Descriptions & Situations (D&S) and Ontology of Infor- 3.3.2. Personalised TV mation Objects (OIO). The typical scenario The most used TV personalisation techniques are “the decomposition of a media asset and the (semantic) annotation of its parts” reveals – Content based: uses a metric to quantify the sim- the two main functionalities of MPEG-7: ilarity between viewers’ profiles and programs decomposition and annotation. The decom- based on their content description. Similarity es- position consists of segmenting the multi- timation is time consuming [37] due to the TV media content based on temporal, spatial content amount. or spatio-temporal descriptors. Then these – Collaborative filtering: recommends programs segments are annotated with the MPEG- based on estimated similar profiles. Despite its 7 features descriptors. Following the D&S effectiveness is many domains, collaborative fil- pattern, decomposition is a Situation (Seg- tering for TV programs suffers from many issues mentDecomposition) that satisfies a De- [85] such as a) first-rater problem as new pro- scription (SegmentationAlgorithm). grams are not rated enough to be recommended, MPEG-7 Ontology Hunter [52] reverse-engineered b) cold-start problem where new users did not a core subset of MPEG-7 specification to rated programs yet and no valid recommendation generate an RDF Schema (RDFS) ontology could be suggested, c) sparsity problem consists describing MPEG-7 elements semantics be- on lack of overlap between two random viewers if fore generating a Web Ontology Language they did not rated the same programs . O’Sullivan (OWL) version18 . et al. [85] focus on c) as they consider it the most Multimedia ontologies for schemas interoperability stringent problem of collaborative filtering for TV The Multimedia Metadata Ontology (M3O) is program recommendation. a follow-up initiative to COMM based also on – Social filtering: similarity between profiles is DOLCE but not restricted to MPEG-7 and thus based on their friendship in social networks un- capable of expressing all structural information like estimation in collaborative filtering. of many multimedia metadata formats while pre- serving the abilities of the COMM. To overcome the shortcomings of each technique, TV recommender systems tend to use hybrid approaches 3.3. Reasoning about TV data [85,37,66]. The ultimate goal behind lifting the TV data to the Semantic personalisation of TV By semantic person- semantic level is allowing reasoning about TV data alisation, we refer to both viewers’ profiles enrich- in order to facilitate personalisation, recommendation ment with semantics i.e. ontology-based viewer profil- and social TV features. Besides the semantic annota- ing and semantic representation of the TV content. tion of TV data, the viewers and their context should In SenSee framework [5], viewer profile and context be represented at the semantic level to allow automatic are extended via ontologies describing time, geograph- matching. ical location and TV domain knowledge. The authors, 3.3.1. Viewer profiling Aroyo et al. [5], prove the advantage of ontology-based Since user profiling and personalisation is a well es- TV recommendation by drawing a quantitative com- tablished solution to the information overload problem parison with free text approach. [42], it stimulates viewer modelling activity to solve Avatar [37] recommendation is based on hybrid ap- proach using collaborative filtering and semantic simi- 18 http://metadata.net/mpeg7/ larity between the content and the user profile. The se-
10 B. Makni et al. / Semantic integration of TV data and services mantic similarity is calculated according to a dedicated tecture and that the same architecture could lead to the TV domain ontology19 . programmable Web. From the RESTful prospect, the The NoTube BeanCounter [106] aggregates user’s WS-* specifications do not complement each other but activity from different social networks and uses the ac- usually overlap and compete, which is confusing Web tivity stream in TV programs recommendation. The Services designers. Moreover building Remote Proce- aggregation from different Web sources illustrates one dure Call (RPC) upon Web is counter-intuitive and of the most important advantages of semantic tech- does not take advantages of the Web’s REST archi- nologies, for instance the NoTube BeanCounter aligns tecture. More objectively, Pautasso et al. [88] made a the data gathered from Last.fm with the BBC pro- quantitative technical comparison based on architec- grammes ontology 20 in SKOS (Simple Knowledge Or- tural principles and decisions and concludes that REST ganization System) [75]. is well suited for basic, ad-hoc integration scenarios and that WS-* is more flexible and addresses advanced 3.4. Discussion quality of service requirements. From the user’s perspective, the semantic integra- 4.2. TV services tion of TV data brings the freedom of choice allow- ing the cross-collection searching over many Inter- It became common practice throughout the last net video and TV repositories while preserving a cen- decade; to expose all sorts of multimedia content and tralised profile adapted to each collection. That is metadata stored in one particular repository through a materialised through the semantic interoperability be- set of Web APIs. The motivation for this practice is to tween the different used multimedia metadata. How- allow the aggregation of the multimedia content and ever the content description is still generated manually its tailoring to users’ requirements. Which delegates or via crowdsourcing due to the remaining semantic- the adaptation of multimedia consumption to third par- gap between the computed multimedia features and the ties developers and exempt the multimedia providers content concepts. Thus the semantic integration of TV from maintaining different platforms end-points such data is dependant to the accuracy of the content de- as Android21 and iOS22 platforms. scription. To name but a few, major TV broadcasters such as BBC expose their data via Web endpoints23 and pro- vide APIs to process these data, and YouTube24 data 4. Semantic integration of TV services API25 allows a program to search for videos, retrieve standard feeds, and see related content. Towards collaboration, the different next-generation However, these APIs are not standardised in terms parties adhere to the SOA to expose their services of inputs, outputs, nor invocation methods, and specific and consume others’. Specifically, the NoTube SOA is clients should be designed to each API. That raises the based on Web services to take advantage of the well- repository level interoperability issue. In the following established Web architecture as a medium for services section, we present the major Semantic Web services communication. approaches that tackle this issue generally and in TV context specifically. 4.1. The Web of services 4.3. Semantic Web Services Mainly two approaches are used to adapt the Web to a Web of services: the first approach proposes a stack Whatever the used approach RESTful or WS-* Web of specifications to support the services requirements Services, performing complex operations such as ser- such as communication, selection, security etc. giving vices selection, composition, or mediation requires hu- this approach the Big Web Services name [88], the sec- ond approach, RESTful Web Services, claims that the 21 http://www.android.com/ success of Web is due to the maturity and ease of use 22 http://www.apple.com/iphone/ios4/ of its Representational State Transfer (REST) archi- 23 http://backstage.bbc.co.uk/data/Data 24 YouTube.com 19 http://avatar.det.uvigo.es/index-i.html 25 http://code.google.com/apis/youtube/ 20 http://purl.org/ontology/po/ getting_started.html\#data_api
B. Makni et al. / Semantic integration of TV data and services 11 man intervention. However, these operations should be 4.4.1. Capabilities representation automated as the users required functionalities are sel- There are two approaches to represent the capabil- dom achieved via one Web Service; hence the need ities of a Web Service, the first one is based on an for automated services composition and mediation in extensive ontology of functions where capabilities ad- large scale context. Such as the Web 1.0, the software vertisement is done by binding to a class of homoge- agents could not reason about Web Services as they see neous functions within a taxonomy of services such them as inputs and outputs without any conscience if, as flight booking, or transportation service. The sec- for example, the received message contains a ranked ond approach models the flow of state transformations, in occurrence the flight booking service requests a de- list of programs from a recommendation service or a parture and arrival cities, a departure and arrival dates program description from an EPG service. And to au- and a credit card, and changes the state by decreasing tomate services reasoning, the software agents should the number of available seats and withdrawing the due process the services messages and operations at the se- amount of money. mantic level via shared ontologies. Which is the re- search field of SWS [72] to solve at least one of the 4.4.2. Artificial intelligence planning following challenges. Stuart et al. [101] define artificial intelligence plan- ning (AI planning) as “a kind of problem solving, 4.3.1. Discovery where an agent uses its beliefs about available actions Semantic Web Services discovery consists on re- and their consequences, in order to identify a solution trieving the relevant services that achieve the requested over an abstract set of possible plans.” This agent, the functionalities. Mainly discovery is based on advertis- planner, accepts three inputs ing service capabilities in a centralised or distributed 1. a formalised description of the initial state of the registry, and matchmaking these capabilities with the world. requests. 2. a formalised description of the agent’s goal (the desired behaviour). 4.3.2. Composition 3. a formalised description of the possible actions We noted that complex functionalities are rarely that can be performed (the domain theory) achieved via one Web Service, yet many functionali- ties from different Web Services should be combined and outputs a sequence of actions that, when executed to achieve complex requests. Most of the proposed ap- in any world satisfying the initial state description, will achieve the goal. [110] proaches for automatic Web Services composition are inspired by the researches in cross-enterprise workflow 4.4.3. Cross-organisational workflow and AI planning [93]. A workflow is an abstraction of a business process. It comprises a number of logic steps (known as tasks 4.3.3. Mediation or activities), dependencies among tasks, routing rules, Composing different Web Services from different and participants [20]. When many organisations inter- providers and which are not initially designed to coop- vene in the business process, its abstraction is a cross- erate raises another challenge which is Web Services organisational workflow that could be assimilated to mediation. Mediation aims to adapt services outputs to Web Services composition. be consumed by the following service(s) in the com- 4.4.4. Abstract State Machines position chain. The Church-Turing thesis states that any real-world 4.3.4. Choreography computation can be translated into an equivalent com- The web Services choreography handles the interac- putation involving a Turing machine. However, steps tion between the services invoked in the composition, number required by the machine is not bounded and even simple operations could be simulated by a large from a global point of view. number of steps. So the importance of the Abstract State Machine (ASM), introduced by Gurevich [48] 4.4. Preliminaries as every sequential [49] or parallel algorithm [16] is behaviorally equivalent to a correspondent ASM. The In this section we define required notions to com- behaviour emulation is a set of transitions between ab- pare the different SWS approaches. stract states where
12 B. Makni et al. / Semantic integration of TV data and services chy. Grounding these types to WSDL and vice-versa could be defined via Extensible Stylesheet Language Transformations (XSLT). Preconditions and effects are defined via rule languages such as Rule Markup Lan- guage (RuleML) or OWL Rules Language (ORL) to define the state respectively before and after the invo- cation of the service. We summarise how the OWL- S approach tries to solve the semantic Web Services challenges: Services discovery with OWL-S For capabilities de- scription, OWL-S supports both services taxonomy and state transformation approaches. Services taxon- Fig. 1. Top level of the service ontology omy is formalised by sub-classing the Services Profiles which is also used to define the state transformations. – A state is a dictionary of (Name, Value) pairs, Then to match the capabilities with the request, either called the state signature. by searching a subsumption relation between the re- – Transition rules define the evolution of the states quested service class and a class from the services tax- (i.e. the values change) onomy, or by matching both the inputs and outputs of the request and the advertised services also via sub- 4.5. Semantic Web Services approaches sumption relations. Fig.2 illustrates the OWL-S dis- covery approach. We distinguish two approaches that aim to solve Services composition with OWL-S OWL-S based one or more of the semantic Web Services challenges, composition naturally falls into the AI planning sec- which are the top down and bottom up approaches. tion, as OWL-S provides 4.5.1. Top down approaches 1. the formalised description of the initial state of Top down approaches provide conceptual frame- the world, in terms of preconditions. works and languages to describe the semantics of Web 2. the formalised description of the desired goal in Services before grounding these descriptions to the terms of users required outputs and effects. services. 3. the domain theory by modelling the OWL-S The OWL-S approach The OWL-S aims to provide atomic process as an action that transforms the building blocks for encoding rich semantic service de- inputs into outputs. scriptions that builds naturally upon OWL [71]. The The next step is the choice of a suitable AI planning OWL-S approach consists on an Upper Ontology for algorithm for Web Services composition. Oh et al. [84] Services with three interrelated sub-ontologies: Fig. 1 draw a decision tree in Fig 3 that pilots this choice ac- cording to the scale of the available services and the Profile ontology for describing the service functional- complexity of composition. A simple composition in- ities in order to advertise the service and match it volves a sequential AND operator while a complex with the requests. composition is expressed by AND, OR, XOR, NOT Process model ontology for behavioural description operators and by constraints. with the intention of service invocation, enact- ment, composition, monitoring and recovery. The WSMO approach The WSMO [94] approach Grounding ontology bonds the process model with reuses the main concepts identified in the Web Service detailed specifications of the service from Web Modeling Framework (WSMF) [36] to define Seman- Services Description Language (WSDL). tic Web Services: Due to this enrichment of expressiveness, OWL-S ex- Ontologies provide the terminology used by other tends the WSDL operations to a more abstract con- WSMO elements to describe the relevant aspects struct “atomic process” that extends inputs and out- of the domains of discourse. puts and introduces preconditions and effects (IOPE). Web services represent computational entities able The inputs and outputs are typed following the OWL to provide access to services that provide some typing system which allows binding to a class hierar- value in a domain. The terminology defined by
B. Makni et al. / Semantic integration of TV data and services 13 Fig. 2. Services discovery with OWL-S Fig. 3. A decision tree of AI solutions for the Web Services composition problem [84] the ontologies is used to describe the Web ser- Mediators handle interoperability problems between vices capabilities, interfaces, and internal work- different WSMO elements, at the data level to re- ing. solve mismatches between different used termi- nologies, at the protocol level to ensure communi- Goals model the user view in the Web service usage cation between Web services and on process level process in terms of requested functionalities. when combining Web Services.
14 B. Makni et al. / Semantic integration of TV data and services the bottom-up approach builds incrementally upon ex- isting Web services standards [65]. The SAWSDL approach SAWSDL recommendation [65] forms the first brick upon WSDL that gears up seman- tic annotations of services by providing Model Refer- ence and Schemas Mapping extensions [60]. Where Model Reference is an extension attribute, sawsdl:modelReference, Fig. 4. WSMO taxonomy of mediators [87] applicable to any WSDL or XML Schema ele- ment to point to one or more semantic concepts, Moreover all these concepts could have Non-Functional in order to describe the meaning of data or to properties. specify the function of a Web service operation. Web Services mediation with WSMO WSMO sup- ports SWS mediation naturally via the Mediators con- Schemas Mapping consists of transforming data from cept. Indeed, WSMO defines four types of mediators XML message format to a semantic model and of two categories Fig. 4: vice-versa. The former transformation is called lifting and expressed by the sawsdl:liftingSchemaMapping Refiners express the refinement relation between ele- attribute and the former is called lowering ex- ments pressed by the sawsdl:loweringSchemaMapping OO-Mediators import ontologies and resolve attribute. possible representation mismatches between them. The WSMO-Lite approach As SAWSDL in itself GG-Mediators express goals refinement and does not define the semantics of Web services, but equivalence. offers means to link the WSDL descriptions to on- tologies, the Web Services Modeling Ontology Lite Bridges enable heterogeneous elements interopera- (WSMO-Lite) approach [107] defines a service ontol- tion. ogy that bonds semantics to the service description via WG-Mediators express total or partial fulfil- SAWSDL attributes. These semantics are expressed by ment of desired goals by exposed Web Ser- the following WSMO-Lite ontology concepts: vices. WW-Mediators deals with heterogeneity prob- Ontology a subclass of owl:Ontology that defines a lems between Web Services that could ap- container for a collection of assertions about the pear during composition and orchestration information model of a service. tasks. ClassificationRoot defines a root class for a taxon- omy of services functionalities. Web Services choreography with WSMO WSMO NonFunctionalParameter allows the description of choreography inherits the core principles of ASM, domain specific nonfunctional properties. namely state-based, represents a state by a signature, models state changes by transition rules[97]. The state Axiom sub-classed in Condition and Effect to form a signature is expressed in terms of instances of concepts service capability. or relations from a state ontology. The state change 4.5.3. Semantic annotation of RESTful services consists of new instance or new value for the relation Since there is no agreed machine-processable de- attribute which leads to the notion of evolving ontolo- scription language for RESTful Web services [69], gies by analogy to the evolving algebra, the first name the MicroWSMO [59] approach for semantic RESTful of ASM. services builds on top of hREST [62], a microformat 4.5.2. Bottom-up approaches [57] that enables the creation of machine-processable The bottom up approach tends to provide a more descriptions on top of existing HTML descriptions. developer friendly way to semantically annotate Web MicroWSMO tends to provide a SAWSDL-like an- Services. Based on the dictum that the top-down notation of Restful services that could be bonded to approach assumption, that the service engineer de- WSMO-Lite ontology in order to provide RESTful and scribes semantics for the service before grounding WSDL based services interoperability. The Figure 5 these descriptions to the services is counter-intuitive, illustrates this relative positioning.
B. Makni et al. / Semantic integration of TV data and services 15 description is expressed in terms of Goals, Mediators, Web Services and Ontologies. These are described in a formal representation language, for instance, OCML [77]. IRS-III supports capability-based invocation: the request is a goal to be achieved via the following inter- mediated operations [19] 1. discover potentially relevant Web services; 2. select the set of Web services which best fit the incoming request; 3. mediate any mismatches at the conceptual level; Fig. 5. Relative positioning of WSMO-Lite and MicroWSMO [61] 4. invoke the selected Web services whilst adhering to any data, control flow and Web service invoca- 4.6. Semantic TV services tion constraints. Our experience of brokering TV-related services Given that IRS-III directly aims at automating ser- within the NoTube project offered us a fertile ground to vice execution related aspects, the interface covers apply, compare, and adapt different semantic Web ser- choreography and orchestration descriptions. Chore- vices approaches [32]. We introduce the specific No- ography addresses the communication between the Tube challenges, the different semantic TV services IRS-III broker and a Web service, and is described approaches, and their shortcomings. as so-called grounding. The IRS-III grounding mech- 4.6.1. NoTube use-case anism supports REST-based, SOAP-based, and XML- In order to illustrate the challenges with respect to RPC based services [64]. Grounding involves two pro- service-related tasks, we describe one of the main use cesses referred to as lifting and lowering. Lowering in- cases driven by the TV broadcast industry partners volves transforming input parameters at the semantic within the NoTube project - namely, the requirement level to data input to the service at the syntactic level. to provide personalized content and metadata deliv- Lifting involves the opposite transformation, i.e. trans- ery to users. Here, the basic feature is the matching of forming the data output from the service at the syn- heterogeneous users’ profiles, e.g. including interests, tactic level into an ontological object at the semantic preferences, and activity data, and user contexts (e.g. level. current location and viewing device) to filter and de- At the semantic level the orchestration is repre- liver TV content from a variety of sources. Address- sented by a workflow model expressed in OCML that ing this particular use case in a service-oriented man- describes the flow of control between Web services. ner involves selecting, and orchestrating between nu- The IRS-III orchestration model supports the main merous services that provide various functionality, for control-flow primitives of sequence, selection, and rep- instance, to aggregate users’ topic interests based on etition. their social networking activities, retrieve EPG data from various sources, and provide recommendations 4.6.3. The iServe Linked Services approach based on a dedicated algorithm. To support the highly iServe supports publishing service annotations as service-oriented nature of the project, two major goals linked data - Linked Services - expressed in terms of need to be supported: a) support of distributed devel- a simple conceptual model that is suitable for both hu- opers with lightweight service annotations, and b) sup- man and machine consumption and abstracts from ex- port of application automation with Semantic Web Ser- isting heterogeneity around service kinds and annota- vice brokerage . In the early stage we focused on b) via tion formalisms. Particularly iServe provides: a top down approach, namely, IRS-III [34]. However, – Import of service annotations in a range of the later need for the distribution of services annotation formalisms (e.g., SAWSDL, WSMO-Lite, Mi- revealed the need for lightweight approach via iServe croWSMO, OWL-S) covering both WSDL ser- [89]. vices and Web APIs; 4.6.2. The IRS-III framework – Means for publishing semantic annotations of ser- IRS-III is a semantic execution environment that vices which are automatically assigned a resolv- adopts the WSMO approach, videlicet that a service able HTTP URI;
You can also read