TRENDS AND TECHNOLOGIES IN DIGITAL MEDIA - Fraunhofer Allianz Digital Media
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
CONTENT V I R T U A L R E A L I T Y – T E C H N O L O G I E S T H AT PREFACE TA K E Y O U T H E R E ! 4 Enjoy VR Virtual Reality is one of the most appealing audio, the realism will be intensified. What 10 Real people in virtual 3D worlds and promising new experiences when it is more, the viewer can navigate through 14 Television with an all-round view comes to enjoying media content. The the scene, and interact with objects or 16 Standardization on light field modalities interactive selection of the view provides content in the VR world. more than audio and picture – especially With 360 degree the viewer can turn his for all users who have grown up with Star or her head to the left and to the right, up TECHNOLOGIES AND TOOLS FOR PRODUCTION Trek and computer games. Wearing VR and down. New production technologies, AND IMMERSIVE MEDIA glasses is no longer a barrier – what counts e. g. light-field, even extend these three di- 20 OmniCam-360 degree first live transmission with the Berlin is the real-time view selection of video and mensions of freedom to six and allow the Philharmonics sound content in best quality very close to viewer to move forward and backwards or 22 MPEG-H: Delivering immersive audio to the world’s first reality. This immersive experience of real- step to one or the other side, too. But for terrestrial UHD-TV system life content makes VR a door-opener for VR, there is still some way to go. Bottle- 24 xHE-AAC: The codec of choice for streaming and Digital Radio new media business, in a way that is differ- necks such as parallax-compensated image 28 Post-production tools ent from CGI-created content. stitching are demanding challenges to 32 MICO – multimodal media annotation and search Advances in image sensor technology solve. And the questions of efficient busi- provide higher resolution, a wider color ness models for VR, which include the gamut, or higher sensitivity, which allows foundation of platforms and video chan- 36 FRAUNHOFER TECHNOLOGY INSIDE – for better capturing of real scenes. New nels, have still not been answered satisfac- SOLUTIONS AND PRODUCTS display technologies enable better repro- torily. We will give you an overview of new duction and smaller dimensions. 360 de- technologies that take you to your VR gree capturing combined with 3D audio experience. 38 ABOUT US takes the user into the scene. Computa- tion of high data volumes is possible even in smartphones. By combining VR content with binaural rendering from object-based Dr. Siegfried Fößel 2 3
VIRTUAL REALITY – TECHNOLOGIES T H AT TA K E Y O U T H E R E ! ENJOY VR N ew tec hnol o gies f rom Fr aunhof er labor at or ies ensure t hat VR w or lds c an be enj oy e d w it hout lim it at ion. We need eyes and ears to perceive the in a star-shaped arrangement, the Omni- world around us optimally. This also ap- Cam-360 makes use of ten cameras that plies to virtual realities. We are only con- are arranged vertically. With the aid of vinced by them when both the picture mirrors they seem as if they are all looking quality and sound are right. at the scene from exactly the same point,” says Christian Weissig, Project Convincing all-round panorama Manager at Fraunhofer HHI. “For the first time, automatic post-production is Researchers at Fraunhofer HHI have de- possible without manual correction.” veloped a technology to give us a realistic picture impression. At its core is the Thanks to their OmniCam-360, the re- OmniCam-360: it records 360 degree searchers were now able to support panorama images, or “all-round” images, live-streaming applications for the first without parallax. This means that there time – to transmit a concert by the Berlin are no kinks in the overall visuals that Philharmonic. The users could enjoy the could make a violin bow or a hand concert either on a smart TV or a com- “disappear”. With other recordings, this puter, or even put on VR glasses to be in happens frequently wherever the images the “live” audience. Either way, the view- from two cameras overlap. “Instead of ers had a much better view than the placing the various cameras back to back actual concert attendees, as they were 4 5
VIRTUAL REALITY – TECHNOLOGIES T H AT TA K E Y O U T H E R E ! “standing” on the stage right behind the environment where they become a part chain of production. A specially optimized conductor. of what is happening around them. For algorithm that can be applied to different example, a loud noise can simply change types of cameras with either built-in or The recordings made by the Omni- the direction of a user’s gaze, allowing for external microphones enables convincing Cam-360 have a resolution that is ten entirely new types of storytelling. 3D sound recordings. Additionally, times that of HD, or, to be more precise, post-production tools allow straightfor- 10,000 x 2000 pixels. As televisions and Creating convincing audio content for vir- ward mixing and mastering of 3D sound VR glasses are currently unable to display tual worlds can be challenging; however, while the audio codecs HE-AAC and this resolution, the researchers are work- the developers at Fraunhofer IIS are work- MPEG-H, co-developed at Fraunhofer IIS, ing on adapting the data transmitted to ing on innovative solutions to deliver this efficiently transmit surround and 3D match the resolution of the end device in to consumers. One answer is Fraunhofer sound, e.g. in a live-streaming app. question. Furthermore, in the future they Cingo, today’s leading solution for play- would only like to transmit the data that back of enveloping audio for VR applica- Thanks to Fraunhofer‘s audio and video lies within the viewer’s point of view – the tions via headphones, which supports technologies for VR, the next virtual con- “region of interest”, as it is known. every audio format for surround and 3D cert will be a feast for the eyes and ears, sound. For a highly realistic listening where the users’ experience will be as if Enveloping audio impressions experience, Cingo dynamically adapts the they were attending a live event in a con- sound to the user’s head movements in cert hall. A convincing VR experience is not only real-time. Fraunhofer Cingo is already based on what the eyes can see, as 50 being used in Samsung and LG’s VR head- percent of the experience depends on the sets. authentic audio reproduction. Flaws in audio instantly ruin the illusion of an arti- In addition to playback, proper recording, ficial world. However, if done properly, production, and transport of audio signals the 3D sound is authentic and allows are critical to a successful VR experience. users to enter the virtual world. The audio For this reason, Fraunhofer IIS developed impressions lead users through the virtual technologies for the complete VR audio 6 7
VIRTUAL REALITY – TECHNOLOGIES T H AT TA K E Y O U T H E R E ! The immersive media experience with Fraunhofer VR-technologies: as if being part of the live audience 8 9
VIRTUAL REALITY – TECHNOLOGIES T H AT TA K E Y O U T H E R E ! REAL PEOPLE IN VIRTUAL 3D WORLDS A 3D re c ording s ys te m re p l a ce s th e s y n th eti c av atar: A t the moment, av a t a r s – v ir t ua l p e o p l e – s tu m b l e q u i te u n natural l y through v i rtual wor lds. A ne w te c h n o l o g y wi l l m a k e th e m ov ements of v i rtual humans appe a r sm oot he r a n d m o re n a tu ra l . The background is deceptively realistic: wear a special suit containing a large trees and houses look like real life, even number of sensors that capture the the cars are not noticeably different from movements of the arms, legs and other real vehicles. If you weren’t wearing VR body parts, and forwards the data directly glasses, you might imagine that you were to the PC. This information is then used in real life instead of virtual reality. Only to animate the avatar in question. the “humans” are still moving somewhat unnaturally, and the textures – the weave Fluid, lifelike movements in the characters’ sweater material, their five o’clock shadow, or moles on their Researchers at the Fraunhofer Hein- skin – also still have quite a way to go to rich-Hertz-Institute HHI have now devel- appear natural. In order to copy how hu- oped a more expedient technology. “We mans move, the creators of the avatars record the real person from all spatial di- usually use motion capture: this tracking rections using 20 to 30 cameras and then process takes a real person and places calculate a robust three-dimensional markers on their body; the movements of model,” explains Dr. Oliver Schreer, Group these markers are then recorded by a Leader at the Fraunhofer HHI. “We can camera. Alternatively, the person can then add this model directly to the virtual 10 11
VIRTUAL REALITY – TECHNOLOGIES T H AT TA K E Y O U T H E R E ! world. We plan on accelerating capture tion about how far each body part is from and 3D modeling to allow us to integrate the camera in each recording. The soft- the person into the virtual world in real ware calculates this depth information time. It is no longer necessary to animate for each camera pair up to 50 times a the person any further.” The result is that second, which is also the rate at which the person moves as fluidly and naturally each camera pair takes pictures. The soft- in the VR world as in the real world; the ware then fuses the data from individual person’s motions are carried over one-to- camera pairs together. The result is a life- one instead of having to animate them as like three-dimensional copy of the person, previously. The texture also looks real to together with their movements, which the viewer. The viewer can observe the can be rendered for any viewing angle. avatar from any point of view of his or Mapping faults and undercuts are now her choice – it is, for example, possible to a thing of the past. For users, this means move in 360 degree around the avatar. that they can see areas that were previ- This makes one’s virtual companion ously hidden when only a single camera appear to be extremely true to life. pair was used – hidden by the hand that the person is holding in front of their A 3D model makes it possible torso, for example. This three-dimensional model can be integrated directly into the The real trick is in the software that cre- VR world – the real person “hatches” ates the 3D model. But let’s start at the from the camera set and into the artificial beginning: the cameras – all of which are world with all their natural movements. in pairs – are distributed at optimum posi- tions around the room. Each pair of cam- eras captures three-dimensional informa- tion about the real person. Similar to our two eyes, these cameras receive informa- 12 13
VIRTUAL REALITY – TECHNOLOGIES T H AT TA K E Y O U T H E R E ! TELEVISION WITH AN ALL-ROUND VIEW in order to show their viewers additional content. We use HbbTV in order to move functions that television sets previously I n t he f ut ure , T V v i e we rs wi l l b e th e i r o wn “c ameramen”. The es tab couldn’t perform onto the Internet – into l i s he d st a nda rd H b b T V c a n b e u s e d fo r 3 6 0 degree TV trans mi s s i on. a cloud, to be precise. So we make sure that the TV set only does what it’s able to. This means that if the viewer chooses a The helicopter blades whir as it soars over all spatial directions; the sound is also re- specific viewing angle, the calculation is the mountain peaks, an eagle glides past corded spherically. For playback, we start performed in real time in the cloud. The the side window. How nice it would be by deciding on a specific viewing direction television doesn’t even “notice” what it’s to be able to turn your head and watch – the viewer’s attention should be princi- doing. where the eagle goes. But, alas, you’re pally focused on the direction in which watching the scene as part of a TV docu- the action is happening. From here, the What does the user need to have in mentary from the comfort of your own viewer can use his or her remote control order to be able to benefit from this couch and you have to settle for the per- to move their point of view from side to technology in the future? spective that the cameraman chose. But, side, up, back, or down. It’s no longer just Basically all he or she needs is a modern with 360 degree TV, it’s different: here, one film; it’s several films in one. television that is HbbTV ready. We delib- the end user can play director and decide erately left out additional devices such as which direction he or she would like to This requires large quantities of data VR glasses in order to be able to reach as look. Dr. Stephan Steglich, Division Direc- to be transmitted. What technological wide a public as possible. A test version tor at the Fraunhofer Institute for Open requirements need to be fulfilled to of our 360 degree television is already Communication Systems FOKUS, explains bring 360 degree TV to viewers? being broadcast. Our technology is even how it works. One possible solution we have identified of interest for mobile end devices such as is the recognized standard HbbTV, which smartphones: the more complexity I can Dr. Steglich, how should we imagine is short for hybrid broadcast broadband move from the device into the cloud, the 360 degree television? TV. TV stations usually use this standard in longer the device’s battery will last. The A special camera setup records the film in conjunction with an Internet connection data volume is also reduced. 14 15
VIRTUAL REALITY – TECHNOLOGIES T H AT TA K E Y O U T H E R E ! STANDARDIZATION ON LIGHT FIELD ated imagery (CGI). This can be the view- in VR glasses. By using different arrange- point, the viewing direction, or the focal ments of cameras and special sensors and MODALITIES point. optics, other applications are possible, such as VR with translational movement One of the most important achievements electronic sensor. Several attempts were The theory behind this has been known or 3D modeling or generation of point of mankind is the transfer of knowledge made to increase the dimensions of the since the end of the 19th or the begin- clouds with the option of merging real to future generations. This is done by sev- reproduction of the scene, e. g. for stereo- ning of the 20th century. It was devel- scenes with CGI scenes. Today, the indus- eral methods: verbally, by drawings, in scopic 3D and also to get some depth in- oped by Faraday, Lippmann, and Gershun try is investigating which technologies al- writing and printing, or lately by photo- formation. But it is difficult to capture and and is called light field theory or plenop- low which applications. This relates to Vir- graphs and video. All these methods are reproduce the scene in the exact same tic. This states that, at every point in tual Reality, Augmented Reality (AR), exchangeable in some way; however it is way as an individual with only two im- space, the light rays form a function point clouds, light field cameras, visual a great deal more effort to describe a sit- ages. based on position, direction, and inten- effects VfX, depth enriched images, 3D uation verbally than by a photograph. The sity. The traditional cameras today cap- models; in general, every process of mix- phrase “A picture is worth a thousand A revolutionary approach to image ture the light rays in a specific position, ing real scenes with computer-generated words” became popular in the early 20th production but bundle light rays from one or more scenes. century and well describes the phenome- directions by the lens and record this on a non that people can incorporate knowl- With the advent of low-cost imagers, ad- photosensitive sensor. The aperture of the Standardization of light/sound field edge by visual sensors much faster than ditional sensors for localization or depth lens defines the number of bundled rays. technology by hearing and interpreting words. In the acquisition, and new electronic display last hundred years, this fact was imple- devices, true three-dimensional acquisi- It is clear that we cannot capture the Within the standardization committees mented by capturing images with cam- tion and reproduction for a scene are in complete light-field as we cannot position ISO/IEC JTC1 SC29 WG1 (called JPEG) eras, in the form of a still image as photo- sight. The approach is not only to capture a camera in every point in space without and WG11 (called MPEG), several groups graph, or in combination with sound as and reproduce the two images seen by their influencing each other, but it is pos- are working on these topics, discussing motion picture or video. Technology in human eyes, but to capture the complete sible to sample the light-field in different the possible modalities and correct repre- the 20th century allowed fast imaging of scene in multiple dimensions, to model ways. The simplest way is to use a circular sentation forms. Attached is a list of the the scene on a flat planar sensor, either the scene, and to render the images camera array for capturing 360 degrees, current working groups. on a photosensitive emulsion or on an based on demands as in computer gener- allowing Virtual Reality (VR) reproductions 16 17
VIRTUAL REALITY – TECHNOLOGIES T H AT TA K E Y O U T H E R E ! data points during capturing, point – Joint Ad Hoc Group for Digital rep- images and depth maps allowing formats, 3D projection methods, clouds are sparsely distributed in most resentations of Light/Sound Field depth enriched images. metadata, and signaling is necessary. of the cases. The benefit of point for immersive media applications clouds is the 3D representation, which This group was a joint ad hoc group – MPEG Wave Field Audio – MPEG Light-Field Coding allows new renderings from different between JPEG and MPEG to deter- As well as images, audio experiences Light field enables many different spe- view-points. mine the state of the art, potential are also position dependent and used cial effects, like the “Matrix Bullet ef- modalities, and applications. The sum- for object localization by viewers. A fect”, refocusing, or depth-based In the next few years, light field technol- mary technical report can be found as group in MPEG investigates the use of editing. For next-generation cinematic ogy will become a highly interesting tech- JPEG Pleno Joint AhG Report on the wave field/sound field technology to movies in particular, 3D processing al- nology for improving the viewing experi- JPEG website. support less restrictive viewing condi- lows a much more immersive viewing ence. Today, this field is being very tions, such as flexible point of view experience. In combination with audio intensively developed. It is still unclear – JPEG Pleno and object-based sound representa- wave field processing new workflows which representation formats are neces- The Adhoc Group JPEG PLENO targets tions. As in the image field, micro- are possible. The focus in light field sary and useful to allow optimized work- a standard framework for the repre- phone arrays or microphone for indi- coding is the representation and cod- flows. sentation and exchange of new imag- vidual sound objects are used to ing of light fields captured by lenslet ing modalities such as light field, capture the sound field. light-field cameras or camera arrays. The JPEG and MPEG committees will point-cloud, and holographic imaging. investigate and standardize new formats Several workshops for this topic were – MPEG Tools for VR – MPEG Point Cloud Compression and methods for this new technology. executed. At the last ICME 2016 con- VR is a fast-progressing technology Images from cameras used in light ference, a grand challenge for light with a strong market momentum. field capturing are projections of light Overview by Dr. Siegfried Foessel, field image compression was offered. Several technologies are necessary to rays to the sensor. A more universal Fraunhofer Digital Media Alliance realize a VR workflow such as image representation is a point cloud. In this – JPEG Systems stitching, metadata enrichment, de- cloud, the position, light rays and The subgroup JPEG systems investi- velopment of optimized video coding Bidirectional Reflectance Distribution gates the backward compatible exten- technologies. To avoid fragmentation Function, BRDF is defined for specific sion of JPEG for integrating multiple in this market, the standardization of objects. Because of the limitation of 18 19
TECHNOLOGIES AND TOOLS FOR P R O D U C T I O N A N D I M M E R S I V E M E D I A OMNICAM-360 DEGREE FIRST LIVE TRANS- MISSION WITH THE BERLIN PHILHARMONICS Si nc e the O mn iCam - syst em w as f ir st developed at Fr aunhof er Heinr ich H ertz I ns ti tute in 2009, num erous video product ions have been rea- l i z ed i n c ooper at ion w it h t he Ber lin Philhar m onics – a reliable t op- class partner. A fter var ious product ions, t he Om niCam - 360 w as inst alled t o rec ord the c on cer t given on occasion of t he 25t h anniver sar y of t he fal l of the Ber lin Wall in 2014. 2015, t he docum ent ar y “Playing t he Spac e” was re leased, in w hich t he Philhar m onics plays a leading role and whi c h wa s also f ilm ed w it h t he panor am ic cam er a syst em . The compact OmniCam-360 from the the OmniCam-360 came to use in the Fraunhofer HHI is equipped with ten HD first live transmission: The panoramic im- cameras that are attached to a mirror ages were successfully streamed in UHD system. The individual produced images quality through the Philharmonics Digitals are corrected in real time and put to- Concert Hall platform and shown online. gether into a parallax-free UHD video By wiping and/or zooming, viewers can panorama with a resolution of approx. focus on the part of the picture that inter- 10,000 x 2,000 pixels. The OmniCam-360 ests them the most. In broadcasts of con- measures about 50 x 50 centimeters and certs this could be the section of the or- weighs around 15 kilos. Therefor it can chestra with their favorite instruments, or be utilized for many different cases. the conductor, or they could just let their The collaboration with the Berlin Phil gaze wander. harmonics was taken to a new level when 20 21
TECHNOLOGIES AND TOOLS FOR P R O D U C T I O N A N D I M M E R S I V E M E D I A MPEG-H: DELIVERING IMMERSIVE AUDIO would be part of the Korean standard for According to Korean government repre- terrestrial UHD TV. If governmental offi- sentatives, it will then be expanded to TO THE WORLD’S FIRST TERRESTRIAL UHD-TV cials approve the standard in September, cities near the Olympic Game venues. By SYSTEM MPEG-H 3D Audio will provide an entirely 2021, the service will be available nation- new sound experience in the world’s first wide. Ul t r a H igh D e f in i ti o n Te l e vi s i o n (U HD T V ) i s the nex t bes t thi ng to rea- 4K terrestrial TV system. l i t y . U H D T V t r a n s p o rts c ry s ta l c l e a r, h i g h -res ol uti on pi c tures that The MPEG-H 3D Audio codec is ready to m a k e y ou w a nt to to u c h th e s cre e n to ch e ck that the rac e c ar wi l l not If approved, South Korean consumers use in broadcast systems, as demon- dr iv e t hrough yo u r l i v i n g ro o m. T h e 4 K a n d 8K i mage res ol uti on al l ows would be the first to experience interac- strated by Fraunhofer IIS in a series of UHD T V t o c re a te th e i l l u s i o n o f re a l i ty. I n addi ti on, i t offers the ex - tive and immersive sound, the two main events. Fraunhofer’s latest presentation of te nsion t o 3D vi d e o re p ro d u cti o n , Hi g h Dy nami c Range I magi ng (H D RI ) features of MPEG-H 3D Audio. The tech- a complete ATSC 3.0 broadcast chain to displa y im a ge l u mi n a n c e i n mo re d e ta i l and H i gh F rame Rate (H F R) nology enables users to adjust the sound with MPEG-H Audio was at KOBA Show whic h prov ide s a m o re fl u e n t p i c tu re s e q u e nc i ng and i mage qual i ty by mix to their preferences. For example, in Seoul, South Korea in May 2016. exc e e ding t he typ i ca l fra m e ra te s u s e d to d ay . consumers would have the ability to choose between different commentators MPEG-H 3D Audio is already being inte- in a sporting event or improve the intelli- grated into professional broadcasting But video reproduction is only one part of of “being there” rather than just watch- gibility of dialogues by decreasing ambi- equipment. Most recently, the broadcast the equation for a convincing TV experi- ing a TV program. ence volume. Furthermore, by adding 3D equipment manufacturers DS Broadcast ence. Sound is the other central ingredi- audio components for additional height and Kai Media announced the availability ent to authenticity. Gladly, technological The MPEG-H Audio standard, mainly de- information the codec delivers true audio of MPEG-H 3D Audio in their latest 4K developments on the audio side are veloped by Fraunhofer IIS, was selected as immersion. encoder products. equally innovative: The next-generation A/342 Candidate Standard for the new More product announcements, including audio standard MPEG-H 3D Audio takes television standard ATSC 3.0 in May With the 2018 Winter Olympic Games in TV sets from major manufacturers for the three-dimensional sound from the real 2016. Shortly after, the South Korean Pyeongchang, South Korea on the hori- Korean market, are expected this fall. world into the living room. MPEG-H pro- standards organization Telecommunica- zon, the new TV terrestrial system is ex- vides a full audio immersion experience tions Technology Association (TTA) an- pected to launch in February 2017, start- for TV viewers, which creates the feeling nounced that the audio technology ing with the Seoul metropolitan area. 22 23
TECHNOLOGIES AND TOOLS FOR P R O D U C T I O N A N D I M M E R S I V E M E D I A XHE-AAC: THE CODEC OF CHOICE FOR STREAMING AND DIGITAL RADIO As f e a t ur e d by R a d i o Wo rl d I n te rn a ti o n a l on 13 June 2016 To da y , m obile mu s i c s tre a m i n g s e rvi ce s a re fac i ng a s eri es of c hal l en- g e s. S e r v ic e pro vi d e rs a re s u ffe ri n g d u e to the i nabi l i ty to s erv e the full r a nge of po te n ti a l c u s to me rs b e c a u s e of real -worl d bandwi dth l i mit a t ions c omb i n e d wi th a h i g h p l a y o u t/C D N c os t. I n addi ti on, toda y ’s m obile u s e r e xp e ri e n ce i s fa r fro m perfec t. Streami ng s erv i c es are e a t ing int o c o n s u me rs ’ m o n th l y d a ta a ll owanc es and s erv i c e rel i a- b ilit y is f a r f rom wh a t we h a ve g ro wn to a p prec i ate from c l as s i c broad- ca st c ov e r a ge . However, there is an answer. Recently, the ticular today’s de-facto standard MPEG MPEG working group in ISO standardized HE-AAC, on the other hand codecs for and published the latest member of the speech-only content at very low bit rates highly successful AAC family of audio co- represented by AMR-WB+. decs, “xHE-AAC” (Extended High-Effi- ciency Advanced Audio Coding). xHE-AAC has the ability to provide good quality audio down to 8 kbps for mono xHE-AAC combines and significantly im- and 16 kbps for stereo services regardless proves the functionality of two formerly of the type of content (e.g. talk radio, a separated worlds: On the one hand the music program or a jingle). Even in the general-purpose audio codecs and in par- mid bit rate range, xHE-AAC shows signif- 24 25
TECHNOLOGIES AND TOOLS FOR P R O D U C T I O N A N D I M M E R S I V E M E D I A icant quality improvements over estab- geting potential customers limited by to all existing AAC based content. The lished codecs, while for higher bit rates it affordable 2G contracts who so far could licensing program also makes it more converges to the same transparent quality not be served using existing audio codecs, convenient for manufacturers to mix xHE- known from the other members of the but in the case of India represent roughly AAC and AAC-only devices and apps. AAC family. 90 percent of mobile users. Furthermore, service providers can benefit from a dra- Professional streaming encoders are Due to its universality, the global Digital matic cut in monthly costs of data distri- already available by Telos Alliance and Radio Mondiale (DRM) digital radio stan- bution. Sample calculations for mid-size StreamS. dard has adopted MPEG xHE-AAC as its services show a saving potential of 75 primary audio codec. India is on the verge percent of the CDN cost once mobile of becoming the world’s largest digital ra- apps start requesting the low bit rate dio deployment. In February 2016, at- xHE-AAC streams instead of the mp3 and tendees at Broadcast Engineering Society HE-AAC versions. of India Convention in New Delhi wit- nessed the first DRM transmission based Consumers with 4G / LTE contracts around on xHE-AAC by India’s public broadcaster the world benefit from the increase in All India Radio. DRM ready receiver service reliability when leaving the chipsets and receivers support xHE-AAC well-covered city centers and falling back out of the box, as well as all providers of to 2G-coverage, or joining mass events professional DRM encoders and multi- with plenty of mobile data usage. plexers. In April, Via Licensing announced the For audio streaming over IP to mobile de- start of the official xHE-AAC licensing vices, xHE-AAC promises a revolutionary program, which includes the functionality impact on both user experience and busi- of the existing AAC program. This makes ness opportunities. Streaming service pro- up for the fact that xHE-AAC capable viders for the first time can now start tar- audio decoders are downward compatible 26 27
TECHNOLOGIES AND TOOLS FOR P R O D U C T I O N A N D I M M E R S I V E M E D I A POST-PRODUCTION TOOLS Integrated quality check green marking. In future, the user will also receive a detailed quality report for The great number of largely automatically those parts of the distribution formats IMF – a uniform exchange format for mats, therefore the risk of quality losses created videos means that a comprehen- that need further attention. The quality post-production due to unwanted decoding, image-con- sive and, indeed, automatic quality check report is presented in a visual time-line version and encoding steps is minimized. of the transcoding process is desirable in directly integrated in the easyDCP soft- easyDCP, the post-production software At the same time, the IMP serves as a order to save time and money. Usually ware suite in order to establish an intui- from Fraunhofer IIS, has offered process- master for the creation of distribution for- the material is inspected for errors and tive way to check the problematic scenes ing of IMPs (Interoperable Master Pack- mats. In practice, it is not uncommon that artifacts that may arise during lossy if there are any. The first fault modules ages) in addition to DCP generation for up to 100 versions of different distribu- transcoding or conversion into other color for quality checking with predefined pa- quite a while now. The necessary Interop- tion formats need to be created from a spaces or bit depths. The Fraunhofer re- rameters have already been integrated erable Master Format, IMF, proves its in- single master and checked simultane- searchers’ approach is a direct integration into the overall easyDCP system. teroperability successfully over and over ously. IMF allows for the automatic cre- of various quality check modules into the at various Plugfest events and serves as a ation of videos with different technical transcoding processes. To this end, the For 2017, the IMF suite with transcoding universal exchange format within the parameters such as video compression scientists will present a version of the and expanded checking functionality is movie production. Since standardization format, spatial resolution of the target software that can be used both to create planned to be rolled out to the first pilot by the SMPTE (Society of Motion Picture device, audio speaker set-up, or different the IMP and, in particular, to check the users. and Television Engineers), IMF has been national versions (subtitles, soundtrack in generation of distribution formats from establishing itself more and more in the the local language etc.). Using the an IMP automatically. media industry. IMF-concept called Output Profile List OPL, this work can be automated, since At IBC 2016, the Fraunhofer Institutes IIS The advantages of the IMF are obvious: a the conversion of the IMP into a certain and IDMT will showcase software compo- uniform and internationally standardized distribution format (e. g. iTunes) is de- nents that detect problematic sections post-production format guarantees seam- scribed here. of video and summarize them in a test less exchange of content in the highest report. In the first demonstration version, picture quality; it is no longer necessary to the test results will be shown as a simple support a lot of different exchange for- “traffic light function” with red/amber/ 28 29
TECHNOLOGIES AND TOOLS FOR P R O D U C T I O N A N D I M M E R S I V E M E D I A Output Profile List OPL allows for the automatic creation of different distribution formats 30 31
TECHNOLOGIES AND TOOLS FOR P R O D U C T I O N A N D I M M E R S I V E M E D I A MICO – MULTIMODAL MEDIA ANNOTATION AND SEARCH In order to provide improved search and – Face detection and recognition recommendation capabilities for ever-in- creasing amounts of content, cost-effec- By multimodal fusion and reuse of results tive technologies for automatic extraction from extractors, annotation quality can be of metadata from raw media objects are improved at the same time, e. g. by com- needed. One of the core challenges in bining speaker and face recognition, or this domain is the need to shift from by combining speaker and speech recog- using individual standalone extractors to nition. In order to realize such potential, multimodal and context-aware extractor the following challenges need to be workflows, which can provide substan- addressed: tially improved search functionalities – Cost-effective integration of heteroge and results. neous extractors – Flexible extractor workflow orchestra- One example: even to realize seemingly tion simple search functionalities to find video – An extensible, common metadata segments where a (specific) person talks model about a specific topic, a news video pro- – Availability of an expressive, vider needs these extractors for annota- media-aware query language tion: – Recommender systems using both – Speaker recognition annotations and collaborative filtering – Speech recognition – Named entity recognition Fraunhofer IDMT has been active in the – Temporal video segmentation domains of A/V extractor implementation, 32 33
TECHNOLOGIES AND TOOLS FOR P R O D U C T I O N A N D I M M E R S I V E M E D I A integration and orchestration, and recom- mendation for many years. We believe that multimodal, complex annotation and recommendation are keys to future media systems, but it requires an integrated technology platform addressing all afore- mentioned aspects. Such a platform has now been developed by the EU R&D proj- ect MICO, with all core functionalities be- ing provided as business-friendly OSS. Fraunhofer IDMT has provided significant contributions to the project and, together with other MICO partners, intends to further build on and extend the platform to realize the potential of multimodal annotation, search and recommendation. 34 35
FRAUNHOFER TECHNOLOGY INSIDE – SOLUTIONS AND PRODUCTS easyDCP – ONLY A FEW CLICKS FRAUNHOFER CINGO ® TO YOUR DCP In order to create a DCP (Digital Cinema Package) easily and Fraunhofer Cingo enables an immersive sound experience on compliant to all the requirements, or to test such packages sub- mobile and virtual reality devices. For device manufacturers and jected to delivery, Fraunhofer IIS developed the easyDCP soft- service providers, Cingo is available as optimized software imple- ware suite. More information about easyDCP, which is available mentation for all major PC and mobile platforms, including iOS as standalone tool-set – as well as plug-ins for various post-pro- and Android. Equipment manufacturers using Cingo are Sam- duction solutions incl. BMDs Resolve – can be found sung (Samsung Gear VR), LG (LG 360 VR) and Google (Nexus at www.iis.fraunhofer.de/easydcp family). LIGHTWEIGHT IMAGE CODING MPEG-H FOR VIDEO OVER IP AND MEDIA CONTRIBUTION MPEG-H Audio provides interactive, immersive sound for TV and VR applications. It is available as software implementation to The Lici® codec ensures image-by-image, visually lossless trans- chip manufacturers, broadcasters and consumer electronics mission of high-resolution video with compression ratios of 1:2 manufacturers. The first professional broadcast encoders that to 1:6. It features extremely low latency, high throughput and support MPEG-H Audio software are DS Broadcast´s BGE9000 requires little logic to implement. Lici is used for Video over IP ® 4K Ultra HD Encoder and the new 4K UHD Live Broadcast En- applications and for contribution in professional movie produc- coder KME-U4K from Kai Media. tion. IP Cores are available - licensing conditions can be sent on request. www.iis.fraunhofer.de/lici 36 37
ABOUT US FRAUNHOFER DIGITAL MEDIA ALLIANCE As an one-stop competence center for media, digital movies, and standardiza- Publication Information digital media we provide for our custom- tion, as well as new cinematography, Photo acknowledgements ers scientific know-how and the develop- audio, and projection technologies, post- Fraunhofer Digital Media Alliance Cover picture: Fraunhofer IIS/fotolia.com ment of solutions that can be integrated production, distribution, and archiving. c/o Fraunhofer Institute for Page 3: Fraunofer IIS/Karoline Glasow in workflows and optimize process steps. The goal of the Fraunhofer Digital Media Integrated Circuits IIS Page 4: Fraunhofer IIS/David Hartfiel Alliance is to quickly and easily help find Page8/9: Fraunhofer HHI The members of the Digital Media Net- the right contacts, partners, and suitable Am Wolfsmantel 33 Page 11: Fraunhofer HHI work are actively working in renowned technology. 91058 Erlangen, Germany Page 15: Fraunhofer Fokus organizations and bodies like Interna- Page 20: Fraunhofer HHI/Berlin Phil tional Standardization Organization ISO, The Fraunhofer Institute members are Concept and Editor harmonic Orchestra/Monika Rittershaus ISDCF (Inter-Society Digital Cinema Fo- – Digital Media Technologie IDMT, Angela Raguse Page 23: Fraunhofer IIS/Frank Boxler/ rum), SMPTE (Society for Motion Picture Ilmenau Fraunhofer-Allianz Digital Media Valentin Schilling and Television Engineers), FKTG (German – Integrated Circuits IIS, Erlangen Page 24: Fraunhofer IIS/Matthias Heyde Society for Broadcast and Motion Picture), – Telecommunications, Layout and production Page 27: iStock.com/sweetym and in the EDCF (European Digital Cin- Heinrich-Hertz-Institut HHI, Berlin Kathrin Brohasga Page 29: Fraunhofer IIS/Fraunhofer IDMT ema Forum). – Open Communication Systems Kerstin Krinke Page 30/31: Fraunhofer IIS FOKUS, Berlin Paul Pulkert Page 33: CC0 Public Domain Fraunhofer Institutes in the Digital Media Page 37: Fraunhofer IIS/Kurt Fuchs Alliance jointly offer innovative solutions Contact Page 37: DS Broadcast and products for the transition to the dig- Fraunhofer Digital Media Alliance ital movie and media world of tomorrow. Angela Raguse M.A. The Institutes in the Alliance are available Phone +49 9131 776-5105 as renowned contacts and partners for all alliance-dc@iis.fraunhofer.de © Fraunhofer-Gesellschaft of the digital topics connected to digital www.digitalmedia.fraunhofer.de 38 39
You can also read