Technology Trends in Audio Engineering - A report by the AES Technical Council

Page created by Tommy Welch
 
CONTINUE READING
Technology Trends in Audio Engineering - A report by the AES Technical Council
TECHNOLOGY TRENDS

Technology Trends in
Audio Engineering
A report by the AES Technical Council

INTRODUCTION                                     http://www.aes.org/technical/ to learn          Council and its committees is to track
Technical Committees are centers of tech-        more about the activities of each com-          new, important research and technol-
nical expertise within the AES. Coordi-          mittee and to inquire about membership.         ogy trends in audio and report them to
nated by the AES Technical Council, these        Membership is open to all AES members           the Board of Governors and the Society’s
committees track trends in audio in order        as well as those with a professional inter-     membership. This information helps the
to recommend to the Society papers,              est in each field.                              governing bodies of the AES to focus on
workshops, tutorials, master classes, stan-         Technical Committee meetings and             items of high priority. Supplying this
dards, projects, publications, conferences,      informal discussions held during regular        information puts our technical expertise
and awards in their fields. The Technical        conventions serve to identify the most          to a greater use for the Society. In the fol-
Council serves the role of the CTO for           current and upcoming issues in the spe-         lowing pages you will find an edited com-
the society. Currently there are 20 such         cific technical domains concerning our          pilation of the reports recently provided
groups of specialists within the council.        Society. The TC meetings are open to all        by some of the Technical Committees.
Each consists of members from diverse            convention registrants. With the addition
backgrounds, countries, companies,               of an internet-based Virtual Office, com-                                 Francis Rumsey
and interests. The committees strive to          mittee members can conduct business at                      Chair, AES Technical Council
foster wide-ranging points of view and           any time and from any place in the world.         Bob Schulein, Jürgen Herre, Michael Kelly
approaches to technology. Please go to:             One of the functions of the Technical                                       Vice Chairs

                                                         AUDIO FOR
                                                    TELECOMMUNICATIONS
                                                               Antti Kelloniemi
                                                         Kalle Koivuniemi, Co-chairs

The trend in mobile telecommunications           to support multiple voice codecs for each       codecs.The audio recording capabilities of the
across the globe has been towards devices        mode depending on the network or service        devices have improved to offer high bitrate
that are the mobile window into a person’s       provider. In all cases the switching between    and 24 bit capture in some cases. There are
digital universe. In 2013 smartphones outsold    modes and performance of the system must        also many products out on the market that
feature phones globally for the first time on    be seamless, as past performance is no longer   allow multichannel audio recording via inter-
a unit basis. Several companies have focused     acceptable to the consumer. After 2013,         nal microphone arrays to complement the
efforts on creating inexpensive smartphones      mobile communication using VoIP and video       video recording on the devices. Use of stereo
for the emerging markets, where, in many         conferencing apps has increased in accelerat-   or spatial audio also for communication appli-
cases, these devices may be the only link        ing pace.                                       cations was increasingly studied in scientific
those consumers have to the internet and the        From a voice quality standpoint, the AMR     research during the first decade of 2000. The
rest of the world. This expansion has resulted   Wide Band codec is in widespread use glob-      first commercial applications have emerged,
in devices that have many more communica-        ally, and the 3GPP EVS (Enhanced Voice          and we can see this as one trend that may
tions bands and codecs than we have had in       Services) codec was finalized in 2014 and the   change the way we communicate, together
the past. It is not uncommon to find single      first commercial deployment took place in       with the rise of technology and applications
devices that will communicate via UMTS,          South Korea next year. Demands for IP-based     for augmented or virtual reality.
GSM, CDMA, LTE (VoLTE), and WiFi (VoIP).         communication also in poor networks have           Voice controlled user interfaces in the
Some of these modes of communication have        caused a rising interest in low bitrate voice   mobile device have evolved to Virtual Personal

1                                                                                                    J. Audio Eng. Soc., Vol. 65, No. 3, 2017 March
TECHNOLOGY TRENDS

Assistants that extend to other platforms as           Advancements in usability in all environ-     technology constantly monitors the content
well. Interaction with Natural Language is          ments have come in the form of noise adaptive    and the speaker to produce optimal loud-
also coming to chatbots that are used for           technology in voice call, content playback       ness or bandwidth, while maintaining safe
automating business processes like ordering         and voice control. Voice communication and       drive levels for the speaker. This technology
a pizza by means of a phone call. Communica-        voice recognition have benefitted from algo-     paired with new high voltage amplifiers work
tion with these virtual assistants sets different   rithms that adapt uplink noise reduction or      together to produce impressive audio quality
requirements for smart phones and VoIP sys-         algorithm parameters based on noise type         in many environments.
tems from what was common with human-to-            and level. This becomes even more important         On the content side of the business, the
human calls and new technologies are being          when voice control and new types of endpoint     trend has been from localized content to cloud
developed to raise the quality of these features.   devices emerge. Smart speakers or communi-       based content. This has allowed the con-
We can estimate that in the near future using       cation hubs for homes, conferencing phones       sumer to consume the same content across all
a mobile device for voice communication             for office environment are used from further     devices, which has posed challenges for small
between a user and a digital assistant or a bot     away compared to personal mobile devices,        power-conscious mobile devices.
will be as typical as communication between         so signal to noise ratio may be poor to start       The move from portable phones with a
humans. This communication will happen              with. Likewise downlink audio and content        few extra features to the mobile hub of the
both through the same signal paths as normal        playback have benefitted from noise adaptive     consumer’s digital world is reflected in the
voice communication and through dedicated           algorithms. Another new feature that assists     standards world, where the scope of standards
paths, using signal processing and coding           the usability in various environments has been   as well as their names have had to change to
optimized for voice recognition engines.            the use of smart amplifier technology. The       keep up.

                                              BROADCAST AND ONLINE DELIVERY
                                          David Bialik, Kimio Hamasaki, Matthieu Parmentier, Co-chairs
                                                            Jim Starzynski, Vice Chair

Using LKFS and LUFS: Trends to                      and scale” content loudness automatically,       member meeting at NAB in April, work com-
Effectively Managing Video Content                  with extreme accuracy and without audible        menced on the paper and consensus was
Audio Loudness                                      artifacts. Smooth content transitions for        achieved in late summer. AES TD1005: Audio
During the analog to digital television tran-       over the air television programming and          Guidelines for Over the Top Television and
sition, professionals and consumers alike           advertising were now possible, meeting the       Video Streaming — Preliminary Loudness
struggled with the unintentional loud-              expectations of content creators, broadcast-     Guidelines was formally released in less than
ness challenges that DTV’s extreme audio            ers, government and the home audience.           6 months at the AES Convention in Los
dynamic range brought to audiences. Pro-               In January of 2016, on the heels of a very    Angeles in late September.
gram-to-commercial loudness transition              successful release of AES TD1004: Recom-            Currently, AGOTTVS group members are
problems recognized by broadcasters, and            mendation for Loudness of Audio Streaming        working independently in a Coding, Meta-
shortly after by government, were effectively       and Network File Playback, the AES leader-       data, Loudness and DRC Subgroup, an Archi-
addressed and solved by the implementa-             ship and its members recognized the need         tecture and Devices Subgroup, and a Content
tion of the recommended practices of indus-         to bring this new trend in TV loudness man-      Creation Subgroup. All members meet every
try-wide efforts (ATSC’s A/85, EBU’s R 128          agement to the rapidly expanding platforms       other week during a 90 minute teleconfer-
and others) that documented new loudness            of early Over-The-Top-Television (OTT) and       ence. The group’s goal is to leverage the
measurement techniques.                             Online Video Distribution (OVD) that were        successful loudness techniques and already
A key ingredient to all solutions was the cre-      noticeably plagued by exactly the same loud-     in-place recommendations for over the air
ation of the cornerstone loudness measure-          ness issues as early DTV. In winter of 2016,     TV and bring them to OTT and OVD. Work
ment recommendation—ITU-R BS.1770 that              AES Audio Guidelines for Over the Top Tele-      being done on the behavior of current coding
established a globally recognized means to          vision and Video Streaming (AGOTTVS—a            technology and the variety of devices repro-
assign a numeric value to perceptual loud-          subcommittee of this Technical Committee)        ducing on-line content will help to fine tune
ness, based on a model of human hearing.            group was formed and currently has over 50       the new loudness guidelines.
   With these tools in hand, broadcasters           members representing Amazon, Apple, Goo-            AGOTTVS is well on its way to create an
around the world equipped their facilities          gle, Hulu, Netflix, manufacturers, major US      AES Comprehensive Loudness Recommen-
with new loudness metering and trained              and European TV networks, content creators       dation to serve as a reference to the online
their staff to mix using a new loudness             and many more.                                   content creation and distribution commu-
measurement paradigm based on identi-                  AGOTTVS first charge was to develop “Pre-     nity. The work has potential to become an
cally derived values (labeled LKFS by ATSC          liminary Guidelines” to fulfill the group’s      AES standard and is tentatively planned for
and LUFS by EBU) moving away from                   urgent need to bring DTV’s basic fundamen-       release at AES 143 in New York this fall.
obsolete VU and PPM approaches. This                tal LKFS-LUFS loudness practices to the             Interested individuals can contact broad-
development paved the way for file-based            emerging online distribution community.          cast@aes.org or Jim.Starzynski@
content transcoding devices to “measure             Conceived during an informal face-to-face        nbcuni.com, Chairman AES AGOTTVS.
2                                                                                                        J. Audio Eng. Soc., Vol. 65, No. 3, 2017 March
TECHNOLOGY TRENDS

                                                    CODING OF AUDIO SIGNALS
                                                  Jürgen Herre and Schuyler Quackenbush, Chairs

The AES Technical Committee on Coding               coding process and send parametric infor-      for this setup.
of Audio Signals is composed of experts             mation about them instead. On the decoder         The paradigm compatibility challenge:
in perceptual and lossless audio coding.            side, the transmitted waveform components      3D audio content can be produced in differ-
The topics considered by the committee              are used to re-create the missing spectral     ent paradigms, i.e. either as channel signals
include signal processing for audio data            regions using the parametric information.      intended for a specific loudspeaker setup, or
reduction and associated rendering. This            This allows full audio bandwidth even at low   as object signals intended for specific spatial
includes methods for reducing redun-                bitrates. Well-known examples of such tech-    playback coordinates and properties, or as
dancy and irrelevancy. An understand-               nologies are Spectral Bandwidth Replica-       Higher Order Ambisonics (HOA). Although
ing of auditory perception, models of the           tion (SBR), Harmonic Bandwidth Extension       the latter two are independent of particular
human auditory system, and subjective               (HBE) and Intelligent Gap Filling (IGF).       loudspeaker setups, they are very different
sound quality are key to achieving effec-              Both joint coding of channels and band-     signal representations. These different sig-
tive signal compression.                            width extension have recently also been        nal representations require different coding
   Audio Coding has undergone a continu-            implemented without a dedicated filterbank,    approaches in order to provide high signal
ous evolution since its commercial begin-           thus saving computational complexity and       compression while preserving spatial fidelity.
nings in the early nineties. Today, an audio        algorithmic delay.                                As a recent example, the MPEG-H 3D Audio
codec provides many more functionalities                                                           codec addresses these challenges by integrat-
beyond mere bitrate reduction while pre-            Convergence between speech and                 ing a core signal compression technology
serving sound quality. A number of exam-            audio coding                                   (USAC) that can support channels and objects
ples are given in the following:                    The latest generation of codecs often come     and HOA signals when equipped with prepro-
                                                    as combined audio and speech codec, i.e.       cessing to achieve additional compression.
Evolution of the coder structure                    a synthesis of advanced perception-based
In the nineties, a classic audio coder con-         audio coding technology and state-of-the-      Integration of coding and rendering
sisted of four building blocks: analysis fil-       art speech-production-based speech coding      Many modern codecs do not only perform
terbank, perceptual model, quantization             technology into a truly universal codec with   efficient compression and decoding of audio
and entropy coding of spectral coefficients,        optimum performance for both music and         signals but also include aspects of signal
and bitstream multiplexer. Later, more              speech signals. Examples include MPEG-D        rendering to output devices, such as head-
and more tools were added to enhance the            Unified Speech and Audio Coding (USAC)         phones and loudspeakers. Examples for
codec behavior for critical types of input          or 3GPP’s codec for Enhanced Voice Ser-        such efficient combined processing include:
signals, such as transient and tonal solo           vices (EVS) where the latter is also a com-       Binaural rendering for headphones: head-
instruments, and to improve coding of ste-          munication codec with suitably low delay.      phone listening is a presentation format
reo material. More recently, this structure                                                        that is especially important in view of the
was augmented by pre/post processors that           Coding of immersive audio programs             vast number of multimedia-enabled mobile
provided significantly enhanced perfor-             After stereo and surround sound, the           devices. Binaural presentation provides an
mance at low bitrates:                              next generation of formats that further        immersive experience via headphones. Inte-
   Joint coding of channels: firstly, spatial       increased spatial realism and listener         grated coding and binaural rendering at the
preprocessors perform a downmix of the              immersion is “3D Audio”, i.e. reproduc-        decoder comes at very low computational
codec input channels (two-channel ste-              tion including height (higher and possi-       cost and permits use of personalized HRTFs/
reo, 5.1 or more) into a reduced number of          bly lower) loudspeakers. Examples are the      BRIRs. Examples include MPEG-D MPEG
waveforms plus parametric side informa-             22.2 or 5.1+4 (i.e. four height loudspeakers   Surround and Spatial Audio Object Coding
tion describing the properties of the original      added on top of the regular 5.1) configura-    (SAOC) as well as MPEG-H 3D Audio.
spatial sound image. Thus, the bit rate can         tions. For embracing such formats, two key        Rendering for multiple loudspeakers: for
be significantly reduced. On the decoder            challenges needed to be solved:                loudspeaker presentation, an immersive lis-
side, the transmitted waveforms are upmixed            The loudspeaker setup / “format” com-       tener experience can be achieved by directly
again using the transmitted spatial side infor-     patibility challenge: 3D audio content may     rendering the decoded audio output to the tar-
mation to reproduce a perceptual replica of         be produced in many different loudspeaker      get multi-channel loudspeaker layout, which
the original sound image. Examples for such         layouts. Furthermore, reproduction in con-     can be attractive from computational complex-
spatial pre/postprocessors are Parametric           sumer homes will not happen with a large       ity perspective. Several recent audio codecs
Stereo (PS) and MPEG Surround.                      number of (e.g. 22) loudspeakers that may      offer the ability to render to a user selectable
   Bandwidth extension: secondly, to alle-          have been used for content production.         output layout, i.e. either one of the standard-
viate the burden of the core codec, vari-           Thus, a codec for 3D Audio needs to be able    ized loudspeaker setups (ranging from 5.1 to
ous pre/post processors were devised which          to reproduce 3D content on any available       22.2) or even to arbitrary setups. Examples
exclude spectral regions (predominantly             loudspeaker setup, adapting the content and    include MPEG-D Spatial Audio Object Coding
the high frequency range) from the regular          providing best possible listening experience   (SAOC) and MPEG-H 3D Audio.

J. Audio Eng. Soc., Vol. 65, No. 3, 2017 March                                                                                                  3
TECHNOLOGY TRENDS

Interactive sound modification                  dium sound for sports events, personal music       they also have the capability to signal to
capability                                      remixes). This can be achieved by using sound      the transmitting device the instantaneous
The ability to modifying a particular sound     objects and, for low bitrate applications,         channel capacity. Compressed audio for-
within a complex sound mixture at the time      MPEG-D Spatial Audio Object Coding (SAOC).         mats that support seamless switching to
of decoding is receiving increased attention.                                                      lower (or back to higher) bit rates permit
Most prominently, dialog enhancement of         Rate adaptive audio representation                 the system to deliver uninterrupted audio
coded sound can boost dialog for hearing        There is an ever-greater consumption of            under changing channel conditions. An
impaired listeners. Interactive sound modi-     streaming audio on mobile devices. While           example of such a system is the MPEG-3D
fication also enables creating personalizing    mobile channels invariably suffer from             Audio coding format in combination with
sound mixtures (e.g. commentator vs. sta-       variable transmission channel capacity,            the MPEG DASH delivery format.

                                                 HIGH RESOLUTION AUDIO
                                                   Vicki Melchior and Josh Reiss, Chairs

High resolution audio (HRA) is an estab-           Also related to DSD is DXD, a designation       filtering, higher resolutions are both the
lished mainstay in the professional and         for PCM at 352.8 kHz/24b championed by             outcome of, and drivers of, this search.
audiophile markets. Growth in the last sev-     Merging Technologies as an intermediate            There is at present an effort by manufac-
eral years focused on a number of new for-      stage in DSD processing. Single bit streams        turers of high quality converters to address
mats and on quality considerations for the      cannot easily be filtered or processed and so      shortcomings attributed to the up-sampling
internet delivery of downloads in both the      typically are converted to PCM at high sample      chips and multi-bit sigma delta modulator
new and older HRA formats, typically at         rates to enable production. DXD has been           chips used nearly universally in PCM DAC
increasingly high bit rates. Rapid growth       used as a direct recording format for release in   processing. Techniques include substitu-
in streaming now suggests that formats          DSD, as an intermediate when both recording        tion of FPGA or computer-based up-sam-
appropriate to more stringent bit rates         and releases are in DSD, and less often as a       pling for that found on chips, custom filter
will be useful, and certain of those are in     352.8 kHz LPCM release format.                     design including minimum phase designs,
development. The ability to stream can also        “Master Quality Authenticated” (MQA) is a       increase of processing bit depth to dou-
facilitate the introduction of HRA into the     proprietary LPCM codec introduced in 2014          ble precision floating point (64b) or above,
mainstream marketplace. The HRA Techni-         by Meridian Audio and MQA Ltd. It combines         and custom sigma delta modulation and
cal Committee sponsors workshops, tutori-       newly developed filtering and signal process-      decimation. Chip makers have developed
als, and discussions highlighting significant   ing with a hierarchical fold-down and bit          improved chips incorporating similar pro-
aspects of these developments for the AES       packing scheme, thus enabling the streaming        cessing upgrades, plus improved noise
community.                                      of high resolution quality files at data rates     shaping, jitter control, clocking and iso-
                                                comparable to CD while embedding a lower           lation. Such chips increasingly appear in
Recent HRA Formats                              bitrate version for conventional CD playback.      HRA-capable hardware.
Notable in the last several years has been      The high resolution data is not lossless in the        The theoretical and practical influence of
the emergence and uptake of DSD as an           conventional sense of exact bit retention but      filters on sound has long been debated and
independent encoding and distribution for-      rather incorporates bandwidth and dynamic          continues to be. There has been little formal
mat. DSD is the term used by Sony and           range processing reflective of the band limits     study, but a recent AES convention paper
Philips for the single bit stream output        of music and the dynamic range imposed by          reported audibility in double blind tests of
of a sigma delta converter that, together       microphone noise.                                  several down-sampling filters typical of those
with related processing, is used in stor-          Virtually all new professional and con-         in common CD usage (AES 137, Los Angeles;
age and transmission associated with the        sumer hardware and software supports at            Jackson, et.al.) Also worth mentioning in
production of SACDs. The original DSD           least a subset of the LPCM and DSD high res-       this regard is the filtering approach in the
oversampling rate of 64 Fs (64 x 44.1 kHz =     olution formats. The MQA codec is gaining          new MQA algorithm, which incorporates
2.8224 MHz) was expanded to include both        industry support and increasingly appears in       (proprietary) non-traditional sampling, fil-
128 Fs and 256 Fs. The main advantage of        newer portables, streamers and players, espe-      tering, and convolutional methods based
the higher rates is that the rise in shaped     cially audiophile gear, although a significant     on triangular sampling kernels. The design-
noise which occurs as a consequence of          backward-compatibility characteristic is the       ers claim, in AES and internet papers, that
dynamic range processing in sigma delta         playability of the embedded standard resolu-       significant sonic enhancements occur over
modulators can be pushed considerably fur-      tion file in the absence of a decoder.             conventional sampling and filtering, and are
ther out beyond the audio band (> 60 kHz),                                                         readily audible to mastering engineers.
and with less quantization noise remaining      Improved Converters, Filters, and
in the audio band, than is possible with        Signal Processing                                  Distribution
64 Fs. The DSD signal is said to sound          While audio designers have always sought           Distribution of HRA files is now primarily
cleaner and more transparent at the higher      to identify the sources of sonic deterio-          internet-based, although discs including
data rates.                                     ration brought about by processing and             Blu-ray, Pure Audio Blu-ray, SACD, and

4                                                                                                      J. Audio Eng. Soc., Vol. 65, No. 3, 2017 March
TECHNOLOGY TRENDS

DVD-A continue in various regions. HRA             are one-fourth the size of comparable PCM       the AES, and are presently co-sponsoring
downloads remain a growth sector. Spe-             files but are still large for general stream-   extensive promotions with major consumer
cialty websites ranging from large aggrega-        ing. Two developing formats are MQA, men-       electronics resellers (Best Buy in the U.S.).
tors to small labels and orchestras offer new      tioned earlier, and in some uses, MPEG-4        Their labeling system has been extended
work and remastered catalog at PCM reso-           SLS. The main limitation to MQA adoption        to HRA streaming services in the indus-
lutions from 192 kHz/24b to 44.1 kHz/16b,          as yet is the lack of enough MQA-encoded        try-wide expectation that streaming will
and DSD at 64 Fs and 128 Fs. Consumers             music. Warner Music Group is currently          soon become important.
and the industry generally have become             converting their catalog to MQA, and if
increasingly conscious of release quality,         other major labels follow there may be sig-     Academic Studies
leading some websites to require that older        nificant content in the near future.            A recent paper published by the Queen
data be sourced from the best available                                                            Mary University of London audio engineer-
masters while eschewing further transcod-          Initiative to Extend HRA to the Broad           ing group applied meta-analysis statistical
ing or other media transfers.                      Market                                          techniques to a collection of 20 previously
   Streaming is rapidly displacing down-           A joint initiative from the Digital Enter-      published perceptual studies of HRA. The
loads in the mass market and will certainly        tainment Group (DEG), Consumer Elec-            result showed a small but significant audi-
impact high end and pro audio, although            tronics Association (CEA), the Recording        bility of HRA in comparison to CD that
the extent remains uncertain. Network and          Academy, and the major labels to define         greatly increased when participants were
storage bandwidths have improved, allowing         and promote HRA continues. Their defini-        initially trained. The paper is notewor-
a number of services (e.g., Qobuz) to stream       tion and set of provenance designators for      thy both for the result, and for the use of
losslessly compressed HRA, but the high            HRA were released in 2014 and, while not        meta-analysis in this context. (J. Reiss, “A
bitrates of these files mitigate against their     universally liked, are in current use. The      Meta-Analysis of High Resolution Audio
general use. This has led to a search for via-     group initially sponsored demonstrations        Perceptual Evaluation”, JAES vol. 64, pp.
ble data-packing formats for HRA. DSD files        of HRA at many trade events, including          364–379, (June 2016))

                                                 LOUDSPEAKERS AND HEADPHONES
                                                   Steve Hutt, Chair; Juha Backman, Vice Chair

TC-LH is comprised of experts in loud-             accuracy are progressing with research to       Technology
speaker and headphone systems with skills          quantify ideal performance goals, align-        Micro Drivers. Portable devices drive a big
in transducer design, manufacturing, mea-          ing objective measurements with subjec-         demand for micro-drivers. Challenges lie
surement and proficiencies in audio sys-           tive preferences. Wireless connectivity is      in developing higher acoustic output with
tem integration, sound propagation and             increasing, driven in part by device design     greater low frequency and quality, but with
acoustics.                                         and convenience.                                miniaturization of drivers, enclosures and
                                                      Headphone Transducers. These are dom-        cost. Micro-driver topologies are dominated
Consumer Electronics                               inated by “full-range” moving coil designs.     by permanent magnet/moving-coil motors,
Streaming Devices. This is a growing mar-          Balanced armature devices and multi-way         though not necessarily with axisymmetric
ket segment but has recently become sub-           systems are becoming more prevalent.            voice-coils. MEMS (microelectromechan-
ject to disruption from “Smart Speakers”.          Alternative topologies are entering the mar-    ical systems), with integrated electronics
These are wireless systems with voice com-         ket such as planar and in-ear electrostatics,   are finding their way into devices and show
mand control and interface to service plat-        also available in multi-way configurations.     great promise.
forms as part of the “Internet of Things”             Virtual Reality. Systems are garnering          Neo Magnets. The attractiveness of neo
(IoT) connectivity network. By some                new opportunities requiring improved head-      magnets (Neodymium Iron Boron, NdFeB)
accounts, smart speakers are already among         phone performance, including requirements       lie in the size and weight ratio to magnetic
the highest selling types of loudspeakers.         for rendering spatial attributes accurately.    potential vs. ferrite or other magnets. Driven
So far, playback quality appears to be less           Hi-Resolution. Hi-Res audio growth is        by weight reduction and though more expen-
important than interactive capabilities.           robust. As loudspeakers are often referred      sive than ferrite, a trend to neo in automotive
   Home Theater. The trend for in-wall,            to as the “weak link in the audio chain”        and portable pro sound started in the 1990s.
invisible & low profile systems continues          and the Hi-Res effort includes extending        In late 2008 neo prices began to rise dramat-
and drives development of transducers with         audio bandwidth ever higher, loudspeak-         ically and by 2011 were near 10 times their
minimal depth. Sound Bars continue their           ers capable of Hi-Res playback will require     2008 price. Automotive companies and sup-
popularity, some with improved perfor-             careful consideration and development.          pliers absorbed the cost but pro sound has
mance over earlier generations.                    “Super-tweeters” and planar drivers have        begun to revert back to ferrite in non-flying
   Headphones. While headphone per-                been available for years with >60 kHz band-     systems. Additional demand sustaining high
formance was arguably less important to            width. Temporal requirements for Hi-Res         neo prices comes from emerging industries
some users than fashion, efforts for sonic         audio are under investigation.                  such as electric vehicles and wind gener-

J. Audio Eng. Soc., Vol. 65, No. 3, 2017 March                                                                                                 5
TECHNOLOGY TRENDS

ation. Currently, neo prices have fallen to      will trend with the integration of fully cou-     ducers onto a single horn that functions as
~3.5 times their pre-2008 norm but dyspro-       pled DSP/amplifier/loudspeaker systems.           a high output point source.
sium (used in neo for high heat applications)       Moving Magnet topology has been                   3D Sound in vehicles has evolved from
remains even higher. Neo continues to be         researched for years and is now used in a         spatial novelty to trend, incorporating over-
favored for high performance compression         potentially disruptive high output low-fre-       head sound. The mechanical challenge is to
drivers and tweeters.                            quency system utilizing closely coupled           develop transducers that fit shallow mount-
                                                 amplifier and DSP control circuitry.              ing depth in the headliner, yet have directiv-
Alternative Topology Transducers                    DSP Corrective Systems are evolving with       ity appropriate for 3D sound fields.
Air Motion Transformer (AMT) & Planar.           the benefit of compensating for deficiencies         Cinema. The trend to 3D sound contin-
AMT transducers are finding increased use        in transducers, and have great merit with         ues though the specification for loudspeaker
in studio monitors and pro systems. Planar       micro-drivers. Advancements in DSP solu-          directivity and power capacity is still evolving.
magnetic drivers are found in some high          tions are indeed admirable and useful, but           Zoned-audio has not prevailed in con-
end pro systems and vehicles.                    DSP on its own has not negated the effort to      sumer space but is gaining traction in
   MEMS. MEMS used in arrays are being           develop improved transducers.                     automotive sound. The goal is to create
touted for package, weight and potential for                                                       active and null zones so that each passen-
beam steering.                                   Directivity Control                               ger listens to discrete sound not audible to
   DML & BMR. The trend of Distributed           Pro Sound continues to trend directivity          other passengers. DSP is used to manage
Mode Loudspeakers (DML) systems popular-         control with proliferation of line array sys-     the propagation and phase of purpose built
ized in the 2000s has waned with the excep-      tems and low frequency cardioid arrays.           transducers with waveguides that exhibit
tion of a few niche applications. Balanced       Purpose designed transducers and wave-            specific directivity and in some cases “flat”
Mode Radiator (BMR) device improvements          guides continue to evolve along with sys-         depth to fit into headliners.
continue.                                        tem modularity. Line array systems utilize
   Electro-Magnetic Drivers using a field-       passive or automated mechanical adjust-           Standards
coil in place of permanent magnet, are           ments for vertical tilt and alignment. Hor-       AES2-2012 (Standard) [Methods of Mea-
subject to recent patent filings motivated       izontal coverage can be adjusted by manual        suring Drive Units] up for review in 2017.
by efforts to hedge neo price instability.       or automated adjustment of horn bound-            X223 (Information Document in prog-
Efficiency. Automotive and pro sound can         aries. A trend is evolving with advanced          ress) focuses on developing methods
benefit most, but improvements in loud-          adaptive electronics to manage beam steer-        to reliably repeat loudspeaker driver
speaker efficiency lags potential advantages,    ing. This allows reconfiguring a system’s         measurements in different locations.
partly because of the intellectual complexity    directivity for real time conformance to          AES-X241, “End-of-line testing for loud-
of loudspeaker voltage sensitivity vs. true      architectural and ambient conditions. An          speaker drivers” (standard in progress). Tar-
efficiency. Maximum efficiency solutions         alternative approach loads multiple trans-        get release May 2018.

                                                 NETWORK AUDIO SYSTEMS
                                                             Kevin Gross, Chair
                                                Richard Foss and Thomas Sporer, Vice Chairs

Professional AV synchronization                  and September 2013. Coordination between          259M. In 2007 SMPTE published ST 2022-6,
Professional AV media networking requires        the two standards development groups              which describes carriage of SDI signals over
a synchronization component. Early audio         resulted in a synchronization component in        an RTP/IP connection. Since SDI carries as
network technologies such as CobraNet and        AES67 based on IEEE 1588, which is compat-        many as 16 channels of digital audio multi-
Livewire used proprietary techniques for syn-    ible with ST 2059. The compatibility between      plexed with video, ST 2022-2 also defines a
chronization. In 2002, the IEEE 1588 preci-      the standards was perhaps not widely appreci-     type of audio networking.
sion synchronization standard was published.     ated until the AES published AES Standards           During NAB 2014 several broadcast vendors
Initial deployment was in industrial network-    Report R16 PTP parameters for AES67 and           demonstrated ST 2022-6 primarily in a point-
ing applications but it soon found applica-      SMPTE ST 2059-2 interoperability in May           to-point configuration. There was a clear need
tions in audio networking in AVB, Dante, and     2016. SMPTE has held interoperability test-       to define a more network-centric framework
Q-LAN.                                           ing events for manufacturers working with         that could help realize the full potential of
   Back in 2007 SMPTE recognized the need        ST 2059 and has included synchronization          IP video. The Video Services Forum initiated
for network synchronization for professional     interoperability testing with AES67 devices in    such a development effort in April, 2014.
video networking applications. A require-        recent events.                                       Two approaches were considered; the first
ments study produced a Request for Stan-                                                           was built on ST 2022-6 and thus moved the
dardization requirement document in August       Video IP networking                               legacy SDI approach onto a network. This
2009. Standards development ensued and           Video infrastructure in broadcast facilities is   effort produced a technical recommendation
SMPTE ST 2059 was published in March 2015.       primarily based on the synchronous digital        called TR-04. The second approach removed
AES67 was developed between October 2010         interface (SDI) standardized in SMPTE ST          the SDI legacy component and uses previous

6                                                                                                      J. Audio Eng. Soc., Vol. 65, No. 3, 2017 March
TECHNOLOGY TRENDS

work done in the Internet Engineering Task       document deals with a set of parameters that       At the time of this writing there are two
Force (IETF), RFC 4175 for flexible uncom-       describe how to transmit and receive audio      areas of new work under consideration. These
pressed digital video over RTP/IP. This effort   streams. These are to ensure the successful     are (i) dual streaming between devices; (ii) SIP
produced TR-03 in November 2015. The audio       decoding of audio based on the parameters       peering between broadcasters, which would
component under TR-03 is AES67 and syn-          received and negotiated when each specific      allow the transfer of audio between organi-
chronization is accomplished using SMPTE         connection is established. The specification    zations that currently use closed SIP systems
ST 2059 (see above). These media networking      has been implemented by a number of man-        simply through the sender making an appro-
techniques have been adopted by many of the      ufacturers and is in use by some broadcasting   priate call to the destination. Any work in
major players in the broadcast industry. A       organizations.                                  either of these two areas would be expected to
trade association, AIMS, was formed in April        The specification defined a new Session      make use of existing protocols and standards
2016 to promote the technology and SMPTE         Description Protocol (SDP) attribute named      to achieve these goals.
was charged with creating standards, expected    ebuacip, along with a number of attribute          AES67 includes a mode that allows interop-
to be called ST 2110, from the technical rec-    values. The attribute is now registered with    erability with the ACIP standard EBU TECH
ommendations.                                    IANA.                                           3326. The AES has held interoperability test-
                                                    The ACIP II group is now officially closed   ing events for manufacturers working with
Audio Contribution over IP (ACIP II)             but remains ready to be resurrected should      AES67, and recent ones have included testing
This EBU working group published EBU             the need arise for any particular topics that   with ACIP devices. The next AES67 interop-
Tech. 3368 Profiles in November 2014. This       might be within its scope.                      erability event is planned for February 2017.

J. Audio Eng. Soc., Vol. 65, No. 3, 2017 March                                                                                                 7
You can also read