EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES

Page created by Raymond Bryant
 
CONTINUE READING
EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES
Emergent User Interfaces
      CS-E4200
               Lecture 4
                   –
      Sound and auditory interfaces

                                      4 Feb 2021
EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES
4 Feb 2021
EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES
What is Sound
Vibrations (pressure variations) in elastic media (such as
air), and detected by a receiver (ears, microphone, …)

Sound signal can be analysed with Fourier transform, and
chacterized by the frequency spectrum

Two different types
  tonal sounds, consisting of sinusoidal components with
  frequencies in harmonic (integer) relation
  noises – with variable spectra

Human-made sounds are often tonal (vowels, singing,
musical instruments), most natural sounds are not

                                                             4 Feb 2021
EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES
Sinusoidal Sound
Elementary component in sound analysis, rare in nature

Two key properties
  frequency, measured in Hertz (Hz)
     1 Hz = one cycle per second
                                                  A
     inverse of cycle period ( f = 1/ T )
  amplitude (pressure level, A )

                                              T

                                                      4 Feb 2021
EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES
Hearing Sounds
Humans can hear approx. 20 Hz - 20 kHz
  reduces with age to about 15 kHz

Pressure level (amplitude)
  measured in relation to "standard" sound pressure level on a
  logarithmic scale as decibels (dB)
  Examples:
     normal conversation 65 dB, chainsaw 120 dB
     above 85 dB is to be avoided
  Note: this is not the same as "loudness", which means
  subjectively perceived strength of the sound

                                                          4 Feb 2021
EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES
4 Feb 2021
EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES
Auditory Thresholds

Humans hear frequencies from 20 – 22,000 Hz

Most everyday sounds from 40 – 80 dB          4 Feb 2021
EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES
Tonal pitch and timbre
Non-sinusoidal but repeating waveform

  base frequency (F0) + harmonic overtones (n x F0)

  pitch = perceived base frequency

  timbre = sound "color", defined by spectral proportions

Not constant loudness

  sound envelope

                                                            4 Feb 2021
EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES
Non-tonal sounds
   Non-repeating waveform

a) Discrete spectral components with non-harmonic frequencies

      bells, resonating everyday objects

b) Continuous spectrum = noise

      random (not predictable) signal

      noises are different: distribution of
      frequencies and signal values matter

      white noise, pink/blue noise,
      popcorn noise, etc.
                                                                4 Feb 2021
EMERGENT USER INTERFACES - CS-E4200 LECTURE 4 SOUND AND AUDITORY INTERFACES 4 FEB 2021 - MYCOURSES
Voice and Speech
Formed by the vocal tract

Vowels (a, e, i, …) have a harmonic
spectrum characterized by formants
(peaks at certain kHz frequencies)

Consonants
  harmonic: voiced nasals (b, m, n, j, …)

  transients (k, p, t)

  continuous noise (f, s, …)

   https://en.wikipedia.org/wiki/Articulation_(phonetics)
                                                            4 Feb 2021
Hearing   4 Feb 2021
Anatomy of the Ear

                     4 Feb 2021
How the Ear Works

https://www.youtube.com/watch?v=pCCcFDoyBxM   4 Feb 2021
How the Ear Works
Cochlea performs kind of Fourier analysis
  different frequencies are distributed along the basilar
  membrane, and sensed by different hair cells

Important for sound perception is the instant spectral content
over short time (ca. 5–50 ms)
                            signal                               sonogram

Example: "cochlea"
   from http://www.neuroreille.com/promenade/english/sound/fsound.htm

                                                                        4 Feb 2021
Distance to Listener
  Relationship between sound
  intensity and distance to the listener
Inverse-square law

  The intensity varies inversely with the
  square of the distance from the source.
  So if the distance from the source is
  doubled (increased by a factor of 2),
  then the intensity is quartered
  (decreased by a factor of 4).

                                            4 Feb 2021
Sound Localization
Humans have two ears
  localize sound in space

Sound can be localized
using 3 coordinates
  Azimuth, elevation,
  distance

                               4 Feb 2021
Sound Localization
Azimuth Cues
  Difference in time of sound reaching two ears
    Interaural time difference (ITD)
  Difference in sound intensity reaching two ears
    Interaural level difference (ILD)

Elevation Cues
  Monaural cues derived from the pinna (ear shape)
    Head related transfer function (HRTF)

Range Cues
  Difference in sound relative to range from observer
  Head movements (otherwise ITD and ILD are same)   4 Feb 2021
Sound Localization

https://www.youtube.com/watch?v=FIU1bNSlbxk
                                              4 Feb 2021
Sound Localization (Azimuth Cues)

                     Interaural Time Difference (ITD)

                     Interaural Level Difference (ILD)

                                             4 Feb 2021
4 Feb 2021
HRTF (Elevation Cue)
Pinna and head shape affect frequency intensities

Sound intensities measured with microphones in ear and
compared to intensities at sound source
  Difference is HRTF, gives clue as to sound source location

                                                               4 Feb 2021
Accuracy of Sound Localization
       People can locate sound
            Most accurately in front of them
                2-3° error in front of head
            Least accurately to sides and behind head
                Up to 20° error to side of head
                Largest errors occur above/below elevations and behind head

       Front/back confusion is an issue
            Up to 10% of sounds presented in the front are perceived coming
            from behind and vice versa (more in headphones)

BUTEAN, A., Bălan, O., NEGOI, I., Moldoveanu, F., & Moldoveanu, A. (2015). COMPARATIVE RESEARCH ON SOUND
LOCALIZATION ACCURACY IN THE FREE-FIELD AND VIRTUAL AUDITORY DISPLAYS. InConference proceedings of»
eLearning and Software for Education «(eLSE)(No. 01, pp. 540-548). Universitatea Nationala de Aparare Carol I.
                                                                                                      4 Feb 2021
4 Feb 2021
Sound Synthesis
  Abstract algorithms
     additive: sum up a number of sinusoidals (+noise)
     subtractive: reduce white noise spectrum by filters
     any computational signal generator: wavetable, FM, etc…

  Physically based models – simulate vibrations in a material
     excitation: impact, friction, air flow …
     resonances of the object
     damping in the material

  Sound envelope
     design choice: stationary vs. variable amplitude/spectrum

https://en.wikipedia.org/wiki/Category:Sound_synthesis_types   4 Feb 2021
4 Feb 2021
Sound Reproduction
Loudspeakers
  Single sound source at one point
  Stereo – two speakers, sound may move along the Left-Right axis
  Multichannel – more directional / spatial effects
     Surround sound in theaters, horizontal directionality (L-R, front-rear)
     General spatialization (next slide)

Headphones
  Normal stereo signal – L/R directionality
     sound may appear coming from inside the head
  Spatialized signal by filtering with HRTF (head-related transfer function)

                                                                               4 Feb 2021
Vector Base Amplitude Panning
Extension of usual stereo: Divide a virtual sound source's signal
among the three loudspeakers nearest to source direction

Can be realized with any number of loudspeakers

                                                                        4 Feb 2021
                                  http://legacy.spa.aalto.fi/research/cat/vbap/
Sound in the user interface
Output: Auditory Display
  analogous to visual display, but sound is a temporal medium

  sonification = representing information with sound

     continuous sound: system state / object property

     transients: events

Input: Sound Recognition
  analysing sound structure

  interpreting as events / input values

                                                                4 Feb 2021
more info & some examples: http://www.icad.org/audio.php

                                                           4 Feb 2021
Sonic Finder demo – https://vimeo.com/158610127

                                                  4 Feb 2021
4 Feb 2021
Example of mapping features to sounds to help navigation

                                                           4 Feb 2021
https://sonification.de/handbook/chapters/chapter2/

                                                  4 Feb 2021
https://dl.acm.org/doi/10.1145/1978942.1979357

                                                 4 Feb 2021
4 Feb 2021
https://www.youtube.com/watch?v=dplpCW-P77o

                                              4 Feb 2021
http://dnasonification.org

                             4 Feb 2021
visual attention

             Example: signals at traffic crosswalks
             •     https://www.brantfordexpositor.ca/2013/05/23/resident-fed-up-with-audible-crosswalk-beeping/wcm/696e8e66-8e0c-9cc6-88d5-f11a0780175c

             •     https://nationalpost.com/news/canada/chirping-sound-at-intersections
             •     http://www.apsguide.org/appendix_c_signal.cfm

                                                                                                                                   4 Feb 2021
Sound as input
What sounds to use?             Analyzing input signal: find
                                patterns in…
  Active: voice, tactile
  (eg. hand clapping),             amplitude variation,
  various instruments              envelope

  Ambient: environment             frequency content
  noises, traffic, footsteps,      (spectrum)
  etc.
                                   temporal structure (rhythm)
Input device: microphone
                                Interpret found patterns as
  highly variable amplitude –
  auto-adjusting sound level       events

  ambient noise often a            continuous values
  problem
                                                           4 Feb 2021
Voice input
Complex issue, developed for decades

Available as software libraries or web services

Based on analyzing various features from the sound signal
(short time spectra, envelope, etc.)

  word-based: input is compared to prototype words, works in
  simple cases

  phoneme-based: more general, detecting phonemes and
  their combinations in words

Has to be tuned for different speakers

                                                         4 Feb 2021
Practical issues
Don’t limit your ideas to most obvious features only
(amplitude and frequency)

Pattern recognition is the easier the less you have
different options to detect (eg. just numbers or few
commands vs. full language vocabulary)

Sound is temporal medium – detecting input requires
time (eg. while driving: turning a physical wheel vs.
pronouncing left/right commands)

                                                   4 Feb 2021
Demos
Audio libraries in Processing: Sound and Minim

  examples: Libraries/Sound/Analysis, …/IO,
  …/Soundfile/Keyboard

  Contributed Libraries/Minim/Basics/SynthesizeSound

Sonogram analysis

Theremin controlled with Arduino
                                                 What is theremin?
                    https://www.youtube.com/watch?v=-QgTF8p-284

                                                         4 Feb 2021
Next Steps
Get used to different technologies

  Try out examples demonstrated in the lectures (camera,
  sound, sensors)

  Arduino packages will be available next week at Aalto
    è check Doodle form in MyCourses for suitable times

Start thinking, what technologies and/or application cases
would interest you
     è check questionnaire in MyCourses

  Seek ideas in the net (links in the lecture slides, more will
  appear in MyCourses)

                                                                  4 Feb 2021
You can also read