Temporal Encoding is Required for Categorization, But Not Discrimination - Oxford Academic Journals

Page created by Julie Ramos
 
CONTINUE READING
Temporal Encoding is Required for Categorization, But Not Discrimination - Oxford Academic Journals
Cerebral Cortex, June 2021;31: 2886–2897

                                                                            doi: 10.1093/cercor/bhaa396
                                                                            Advance Access Publication Date: 12 January 2021
                                                                            Original Article

ORIGINAL ARTICLE

Temporal Encoding is Required for Categorization,

                                                                                                                                                   Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021
But Not Discrimination
Justin D. Yao1 and Dan H. Sanes1,2,3,4
1 Center
       for Neural Science, New York University, New York, NY 10003, USA, 2 Department of Psychology, New
York University, New York, NY 10003, USA, 3 Department of Biology, New York University, New York, NY 10003,
USA and 4 Neuroscience Institute, NYU Langone Medical Center, New York University, New York, NY 10016,
USA

Address correspondence to Justin D. Yao, Center for Neural Science, 4 Washington Place, Room 621, New York, NY 10003, USA. Email: jdyao@nyu.edu.

Abstract
Core auditory cortex (AC) neurons encode slow f luctuations of acoustic stimuli with temporally patterned activity. However,
whether temporal encoding is necessary to explain auditory perceptual skills remains uncertain. Here, we recorded from
gerbil AC neurons while they discriminated between a 4-Hz amplitude modulation (AM) broadband noise and AM rates
>4 Hz. We found a proportion of neurons possessed neural thresholds based on spike pattern or spike count that were
better than the recorded session’s behavioral threshold, suggesting that spike count could provide sufficient information
for this perceptual task. A population decoder that relied on temporal information outperformed a decoder that relied on
spike count alone, but the spike count decoder still remained sufficient to explain average behavioral performance. This
leaves open the possibility that more demanding perceptual judgments require temporal information. Thus, we asked
whether accurate classification of different AM rates between 4 and 12 Hz required the information contained in AC
temporal discharge patterns. Indeed, accurate classification of these AM stimuli depended on the inclusion of temporal
information rather than spike count alone. Overall, our results compare two different representations of time-varying
acoustic features that can be accessed by downstream circuits required for perceptual judgments.

Key words: amplitude modulation, auditory cortex, auditory discrimination, rate code, temporal code

Introduction                                                                judgments. For example, our ability to distinguish the pitch
Perceptual judgments depend on neural responses that are                    of a musical instrument must be encoded in the temporal
unique to individual sensory stimuli. The neural representation             domain, at least in the auditory brainstem (see Bidelman 2013
can be as simple as the total spike count, or it can take on a              for review). Here, we ask whether temporal encoding by core
more complicated form as the temporal distribution of spikes                auditory cortex (AC) neurons is necessary to explain behavioral
(temporal code). For example, visual and somatosensory cortex               acuity in animals that are discriminating sounds based on the
neurons encode behaviorally relevant stimulus parameters with               modulation rate of time-varying intensity fluctuations.
a spike count code that provides sufficient information to guide               Modulation of signal amplitude is a fundamental acoustic
perceptual acuity (Tolhurst et al. 1983; Parker and Hawken 1985;            cue that is present in speech, nonhuman vocalizations, and
Bradley et al. 1987; Britten et al. 1992; Hernández et al. 2000;            many other natural sounds (Shannon et al. 1995; Wang 2000;
Salinas et al. 2000; Luna et al. 2005). However, most natural               Singh and Theunissen 2003; Zeng et al. 2005; Elliott and Theunis-
sounds are composed of time-varying intensity fluctuations,                 sen 2009). Although neural responses to amplitude modulated
from slow (∼1 Hz) to fast (>100 Hz), suggesting that a temporal             (AM) sounds are well characterized (Joris et al. 2004; Malone et al.
pattern of activity may be required to perform fine perceptual              2010), their relationship to perceptual judgments is less certain.

Published by Oxford University Press 2021.
This work is written by US Government employees and is in the public domain in the US.
Temporal Encoding is Required for Categorization, But Not Discrimination - Oxford Academic Journals
Candidate Codes for Auditory Discrimination         Yao and Sanes          2887

For very fast AM rates, core AC neurons are unable to synchro-           paradigm, similar to that as described previously (von Trapp
nize to the stimulus, and must encode these stimuli with a spike         et al. 2017). Briefly, gerbils were placed on controlled food access
count code (Yao and Sanes 2018). At intermediate AM rates,               and trained to initiate a trial by placing their noses in a cylindri-
described perceptually as “flutter” (Miller and Taylor 1948), AC         cal port that interrupted an infrared beam. Animals were shaped
neurons can provide a sufficient representation through either           to approach a food tray upon presentation of the “Go” signal (AM
spike count or temporal codes (Joris et al. 2004; Bendor and Wang        rate > 4 Hz), and received a reward (20-mg pellet) from a pellet
2007). In fact, the discrimination of large differences between          dispenser (Med Associates Inc.). After learning to consistently
temporal fluctuation rate in the flutter range may rely on an AC         initiate Go trials, animals were then trained to repoke upon
neuron spike count code (Lemus et al. 2009). In contrast, the peak       presentation the “Nogo” signal (AM rate = 4 Hz). Nogo trials (30%
of the AM spectrum of speech is quite slow at ∼4 Hz (Ding et al.         probability) were randomly interleaved with Go trials. During the
2017). Thus, AC neuron temporal encoding could easily account            initial training stage, both the Go and Nogo stimuli consisted of
for auditory discrimination of slow time-varying fluctuations.           AM frozen broadband noise (25-dB rolloff at 3.5 and 20 kHz) with
    Here, we ask whether temporal encoding is necessary to               a modulation depth of 100%, presented at a sound level of 50-dB

                                                                                                                                                 Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021
explain behavioral acuity in animals performing an AM rate               SPL. In addition, AM stimuli were preceded by a 200-ms onset
discrimination task. We recorded from gerbil AC neurons tele-            ramp, followed by an unmodulated period of 200 ms, which then
metrically while they discriminated between a 4-Hz AM broad-             transitioned to AM noise for at least 1000 ms. This resulted in a
band noise and AM rates >4 Hz. We found that a proportion of             total stimulus duration of at least 1400 ms.
AC units displayed spike count AM discrimination thresholds                  Trials were scored as a Hit (correctly approaching the food
that were superior to behavioral thresholds, suggesting that             tray during a Go trial), Miss (failing to approach the food tray
spike count is sufficiently informative to explain perceptual            and repoking during a Go trial), Correct Reject (CR; correctly
acuity. Similarly, a population-level activity decoder based on          repoking during a Nogo trial), or False Alarm (FA; incorrectly
spike count was sufficient to explain average behavioral AM              approaching the food tray on a Nogo trial). Psychometric thresh-
discrimination, whereas a decoder with access to temporal dis-           olds were assessed by presenting Go trials across five different
charge information outperformed the best overall behavioral              AM rates (4.5, 6, 8, 10, and 12 Hz), randomly interleaved with
performance. Finally, we show that temporal coding is likely             Nogo trials (4 Hz). The percentage of Hits were plotted as a
required to support the accurate classification of AM rates.             function of AM rate and these psychometric functions were
Overall, our results suggest that discrimination of time-varying         fit with a cumulative Gaussian using Bayesian inference from
acoustic features can be accomplished with a spike count code,           the open-source package psignifit 4 for MATLAB (Schütt et al.
but categorization of these same stimuli requires temporal spike         2016). The fitted distribution of percent correct scores was then
pattern information.                                                     transformed to the signal detection metric, d , by calculating
                                                                         the difference in z-scores of Hit rate versus FA rate (Green and
                                                                         Swets 1966). Hit and FA rates were constrained to floor (0.05)
Materials and Methods                                                    and ceiling (0.95) values to avoid d values that approach infinity.
Experimental Subjects                                                    Psychometric threshold was defined as the AM rate at which
                                                                         d = 1. Only sessions during which the FA rate was ≤30% and the
Three adult gerbils (Meriones unguiculatus, 2 males and 1 female)
                                                                         animal performed a minimum of 150 trials were used to track
were weaned from commercial breeding pairs (Charles River),
                                                                         psychometric performance and auditory cortex physiology.
and housed on a 12 h light/dark cycle with free access to food
and water unless otherwise noted. All procedures were approved
                                                                         Neurophysiology
by the Institutional Animal Care and Use Committee at New York
                                                                         Electrophysiological procedures are identical to those of previ-
University.
                                                                         ous studies from our laboratory (Yao and Sanes 2018). Below, we
                                                                         provide a summary of the procedures.
Method Details
Behavioral Apparatus                                                     Electrode Implantation
Adult gerbils were placed in a plastic test cage (0.25 × 0.25 × 0.4 m)   Animals underwent electrode implantation after they were fully
within a sound-attenuating booth (IAC; internal dimensions:              trained and three psychometric functions had been obtained
2.2 × 2 × 2 m) and observed via a closed-circuit monitor. Acoustic       that met the criteria of FA rate ≤ 0.30 and maximum d ≥ 2.
stimuli were delivered from a calibrated free-field tweeter              During implantation surgery, the animal was anesthetized with
(DX25TG0504; Vifa) positioned 1 m directly above the test cage.          isoflurane/O2 , secured on a stereotaxic device (Kopf), and a 16-
Sound calibration measurements were made with a 1/4-inch                 channel silicone probe array (four shanks with recording sites
free-field condenser recording microphone (Bruël and Kjaer)              arranged in a 600 × 600-μm grid; Neuronexus A4 × 4–4 mm-200-
placed in the center of the cage. A pellet dispenser (Med                200-1250-H16_21 mm) was implanted in the left core auditory
Associates Inc.) was connected to a customized 3D printed food           cortex. The array was fixed to a custom-made microdrive to
tray placed within the test cage, and a nose port was placed             allow for subsequent advancement across recording sessions,
on the opposite side. Stimulus, delivery of food pellet rewards          and angled at 25◦ in the mediolateral plane. Typically, we
(20 mg), and behavioral data acquisition were controlled by a            positioned the rostral-most shank of the array at 3.9 mm rostral
personal computer through custom MATLAB scripts (written by              and 4.6–4.8 mm lateral to lambda. A ground wire was inserted
Dr Daniel Stolzberg: https://github.com/dstolz/epsych) and an            in the contralateral cortical hemisphere. Animals recovered for
RZ6 multifunction processor (Tucker-Davis Technologies).                 at least 1 week before being placed on controlled food access for
                                                                         psychometric testing. At the termination of each experiment,
Behavioral Training and Testing                                          animals were deeply anesthetized with sodium pentobarbital
Amplitude modulation (AM) rate discrimination was assessed               (150 mg/kg) and electrolytic lesions were made through one
with a positive reinforcement Go-Nogo appetitive conditioning            contact site via passing current (7 mA, 5–10 s). Animals
Temporal Encoding is Required for Categorization, But Not Discrimination - Oxford Academic Journals
2888      Cerebral Cortex, 2021, Vol. 31, No. 6

were then perfused with phosphate-buffered saline and 4%               respectively (Fig. 3A). The percentage of Hit and FA scores were
paraformaldehyde. Brains were extracted, postfixed, sectioned          calculated across repetitions. The percentage of Hits was fit with
on a vibratome (Leica), and stained for Nissl for offline              a similar cumulative Gaussian as described in the psychome-
verification that electrode tracks spanned core auditory cortex        tric analysis above. The fitted distribution of percent correct
(Radtke-Schuller et al. 2016).                                         scores was then normalized (Z scored) and converted to a neural
                                                                       classifier-based d . Neurometric thresholds for individual units
                                                                       were defined as the lowest AM rate that proved significantly
Data Acquisition
                                                                       different from the Nogo AM rate (4 Hz). Our procedural definition
Neural recordings were made from awake, behaving animals
                                                                       of significant neural AM rate discrimination was identical to that
during psychometric testing. Extracellular neural activity was
                                                                       used for behavior, d ≥ 1. Thus, the neural threshold was defined
acquired via a 15-channel wireless headstage and receiver
                                                                       as the AM rate at which the neurometric function crossed d = 1
(model W16, Triangle Biosystems). The analog signals were
                                                                       (see Fig. 3B).
preamplified and digitized at a 24.414 kHz sampling rate (TB32;
Tucker-Davis Technologies). The converted digital signals were         Population Coding

                                                                                                                                              Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021
then fed via fiber optic link to the RZ5 base station (TDT,            We used a previously employed linear classifier readout proce-
Tucker-Davis Technologies) for filtering and processing.               dure (Yao and Sanes 2018) to assess AM rate discriminability
   For offline multiunit and single-unit analysis, signals were        across a population of AC single units. Specifically, a linear clas-
high-pass filtered (300 Hz), and common average referencing            sifier was trained to decode responses from a proportion of trials
was applied to each individual channel (Ludwig et al. 2009). A         to each stimulus set (e.g., “Go” and “Nogo”; Fig. 7A). Specifically,
spike extraction threshold was set to 4 SDs > noise floor, and         spike count responses from N neurons were counted across 1 ms
an artifact rejection threshold was set to 20 SDs > noise floor.       bins to T trials of S stimuli and formed the population “response
Candidate waveforms were then peak-aligned, hierarchically             vector.” Since the number of trials were unequal across all units,
clustered, and sorted in principal component (PC) space using          we randomly subsampled a proportion of trials (i.e., 14 trials)
the MATLAB-based package UltraMegaSort 2000 (Fee et al. 1996;          from each unit. 13 of the 14 trials were then randomly sampled
Hill et al. 2011). Well-isolated single units demonstrated a clear     (without replacement) across N neurons and averaged to reduce
separation in PC space, and fewer than 10% of refractory period        the response vector to length Nbin . To decode overall spike count
violations. The majority of recording sites contained spikes from      responses, spike counts were first summed across the bins,
several unresolved units and were considered multiunits. Sepa-         which further reduced the length of the response vector and
rate analyses of single- versus multiunit populations revealed no      eliminated the temporal dimension. A support vector machine
systematic differences from one another, and were pooled for all       (SVM) procedure was used to fit a linear hyperplane to the data
reported analyses.                                                     set (“training set”). Cross-validated classification performance
                                                                       was assessed on the remaining single trial (1 of the 14) by com-
Neurometric Classifiers                                                puting the number of times this test set was correctly classified
We adopted spike count and pattern classifier analyses to fur-         and misclassified based on the linear hyperplane across 500
ther assess the cortical encoding of AM rates (Machens et al.          iterations with a new randomly drawn sampled train and test
2003; Narayan et al. 2006; Wang et al. 2007; Billimoria et al. 2008;   sets for each iteration. Performance metrics included the pro-
Schneider and Woolley 2010; von Trapp et al. 2016; Yao and Sanes       portion of correctly classified Go trials (“Hits”) and misclassified
2018). The spike count metric used the overall spike count in          Nogo trials (“False Alarms”). Similar to the psychometric and
response to each AM stimulus (1000 ms), whereas the spike              individual unit neurometric analyses, we converted population
pattern metric utilized Euclidean distance to quantify the dis-        decoder performance metrics into d values. Decoding readout
similarity between two spike trains in high-dimensional space          performance was assessed as a function of the number of sin-
(van Rossum 2001). Both spike count and spike pattern clas-            gle units (Fig. 7B–F). The SVM procedure was implemented in
sifiers were decoded using a leave-one-out template-matching           MATLAB using the “fitcsvm” and “predict” functions with the
procedure. For each individual unit, test trials consisted of one      “KernelFunction” set to “linear.”
randomly selected spike train from a Go trial at a particular
                                                                       Experimental Design and Statistical Analysis
AM rate (e.g., 4.5, 6, 8, 10, or 12 Hz), and one randomly selected
                                                                       Each experiment was performed once with technical replica-
spike train from a Nogo trial (4 Hz). Each Go and Nogo template
                                                                       tion occurring for behavioral data only (i.e., each animal was
was composed of all other trials other than the test trials. The
                                                                       tested psychometrically multiple times), and all measures were
test trial was assigned to the Go or Nogo template based on
                                                                       subject to biological replication. Statistical analyses and proce-
the smallest difference in spike count (spike count classifier)
                                                                       dures were implemented in JMP 13.2.0 (SAS) or custom-written
or Euclidean distance (spike pattern classifier) between the test
                                                                       MATLAB scripts (MathWorks) that incorporated the MATLAB
and mean of the template trials. For classifying spike patterns,
                                                                       Statistics Toolbox. For normally distributed data (as assessed
the average discriminability across all units was maximized
                                                                       by the Lilliefors test), data are reported as mean ± SEM unless
when spike times were binned at 10 ms (Fig. 3C,D). Thus, all
                                                                       otherwise stated. When data were not normally distributed,
reported spike pattern data were from spike trains binned at
                                                                       the nonparametric Wilcoxon signed-rank test was used when
10 ms. Test and template trials were selected randomly, and
                                                                       appropriate.
spike train classification was repeated 1000 times to minimize
selection biases.
    Classification of trial to template assignments was scored         Results
as follows: Go test trials were labeled as “Hits” or “Misses” if
                                                                       Psychometric Sensitivity to AM Rate
they were assigned to the Go or Nogo template, respectively.
Likewise, Nogo test trials were labeled “False Alarms” or “Correct     In order to simultaneously record from auditory cortex
Rejections” if they were assigned to the Go or Nogo template,          neurons during behavior, we first trained gerbils (n = 3) on a
Temporal Encoding is Required for Categorization, But Not Discrimination - Oxford Academic Journals
Candidate Codes for Auditory Discrimination               Yao and Sanes             2889

                                                                                                                                                                      Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021
Figure 1. Behavioral performance on the AM rate discrimination task. (A) Exem-
plar fit psychometric function obtained from one gerbil during one session.
Horizontal black dashed line indicates discrimination threshold relative to the
4 Hz Nogo signal (d = 1). (B) Individual (symbols) psychometric thresholds across
session number. Each symbol type corresponds to each individual gerbil. Average
psychometric threshold from all animals (horizontal bar) is plotted. (C) Distribu-
tion of FA rate, lapse rate, maximum d , and report “Yes” are plotted for each      Figure 2. Candidate waveform selection for neurometric analyses. (A) Raw
animal.                                                                              waveform trace of evoked neural response to AM stimulus. Red line represents
                                                                                     selection criteria of >4 SDs above the noise f loor. (B) Principal component
                                                                                     analysis plot where two waveform clusters (blue and orange) are separated. Raw
                                                                                     waveforms and averages from two waveform clusters are displayed above. (C)
Go-Nogo AM rate discrimination task. Figure 1A displays an
                                                                                     Example rasters and PSTHs from one unit in response to each AM stimulus. Bin
example psychometric function for one test session from                              width: 10 ms.
one animal. AM discrimination thresholds were taken as
the lowest AM rate corresponding to d = 1 from the fitted
psychometric function. Across our three animals, AM dis-
crimination thresholds were similar (two-way mixed model                             performance for each unit based on the similarity of spike count
ANOVA; F(2,58) = 0.38, P = 0.69) (Fig. 1B), and the average AM rate                  and spike pattern to a template (see Materials and Methods;
discrimination threshold across all animals and sessions was                         Fig. 3A). Neural sensitivity was quantified by a d metric that
4.87 ± 0.02 Hz (relative to the 4 Hz Nogo stimulus). For each                        signifies the statistical difference between neural responses
animal, thresholds were not statistically correlated with session                    evoked by 4 Hz (Nogo signal) versus each Go signal (4.5–12 Hz).
number (Spearman’s correlation; Gerbil 1, r = 0.41, P = 0.05; Gerbil                 Figure 3B displays two neurometric functions, calculated by
2, r = 0.27, P = 0.09; Gerbil 3, r = 0.07, P = 0.60). We also measured               temporal spike pattern (green) and spike count (magenta) across
lapse rate, or the probability of a Miss on the easiest Go signals                   1000 ms stimulus duration, from one individual unit. For this
(i.e., 12-Hz trials). Lapse rate has been used as a proxy for                        example unit, d values were greater when calculated from
task engagement and motivation, as unmotivated animals                               the spike pattern metric compared to the spike count metric,
tend to miss easy Go trials. No between-animal differences                           suggesting spike pattern yields greater sensitivity.
were observed for FA rate (two-way mixed model ANOVA;                                    To assess which template-matching classifier metric yielded
F(2,58) = 3.2, P = 0.50), lapse rate (two-way mixed model ANOVA;                     overall greater sensitivity, we compared each unit’s best (e.g.,
F(2,58) = 1.17, P = 0.32), maximum d (two-way mixed model                           maximum) spike pattern d with its corresponding best spike
ANOVA; F(2,58) = 2.72, P = 0.07), and reported “Yes” (two-way mixed                  count d . These metrics were calculated across the entire 1000
model ANOVA; F(2,58) = 0.06, P = 0.94) (Fig. 1C).                                    ms stimulus duration. Across our population of recorded units,
                                                                                     best spike pattern d (mean ± SE: 1.54 ± 0.04) was significantly
Behavioral Performance More Closely Matches Neural Sensitivity                       higher than best spike count d (mean ± SE: 0.94 ± 0.02) (two-
Based on Temporal Spike Patterns                                                     tailed t-test; P < 0.0001, t = 15.6) (Fig. 3C). To further examine neu-
Recorded physiological data (Fig. 2A) were preprocessed to                           ral sensitivity between spike pattern and spike count metrics, we
extract candidate waveforms for offline spike sorting pro-                           compared each unit’s “neural threshold” extracted from spike
cedures (see Materials and Methods). Principal component                             pattern and spike count neurometric functions (Fig. 3D). We
(PC) clustering (Fig. 2B) was used to further sort the extracted                     found that 16% of units produced neural thresholds based on
waveforms into clusters classified as single- or multiunits.                         spike count, whereas 58% of units produced neural thresholds
Figure 2C displays example raster plots and corresponding                            based on spike pattern. Of the units with neural thresholds
poststimulus-time histograms (PSTHs) for one unit in response                        from either spike pattern or spike count, spike pattern neural
to each AM rate presented during task performance. We                                thresholds were significantly lower than spike count neural
recorded from a total of 463 units (gerbil 1: 102, gerbil 2: 104,                    thresholds (Wilcoxon signed-rank test; P < 0.0001; Spike pat-
gerbil 3: 257) where 98 (21%) were classified as single units.                       tern median threshold: 4.54 Hz; spike count median thresh-
   To assess the neural sensitivity of each unit, we                                 old: 10.2 Hz) (Fig. 3E). Together, these results indicate that the
applied a template-matching classifier analysis that calculates                      temporal spike pattern of cortical responses provides greater
Temporal Encoding is Required for Categorization, But Not Discrimination - Oxford Academic Journals
2890        Cerebral Cortex, 2021, Vol. 31, No. 6

                                                                                                                                                                                  Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021
Figure 3. Quantifying AM rate sensitivity with AC spike count and pattern. (A) Schematic of template-matching classification procedure. Spike count and spike pattern
classifiers were decoded using a leave-one-out template-matching procedure. For each unit, test trials consisted of one randomly selected spike train from a Go trial
at a particular AM rate (e.g., 4.5, 6, 8, 10, or 12 Hz), and one randomly selected spike train from a Nogo trial (4 Hz). Each template was composed of all other trials
other than the test trial. The test trial was assigned to the Go or Nogo template based on the smallest difference in spike count (spike count classifier) or Euclidean
distance (spike pattern classifier) between the test and mean of the template trials. This classification procedure was repeated 1000 times to minimize selection
biases. See Methods for details. (B) Exemplar fit neurometric function from one unit based on spike pattern (green) and spike count (magenta) classification across
1000-ms stimulus duration. Horizontal black dashed line indicates discrimination threshold relative to the 4-Hz Nogo signal (d = 1). Corresponding thresholds for each
classification metric are indicated by vertical dashed lines (spike pattern, green; spike count, magenta). (C) Scatter plot of best spike pattern d versus best spike count
d across all individual units (circles). Histograms plot the distribution of best spike pattern and spike count d . Inset: Average spike pattern best d (±SEM) as a function
of bin width for all units. (D) Scatter plot of neural thresholds based on spike pattern and spike count metrics. Histograms plot the distribution of neural thresholds
based on spike pattern and spike count. Inset: Average spike pattern neural threshold (±SEM) as a function of bin width for all units. (E) Cumulative distribution of
thresholds for each classification metric. Vertical gray bar represents the average behavioral threshold. See text for statistical details.

neural sensitivity than spike count, which may be utilized for                            neuralSP /behavioral threshold ratio: 0.97) than spike count
stimulus-driven behavioral performance.                                                   neural thresholds (median neuralSC /behavioral threshold
    To examine whether temporal spike patterns or over-                                   ratio: 1.3). Overall, neuralSP /behavioral threshold ratios were
all spike count evoked by the AM rates are sufficient to                                  significantly lower than neuralSC /behavioral threshold ratios
explain behavioral performance, we quantified the relationship                            (Wilcoxon signed-rank test; P < 0.0001). Spike pattern neural
between neural and behavioral thresholds by calculating                                   thresholds could be better than behavioral thresholds. This is
neural/behavioral threshold ratios for each unit. Specifically,                           illustrated by the greater proportion of units with spike pattern
each unit’s spike pattern and spike count neural threshold                                neural thresholds ≤ behavioral thresholds (neuralSP /behavioral
(Fig. 4A) is directly compared with its corresponding behav-                              threshold ratio ≤ 1: 0.60, 162/272 units) relative to spike count
ioral threshold from the same session. The distribution of                                neural thresholds ≤ behavioral thresholds (neuralSC /behavioral
spike pattern neural (neuralSP )/behavioral threshold and                                 threshold ratio ≤ 1: 0.21, 17/80 units) (Fig. 4C).
spike count (neuralSC )/behavioral threshold ratios for each                                  To examine whether greater neurometric sensitivity based
animal are shown in Figure 4B. Behavioral thresholds more                                 on spike pattern relative to spike count could be explained
closely matched spike pattern neural thresholds (median                                   by the degree of overall synchrony of each unit’s responses
Candidate Codes for Auditory Discrimination                   Yao and Sanes               2891

                                                                                                                                                                           Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021

Figure 4. Behavioral acuity is matched by AC sensitivity. (A) Histogram of neural thresholds from spike pattern (green) and spike count (magenta) metrics. Vertical gray
bar represents average behavior threshold. (B) Relationship between AC activity and behavior quantified as the ratio between neural and behavior thresholds (NT/BT)
from the same recorded sessions. Vertical lines represent median ratio values. (C) Proportion of individual units with ratio values ≤1. Neural classification metrics
were calculated across 1000-ms stimulus duration. See text for statistical details.

to AM rates, we compared each unit’s best vector strength                              corresponding best vector strength (linear regression; R2 = 0.30,
against its best spike pattern d (Fig. 5A) and best spike count                       P < 0.0001), whereas best spike count d had a near-zero cor-
d (Fig. 5B). Vector strength represents the strength of stimulus                      relation with best vector strength (linear regression; R2 = 0.01,
synchrony and range from 0 (no synchrony) to 1 (all spikes are                         P > 0.05). This demonstrates that the synchronous patterns of
identical phase) (Goldberg and Brown 1969). We found that best                         neural responses evoked by the presented AM rates are a strong
spike pattern d possessed a significant positive correlation with                     factor driving the neurometric sensitivity.
2892         Cerebral Cortex, 2021, Vol. 31, No. 6

                                                                                          the number of cells in Figure 6B–F. Across each stimulus
                                                                                          condition, both spike pattern and spike count decoders
                                                                                          displayed greater d with increasing cell counts. However, the
                                                                                          spike pattern decoder outperforms the spike count decoder
                                                                                          across all conditions (mean ± SEM d difference; 4 vs. 4.5 Hz:
                                                                                          1.89 ± 0.05; 4 vs. 6 Hz: 1.57 ± 0.05; 4 vs. 8 Hz: 1.45 ± 0.06; 4 vs.
                                                                                          10 Hz: 1.15 ± 0.07; 4 vs. 12 Hz: 1.10 ± 0.08). The spike pattern
                                                                                          decoder reached maximum d for all stimulus conditions at
                                                                                          ≥45 cells, whereas spike count decoder performance never
                                                                                          reached an asymptote, suggesting that decoding performance
                                                                                          could continue increasing with additional cells. To compare
                                                                                          population coding with behavioral performance, we examined
Figure 5. Greater neurometric sensitivity based on spike pattern relative to spike        spike pattern and spike count decoder results relative to the

                                                                                                                                                                 Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021
count could be explained in part by the degree of overall synchrony of each unit’s        overall best individual or average behavioral performance for
responses to AM rates. (A) Scatter plot of each individual unit’s best spike pattern      each stimulus condition. Across all stimulus conditions (4 vs. 4.5,
d versus its corresponding best vector strength value. (B) Scatter plot of each
                                                                                          6, 8, 10, and 12 Hz), the spike pattern decoder performed better
individual unit’s best spike count d versus its corresponding best vector strength
value. Black lines represent linear fits of the data. See text for statistical details.
                                                                                          than the average and best behavioral d . The spike count decoder
                                                                                          reached average behavioral d and only performed better
                                                                                          than the best behavioral d for the near-threshold 4.5 Hz Go
An Auditory Cortex Population Readout Reveals Complementary                               condition.
Codes for AM Rate Discrimination
Our current findings suggest that neural sensitivity based on
stimulus-driven temporal spike patterns for individual units                              AM Rate Classification Relies on the Temporal Patterns of Cortical
correlates more closely to behavioral performance than neural                             Responses
sensitivity based on overall spike count. To assess whether                               Currently, our results suggest that temporal and rate codes
population-level encoding follows a similar neural code that                              could serve as potential readouts for AM rate discrimination.
contributes to behavior, we constructed linear classifiers using                          This complementary neural code scheme could be most appro-
support vector machines (SVM) (see Materials and Methods).                                priate to our Go-Nogo AM rate discrimination task where a
Briefly, Go versus Nogo AM rate discriminability was calculated                           correct response could be determined based on the difference in
across our recorded single-unit population (n = 98) with a linear                         evoked spike count or temporal pattern responses between a Go
population readout scheme. Our population linear classifiers                              and Nogo signal. Thus, in order to distinguish the contribution
were trained to decode responses from a proportion of trials                              of temporal versus spike count coding to AM rate processing,
to each individual Go versus Nogo stimulus pair (Fig. 6A). Sim-                           we asked: what sound-driven behavior would rely only on the
ilar to the individual unit-by-unit template-matching classifier                          temporal patterns of cortical responses? To address this, we
scheme, the parameters of our linear classifier (i.e., comparing                          predicted that an auditory classification task, where a subject is
populations of each individual Go signal vs. the Nogo signal)                             required to classify an AM rate stimulus across a number of var-
were chosen because the animal’s goal was to indicate and                                 ious AM rates, would exclusively rely on the temporal patterns
report whether the Go signal differentiated from the Nogo (4 Hz).                         of cortical spikes for accurate behavioral performance. To test
To decode population responses, spike trains from all neurons                             this prediction, we performed a template-matching classifier
were organized across 1 ms bins throughout the full stimulus                              analysis on our current neural data set that calculates the clas-
duration (1000 ms) for all trials. Thus, the SVM was given access                         sification accuracy for each unit based on the similarity of spike
to spiking information across the entire temporal domain in                               pattern and spike count to different AM rate templates. This
order to fit a linear hyperplane that best segregated the training                        is similar to our Go versus Nogo template-matching classifier
data set. Additionally, the SVM was given only overall spike                              analysis presented in the previous sections except test trials are
count information (i.e., spike counts were summed across all                              compared with each of the 6 AM rate signals. Figure 7A displays
bins throughout the entire stimulus duration) to fit its appropri-                        dot rasters and corresponding PSTHs to each AM rate stimulus
ate linear hyperplane. This reduced the length of the response                            from one example unit. Classification performance based on
vector and eliminated the temporal dimension. Cross-validated                             temporal spike pattern and spike count from this example unit
classification performance was assessed across 500 iterations                             are displayed by confusion matrix plots in Figure 7B,C, respec-
and labeled as spike pattern and spike count readouts based                               tively. For this example unit, AM rate classification based on
on whether or not information within the temporal domain was                              temporal spike pattern is near perfect (Fig. 7B), whereas AM
present for the SVM, respectively. Overall, population decoding                           rate classification based on spike count is poor (Fig. 7C). This
performance was assessed as a function of the number of units                             trend is evident across all units, with the grand mean confusion
used in the linear population readout by applying a resampling                            matrix based on spike pattern displaying near perfect AM rate
procedure to randomly select a subpopulation of cells (5–98 at                            classification (Fig. 7D). AM rate classification based on spike
increasing increments of 5) across 250 iterations. During each                            count remains poor (Fig. 7E).
iteration of the resampling procedure, a new subpopulation of                                 To quantify classification performance, we considered the
cells was randomly selected (without replacement) prior to the                            unsigned error magnitude (mean observed RMS error, “RMSE”)
decoding readout procedure. Thus, 250 groups of N cells from                              for each tested AM rate. Larger RMSE values signify greater error
the entire population were randomly drawn and 500 sets of trials                          magnitudes. Figure 7F plots the distribution of RMSE from all
were randomly drawn.                                                                      units for spike pattern and spike count classification across each
    Spike pattern and spike count decoding performance for                                tested AM rate. RMSE values displayed a significant interaction
each Go versus Nogo condition is plotted as a function of                                 between neural classification metric (spike pattern and spike
Candidate Codes for Auditory Discrimination                  Yao and Sanes              2893

                                                                                                                                                                       Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021
Figure 6. AC population decoder analyses can explain behavioral performance. (A) Assessing population encoding by measuring discriminability with a linear
population readout. See Methods for details. (B–F) Average (±SD) population decoder performance between AM rate Nogo (4 Hz) versus Go (4.5, 6, 8, 10, and 12 Hz)
signals as a function of unit count. Green functions represent average readout performance from a population decoder with access to temporal discharge information.
Magenta functions represent average readout performance from a population decoder based on overall spike count. Solid horizontal lines represent the best behavioral
d from all animals and sessions. Dashed horizontal lines represent average behavioral d from all animals and sessions. Shaded region represents ±1 SD.

count) and AM rate (two-way mixed model ANOVA; F(5,4620) = 107,                     found that RMSE significantly increases with increasing bin
P < 0.0001). Post hoc two-tailed t-tests (Holm-Bonferroni-                          width (two-way mixed model ANOVA; F(8,4158) = 2434, P < 0.0001).
corrected) indicated RSME values from spike pattern were                            This suggests that spike pattern can be represented in a
significantly lower than RSME values from spike count across all                    more complex space than a simple spike count measure.
tested AM rates (mean ± SEM; 4 Hz: spike pattern = 0.17 ± 0.02,                     Overall, accurate neural classification of slow AM rates requires
spike count = 2.60 ± 0.04, P < 0.0001, t = 62.1; 4.5 Hz: spike                      temporal spike pattern information.
pattern = 0.45 ± 0.02, spike count = 2.42 ± 0.03, P < 0.0001, t = 51.9;
6 Hz: spike pattern = 0.57 ± 0.02, spike count = 2.22 ± 0.02,
P < 0.0001, t = 51.3; 8 Hz: spike pattern = 0.66 ± 0.03, spike
                                                                                    Discussion
count = 2.30 ± 0.02, P < 0.0001, t = 50.5; 10 Hz: spike pattern = 0.95              Understanding the relationship between perceptual judgments
± 0.04, spike count = 2.85 ± 0.03, P < 0.0001, t = 30.9; 12 Hz: spike               and the neural representation of sensory stimuli remains
pattern = 1.17 ± 0.05, spike count = 3.77 ± 0.05, P < 0.0001, t = 36.1).            challenging due to the breadth of candidate codes (Perkel and
These results demonstrate that temporal spike pattern is the                        Bullock 1968). To address this question, we simultaneously mea-
dominant neural code for the classification of AM rates that                        sured the perceptual ability of gerbils to discriminate between
range between 4 to 12 Hz. Furthermore, we found that RSME                           slow AM rates while recording stimulus-evoked responses
grew significantly larger with faster AM rates (two-way mixed                       from AC neurons. Our primary goal was to determine whether
model ANOVA; F(1,924) = 13 245, P < 0.0001). This suggests that                     temporal coding was necessary to explain behavioral acuity.
the temporal spike pattern becomes a less reliable code at                          Here, we report that AC neuron spike count coding is sufficiently
higher AM rates. To examine the degree to which classification                      informative to explain the gerbils’ behavioral AM discrimination
accuracy improves across increasing dimensions of the data, we                      thresholds. Since temporal coding far outstripped behavior,
compared average classification RMSE for each AM rate based                         we asked whether this information would be required to
on temporal pattern as a function of bin width (Fig. 7G). We                        support a more demanding perceptual task. In fact, our results
2894        Cerebral Cortex, 2021, Vol. 31, No. 6

                                                                                     A Spike Count Code is Sufficient to Support AM
                                                                                     Discrimination
                                                                                     The detection, discrimination, or categorization of envelope
                                                                                     cues could be based on either of two cardinal strategies: a
                                                                                     spike count code or some type of temporal code. For auditory
                                                                                     cortex, a spike count code has been proposed to account for AM
                                                                                     depth detection threshold, as well as improved sensitivity as
                                                                                     the AM depth increases (Liang et al. 2002; Johnson et al. 2012;
                                                                                     Niwa et al. 2012, 2013, 2015; Rosen et al. 2012; von Trapp et al.
                                                                                     2016; Yao and Sanes 2018). In fact, a cortical spike count code
                                                                                     correlates closely with the perceptual acuity of detecting AM
                                                                                     stimuli (Niwa et al. 2012, 2013, 2015; von Trapp et al. 2016; Caras
                                                                                     and Sanes 2017; Yao and Sanes 2018), despite the availability

                                                                                                                                                                 Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021
                                                                                     of a synchronized discharge pattern that also scales with
                                                                                     modulation depth (Eggermont 1994; Middlebrooks 2008a, 2008b;
                                                                                     Malone et al. 2010). AC neuron discharge rate can also vary across
                                                                                     a narrow range of modulation frequencies (Schreiner and Urbas
                                                                                     1986, 1988; Schulze and Langner 1997; Liang et al. 2002). Thus,
                                                                                     spike count coding could also support AM discrimination. For
                                                                                     example, the discrimination between temporal fluctuation rates
                                                                                     within the flutter range (∼10–50 Hz) is plausibly explained by an
                                                                                     AC neuron spike count code (Lemus et al. 2009).
                                                                                         Our results indicate that neural AM rate discrimination
                                                                                     thresholds based on the overall spike count are sufficient to
                                                                                     account for behavioral thresholds obtained simultaneously
                                                                                     during a recording session (Fig. 4B). Furthermore, a population
                                                                                     decoder based on spike count matched, but did not exceed,
                                                                                     the average behavioral performance (Fig. 7). In contrast, when
                                                                                     neural thresholds were based on spike pattern, a greater
                                                                                     number of AC unit thresholds exceeded behavioral thresholds
                                                                                     (Fig. 4C). One possible explanation for this disparity is that,
                                                                                     with additional training, animals begin to use this temporal
                                                                                     information and reach superior behavioral thresholds. In fact,
                                                                                     the single best psychometric sensitivity displayed during a
                                                                                     single session was 4.57 Hz, nearly identical to that predicted
                                                                                     by a temporal coding strategy. Such a scenario could also
                                                                                     help to explain why animals reach exceptional perceptual
                                                                                     performance following focused practice on a narrow task
                                                                                     (Recanzone et al. 1992, 1993; Crist et al. 2001; Schoups et al. 2001;
                                                                                     Beitel et al. 2003; Bao et al. 2004; Polley et al. 2006; Yan et al. 2014;
                                                                                     Caras and Sanes 2017).
                                                                                         Another possible explanation for the disparity between
                                                                                     neural thresholds and behavioral acuity is that the integration
                                                                                     of sensory encoded information across areas downstream of
                                                                                     sensory cortex could accurately predict how well an animal per-
Figure 7. Accurate classification of AM rates requires temporal coding. (A)          forms on a given trial (Yao et al. 2020). This is primarily the case
Example rasters and PSTHs from one unit in response to each AM stimulus.
                                                                                     as perceptual judgments emerge from the temporal integration
(B) AM rate decoded with a temporal spike pattern classifier from the spiking
responses from one example unit. (C) AM rate decoded with a spike count
                                                                                     of sensory inputs downstream of primary sensory cortices
classifier from the spiking responses from one example unit. (D) Same as B           (Fassihi et al. 2017). As sensory input ascends the cortical
except from the average of all units. (E) Same as C except from the average of       pathway, the timescale over which neurons encode infor-
all units. (F) Distribution of root-mean squared error (RMSE) of classification      mation increases. For example, neurons within secondary
based on temporal spike pattern (green) and spike count (magenta) for each AM        auditory cortex encode and integrate acoustic information
rate. Vertical bars represent population averages. (G) Average classification RMSE
                                                                                     over longer durations than neurons in primary auditory
based on temporal spike pattern for each AM rate across bin widths. See text for
statistical details.
                                                                                     cortex (Boemio et al. 2005; Bendor and Wang 2007; Scott
                                                                                     et al. 2011). These longer integration times are suggested to
                                                                                     correlate with perceptual attributes (DeWitt and Rauschecker
demonstrate temporal coding would be needed for accurate                             2012; de Heer et al. 2017). Thus, even if a physical stim-
classification of slow AM rates. Below, we discuss these findings                    ulus is encoded accurately within sensory cortex, the lack
within the context that distinct perceptual capabilities driven                      or inappropriate integration of such sensory information
by time-varying acoustic cues likely require separate cortical                       across downstream pathways could lead to poorer behavioral
codes.                                                                               performance.
Candidate Codes for Auditory Discrimination        Yao and Sanes          2895

Classification Judgments Must Rely on a Temporal Code                  financial, or nonfinancial interest in the subject matter or mate-
                                                                       rials discussed in this manuscript. The authors declare no com-
Although a cortical spike count code is sufficient to explain
                                                                       peting interests.
the detection and discrimination of envelope cues, a temporal
code could be required for more demanding perceptual judg-
ments, such as a feature classification. Previous investigations       Funding
on the neural encoding principles of communication sounds
                                                                       National Institute on Deafness and Other Communication
offer evidence that auditory cortex processing and temporal
                                                                       Disorders at the National Institute of Health (grant numbers
coding underlie perception for complex time-varying acoustic
                                                                       K99DC018600 to J.D.Y.; R01DC011284 to D.H.S.).
cues such as speech and animal vocalizations. First, AC lesions
lead to severely impaired processing of communication sounds
(Heffner and Heffner 1986; Porter et al. 2011). Second, neuro-         References
physiological studies across species demonstrate that natural
                                                                       Bao S, Chang EF, Woods J, Merzenich MM. 2004. Temporal plastic-
vocalization sounds are highly represented by AC discharge

                                                                                                                                               Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021
                                                                           ity in the primary auditory cortex induced by operant percep-
patterns (e.g., Wang et al. 1995; Narayan et al. 2006; Schnupp
                                                                           tual learning. Nat Neurosci. 7(9):974–981. doi: 10.1038/nn1293.
et al. 2006; Billimoria et al. 2008; Engineer et al. 2008; Mesgarani
                                                                       Beitel RE, Schreiner CE, Cheung SW, Wang X, Merzenich
et al. 2008; Russ et al. 2008; Recanzone 2008; Walker et al. 2008;
                                                                           MM. 2003. Reward-dependent plasticity in the primary
Huetz et al. 2009; Schneider and Woolley 2013; see Gaucher et al.
                                                                           auditory cortex of adult monkeys trained to discrimi-
2013 for review). Similarly, electrocorticography (ECoG) record-
                                                                           nate temporally modulated signals. Proc Natl Acad Sci USA.
ings from human auditory cortex utilize high-dimensional algo-
                                                                           100(19):11070–11075. doi: 10.1073/pnas.1334187100.
rithms based on temporal signals to decode distinct features
                                                                       Bendor D, Wang X. 2007. Differential neural coding of acous-
of speech (Mesgarani et al. 2014; Moses et al. 2019; Oganian
                                                                           tic flutter within primate auditory cortex. Nat Neurosci.
and Chang 2019; Yi et al. 2019). Third, temporal coding of such
                                                                           10(6):763–771. doi: 10.1038/nn1888.
complex time-varying fluctuations of acoustic cues is correlated
                                                                       Bidelman GM. 2013. The role of the auditory brainstem in pro-
with behavioral performance (Engineer et al. 2008; Schneider
                                                                           cessing musically relevant pitch. Front Psychol. 4:264. doi:
and Woolley 2013). Thus, the spike-timing-based coding strate-
                                                                           10.3389/fpsyg.2013.00264.
gies that sufficiently represent complex time-varying acoustic
                                                                       Billimoria CP, Kraus BJ, Narayan R, Maddox RK, Sen K. 2008.
stimuli could drive perceptual judgments.
                                                                           Invariance and sensitivity to intensity in neural discrimi-
    Although our animals performed a discrimination task in the
                                                                           nation of natural sounds. J Neurosci. 28(25):6304–6308. doi:
current study, we asked whether accurate AM rate classifica-
                                                                           10.1523/JNEUROSCI.0961-08.2008.
tion might require the temporally patterned responses of AC
                                                                       Boemio A, Fromm S, Braun A, Poeppel D. 2005. Hierarchical and
neurons. We report that precise classification of AM rates in
                                                                           asymmetric temporal sensitivity in human auditory cortices.
the 4–12 Hz range could not be accomplished with an AC code
                                                                           Nat Neurosci. 8:389–395.
based on spike count alone. Rather, access to a temporal code
                                                                       Bradley A, Skottun BC, Ohzawa I, Sclar G, Freeman RD. 1987.
is required (Fig. 7). Thus, an important future direction would
                                                                           Visual orientation and spatial frequency discrimination: a
be to simultaneously record neural and behavioral measures
                                                                           comparison of single neurons and behavior. J Neurophysiol.
underlying the classification of AM stimuli. At the level of the
                                                                           57(3):755–772. doi: 10.1152/jn.1987.57.3.755.
auditory cortex, it might be the case that the candidate codes for
                                                                       Britten KH, Shadlen MN, Newsome WT, Movshon JA. 1992. The
accurate classification could change. We predict that if behav-
                                                                           analysis of visual motion: a comparison of neuronal and psy-
ioral classification of slow AM rates requires a temporal code,
                                                                           chophysical performance. J Neurosci. 12:4745–4765. https://
then behavioral classification accuracy will be high with very
                                                                           www.ncbi.nlm.nih.gov/pubmed/1464765.
few errors.
                                                                       Caras ML, Sanes DH. 2017. Top-down modulation of sensory
    Overall, our findings are consistent with previous sensory
                                                                           cortex gates perceptual learning. Proc Natl Acad Sci USA.
encoding studies that suggest neural information represented
                                                                           114(37):9972–9977. doi: 10.1073/pnas.1712305114.
within primary auditory cortex carries complementary and mul-
                                                                       Crist RE, Li W, Gilbert CD. 2001. Learning to see: experience and
tiplexed spike count and spike pattern codes that are sufficient
                                                                           attention in primary visual cortex. Nat Neurosci. 4(5):519–525.
for correct stimulus discrimination and classification (Malone
                                                                           doi: 10.1038/87470.
et al. 2015). Such complementary cortical codes may further
                                                                       de Heer WA, Huth AG, Griffiths TL, Gallant JL, Theunissen FE.
transform to an exclusive spike count code along the ascending
                                                                           2017. The hierarchical cortical organization of human speech
pathway (Yin et al. 2011; Zuo et al. 2015). Furthermore, our
                                                                           processing. J Neurosci. 37:6539–6557.
current results build on previous findings that show spike count
                                                                       DeWitt I, Rauschecker JP. 2012. Phoneme and word recognition
coding is sufficient to explain perceptual function and provide
                                                                           in the auditory ventral stream. Proc Natl Acad Sci USA. 109:
new evidence that the behavioral acuity of discriminating slow,
                                                                           E505–E514.
time-varying fluctuations of acoustic cues could be explained by
                                                                       Ding N, Patel AD, Chen L, Butler H, Luo C, Poeppel D. 2017. Tempo-
an AC spike count code.
                                                                           ral modulations in speech and music. Neurosci Biobehav Rev.
                                                                           81(Pt B):181–187. doi: 10.1016/j.neubiorev.2017.02.011.
                                                                       Eggermont JJ. 1994. Temporal modulation transfer functions
                                                                           for AM and FM stimuli in cat auditory cortex. Effects of
Notes                                                                      carrier type, modulating waveform and intensity. Hear Res.
We thank members of the Sanes laboratory for constructive                  74(1–2):51–66. doi: 10.1016/0378-5955(94)90175-9.
comments. Conflict of interest: The authors whose names are            Elliott TM, Theunissen FE. 2009. The modulation transfer func-
listed immediately above certify that they have no affiliations            tion for speech intelligibility. PLoS Comput Biol. 5(3):e1000302.
with or involvement in any organization or entity with any                 doi: 10.1371/journal.pcbi.1000302.
2896      Cerebral Cortex, 2021, Vol. 31, No. 6

Engineer CT, Perez CA, Chen YH, Carraway RS, Reed AC, Shetake          Malone BJ, Scott BH, Semple MN. 2010. Temporal codes for ampli-
    JA, Jakkamsetti V, Chang KQ, Kilgard MP. 2008. Cortical               tude contrast in auditory cortex. J Neurosci. 30(2):767–784. doi:
    activity patterns predict speech discrimination ability. Nat          10.1523/JNEUROSCI.4170-09.2010.
    Neurosci. 11(5):603–608. doi: 10.1038/nn.2109.                     Malone BJ, Scott BH, Semple MN. 2015. Diverse cortical codes for
Fassihi A, Akrami A, Pulecchi F, Schonfelder V, Diamond ME.               scene segmentation in primate auditory cortex. J Neurophys-
    2017. Transformation of perception from sensory to motor              iol. 113(7):2934–2952. doi: 10.1152/jn.01054.2014.
    cortex. Curr Biol. 27:1585, e6–1596.                               Mesgarani N, Cheung C, Johnson K, Chang EF. 2014. Phonetic
Fee MS, Mitra PP, Kleinfeld D. 1996. Automatic sorting of multiple        feature encoding in human superior temporal gyrus. Science.
    unit neuronal signals in the presence of anisotropic and              343(6174):1006–1010. doi: 10.1126/science.1245994.
    non-Gaussian variability. J Neurosci Methods. 69:175–188. doi:     Mesgarani N, David SV, Fritz JB, Shamma SA. 2008. Phoneme
    10.1016/S0165-0270(96)00050-7.                                        representation and classification in primary auditory cortex.
Gaucher Q, Huetz C, Gourévitch B, Laudanski J, Occelli F,                 J Acoust Soc Am. 123(2):899–909. doi: 10.1121/1.2816572.
    Edeline JM. 2013. How do auditory cortex neurons rep-              Middlebrooks JC. 2008a. Auditory cortex phase locking to

                                                                                                                                              Downloaded from https://academic.oup.com/cercor/article/31/6/2886/6082826 by guest on 31 December 2021
    resent communication sounds. Hear Res. 305:102–112. doi:              amplitude-modulated cochlear implant pulse trains. J Neu-
    10.1016/j.heares.2013.03.011.                                         rophysiol. 100(1):76–91. doi: 10.1152/jn.01109.2007.
Goldberg JM, Brown PB. 1969. Response of binaural neurons of           Middlebrooks JC. 2008b. Cochlear-implant high pulse rate and
    dog superior olivary complex to dichotic tonal stimuli: some          narrow electrode configuration impair transmission of tem-
    physiological mechanisms of sound localization. J Neurophys-          poral information to the auditory cortex. J Neurophysiol.
    iol. 32:613–636. doi: 10.1152/jn.1969.32.4.613.                       100(1):92–107. doi: 10.1152/jn.01114.2007.
Green DM, Swets JA. 1966. Signal detection theory and psychophysics.   Miller GA, Taylor WG. 1948. The perception of repeated bursts of
    New York: Wiley.                                                      noise. J Acoust Soc Am. 20:171.
Heffner HE, Heffner RS. 1986. Effect of unilateral and bilateral       Moses DA, Leonard MK, Makin JG, Chang EF. 2019. Real-
    auditory cortex lesions on the discrimination of vocaliza-            time decoding of question-and-answer speech dialogue
    tions by Japanese macaques. J Neurophysiol. 56(3):683–701. doi:       using human cortical activity. Nat Commun. 10(1):3096. doi:
    10.1152/jn.1986.56.3.683.                                             10.1038/s41467-019-10994-4.
Hernández A, Zainos A, Romo R. 2000. Neuronal correlates               Narayan R, Graña G, Sen K. 2006. Distinct time scales in cortical
    of sensory discrimination in the somatosensory                        discrimination of natural sounds in songbirds. J Neurophysiol.
    cortex. Proc Natl Acad Sci USA. 97(11):6191–6196. doi:                96(1):252–258. doi: 10.1152/jn.01257.2005.
    10.1073/pnas.120018597.                                            Niwa M, Johnson JS, O’Connor KN, Sutter ML. 2012. Active
Hill DN, Mehta SB, Kleinfeld D. 2011. Quality metrics to                  engagement improves primary auditory cortical neurons’
    accompany spike sorting of extracellular signals. J Neurosci.         ability to discriminate temporal modulation. J Neurosci.
    31(24):8699–8705. doi: 10.1523/JNEUROSCI.0971-11.2011.                32(27):9323–9334. doi: 10.1523/JNEUROSCI.5832-11.2012.
Huetz C, Philibert B, Edeline JM. 2009. A spike-timing code for        Niwa M, Johnson JS, O’Connor KN, Sutter ML. 2013. Differ-
    discriminating conspecific vocalizations in the thalamocorti-         ences between primary auditory cortex and auditory belt
    cal system of anesthetized and awake Guinea pigs. J Neurosci.         related to encoding and choice for AM sounds. J Neurosci.
    29(2):334–350. doi: 10.1523/JNEUROSCI.3269-08.2009.                   33(19):8378–8395. doi: 10.1523/JNEUROSCI.2672-12.2013.
Johnson JS, Yin P, O’Connor KN, Sutter ML. 2012. Ability of            Niwa M, O’Connor KN, Engall E, Johnson JS, Sutter ML.
    primary auditory cortical neurons to detect amplitude mod-            2015. Hierarchical effects of task engagement on ampli-
    ulation with rate and temporal codes: neurometric analysis.           tude modulation encoding in auditory cortex. J Neurophysiol.
    J Neurophysiol. 107(12):3325–3341. doi: 10.1152/jn.00812.2011.        113(1):307–327. doi: 10.1152/jn.00458.2013.
Joris PX, Schreiner CE, Rees A. 2004. Neural processing of             Oganian Y, Chang EF. 2019. A speech envelope landmark for
    amplitude-modulated sounds. Physiol Rev. 84(2):541–577. doi:          syllable encoding in human superior temporal gyrus. Sci Adv.
    10.1152/physrev.00029.2003.                                           5(11):eaay6279. doi: 10.1126/sciadv.aay6279.
Lemus L, Hernández A, Romo R. 2009. Neural codes for per-              Parker A, Hawken M. 1985. Capabilities of monkey cortical cells
    ceptual discrimination of acoustic flutter in the primate             in spatial-resolution tasks. J Opt Soc Am A. 2(7):1101–1114. doi:
    auditory cortex. Proc Natl Acad Sci USA. 106(23):9471–9476. doi:      10.1364/josaa.2.001101.
    10.1073/pnas.0904066106.                                           Perkel DH, Bullock TH. 1968. Neural coding. Neurosci Res Program
Liang L, Lu T, Wang X. 2002. Neural representations of sinusoidal         Bull. 6(3):221–348.
    amplitude and frequency modulations in the primary audi-           Polley DB, Steinberg EE, Merzenich MM. 2006. Perceptual
    tory cortex of awake primates. J Neurophysiol. 87(5):2237–2261.       learning directs auditory cortical map reorganization
    doi: 10.1152/jn.2002.87.5.2237.                                       through top-down influences. J Neurosci. 26(18):4970–4982.
Ludwig KA, Miriani RM, Langhals NB, Joseph MD,                            doi: 10.1523/JNEUROSCI.3771-05.2006.
    Anderson DJ, Kipke DR. 2009. Using a common average                Porter BA, Rosenthal TR, Ranasinghe KG, Kilgard MP. 2011. Dis-
    reference to improve cortical neuron recordings from                  crimination of brief speech sounds is impaired in rats with
    microelectrode arrays. J Neurophysiol. 101(3):1679–1689. doi:         auditory cortex lesions. Behav Brain Res. 219(1):68–74. doi:
    10.1152/jn.90989.2008.                                                10.1016/j.bbr.2010.12.015.
Luna R, Hernández A, Brody CD, Romo R. 2005. Neural codes for          Radtke-Schuller S, Schuller G, Angenstein F, Grosser OS,
    perceptual discrimination in primary somatosensory cortex.            Goldschmidt J, Budinger E. 2016. Brain atlas of the Mongo-
    Nat Neurosci. 8(9):1210–1219. doi: 10.1038/nn1513.                    lian gerbil (Meriones unguiculatus) in CT/MRI-aided stereo-
Machens CK, Schütze H, Franz A, Kolesnikova O, Stemmler MB,               taxic coordinates. Brain Struct Funct. 221(Suppl 1):1–272. doi:
    Ronacher B, Herz AV. 2003. Single auditory neurons rapidly            10.1007/s00429-016-1259-0.
    discriminate conspecific communication signals. Nat Neu-           Recanzone GH. 2008. Representation of con-specific vocaliza-
    rosci. 6(4):341–342. doi: 10.1038/nn1036.                             tions in the core and belt areas of the auditory cortex in
You can also read