Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography

Page created by Beth Gibbs
 
CONTINUE READING
Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography
Journal of Neural Engineering

PAPER • OPEN ACCESS

Towards optimizing electrode configurations for silent speech recognition
based on high-density surface electromyography
To cite this article: Mingxing Zhu et al 2021 J. Neural Eng. 18 016005

View the article online for updates and enhancements.

                               This content was downloaded from IP address 46.4.80.155 on 11/10/2021 at 02:41
Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography
J. Neural Eng. 18 (2021) 016005                                                           https://doi.org/10.1088/1741-2552/abca14

                              Journal of Neural Engineering

                              PAPER

                              Towards optimizing electrode configurations for silent speech
OPEN ACCESS
                              recognition based on high-density surface electromyography
RECEIVED
22 July 2020                  Mingxing Zhu1,2,3, Haoshi Zhang1,2,3, Xiaochen Wang1,2, Xin Wang1,2, Zijian Yang1, Cheng Wang1,2,
REVISED                       Oluwarotimi Williams Samuel1, Shixiong Chen1 and Guanglin Li1
27 October 2020
                              1
ACCEPTED FOR PUBLICATION
                                  CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy
12 November 2020                  of Sciences, Shenzhen 518055, People’s Republic of China
                              2
                                  Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, People’s Republic of China
PUBLISHED                     3
25 January 2021                   The first two authors contributed equally to the work.
                              E-mail: sx.chen@siat.ac.cn and gl.li@siat.ac.cn
Original content from
this work may be used
                              Keywords: electrode placement optimization, high-density surface electromyography, sequential forward selection algorithm,
under the terms of the        silent speech recognition
Creative Commons
Attribution 4.0 licence.

Any further distribution
of this work must
                              Abstract
maintain attribution to       Objective. Silent speech recognition (SSR) based on surface electromyography (sEMG) is an
the author(s) and the title
of the work, journal          attractive non-acoustic modality of human-machine interfaces that convert the neuromuscular
citation and DOI.
                              electrophysiological signals into computer-readable textual messages. The speaking process
                              involves complex neuromuscular activities spanning a large area over the facial and neck muscles,
                              thus the locations of the sEMG electrodes considerably affected the performance of the SSR system.
                              However, most of the previous studies used only a quite limited number of electrodes that were
                              placed empirically without prior quantitative analysis, resulting in uncertainty and unreliability of
                              the SSR outcomes. Approach. In this study, the technique of high-density sEMG was proposed to
                              provide a full representation of the articulatory muscle activities so that the optimal electrode
                              configuration for SSR could be systemically explored. A total of 120 closely spaced electrodes were
                              placed on the facial and neck muscles to collect the high-density sEMG signals for classifying ten
                              digits (0–9) silently spoken in both English and Chinese. The sequential forward selection
                              algorithm was adopted to explore the optimal electrodes configurations. Main Results. The results
                              showed that the classification accuracy increased rapidly and became saturated quickly when the
                              number of selected electrodes increased from 1 to 120. Using only ten optimal electrodes could
                              achieve a classification accuracy of 86% for English and 94% for Chinese, whereas as many as 40
                              non-optimized electrodes were required to obtain comparable accuracies. Also, the optimally
                              selected electrodes seemed to be mostly distributed on the neck instead of the facial region, and
                              more electrodes were required for English recognition to achieve the same accuracy. Significance.
                              The findings of this study can provide useful guidelines about electrode placement for developing a
                              clinically feasible SSR system and implementing a promising approach of human-machine
                              interface, especially for patients with speaking difficulties.

                              1. Introduction                                                     communication disorders [1–7]. It is worth noting
                                                                                                  that a large number of the existing systems util-
                              Speaking, as a natural form of human communic-                      ize acoustic signals as input for its speech recogni-
                              ation via speech, is a tremendously important way                   tion task, which was widely reported in the literature
                              to engage in social interaction through transmit-                   [8–12]. However, the quality of the acoustic signals
                              ting information, expressing emotions, and convey-                  is easily affected by various factors such as environ-
                              ing intentions. In this regard, automatic speech recog-             mental noises and non-target human speeches, res-
                              nition systems have been developed and used in                      ulting in dramatic declines in recognition accuracies
                              numerous aspects of our current life very success-                  under complex conditions. For instance, in loud noisy
                              fully and widely to enable effective communication                  environments such as the airport, acoustic signals
                              in the context of human-machine interactions and                    of speeches are easily influenced by large-amplitude

                              © 2021 The Author(s). Published by IOP Publishing Ltd
Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography
J. Neural Eng. 18 (2021) 016005                                                                           M Zhu et al

environmental noises, making it rather difficult to         of sEMG signals, and accomplished average classific-
obtain acoustic signals with a satisfactory signal          ation accuracies (CAs) of 94.5% and 89.4% in the
to noise ratio. Such degradation in the quality of          healthy and dysarthric volunteers, respectively, using
the acoustic signals would undoubtedly comprom-             a feed-forward neural network classifier with five-
ise the performance of the speech recognition sys-          fold cross-validation [25]. However, most of the stud-
tem. Moreover, in the circumstances where the acous-        ies used myoelectric information from only a few
tic signals cannot be transmitted (such as space and        electrodes for sEMG-based SSR. The placement of
underwater), acoustic signals may be unavailable to         the electrodes was also completely dependent on the
serve as the input for the speech recognition tasks.        knowledge of the individual researchers without prior
Besides, for certain categories of patients with speak-     quantitative analysis or benchmark standard.
ing difficulties (e.g. those with laryngectomee and              The speaking process involves complex neur-
dysphonia), it is impossible for them to properly           omuscular activities over a large group of facial and
use those acoustic-based recognition systems due to         neck muscles, spanning a relatively large area [26–28].
lack of speech input. Thus, these limitations of the        Thus, the use of a few sEMG electrodes that are placed
commonly used acoustic speech recognition systems           empirically may not cover enough regions of muscles
have prompted the need for alternative physiological        and provide essentially adequate information for SSR
information that could release the extensive depend-        in practical applications. Therefore, it is essential to
ency on acoustic signals and might be used for reliable     obtain full knowledge of the sEMG activities over a
speech recognition in silent voice circumstances.           large area covering all the articulatory muscles, and
     Towards addressing the limitations of acoustic         thereafter we could understand which regions con-
signals, surface electromyography (sEMG) that con-          tain the most important information towards a high
sists of electrophysiology information of muscles           accuracy speech recognition [29–33]. The technology
associated with speaking has been considered as an          of high-density sEMG (HD sEMG) uses a 2D closely-
alternative input for automatic speech recognition          spaced electrode array (instead of a few electrodes)
[13–16]. Compared with the acoustic signals, the            to acquire muscle activities over a relatively large
sEMG signals would not be affected by interfer-             portion of the skin surface and therefore provides
ences from the acoustic ambient noises, making it           greater representation of the behaviors of multiple
possible to achieve accurate recognition of human           muscles or muscle groups [34–37]. In the past dec-
speech even in noisy environments. Moreover, the            ades, the HD sEMG technique had been employed in
sEMG signals measured from the articulatory muscles         many studies to provide controlling input for human-
could facilitate the development of human-machine           machine interfaces, to decode motion intents for arti-
interaction speech interfaces for the rehabilitation of     ficial prosthesis, and to decompose motor units for
patients with speech disorders [17–20]. Interestingly,      motion analyses [38–41]. Moreover, the HD sEMG
the sEMG signals can be obtained seamlessly in both         technique has also been used as a useful tool to
silent and audible modes, making it an ideal candid-        dynamically visualize and evaluate muscular activities
ate to overcome the limitation of acoustic signals for      associated with swallowing and phonating in previous
speech recognition.                                         studies [42–45]. Therefore, the HD sEMG technique
     Due to its important role in the field, the approach   could be a great candidate to provide full information
of silent speech recognition (SSR) based on sEMG            about the articulatory muscles so that the perform-
has been reported by many studies in the past dec-          ance of the sEMG-based SSR could be improved.
ades. The first study in 1985 used three-electrode               However, the HD sEMG method requires cum-
channels located near the mouth to record the sEMG          bersome time-consuming preparation of a large
signals and to classify five Japanese vowels in real-       number of electrodes and could lead to patient dis-
time [21]. In another study, sEMG signals recor-            comfort as a consequence. Moreover, the processing
ded from two pairs of electrodes on the neck areas          of high-volume HD sEMG requires large compu-
below the chin were used to classify four sub-vocal         tation complexity and high power consumption,
Hindi phonemes [22]. Afterward, eight sEMG sensors          making it impossible to perform automatic speech
(four on the face, and four on the neck) were util-         recognition in real-time, especially in portable applic-
ized for alaryngeal speech recognition in patients          ations. Moreover, there is high redundancy in the HD
with surgical removal of the larynx and reported            sEMG recordings, and it may lead to over-fitting of
an averaged error rate of 10.3% for the full eight-         the data, leading to a decline of classification per-
sensor set, as well as a mean error rate of 13.6%           formance as a result. Thus, it is important to explore
when reducing the sensor set to four locations [23].        the least electrode number and optimal electrode
Then, the researchers achieved a 91.1% recognition          positions to provide adequate information for reli-
rate in 1200-phrase subvocal speech experiments             able speech recognition. However, the comprehens-
using 11 sEMG sensors placed on small articu-               ive exploration of the optimal electrode number and
lator muscles of the face and neck [24]. Addition-          positions in the context of SSR have rarely been
ally, a speech recognition system was developed for         investigated to the best of our knowledge. Moreover,
classifying nine Thai Syllables by using five channels      whether there is a significant difference in the optimal

                                               2
Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography
J. Neural Eng. 18 (2021) 016005                                                                                  M Zhu et al

electrode configuration among different languages              Table 1. Speaking tasks of ten English and Chinese digits.
also remains unknown.
    The purpose of this study is to investigate the                      Zero
optimal number and positions of sEMG electrodes                          One
with the desired classification performance for auto-
                                                                         Two
matic SSR, using the approach of HD sEMG. A total
of 120 high-density electrodes were evenly placed over                   Three
the facial and neck muscles, and the HD sEMG sig-                        Four
nals were simultaneously collected when the subjects
                                                                         Five
were speaking ten numbers in both Chinese and Eng-
lish. Then the HD sEMG signals were systemically                          Six
analyzed to explore the contributions from differ-                      Seven
ent regions of muscles towards a satisfactory SSR so
                                                                         Eight
that the general guidelines for deciding the optimal
electrode number and positions could be established.                     Nine

This study could pave the way for the development
of a clinically feasible system for SSR, particularly for
patients with speaking difficulties.                        in English or Chinese in their normal manner, with
                                                            the exception that no sound or voice could be pro-
2. Methodology                                              duced. The subjects could practice the transition from
                                                            audible speech to silent speech until they were exper-
2.1. Subjects                                               ienced in silent speech production prior to the exper-
In this study, a total of 12 healthy volunteers (with       iments. A microphone was placed near the subject to
a mean age of 25.2 years), including seven males            monitor the real-time acoustic signals, and it will ask
and five females, participated in the experiments.          the subject to retry the silent speech task if the micro-
All subjects were native Chinese speakers with nor-         phone recorded any voices from the subject. Each
mal speaking/hearing abilities and at least 10 years        word was repeated 28 times with normal speed (1 s
of experience in English learning. Before the exper-        per word) in one trial, and a rest time of 3 s was set
iments, each subject was clearly explained with the         between two successive repetitions to avoid muscle
objective and procedures of the experiments. All the        fatigue.
subjects voluntarily provided written informed con-
sent, and the experiments were approved by the              2.3. HD sEMG recording
Institutional Review Board of Shenzhen Institutes of        The HD sEMG signals were collected via a mul-
Advanced Technology (#IRB ID: SIAT-IRB-170815-              tichannel sEMG recording system (REFA 120-model,
H0178). They also gave their permission to use              TMS International, the Netherlands), where a total of
their photos and data for scientific and educational        120 channels monopolar electrodes were utilized in
purposes.                                                   the current study. The HD sEMG signals were syn-
                                                            chronously recorded from all channels of surface elec-
2.2. Experimental procedures                                trodes on the facial and neck muscles, as presented in
A set of experiments involving various speaking tasks       figure 1.
were designed and performed by the recruited sub-               The 120-channel electrodes were evenly placed on
jects inside an electromagnetic-shielded room to            the articulatory muscles using miniaturized double-
ensure high-quality EMG signals recordings. The             sided tapes, with left-right symmetry. Each surface
experiments necessitated the subjects to speak with         electrode was approximately 10 mm in diameter,
silent speech mode in two languages, including Eng-         and an inter-electrode distance between two neigh-
lish and Chinese, and the corresponding HD sEMG             boring electrode centers was kept at 15 mm to
recordings were obtained to compare the articulat-          ensure proper coverage of the articulatory muscles
ory muscles contraction patterns of both languages.         associated with speaking activities and to minimize
The main goal to apply our methods to Chinese and           between-channel interactions during the data col-
English was to investigate the similarities and differ-     lection. Considering the mean size of the subjects’
ences in SSR in different languages. In this regard,        neck, a total of 80 electrodes were structured in
the subjects were first asked to maintain a quiet relax     a 5 × 16 grid array (from channel 1 to channel
state without speaking or moving their body parts for       80) and placed on the suprahyoid and infrahyoid
around 40 s, so that the baseline for the sEMG signals      muscles located in the front neck regions (figure
could be determined. Afterward, they were asked to          2(b)). Meanwhile, the rest 40 electrodes were placed
pronounce ten numbers (zero to nine: 0–9) with the          on the face muscles, with 20 electrodes (channel 81–
silent (subvocal) mode in English and Chinese lan-          100) on the left and the other 20 electrodes (chan-
guages, respectively (table 1). During the silent speech    nel 101–120) on the right (figure 2(a)). The speech-
tasks, the subjects were instructed to say the words        related artifacts that mainly originated from the

                                               3
Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography
J. Neural Eng. 18 (2021) 016005                                                                                            M Zhu et al

                       Figure 1. Location of the high-density surface electrodes on the facial and neck muscles.

skin-electrode impedance variation during the silent                 pattern recognition due to their easy implementa-
speech production were minimized by the strong                       tion and low computation complexity [46, 47]. In this
electrode ring sticker with high-quality conductive                  study, four prevalent time-domain features, including
gel, so that stable contact between the electrodes and               mean absolute value (MAV), waveform length (WL),
the skin could be made even when the subject was                     zero crossing (ZC), and slope sign change (SSC), were
speaking.                                                            extracted from the preprocessed sEMG signals for
     In order to evaluate the contribution of different              word classification.
muscles, the electrodes were grouped in four different                   The MAV feature is an average of the absolute
ways (figure 3): the 40 electrodes on the face (F_40ch),             amplitude of the EMG signals within a windows seg-
the odd columns of electrodes on the neck (N_40ch),                  ment calculated by:
all the 80 channels on the neck (N_80ch), and all the
                                                                                                       1∑
                                                                                                           N
120 channels (All_120ch). Afterward, a reference elec-
                                                                                          MAV =           |xi |.                  (1)
trode was fixed on a fabric bracelet and placed on the                                                 N
                                                                                                          i=1
left wrist of the subject.
                                                                         The WL feature could provide a measurement of
                                                                     the complexity of the EMG signals and is defined as
2.4. Features extracting and word classification                     the cumulative length of the EMG waveform in the
In this study, All channels of sEMG signals were                     segment:
sampled at 2048 Hz and filtered with a fourth-
order band-pass Butterworth filter, with cut-off fre-                                             ∑
                                                                                                  i=1

quencies of 30–500 Hz to reduce the ECG noise                                           WL =            |xi+1 − xi |.             (2)
and other low-frequency baseline variations. After                                                N−1

that, a notch filter was used to attenuate the power-                    The ZC feature is defined as the times that sEMG
line interferences at 50 Hz and its harmonic fre-                    amplitude crosses zero amplitude and could be calcu-
quencies. Both automated methods during data col-                    lated by:
lection and visual approach during offline ana-
lyses were employed to exclude faulty trials so that                                            ∑
                                                                                                i=1

satisfactory data quality could be ensured in this                                     ZC =           sgn (−xi xi+1 ),            (3)
study.                                                                                          N−1

    The filtered sEMG data within one trial contain-                 where sgn(x) equals to 1 when x is not less than a
ing all the repetitions of a given word was sliced                   chosen threshold, and it equals to 0 otherwise to avoid
according to the instructed speech starting time of                  the interferences from low-level noises.
the subject (figure 4) to get rid of the non-EMG sig-                    The SSC feature represents the times that the slope
nals during the resting state. The sEMG signals when                 of the sEMG signals changes its sign, and it can be
actually pronouncing the words were preserved and                    mathematically expressed as:
concatenated to get the preprocessed sEMG. Then
the preprocessed sEMG signals were segmented into                                      ∑
                                                                                       i=1

a series of sliding windows with a length w of 400                           SSC =           |f [(xi − xi−1 ) × (xi − xi+1 )]|.   (4)
sampling points (about 200 ms) and an incremental                                      N−1

∆w of 200 sampling points (about 100 ms overlap-                          All the four proposed features were computed
ping), as shown in figure 4.                                         for each analysis window to serve as the training
    Features extracted from the sEMG signals can                     and testing data of the word recognition classifier
provide useful information embedded in the sig-                      for the SSR tasks. The linear discriminant analysis
nals for classifying the intended motions. There were                (LDA) algorithm that can provide robust classific-
                                                                     ation against various sEMG interferences with low
various features involving in the time, frequency,                   computation cost was employed to build the classi-
and time-frequency domain; among those, the time                     fier [31]. Typically, the LDA assumes that data of each
domain features were used the most frequently in                     class follow multivariate Gaussian distribution with

                                                      4
Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography
J. Neural Eng. 18 (2021) 016005                                                                                             M Zhu et al

    Figure 2. The placement of 120-channel electrodes for recording the HD sEMG signals from the face (a) and neck (b) muscles
    associated with speaking activities.

    Figure 3. Four different ways to group the HD electrodes: (a) the 40 electrodes on the face (F_40ch), (b) the odd columns of
    electrodes on the neck (N_40ch), (c) all the 80 channels on the neck (N_80ch), and (d) all the 120 channels (All_120ch).

 homoscedastic covariance and the probability dens-                            above formula can be simplified as follows:
 ity function p of class i is defined as:
                             (        )d
  p(x|wk ) =
                  1              1
                                 √         e−1/2 (x − uk ) ′ C− 1                              gk ∗ (x) = WK x ′ + Bk               (7)
                       1/2                                    k (x − uk ) ,
               |Ck |             2π
                                                                         (5)

where d is the dimension of the vector x; uk and
                                                                                                    W k = uk C−
                                                                                                              k
                                                                                                                1
                                                                                                                                    (8)
 Ck represent the mean vector and the covariance
 matrix for the class i, respectively. According to the
 Bayesian classifier, the decision function of LDA can
                                                                                               1
 be expressed as:                                                                        Bk = − uk C−1 ′
                                                                                                    k uk − ln{P(wk )}.              (9)
                                                                                               2
                           1
   gk ∗ (x) = uk C−  1 ′        −1 ′
                   k x − uk Ck uk − ln{P(wk )},
                           2                                                       Then, a five-fold cross-validation technique was
                                                     (6)                       used to partition the matrix of extracted features and
 where ln{P (wk )} is the prior probability of class wk .                      the corresponding targets into training and testing
 The pooled covariance matrix Ck is calculated as the                          sets. These sets were subsequently fed into the LDA
 average covariance matrixes of all classes. As a res-                         classifier that eventually identified the speech pat-
 ult, uk and Ck are the only parameters in LDA, where                          terns inherent in the extracted features. The perform-
                                                                               ance of the classifier was assessed using the CA metric
 k = 1, 2, …, t, and t is the total number of classes.
                                                                               defined as [48]:
 Meanwhile, the LDA is determined by the size of the                                   Number of correctly classifyed samples
 comparison gk ∗ (x), so by removing the independent                            CA =                                          × 100 %.
                                                                                         Total number of testing samples
 terms and constant terms in the above formula, the                                                                                 (10)

                                                                 5
Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography
J. Neural Eng. 18 (2021) 016005                                                                                         M Zhu et al

   Figure 4. Flowchart of creating sliding windows moving along the sEMG time waveform to extract features for speech pattern
   recognition, with the window width of w = 400 samples and the overlapping length of ∆w = 200 samples.

2.5. EMG channel selection                                         had already been selected in the (i–1)th iteration,
The spectrogram of the speech signals during the                   each channel (xj ) from the rest electrodes would
phonation task was derived by using the short-time                 be picked out and combined with the selected sets
Fourier transform (STFT) method to perform the                     {S_(i–1)ch} for the speech recognition in the ith iter-
time-frequency analysis. The duration of the STFT                  ation equation (11). This procedure will be repeated
analysis window was 20 ms and the overlap width                    until all the rest channels have been tested, and the
between two neighboring windows was 10 ms. The                     optimal channel x∗ with the highest CA would be
formula of the STFT method implemented in this                     selected during the ith iteration. Accordingly, the
study could be expressed as follows.                               sets {S_ (i−1) ch + x∗ } would be selected as the ith
     To obtain the minimum possible number of elec-                optimal channel sets {S_ich} as a result equation (12).
trode channels with satisfactory CAs, the sequential               The iteration process was repeated until i reached
forward selection (SFS) algorithm that is achieved                 its maximum value N (the maximum number of
by an iterative searching procedure was utilized.                  channels).
The SFS method, an iterative searching procedure,
firstly selected the best single-channel for classific-               CAs ({S_ (i−1) ch} + x∗ )
                                                                                            (                  )
ation and then added one more channel at a time                         =      max      CAs {S_ (i−1) ch} + xj                  (11)
                                                                             j∈{1,2,··· ,N−i}
that could reach the maximum CA combine with the
selected electrodes [49, 50]. If the optimal channel
sets {S_(i–1)ch} containing a total of (i–1) channels                            {S_ich} = {S_ (i−1) ch} + x∗ .                 (12)

                                                     6
Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography
J. Neural Eng. 18 (2021) 016005                                                                        M Zhu et al

2.6. Statistical analyses                                 CA showed a similar variation pattern when the D-
The statistical analyses of one-way ANOVA were per-       group or S-group changed. Similar findings were
formed to examine whether different groups of elec-       also observed when comparing the D-groups and S-
trodes had significant effects on the CA of SSR for       groups. Furthermore, the averaged CA for Chinese
Chinese and English, respectively, so that the muscles    recognition was significantly higher than that of the
with major contribution could be identified. All the      English recognition for the same D-group or S-group,
statistical results were obtained by comparing the p-     together with slightly smaller standard deviations.
value with a confidence level of 0.05. The optimal
electrodes that could achieve maximum CA were also        3.2. Effects of the selected channel number
selected for each subject, and the results were ana-      The CA averaged across different digits as a function
lyzed across all subjects in order to provide a general   of channel number selected by the SFS algorithm was
guideline for practical electrode placements in sEMG-     shown in figure 6. As could be observed from the
based SSR.                                                English speech recognition in figure 6(a), the aver-
                                                          aged accuracy rapidly increased from 30.8% to 95%
3. Results                                                when the electrode number increased from 1 to 24.
                                                          However, the growth in the accuracy was rather lim-
3.1. Effects of electrode groups on classification        ited when the channel number further increased. It
performance                                               was noteworthy that the accuracy reached about 80%
In this section, the HD sEMG signals synchronously        when only seven optimal channels were selected by
recorded from the facial and neck muscles were used       the SFS algorithm, and it could achieve 90% for 14
to classify the silent speech associated with 11 pat-     channels. As a comparison, the Chinese speech recog-
terns (ten number 0–9, plus resting state) in Eng-        nition in figure 6(b) showed a faster increasing rate
lish and Chinese, respectively. In order to investig-     in the accuracy when the selected channel number
ate the effects of electrode configuration, the CAs       increased from 1 to 120, with 12 channels accomplish-
were systemically compared when using two differ-         ing an accuracy of 95%. It was also seen that an accur-
ent ways to group the HD electrodes: firstly, the elec-   acy of 85% could be achieved for merely five electrode
trodes were grouped based on the muscle distribu-         channels, and it could reach 90% when adding only
tion and correspondingly divided into four D-groups       two more channels.
(F_40ch, N_40ch, N_80ch, and All_120ch), as shown             To identify the major contributing muscles
in figure 2; secondly, the electrodes were selected by    towards SSR, the distribution of the optimal elec-
employing the SFS algorithm, and there were four          trodes selected by the SFS algorithm for increas-
different S-groups depending on how many elec-            ing accuracies was shown in figure 7 as a typical
trodes have been selected, namely S_10ch (10 elec-        example. A significant observation was that it gener-
trodes selected), S_20ch, S_30ch, and S_40ch. Then        ally required more electrodes for English recognition
the CAs in classifying 11 subvocal patterns were com-     than Chinese to achieve the same CA. For accuracy of
pared among different groups, and a typical example       80% in figure 7(a), only one out of the seven selected
was shown in figure 5, in which the scatter plots rep-    electrodes were distributed on the face for English
resented the CAs of a specific pattern. The means and     recognition, whereas none of the five selected elec-
standard deviations were also shown to compare the        trodes were located on the face region for Chinese
performances among different groups.                      recognition. For the accuracy of 85%, only one out of
    It could be observed from figure 5(a) that the        the ten and six electrodes was located on the face
averaged CA for English recognition consistently          for English and Chinese recognition, respectively
increased when the D-group changed from F_40ch            (figure 7(b)). Similarly, to achieve higher accuracies
to All_120ch, with the F_40ch having the lowest           (figures 7(c) and (d)), while the required electrodes
CA of 73.6%. Similarly, it also showed a monoton-         also increased, resulting in much fewer electrodes
ous increase for the averaged CA when the S-group         were distributed on the face compared with those on
changed from S_10ch to S_40ch, with the S_10ch            the neck, for both English and Chinese recognitions.
group having the lowest CA of 86.1%. When com-
paring between the D-groups and S-groups, it could        3.3. Comparison of optimized channel number
be seen that the S_10ch group had much higher CA          between language
than the F_40ch group, although it contained only ten     pt In this section, we further analyzed the optim-
electrodes. Meanwhile, the S_20ch group also showed       ally selected channels for each subject, and the
slightly larger averaged CA than the N_40ch group,        required channel numbers averaged across all the
though it had only half of the electrode number.          subjects for four increasing accuracies were shown
Moreover, the S_30ch group had similar averaged CA        in figure 8, with the orange color representing Eng-
as the N_80ch group, and the S_40ch group showed          lish digit recognition and blue color for Chinese. It
comparable CA as the All_120ch group, despite the         could be observed from the figure 8 that the mean
fact that the S-groups had much fewer electrodes.         required channel number consistently increased as
In figure 5(b) for Chinese recognition, the averaged      the target CA increased for both English and Chinese

                                             7
Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography
J. Neural Eng. 18 (2021) 016005                                                                                             M Zhu et al

   Figure 5. Classification accuracies of silent speech recognition in English (a) and Chinese (b) for D-groups (F_40ch, N_40ch,
   N_80ch and All_120ch) and S-groups (S_10ch, S_20ch, S_30ch and S_40ch). Each scatter point represented the classification
   accuracy for one of the 11 patterns (ten number 0–9, plus resting state).

recognition tasks. When the accuracy increased from                  distribution similar to figure 7 could be obtained
80% to 95%, the mean optimal channel number                          individually. Then, occurring times of each of the 120
increased from 6.3 to 25.3 for English, whereas it                   electrodes were counted when accumulating the res-
was from 5.5 to 22.8 for Chinese. For a given accur-                 ults of all recruited subjects, and the overall distribu-
acy, the mean required channel number for Chinese                    tion of the optimally selected electrodes was shown
always seemed lower than that of English recognition.                in figure 9, with different colors representing differ-
The variation pattern of the selected channel num-                   ent occurring times above four.
ber averaged across subjects (figure 8) agreed with                      For English digit recognition in figure 9(a),
the general observations of the typical example in                   the optimally selected channels with the maximum
figure 7.                                                            occurrences (seven and six times) were consistently
3.4. Distribution of selected electrodes across                      distributed along the edge of the neck region, and
subjects                                                             none of them were observed on the face. For selec-
The selected channels to achieve an accuracy of 95%                  ted channels with five or four occurrences, there were
were analyzed for each subject, and an electrode                     also many more electrodes on the neck than on the

                                                       8
Towards optimizing electrode configurations for silent speech recognition based on high-density surface electromyography
J. Neural Eng. 18 (2021) 016005                                                                                               M Zhu et al

   Figure 6. Mean classification accuracy (averaged across different words) as a function of the optimal electrode number (from 1 to
   120) selected by the SFS algorithm in English (a) and Chinese (b) speech recognition tasks.

     Figure 7. Distribution of the selected electrodes for given classification accuracies of 80% (a), 85% (b), 90% (c), and 95% (d).

face. For the neck region, it seemed that there were                  (figure 9(b)) seemed more than that of English recog-
more optimal electrodes located on the right side than                nition (figure 9(a)).
the left side. Meanwhile, for the Chinese digit recog-
nition in figure 9(b), the channels with the highest                  4. Discussion
occurrences (eight and seven times) are also distrib-
uted around the edge of the neck, without any of                      The sEMG-based SSR technique is a viable alternat-
them on the face. The channels that appeared four                     ive way of communication in situations when acous-
times seemed to be dispersedly distributed on both                    tic signals are not available as input for automatic
the face and neck regions. Additionally, the selected                 speech recognition, especially for patients with speak-
channels occurring at least four times for Chinese                    ing difficulties. Importantly, the optimal electrode

                                                        9
J. Neural Eng. 18 (2021) 016005                                                                                              M Zhu et al

   Figure 8. Channel number averaged across all the recruited subjects of different classification accuracy for English and Chinese
   digit recognition.

locations and number are key factors to be con-                       the neck regions would be more likely to record the
sidered in such sEMG-based speech recognition sys-                    controlling patterns of the articulatory muscles, and
tems. However, the key factors are often arbitrar-                    therefore the neck electrodes contribute more toward
ily determined via the trial and error method or                      the SSR. This might be a finding that is intrinsic to
completely by experiences, leading to unsatisfactory                  the properties of silent speech that is less dependent
recognition accuracies that hinder the widespread                     on facial articulation. Another interesting finding of
application of such speech recognition systems. The                   this study is that the optimal electrodes within the
purpose of this study is to systemically investigate the              neck seemed to distribute along the edges, as indic-
optimal electrode positions and minimum electrode                     ated by the typical example in figure 7 and the aver-
number to provide guiding principles for electrode                    aged results in figure 9. A possible explanation is that
preparation of the sEMG-based SSR system.                             the articulatory muscles are located closer to the edges
    To achieve the goal, HD sEMG signals containing                   of the electrode array, and there are few articulat-
a rich set of electrophysiological information about                  ory muscles distributed in the center because of the
the muscles were simultaneously collected from 120                    prominent laryngeal. Another possibility might be
channels of surface electrodes located on the facial                  that the electrodes further away from the center might
and neck regions when the subjects silently pro-                      be less interfered with by the electrode movements
nounced ten digits in English and Chinese, respect-                   during the silent pronunciation. It is also possible
ively. As shown in figure 5, electrodes from dif-                     that the interference of vibrations from both sides of
ferent regions contributed quite differently towards                  the neck might make these electrodes informational
the speech classification performance of SSR. For                     redundant. Although there were many studies that
example, the CAs obtained when using electrodes on                    also proved the feasibility of using sEMG signals for
the neck were much higher than those on the face                      SSR [23–25, 46, 51], few of them evaluated the con-
for different languages, although with the same chan-                 tribution difference between the face and neck artic-
nel number. It indicates that the information from                    ulatory muscles or the muscular difference within the
the neck electrodes plays a more important role, and                  neck region, due to the lack of full representation of
therefore the muscles on the neck might have more                     muscular activities using HD sEMG signals. The out-
contributions during the silent speaking tasks. This                  comes of this study suggest that the sEMG electrodes
conclusion is further supported by figures 6 and 8,                   should mainly cover the neck muscles rather than the
in which the contributing electrodes mostly came                      facial muscles for satisfactory classification perform-
from the neck instead of the face when using the                      ance in practical applications of SSR.
SFS algorithm to select the optimal channels. Very                         Another important finding of this study was that
few optimal electrodes came from the facial region,                   the optimal electrodes selected by the SFS algorithm
regardless of the achieved CA or the language. One                    showed much better performance compared with the
explanation is that the quasi-periodic vibrations of                  cases when all the electrodes from the neck or facial
vocal cords during speaking are mainly controlled                     region were used. In figure 5, only 10 optimally selec-
by the neck muscles instead of the facial muscles.                    ted electrodes could achieve a CA close to 90%, and
There are more articulatory muscles distributed along                 it significantly outperformed the case when all the
the neck regions, and more muscles are activated or                   40 face electrodes were used. Moreover, the CA of
involved during speech production. Meanwhile, the                     only 30 optimal electrodes was comparable to that
activations of facial muscles are also affected by other              of all the 80 electrodes on the neck, while only 40
physiological functions such as expression control,                   optimal electrodes performed similarly as all the 120
while the neck muscles are more dedicated to speech                   electrodes. The results suggest that the selection and
and swallowing functions. Thus, the electrodes on                     optimization of the electrodes are of great importance

                                                      10
J. Neural Eng. 18 (2021) 016005                                                                                                M Zhu et al

   Figure 9. Distribution of the optimally selected electrodes that occurred at least four times when accumulating the results of all
   subjects in English (a) and Chinese (b) silent speech recognition.

for sMEG-based SSR. If there was no optimization                       In contrast, it is possible to achieve an accuracy as
and all the electrodes were placed on positions with                   high as 90% when using only 14 electrodes for Eng-
less importance (such as on the face in figure 5), for                 lish and even less (7 electrodes) for Chinese (figure
example, the CA would be as low as 73.6% even                          6). Meanwhile, it was also observed in figure 6 that
though the electrode number was as large as 40.                        CAs initially grew rapidly when the optimal channel

                                                       11
J. Neural Eng. 18 (2021) 016005                                                                        M Zhu et al

number selected by the SFS algorithm increase from        subjects are native Chinese speakers and they were
one, indicating adding more electrodes would be           more fluent in Chinese speaking tasks. In contrast,
rather beneficial when the electrode number is quite      there might be slightly larger differences in the sEMG
limited. This finding is in accordance with the previ-    signals when repeated the same English word in silent
ous study that reported rapidly increasing error rates    mode, since the foreign language tasks were not as
with a reduced number of sensors [23]. Also, adding       fluent and stable as their native language. There-
nearly 100 more electrodes to the optimally selec-        fore, it would require more information from more
ted electrodes (24 for English and 12 for Chinese)        electrodes to achieve the same CA due to the larger
with CA above 95% could only lead to an accur-            variation in the sEMG signals for English speaking
acy increase of less than 3%, indicating that there       tasks. Meanwhile, as could be observed from table
might be large redundancy in the information con-         1, the pronunciation of English digits usually has
tained in the extra 100 electrodes. In other words, a     more syllables than those in Chinese (such as zero
continual increase in the electrode channels does not     and seven). Although with the same number of syl-
necessarily translate to improved classification per-     lables, the pronunciation of English seemed more
formance. Moreover, a large number of electrodes          complex than Chinse due to the involvement of more
would lead to high computational complexity, over-        consonants (such as five and six). The increase in
fitting of redundant data, and an increase in the sys-    pronunciation complexity would need more muscles
tem cost. The electrode optimization could help to        to be involved during the speaking process, lead-
greatly reduce the redundancy by selecting optimal        ing to the observations that more electrodes were
positions where the activities of all different muscles   required for English recognition when compared with
could be recorded using a significantly reduced num-      Chinese (figures 7 and 8). Moreover, the subjects
ber of electrodes, which is also supported by the find-   recruited in this study are all native Chinese speak-
ings of related studies [16, 29, 30]. By employing        ers, and their influence in speaking Chinese might
electrode optimization, a system with a minimum           provide more stable sEMG signals that are benefi-
number of electrodes would be easier to operate and       cial for speech recognition. The distribution of the
more comfortable for the subjects, which may facil-       optimally selected electrodes also demonstrated slight
itate widespread applications of automatic speech         differences between English and Chinese (figure 9).
recognition. There are different algorithms to obtain     More electrodes with higher occurring times (greater
the optimal combination of electrode positions, such      than 6) were observed when the results were accu-
as independent components analysis [30], Genetic          mulated across subjects, indicating that there might
algorithm [31], Fisher–Markov selector [52], and the      be more in common for the activation patterns of
SFS approach [50]. In this study, the algorithm of        the articulatory muscles when the subjects spoke
SFS was used because it is easy to implement and          Chinese. It might also be explained by the different
shows great performance in various circumstances          proficiency in English among subjects since it is not
of data dimension reduction. The electrode optim-         their native language. These findings between English
ization proposed by this study might provide use-         and Chinese suggest that we should pay attention to
ful guidelines for sEMG-based SSR on how to use           the language differences when deciding the optimal
the least number of electrodes to achieve outstand-       electrode positions and number for best practices
ing CAs. However, due to the significant individual       of SSR.
differences in pronunciation habits, more subjects            Helping Patients who have undergone laryngec-
with various speaking styles should be recruited in       tomy or suffered from dysphonia is one of the final
future studies before more solid guidelines could be      goals for our proposed sEMG-based SSR techno-
established.                                              logy. However, this study is just the start of this
     In this study, differences in classification per-    research line and a lot of work still needs to be done
formance were also observed between English and           before the technology could be finally applied to these
Chinese. In figure 5, systemically higher averaged        patients. The purpose of this study is to investig-
CAs could be found for Chinese digit recognition          ate the contributions of different articulatory muscles
than English, for all the electrode configurations. In    in SSR and to explore the optimal electrode loca-
figure 6, the accuracy increased more rapidly when        tions towards satisfactory classification. Such purpose
the optimally selected electrode number increased         was achieved by employing normal healthy individu-
from 1 to 120. Moreover, consistently fewer electrodes    als with healthy articulation musculature and nor-
were required for Chinese recognition to achieve the      mal anatomy so that a general guideline could be
same accuracy as English tasks, as indicated in figures   established on how to place electrodes in patients
5–7. One reason for the language differences could        with speaking difficulties, especially when the num-
be that speaking different languages required differ-     ber of the available electrodes were limited. In future
ent articulation styles that may influence the contrac-   studies, patients with laryngectomy and dysphonia
tion patterns of the articulatory muscles. The slight     should also be included to confirm the claims and
superior performance of Chinese speech recognition        the system should be made to work in real-time with
may be attributed to the fact that all the recruited      online processing and closed-loop classifiers so that

                                             12
J. Neural Eng. 18 (2021) 016005                                                                              M Zhu et al

the proposed sEMG-based SSR technology could be          Oluwarotimi Williams Samuel 
more useful in real scenarios.                           https://orcid.org/0000-0003-1945-1402
                                                         Shixiong Chen  https://orcid.org/0000-0002-5868-
5. Conclusions                                           6952

In this study, the optimal number and positions          References
of electrodes for SSR in English and Chinese were
investigated based on HD sEMG recordings with 40          [1] De-la-calle-silos F and Stern R M 2017 Synchrony-based
                                                              feature extraction for robust automatic speech recognition
electrodes placed on the face and 80 on the front             IEEE Signal Process. Lett. 24 1158–62
neck. The findings suggested that the electrodes on       [2] Fukui M, Watanabe T and Kanazawa M 2018 Sound source
the neck region had significantly larger contributions        separation for plural passenger speech recognition in smart
than those on the face. The electrode groups optim-           mobility system IEEE Trans. Consum. Electron. 64 399–405
                                                          [3] Li J, Deng L, Gong Y and Haeb-Umbach R 2014 An overview
ally selected by the SFS algorithm outperformed the           of noise-robust automatic speech recognition IEEE/ACM
no-optimization group with much larger numbers of             Trans. Audio, Speech, Language Process. 22 745–77
electrodes, among which only ten optimal electrodes       [4] Shimada K, Bando Y, Mimura M, Itoyama K, Yoshii K and
could achieve a CA above 86%. Further investigation           Kawahara T 2019 Unsupervised speech enhancement based
                                                              on multichannel NMF-informed beamforming for
demonstrated that the CAs increased rapidly and then          noise-robust automatic speech recognition IEEE/ACM Trans.
became saturated when the optimal channel num-                Audio, Speech, Language Process. 27 960–71
ber increased from 1. Statistical results showed that     [5] Sainath T N, Weiss R J, Wilson K W, Li B, Narayanan A,
the optimally selected electrodes were mainly distrib-        Variani E, Bacchiani M, Shafran I, Senior A and Chin K 2017
                                                              Multichannel signal processing with deep neural networks
uted on the neck instead of the face region, and more         for automatic speech recognition IEEE/ACM Trans. Audio,
electrodes were required for English recognition to           Speech, Language Process. 25 965–79
achieve the same accuracy. The finding of this study      [6] Saksamudre S K, Shrishrimal P and Deshmukh R 2015 A
might provide useful information about the electrode          review on different approaches for speech recognition
                                                              system Int. J. Comput. Appl. 115 23–28
placement when using sEMG signals for SSR, which          [7] Enarvi S, Smit P, Virpioja S and Kurimo M 2017 Automatic
is of great importance in developing reliable speech          speech recognition with very large conversational finnish
recognition systems in clinical applications.                 and estonian vocabularies IEEE/ACM Trans. Audio, Speech,
                                                              Language Process. 25 2085–97
                                                          [8] Yoshioka T, Sehr A, Delcroix M, Kinoshita K, Maas R,
Acknowledgments                                               Nakatani T and Kellermann W 2012 Making machines
                                                              understand us in reverberant rooms: robustness against
                                                              reverberation for automatic speech recognition IEEE Signal
We would thank all the members in our research                Process. Mag. 29 114–26
laboratory at the Research Center for Neural Engin-       [9] Yu J, Markov K and Matsui T 2019 Articulatory and
eering, Institute of Advanced Integration Techno-             spectrum information fusion based on deep recurrent neural
                                                              networks IEEE/ACM Trans. Audio, Speech, Language Process.
logy, Shenzhen Institutes of Advanced Technology,
                                                              27 742–52
for their supports and assistance in conducting the      [10] Muhammad G 2015 Automatic speech recognition using
experiments and signal processing. The work also was          interlaced derivative pattern for cloud based healthcare
supported by Shenzhen Institute of Artificial Intelli-        system Cluster Comput. 18 795–802
                                                         [11] Ganapathy S 2017 Multivariate autoregressive spectrogram
gence and Robotics for Society.
                                                              modeling for noisy speech recognition IEEE Signal Process.
                                                              Lett. 24 1373–7
                                                         [12] Joy N M and Umesh S 2018 Improving acoustic models in
Funding                                                       torgo dysarthric speech database IEEE Trans. Neural Syst.
                                                              Rehabil. Eng. 26 637–45
This work was supported in part by the National          [13] Janke M and Diener L 2017 EMG-to-speech: direct
Natural Science Foundation of China (Grant                    generation of speech from facial electromyographic signals
                                                              IEEE/ACM Trans. Audio, Speech, Language Process.
Nos. 61771462, 61901464, and 81927804), Shen-                 25 2375–85
zhen Governmental Basic Research (Grant No.              [14] Khan M and Jahan M 2018 Classification of myoelectric
JCYJ20180507182241622), Science and Technology                signal for sub-vocal Hindi phoneme speech recognition
Planning Project of Shenzhen (Grant No.                       J. Intell. Fuzzy Syst. 35 5585–92
                                                         [15] Chau G and Kemper G 2015 One channel subvocal speech
GJHZ20190821160003734),      Shenzhen      Science            phrases recognition using cumulative residual entropy and
and Technology Development Fund (Grant No.                    support vector machines IEEE Lat. Am. Trans. 13 2135–43
JCYJ20170818163505850), Science and Technology           [16] Kubo T, Yoshida M, Hattori T and Ikeda K 2014 Towards
Program of Guangzhou (Grant No. 201803010093),                excluding redundancy in electrode grid for automatic speech
                                                              recognition based on surface EMG Neurocomputing 134
Science and Technology Planning Project of Guang-             15–19
dong Province (Grant No. 2019A050510033).                [17] Smith N R, Rivera L A, Dietrich M, Shyu C-R, Page M P and
                                                              DeSouza G N 2015 Detection of simulated vocal
ORCID iDs                                                     dysfunctions using complex sEMG patterns IEEE J. Biomed.
                                                              Health Inform. 20 787–801
                                                         [18] Yu S, Lee T and Ng M L 2016 Surface electromyographic
Mingxing Zhu  https://orcid.org/0000-0002-1204-              activity of extrinsic laryngeal muscles in Cantonese tone
118X                                                          production J. Signal Process. Syst. 82 287–94

                                            13
J. Neural Eng. 18 (2021) 016005                                                                                             M Zhu et al

[19] Stepp C E, Heaton J T, Braden M N, Jetté M E,                  [36] Cerone G L, Botter A and Gazzoni M 2019 A modular,
     Stadelman-Cohen T K and Hillman R E 2011 Comparison of               smart, and wearable system for high density sEMG detection
     neck tension palpation rating systems with surface                   IEEE Trans. Biomed. Eng. 66 3371–80
     electromyographic and acoustic measures in vocal                [37] Huang C, Chen X, Cao S and Zhang X 2016 Muscle-tendon
     hyperfunction J. Voice 25 67–75                                      units localization and activation level analysis based on
[20] Stepp C E, Hillman R E and Heaton J T 2010 Use of neck               high-density surface EMG array and NMF algorithm
     strap muscle intermuscular coherence as an indicator of              J. Neural. Eng. 13 066001
     vocal hyperfunction IEEE Trans. Neural Syst. Rehabil. Eng.      [38] Afsharipour B, Ullah K and Merletti R 2015 Amplitude
     18 329–35                                                            indicators and spatial aliasing in high density surface
[21] Sugie N and Tsunoda K 1985 A speech prosthesis employing             electromyography recordings Biomed. Signal Process. Control
     a speech synthesizer-vowel discrimination from perioral              22 170–9
     muscle activities and vowel production IEEE Trans. Biomed.      [39] Naik G R, Baker K G and Nguyen H T 2014 Dependence
     Eng. 7 485–90                                                        independence measure for posterior and anterior EMG
[22] Khan M and Jahan M 2016 Sub-vocal speech pattern                     sensors used in simple and complex finger flexion
     recognition of Hindi alphabet with surface                           movements: evaluation using SDICA IEEE J. Biomed. Health
     electromyography signal Perspect. Sci. 8 558–60                      Inform. 19 1689–96
[23] Meltzner G S, Heaton J T, Deng Y, De Luca G, Roy S H and        [40] Wang D, Zhang X, Gao X, Chen X and Zhou P 2016 Wavelet
     Kline J C 2017 Silent speech recognition as an alternative           packet feature assessment for high-density myoelectric
     communication device for persons with laryngectomy                   pattern recognition and channel selection toward stroke
     IEEE/ACM Trans. Audio, Speech, Language Process.                     rehabilitation Front. Neurol. 7 197–206
     25 2386–98                                                      [41] Bai D, Chen S and Yang J 2019 Upper arm motion
[24] Meltzner G S, Heaton J T, Deng Y, De Luca G, Roy S H and             high-density sEMG recognition optimization based on
     Kline J C 2018 Development of sEMG sensors and                       spatial and time-frequency domain features J. Healthc. Eng.
     algorithms for silent speech recognition J. Neural. Eng.             2019 1–17
     15 046031                                                       [42] Zhu M, Samuel O W, Yang Z, Lin W, Huang Z, Fang P, Tan J,
[25] Jong N S and Phukpattaranont P 2019 A speech recognition             Li P, Tong M C and Leung K K 2018 Using muscle synergy to
     system based on electromyography for the rehabilitation of           evaluate the neck muscular activities during normal
     dysarthric patients: a Thai syllable study Biocybern. Biomed.        swallowing 40th EMBC (Hawaii, USA) pp 2454–7
     Eng. 39 234–45                                                  [43] Zhu M, Yu B, Yang W, Jiang Y, Lu L, Huang Z, Chen S and
[26] Dewan K, Vahabzadeh-Hagh A, Soofer D and Chhetri D K                 Li G 2017 Evaluation of normal swallowing functions by
     2017 Neuromuscular compensation mechanisms in vocal                  using dynamic high-density surface electromyography maps
     fold paralysis and paresis Laryngoscope 127 1633–8                   Biomed. Eng. Online 16 133–50
[27] Yin J and Zhang Z 2014 Interaction between the                  [44] Zhu M, Liang F, Samuel O W, Chen S, Yang W, Lu L, Zou H,
     thyroarytenoid and lateral cricoarytenoid muscles in the             Li P and Li G 2017 A pilot study on the evaluation of normal
     control of vocal fold adduction and eigenfrequencies                 phonating function based on high-density sEMG
     J. Biomech. Eng. 136 111006                                          topographic maps 39th EMBC (Jeju Island Korea) pp 1030–3
[28] Chhetri D K and Neubauer J 2015 Differential roles for the      [45] Zhu M, Lu L, Yang Z, Wang X, Liu Z, Wei W, Chen F, Li P,
     thyroarytenoid and lateral cricoarytenoid muscles in                 Chen S and Li G 2018 Contraction patterns of neck muscles
     phonation Laryngoscope 125 2772–7                                    during phonating by high-density surface electromyography
[29] Hua J, Li G, Jiang D, Zhao H and Qi J 2019 An optimized              2018 IEEE CBS (Shenzhen China) pp 572–5
     selection method of channel numbers and electrode layouts       [46] Srisuwan N, Phukpattaranont P and Limsakul C 2018
     for hand motions recognition Int. J. Hum. Resour. Manag.             Comparison of feature evaluation criteria for speech
     16 1941006                                                           recognition based on electromyography Med. Biol. Eng.
[30] Naik G R, Al-Timemy A H and Nguyen H T 2015                          Comput. 56 1041–51
     Transradial amputee gesture classification using an             [47] Phinyomark A, Phukpattaranont P and Limsakul C 2012
     optimal number of sEMG sensors: an approach using ICA                Feature reduction and selection for EMG signal classification
     clustering IEEE Trans. Neural Syst. Rehabil. Eng.                    Expert Syst. Appl. 39 7420–31
     24 837–46                                                       [48] Samuel O W, Zhou H, Li X, Wang H, Zhang H, Sangaiah A K
[31] Wang Z, Fang Y, Li G and Liu H 2019 Facilitate sEMG-based            and Li G 2018 Pattern recognition of electromyography
     human-machine interaction through channel optimization               signals based on novel time domain features for amputees’
     Int. J. Hum. Resour. Manag. 16 1941001                               limb motion classification Comput. Electr. Eng. 67 646–55
[32] Clancy E A, Martinez-Luna C, Wartenberg M, Dai C and            [49] Geng Y, Zhang X, Zhang Y and Li G 2014 A novel channel
     Farrell T R 2017 Two degrees of freedom quasi-static                 selection method for multiple motion classification using
     EMG-force at the wrist using a minimum number of                     high-density electromyography Biomed. Eng. Online
     electrodes J. Electromyogr. Kinesiol. 34 24–36                       13 102
[33] Xu W, Li G, Ju Z and Liu H 2019 Surface EMG electrode           [50] Li X, Samuel O W, Zhang X, Wang H, Fang P and Li G 2017
     distribution for thumb motion classification based on                A motion-classification strategy based on sEMG-EEG signal
     wireless communication equipment Int. J. Wirel. Mob.                 combination for upper-limb amputees J. Neuroeng. Rehabil.
     Comput. 16 166–71                                                    14 1–13
[34] Kim M, Gu G, Cha K J, Kim D S and Chung W K 2018                [51] Wand M, Janke M and Schultz T 2014 Tackling speaking
     Wireless semg system with a microneedle-based high-density           mode varieties in EMG-based speech recognition IEEE
     electrode array on a flexible substrate Sensors                      Trans. Biomed. Eng. 61 2515–26
     18 92–93                                                        [52] Cheng Q, Zhou H and Cheng J 2010 The fisher-markov
[35] Ison M, Vujaklija I, Whitsell B, Farina D and Artemiadis P           selector: fast selecting maximally separable feature subset for
     2015 High-density electromyography and motor skill                   multiclass classification with applications to
     learning for robust long-term control of a 7-DoF robot arm           high-dimensional data IEEE Trans. Pattern Anal. Mach.
     IEEE Trans. Neural Syst. Rehabil. Eng. 24 424–33                     Intell. 33 1217–33

                                                      14
You can also read