Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
1 Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review Marjane Khodatars, Afshin Shoeibi, Navid Ghassemi, Mahboobeh Jafari, Ali Khadem, Delaram Sadeghi, Parisa Moridian, Sadiq Hussain, Roohallah Alizadehsani, Member, IEEE, Assef Zare, Abbas Khosravi, Member, IEEE, Saeid Nahavandi, Fellow, IEEE, U. Rajendra Acharya, Senior Member, IEEE, Michael Berk arXiv:2007.01285v3 [cs.LG] 26 Jul 2020 Abstract—Accurate diagnosis of Autism Spectrum Disorder I. I NTRODUCTION (ASD) is essential for its management and rehabilitation. Neu- roimaging techniques that are non-invasive are disease markers SD is a disorder of the nervous system that affects and may be leveraged to aid ASD diagnosis. Structural and func- tional neuroimaging techniques provide physicians substantial A the brain and results in difficulties in speech, social interaction and communication deficits, repetitive behaviors, information about the structure (anatomy and structural con- nectivity) and function (activity and functional connectivity) of and delays in motor abilities [1]. This disease can generally the brain. Due to the intricate structure and function of the brain, be distinguished with extant diagnostic protocols from the diagnosing ASD with neuroimaging data without exploiting artifi- age of three years onwards. Autism influences many parts cial intelligence (AI) techniques is extremely challenging. AI tech- of the brain. This disorder also involves a genetic influence niques comprise traditional machine learning (ML) approaches and deep learning (DL) techniques. Conventional ML methods via the gene interactions or polymorphisms [2], [3]. One in employ various feature extraction and classification techniques, 70 children worldwide is affected by autism. In 2018, the but in DL, the process of feature extraction and classification is prevalence of ASD was estimated to occur in 168 out of 10,000 accomplished intelligently and integrally. In this paper, studies children in the United States, one of the highest prevalence conducted with the aid of DL networks to distinguish ASD were rates worldwide. Autism is significantly more common in boys investigated. Rehabilitation tools provided by supporting ASD patients utilizing DL networks were also assessed. Finally, we than in girls. In the United States, about 3.63 percent of boys presented important challenges in the automated detection and aged 3 to 17 years have autism spectrum disorder, compared rehabilitation of ASD. with approximately 1.25 percent of girls [4]. Index Terms—Autism Spectrum Disorder, Diagnosis, Rehabil- Diagnosing ASD is difficult because there is no pathophys- itation, Deep Learning, Neuroimaging, Neuroscience. iological marker, relying instead just on psychological criteria [5]. Psychological tools can identify individual behaviors, M. Khodatars and D. Sadeghi are with the Dept. of Medical Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran. levels of social interaction, and consequently facilitate early A. Shoeibi and N. Ghassemi are with the Faculty of Electrical Engineering, diagnosis. Behavioral evaluations embrace various instruments Biomedical Data Acquisition Lab, K. N. Toosi University of Technology, and questionnaires to assist the physicians to specify the partic- Tehran, Iran, and the Computer Engineering Department, Ferdowsi University of Mashhad, Mashhad, Iran. ular type of delay in a child’s development, including clinical M. Jafari is with Electrical and Computer Engineering Faculty, Semnan observations, medical history, autism diagnosis instructions, University, Semnan, Iran. and growth and intelligence tests [6]. A. Khadem is with the Faculty of Electrical Engineering, K. N. Toosi University of Technology, Tehran, Iran. (Corresponding author: Ali Khadem, Several investigations for the diagnosis of ASD have re- email: alikhadem@kntu.ac.ir). cently been conducted on neuroimaging data (structural and P. Moridian is with the Faculty of Engineering, Science and Research functional). Branch, Islamic Azad University, Tehran, Iran. Sadiq Hussain is System Administrator at Dibrugarh University, Assam, Analyzing anatomy and structural connections of brain areas India, 786004. with structural neuroimaging is an essential tool for study- A. Zare is with Faculty of Electrical Engineering, Gonabad Branch, Islamic ing structural disorders of the brain in ASD. The principal Azad University, Gonabad, Iran. R. Alizadehsani, A. Khosravi and S. Nahavandi. are with the Institute tools for structural brain imaging are magnetic resonance for Intelligent Systems Research and Innovation (IISRI), Deakin University, imaging (MRI) techniques [7], [8], [9]. Cerebral anatomy is Victoria 3217, Australia. investigated by structrul MRI (sMRI) images and anatomical U. R. Acharya is with the Dept. of Electronics and Computer Engineering, Ngee Ann Polytechnic, Singapore 599489, Singapore, the Dept. of Biomedical connections are assesed by diffusion tensor imaging MRI Informatics and Medical Engineering, Asia University, Taichung, Taiwan, and (DTI-MR) [10]. Investigating the activity and functional con- the Dept. of Biomedical Engineering, School of Science and Technology, nections of brain areas using functional neuroimaging can Singapore University of Social Sciences, Singapore. M. Berk is with the Deakin University, IMPACT - the Institute for also be used for studying ASD. Brain functional diagnostic Mental and Physical Health and Clinical Translation, School of Medicine, tools are older approaches than structural methods for studying Barwon Health, Geelong, Australia, and the Orygen, The National Centre ASD. The most basic modality of functional neuroimaging is of Excellence in Youth Mental Health, Centre for Youth Mental Health, Florey Institute for Neuroscience and Mental Health and the Department of electroencephalography (EEG), which records the electrical Psychiatry, The University of Melbourne, Melbourne, Australia. activity of the brain from the scalp with a high temporal
2 resolution (in milliseconds order) [11]. Studies have shown that employing EEG signals to diagnose ASD have been useful [12], [13], [14]. Functional MRI (fMRI) is one of the most promising imaging modalities in functional brain disorders, used as task-based (T-fMRI) or restingstate (rs-fMRI) [15], [16]. fMRI-based techniques have a high spatial resolution (in the order of millimeters) but a low temporal resolution due to slow response of the hemodynamic system of the brain as well as fMRI imaging time constraints and is not ideal for recording the fast dynamics of brain activities. In addition, Fig. 1: Number of papers published every year for ASD these techniques have a high sensitivity to motion artifacts. diagnosis and rehabilitation. It should be stressed that in consonance with studies, three less prevalent modalities of electrocorticography (ECoG) [17], functional near-infrared spectroscopy (fNIRS) [18], and Mag- netoencephalography (MEG) [19] can also attain reasonable performance in ASD diagnosis. An appropriate approach is to utilize machine-learning techniques alongside functional and structural data to collaborate with physicians in the process of accurately assessing ASD. In the field of ASD, applying machine learning methods generally entail two categories of traditional methods [20] and DL methods [21]. As opposed to traditional methods, much less work has been done on DL methods to explore ASD or design rehabilitation tools. This study reviews ASD assesment methods and patients’ Fig. 2: Illustration of various types of DL methods. rehabilitation with DL networks. The outline of this paper is as follows. Section 2 is search strategy. Section 3 concisely presents the DL networks employed in the field of ASD. In reinforcement learning [28], and a variety of DL networks are section 4, existing computer-aided diagnosis systems (CADS) provided for each type. So far, most studies applied to identify that use brain functional and structural data are reviewed. ASD using DL have been based on supervised or unsupervised In section 5, DL-based rehabilitation tools for supporting approaches. Figure 2 illustrates generally employed types of ASD patients are introduced. Section 6 discusses the reviewed DL networks with supervised or unsupervised learning to papers. Section 7 reveals the challenges of ASD diagnosis study ASD. and rehabilitation with DL. Finally, the paper concludes and suggests future work in section 8. IV. CADS- BASED DEEP LEARNING TECHNIQUES FOR ASD DIAGNOSIS BY NEUROIMAGING DATA II. S EARCH S TRATEGY A traditional artificial intelligence (AI)-based CADS en- In this review, IEEE Xplore, ScienceDirect, SpringerLink, compasses several stages of data acquisition, data pre- ACM, as well as other conferences or journals were used to processing, feature extraction, and classification [29], [30], acquire papers on ASD diagnosis and rehabilitation using DL [31], [32]. In [33], [34], [35] existing traditional algorithms for methods. Further, the keywords ”ASD”, ”Autism Spectrum diagnosing ASD have been reviewed. In contrast to traditional Disorder” and ”Deep Learning” were used to select the papers. methods, in DL-based CADS, feature extraction, and classi- The papers are analyzed till June 03th, 2020 by the authors fication are performed intelligently within the model. Also, (AK, SN). Figure 1 depicts the number of considered papers due to the structure of DL networks, using large dataset to using DL methods for the automated detection and rehabilita- train DL networks and recognize intricate patterns in datasets tion of ASD each year. is incumbent. The components of DL-based CADS for ASD detection are shown in Figure 3. It can be noted from the III. D EEP L EARNING T ECHNIQUES FOR ASD D IAGNOSIS figure that, large and free databases are first introduced to AND R EHABILITATION diagnose ASD. In the second step, various types of pre- processing techniques are used on functional and structural Nowadays, DL algorithms are used in many areas of data to be scrutinized. Finally, the DL networks are applied medicine including structural and functional neuroimaging. on the preprocessed data. The application of DL in neural imaging ranges from brain MR image segmentation [22], to detection of brain lesions such as tumors [23], diagnosis of brain functional disorders A. Neuroimaging ASD Datasets such as ASD [24], and production of artificial structural or Datasets are the heart of any CADS development and the functional brain images [25]. Machine learning techniques capability of CADS depends primarily on the affluence of are categorized into three fundamental categories of learning: the input data. To diagnose ASD, several brain functional supervised learning [26], unsupervised learning [27], and and structural datasets are available. The most complete free
3 Fig. 3: Block diagram of CAD system using DL architecture for ASD detection. dataset available is ABIDE [36] dataset with two subsets: motion correction is to align all images to a reference image so ABIDE-I and ABIDE-II, which encompasses sMRI, rs-fMRI, that the coordinates and orientation of the voxels be identical and phenotypic data. ABIDE-I involves data from 17 inter- in all fMRI volumetric images [43], [44], [45]. national sites, yielding a total of 1112 datasets, including 539 S LICE T IMING C ORRECTION : The purpose of modifying individuals with ASD and 573 healthy individuals (ages 64-7). the slice time is to adjust the time series of the voxels so that In accordance with HIPAA guidelines and 1000 FCP / INDI all the voxels in each fMRI volume image have a common protocols, these data are anonymized. In contrast, ABIDE- reference time. Usually, the corresponding time of the first II contains data from 19 international sites, with a total of slice recorded in each fMRI volume image is selected as the 1114 datasets from 521 individuals with ASD and 593 healthy reference time [43], [44], [45]. individuals (ages 5-64). Also, preprocessed images of the I NTENSITY N ORMALIZATION : at this stage, the average ABIDE-I series called PCP [37] can be freely downloaded by intensity of fMRI signals are rescaled to compensate for global the researchers. The second recently released ASD diagnostic deviations within and between the recording sessions [43], database is called NDAR, which comprises various modalities, [44], [45]. and more information is provided in [38]. R EGISTRATION TO A STANDARD ATLAS : The human brain entails hundreds of cortical and subcortical areas with var- B. Preprocessing Techniques ious structures and functions, each of which is very time- Neuroimaging data (especially functional ones) is of rela- consuming and complex to study. To overcome the problem, tively complicated structure, and if not pre-processed properly, brain atlases are employed to partition brain images into a it may affect the final diagnosis. Preprocessing of this data confined number of ROIs, following which the mean time typically entails multiple common steps performed by different series of each ROI can be extracted [46]. ABIDE datasets use a software as standard. Indeed, occasionally prepared pipelines manifold of atlases, including Automated Anatomical Labeling are applied on the dataset to yield pre-processed data for future (AAL) [47], Eickhoff-Zilles (EZ) [48], Harvard-Oxford (HO) researches. In the following section, preprocessing steps are [49], Talaraich and Tournoux (TT) [50], Dosenbach 160 [51], briefly explained for fMRI data. Craddock 200 (CC200) [52] and Craddock 400 (CC400) [53] 1) Standard (Low-level) fMRI preprocessing steps: Low- and more information is provided in [54]. Table I provides level pre-processing of fMRI images normally has fixed num- complete information on preprocessing tools, atlases, and ber of steps applied on the data, and prepared toolboxes some other preprocessing information. are usually used to reduce execution time and yield better 2) Pipeline Methods: Pipelines present preprocessed im- accuracy. Some of these reputable toolboxes contain FMRIB ages of ABIDE databases. They embrace generic pre- software libraries (FSL) [39], BET [40], FreeSurfer [41], processing procedures. Employing pipelines, distinct methods and SPM [42]. Also, important and vital fMRI preprocess- can be compared with each other. In ABIDE datasets, pre- ing incorporates brain extraction, spatial smoothing, temporal processing is performed by four pipeline techniques: neu- filtering, motion correction, slice timing correction, intensity roimaging analysis kit (NIAK) [55], data processing assistant normalization, and registration to standard atlas, which are for rs- fMRI (DPARSF) [56], the configurable pipeline for the summarized as follows: analysis of connectomes (CPAC) [57], or connectome com- B RAIN EXTRACTION : the goal is to remove the skull and putation system (CCS) [58]. The preprocessing steps carried cerebellum from the fMRI image and maintain the brain tissue out by the various pipelines are comparatively analogous. The [43], [44], [45]. chief differences are in the particular algorithms for each S PATIAL SMOOTHING : involves averaging the adjacent vox- step, the software simulations, and the parameters applied. els signal. This process is persuasive on account of neighbor- Details of each pipeline technique are provided in [54]. Table I ing brain voxels being usually closely related in function and demonstrates the pipeline techniques used in autism detection blood supply [43], [44], [45]. exploiting DL. T EMPORAL FILTERING : the aim is to eliminate unwanted 3) High-level preprocessing Steps: High-level techniques components from the time series of voxels without impairing for pre-processing brain data are important, and using them ac- the signal of interest [43], [44], [45]. companying preliminary pre-processing methods can enhance R EALIGNMENT (M OTION C ORRECTION ): During the the accuracy of ASD recognition. These methods are applied fMRI test, people often move their heads. The objective of after the standard pre-processing of functional and structural
4 Fig. 4: Overall block diagram of a 2D-CNN used for ASD detection. Fig. 5: Overall block diagram of a 3D-CNN used for ASD detection. brain data. These include sliding window (SW) [24], data markedly lessened and make it feasible to train these networks augmentation (DA) [59], functional connectivity matrix (FCM) with smaller databases [21]. Figure 4 shows the block digram estimation [60], [61] and applying fast Fourier transformation of 2D-CNN used for ASD detection. (FFT) [62]. Furthermore, some of the researches utilized 3D-CNN feature extraction [63] techniques and some also use feature By transforming the data into three dimensions, the con- selection methods. Precise information on reviewed studies is volution network will also be altered to a three-dimensional indicated in detail in Table I. format (Figure 5). It should be noted that the manipulation of three dimensional CNN (3D-CNN) networks is less beneficial C. Deep Neural Networks than 1D-CNN and 2D-CNN networks for diverse reasons. Deep learning in various medical applications, including the First, the data required to train these networks must be much diagnosis of ASD, has become extremely popular in recent larger which conventionally such datasets are not utilizable and years. In this section of the paper, the types of Deep Learning methods such as pre-training, which are extensively exploited networks used in ASD detection are examined, which include in 2D networks, cannot be used here. Another reason is that CNN, RNN, AE, DBN, CNN-RNN, and CNN-AE models. with more complicated structure of networks, it becomes much 1) Convolutional Neural Networks (CNNs) : In this section, tougher to fix the number of layers, and network structure. the types of popular convolutional networks used in ASD The 3D activation map generated during the convolution of a diagnosis are surveyed. These networks involve 1D-CNN, 2D- 3D CNN is essential for analyzing data where volumetric or CNN, 3D-CNN models, and a variety of pre-trained networks temporal context is crucial. This ability to analyze a series of such as VGG. frames or images in context has led to the use of 3D CNNs as 1D AND 2D-CNN tools for action detection and evaluation of medical imaging. There are many spatial dependancies present in the data [65]. and it is difficult to extract these hidden signatures from the 2) Deep Belief Networks (DBNs): DBNs are not popular data. Convolution network uses a structure alike to convolu- today as they used to be, and have been substituted by new tion filters to extract these features properly and contribute models to perform various applications ( e.g., autoencoders for to the knowledge that features should be processed taking unsupervised learning, generative adversarial networks (GAN) into account spatial dependencies; so the number of network for generative modes [66], variational autoencoders (VAE) parameters are significantly reduced. The principal application [67]). However, disregarding the restricted use of these net- of these networks is in image processing and due to the works in this era, their influence on the advancement of neural two-dimensional (2D) image inputs, convolution layers form networks cannot be overlooked. The use of these networks 2D structures, which is why these networks are called 2D in this paper is related to the feature extraction without a convolutional neural network (2D-CNN). By using another supervisor or pre-training of networks. These networks serve type of data, one-dimensional signals, the convolution layers’ as unsupervised, consisting of several layers after the input structure also resembles the data structure [64]. In convo- layer, which are shown in Figure 6. The training of these lution networks, assuming that various data sections do not networks is done greedily and from bottom to top, in other require learning different filters, the number of parameters are words, each separate layer is trained and then the next layer is
5 Fig. 6: Overall block diagram of a DBN used for ASD detection. Fig. 8: Overall block diagram of an LSTM used for ASD detection. Fig. 7: Overall block diagram of an AE used for ASD detection. appended. After training, these networks are used as a feature extractor or the network weights are used as initial weights of a network for classification [21]. 3) Autoencoders (AEs): Autoencoders (AEs) are more than 30 years old, and have undergone dramatic changes over the years to enhance their performance. But the overall structure of these networks has remained the same [21].These networks Fig. 9: Overall block diagram of a CNN-RNN used for ASD consist of two parts: coder and decoder so that the first part detection. of the input leads to coding in the latent space, and the decoder part endeavors to convert the code into preliminary data (Figure 7). Autoencoders are a special type of feedforward most efforts have been made to enhance these two structures neural networks where the input is the same as the output. and make them resistant to challenges (e.g., GRU-D [68] is They compress the input into a lower-dimensional code and used to find the lost data). then reconstruct the output from this representation. The code 5) CNN-RNN: The initial idea in these networks is to utilize is a compact summary or compression of the input, also called convolution layers to amend the performance of RNNs so that the latent-space representation. Various methods have been the advantages of both networks can be used; CNN-RNN, proposed to block the data memorization by the network, on the one hand, can find temporal dependencies with the including sparse AE (SpAE) and denoising AE (DAE) [21]. aid of RNN, and on the other hand, it can discover spatial Trained properly, the coder part of an Autoencoder can be used dependencies in data with the help of convolution layers [69]. to extract features; creating an unsupervised feature extractor. These networks are highly beneficial for analyzing time series 4) Recurrent Neural Networks (RNNs): In convolution net- with more than one dimension (such as video) [70] but further works, a kind of spatial dependencies in the data is addressed. to the simpler matter, these networks also yield the analysis But interdependencies between data are not confined to this of three-dimensional data so that instead of a more complex model. For example in time-series, dependencies may be design of a 3D-CNN, a 2D-CNN with an RNN is occasionally highly distant from each other, on the other hand, the long- used. The superiority of this model is due to the feasibility term and variable length of these sequences results in that of employing pre-trained models. Figure 9 demonstrates the the ordinary networks do not perform well enough to process CNN-RNN model. these data. To overcome these problems, RNNs can be used. 6) CNN-AE: In the construction of these networks, the LSTM structures are proposed to extract long term and short principal aim and prerequisite have been to decrease the term dependencies in the data (Figure 8). Another well-known number of parameters. As shown before, changing merely the structure called GRU is developed after LSTM, and since then, network layers to convolution markedly lessens the number
6 Fig. 11: Block diagram of ios application for ASD rehabilita- tion. Fig. 10: Overall block diagram of a CNN-AE used for ASD detection. of parameters; combining AE with convolution structures also makes significant contribution. This helps to exploit higher dimensional data and extracts more information from the data Fig. 12: Cloud system design for ASD rehabilitation. without changing the size of the database. Similar structures, with or without some modifications, are widely deployed for image segmentation [71], and likewise unsupervised network B. Cloud Systems can be applied for network pre-training or feature extraction. Mohammadian et al. [74] proposed a new application of DL Figure 10 depicts the CNN-AE network used for ASD detec- to facilitate automatic stereotypical motor movement (SMM) tion. Tables I and II, provide the summary of papers published identification by applying multi-axis inertial measurement on detection and rehabilitation of ASD patients using DL, units (IMUs). They applied CNN to transform multi-sensor respectively. time series into feature space. An LSTM network was then combined with CNN to obtain the temporal patterns for SMM V. D EEP L EARNING T ECHNIQUES FOR ASD identification. Finally, they employed the classier selection R EHABILITATION voting approach to combine an ensemble of the best base learners. After various experiments, the superiority of their Rehabilitation tools are employed in multiple fields of proposed procedure over other base methods was proven. medicine and their main purpose is to help the patients to Figure 12 shows the real-time SMM detection system. First, recover after the treatment. Various and multiple rehabilitation IMUs, which are wearable sensors, are used for data collec- tools using DL algorithms have been presented. Rehabilitation tion; the data can then be analyzed locally or remotely (using tools are used to help ASD patients using mobile, computer Wi-Fi to transfer data to tablets, cell phones, medical center applications, robotic devices, cloud systems, and eye tracking, servers, etc.) to identify SMMs. If abnormal movements are which will be discussed below. Also, the summary of papers detected, an alarm will be sent to a therapist or parents. published on rehabilitation of ASD patients using DL algo- rithms are shown in Table II. C. Eye Tracking Wu et al. [75] proposed a model of DL saliency predic- A. Mobile and Software Applications tion for autistic children. They used DCN in their proposed paradigm, with a SM saliency map output. The fixation density Facial expressions are a key mode of non-verbal commu- map (FDM) was then processed by the single-side clipping nication in children with ASD and play a pivotal role in (SSC) to optimize the proposed loss function as a true label social interactions. Use of BCI systems provides insight into along with the SM saliency map. Finally, they exploited an the user’s inner-emotional state. Valles et al. [72] conducted autism eye-tracking dataset to test the model. Their proposed research focused on mobile software design to provide assis- model outperformed other base methods. Elbattah et al. [76] tance to children with ASD. They aimed to design a smart aimed to combine unsupervised clustring algorithms with deep iOS app based on facial images according to Figure 11. In learning to help ASD rehabilitation. The first step involved the this way, people’s faces at different angles and brightness visualization of the eye-tracking path, and the images captured are first photographed, and are turned into various emoji from this step were fed to an autoencoder to learn the features. so that the autistic child can express his/her feelings and Using autoencoder features, clustering models are developed emotions. In this group’s investigation [72], Kaggle’s (The using the K-Means algorithm. Their method performed better Facial Expression Recognition 2013) and KDEF (Kaggle’s than other state-of-art techniques. FER2013 and Karolinska Directed Emotional Faces) databases were used to train the VGG-16. In addition, the LEAP system was adapted to train the model at the University of Texas. The research provided the highest rate accuracy of 86.44%. In another similar study, they achieved an accuracy of 78.32% [73].
TABLE I: Summary of articles published using DL methods for neuroimaging-based ASD detection. Neuroimaging Number Image Preprocessing High level Inputs DNN Number of Performance Work Datasets Pipelines DNN Classifier K fold Modalities of Cases Atlas Toolbox Preprocessing DNN Toolbox Layers Criteria (%) T-fMRI 82 ASD Single Mean Channel Input Majority [24] Clinical Acquisition residual NA MNI152 FSL SW Single STD Channel Input NA 2CC3D 17 NA F1-Score = 89 48 HC Voting -fMRI Combined 2-Channel Input 82 ASD SVE, C-SVE, H-SVE, Monte Mean-Channel Sequence [77] Clinical Acquisition T-fMRI NA AAL NA NA 2CC3D 14 Sigmoid NA Acc = 97.32 48 HC Carlo Approximation STD-Channel HCP Dataset in T-fMRI 68 Subjects with Dictionary Learning [78] 7 Tasks and NA NA FSL Functional RNSs Maps NA 3D-CNN 8 Softmax NA Acc = 94.61 the HAFNI Project rs-fMRI and Sparse Coding 1 rs-fMRI Data 21 ASD [59] Clinical Acquisition T-fMRI NA AAL FSL DA ROIs Time-Series Keras LSTM 7 Sigmoid 10 Acc=69.8 19 HC T-fMRI 1711 ASD SPM Wavelet Transform and Ensemble AUROC=0.92 [79] Different Datasets rs-fMRI NA AAL FCM Keras 2D-CNN 14 Softmax NA 15903 HC SpeedyPP Different Techniques Ensemble Acc=85.19 Phenotypic Info Clinical 82 ASD SW Original fMRI Sequence [80] T-fMRI NA AAL Neurosynth NA 2CC3D 16 Sigmoid NA Acc= 87.1 Acquisition 48 HC Corrupting Strategy Mean-Channel Sequence 41 ASD Prediction Distribution std-channel sequence [80] ABIDE-I rs-fMRI NA AAL FSL NA 2CC3D 16 Sigmoid NA Acc= 85.3 54 HC Analysis Corrupt a ROI of Original Image ABIDE-I 379 ASD, 395 HC All Atlases Concatenating Voxel-Level Maps [81] rs-fMRI CPAC NA FCM NA 3D-CNN 7 Sigmoid 10 Acc= 73.3 ABIDE II 163 ASD, 230 HC ABIDE of Connectivity Fingerprints CC-200 Acc=70.1 505 ASD Masking [82] ABIDE-I rs-fMRI CPAC AAL NA FCM, DA PyTorch AE NA SLP 10 Sen=67.8 530 HC Correlations Dosenbach160 Spec=72.8 [83] ABIDE-I rs-fMRI 872 subjects CPAC HO Nilearn NA Raw Images NA G-CNNs 5 Softmax 10 Acc=70.86 BrainNetCNN Acc= 68.7 474 ASD Functional [84] ABIDE-I rs-fMRI CCS AAL NA FCM NA with Proposed 15 Softmax 5 Sen= 69.2 539 HC Connectomes Layers Spe= 68.3 SPM8 Pearson Correlation 13 ASD Qcut, NMI [85] ABIDE-I rs-fMRI NA AAL REST Coefficient Matrix NA DAE NA NA NA Acc=54.49 22 HC Statistic Matrix DPARSF NMI Matrix FSL Acc=100 11 ASD NA Convert NII Files Preprocessed [86] ABIDE rs-fMRI NA Caffe LeNet-5 Standard Softmax NA Sen=99.99 16 HC BET to PNG Images PNG Images Spec=100 55 ASD FCM, Feature Multiple Softmax [87] ABIDE-I rs-fMRI NIAK AAL NA Whole-Brain FCPs NA 4 5 Acc=86.36 55 HC Selection SAEs regression ABIDE-I 54 ASD Images with 95 68, 79 68, Keras Dimension Reduction Acc=72.73 62 HC with [62] ABIDE-II rs-fMRI NA NA SPM8 and 79 95 Dimensions, Around MCNNEs 9 Binary SR 10 Sen=71.2 156 ASD Theano FFT the x, y, and z Axes Spec=73.48 ABIDE-I + II 187 HC backend Creating Stochastic 542 ASD All Gray Matter Various [88] ABIDE rs-fMRI CPAC NA Parcellations by Poisson NA 3D-CNN 6 10 Acc=72 625 HC Atlases Mask Parcellations Methods Disk Sampling 465 ASD Edge Weights of Subjects’ [89] ABIDE-I rs-fMRI DPARSF AAL NA FCM Keras VAE 3 NA NA 507 HC Brain Graph NA 539 ASD Craddock Mean Time Courses [90] ABIDE-I rs-fMRI CCS Neurosynth DA Keras LSTM 5 Sigmoid 10 Acc=68.5 573 HC 200 from ROIs Slicetiming, Spatial rs-fMRI, 505 ASD Standardization, Smoothing, 4005-Dimensional Acc=61 [91] ABIDE Phenotypic NA CC200 DPABI NA SAE 3 Clustering NA 530 HC Filtering, Removing Eigenvector F1-Score=60.2 Info Covariates, FCM, AE-MKFC Independent Components Acc=87.21 42 ASD Time Courses of [92] ABIDE rs-fMRI NA NA FSL (Time Course, Power NA SAE 9 Softmax 21 Sen=89.49 42 HC Each Subject Spectrum and Spatial Map) Spec=83.73 NY site UM site fMRI ROI Time-Series, [93] ABIDE-I rs-fMRI CCS AAL Neurosynth DA Keras LSTM 6 Sigmoid 10 Acc=74.8 US site Functional Connectivity UC site HO Acc=73.2 408 ASD AAL 3 Different FCM+ [94] ABIDE-I rs-fMRI CPAC FSL NA Keras DANN 25 Sigmoid 10 Sen=74.5 401 HC Demographic Data CC200 Spec=71.7 Avg Acc= 67.1 DTL-NN Framework: At Least 60 Softmax Avg Sen=65.7 [95] ABIDE rs-fMRI CCS AAL FSL Offline Learning, Transfer FC Patterns NA SSAE 4 5 Subjects regression Avg spec=68.3 Learning FCM AUC=0.71 AAL FAST 993 ASD Schaefer-100 Mean Time-Series [96] ABIDE I+II rs-fMRI NA BET NA NA 1D-CNN 5 Softmax 10 Acc=68 1092 HC HO within Each ROI Schaefer-400 FAST 7
8 4 Deep Acc=87 529 ASD All Single Volume Glass Brain and Ensemble F1-score=86 [97] ABIDE-I rs-fMRI NA NA Keras 16 Sigmoid NA 573 HC Pipelines Image Generator Stat Map Images Classifier Recall=85.2 techniques Pre=86.8 [98] ABIDE-II rs-fMRI 303 ASD, 390 HC NA NA FSL NA 1D Time Series from Voxels NA 1D-CAE 14 NA NA Acc= 65.3 Thresholding Based [99] ABIDE rs-fMRI 40 ASD, 40 HC CCS NA NA WM, GM, CSF NA AlexNet Standard Softmax NA Acc=82.61 Segmentation Parcellated DA Using SMOTE Acc=82 Whole All Upper Triangle Part of ASD- [100] ABIDE rs-fMRI into 200 NA and Graph Network NA Proposed SLP NA Sen=79.1 Dataset Pipelines the Correlation Matrix DiagNet Regions Motifs, FCM Spec=83.3 Time Series Extraction from Different Regions, Auto- Acc=80 12 ASD [60] ABIDE-I rs-fMRI CPAC SCSC NA FCM PyTorch ASD- Proposed SVM 5 Sen=73 14 HC Connectivity Matrix, Network Spec=83 SMOTE Algorithm Acc=70.20 505 ASD [61] ABIDE-I rs-fMRI CPAC CC400 NA FCM FCM NA 2D-CNN 20 MLP 10 Sen=77.00 530 HC Spec=61.00 Acc=70 505 ASD 1D CNN [101] ABIDE rs-fMRI NA NA NA FCM 1D-Vector of FCM NA 7 Softmax NA Sen=74 530 HC -AE Spec=63 539 ASD Mean DSC=91.56 [102] ABIDE-I rs-fMRI NA NA FreeSurfer NA Single 3D Image Theano 3D-FCNN 13 Softmax 6 573 HC Mean MHD=14.05 Acc=93.59 501 ASD FCM, Converting to 1000 Features Selected Different [103] ABIDE-I rs-fMRI DPARSF AAL NA NA SSAE 3 Softmax Sen=92.52 553 HC 1D-Vector by the SVM-RFE Folds Spec=94.56 Online Dictionary Learning FSL Average Acc= 70.5 100 ASD and Sparse Representation 4D Matrix with 150 3D [104] ABIDE-I rs-fMRI NA NA Theano 3D-CNN 14 NA 10 Average Sen= 74 100 HC Techniques, Generating Network Overlap Maps FEAT Average Spec= 67 Spatial Overlap Patterns Population Graph Construction, rs-fMRI & 529 ASD Population Scikit [105] ABIDE-I CPAC HO FSL Feature Selection Strategies GCN 11 Softmax 10 Acc=80.0 Phenotypic Info 571 HC Graph -learn (RFE, PCA, MLP, AE) rs-fMRI & 403 ASD Mean Time-Series [106] ABIDE-I CCS CC200 NA DA Keras LSTM 6 Sigmoid 10 Acc=70.1 Phenotypic Info 468 HC from ROIs rs-fMRI & Two Acc=70 S-MRI & 505 ASD Craddock [107] ABIDE-I CPAC NA FCM 1D-Vector of FCM NA SdAE NA Softmax 10 Sen=74 Phenotypic 530 HC 200 + MLP Spec=63 Info rs-fMRI FSL 75 Qualified Brainsuite Extraction of Fetal Mean Time Series F1-score=84 [108] Clinical Acquisition Fetal NA NA PyTorch 3D-CNN 7 Sigmoid NA Subjects SPM8 Brain fMRI Data, SW of 3D fMRI Volumes AUC=91 BOLD fMRI CONN ABIDE-I rs-fMRI Segmentation, Average Acc=65.56 116 ASD Rs-fMRI + GM+WM [109] NA AAL SPM8 Mean Time Series Theano DBN 6 LR 10 Sen=84 69 HC Data Fusions ABIDE-II s-MRI of Each ROI Spec=32.96 FCM Vector Keras rs-fMRI 418 ASD All FCM, Features Anatomical Features Tensor- Different Various [110] IMPAC NA NA Combination of Both Flow 8 3 AUC= 80 497 HC atlases Extraction from S-MRI Networks Methods s-MRI Anatomical and Connectivity Caffe FCM Vector rs-fMRI AAL Label Fus- Ensemble ion Using Acc= 85.06 368 ASD CC200 FCM, of 5 Stacked [111] ABIDE-I CPAC Freesurfer 1D-Vector of FCM NA 31 the Average 10 Sen= 81 449 HC Fisher Score AEs and s-MRI of Softmax Spec= 89 Destrieux MLP Probabilities rs-fMRI 61ASD Data-Driven Landmark Discovery 50 Patches Extracted Multi- [112] NDAR NA NA NA NA 13 Softmax 10 Acc=76.24 s-MRI 215 HC Algorithm, Patch Extraction from 50 Landmarks Channel CNN FSL Each Acc=88.5 All 78 ASD Proposed PICA, Extraction PSDs of 34 34 SAE [63] NDAR NA NA PSVM NA Sen=85.1 Modalities 124 HC Atlas of PSD Components SAEs Has 2 BET Spec=90.4 Layers s-MRI In-House 11 [113] NDAR 60 ASD, 211 HC NA NA 3D Patches Extraction Patch Size 166416 NA DDUNET NA 5 NA Tools Blocks ABIDE-I 21 ASD, 21 HC FSL Segmentation, Shape CDF Values [114] NDAR/Pitt s-MRI 16 ASD, 16 HC NA NA NA SNCAE NA Softmax NA Acc=96.88 iBEAT Feature Extraction of Features NDAR/IBIS 10 ASD, 10 HC
Acc=90.39 78 ASD Construction of Individual [115] ABIDE-I s-MRI NA Destrieux FreeSurfer 3000 Top Features NA SAE 3 Softmax 10 Sen=84.37 104 HC Network, F-score Spec=95.88 HCP 1113 HC Desikan Normalization, Apply Tensor- [116] s-MRI 83 ASD NA FreeSurfer Preprocessed Images Flow DEA 3 NA 10 AUC=63.9 ABIDE-I Killia One-Hot Coding 105 HC Keras ABIDE 32 Slices Acc=84 [117] s-MRI 1112 Subjects NA NA SPM12 NA Along Each Axial, Keras DCNN 17 Sigmoid NA Sen=77 CombiRx Coronal, and Sagittal Spec=85 Coronal, Axial and Sagittal FastSur- [118] ABIDE-II s-MRI NA NA DKT FreeSurfer Segmentation PyTorch Proposed Softmax NA NA 2D Slices fer CNN HO Cortical and GABM Method, New 500 ASD Preprocessed [119] ABIDE-I s-MRI NA Subcortical FSL Chromosome Encoding NA 3D-CNN 11 Softmax 5 Acc=70 500 HC MRI Scans Structural Scheme Atlas Clinical Sparse Annotations, Acc=91.6 [120] s-MRI 48 HC NA NA FreeSurfer Image Patch Caffe 3D-CNN 18 Softmax NA Acquisition DA AUC=94.1 Brain- Filtering, Calculating Acc=74.54 270 ASD Netome Mean Time Series for Mean Time Series Data CNN- [121] ABIDE-I rs-fMRI CPAC NA NA 14 Sigmoid 5 Sen=63.46 305 HC Atlas ROIs Using BrainNetome Stacked Across ROIs GRU Spec=84.33 (BNA) Atlas (BNA), Normalization Transformation of Acc=95.7 Clinical 25 ASD 1D CNN- [122] fNIRS No No No the Time Series PM, GM, SM Keras NA Bagging NA Sen=97.1 Acquisition 22 HC LSTM to Three Variants Spec=94.3 SW Acc=92.2 Clinical 25 ASD [123] fNIRS No No No Converted into the 3D Tensor NA CGRNN 7 NA NA Sen=85.0 Acquisition 22 HC 3D Tensor Spec=99.4 FreeSurfer Different Various FSL [124] s-MRI NA NA Geometric DA 3D Cortical Mask Theano ConvNet U-Nets NA 8 NA Datasets Methods SPM12 VolBrain Performed an Automatic Quality Control, Visually Inspection, MM- 620 ASD 9 Temporal Summary Measures, Majority Acc=64 [125] ABIDE I+II rs-fMRI CPAC HO FSL Each Summary Measure NA ensemble 7 5 2085 HC Mean and STD of the Summary Voting F1-Score=66 (3D-CNN) Measures, Normalization, Occlusion of Brain Regions rs-fMRI 184 ASD 3D-CNN Acc=77 [126] ABIDE-I Phenotypic CPAC NA NA Down Sampling Raw 4D Volume NA 21 Softmax 5 110 HC C-LSTM F1-score=78 Information 264 ROIs AFNI FCM, rs-fMRI and 403 ASD Based Acc= 79.2 [127] ABIDE-I NA FSL Feature Extraction Normalized Features NA AE 7 DNN 10 s-MRI 468 HC Parcellation AUC= 82.4 MATLAB (Different Features) Scheme Acc=71 505 ASD K-Means [128] ABIDE-I rs-fMRI CPAC CC200 NA FCM 1D-Vector of FCM PyTorch CapsNet Standard 10 Sen=73 530 HC Clustering Spec=66 9
10 TABLE II: Summary of papers published on rehabilitation of ASD patients using DL algorithms. Type of Number Inputs DNN Number of Performance Work Datasets Preprocessing DNNs Classifier K fold Applications of Cases DNN Toolbox Layers Criteria (%) Caffe Acc=85 20 ASD HFM Construction, HFMs, Natural [129] OSIE — VGGNeT 50 Softmax 13 Sen=80 19 HC Filtering Normalizing, DA Scene Images TensorFlow Spec=89 70 [73] KDEF Facial Expression Recognition DA RGB Images (562762) Keras DCNN 44 Softmax NA Acc=78.32 Individuals Clinical Detecting Audio Regimes That Directly Noisemes Network Standard Synthetic [130] 33 ASD MFCC Spectrograms 32 Spectrograms NA DiarTK Diarization NA Acc=84.7 Acquisition Estimate ASD Severity Social Affect scores Network RF Network Kaggle’s Keras [72] FER2013 Facial Expression Recognition NA No 4848-Pixel Images (TensorFlow DCNN 44 Softmax NA Acc=86.44 KDEF Backend) Acc=57.90 14 ASD SalGAN Model, [131] SALICON ASD Classification Sequence of Image Patches NA SP-ASDNet 11 NA NA Rec=59.21 14 HC Feature Extraction Pre=56.26 5-channel Sub-Sequence Stacks [132] BigFaceX Facial Expression Recognition 196 Subjects SW, Merge in the Channel Dimension, DA Keras TimeConvNet PreTrain Nets Softmax NA Acc=97.9 within a Specific Time Window Different Interactive and Intelligent [133] Suitable Courseware for Children with ASD NA Different Inputs NA Different Nets NA NA NA NA Datasets Chatbot , NLP, Visual Aid Resizing, Frame Extraction, R-CNN VGG-16 K-NN Acc= 88.2 Estimating Visual Attention in Visual Inspection 5 Facial Landmarks - 36 Pre=83.3 [134] Camera Images 6 ASD and ID NA 10 Robot-Assisted Therapy Face Detection (ViolaJones), HOG Descriptors Sen=83.0 Cascaded CNNs MTCNN Nave Spec=87.3 Feature Extraction (HOG Descriptors) Architecture Sensor 6 ASD Majority [135] Automatic SMM detection Resampling, Filtering, SW Time-Series of Multiple Sensors Keras CNN-LSTM 13 NA NA Data 5 HC Voting Segmentation, Different [136] KOMAA Facial Expression Recognition 55 subjects Greedy Forward Feature Selection NA CNN 9 SVM NA Acc=96 Features, Z-scores Story-Telling 31 ASD Coherence Representation of [137] ASD Classification DA, ChineseWord2Vec 32-Dimensional Word Vector NA LSTM 1 NA Acc=92 Narrative Corpora 36 HC LSTM Forget Gate Ext-Dataset 136 ASD TLD Method, Accumulative Angle Histogram, Length Acc=92.6 [138] ASD Classification using Eye Tracking Keras LSTM 4 NA 10 Sen=91.9 (video dataset) 136 HC Histogram Computation Histogram and Fused Histogram, Spec=93.4 Predicting Visual Attention SIM=67.8 [139] MIT1003 300 Images NA Raw Images NA DCN 26 NA NA CC=76.9 of Children with ASD. AUC-J=83.4 Scan Path Data, 14 ASD Acc=55.13 [75] Including Location ASD Classification DA Methods Image, Data Points Pytorch ResNet18 Standard Softmax NA Sen=63.5 14 HC and Duration Spec=47.1 UCI Machine Acc=99.53 [140] Learning ASD Classification 704 Different Methods Preprocessed Data NA CNN 7 NA NA Sens=99.39 Repository Spec=100 Keras, Eye Tracking 29 ASD Visualization of Eye-Tracking Scanpaths K-Means Silhouette [76] ASD Classification 100*100 Image Scikit- AE 8 NA Scanpath 30 HC Scaling Down, PCA Clustering score=60 Learn Engagement Estimation Keras R-CNN + ICC=43.35 Cropped Face Images with [141] Video Data of Children with ASD During a 30 children NA CultureNet ResNet50+ Softmax NA CCC=43.18 (256 *256) TensorFlow Robot-Assisted Autism Therapy 5FC layers PC=45.17 Backend YouTube Avg Pre=73 Modeling Typical and Atypical Behaviors Sequences of Individual Frames openCV, [142] ASD 68 video Clips Different Methods DCNN NA DT 5 Avg Recall=75 in ASD Children at a Rate of 30 fps Caffe Dataset Avg Acc=71 Behavioral Data Extracted Segmentation, Upper Body tracking, Acc=88.46 Video 5 ASD 3 Movement Features with 68 [143] from Video Analysis of Laban Movement Analysis to NA CNN 10 Softmax NA Pre=89.12 Dataset 7 HC Facial Key-Points Child-Robot Interactions. Drive Weight, Different features Recall=88.53 Video Developing Automatic SMM Resampling, Filtering, SW, Time-Series of Multiple Deeppy [144] 6 ASD CNN 8 SVM NA F1-score=95 Dataset Detection Systems Data Balancing, Normalizing Accelerometer Sensors Library Acc=100 The Embedded Categorical Variables 513 ASD Cleaning Missing Values and Outliers, Spec=99 [145] ASD Screening Autism Screening are Concatenated with Numerical NA DENN 4 Sigmoid NA 189 HC Visualization, Identity Mapping Sen=100 Features as New Feature Vectors F1-score=99 ASD Screening Classification of Handling of Missing Values, Variable Acc=99.40 [146] — Reduction, Normalization, and Normalized Variables Keras DNN 7 Sigmoid NA Sen=97.89 Datasets Adults with ASD Label Encoding Spec=100
11 Fig. 13: Number of DL tools used for the diagnosis and Fig. 14: Number of of DL networks used for ASD detection rehabilitation of ASD patients in reviewed papers. and rehabilitation in the reviewed works. VI. D ISCUSSION In this study, we performed a comprehensive overview of the investigations conducted in the scope of ASD diagnostic CAD systems as well as DL-based rehabilitation tools for ASD patients. In the field of ASD diagnosis, numerous papers have been published using functional and structural neuroimaging data as well as rehabilitation tools, as summarized in Table III in the appendix. A variety of DL toolboxes have been proposed for implementing deep networks. In Tables I and II the types of DL toolboxes utilized for each study are depicted, and the total number of their usage is demonstrated in Figure 13. The Keras toolbox is used in the majority of the studies due to its simplicity. Keras offers a consistent high- level application programming interface (APIs) to build the models more straightforward, and by using powerful backends such as TensorFlow, its performance is sound. Additionally, Fig. 15: Number of various classification algorithms used for due to all pre-trained models and available codes on platforms the detection of ASD and rehabilitation in DL. such as GitHub, Keras is quite popular among researchers. Number of DL networks used for the ASD detection in the reviewed works is shown in Figure 14. Among the various DL architectures, CNN is found to be the most popular one and healthy) available in the public domain. Hence, researchers as it has achieved more promising results compared to other are not able to broaden their investigation to all sub-types deep methodologies. The autoencoder, as well as RNN, have of ASD. Two of the cheapest and most pragmatic functional yielded favorable results. It can be noted that in recent years, neuro-screening modalities for diagnosis of ASD are EEG, and the number of DL-based papers has increased exponentially fNIRS. But unfortunately the deficiency of freely available due to their sound performance and also the availability of datasets has resulted in little research in this area. Another vast and thorough datasets. obstacle is that multi-modality databases such as EEG-fMRI The number of various classification algorithms used in DL are not available to researchers to evaluate the effectiveness of networks are shown in Figure 15. One of the best and most incorporating information of different imaging modalities to widely used is the Softmax algorithm (Tables I and II). It is detect ASD. Although fMRI and sMRI data are ubiquitous in most popular since it is differentiable in the entire domain and the ABIDE dataset, the results of merging these structural and computationally less expensive. functional data for ASD diagnosis using DL have not yet been investigated. Another problem grappling the researchers is VII. C HALLENGES designing the DL-based rehabilitation systems with hardware Some of the most substantial challenges in ASD diagnosis resources. Nowadays, assistive tools such as Google Colab scope using DL-based techniques are addressed in this section, are available to researchers to improve the processing power; which comprise database and algorithmic problems. There are however, the problems still prevail when implementing these only two-class brain structural and functional datasets (ASD systems in real-world scenarios.
12 VIII. C ONCLUSION AND F UTURE W ORKS The receiver operating characteristic curve (ROC-curve) ASD is typically characterized by social disorders, com- depicts the performance of the proposed model at all clas- munication deficits, and stereotypical behaviors. Numerous sification thresholds. It is the graph of true positive rate vs. computer-aided diagnosis systems and rehabilitation tools have false positive rate (TPR vs. FPR). Equations for calculation of been developed to assist patients with autism disorders. In this TPR and FPR are presented below. survey, research on ASD diagnosis applying DL and func- TP TPR = (6) tional and structural neuroimaging data were first assessed. TP + FN The researchers have taken advantage of deep CNNs, RNNs, FP AEs, and CNN-RNN networks to improve the performance FPR = (7) FP + TN of their systems. Boosting the accuracy of the system, the capability of generalizing and adapting to differing data and A REA UNDER THE ROC C URVE (AUC) real-world challenges, as well as reducing the hardware power AUC presents the area under the ROC-curve from (0, 0) requirements to the extent that the final system can be utilized to (1, 1). It provides the aggregate measure of all possible by all are the principal challenges of these systems. To enhance classification thresholds. AUC has a range from 0 to 1. A the accuracy and performance of CADS for ASD detection in 100% wrong classification will have AUC value of 0.0, while the future, deep reinforcement networks (RL) or GANs can be a 100% correct classified version will have the AUC value exploited. Scarcity of data is always an aparamount problem in of 1.0. It has two folded advantages. One is that it is scale- the medical field that can be resolved relatively with the help of invariant, which implies how well the model is predicted rather these deep GANs. Also, as another direction for future works, than checking the absolute values. The second advantage is handcrafted features can be extracted from data and fed to that it is classification threshold-invariant as it will verify the DL networks in addition to raw data; this can help increasing performance of the model irrespective of the threshold being performance by adding the potential of traditional methods to selected. DL-based models. Many researchers have proposed various DL-based rehabil- A PPENDIX B itation tools to aid the ASD patients. Designing a reliable, Table III shows details of Deep Nets in all the papers accurate, and wearable low power consumption DL algorithm reviewed in this study. based device is the future tool for ASD patients. An achievable rehabilitation tool is to wear smart glasses to help the children ACKNOWLEDGMENT with ASD. These glasses with the built-in cameras will acquire MB is supported by a NHMRC Senior Principal Re- the images from the different directions of environment. Then search Fellowship (1059660 and 1156072). MB has received the DL algorithm processes these images and produces mean- Grant/Research Support from the NIH, Cooperative Research ingful images for the ASD children to better communicate Centre, Simons Autism Foundation, Cancer Council of Vic- with their surroundings. toria, Stanley Medical Research Foundation, Medical Benefits Fund, National Health and Medical Research Council, Medical A PPENDIX A Research Futures Fund, Beyond Blue, Rotary Health, A2 milk S TATISTICAL M ETRICS company, Meat and Livestock Board, Woolworths, Avant and This section demonstrates the equations for the calculation the Harry Windsor Foundation, has been a speaker for Astra of each evaluation metric. In these equations, True positive Zeneca, Lundbeck, Merck, Pfizer, and served as a consultant (TP) is the correct classification of the positive class, True to Allergan, Astra Zeneca, Bioadvantex, Bionomics, Collab- negative (TN) is the correct classification of the negative class, orative Medicinal Development, Lundbeck Merck, Pfizer and False positive (FP) is the incorrect prediction of the positives, Servier - all unrelated to this work. False negative (FN) is the incorrect prediction of the negatives. TP + TN Accuracy(Acc) = (1) TP + TN + FP + FN TN Specif icity(Spec) = (2) TN + FP TP Sensitivity(Sen) = (3) TP + FN TP P recision(P rec) = (4) TP + FP P rec ∗ Sens F 1 − Score = 2 ∗ (5) P rec + Sens R ECEIVER O PERATING C HARACTERISTIC C URVE (ROC- C URVE )
13 TABLE III: Details of Deep Nets. For ASD diagnosis and Rehabilitation. Work Network Details for Deep Networks Dropout Classifier Optimizer Loss Function 2 (rate=0.5) [24] 2CC3D CNN Layers (6) + Pooling Layers (4) + FC Layers (2) Sigmoid NA BCE 2 (rate=0.65) [77] 2CC3D CNN Layers (6) + Pooling Layers (4) + FC Layers (3) NA Sigmoid NA NA [78] 3D-CNN CNN Layers (2) + LReLU Actication + Pooling Layers (1) + FC Layers (1) 3 (rate=NA) Softmax SGD MNLL [59] LSTM LSTM Layers (1) + Pooling Layers (1) + FC Layers (3) 1 (rate=0.5) Sigmoid Adadelta MSE 2 (rate=0.3) [79] CNN CNN Layers (2) + ReLU Activation + BN Layers (4) + FC Layers (3) Softmax Adam NA 2 (rate=0.7) [80] 2CC3D CNN Layers (6) + Pooling Layers (4) + FC Layers (2) 2 (rate=NA) Sigmoid NA NA [81] CNN CNN Layers (2) + ELU Activation + Pooling Layers (2) + FC Layers (2) NA Sigmoid SGD NA MSE [82] AE Standard AE with Tanh Actication NA SLP NA BCE [83] G-CNN Proposed G-CNN with 3 Layer CNN (rate=0.3) Softmax Adam NA BrainNetCNN with Element-wise layer (1) + E2E layers (2) + E2N layer (1) + N2G layer (1) 5 (rate=0.5) [84] Softmax Adam Proposed Loss Function Proposed Layers + FC layers (3)+ Leaky ReLU Activation+ Tanh Activation 1 (rate= 0.6) [85] DAE Standard DAE NA NA NA Proposed Loss function [86] LeNet-5 Standard LeNet-5 Architecture NA Softmax NA NA [87] SAE SAE with LSF Activation NA Softmax LBFGS NA Adam [62] MCNNEs CNN Layers (3) + ReLU Activation + Pooling Layers (3) + FC Layers (1) 1 (rate=0.5) Binary SR BCE Adamax SGD BCE [88] 3D-CNN CNN Layers (2) + ELU Activation + Pooling Layers (2) + FC Layers (3) NA Sigmoid Adam MSD [89] VAE VAE with 3 Layers NA NA Adadelta Proposed Loss Function [90] LSTM LSTM Layers (1) + Pooling Layers (1) + FC Layers (1) 1(rate=0.5) Sigmoid Adadelta BCE [91] SAE SAE Layers (3) + Sigmoid Activation NA Clustering Proposed Opt. NA [92] SAE SAE Layers (8) + Sigmoid Activation NA SR L-BFGS MSE BCE [93] LSTM LSTM Layers (2) + Pooling Layers (1) + FC Layers (2) NA Sigmoid Adam MSE 3 MLP (1 Dropout Layer and 4 Dense Layers) + Self-Attention (3) + Fusion (3) [94] Multichannel DANN 1 (rate=NA) Sigmoid NA CE + Aggregation Layer + Dense Layer (1) + ReLU, ELU, Tanh Activations Scaled Conjugate [95] SSAE 3 SSAE Layers NA Softmax Proposed Loss Function Gradient Descent [96] 1D-CNN CNN Layers (1) + Pooling Layers (1) + FC Layers (1) (rate=0.2) Softmax Adam NA [97] CNN CNN Layers (6) + Pooling Layers (4) + BN Layers (2) + FC Layers (2) 1 (rate=0.25) Sigmoid Adam Propose Loss Function [98] 1D CAE-CNN Encoder (4 layers) + Decoder (4 layers) + CNN layers (2) + pooling layers (2) + FC layers (2) NA NA NA NA [99] AlexNet Standard AlexNet Architecture NA Softmax NA CE [100] ASD-DiagNet Proposed DiagNet NA SLP NA NA [60] Auto-ASD-Network Proposed Auto-ASD-Network NA SVM NA NLLF [61] CNN CNN layers (7) + Pooling layers (7) + FC layers (3) 1 (rate=0.25) MLP NA NA [101] 2 SdAE-CNN Proposed SDAE-CNN with 7 Layes CNN NA Softmax NA NA [102] 3D-FCNN CNN Layers (9) + PReLU Activation + FC Layers (3) NA Softmax SGD CE [103] SSAE 2 Layers SSAE NA Softmax NA NA [104] 3D-CNN CNN layers (7) + Pooling Layers (3) + FC Layers (2) + Log-Likelihood Activation 2 (rate=0.2) NA SGD MNLL GCN GCN with ReLU and Sigmoid Actication CE [105] (rate=0.3) Softmax NA AE SAE wth Tanh Activation MSE BCE [106] LSTM Proposed Deep Nework (rate=0.5) Sigmoid Adadelta MSE [107] 2 SdAE-MLP Proposed 2-SDAE-MLP Network NA Softmax NA MSE [108] 3D-CNN CNN Layers (2) + ReLU Activation + Pooling Layers (2) + FC Layers (2) NA Sigmoid SGD BCE [109] DBN DBN with 5 Hidden Layers NA LR NA NA [110] FeedFWD Dense layers (5) + LReLU activation 3 (rate= NA) NA Adam BCE Ensemble Averaging the Softmax [111] of 5 SAEs 5 [ AE (3) + MLP (2)] + Softmax 5 (rate=NA) NA NA Activation and MLPs Probabilities [112] Multi-Channel CNN CNN Layers (5) + ReLU Activation + Pooling Layers (2) + FC Layers (5) NA Softmax NA CE [63] 34 SAEs 34 [ SAE network (2)] NA PSVM L-BFGS NA [113] DDUNET Proposed DDUNET with 11 blocks and ReLU Activation (rate=0.1) NA SGD CE [114] SNCAE Proposed SNCAE Newtork NA Softmax NA NA [115] SpAE SpAE with 2 Networks NA Softmax NA MSE [116] DAE AE (3) + SELU Activation NA NA Adam Sum of MSE + 2 CE + CC [117] DCNN CNN Layers (6) + ReLU Activation + Pooling Layers (6) + FC Layers (4) NA Sigmoid Adam BCE [118] FastSurfer CNN Proposed FastSurfer CNN Network NA Softmax Adam Logistic & Dice Losses [119] 3D-CNN CNN Layers (3) + ReLU Activation + Pooling Layers (3) + FC Layers (2) 2 (rate=0.5) Softmax Adadelta CE [120] 3D-UNET DCNN Layers (7) + ReLU Activation + Pooling Layers (2) + BN Layers (6) 2 (rate=0.5) Softmax SGD weighted CE [121] CNN-GRU CNN Layers (4) + GRU Layers (2) + ReLU Activation + Pooling Layers (2) + FC Layers (5) NA Sigmoid Adam BCE [122] 1D CNN - LSTM Proposed 1D-CNN LSTM with ReLU Activation (rate=0.2) Softmax Adam CCE CNN layers (3) + ReLU Activation + Pooling Layers (1) [123] CGRNN 1 (rate=0.5) NA Adam BCE + GRU Layers (1) + Sigmoid Activation + FC Layer (1) [124] ConvNet Variation of the U-net Convolutional Architecture NA NA Adam Proposed Loss Function [125] 3D-CNN CNN Layers (2) + ELU Activations + Pooling Layers (2) + FC Layers (2) NA Sigmoid SGD BCE CNN Layers (8) + Conv-Bi LSTM Layers (2) + Sigmoid Activation (for LSTM) [126] 3DCNN C-LSTM 8 (rate=0.2) Softmax Adam CE + Pooling Layers (1) + FC Layers (1) [127] AE Proposed AE with 7 Layers NA DNN NA NA [128] CapsNets Standard Architecture NA K-Means Clustering Adam Proposed Loss Function [129] VGGNets + ASDNet CNN Layers (27) + ReLU Activation + Pooling Layers (10) + FC Layers (6) 6 (rate=0.5) Softmax SGD CE 7 (rate=0.25) [73] DCNN CNN Layers (7) + Activation+ Pooling Layers (13) + FC Layers (3) + BN Layers (10) Softmax SGD NA 3 (rate=0.5) Noisemes net [130] Standard Networks NA RF NA NA DiarTK Diarization net 7 (rate=0.25) [72] DCNN CNN Layers (7) + ELU Activation + Pooling Layers (13) + FC Layers (3) + BN Layers (10) Softmax SGD NA 3 (rate=0.5) [131] SP-ASDNet CNN Layers (2) + LSTM Layers (2) + Pooling Layers (3) + FC Layers (2) 2 (rate=NA) NA Adam BCE Convolutional Spatiotemporal Encoding Layer+ Backbone Convolutional [132] TimeConvNet NA Softmax Adam CCE Neural Network Architecture (Mini-Xception, ResNet20, MobileNetV2) [133] Different Networks Proposed Structure NA NA NA NA RCNN VGG-16 [134] NA KNN NA NA MTCNN Cascaded CNNs Architecture CNN Layers (3) + LSTM Layers (1) + ReLU Activation 1 (rate=0.5) [135] CNN-LSTM Softmax SGD NA + Pooling Layers (3) + FC Layers (3) 1 (rate=0.2) [136] CNN CNN Layers (4) + Pooling Layers (2) + FC Layers (2) NA Softmax NA NA [137] LSTM LSTM layer (1) NA Coherence Representation NA NA [138] LSTM LSTM Layers (3) + Sigmoid Activation + FC Layers (1) NA NA NA CE [139] DCN CNN Layers (17) + Pooling Layers (3) + Deconvolution Layers (3) + Learned Priors (3) NA NA NA Proposed Loss Function [75] Pretrained Resnet18 Standard ResNet-18 Architecture Standard Standard Adam BCE [140] CNN CNN Layers (2) + ReLU Activation + Pooling Layers (2) + FC Layers (2) 1 (rate=0.5) NA Adam BCE [76] AE AE with 8 Layers NA K-Means Clustering NA NA [141] CultureNet Faster R-CNN + Modified ResNet50 + 5FC Layers NA Softmax Adelta Proposed Loss Function [142] DCNN Proposed DCNN Architecture with Different Layers NA Decision Tree (DT) Manual Optimization NA [143] CNN CNN Layers (2) + ReLU Activation + FC Layers (3) 4 (rate=0.2) Softmax NA NA SA-B3D with CE [144] CNN Layers (5) + LSTM Layers (1) + Pooling Layers (4) + FC Layers (1) NA Sigmoid Adam LSTM Model Proposed Loss Function [74] CNN CNN Layers (3) + ReLU Activation + Pooling Layers (3) + FC Layers (1) NA SVM SGD NA [145] DENN Proposed DENN Architecture with ReLU Activation + FC Layers (2) NA Sigmoid mini-batch SGD CCE (rate =0.2) [146] DNN Propoded DNN with ReLU Activation + FC Layers (2) Sigmoid Adam BCE (rate =0.4)
You can also read