Wood Defect Detection Based on Depth Extreme Learning Machine - MDPI

Page created by Brent Hall
 
CONTINUE READING
Wood Defect Detection Based on Depth Extreme Learning Machine - MDPI
applied
           sciences
Article
Wood Defect Detection Based on Depth Extreme
Learning Machine
Yutu Yang 1 , Xiaolin Zhou 2 , Ying Liu 1, *, Zhongkang Hu 3 and Fenglong Ding 1
 1    College of Mechanical & Electronic Engineering, Nanjing Forestry University, Nanjing 210037, China;
      yangyutu@njfu.edu.cn (Y.Y.); dfl@njfu.edu.cn (F.D.)
 2    School of Artificial Intelligence, Hezhou University, Hezhou 542899, China; zhouxiaolinzxl@hotmail.com
 3    Nanjing Fujitsu Nanda Software Technology Co., Ltd., Nanjing 210012, China; huzk.fnst@cn.fujitsu.com
 *    Correspondence: liuying@njfu.edu.cn
                                                                                                      
 Received: 18 September 2020; Accepted: 22 October 2020; Published: 24 October 2020                   

 Abstract: The deep learning feature extraction method and extreme learning machine (ELM)
 classification method are combined to establish a depth extreme learning machine model for wood
 image defect detection. The convolution neural network (CNN) algorithm alone tends to provide
 inaccurate defect locations, incomplete defect contour and boundary information, and inaccurate
 recognition of defect types. The nonsubsampled shearlet transform (NSST) is used here to preprocess
 the wood images, which reduces the complexity and computation of the image processing. CNN is
 then applied to manage the deep algorithm design of the wood images. The simple linear iterative
 clustering algorithm is used to improve the initial model; the obtained image features are used as
 ELM classification inputs. ELM has faster training speed and stronger generalization ability than
 other similar neural networks, but the random selection of input weights and thresholds degrades
 the classification accuracy. A genetic algorithm is used here to optimize the initial parameters of the
 ELM to stabilize the network classification performance. The depth extreme learning machine can
 extract high-level abstract information from the data, does not require iterative adjustment of the
 network weights, has high calculation efficiency, and allows CNN to effectively extract the wood
 defect contour. The distributed input data feature is automatically expressed in layer form by deep
 learning pre-training. The wood defect recognition accuracy reached 96.72% in a test time of only
 187 ms.

 Keywords: wood defect; CNN; ELM; genetic algorithm; detection

1. Introduction
     Today’s wood products are manufactured under increasingly stringent requirements for surface
processing. In developed countries such as Sweden and Finland with developed forest resources,
the comprehensive use rate of wood is as high as 90%. In sharp contrast, the comprehensive use rate of
wood in China is less than 60%, causing a serious waste of resources. With China’s rapid economic
development, people are increasingly pursuing a high-quality life, which will inevitably lead to an
increase in demand for wood and wood products, such as solid wood panels, wood-based panels,
paper and cardboard, and other consumption levels are among the highest in the world. The existing
wood storage capacity and processing level make it difficult to meet the rapid growth demand. The lack
of wood supply and the low use rate have led to the limited development of China’s wood industry.
Therefore, it is necessary to comprehensively inspect the processing quality of logs and boards to
improve the use rate of wood and the quality of wood products.
     The nondestructive testing of wood can accurately and quickly make judgments on the physical
properties and growth defects in wood, and nondestructive wood testing and automation can be

Appl. Sci. 2020, 10, 7488; doi:10.3390/app10217488                                www.mdpi.com/journal/applsci
Wood Defect Detection Based on Depth Extreme Learning Machine - MDPI
Appl. Sci. 2020, 10, 7488                                                                           2 of 14

realized. In recent years, the combined application of computer technology along with detection
and control theory has made great progress in the detection of wood defects. In the nondestructive
testing of wood surfaces, commonly used traditional methods include laser testing [1,2], ultrasonic
testing [3–5], acoustic emission technology [6,7], etc. Computer-aided techniques are a common
approach to surface processing, as they are efficient and have a generally high recognition rate [8,9].
Deep learning was first proposed by Hinton in 2006; in 2012, scholars adopted an AlexNet network
based on deep learning to achieve computer vision recognition accuracy of up to 84.7%. Deep learning
prevents dimensionality in layer initialization and represents a revolutionary development in the field
of machine learning [10,11]. More and more scholars are applying deep learning networks in wood
nondestructive testing. He [12] et al. used a linear array CCD camera to obtain wood surface images,
and proposed a hybrid total convolution neural network (Mix-FCN) for the recognition and location of
wood defects; however, the network depth was too deep and required too much calculation. Hu [13]
and Shi [14] used the Mask R-CNN algorithm in wood defect recognition, but they used a combination
of multiple feature extraction methods, which resulted in a very complex model. However, the current
deep learning algorithms still have problems such as inaccurate defect location, incomplete defect
contour and boundary information in the wood defect detection process. To solve the above problems
and effectively meet the needs of wood processing enterprises for wood testing, we carry out the
research of this article.
     The innovations of this article are: (1) Simple pre-processing of wood images using nonsubsampled
shear wave transform (NSST), reducing the complexity and computational complexity of image
processing, as the input of convolutional neural network; (2) application of a simple linear iterative
clustering (SLIC) algorithm to enhance and improve the convolutional neural network to obtain a
super pixel image with a more complete boundary contour; (3) use of genetic algorithm to improve
extreme learning machine and classified the obtained image features. Through the above method,
the accuracy of defect detection is improved, and the recognition time is truncated to establish an
innovative machine-vision-based wood testing technique.

2. Materials and Methods

2.1. Wood Defect Original Image Dataset
      According to the different processes and causes of solid wood board defects, they are divided into
biohazard defects, growth defects and processing defects. Among them, growth defects and biohazard
defects are natural defects, which have certain shape and structure characteristics, and are also an
important basis for wood grade classification. Generally speaking, solid wood board growth defects
and biohazard defects can be divided into: dead knots, live knots, worm holes, decay, etc. The original
data set used in the experiment in this article is derived from the wood sampling image in the 948
project of the State Forestry Administration (the introduction of the laser profile and color integrated
scanning technology for solid wood panels). When scanning to obtain wood images, the scanning
speed of the scanner is 170 Hz–5000 Hz; Z direction resolution is 0.055 mm–0.200 mm; X direction
resolution is 0.2755 mm–0.550 mm; and color pixel resolution can reach 1 mm × 0.5 mm. The data set
includes 5000 defect maps of pine, fir, and ash. The bit depth of each image is 24, and the specified size
is at the 100*100 pixel level. Part of the defect image is shown in Figure 1.
Wood Defect Detection Based on Depth Extreme Learning Machine - MDPI
Appl. Sci. 2020, 10, x FOR PEER REVIEW                                                                                                             3 of 14
Appl. Sci. 2020, 10, 7488                                                                                                                          3 of 14
   Appl. Sci. 2020, 10, x FOR PEER REVIEW                                                                                                       3 of 14

                      Figure 1. Common defects of solid wood such as dead-knot, live-knot and decay.
                       Figure1.1.Common
                     Figure       Commondefects
                                         defects of
                                                 of solid
                                                    solid wood
                                                          wood such
                                                               suchas
                                                                    asdead-knot,
                                                                       dead-knot,live-knot
                                                                                   live-knotand
                                                                                             anddecay.
                                                                                                  decay.
  2.2. Optimized Convolution Neural Network
2.2.2.2.Optimized
          OptimizedConvolution
                          ConvolutionNeural NeuralNetwork
                                                      Network
         This paper proposes an optimized algorithm which uses NSST to preprocess the images
        This
  followed  Thispaper theproposes
                 bypaper    CNN      to an
                               proposes      optimized
                                                    defect algorithm
                                              an optimized
                                         extract                            which
                                                                    algorithm
                                                              features    from         usesimages
                                                                                   which
                                                                                  wood         NSSTNSST
                                                                                               uses     to
                                                                                                         aspreprocess
                                                                                                                 to preprocess
                                                                                                             a preliminary    the CNN
                                                                                                                                    images      followed
                                                                                                                                        themodel.
                                                                                                                                               images    The
by   followed
      the CNN
  simple          bytothe
              linear          CNNdefect
                         extract
                        iterative     to  extract   defect
                                               features
                                       clustering      (SLIC) features
                                                             from    woodfrom
                                                                  super-pixel    wood
                                                                              images      images
                                                                                          as           as aalgorithm
                                                                                               a preliminary
                                                                                    segmentation            preliminary
                                                                                                                      CNN      CNN to
                                                                                                                           is model.
                                                                                                                               used      model.
                                                                                                                                            The
                                                                                                                                            analyze  The the
                                                                                                                                                   simple
     simple
linear
  wood          linear clustering
           iterative
             images       iterative    clustering
                           by super-pixel              (SLIC) super-pixel
                                           (SLIC)clustering,
                                                      super-pixel     which       segmentation
                                                                        segmentation
                                                                                 allows        algorithm
                                                                                              the       algorithm
                                                                                                    defects    isinused is used
                                                                                                                       wood         to analyze
                                                                                                                            to analyze
                                                                                                                                 images       the    the
                                                                                                                                               andwood local
     wood
images       byimages
                  super-pixelby  super-pixel
                                      clustering,   clustering,
                                                       which         which
                                                                  allows    theallows
                                                                                  defects  the in defects
                                                                                                   wood
  information regarding defects and cracks to be efficiently located. The obtained information is fed         in
                                                                                                             imageswood   and images
                                                                                                                                 local    and      local
                                                                                                                                          information
     information
regarding        defectsregarding       defects
                                             to beand     cracks to    be efficiently       located.information
                                                                                                        The obtained          information         is initial
                                                                                                                                                     fed
  back to the       initialand     cracks
                               model,      which     efficiently
                                                    enhances         located.
                                                                    the original TheCNN.obtained                          is fed   back to the
     back   to  the   initial   model,
model, which enhances the original CNN.    which    enhances       the original     CNN.
  2.2.1. Structure and Characteristics of Convolution Neural Networks
     2.2.1.
2.2.1.       Structureand
          Structure          andCharacteristics
                                  Characteristicsof      ofConvolution
                                                             Convolution NeuralNeural Networks
                                                                                           Networks
         The CNN is an artificial neural network algorithm with multi-layer trainable architecture [15].
        The TheCNNCNNisisananartificial
                                   artificialneural
                                                 neuralnetwork
                                                            network algorithm
                                                                        algorithm with  withpoolmulti-layer
                                                                                                 multi-layer     trainable
                                                                                                                    trainable   architecture
                                                                                                                                   architecture     [15].
                                                                                                                                                       [15].
  It Itgenerally
        generally      consists
                        consists     of
                                      of  an
                                           an  input
                                               input     layer, excitation
                                                         layer,     excitation layer,
                                                                                   layer,   pool       layer,convolution
                                                                                                     layer,       convolution         layer,
                                                                                                                                  layer,     and  andfullfull
Itconnection
    generally consists          of  an   input   layer,    excitation    layer,   pool   layer,    convolution        layer,   and    full  connection
     connectionlayer. layer.CNNs
                               CNNshave   have many       advantagesin
                                                many advantages            interms
                                                                               termsofofimageimage     processing
                                                                                                    processing           applications.
                                                                                                                     applications.       (1) (1)    Feature
                                                                                                                                              Feature
layer.    CNNs
  extraction         have     many      advantages        in  terms    of image      processing        applications.        (1)  Feature      extraction
     extractionand   andclassification
                             classification can can be be combined
                                                            combined into  intothe thesame
                                                                                         samenetwork
                                                                                                   networkstructure
                                                                                                                  structure   andand     synchronized
                                                                                                                                     synchronized
and     classification
  training      can   be      can be combined
                            achieved,      and    the    into the same
                                                        algorithm      is     network
                                                                          fully    adaptive. structure
                                                                                                   (2)  Whenand thesynchronized
                                                                                                                         image      size  training
                                                                                                                                           is larger,   can
     training can be achieved, and the algorithm is fully adaptive. (2) When the image size is larger,                                               the the
bedeepachieved,      and     the  algorithm        is fully    adaptive.     (2)  When       the   image     size    is  larger,   the   deep      feature
     deepfeature
              feature information
                           information can   can be       extracted better.
                                                    be extracted         better. (3) (3)ItsItsunique
                                                                                                 uniquenetwork
                                                                                                             networkstructurestructure   hashas      strong
                                                                                                                                                strong
information
  adaptability       can be extracted           better. (3)imageIts unique      network        structure     has strong        adaptability          to
                                                                                                                                                      inthe
     adaptabilitytotothe     the local
                                  local deformation,
                                           deformation, image            rotation,
                                                                        rotation,      image
                                                                                      image        translation,
                                                                                                translation,       andand    other
                                                                                                                          other        changes
                                                                                                                                   changes       in the   the
local
  input  deformation,          image      rotation,     image     translation,      and     other     changes      in  the   input     image.       In  this
     inputimage.
              image.In   Inthis
                             thisstudy,
                                   study, each
                                             each pixel in   in the
                                                                 thewood
                                                                      woodimage imagewas  wasconvoluted
                                                                                                  convoluted     and andthethe   defect
                                                                                                                              defect        feature
                                                                                                                                        feature     waswas
study,    eachby
  extracted
     extracted    pixel     in the wood
                   byexploiting
                        exploiting       theseimage
                                        these   CNNwas         convolutedThe
                                                          characteristics.
                                                          characteristics.      and
                                                                                 TheCNNthe
                                                                                        CNN  defect    feature
                                                                                                 network
                                                                                               network             was extracted
                                                                                                               skeleton
                                                                                                            skeleton      usedused    inby
                                                                                                                                  in this thisexploiting
                                                                                                                                                  article
                                                                                                                                             article   is is
these    CNN       characteristics.
     shownininFigure
  shown            Figure2.2.               The   CNN       network     skeleton      used     in this   article    is shown       in  Figure      2.

                                                                                      Layer
                                                                                       LayerOut
                                                                                             Out
                                                         Input
                                                         Input Image
                                                               Image                  3*100*100
                                                                                       3*100*100

                                                     Conv7*7,stride=1
                                                     Conv7*7,stride=1                 64*98*98
                                                        ,padding=2                      64*98*98
                                                          ,padding=2

                                                              RelUs
                                                               RelUs
                                                        Max pooling
                                                         Max pooling                  64*49*49
                                                        3*3,stride=2                    64*49*49
                                                        3*3,stride=2
                                                              RelUs
                                                               RelUs
                                                     Conv7*7,stride=1
                                                                                      128*47*47
                                                        ,padding=2
                                                     Conv7*7,stride=1
                                                                                        128*47*47
                                                          ,padding=2
                                                              RelUs
                                                               RelUs
                                                      Average Pooling
                                                                                      128*24*24
                                                        2*2,stride=2
                                                      Average Pooling
                                                                                        128*24*24
                                                        2*2,stride=2
                                                           RelUs
                                                               RelUs
                                                            FC 1000

                                                             FC 1000
                                                            Softmax
                                                           Classifer
                                                             Softmax
                                                          Classifer
                                                       Figure 2. CNN network skeleton.
                                                       Figure2.2.CNN
                                                      Figure     CNNnetwork
                                                                     networkskeleton.
                                                                             skeleton.
Appl. Sci. 2020, 10, 7488                                                                         4 of 14

2.2.2. Non-Subsampled Shearlet Transform (NSST)
     The NSST can represent signals sparsely and optimally, but it also has a strong direction-
sensitivity [16–18]. Therefore, using NSST to preprocess wood images can preserve the defects
feature of wood images. Redundancies in the wood image information are reduced in addition to the
complexity and computation of image processing with the depth learning method.

2.2.3. Simple Linear Iterative Clustering (SLIC)
      The CNN uses a matrix form to represent an image to be processed, so the spatial organization
relationship between pixels is not considered—this affects the image segmentation and obscures the
boundary of the defective region of the wood image. The SLIC algorithm can generate relatively
compact super-pixel image blocks after processing a gray or color image. The generated super-pixel
image is compact between pixels, and the edge contour of the image is clear. To this effect, the SLIC
extracts a relatively accurate contour to supplement the feature contour. The SLIC also works with
relatively few initial parameters. Only the number of hyper pixels needed to segment the image must
be set. The algorithm is simple in principle, and has a small calculation range and rapid running
speed. By 2015, the parallel execution speed had reached 250 FPS; it is now the fastest super-pixel
segmentation method available [19].

2.2.4. Feature Extraction
     The optimized CNN model proposed in this paper was designed for wood surface feature
extraction. Knots were used as example wood defects (Figure 3a) to test feature extraction via the
following operations. The input image Figure 3a was directly processed by CNN algorithm to
obtain image Figure 3b, which presented local irregularity and nonsmooth edges in the contour after
enlargement [20]. The SLIC algorithm was used to process the input image (Figure 3a) followed by
longitudinal convolution (Figure 3d). The image shown in Figure 3h was obtained after edge removal
and fusion processing. The defect contour features of Figure 3h are substantially clearer compared to
Figure 3b because the segmentation of wood images using CNN, which is expressed in pixels as a
matrix without considering the spatial organization relationship between pixels, affects the end image
segmentation results. The SLIC algorithm instead extracts the wood defect boundary and contour
information from the original image and feeds back the information to the initial segmentation results
of the CNN model.
     The above process reduces the redundancy of local image information in addition to the complexity
and computation of the image processing. The pixel-level CNN model method does not accurately
reveal the boundary of the defective region of wood image, but instead indicates only its general
position. SLIC can extract a relatively accurate contour to supplement it and optimize the initial
CNN model. To this effect, the proposed SLIC algorithm-based method improves the defect feature
extraction of wood images over CNN alone.
     The input image (Figure 3a) was processed by the NSST algorithm to obtain the image shown in
Figure 3e, then Figure 3f was obtained using the SLIC algorithm. Vertical convolution was carried out to
obtain the image shown in Figure 3g. Figure 3i was obtained after edge removal and fusion processing.
Consider Figure 3i compared to Figure 3h: although the wood defect contour feature extraction effects
are not obvious, using NSST to preprocess the image reduces environmental interference and training
depth to markedly decrease the computation and complexity of the image processing.
Appl. Sci. 2020, 10, 7488                                                                                                            5 of 14
 Appl.Sci.
Appl.  Sci.2020,
            2020,10,
                  10,xxFOR
                        FORPEER
                            PEERREVIEW
                                 REVIEW                                                                                              55of
                                                                                                                                        of14
                                                                                                                                           14

                                               CNN
                                              CNN

                                                             (b)
                                                            (b)
                                               SLIC
                                              SLIC                    CNN
                                                                     CNN
                                    (a)
                                   (a)
                                       NN
                                       SS
                                       SS                    (c)
                                                            (c)                     (d)                        (h)
                                                                                                              (h)
                                                                                   (d)
                                       TT

                                              SLIC
                                             SLIC                     CNN
                                                                     CNN

                                    (e)
                                   (e)                       (f)                        (g)
                                                                                       (g)                     (i)(i)
                                                            (f)

                         Figure3.3.Contrast
                        Figure     Contrastdiagram
                                   Contrast diagramof
                                            diagram ofoptimized
                                                    of optimizedCNN
                                                       optimized CNNfeature
                                                                     featureextraction
                                                                             extractioneffects.
                                                                                        effects.

      Basedon
      Based  onthe
                theabove
                    aboveanalysis,
                           analysis,this
                                     thispaper
                                          paperdecided
                                                decidedto
                                                        touse
                                                          usethe
                                                              thewood
                                                                 woodimage
                                                                      imageprocessing
                                                                           processingframe
                                                                                      frameshown
                                                                                            shown
 in Figure 4 to obtain the wood  defect  feature map.
in Figure 4 to obtain the wood defect feature map.

                                                            Inputwood
                                                           Input  wood
                                                               image
                                                              image

                                                              NSST
                                                             NSST
                                                            processing
                                                           processing
                                              Feature
                                             Feature
                                             Extraction
                                            Extraction
                                                               Input
                                                              Input
                                                           trainingdata
                                                          training  data

                                                    Initial
                                                   Initial            Superpixel
                                                                     Super   pixel                 Input
                                                                                                 Input
                                                trainingCNN
                                               training  CNN        clusteranalysis
                                                                   cluster  analysis           testingdata
                                                                                              testing  data

                                                       Superpixel
                                                      Super  pixelenhanced
                                                                   enhanced
                                                           CNNmodel
                                                          CNN     model

                                                           FeatureImage
                                                          Feature  Image

                                               Figure 4.4. Wood
                                                Figure4.        image
                                                           Woodimage  processing
                                                                       processingframe.
                                                                 imageprocessing  frame.
                                               Figure     Wood                   frame.
2.3. Extreme Learning Machine (ELM)
 2.3.Extreme
2.3.  ExtremeLearning
                   LearningMachine
                              Machine(ELM)(ELM)
       Depth learning is commonly used in target recognition and defect detection applications due to
        Depthlearning
       Depth      learningisiscommonly
                                commonlyused     usedinintarget
                                                          targetrecognition
                                                                    recognitionand    anddefect
                                                                                            defectdetection
                                                                                                     detectionapplications
                                                                                                                   applicationsdue   dueto  to
its excellent     feature extraction capability.         The CNN        is highly time-consuming             due to the necessity          of
 its
its  excellent
    excellent      feature  extraction
                  feature extraction       capability.
                                          capability.     The
                                                         The    CNN      is highly     time-consuming         due    to  the   necessity    of
iterative    pre-training    and fine-tuning       stages;  theCNN      is highly
                                                                 hardware             time-consuming
                                                                                requirements       for more  due    to the necessity
                                                                                                                complex       engineering  of
 iterative pre-training
iterative      pre-training and  and fine-tuning
                                         fine-tuning stages;
                                                          stages; the the hardware
                                                                             hardware requirements
                                                                                             requirements for     for moremore complex
                                                                                                                                   complex
applications      are also high. The       deep CNN structure          has a large quantity         of adjustable free        parameters,
 engineering
engineering         applications     are  also   high.  The   deep     CNN      structure     has   a  large   quantity      of adjustable
which     makesapplications
                    its constructionare also
                                          highlyhigh.   The deep
                                                    flexible.  On the CNN otherstructure
                                                                                   hand, ithas     a large
                                                                                               lacks          quantity
                                                                                                       theoretical          of adjustable
                                                                                                                       guidance      and is
 freeparameters,
free   parameters,which  whichmakes
                                  makesits  itsconstruction
                                                constructionhighly
                                                                 highlyflexible.
                                                                            flexible.On  Onthetheother
                                                                                                   otherhand,
                                                                                                            hand,ititlacks
                                                                                                                        lackstheoretical
                                                                                                                                theoretical
overly reliant on experience, so its generalization performance is dubious. In this study, we integrated
 guidance and
guidance       and isis overly
                         overly reliant
                                   reliant onon experience,
                                                  experience, so  so its
                                                                      its generalization
                                                                           generalization performance
                                                                                                 performance isis dubious.
                                                                                                                        dubious. In  In this
                                                                                                                                         this
the ELM into       a depth extreme       learning    machine (Figure        5) to improve the training efficiency              of the deep
 study,
study,     we    integrated
          we integrated        the    ELM     into   a depth    extreme       learning      machine      (Figure    5)   to  improve      the
convolution        network. theThe ELM       into method
                                      proposed      a depth extracts
                                                               extremewood   learning
                                                                                    defectsmachine
                                                                                               by using (Figure    5) to improve
                                                                                                            an optimized        CNN and  the
 trainingefficiency
training     efficiencyof  ofthe
                              thedeep
                                    deepconvolution
                                           convolutionnetwork.
                                                            network.The   Theproposed
                                                                                 proposed methodextracts   extractswood wooddefects
                                                                                                                                 defects by
ELM classifier       to exploit   the excellent     feature extraction      ability of the method
                                                                                               deep network and           fast trainingby  of
 using
using     an
         an   optimized
             optimized      CNN
                            CNN       and
                                     and    ELM
                                           ELM      classifier
                                                   classifier   to
                                                               to   exploit
                                                                   exploit     the
                                                                              the    excellent
                                                                                    excellent     feature
                                                                                                 feature     extraction
                                                                                                            extraction       ability
                                                                                                                            ability   ofthe
                                                                                                                                     of   the
ELM simultaneously.
 deepnetwork
deep    network andfast    fasttraining
                                 trainingof  ofELM
                                                 ELMsimultaneously.
                                                        simultaneously.
       The ELMand     algorithm      differs from      traditional pre-feedback neural network training learning.
        TheELM
       The    ELMalgorithm
                      algorithmdiffers
                                     differsfrom
                                               fromtraditional
                                                      traditionalpre-feedback
                                                                     pre-feedback neuralnetwork     networktraining
                                                                                                                traininglearning.
                                                                                                                              learning.Its Its
Its hidden     layer does not need         to be iterated,    and input weightsneural      and hidden layer        node biases are set
 hidden
hidden      layer
           layer     does  not
                    does not needneed    to
                                        to bebe  iterated,
                                                iterated,    and   input
                                                            and input       weights
                                                                            weights       and
                                                                                         and    hidden
                                                                                               hidden      layer
                                                                                                          layer    node     biases   are   set
randomly       to minimize      training     error.  The output      weights      of the    hidden    layer   arenode      biases are
                                                                                                                   determined             set
                                                                                                                                     by the
 randomlyto
randomly        tominimize
                    minimizetraining
                                 trainingerror.
                                              error.The
                                                      Theoutput
                                                           outputweights
                                                                      weightsof    ofthe
                                                                                       thehidden
                                                                                            hiddenlayerlayerarearedetermined
                                                                                                                    determinedby     bythethe
algorithm [21–23]. The ultimate learning machine is based on the proved ordinary extreme theorem
 algorithm[21–23].
algorithm       [21–23].TheTheultimate
                                 ultimatelearning
                                              learningmachine
                                                          machineisisbasedbasedon   onthetheproved
                                                                                              provedordinary
                                                                                                         ordinaryextreme
                                                                                                                       extreme theorem
and interpolation        theorem,     under which when          the hidden layer          activation function       of a singletheorem
                                                                                                                                    hidden
 and
and    interpolation
      interpolation       theorem,
                         theorem,      under
                                      under     which
                                                which    when
                                                        when     the
                                                                the    hidden
                                                                     hidden       layer
                                                                                 layer     activation
                                                                                          activation     function
                                                                                                        function     of
                                                                                                                    of   aasingle
                                                                                                                            singlehidden
                                                                                                                                    hidden
layer feedforward neural network is infinitely differentiable, its learning ability is independent of the
 layerfeedforward
layer    feedforwardneural neuralnetwork
                                     networkisisinfinitely
                                                    infinitelydifferentiable,
                                                                 differentiable,its   itslearning
                                                                                           learningability
                                                                                                       abilityisisindependent
                                                                                                                   independentof      ofthe
                                                                                                                                          the
 hiddenlayer
hidden      layerparameters
                    parametersand   andisisonly
                                             onlyrelated
                                                    relatedtotothe
                                                                thecurrent
                                                                      currentnetwork
                                                                                 networkstructure.
                                                                                              structure.WhenWhenthe   theinput
                                                                                                                           inputweights
                                                                                                                                    weights
Appl. Sci. 2020, 10, 7488                                                                                                    6 of 14

 Appl. Sci. 2020, 10, x FOR PEER REVIEW                                                                                       6 of 14
hidden layer parameters and is only related to the current network structure. When the input weights
and
 andhidden
      hidden layer
                layer node
                      node offsets
                             offsetsare
                                     arerandomly
                                         randomlyassigned
                                                    assigned to
                                                              to obtain
                                                                  obtain the
                                                                          the appropriate
                                                                               appropriate network
                                                                                             network structure,
                                                                                                         structure,
the
 theELM
     ELMhas hasuniversal
                 universalapproximation
                            approximationcapability.
                                            capability.The
                                                        Thenetwork
                                                             networkinput
                                                                       inputweights
                                                                               weightsandandhidden
                                                                                              hiddenlayer
                                                                                                        layernode
                                                                                                              node
offsets
 offsets can be randomly assigned by approximating any continuous function. Under the premiseof
         can  be randomly    assigned  by approximating    any continuous     function.  Under    the  premise   of
network
 networkhidden
            hiddenlayer
                    layeractivation
                           activationfunction
                                      functioninfinite
                                               infinitedifferentiability,
                                                        differentiability,the
                                                                           theoutput
                                                                                outputweights
                                                                                        weightsof  ofthe
                                                                                                      thenetwork
                                                                                                          network
can
 canbebecalculated
          calculatedvia
                      viathe
                          theleast
                              leastsquare
                                    squaremethod.
                                           method.The
                                                    Thenetwork
                                                         networkmodel
                                                                   modelthat
                                                                           thatcan
                                                                                 canapproximate
                                                                                      approximatethe  thefunction
                                                                                                          function
can
 can be established, and the corresponding neural network functions such as classification, regression,
     be  established,  and  the corresponding  neural  network   functions  such   as classification,  regression,
and
 andfitting
      fittingcan
               canbe
                   berealized.
                      realized.

                        Figure5.5. Contrast
                       Figure      Contrastdiagram
                                            diagramof
                                                   ofoptimized
                                                      optimizedCNN
                                                               CNNfeature
                                                                   featureextraction
                                                                           extractioneffects.
                                                                                      effects.

      This
       Thispaper
             papermainly
                       mainly centers   on the
                                  centers    on classification     function
                                                  the classification           of ELM,
                                                                           function    ofwhich
                                                                                          ELM,serveswhichtoserves
                                                                                                             select atorelatively
                                                                                                                           select a
simple   single
 relatively      hidden
             simple        layer
                       single      neural
                                hidden      network
                                          layer   neural  asnetwork
                                                             the classifier.
                                                                        as theThe   traditional
                                                                                classifier.        neural network
                                                                                             The traditional     neuralalgorithm
                                                                                                                          network
needs   manyneeds
 algorithm      iterations
                      manyand       parameters,
                               iterations             learns slowly,
                                            and parameters,             hasslowly,
                                                                   learns     poor expansibility,       and requires
                                                                                      has poor expansibility,        andintensive
                                                                                                                          requires
manual
 intensiveinterventions.     The ELM used
             manual interventions.           The here
                                                   ELMrequires
                                                            used hereno iterations,
                                                                          requires no theiterations,
                                                                                           learning speed      is relatively
                                                                                                         the learning     speedfast,
                                                                                                                                   is
the  input weights
 relatively  fast, the and
                        inputbiases    are and
                                 weights     generated
                                                   biases arerandomly,
                                                                generated  therandomly,
                                                                                follow-upthe  does    not need
                                                                                                  follow-up       to be
                                                                                                               does   notset,
                                                                                                                           needandto
relatively
 be set, andfewrelatively
                  manual interventions
                              few manual are           required. In
                                                 interventions      aretherequired.
                                                                            large sample
                                                                                       In thedatabase,    the recognition
                                                                                                 large sample      database,rate the
of ELM is better
 recognition    rate than
                      of ELMthatisofbetter
                                      the support
                                             than that  vector  machine
                                                           of the  support(SVM).
                                                                               vectorFor    these reasons,
                                                                                       machine      (SVM). Forwethese
                                                                                                                   use ELMs      as
                                                                                                                          reasons,
classifiers
 we use ELMsto enhance     recognition
                  as classifiers            efficiency
                                    to enhance             and performance
                                                    recognition                  [24,25].
                                                                    efficiency and    performance [24,25].
      The
       TheELM
            ELMalgorithm
                   algorithmintroduced
                                  introducedabove aboveisisthethemain
                                                                  mainclassification
                                                                          classificationmethod
                                                                                           methodfor   forwood
                                                                                                           wooddefect
                                                                                                                   defectfeature
                                                                                                                            feature
recognition
 recognitionininthis   paper.
                    this  paper. However,
                                     However, in the inELM
                                                         the network     structure,
                                                               ELM network            the inputthe
                                                                                  structure,       weights
                                                                                                       input and   the threshold
                                                                                                              weights     and the
of hidden layer
 threshold         nodes layer
             of hidden     are given
                                   nodesrandomly.
                                            are given   Forrandomly.
                                                             the ELM structure
                                                                         For the ELMwith the   same number
                                                                                          structure     with theofsame
                                                                                                                    hidden    layer
                                                                                                                           number
neurons,
 of hidden thelayer
                performance
                       neurons,   of the network
                                           performance is veryofdifferent,
                                                                  the networkwhich ismakes
                                                                                        verythe    classification
                                                                                                different,   which  performance
                                                                                                                       makes the
of the networkperformance
 classification    more unstable.    of The
                                         the genetic
                                               networkalgorithm        (GA) simulates
                                                            more unstable.                 Darwinian
                                                                                 The genetic              evolutionary
                                                                                                   algorithm                theory
                                                                                                                (GA) simulates
to optimize   the  initial  weights    and    threshold     of ELM    by  eliminating    less-fit
 Darwinian evolutionary theory to optimize the initial weights and threshold of ELM by eliminating  weights   and   thresholds.
      Figure
 less-fit      6a,c and
          weights    show    the variation curves of GA-ELM population fitness functions under Radbas,
                          thresholds.
Hardlim,   and6a,c
       Figure    Sigmoid
                      showexcitation      functions,
                              the variation      curves  respectively.
                                                            of GA-ELM    Smaller   fitnessfitness
                                                                            population      values functions
                                                                                                     indicate higher
                                                                                                                  underaccuracy;
                                                                                                                          Radbas,
the Sigmoidand
 Hardlim,      incentive
                    Sigmoid function    has the
                                 excitation         best network
                                                functions,            effect. Smaller fitness values indicate higher
                                                               respectively.
 accuracy; the Sigmoid incentive function has the best network effect.
Appl. Sci. 2020, 10, x FOR PEER REVIEW                                                                         7 of 14
Appl. Sci. 2020, 10, 7488                                                                                      7 of 14

                                                       (a)

                                                       (b)

                                                       (c)
      Figure6.6. Variation curves
     Figure                curves ofof driving
                                        driving function
                                                 functionpopulation
                                                           populationfitness;
                                                                       fitness;(a)(a)Radbas
                                                                                      Radbasdriving function;
                                                                                              driving         (b)
                                                                                                      function;
      Hardlim
     (b) Hardlimdriving function;
                  driving         (c)(c)
                          function;   Sigmoid  driving
                                         Sigmoid        function.
                                                  driving function.
Appl. Sci. 2020, 10, 7488                                                                              8 of 14

     The classification accuracy of GA-ELM and ELM under different excitation functions is also shown
in Table 1. We found that the classification accuracy of Sigmoid and Radbas excitations were similar,
and the Hardlim excitation function was an exception. The accuracy of ELM and GA-ELM was highest
when the Sigmoid function was used as the activation function. The accuracy of GA-ELM reached
95.93%, which is markedly better than that of an unoptimized ELM. The GA optimized ELM network
required fewer hidden layer nodes and showed higher test accuracy as well.

            Table 1. Classification accuracy of GA-ELM and ELM under different excitation functions.

                  Excitation          Classification Accuracy Rate (%)    Number of Hidden Node
                  Function              GA-ELM              ELM           GA-ELM              ELM
                   Sigmoid                95.93             93.85              70              90
                   Hardlim                93.98             91.37              90             170
                   Radbas                 95.37             93.36             110             200

      In summary, an improved depth extreme learning machine was constructed in this study by
combining the optimized GA-ELM classifier with the optimized CNN feature extraction. It is referred
to from here on as “D-ELM”.

3. Experimentation

3.1. Experimental Parameters
     Table 2 shows the computer-related parameters and software platform used by the experimental
system, including CPU model, main frequency and memory size.

                                          Table 2. Experimental parameters.

                                     Items                                Type
                                      Server                  Intel(R)Xeon(R)CPU E5-4603v2
                            Quantity of physical CPUs                        2
                             Main frequency of CPUs                      2.20 GHz
                                     Memory                                16 GB
                                                                       Anaconda 3.5,
                             Experimental platform
                                                               Microsoft Visual Studio 2017

3.2. Empirical Method and Result
      The specific experimental process is shown in Figure 7. First, we preprocessed 5000 original
images via NSST and randomly selected 4000 images for training. Second, for each pixel in each
image, the neighborhood subgraph was taken as the input of CNN, and a total of 40,000,000 samples
were obtained as the experimental training set to train the network model. The remaining 10,000,000
samples were used as test images to evaluate the algorithm. The features extracted from the test
samples were input into the ELM network classifier, the number of hidden layer nodes of the extreme
learning machine was set to 100, then the accuracy and stability of the feature extraction method was
statistically analyzed. We found that when the number of iterations exceeds 3500, the loss function is
around 0.2 and the convergence performance is acceptable.
Appl. Sci. 2020, 10, x FOR PEER REVIEW                                                                       9 of 14
Appl. Sci. 2020, 10, 7488                                                                                    9 of 14
 Appl. Sci. 2020, 10, x FOR PEER REVIEW                                                                       9 of 14

                Figure 7. Flowchart of wood defect feature extraction and classification process.
                Figure 7. Flowchart of wood defect feature extraction and classification process.
     Figure 8a Figure
                 shows7.the
                          Flowchart of wood defect feature extraction and classification process.
                             relationship between the training loss value and the number of iterations.
Although   the training  loss value fluctuates
     Figure 8a shows the relationship           a little
                                            between   theduring   theloss
                                                           training   iteration
                                                                           valueprocess,
                                                                                  and the itnumber
                                                                                              shows aofdownward
                                                                                                         iterations.
      Figure 8a shows the relationship between the training loss value and the number of iterations.
trend as athe
Although    whole.  When
               training    the
                        loss    iteration
                              value       is completed,
                                     fluctuates            the training
                                                a little during           loss value
                                                                  the iteration       was itaround
                                                                                process,     shows 0.2.   Figure 8b
                                                                                                     a downward
 Although the training loss value fluctuates a little during the iteration process, it shows a downward
showsasa agraph
trend      whole. ofWhen
                    accuracy.   When the
                           the iteration  is number   of iterations
                                             completed,               wasloss
                                                           the training     1500, thewas
                                                                              value   accuracy
                                                                                          around of 0.2.
                                                                                                     the Figure
                                                                                                          proposed
                                                                                                                 8b
 trend as a whole. When the iteration is completed, the training loss value was around 0.2. Figure 8b
algorithm
shows       reached
        a graph      90%. Accuracy
                 of accuracy.  Whencontinued
                                        the numberto increase   as iteration
                                                      of iterations  was 1500,quantity  increased
                                                                                  the accuracy   of until  reaching
                                                                                                    the proposed
 shows a graph of accuracy. When the number of iterations was 1500, the accuracy of the proposed
a maximum
algorithm     of about
           reached     98%.
                    90%.  Accuracy continued to increase as iteration quantity increased until reaching a
 algorithm reached 90%. Accuracy continued to increase as iteration quantity increased until reaching
maximum of about 98%.
 a maximum of about 98%.

                                                     (a)

                                                     (a) 8. Cont.
                                                 Figure
Appl.Sci.
Appl.  Sci.2020,
            2020,10,
                  10,xxFOR
                        FORPEER
                     7488   PEERREVIEW
                                 REVIEW                                                                          10ofof14
                                                                                                                10      14

                                                        (b)
                                                       (b)
      Figure8.
     Figure  8.8.Relationship
                  Relationshipbetween
                                betweenlosslossfunction,
                                                function,accuracy,
                                                          accuracy,iteration
                                                         accuracy,   iterationnumber:
                                                                    iteration  number:(a)
                                                                              number:  (a)Relationship
                                                                                           Relationshipbetween
                                                                                          Relationship  between
     loss
      lossfunction
           functionand
                     anditeration;
                          iteration;(b)
                                     (b)accuracy
                                         accuracygraph.
                                                   graph.
                                                  graph.

     Figure
      Figure999shows
     Figure      shows
                shows our
                       ourfinal
                        our     recognition
                             final
                            final           effect
                                    recognition
                                   recognition     on the
                                                 effect
                                                effect   ontest
                                                        on   theset.
                                                            the   testWe
                                                                 test    surround
                                                                       set.
                                                                      set.         the identified
                                                                            We surround
                                                                            We  surround          wood defects
                                                                                           the identified
                                                                                          the   identified wood
                                                                                                           wood
with different
 defectswith
defects         colored
         withdifferent  rectangular
               differentcolored       borders
                         coloredrectangular
                                  rectangularborders
                                               borders

                           Figure9.9.Recognition
                          Figure      Recognitionresults
                                                  resultsbased
                                                          basedon
                                                                onbased
                                                                   basedon
                                                                         ondeep
                                                                            deeplearning.
                                                                                 learning.

4.  Discussion
4.4.Discussion
    Discussion
      This
       This paper
      This   paperproposes
            paper    proposesan
                    proposes     anELM
                                an   ELMclassifier
                                    ELM     classifierbased
                                           classifier   basedon
                                                       based   ondepth
                                                              on   depthstructure.
                                                                  depth     structure. Choosing
                                                                           structure.      Choosing the
                                                                                          Choosing      theappropriate
                                                                                                       the   appropriate
                                                                                                            appropriate
number
 numberof
number    of hidden
           ofhidden   nodes
              hiddennodes     under
                       nodesunder     the
                               underthe   D-ELM
                                       theD-ELM     structure
                                           D-ELMstructure      provides
                                                     structureprovides    enhanced
                                                                providesenhanced        stability
                                                                           enhancedstability       and
                                                                                         stabilityand    generalization
                                                                                                    andgeneralization
                                                                                                          generalization
ability
 abilityin
ability  inthe
        in  thenetwork.
           the   network.To
                network.    To
                            Toensure
                                ensureaccurate
                               ensure             tests
                                         accuratetests
                                        accurate         and
                                                   testsand   prevent
                                                          andprevent   node
                                                               preventnode     redundancy,
                                                                                redundancy,when
                                                                        noderedundancy,           whenthe
                                                                                                 when     thenumber
                                                                                                         the   numberof
                                                                                                              number   of
                                                                                                                      of
hidden
 hidden  nodes
          nodes  was
                  was  100,
                        100,the
                             thetest
                                 testaccuracy
                                       accuracy of
                                                 of D-ELM
                                                     D-ELM   was
                                                              was  maintained
                                                                    maintained    at
                                                                                   ata arelatively
                                                                                          relatively
hidden nodes was 100, the test accuracy of D-ELM was maintained at a relatively stable value over    stable
                                                                                                      stable value
                                                                                                              valueover
                                                                                                                    over
repeated
 repeatedtests.
repeated          The
            tests.The
           tests.      accuracy
                   Theaccuracy    was
                        accuracywaswasphased
                                         phasedas
                                        phased    asshown
                                                 as   shownin
                                                     shown    inFigure
                                                             in  Figure10.
                                                                Figure   10.D-ELM
                                                                        10.   D-ELMsignificantly
                                                                             D-ELM        significantlyoutperformed
                                                                                         significantly    outperformed
                                                                                                         outperformed
ELM
ELM with small fluctuations in amplitude, robustness to the number of test iterations, and
 ELM   with
        with small
              small fluctuations
                     fluctuations  in
                                    in amplitude,
                                        amplitude,   robustness
                                                      robustness  to
                                                                   tothe
                                                                      thenumber
                                                                           number     of
                                                                                       of test
                                                                                           testiterations,
                                                                                                iterations,   andhigher
                                                                                                             and  higher
                                                                                                                 higher
network   stability.
 networkstability.
network    stability.
Appl. Sci. 2020, 10, 7488                                                                                       11 of 14
 Appl. Sci. 2020, 10, x FOR PEER REVIEW                                                                          11 of 14

                           Figure  10.Accuracy
                            Figure10.  AccuracyofofD-ELM
                                                    D-ELMand
                                                          andELM
                                                              ELMafter
                                                                  aftermultiple
                                                                        multipletests.
                                                                                  tests.

    Table
     Table3 shows thethe
            3 shows   results of our
                         results  of algorithm  accuracy
                                     our algorithm        tests, as
                                                      accuracy      mentioned
                                                                 tests,         above. above.
                                                                        as mentioned     D-ELMD-ELM
                                                                                                 has a higher
                                                                                                         has a
average accuracyaccuracy
 higher average  rate but lower
                          rate butstandard  deviationdeviation
                                    lower standard      than ELM.    TheELM.
                                                                  than    accuracy  and stability
                                                                               The accuracy   and of D-ELMof
                                                                                                   stability
network
 D-ELM were   bothwere
         network   accurate
                        bothand   stable.and
                               accurate   As astable.
                                               result,As
                                                       theaperformance     of the classifier
                                                            result, the performance          wasclassifier
                                                                                         of the  improved. was
 improved.
                                          Table 3. D-ELM versus ELM stability.
                                       Table 3. D-ELM versus ELM stability.
                          Average Accuracy      Highest Accuracy   Lowest Accuracy                    Standard
     Algorithms
                              Rate (%)              Rate (%)            Rate (%)               Deviation (std)
                   Average Accuracy Highest Accuracy                   Lowest Accuracy              Standard
  Algorithms
        D-ELM                 95.92                    96.11                   95.65                 0.0967 (std)
                        Rate (%)                  Rate (%)                  Rate (%)           Deviation
         ELM                  92.84                    93.72                   92.16                 0.3220
     D-ELM                95.92                     96.11                     95.65                   0.0967
       ELM                92.84                     93.72                     92.16                   0.3220
     We added an SVM classifier to the experiment to further assess the depth extreme learning
machine.
       We Table
           added4 anshows
                       SVMthe   accuracy
                             classifier to and
                                            the timing    of D-ELM
                                                 experiment             and SVM
                                                                  to further assesstraining testsextreme
                                                                                     the depth     on all samples,
                                                                                                            learning
where   D-ELM
 machine.       again
            Table     has the
                   4 shows  thehighest  accuracy
                                 accuracy          in bothoftraining
                                             and timing         D-ELMand andtesting. Although
                                                                              SVM training      theon
                                                                                             tests   training  time
                                                                                                        all samples,
and  network
 where   D-ELM layer quantity
                 again  has theare higher
                                highest     in D-ELM,
                                         accuracy         its training
                                                     in both    trainingtime  and testAlthough
                                                                         and testing.  time are shorter    than time
                                                                                                  the training   the
other  algorithms
 and network       wequantity
                layer  tested, and
                                are its accuracy
                                    higher          is much
                                              in D-ELM,     its higher.
                                                                trainingThe
                                                                          timeoverall performance
                                                                               and test               of D-ELM
                                                                                         time are shorter          is
                                                                                                            than the
 otherthan
better  algorithms   we tested,
            that of ELM          and its accuracy is much higher. The overall performance of D-ELM is
                          and SVM.
 better than that of ELM and SVM.
                     Table 4. Defect recognition of D-ELM ELM and SVM on wood images.
                       Table 4. Defect recognition of D-ELM
                                Algorithm                   ELM and
                                                        D-ELM       SVM on wood
                                                                  ELM        SVMimages.

                                       Algorithm
                                          Time (s)            D-ELM 1879
                                                             890         ELM          SVM
                                                                                       2356
                       Train
                                      Accuracy rate (s)
                                             Time   (%)    99.47%890 92.31%
                                                                         1879         90.45%
                                                                                      2356
                            Train
                                        Accuracy
                                         Time (ms)rate (%) 187
                                                            99.47% 532
                                                                     92.31%          90.45%
                                                                                        631
                        Test
                                      Accuracy rate (%)
                                            Time (ms)     96.72187 92.16
                                                                       532             91.55
                                                                                       631
                               Test
                                        Accuracy rate (%)      96.72       92.16      91.55
5. System Interface
  5. System Interface
      We constructed a network model and classification optimizer based on the proposed algorithm by
integrating    Anaconda 3.5
        We constructed         and TensorFlow.
                           a network   model and   We  then constructed
                                                     classification          a realbased
                                                                       optimizer    woodon plate
                                                                                              thedefect  identification
                                                                                                  proposed    algorithm
system   in the C# development
  by integrating     Anaconda 3.5   language  on the Microsoft
                                       and TensorFlow.         We Visual    Studio 2017 open
                                                                     then constructed           platform.
                                                                                           a real           The system
                                                                                                   wood plate      defect
can  identify defects
  identification       in solid
                   system        wood
                            in the   C#plate images and
                                        development          provide their
                                                          language           position
                                                                        on the         and size
                                                                                Microsoft       information.
                                                                                            Visual   Studio 2017We also
                                                                                                                    open
used   a Microsoft
  platform.          SQL server
              The system           2012 database
                            can identify  defects into solid
                                                       store wood
                                                               the information    before
                                                                       plate images   andand   after their
                                                                                           provide   processing.
                                                                                                            position and
  sizeOur
       information.
            experiment Weonalso  used
                              deep     a Microsoft
                                     network         SQL
                                               feature      server 2012
                                                         learning    mainlydatabase
                                                                              involvedto store the informationof
                                                                                         the implementation        before
                                                                                                                     the
  and after
network       processing.
           framework    and the training model for wood recognition. The system is based on the network
trainingOur   experiment
          model   discussed onindeep  network
                                 this paper;     feature
                                             it can        learning
                                                     be used           mainly
                                                                to detect       involved
                                                                           defects         the implementation
                                                                                   in the wood   image databaseof   onthe
                                                                                                                       a
  network
single  sheetframework
               and displayandthe the  traininginmodel
                                  coordinates              for Y
                                                   the X and     wood    recognition.
                                                                   directions           The system
                                                                               of the defects, as shownis based   on 11.
                                                                                                           in Figure  the
Onnetwork
     the lefttraining
              side, themodel
                        scanned discussed  in thisare
                                   wood images      paper;    it can with
                                                       displayed      be used   to detect
                                                                           defects  markeddefects  in the
                                                                                             in green       wood
                                                                                                        boxes.  Theimage
                                                                                                                     top
  database on a single sheet and display the coordinates in the X and Y directions of the defects, as
  shown in Figure 11. On the left side, the scanned wood images are displayed with defects marked in
Appl. Sci. 2020, 10, 7488                                                                                                      12 of 14

      Appl. Sci. 2020, 10, x FOR PEER REVIEW                                                                             12 of 14

right side of the interface shows the coordinates of each defect, the cutting position of the plank, and
     green boxes. The top right side of the interface shows the coordinates of each defect, the cutting
the type of defects. Below the table are the total numbers of defects identified by the machine and the
     position of the plank, and the type of defects. Below the table are the total numbers of defects
recognition rate.
      identified by the machine and the recognition rate.

                                 Figure
                              Figure 11.11. Systeminterface
                                         System    interface diagram
                                                             diagram based
                                                                      basedonondepth
                                                                                depthlearning.
                                                                                        learning.

     6. Conclusions
6. Conclusions
            The depth
      The depth     extremeextreme     learning
                                learning   machine machine
                                                       proposed proposed
                                                                     in thisinpaper
                                                                                  this paper      has reasonable
                                                                                        has reasonable               dimensions,
                                                                                                             dimensions,    effectively
      effectively manages heterogeneous data, and works within an acceptable run time. Our results
manages heterogeneous data, and works within an acceptable run time. Our results suggest that it
      suggest that it is a promising new solution to problems such as obtaining marking samples,
is a promising new solution to problems such as obtaining marking samples, constructing features,
      constructing features, and training. It has excellent feature extraction ability and fast training time.
and training.
      Based on It thehas   excellent
                        method           featurelearning,
                                   of machine      extraction Theability     and fast training
                                                                   NSST transform         is used totime.     Basedthe
                                                                                                        preprocess    on original
                                                                                                                          the method
of machine      learning,
      image (i.e.,  reduce itsThe    NSST transform
                                  complexity                  is used to
                                                 and dimensionality            preprocess
                                                                           while   minimizing  thetheoriginal   image (i.e.,
                                                                                                       down-sampling      processreduce
its complexity
      in CNN), then andSLICdimensionality         while minimizing
                               is applied to optimize       the CNN model     thetraining
                                                                                    down-sampling
                                                                                            process. This  process
                                                                                                             methodin     CNN), then
                                                                                                                       effectively
SLICreduces
       is applied     to optimizeofthe
                the redundancy              CNN
                                         local  imagemodel     training
                                                       information       andprocess.
                                                                               extracts This     method
                                                                                         relatively         effectively
                                                                                                       accurate           reduces the
                                                                                                                 supplementary
      feature  contours.     The    optimized     CNN    is  then   used    to  extract   wood
redundancy of local image information and extracts relatively accurate supplementary feature       image   features   and  secure
                                                                                                                               contours.
      corresponding      image    features.   The  feature   is input   to  the  ELM
The optimized CNN is then used to extract wood image features and secure corresponding  classifier   and  the parameters    of theimage
      related
features.   Theneural   network
                  feature           are optimized.
                             is input   to the ELM    The   GA is used
                                                        classifier    andtotheselect  the initial weight
                                                                                   parameters       of the threshold   of ELM
                                                                                                            related neural       to
                                                                                                                               network
      improve the prediction accuracy and stability of the network model. Finally, the image data to be
are optimized. The GA is used to select the initial weight threshold of ELM to improve the prediction
      tested is input to the well-trained network model and final test results are obtained.
accuracy and stability of the network model. Finally, the image data to be tested is input to the
            We also compared the stability of D-ELM and ELM network models. The standard deviation of
well-trained
      D-ELM was network      modeland
                      only 0.0967      andthefinal  test results
                                                accuracy    of D-ELMare obtained.
                                                                          improved by about 3% compared to ELM; the
      We
      stability of the D-ELM network was also found to beELM
           also  compared        the  stability    of D-ELM       and      highernetwork
                                                                                     and lessmodels.       Thetest
                                                                                                 affected by    standard
                                                                                                                   quantitydeviation
                                                                                                                              than
of D-ELM      was   only    0.0967    and   the   accuracy      of D-ELM        improved      by
      ELM. We also found that D-ELM has an accuracy of up to 96.72% and a shorter test time than ELMabout   3%   compared       to ELM;
the stability
      or SVMofatthe  onlyD-ELM
                            187 ms. network
                                       The D-ELMwas also     foundmodel
                                                       network       to be is higher
                                                                                 capableandofless    affected
                                                                                                 highly        by test
                                                                                                         accurate   woodquantity
                                                                                                                            defect than
ELM.recognition
        We also foundwithinthat
                              a very  short has
                                  D-ELM       training  and detection
                                                   an accuracy      of uptime.
                                                                             to 96.72% and a shorter test time than ELM or
SVM Author
       at onlyContributions:
                187 ms. The Conceptualization,
                                  D-ELM networkY.Y.     model and is capable
                                                                   Y.L.;         of highly
                                                                          methodology,     Y.Y.;accurate
                                                                                                  software,wood    defect recognition
                                                                                                            X.Z.; validation, Y.Y.,
within   a very
      Z.H.        short
           and F.D.;      training
                       resources,     and
                                   X.Z. anddetection     time.
                                             Z.H.; writing—original      draft preparation, Y.Y.; writing—review and editing,
      Y.L. and X.Z.; project administration, Y.L.; funding acquisition, Y.L. All authors have read and agreed to the
Author           version ofConceptualization,
         Contributions:
      published             the manuscript.     Y.Y. and Y.L.; methodology, Y.Y.; software, X.Z.; validation, Y.Y., Z.H.
and F.D.; resources, X.Z. and Z.H.; writing—original draft preparation, Y.Y.; writing—review and editing, Y.L. and
      Funding:
X.Z.; project   This research was
               administration,     funded
                                 Y.L.;     by the
                                       funding    2019 JiangsuY.L.
                                                acquisition,    Province Key Research
                                                                   All authors        and Development
                                                                               have read              Plan
                                                                                          and agreed to theby the
                                                                                                            published
      Jiangsu
version  of theProvince Science and Technology under grant BE2019112, and was funded by the Jiangsu Province
                manuscript.
      International Science and Technology Cooperation Project under grant BZ2016028, and was supported from the
Funding: This research was funded by the 2019 Jiangsu Province Key Research and Development Plan by the
Jiangsu Province Science and Technology under grant BE2019112, and was funded by the Jiangsu Province
International Science and Technology Cooperation Project under grant BZ2016028, and was supported from the
948 Import Program on the Internationally Advanced Forestry Science and Technology by the State Forestry
Bureau under grant 2014-4-48.
Appl. Sci. 2020, 10, 7488                                                                                     13 of 14

Conflicts of Interest: The authors declare no conflict of interest.

References
1.    Qiu, Q.W.; Lau, D. Grain Effect on the Accuracy of Defect Detection in Wood Structure by Using Acoustic-Laser
      Technique. In Proceedings of the Nondestructive Characterization and Monitoring of Advanced Materials,
      Aerospace, Civil Infrastructure, and Transportation XIII, Denver, CO, USA, 3–7 March 2019. [CrossRef]
2.    Siekanski, P.; Magda, K.; Malowany, K.; Rutkiewicz, J.; Styk, A.; Krzeslowski, J.; Kowaluk, T.; Zagorski, A.
      On-line laser triangulation scanner for wood logs surface geometry measurement. Sensors 2019, 19, 1074.
      [CrossRef] [PubMed]
3.    Espinosa, L.; Prieto, F.; Brancheriau, L.; Lasaygues, P. Effect of wood anisotropy in ultrasonic wave
      propagation: A ray-tracing approach. Ultrasonics 2019, 91, 242–251. [CrossRef] [PubMed]
4.    Taskhiri, M.S.; Hafezi, M.H.; Harle, R.; Williams, D.; Kundu, T.; Turner, P. Ultrasonic and thermal testing
      to non-destructively identify internal defects in plantation eucalypts. Comput. Electron. Agric. 2020, 173.
      [CrossRef]
5.    Yang, H.M.; Yu, L. Feature extraction of wood-hole defects using wavelet-based ultrasonic testing. J. For. Res.
      2017, 28, 395–402. [CrossRef]
6.    Lukomski, M.; Strojecki, M.; Pretzel, B.; Blades, N.; Beltran, V.L.; Freeman, A. Acoustic emission monitoring
      of micro-damage in wooden art objects to assess climate management strategies. Insight 2017, 59, 256–264.
      [CrossRef]
7.    Rescalvo, F.J.; Valverde-Palacios, I.; Suarez, E.; Roldan, A.; Gallego, A. Monitoring of carbon fiber-reinforced
      old timber beams via strain and multiresonant acoustic emission sensors. Sensors 2018, 18, 1224. [CrossRef]
      [PubMed]
8.    Li, C.; Zhang, Y.; Tu, W.; Jun, C.; Liang, H.; Yu, H. Soft measurement of wood defects based on LDA feature
      fusion and compressed sensor images. J. For. Res. 2017, 28, 1285–1292. [CrossRef]
9.    Dai, J.; Li, Y.; He, K.; Sun, J. R-FCN: Object detection via region-based fully convolutional networks.
      arXiv 2016, arXiv:1605.06409.
10.   Szegedy, C.; Liu, W.; Jia, Y.Q.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A.
      Going Deeper with Convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and
      Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [CrossRef]
11.   LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [CrossRef]
12.   He, T.; Liu, Y.; Yu, Y.B.; Zhao, Q.; Hu, Z.K. Application of deep convolutional neural network on feature
      extraction and detection of wood defects. Measurement 2020, 152. [CrossRef]
13.   Hu, K.; Wang, B.J.; Shen, Y.; Guan, J.R.; Cai, Y. Defect identification method for poplar veneer based
      on progressive growing generated adversarial network and MASK R-CNN Model. Bioresources 2020, 15,
      3041–3052. [CrossRef]
14.   Shi, J.H.; Li, Z.Y.; Zhu, T.T.; Wang, D.Y.; Ni, C. Defect detection of industry wood veneer based on NAS and
      multi-channel mask R-CNN. Sensors 2020, 20, 4398. [CrossRef] [PubMed]
15.   Yang, W.X.; Jin, L.W.; Tao, D.C.; Xie, Z.C.; Feng, Z.Y. DropSample: A new training method to enhance deep
      convolutional neural networks for large-scale unconstrained handwritten Chinese character recognition.
      Pattern Recogn. 2016, 58, 190–203. [CrossRef]
16.   Wan, W.; Lee, H.J. Multi-focus image fusion based on non-subsampled shearlet transform and sparse
      representation. Lect. Notes Electr. Eng. 2018, 449, 120–126. [CrossRef]
17.   Singh, S.; Anand, R.S.; Gupta, D. CT and MR image information fusion scheme using a cascaded framework
      in ripplet and NSST domain. IET Image Process 2018, 12, 696–707. [CrossRef]
18.   Wu, W.; Qiu, Z.M.; Zhao, M.; Huang, Q.H.; Lei, Y. Visible and infrared image fusion using NSST and deep
      Boltzmann machine. Optik 2018, 157, 334–342. [CrossRef]
19.   Boemer, F.; Ratner, E.; Lendasse, A. Parameter-free image segmentation with SLIC. Neurocomputing 2018, 277,
      228–236. [CrossRef]
20.   Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic image segmentation
      with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal.
      Mach. Intell. 2018, 40, 834–848. [CrossRef]
Appl. Sci. 2020, 10, 7488                                                                                    14 of 14

21.   Zhu, W.T.; Miao, J.; Qing, L.Y.; Huang, G.B. Hierarchical Extreme Learning Machine for Unsupervised
      Representation Learning. In Proceedings of the Joint Conference on Neural Networks, Killarney, Ireland,
      11–16 July 2015.
22.   Uzair, M.; Shafait, F.; Ghanem, B.; Mian, A. Representation learning with deep extreme learning machines
      for efficient image set classification. Neural Comput. Appl. 2018, 30, 1211–1223. [CrossRef]
23.   Tang, J.X.; Deng, C.W.; Huang, G.B. Extreme learning machine for multilayer perceptron. IEEE Trans. Neural
      Networks Learn. Syst. 2016, 27, 809–821. [CrossRef]
24.   Yu, W.C.; Zhuang, F.Z.; He, Q.; Shi, Z.Z. Learning deep representations via extreme learning machines.
      Neurocomputing 2015, 149, 308–315. [CrossRef]
25.   Liu, S.L.; Feng, L.; Xiao, Y.; Wang, H.B. Robust activation function and its application: Semi-supervised
      kernel extreme learning method. Neurocomputing 2014, 144, 318–328. [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional
affiliations.

                            © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
                            article distributed under the terms and conditions of the Creative Commons Attribution
                            (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
You can also read