Study of the Optimal Waveforms for Non-Destructive Spectral Analysis of Aqueous Solutions by Means of Audible Sound and Optimization Algorithms
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
applied sciences Article Study of the Optimal Waveforms for Non-Destructive Spectral Analysis of Aqueous Solutions by Means of Audible Sound and Optimization Algorithms Pilar García Díaz * , Manuel Utrilla Manso, Jesús Alpuente Hermosilla and Juan A. Martínez Rojas Department of Signal Theory and Communications, Polytechnic School, University of Alcalá, 28871 Alcalá de Henares, Spain; manuel.utrilla@uah.es (M.U.M.); jesus.alpuente@uah.es (J.A.H.); juanan.martinez@uah.es (J.A.M.R.) * Correspondence: pilar.garcia@uah.es; Tel.: +34-918-856-733 Abstract: Acoustic analysis of materials is a common non-destructive technique, but most efforts are focused on the ultrasonic range. In the audible range, such studies are generally devoted to audio engineering applications. Ultrasonic sound has evident advantages, but also severe limitations, like penetration depth and the use of coupling gels. We propose a biomimetic approach in the audible range to overcome some of these limitations. A total of 364 samples of water and fructose solutions with 28 concentrations between 0 g/L and 9 g/L have been analyzed inside an anechoic chamber using audible sound configurations. The spectral information from the scattered sound is used to identify and discriminate the concentration with the help of an improved grouping genetic algorithm that extracts a set of frequencies as a classifier. The fitness function of the optimization algorithm implements an extreme learning machine. The classifier obtained with this new technique Citation: García Díaz, P.; Utrilla is composed only by nine frequencies in the (3–15) kHz range. The results have been obtained over Manso, M.; Alpuente Hermosilla, J.; 20,000 independent random iterations, achieving an average classification accuracy of 98.65% for Martínez Rojas, J.A. Study of the Optimal Waveforms for concentrations with a difference of ±0.01 g/L. Non-Destructive Spectral Analysis of Aqueous Solutions by Means of Keywords: acoustic chemical analysis; non-destructive analysis; feature extraction; automatic classification Audible Sound and Optimization Algorithms. Appl. Sci. 2021, 11, 7301. https://doi.org/10.3390/app11167301 1. Introduction Academic Editor: Chiara Portesi Acoustic spectroscopy is one of the most promising techniques for nondestructive testing of many materials. This work shows that acoustic spectroscopy in the audible range Received: 8 July 2021 is also well prepared for the study of liquid solutions. No method can claim superiority, Accepted: 6 August 2021 but sound-based sensing of liquids has several advantages over optical techniques and Published: 9 August 2021 can be easily combined with other methods, such as electroacoustic measurements, as discussed in [1]. A review devoted to describing the advantages and limitations of acoustic Publisher’s Note: MDPI stays neutral spectroscopy, with a particular focus on pharmaceutical applications can be seen in [2]. with regard to jurisdictional claims in However, most studies on this research topic are devoted to ultrasound techniques published maps and institutional affil- and devices, due to their higher energy and bandwidth than sounds in the audible range. iations. This can be seen in the monographs devoted to this topic, like [3] and [4]. The last one is very interesting because a research by Contreras et al. [5], page 51, describes the ultrasonic measurement of different sugar concentrations with an accuracy of 0.2% in water volume for pure sugar solutions. They measured the velocity of ultrasound and the density in Copyright: © 2021 by the authors. solutions of D-glucose, D-fructose, and sucrose at various concentrations (0–40% w/v) and Licensee MDPI, Basel, Switzerland. temperatures (10–30 ◦ C). This article is an open access article This conversion of acoustic data to sound velocity is the norm in most ultrasound distributed under the terms and studies of liquids. The calculation of sound velocities introduces some important problems conditions of the Creative Commons and uncertainties due to the need of using statistical or theoretical models and the existence Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ of other processes, as is explained in several publications, for example the Dzida et al.’s 4.0/). excellent review about the determination of the speed of sound in ionic liquids [6]. A de- Appl. Sci. 2021, 11, 7301. https://doi.org/10.3390/app11167301 https://www.mdpi.com/journal/applsci
Appl. Sci. 2021, 11, 7301 2 of 17 scription of a low-cost system for the measurement of sound velocity in liquids can be found in [7]. The use of ultrasound for the study of pure liquids and solutions is limited, com-pared to its application to colloids, suspensions, and emulsions, as reviewed in [8]. However, there have been remarkable advances in recent years, as can be seen, for example, in [9–12]. The acoustic research of aqueous electrolytes was performed by Pal and Roy in [13] using the Fourier spectrum pulse-echo technique, which is discussed in detail in [14]. The number of publications is too large for an exhaustive literature survey, so only a handful of representative examples are shown here. For a detailed review of ultrasound spectroscopy for particle size determination see [15]. In [16] Silva et al. studied polydisperse emulsions by means of acoustic spectroscopy within the frequency range of (6–14) MHz in order to measure the droplet size distribution of water-in-sunflower oil emulsions for a volume fraction range from 10 to 50%. They concluded that the methodology was suitable for polydisperse particle size characterization for moderate concentrations up to 20% and the results were in good agreement with those obtained by laser diffraction analysis. Other interesting application to food analysis can be seen in [17], where the mechanism of rehydration of milk protein concentrate powders is studied by means of broadband acoustic resonance dissolution spectroscopy. Moreover, ref. [18] describes the use of an ultrasonic pulse echo system for vegetable oils characterization. Good reviews of high-resolution ultrasound spectroscopy can be found in [19,20]. All measurements are based on the previous determination of the speed of sound and attenuation in the samples. A number of advantages and applications of this technique are clearly described, for example, samples with very small volumes can be analyzed using different ranges of pressure and temperature. As is explained in [19], at frequencies below 100 MHz, which is clearly the case of audible frequencies, for nano-sized dispersions or solutions, the contribution of scattering to attenuation can be neglected. Thus, attenuation at this long-wavelength regime is determined by the thermal and the shear (visco-inertial) effects. In spite of this, we show that audible acoustic spectroscopy can achieve impressive accuracy in the determination of fructose concentration in water. Another interesting application of ultrasound spectroscopy is the monitoring of bio- catalysis in solutions and complex dispersions, even in real-time, reviewed in detail by Buckin and Caras in [21]. The information that can be extracted from ultrasound data is impressive: substrate concentrations along the entire course of the reaction, time pro- file analysis of the degree of polymerization, reaction rate evolutions, kinetic mechanism evaluation, kinetic and equilibrium constant measurements, and real-time traceability of structural changes in the medium associated with chemical reactions, among others. Finally, an interesting and fascinating application of audible acoustic measurements can be found in [22,23]. Both deal on the determination of Martian rock properties using the microphone of the recent NASA Perseverance rover. This microphone is used to record the sounds associated with the microcrater-forming laser induced breakdown spectroscopy device shots. Additionally, artificial intelligence (AI) algorithms have been incorporated into many engineering applications in recent years. They are integrated in research always providing a remarkable improvement in performance and efficiency. The use of these algorithms is enhanced by continuous and increasing computing power and massive data collec- tion. Although they do not always offer the optimal solution, they approach it with a very acceptable balance of cost and accuracy. Moreover, in many applications there is no unique solution, but rather several solutions under conflicting criteria. Recent studies on the application of IA in different fields of engineering can be found in: computer engi- neering [24–27], electrical engineering [28,29], petroleum engineering [30], fluid mechanic engineering [31,32], energy engineering [33–36], and acoustic engineering [37]. In this work a direct application of audible acoustic spectroscopy to the determination of fructose concentrations in distilled water is presented. It is shown that no data conversion to speeds of sound is necessary, hence eliminating the source of some uncertainty, and
neering [31,32], energy engineering [33–36], and acoustic engineering [37]. In this work a direct application of audible acoustic spectroscopy to the determina- tion of fructose concentrations in distilled water is presented. It is shown that no data conversion to speeds of sound is necessary, hence eliminating the source of some uncer- Appl. Sci. 2021, 11, 7301 tainty, and most importantly, accuracies of the order of 1 part in 100,000 (0.001%) 3 of 17 in weight can be achieved. The use of audible sound has some advantages over ultrasound, mainly the low cost of the measuring equipment and the noncontact nature of the meas- urements. most In order importantly, to optimize accuracies theorder of the technique, of 1 parttheinresults 100,000from a series (0.001%) of different in weight can be pulses achieved. and noises The usepreviously were of audible compared sound has and somethe advantages best soundover wasultrasound, selected formainly the finalthedeter- low cost of the mination. Thismeasuring is a clear equipment improvement andover the noncontact our previous nature of the measurements. technique based on resonant In vi- order to optimize the technique, the results from brations of the sample, which involved direct contact [38]. a series of different pulses and noises were previously In this work, compared and the 364 samples of best soundconcentrations different was selected for of the highfinal determination. purity fructose in dis- This is a clear improvement over our previous technique based on tilled water were used for the study of the best pulse characteristics for acoustic resonant vibrations of chemical the sample, which involved direct contact [38]. analysis. A constant volume of 150 mL for all samples was selected. The container was a In this work, 364 samples of different concentrations of high purity fructose in distilled simple cylindrical glass. A small anechoic chamber was used to place the samples and to water were used for the study of the best pulse characteristics for acoustic chemical analysis. make the sound recordings. The microphone was placed vertically over the surface of the A constant volume of 150 mL for all samples was selected. The container was a simple liquid. The cylindrical sound glass. source A small was one anechoic earpiece chamber placed was used parallel to place to the microphone the samples and to make over the the liquid surface. sound recordings. The microphone was placed vertically over the surface of the liquid. The soundDifferent source was sound configurations one earpiece were explored: placed parallel chirp, square to the microphone overpulses, white the liquid noise, and surface. maximum Different length soundsequence (MLS). configurations In the were end, MLS explored: chirp,produced the best square pulses, results white noise,inand our pre- liminary studies maximum and was(MLS). length sequence selectedIn for the the end,final MLSanalysis. produced The thesamples were best results in excited our prelim-by these sounds during 30 s intervals and inary studies and was selected for the reflected sound was recorded. These recordings were final analysis. The samples were excited by these sounds divided during 30 ssamples into 2-s intervalswhose and thespectra reflected sound were was recorded. calculated Theseof by means recordings the Praatwere program divided [39]. Theintoresulting 2-s samples whosewere spectra spectra were calculated processed by means by means of a of the Praat genetic grouping programalgorithm [39]. The (GGA)resulting taking spectra wereset a training processed of 80% and by means a test of setaof grouping 20%. This genetic algorithm algorithm (GGA) provided a clas- taking a training set of 80% and a test set of 20%. This algorithm provided sifier with more than 98.5% classification accuracy, even for concentrations with a differ- a classifier with encemore thang/L. of ±0.01 98.5% classification accuracy, even for concentrations with a difference of ±0.01 g/L. 2.2.Materials Materialsand andMethods Methods Theexperimental The experimental system system was was composed composed of three of three main main parts: parts: the anechoic the anechoic chamber, chamber, the sound system, and the samples. The liquid sample was placed inside a small handmadehand- the sound system, and the samples. The liquid sample was placed inside a small made anechoic anechoic chamberchamber of exteriorofdimensions exterior dimensions (width, (width, high, depth) high, 80 ×depth) 80in× centimeters. 72 × 56, 72 × 56, in centi- meters. Its Itswas interior interior was using isolated isolated using 2-cm 2-cm thick thick foam andfoam and a frequency-dependent a frequency-dependent absorbentabsor- bent pyramidal pyramidal material material of 4 cmofin4the cmbase in theandbase 6 cmand 6 cm high. high. Thus, theThus, the volume interior interior of volume the of chamber is 58 is×58 the chamber 61×× 6140 cm. × 40 AA cm. cylindrical cylindrical glass glasswith with a avolume volumeofof200 200mLmLfilled filledwith with 150 150 mLmL of of a waterand a water andfructose fructose solution solution was wasplaced placedatatthethe center of the center chamber. of the The glass chamber. The glass mass masswaswas123123g,g,with witha diameter a diameter of 8ofcm. Figure 8 cm. 1 shows Figure 1 shows the schematic diagram the schematic of theof the diagram experimental setup. experimental setup. Figure1.1.Photograph Figure Photographof of thethe experimental experimental installation. installation. The proposed method uses differential measurements and the acoustic performance of the chamber and the environment is sufficient for this purpose. Measurements of the chamber performance were made by means of a Brüel and Kjaer 2250 acoustic analyzer, resulting in 28.2 dBA of background noise and a mean reverberation time of 0.17 s. The frequency response of the chamber is represented in Figure 2.
of theThe chamber andmethod proposed the environment is sufficient uses differential for this purpose. measurements and the Measurements of the acoustic performance chamber performance of the chamber were and the made by means environment of a Brüel is sufficient andpurpose. for this Kjaer 2250 acoustic analyzer, Measurements of the resulting in 28.2 dBA of background noise and a mean reverberation chamber performance were made by means of a Brüel and Kjaer 2250 acoustic time of 0.17 s. The analyzer, frequency response resulting in of of 28.2 dBA thebackground chamber is represented noise and a in Figure mean 2. reverberation time of 0.17 s. The Appl. Sci. 2021, 11, 7301 4 of 17 frequency response of the chamber is represented in Figure 2. Figure 2. Frequency response of the anechoic chamber. Frequency response Figure2.2.Frequency Figure response of of the theanechoic anechoicchamber. chamber. The used microphone was the model ECM-TL3 of Sony, an electret capacitor with The used microphone was theresponse omnidirectional The pattern, frequency used microphone was the model modelECM-TL3 range (20ofof ECM-TL3 Sony, kHz) Sony,anan Hz–20 electret capacitor with with sensitivity electret capacitorofwith−35 dBomnidirectional that was placed pattern, frequency vertically 2.5 cm response range over therange liquid (20 Hz–20 kHz) with sensitivity of omnidirectional pattern, frequency response (20surface, Hz–20 kHz)1.5 cmwithfrom the center. sensitivity In of −35 −35 dB in parallel, thata was placed vertically symmetric position 2.5 the to cm over centertheofliquid the surface, glass, one1.5earpiece cm from model the center.Sony dB that was placed vertically 2.5 cm over the liquid surface, 1.5 cm from the center. In In parallel, in a symmetric MDRXB50APB.CE7 was used position to the center as thetosound source,ofofwith the aglass, one earpiece frequency response model range Sony of (4– parallel, in MDRXB50APB.CE7 a symmetricwas position used as the the sound center source, the with glass, a one frequency earpiece response model range Sony of 24) kHz, a sensitivitywas MDRXB50APB.CE7 of 106 used dB/mW, as the and ansource, sound impedancewith aoffrequency 40 ohms (1 kHz). The response range testofsig- (4–24) kHz, a sensitivity of 106 dB/mW, and an impedance of 40 ohms (1 kHz). The(4– nals 24) were generated by a computer while the recordings were made by another computer test signals were generated by a computer while the recordings were made by anothersig- kHz, a sensitivity of 106 dB/mW, and an impedance of 40 ohms (1 kHz). The test and nals an were external audiobycard. generated a computer while the recordings were made by another computer computer and an external audio card. The and anThe measuring external system, audiosystem, measuring background card. background sound, sound,and andnoise noisesound soundgenerated generated byby thethe sound sound card is cardThe represented measuringin is represented in Figure system, 3. A maximum Figurebackground 3. A maximum level sound, level andof 0.0271 is measured noiseissound of 0.0271 generated measured against againstby the thethe levels sound levels near near1is1(to card (tofull fullscale) represented of scale)in the the signals. ofFiguresignals. 3. A maximum level of 0.0271 is measured against the levels near 1 (to full scale) of the signals. Recordingof Figure3.3.Recording Figure of the the sound sound card card without withoutsignal. signal. Figure 3. Recording The microphone of the was sound card connected to a without PC soundsignal. card MAudio Fast Track Ultra 8R. An amplification factor of 70% for the channel was used to avoid adding internal noise from the card. The measurements were taken with a recording rate of 44.1 kHz by means of the free Audacity software [40]. Sound amplitude was kept below the 70% of the maximum
Appl. Sci. 2021, 11, 7301 5 of 17 level in order to avoid saturation effects. Some test signals were generated by MATLAB [41] and Audacity software: 1. MLS signal: a signal generated by MATLAB, taking into account that the maximum length is 30 s. The amplitude is 60% of the full scale (FS); 2. White noise: a signal generated by Audacity, with an amplitude of 60% of the FS; 3. A set of chirp signals generated by Audacity, with a duration of 1 s each, from 150 Hz to 15 kHz; 4. Square pulses with a period of 250 ms and 50% of duty cycle. Each audio recording had a duration of 30 s, more than enough to ensure the precision and stability of the measurements. Later analyses showed that the recordings were stable enough to allow their partition in several 2 s intervals in order to increase the number of recordings for the classification algorithm. Changes among different spectra from the same sample were so low that they were not measurable. The experiments were performed with a set of 364 samples of water solutions with different concentrations of fructose (see Tables 1–3). The volume of each sample was 150 mL. Distilled water was used as solvent. Food grade pure fructose (>99%) was used for the liquid samples. The concentrations of fructose were from 0 to 9 g/L. A more detailed study was done between 2 g/L and 3 g/L in increments of ±0.1 g/L and between 2.01 g/L and 2.09 g/L in increments of ±0.01, in order to explore the performance of the system. The mass of fructose was measured by means of an analytical balance, a Homgeek TL-Series balance with an accuracy of 50 g/0.001 g. Table 1. Number of samples and their composition used in the experiment. A total of 130 samples of distilled water with different concentrations of fructose, in the range of 0 g/L to 9 g/L, were analyzed. Fructose Concentration (g/L) 0 1 2 3 4 5 6 7 8 9 Number of samples 13 13 13 13 13 13 13 13 13 13 Total samples 130 Table 2. Number of samples and their composition used in the experiment. A total of 117 samples of distilled water with different concentrations of fructose, in the range of 2.1 g/L to 2.9 g/L, were analyzed. Fructose Concentration (g/L) 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 Number of samples 13 13 13 13 13 13 13 13 13 Total samples 117 Table 3. Number of samples and their composition used in the experiment. A total of 117 samples of distilled water with different concentrations of fructose, in the range of 2.01 g/L to 2.09 g/L, were analyzed. Fructose Concentration (g/L) 2.01 2.02 2.03 2.04 2.05 2.06 2.07 2.08 2.09 Number of samples 13 13 13 13 13 13 13 13 13 Total samples 117 A set of 130 samples of water and fructose solutions with 10 concentrations between 0 g/L and 9 g/L, 117 samples of water and fructose solutions with 9 concentrations between 2 g/L and 3 g/L, 117 samples of water and fructose solutions with 9 concentrations between 2.0 g/L and 2.1 g/L have been analyzed inside an anechoic chamber using audible sound configurations. Samples were numbered and visual inspection was used in order to ensure that complete dilution was achieved, and no bubbles were formed. Careful manipulation of
Appl. Sci. 2021, 11, 7301 6 of 17 the samples was done in order to avoid the formation of bubbles or wall drops. Each measurement took 30 s and they were taken in a consecutive way. Each measurement was divided into different 2-s intervals, after verifying that such time was more than enough for accurate and precise spectral information. The input data of the classification algorithm are the spectra of the audio measurements of 2-s in duration. The experiment was carried out with one audio measurement of each of the 28 different concentrations. That means a total of 364 audio samples. The power spectrum of every interval was made using the default Praat 6.0.40 options as can be seen in our previous work [38]. Similarly, a cepstral smoothing of 100 Hz and a decimation procedure were applied (65,537 points); averaged in order to reduce the number of points to a reasonable size (655 points) without losing the main peak structure of the spectra. In summary, the classification algorithm processed 364 input data, each of them being a spectrum defined by 655 values in the frequency range (20 Hz–22.05 kHz). 3. Algorithm for Clustering Problem The spectral response of the liquid samples to the vibrational stimulation of the MLS sounds was used as data input to a genetic grouping algorithm (GGA) to perform the classification of the liquid mixtures according to their fructose concentration. Since the nature of the samples was not affected, it is a non-destructive method. The GGA is itself a genetic algorithm (GA) explicitly modified for solving clustering problems. A brief description of GAs and GGAs is given in this section. 3.1. From Genetic Algorithm to Grouping Genetic Algorithm GA is a bio-inspired algorithm based on the theory of evolution of species by natural selection. A population of individuals fights against each other to gain the resources to survive. Each individual represents an encoded solution of the optimization problem. It is therefore an evolutionary optimization algorithm. The optimization strategy is usually applied to solve problems where it is almost impossible to find the optimal solution and there are several solutions with opposing criteria. The objective is to find one or multi- ple solutions which are close enough to the optimal one, with a very acceptable balance between cost and accuracy. On the other hand, “evolutionary” means that the algorithm computes the solutions through successive generations, undergoing an evolutionary pro- cess that enhances an overall improvement in the fitness value of the majority. Individuals with better fitness values are likely to survive longer than individuals with worse fitness. Along successive generations, individuals will appear that are more fit than others and will progressively improve their fitness. Each generation of individuals undergoes changes through recombination, mutation, and selection functions. These functions allow the diver- sity of individuals and therefore the exploration of the solution space. The execution of the evolutionary algorithm is completed when it reaches a stop condition. The most popular stopping conditions are a maximum number of generations and population convergence. Population convergence is reached when there is no progress in improving the fitness of individuals over several consecutive generations. A more extensive introduction can be found in [42]. The GGA is a modification of the GA oriented to solve grouping and clustering problems [43–46]. The fundamental difference of a GGA versus a GA lies in the encoding of the solution and the use of search operators to manage this encoding. The encoding is key to ensuring high performance in the execution of the algorithm [47]. A solution in the GGA is composed of two sections: the assignment part and the grouping part. The grouping section labels all the groups involved in the solution. The assignment part associates each element to a single group. The value stored in the assignment part is the group assigned to each element. The information about the grouping is in the content of the solution itself and in its length. The total length of the solution is the number of elements to be classified plus the number of groups considered in the solution.
Appl. Sci. 2021, 11, 7301 7 of 17 3.2. The Fitness Function: The Extreme Learning Machine The fitness function numerically characterizes the individual and allows to rank the individuals of a population from best to worst aptitude. The fitness function used in the GGA is the extreme learning machine (ELM). It is a relatively simple machine learning algorithm that generalizes a single hidden layer feedforward network (SLFN), used for regression, binary classification, and multi-classification [48–53]. The input layer takes the input values for a given set of features from the data. The feature set can include all the features of the data or a subset of them. The output layer provides a classification of the data according to the fixed feature set. The single intermediate layer is adjusted by the training of the network. After training, the classification accuracy of the ELM is calculated according to the defined feature set. ELM has demonstrated good performance with extremely high speed [54–56]. This last feature is fundamental for its integration in the GGA, since an extremely high number will be executed during each generation of the evolutionary algorithm. The fitness function is applied to an individual by calculating by ELM the classification accuracy of each of the groups considered in the solution. The best rate is assigned as the fitness value to the individual and the classification rates of the rest of the groups are then discarded. The best classification accuracy corresponds to the group of features that classifies the individual with the best accuracy among all the groups considered in the solution. The rest of the groups are not relevant to the solution. 3.3. Metaheuristic GGA+ELM Algorithm Application for Spectral Analysis As already mentioned, the spectral data have 655 values in the frequency range (20 Hz–22.05 kHz). Obviously 655 characteristics is far too high for classification purposes. The target of the optimization algorithm is to reduce the number of features useful for classifying liquid samples. This means a wrapper feature selection [57] where the GGA maximizes the classification accuracy. The solution is composed of a collection of features varying in length and composition. The set of features extracted by the GGA among the 655 total will constitute the classifier applicable on the spectra of the liquid samples. The challenge is to not exceed more than 10 features and to achieve a classification accuracy of more than 95%. The training and testing data sets are disjoint sets randomly selected from the total of samples. The usual ratio is 80/20, with the training set having the largest number (80%) and the test data the remaining 20%. The population size usually used in the literature varies between 20 and 100 [58,59]. The pair composition of individuals for the crossover operation is randomized. This method has also provided good results in previous research [60,61]. The crossover operation generates a population increase of 50% (a single offspring from each pair), on which a 10% mutation is applied [47,59]. This percentage is higher than usual in genetic algorithms, with the purpose of quickly exploring multiple areas of the solution space. The survival population for the next generation is composed of the winners of pairwise tournaments among the total population. The matches are chosen randomly. The fitness function value of the fighters determines the winner of each tournament. As already described, an individual is coded as a set of groups, where each group is a collection of features that can be a valid classifier of the input data. Not all groups of an individual are useful for classification, but only those with better accuracy. Note that considering a specific individual, each feature of the 655 is only present in a single group. In the GGA fitness function, the ELM algorithm is applied over each group of the individual to classify the testing set data from the knowledge of the training data. The group with the best classification accuracy is selected as a candidate classifier. The fitness of the individual takes the value of the classification accuracy of this highlighted group, which is the best accuracy obtained among all the groups of the individual. The stopping condition employed in optimization is the maximum number of gener- ations. To ensure the high-quality solutions are found within a reasonable computation time, the maximum number of genera considered is Gmax = 50.
best accuracy obtained among all the groups of the individual. The stopping condition employed in optimization is the maximum number of gener- ations. To ensure the high-quality solutions are found within a reasonable computation time, the maximum number of genera considered is Gmax = 50. Appl. Sci. 2021, 11, 7301 8 of 17 4. Results and Discussion 4. Results and Discussion The spectral analysis was performed on a total of 364 spectra from 28 different fruc- tose contents, having The spectral 13 samples analysis of each was performed onconcentration. The 28 a total of 364 spectra concentrations from have been 28 different fructose contents, having 13 samples of each concentration. The 28 concentrations grouped into three data tables with their respective fructose concentration increments: have been ±1 grouped g/L in Tableinto 1, three dataintables ±0.1 g/L Tablewith their 2, and ±0.01respective fructose g/L in Table concentration 3. The algorithmincrements: was run on the ± 1 g/L in Table 1, ± 0.1 g/L in Table 2, and ± 0.01 g/L in Table 3. three sample collections. One purpose of this work is to obtain a limited set of frequencies The algorithm was run on the three sample collections. One purpose of this work able to satisfactorily classify the samples according to their concentration. The main is to obtain a limited set ob- of frequencies able to satisfactorily classify the samples according to their concentration. jective is to determine the degree of discrimination of the classifier on fructose concentra- The main objective is to determine the degree of discrimination of the classifier on fructose tion using this method. It is expected that the accuracy classification of samples in Table 3 concentration using this method. It is expected that the accuracy classification of samples in will Table 3lower be will bethan the lower accuracy than classification the accuracy of samples classification of samples in Table in Table1, 1,asasininTable Table 33 the the con- centration increment concentration incrementis much is muchlower lowerthan than inin Table Table1.1.ItItisisalso alsodesired desired to to know knowwhether whether the accuracy classification of concentrations with a difference of ±0.01 the accuracy classification of concentrations with a difference of ±0.01 g/L is acceptable g/L is acceptable or not. or not. 4.1. Acoustic Response Spectrum 4.1. Acoustic Response Spectrum Figure 4 shows the averaged spectra for each concentration from Table 1. The spectra Figure 4 shows the averaged spectra for each concentration from Table 1. The spectra are defined by the sound pressure level (dB/Hz) over the audible frequencies range (20 are defined by the sound pressure level (dB/Hz) over the audible frequencies range Hz–22.05 kHz). The sound pressure level is normalized in all the curves in Figure 4, with (20 Hz–22.05 kHz). The sound pressure level is normalized in all the curves in Figure 4, values in the in with values range (−1–1). the range (−1–1). Figure 4. Cont.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 9 of 17 Appl. Sci. 2021, 11, 7301 9 of 17 Figure 4. Spectral information of the vibrational absorption bands of ten different concentrations of fructose in distilled water (from 0 g/L to 9 g/L). The curves represent average and normalized values of sound pressure level (dB/Hz) over the audible frequencies range (20 Hz–22.05 kHz) for the Figure 4. Spectral information of the vibrational absorption bands of ten different concentrations of samplesinfrom fructose Tablewater distilled 1. (from 0 g/L to 9 g/L). The curves represent average and normalized values of sound pressure level (dB/Hz) over the audible frequencies range (20 Hz–22.05 kHz) for the sam- Each spectrum can be characterized by particular markers associated with the chemical ples from Table 1. composition of the mixture. The selection of a group of frequencies manually as a classifier of liquid mixtures from their concentration is a tedious and complex task because of the largeEach spectrum number can be of spectral characterized lines (the algorithmbyhandles particularthe markers associated range of audible with theaschem- frequencies ical composition 655 spectral lines).of the mixture. The selection of a group of frequencies manually as a classifier Theof liquid spectra average mixtures from their obtained withconcentration the samples from is aTables tedious and3complex 2 and also showtask because a high ofcomplexity. the large number Among of thespectral lines (the mean spectra fromalgorithm handles Table 3, with the range increments ofg/L, of 0.01 audible somefrequen- of them cies as are 655relatively similar to each other, and it may be necessary to select more focused spectral lines). frequency ranges to The average discriminate spectra obtainedone with concentration from from the samples another.Tables 2-3 also show a high The optimization algorithm was run separately for complexity. Among the mean spectra from Table 3, with increments each sample collection (Tables of 0.01 g/L,1–3) some of them are relatively similar to each other, and it may be necessary to select morethat providing several solutions. Each solution was composed of a set of frequencies focused classify the samples with high accuracy. Two of these solutions were then taken to compose frequency ranges to discriminate one concentration from another. a combined decision system. This combined classifier works as a unique and common The optimization algorithm was run separately for each sample collection (Tables 1– classifier over all samples used. In the following lines, the performance of this classifier on 3)samples providing withseveral solutions. concentrations from Each solution Tables was composed of a set of frequencies that 1–3 is analyzed. classify the samples with high accuracy. Two of these solutions were then taken to com- 4.2. Feature pose Extraction a combined decision system. This combined classifier works as a unique and com- mon classifier over all The GGA+ELM samplesperforms algorithm used. Infeature the following extractionlines, the performance optimizing the accuracyofclassi- this clas- fication sifier of samples on samples according with to their fructose concentrations concentration. from Tables The genetic algorithm is fed 1–3 is analyzed. with spectral information as shown in Figure 4. Each feature is a spectral line. The optimal solution, 4.2. Featureif Extraction it exists, is unknown. The algorithm delivers two of the best solutions found in the execution. Each solution is composed of a set of frequencies (feature extraction) that The GGA+ELM algorithm performs feature extraction optimizing the accuracy clas- classify with high accuracy. Not all solutions have the same number of frequencies. sification A 2.7ofGHz samples Intel Coreaccording to their i7 processor was fructose used. Theconcentration. Thevalues specific parameter genetic algorithm of the GGA is fed with spectral information as shown in Figure 4. Each feature is and ELM are summarized below. Five independent simulations were performed for each a spectral line. The optimal solution, of the three sets ofif samples. it exists, The is unknown. Thesimulation time for each algorithm was delivers two of the1best approximately h, sosolutions the found in the execution. total computation Each time was solution about 15 h. is composed of a set of frequencies (feature extrac- tion) that classify with high accuracy. Not all solutions have the same number of frequen- cies. A 2.7 GHz Intel Core i7 processor was used. The specific parameter values of the GGA and ELM are summarized below. Five independent simulations were performed for
Appl. Sci. 2021, 11, 7301 10 of 17 • Maximum number of generations = 50 generations; • Training data size = 80%; • Testing data size = 20%; • Population size = 50 individuals; • Mutation probability = 0.1; • Number of neurons of ELM = 10 in Table 1; ELM = 11 in Tables 2 and 3. No information on which frequencies to be tested first was given to the algorithm. Table 4 lists the frequencies (kHz) of the independent classifiers. Classifier 1 consists of 4 frequencies in the range (8–15) kHz, and Classifier 2 selects five frequencies in the range (3–15]) kHz. The two classifiers are combined into a single classification system. It is note- worthy that with only nine features can characterize the 28 concentrations in Tables 1–3. Table 4. Characteristics of the two classifiers provided by GGA+ELM to discriminate the fructose concentrations of Tables 1–3. The classifiers make a decision according to the value of aver-age energy density on specific frequencies in the acoustic response spectrum. Frequencies (kHz) Classifier 1 Classifier 2 f1 8.4 3.1 f2 11.7 11.2 f3 13.8 12.8 f4 14.7 13.0 f5 - 14.5 4.3. Discussion A total of 20 M random and independent iterations was run for the two independent classifiers and the combined classifier on random test sets. The results are reported in Table 5. For each set of concentrations (±1 g/L, ±0.1 g/L, and ±0.01 g/L) the average value and standard deviation of the classification accuracy are given. Table 5. Performance of the two classifiers provided by GGA+ELM and the voting system classifier to discriminate the fructose concentrations referred in Tables 1–3. The average values and standard deviation of the accuracy were estimated from 20 M independent and random iterations. 0–9 g/L (±1 g/L) 2.0–3.0 g/L (±0.1 g/L) 2.00–2.10 g/L (±0.01 g/L) Classifier Average Standard Average Standard Average Standard Accuracy Deviation Accuracy Deviation Accuracy Deviation 1 99.71 0.0126 90.32 0.0704 98.65 0.0272 2 97.60 0.0415 85.89 0.0727 80.78 0.0824 Combined 99.82 0.0123 98.98 0.0266 98.65 0.0272 classifier Overall, it is observed that the combined classifier is valid for all concentrations in Tables 1–3 (with a minimum average accuracy of 98.65% over the 20 M iterations). As the average classification accuracy decreases, the difference between sample concentrations b becomes smaller: 99.82% at ±1 g/L (Table 1), 98.98% at ±0.1 g/L (Table 2), and 98.65% at ±0.01 g/L (Table 3). The standard deviation also increases in this direction: 0.0123 with ±1 g/L (Table 1), 0.0266 with ±0.1 g/L (Table 2), and 0.0272 with ±0.01 g/L (Table 3). This pattern meets the expected results: the difficulty of discrimination rises with higher class similarity. In the following lines, we elaborate on the results for each set of classes (fructose concentration), analyzing Tables 1–3 separately. The classification of the samples of Table 1 (0–9 g/L) has very satisfactory results with the three classifiers: Classifier 1 and 2 of Table 4 and the combined classifier of them. With the three classifiers an average accuracy of more than 97% over 20 M random iterations is obtained. It is very remarkable that Classifier 1 can characterize, with only four spectral
The classification of the samples of Table 1 (0–9 g/L) has very satisfactory results with the three classifiers: Classifier 1 and 2 of Table 4 and the combined classifier of them. With Appl. Sci. 2021, 11, 7301 the three classifiers an average accuracy of more than 97% over 20 M random iterations 11 of 17 is obtained. It is very remarkable that Classifier 1 can characterize, with only four spectral lines, up to ten concentrations with an average accuracy of 99.71%. Combining the two lines, up to ten concentrations decision-makers with anachieves in a single classifier average an accuracy average of accuracy 99.71%. Combining the 99.8%. of better than two decision-makers in a single classifier achieves an average accuracy of better than 99.8%. The nine frequencies of the combined classifier are located in the range (3–15) kHz. The nine frequencies of the combined classifier are located in the range (3–15) kHz. These frequencies have been highlighted in Figure 5 on the spectral information of the These frequencies have been highlighted in Figure 5 on the spectral information of the vibrational absorption bands for each concentration of Table 1. Note that the combination vibrational absorption bands for each concentration of Table 1. Note that the combination of these spectral linesallows of these spectral lines allowsthetheten tenclasses classes to to be be differentiated. differentiated. NotNot all frequencies all frequencies are are equally equally important in the classification operation, some frequencies are more decisive in the in important in the classification operation, some frequencies are more decisive the classification classification among among several several classes. classes. ThereThere may be may be other other sets ofsets of frequencies frequencies that clas- that classify sify the samples with similar accuracy. The optimization algorithm offers solutions close to theclose the samples with similar accuracy. The optimization algorithm offers solutions to the optimal optimal solution, solution, withoutwithout ensuringensuring that that there is athere is asolution. unique unique solution. Figure 5. Spectral Figure 5. Spectral information informationofofvibrational vibrationalabsorption absorption bands bands of of 1010 concentrations concentrations from from 0 g/L 0 g/L to to 9 g/L. The curves represent the average and normalized values of the sound pressure level 9 g/L. The curves represent the average and normalized values of the sound pressure level (dB/Hz) (dB/Hz) over over the the audible audible frequencies range. In frequencies range. In each each spectrum, the frequencies spectrum, the frequencies usedused for for both both classifiers classifiers are emphasized. are emphasized.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 12 of 17 Appl. Sci. 2021, 11, 7301 12 of 17 Figure 6 shows ten different patterns for the concentration classification of Table 1. Figure 6 shows ten different patterns for the concentration classification of Table 1. Each pattern is constructed with the average normalized energy density value for each of Each pattern is constructed with the average normalized energy density value for each of the nine the ninefrequencies frequenciesatataaconcentration concentration from from Table Table 1. The 1. The similarity similarity between between somesome patterns patterns is interesting. For example, the curve characterizing the 1 g/L concentration is similar to to is interesting. For example, the curve characterizing the 1 g/L concentration is similar the 55g/L the g/L concentration concentrationpattern. pattern.The The same same happens happens withwith the the 4 g/L 4 g/L andand 9 g/L 9 g/L concentration concentration patterns.This patterns. Thisphenomenon phenomenon hashas been been reproduced reproduced in allin theallexperiments the experiments performed performed and we and we believe believe that that it may it may be associated be associated with with the the resonance resonance of the container of the container used. used. Figure 6.6. Spectral Figure Spectralinformation informationpatterns patterns forfor fructose concentrations fructose fromfrom concentrations 0 g/L 0 to g/L9 g/L, withwith to 9 g/L, the the average and normalized value of the sound pressure level (dB/Hz) as a function of the frequencies average and normalized value of the sound pressure level (dB/Hz) as a function of the frequencies (kHz) (kHz) used usedby bythe thecombined combinedclassifier. classifier. The application of the combined classifier on 143 samples from Table 2 (concentrations The application of the combined classifier on 143 samples from Table 2 (concentra- between 2.0 g/L and 3.0 g/L with increments of ±0.1 g/L) provided satisfactory results. tions between 2.0 g/L and 3.0 g/L with increments of ±0.1 g/L) provided satisfactory re- As in the previous case, a sequence of 20M random and independent iterations was carried sults.As out. Asshown in the in previous Table 5,case, a sequence Classifiers 1 andof 20M random 2 obtained and independent an average iterations accuracy higher than was carried 85%, out. Astheir whereas shown in Table combined 5, Classifiers classifier 1 and improves the2 average obtainedclassification an average accuracy accuracy higher to than 85%, whereas their combined classifier improves the average classification 98.98% with a standard deviation 0.0266. This result is very satisfactory, although worse accuracy to 98.98% than with a standard that obtained for 1 g/Ldeviation 0.0266.The concentrations. Thisgreater result the is very satisfactory, similarity amongalthough worse classes, the than that more obtained complex for 1 g/L concentrations. the classification and the lower the Theprecision. greater the similarity among classes, the moreFigure complex the classification 7 presents the patternsandforthe thelower the precision. concentrations between 2.1 g/L and 2.9 g/L. TheseFigure have been generated 7 presents the from the average patterns for the and normalized between concentrations energy density 2.1 g/Lvalues for g/L. and 2.9 the nine frequencies of the combined classifier. Very similar values are These have been generated from the average and normalized energy density values forobserved in general since the frequencies the nine variation in ofconcentration the combined is only ±0.1 g/L. classifier. VeryExtreme closeness similar values areisobserved appreciatedin gen- in some eral sincecases such as the the variation in2.3 g/L and 2.4 is concentration g/L onlyconcentrations, also with ±0.1 g/L. Extreme the 2.8isg/L closeness and appreciated 2.9 g/L concentrations. in some cases such as the 2.3 g/L and 2.4 g/L concentrations, also with the 2.8 g/L and 2.9 g/L concentrations.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 13 of 17 Appl. Sci. 2021, 11, 7301 13 of 17 Figure 7. Spectral information patterns for fructose concentrations from 2.1 g/L to 2.9, with the average and normalized Figure 7. Spectral information patterns for fructose concentrations from 2.1 g/L to 2.9, with the average and normalized value of the sound pressure level (dB/Hz) as a function of the frequencies (kHz) used by the combined classifier. value of the sound pressure level (dB/Hz) as a function of the frequencies (kHz) used by the combined classifier. For classes with a concentration difference of ±0.01 g/L in Table 3, the 143 samples wereFor alsoclasses analyzedwithbyaClassifiers concentration 1, 2 anddifference the combinedof ±0.01 g/L The system. in Table 3, the 143 last columns samples of Table 5 were show also theanalyzed by Classifiers results obtained by each1,classifier 2 and the combined over 20M random system. andThe last columns independent of Table iterations. 5 Classifier show the1,results obtained byineach with 4 frequencies classifier the 8–15 overand kHz range 20M anrandom and independent average classification itera- accuracy tions. Classifier 1, with 4 frequencies in the 8–15 kHz range and an of 98.65% was much more effective than Classifier 2, where the accuracy de-creased to average classification accuracy 80.78%. Theof 98.65% was much combination of bothmore effective classifiers than did not Classifier improve 2, whereofthe the accuracy accuracy Classifier 1, sode- the combined creased system to 80.78%. considers The only the combination ofdecision of the firstdid both classifiers classifier, ignoringthe not improve theaccuracy decision of of the second Classifier 1, so one. As discussed the combined in theconsiders system beginning of the only thesection decisionforof thetheconcentrations of ig- first classifier, Table 1, not all frequencies contribute equally in the classification. At noring the decision of the second one. As discussed in the beginning of the section for the the high similarity concentrationsof concentrations ofTable Table 1, 3, the not frequencies all frequenciesof Classifier contribute 2 doequally not addinnew the information classification.in At the decision process over the information of Classifier 1. As a result, all the high similarity concentrations of Table 3, the frequencies of Classifier 2 do not add the samples were classified with an average accuracy of 98.65% and a standard deviation of 0.0272. new information in the decision process over the information of Classifier 1. As a result, The allocation of the four frequencies of Classifier 1 in the middle band of the spectrum, all the samples were classified with an average accuracy of 98.65% and a standard devia- between 8 and 15 kHz, is understandable. It is the band with high mean energy density tion of 0.0272. values. Figure 8 shows the position of these frequencies in the mean spectra of the nine The allocation concentrations of the2.01 between four frequencies g/L of Classifier and 2.09 g/L. Figure 9 shows1 in thethemiddle patterns band of the by generated spec- trum, between 8 and 15 kHz, is understandable. It is the band with Classifier 1 of the average and normalized energy density for the four frequencies. As high mean energy density in the values. previousFigure cases 8there shows arethe veryposition similarofpatterns, these frequencies for example in the mean in the spectra 2.01 g/L andof the nine 2.02concentrations g/L patterns. between There is 2.01 also g/L and close 2.09 g/L.between similarity Figure 9the shows 2.03 the g/L,patterns 2.05 g/L,generated and by2.09 Classifier 1 of the average and normalized energy density for the four frequencies. As g/L patterns. in the previous cases there are very similar patterns, for example in the 2.01 g/L and 2.02 g/L patterns. There is also close similarity between the 2.03 g/L, 2.05 g/L, and 2.09 g/L patterns.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 14 of 17 Appl. Sci. 2021, 11, 7301 14 of 17 Figure 8. Figure Spectralinformation 8. Spectral informationofofvibrational vibrational absorption absorption bands bands of ten of ten concentrations concentrations fromfrom 0 to 0 g/L g/L to 9The 9 g/L. g/L. The curves curves repre- represent sent the average the average and normalized and normalized valuesvalues of theofsound the sound pressure pressure levellevel (dB/Hz) (dB/Hz) overaudible over the the audible frequencies frequencies range. range. In In each each spectrum, spectrum, the frequencies the frequencies used used for both for both classifiers classifiers are emphasized. are emphasized. Figure Figure 9. 9. Spectral Spectral information patterns for information patterns for fructose fructose concentrations concentrations from from 2.01 2.01 g/L g/L to to 2.09, 2.09, with with the the average average and and normalized normalized value of the sound pressure level (dB/Hz) as a function of the frequencies (kHz) used by the classifier value of the sound pressure level (dB/Hz) as a function of the frequencies (kHz) used by the classifier 1. 1.
Appl. Sci. 2021, 11, 7301 15 of 17 5. Conclusions We have described a new non-invasive method based on the spectral analysis of audible scattered sounds to conclude the concentration of liquid mixtures according to their chemical composition with low cost. The spectral information was analyzed by a metaheuristic algorithm. ELM was integrated to implement the fitness function of the optimization algorithm GGA and extract a reduced set of frequencies as a classifier. The acoustical response spectrum of the samples to MLS sounds was used after a previous comparison with other sounds, like chirps, square pulses, and white noise. It was sufficient to examine the spectrum response at a few frequencies instead of analyzing the whole range of audible frequencies. The experiments were carried out with 364 measurements from 28 samples of distilled water and fructose mixtures (150 mL) with a fructose concentration varying between 0 and 9 g/L. The 28 concentrations were grouped into three sets with increasing difficulty: ten between concentrations 0 g/L and 9 g/L with ±1 g/L increments, nine concentrations between 2.1 g/L and 2.9 g/L with ±0.1 g/L increments, and nine between 2.01 g/L and 2.09 g/L with ±0.01 g/L increments. This work has allowed us to reduce the problem to a set of only nine frequencies on (3–15) kHz able to classify samples with concentrations of any of the three sets described. In the most complex case, the proposed classifier was able to discriminate fructose concen- trations with variations of ±0.01 g/L with an average accuracy of 98.65%. The higher the concentration difference, the better the classification accuracy. For samples with increments of ±0.1 g/L the average accuracy is 98.98%, and when the concentration increments are ±1 g/L, the average accuracy rises to 99.82%. The optimization algorithm returned different solutions with similar performance. The solution to the problem was not unique. It is important to note that changes in the number of types are likely to change the set of frequencies selected in the solutions. Each frequency had a different weight in the classification process. Future research will be focused on analyzing other chemicals, more complex mixtures and improving the accuracy of this sensing method. Author Contributions: Conceptualization, J.A.M.R.; methodology, J.A.M.R., P.G.D., M.U.M. and J.A.H.; software, P.G.D. and M.U.M.; validation, J.A.M.R. and J.A.H.; formal analysis, J.A.M.R., P.G.D. and J.A.H.; investigation, J.A.M.R., P.G.D., M.U.M. and J.A.H.; resources, M.U.M. and J.A.M.R.; data curation, P.G.D. and M.U.M.; writing—original draft preparation, J.A.M.R., P.G.D., M.U.M. and J.A.H.; writing—review and editing, J.A.M.R., P.G.D., M.U.M. and J.A.H.; visualization, P.G.D. and M.U.M.; supervision, J.A.M.R. All authors have read and agreed to the published version of the manuscript. Funding: No external funding sources were used for this research. Institutional Review Board Statement: The study does not include research on humans or animals. Informed Consent Statement: Not applicable. Data Availability Statement: Data sharing not applicable. Conflicts of Interest: The authors declare no conflict of interest. References 1. Bonacucina, G.; Cespi, M.; Mencarelli, G.; Casettari, L.; Palmieri, G.F. The Use of Acoustic Spectroscopy in the Characterisation of Ternary Phase Diagrams. Int. J. Pharm. 2013, 441, 603–610. [CrossRef] 2. Bonacucina, G.; Perinelli, D.R.; Cespi, M.; Casettari, L.; Cossi, R.; Blasi, P.; Palmieri, G.F. Acoustic Spectroscopy: A Powerful Analytical Method for The Pharmaceutical Field? Int. J. Pharm. 2016, 503, 174–195. [CrossRef] [PubMed] 3. Dukhin, A.S.; Goetz, P.J. Ultrasound for Characterizing Colloids. In ACS Symposium Series; American Chemical Society: Washington, DC, USA, 1999, 2004; pp. 91–120. 4. Povey, M.J. Ultrasonic Techniques for Fluids Characterization; Elsevier: Amsterdam, The Netherlands, 1997. 5. Contreras, N.I.; Fairley, P.; McClements, D.J.; Povey, M.J. Analysis of The Sugar Content of Fruit Juices and Drinks Using Ultrasonic Velocity Measurements. Int. J. Food Sci. Technol. 1992, 27, 515–529. [CrossRef]
You can also read