Optimizing Machine Learning Parameters for Classifying the Sweetness of Pineapple Aroma Using Electronic Nose - inass

Page created by Edgar Hart

Food & Drink

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

Optimizing Machine Learning Parameters for Classifying the Sweetness of Pineapple Aroma Using Electronic Nose - inass

Received: March 1, 2020. Revised: May 22, 2020. 122

 Optimizing Machine Learning Parameters for Classifying the Sweetness of
 Pineapple Aroma Using Electronic Nose

 Mhd Arief Hasan1 Riyanarto Sarno2* Shoffi Izza Sabilla2

 1
 Informatics Study Program, Universitas Lancang Kuning, Pekanbaru 28265, Indonesia
 2
 Department of Informatics, Faculty of Intelligent Electrical and Informatics Technology,
 Institut Teknologi Sepuluh Nopember (ITS), Surabaya 60111, Indonesia
 * Corresponding author’s Email: riyanarto@if.its.ac.id

 Abstract: Electronic nose (e-nose) has been widely used to distinguish various scents in food. The output of e-nose
 is a signal that can be identified, compared, and analyzed. However, many researchers use e-nose without using
 standardization tools, therefore e-nose is still often questioned for its validity. This paper proposes an electronic nose
 (e-nose) to classify the sweetness of pineapples. The standard sweetness levels are measured by using a Brix meter
 as a standardization tool. The e-nose consists of a series of gas sensors MQ Series which are connected to the
 Arduino micro-controller. The sweetness levels measured by the Brix meter are then ordered into low, medium, high
 sweetness groups. These sweetness groups are used as label ground-truth for the e-nose. Signal processing PCA and
 mother wavelet is employed to reduce noise from the e-nose signals. The signal processing methods obtain optimal
 parameters to find the characteristics of each signal. Machine learning methods were successfully carried out with
 optimized parameters for the classification of three levels of sweetness of pineapple. The best accuracy is 82% using
 KNN with 3 neighbors.
 Keywords: E-nose, Classification, Pineapple, Statistical parameters, PCA, Threshold, Wavelet.

 array with each with a different sensitivity. Then,
1. Introduction the electrical signals in the form of sensor responses
 form specific patterns for the type of scent that is
 An electronic nose (e-nose) has been widely
 captured. To be able to identify or classify the
used to distinguish various scents in food. The
 pattern of a sample object, it cannot be done just by
working principle of e-nose is to mimic the function
 looking directly at the pattern produced
of the human nose, in which there are several
 quantitatively, further analysis is needed using a
receptors to identify an aroma. These receptors will
 pattern recognition machine. However, many
be replaced by sensors on the electronic nose. The
 researchers use e-nose without using standardization
electronic nose consists of several sets of gas
 tools, therefore e-nose is still often questioned for its
sensors with different selectivity, a signal collection
 validity.
unit, and a series of pattern recognition software. E-
 Pineapple, or ananas (Ananas comosus (L.)
nose has been widely used for research in the field
 Merr.) It is a type of tropical plant originating from
of food [1] and fruits [2].
 Brazil, Bolivia, and Paraguay. This plant belongs to
 The output of e-nose is a signal that can be
 the Bromeliaceae family. Smell the pineapple scent
identified, compared, and analyzed. The e-nose,
 is one of the techniques to distinguish young and
which consists of an array of non-selective chemical
 mature pineapple. A ripe pineapple is a pineapple
gas sensors, functions to capture and convert odors
 whose bottom has a distinctive aroma of ripe, sweet,
into electrical signals or sensor responses. The
 and fruity fruit. Whereas if a pineapple has an aroma
generated electrical signals are overlapping aromas
 like acid or something fermented like the smell of
forming compounds that are captured by the sensor
International Journal of Intelligent Engineering and Systems, Vol.13, No.5, 2020 DOI: 10.22266/ijies2020.1031.12

Received: March 1, 2020. Revised: May 22, 2020. 123

vinegar, then this pineapple is classified as young [2]. Many studies have used E-nose to measure fruit
pineapple. At present, to determine the level of fruit maturity using E-nose. In the research [3], E-nose is
sweetness, people still rely on trained people. used to calculate the levels of OXIDE in ripe and
Therefore this study proposes to classify pineapples young fruit. This study concluded that the resulting
using E-Nose based on the sweetness level. OXIDE levels could affect the young and ripe of
 For the standard sweetness level is measured fruit. Electronic noses can also distinguish odor
using a Brix meter as a standardization tool. Brix differences between different fruit varieties such as
Meter is a tool used to determine sugar levels in apples [4]. The use of E-Nose allows complex
food or fruit. This study uses a Brix meter to aroma changes in the fruit to be monitored [5]. It is
determine the sweetness level of each pineapple allowing a fresh pineapple to be very easily
sample to be tested using E-nose. perishable (rotten) as it affects the results of the e-
 The sweetness level measured by the Brix meter nose data that is produced. Another story in the
is then divided into 3 groups, namely low, medium, research of [6] calculates the quality level of ripe
and high sweetness. These groups are used as olive oil. European regulations govern the
ground-truth labels for e-nose. Signal processing parameters and methodology for determining the
using Principal Component Analysis (PCA) and quality of olive oil, and this quality mainly depends
wavelets aims to reduce the noise from signals on the condition of the olives after harvest. They
generated by e-nose. PCA is a technique used to succeeded in classifying with three methods of
simplify data, by transforming data linearly to form Naive Bayes (NB), Partial Least Squares
a new coordinate system with maximum variance. Discriminant Analysis (PLS-DA) and Multilayer
Principal component analysis can be used to reduce Perceptron (MLP) neural network. We determined
the dimensions of data without significantly the sweetness level of the pineapple with several
reducing the characteristics of the data. Meanwhile, sweetness levels measured using a Brix meter.
Wavelet is one tool that can be used to analyze non- Previous studies have also carried out the
stationary signals. Wavelet analysis can be used to classification process using E-nose for Java cocoa
show temporary behavior on a signal. This Wavelet fruit [7]. The best predictive classification process is
Transform Method can be used to filter data or obtained by the E-nose - MLP- ANN procedure.
improve the quality of data. The signal processing However, this research only still discusses the odor
method is calculated to obtain the optimal that occurs. Not yet discussed the integration of the
parameters and find the characteristics of each content of the fruit that affects the quality of the fruit
signal. itself. This research has not selected the best sensor
 The next process, the use of learning machines in the feature selection process using E-Nose [8].
for the classification of three levels of sweetness Instead, our research team tried to classify the
pineapple. There are four methods used in this pineapple we shared in the sweetness level process
machine learning process, namely Support Vector using E-nose and Brix Meter. Of course, this
Machine (SVM), k-nearest neighbor (KNN), research is very useful for pineapple processing
Multilayer Perceptron (MLP), and Random Forest. companies, where they require to choose the best
Of the four Machine Learning methods, the highest pineapple in large quantities (container). We begin
accuracy value with optimal parameters will be used this basic research for broader development, namely
as the success level of this study. the detection of pineapples in industrial capacity
 later.
2. E-Nose to determine the level of fruit
 sweetness. 3. Research methodology
 E-Nose technology began to be created and 3.1 Materials
developed in the early 1990s. A few years later,
Benady and Simon in 1995 began to explore E-Nose In this research, the E-Nose used is an MQ
research for the use of electronic noses to examine Series. There are ten types of gas sensors used.
the use of E-Nose in the level of fruit maturity Then These sensors are used to apprehend the difference
three years later Maul Et Early (1998) started to from the aroma of pineapple. In communicating the
create a product for commercialization, also called data, a Universal Serial Bus (USB) port is connected
the e-NOSE 4000 projects from Neotronics to the Arduino microcontroller to transfer the data
Scientific Inc., USA, which consists of 12 into a computer. The collection of gas sensors is
conductive polymers. This electronic nose has inside a sample chamber of transparent glass. In Fig.
proven useful in predicting in a non-destructive way

International Journal of Intelligent Engineering and Systems, Vol.13, No.5, 2020 DOI: 10.22266/ijies2020.1031.12

Received: March 1, 2020. Revised: May 22, 2020. 124

1, it is explained how the mechanism of the E-Nose Table 1. Sensors used and their gas compounds
tool system works. The pineapple was purchased Sensors Gas Types
from the same fruit store and was ripe at the same MQ 2 CH4, Hydrogen (H2), Carbon
age as a sample. The pineapple was cut with the size Monoxide (CO), LPG, Alcohol,
of a dice and the weight of each sample of pineapple Propane, Smoke
 MQ 4 CH4, Natural Gas, Alcohol, Smoke
(dice) used was the same, specifically 100 grams.
 MQ 6 H2, Alcohol, LPG, Cooking Fumes
 The steps in sampling the data were as follows:
 MQ 9 Coal Gas, CO, Liquefied Gas
 1. e-nose was turned on and warmed for 5 MQ 135 NH3, alcohol, NOx, smoke,
 minutes; benzene, and CO2
 2. before the sample was put into the sample room. MQ 136 H2S
 The sample of pineapple was measured by MQ 137 CO, Ammonia
 sugar content using Brix meter; MQ 138 phenyl methane, ethanol, acetone,
 3. the sample was put into the sample chamber; and methanol
 4. the process of taking data took place where the DHT – 22 Temperature and Humidity
 USB port from the sample room was inserted
 into the computer. The data collection process others. For fruit plants, refractometers help to
 lasted for approximately 10-15 minutes; determine when is the right time to harvest and
 5. the sample room was cleaned of existing classifying fruit based on the level of sweetness.
 samples; this was managed to avoid the This refractometer has a measurement range of: 0-
 occurrence of residuals from gases that occur 32% Brix, Resolution: 0.2%, Working Temperature
 from previous data collection. This process average in 10 ~ 30º C, Dimensions: 20.5 x 8 x 5.5
 took approximately 5 minutes. Therefore the cm or 8 x 3.5 x 2.3 inch. How to use this tool is by
 sample chamber was released entirely from the using a pipette to take the liquid to be measured.
 smell of the existing gas. Open the prism cover, and place 2-3 drops on the
 surface of the prism, then close it again. Point the
3.2 E-nose system prism to the light source, and observe from the side
 of the glass binoculars. Then Adjust the knob;
 The E-nose system developed in this study is
 therefore, the size is visible, its value is read based
shown in Fig. 1. Consisted of three components,
 on the boundary between the blue and white areas.
which are data acquisition and transmission unit;
 We use this to measure the sugar content in
sensor arrays and space units; power and gas supply
 pineapples.
unit. E-nose systems were always equipped with
sensors that used gas sensors that were sensitive to 3.4 Stages of research
aromatic compounds, nitrogen oxides, hydrogen,
alkanes, methane, sulfur compounds, alcohol [9]– This research used several experimental
[11]. According to volatile compounds emitted by techniques to produce the best quality data.
peaches in ripening procedures [12], volatile Following Fig. 2 which explains the research flow
metabolites were released during the growth of are explained as follows:
microorganisms [13], [14], and sensor array 1. pineapple data samples are measured using a
selection in related literature [9]–[11], our E-nose Brix meter. Then proceed with the collection of
sensor arrays were built with various sensors of E-Nose data for 15 minutes through Arduino
eight semiconductor metal oxide sensors. with the output format in the form of Comma
 Table 1 explains the use of the number of Separated Value (CSV). Resulting in E-Nose
sensors used on this e-nose. There were nine sensors data of 518 Data Lines;
that we use as described. The Gas Type column is 2. then a threshold is conducted to measure the
the type of gas produced by the sensor. However, in sweetness of the existing pineapple. So that there
this study, we did not discuss what type of sensors are 3 classes of pineapple (Low, Medium, and
and what gases would dominate the data collection High);
of pineapple using E-nose, as previous studies have 3. because the data generated in the calculation of
done [15, 16]. pineapple time is not the same. So we decided to
 take lines 113 to 300 with the aim that data
3.3 Brix meter processing can be optimized;
 4. then we reduce the noise that occurs in the data
 Brix Refractometer is used to measure sugar
 using the Mother wavelet method by using
levels in fruits, fruit juices, coffee, soft drinks, and

International Journal of Intelligent Engineering and Systems, Vol.13, No.5, 2020 DOI: 10.22266/ijies2020.1031.12

Received: March 1, 2020. Revised: May 22, 2020. 125

 Figure. 1 E-nose system

 several functions namely Sym5, db1, rbior1,1 Thus, the class interval length ( ) is obtained.
 using levels 1-10;
 ℎ
5. then we calculate the statistical parameters from = (2)
 the previous step (step 4). So we get 5400
 features from these statistical parameters;
6. Then we did a feature selection using Pearson So as the following intervals and number of data are
 Correlation and Chi-Square; obtained, the result of the interval level shown in
7. After obtaining the features then enter the Table 2.
 Principal Component Analysis (PCA) process. Table 2 explains the range of sweetness levels
 To simplify data, by transforming data linearly obtained after being calculated using a Brix meter.
 so that a new coordinate system with a maximum Then proceed with the threshold process. The
 variant is formed. The PCA process is intended interval is calculated using the manual Brix meter
 to determine the best components. The tool. In this range, there is no comma value, because
 experiment continued with classification. This the needle in the Brix meter is right in each row of
 classification process uses four methods namely numbers shown. Therefore, this study does not take
 SVM, KNN, MLP, and Random Forest; into account the fraction of the range of sweetness
8. So from the 4 experimental methods obtained the of pineapple.
 best accuracy value. Obtained accuracy with The data generated is in the form of CSV format
 optimal parameters. with the amount following the explanation in Table
 1. Each CSV produces 518 lines. Because during
3.4.1. Determination of the brix meter threshold value the data collection process there is a difference in
 the time difference between each turn on e-nose. We
 In the initial stage, the value of the limit range of decided to retrieve data from row 113 to row 300.
the sweetness of the pineapple is made. This This is intended to get maximum results. Because
calculation is done by determining the threshold these lines are lines of data retrieval directly from
value contained in the Brix meter. Brix meter value sensors that are on the E-Nose System tool
with a maximum value of 32%. This range produces described in Fig. 1.
three classes, mainly sweet, half sweet, and less
sweet. The calculation uses the following formula. If Table 2. Pineapple sweetness interval level
based on the interval formula, the interval values are Class Sweetness Value Number of
calculated wherein: Interval data
 1 0 15 High 28 Data
 =7

International Journal of Intelligent Engineering and Systems, Vol.13, No.5, 2020 DOI: 10.22266/ijies2020.1031.12

Received: March 1, 2020. Revised: May 22, 2020. 126

 Parameter Selection
 Data Treshold Selection
 Statistics Feature
 Feature
 (Pearson
 Cut Of Data Reduice Noive Parameter Correlation,
 (Line 113 to 300) (Wavelet) Statistic Chi Square)

 Best Accuracy Classification PCA
 End With Optimal For 3 Classes
 Parameter

 Automatically to
 0

Received: March 1, 2020. Revised: May 22, 2020. 127

 Table 3. The Statistical Parameter function used the appearance of the feature and the appearance of
Function Description the category, which then each term value is ranked
 H5 The h5py package is a Pythonic interface to from the highest. Chi-Square test in statistics is
 the HDF5 binary data format. HDF5 allows applied to test the independence of two events.
 users to store large amounts of numerical ( − )2
 data, and easily manipulate that data from 2 = ∑ (6)
 
 NumPy. where,
 N25 The binomial opportunity data distribution O : Observation value
 for N equals 25 E : Expected value (desire)
 N75 The binomial opportunity data distribution
 for N equals 75
 = ( − 1)( − 1) (7)
 N95 The binomial opportunity data distribution
 for N equals 95
 where :
 Median Used to find the middle value of data b : number of rows
 Mean The average value obtained from the sum of k : number of columns
 all values from each data, then divided by the
 amount of data available
 3.4.5. Machine learning

3.4.4. Feature selection using pearson correlation and One of the main topics in data mining or
chi-square machine learning is classification. Classification is
 an activity to group data consisting of classes then
 Feature selection is an essential part of labeled and targeted. The purpose of this algorithm
optimizing the performance of classification is to solve the problem then create a classification
methods. The primary purpose of feature selection is category that is included in the process of supervised
to reduce complexity, increase accuracy, and choose learning. The purpose of supervised learning is that
optimal features from a data feature set [20]. label data or targets play a role as a 'supervisor' or
Research conducted by Aytug Onan and Serdar 'teacher' who oversees the learning process in
Korukoglu discusses the comparison of several achieving a certain level of accuracy or precision.
feature selection methods, the feature selection This classification process uses python because it
method using Pearson Correlation Coefficient can has full features and has a platform that can be used
improve accuracy and has the advantage of being both for research and for building production
practical and fast for a massive number of features systems. This study uses four classification methods
[21]. Following is the Pearson Correlation as follows:
Coefficient equation shown in Eq. (5). a. SVM (Support Vector Machine) SVM is a
 machine learning method that carries out
 (∑ )− (∑ )( ) activities based on the rules of Structural Risk
 = (5)
 √[ ∑ 2 −(∑ )2] √[ ∑ 2 −(∑ )2] Minimization (SRM). SVM has a target to find
 the best hyperplane. The goal is to separate the
where: two classes in the input space. The basic
 : Pearson’s Product Moment Correlation principle of SVM is a linear classifier and
 Coefficient subsequently developed to work on non-linear
 : Number of data pairs of and problems. SVM tries to find the best
∑ : Total number of variables hyperplane in the input space. The basic
∑ : Total number of variables principle of SVM is a linear classifier and
∑ 2 : The square of the total number of subsequently developed to work on non-linear
 variables problems. SVM has two parameters, which are
∑ 2 C and gamma. This research creates an array
 : The square of the total number of
 variables for these parameters, as follows:
 ‘model_C’ : [0.0001; 0.001; 0.01; 0.1; 1; 10;
 Chi-Square feature selection uses statistical 100]
theory to test the independence of a term with its ‘model_gamma’ : [0.0001; 0.001; 0.01; 0.1; 1;
category [20]. One of the purposes of using this 10; 100]
feature selection is to eliminate disruptive features
in the classification. In the Chi-Square feature b. KNN (k-nearest neighbor). KNN classifies
selection based on statistical theory, two of them are objects based on learning data that is the closest

International Journal of Intelligent Engineering and Systems, Vol.13, No.5, 2020 DOI: 10.22266/ijies2020.1031.12

Received: March 1, 2020. Revised: May 22, 2020. 128

 distance to the object. KNN classifies the pineapple. Data processing methods are used to
 projected learning data in multiple dimensions. determine its classification using the PCA method.
 This space is divided into sections that This technique is widely used to reduce the number
 represent learning data criteria. Each learning of dimensions in a data set as it only uses the
 data is represented as points c in many- components that most contribute to tasks such as
 dimensional spaces. The functions of KNN in classification or regression in Machine Learning.
 Python are used as follows: The introduction of this pattern is expected to
 ‘model_n_neighbor’ : [1; 2; 3; 4; 5; 6; 7; 8; 9;
 simplify testing the difference in the sweetness level
 10] of pineapple based on the aroma of pineapple.
 ‘model_weights’ : [distance; uniform]
 3.4.7. Accuracy
c. MLP Multilayer Perceptron (MLP) MLP is Accuracy is one of the metrics for evaluating
 one type of classification method with an classification models. Informally, accuracy is the
 artificial neural network algorithm that adopts prediction fraction of our correct model. Formally,
 the workings of neural networks in living accuracy has the following definition :.
 things. This algorithm is known to be reliable
 because of the learning process that can be Accuracy
 carried out in a directed direction. Learning ℎ 
 this algorithm is done by updating the back =
 ℎ 
 weight (backpropagation). Determination of
 the optimal weight will produce the right
 classification results. MLP consists of a simple For binary classification, accuracy can also be
 system of interconnecting networks. The MLP calculated in terms of positive and negative as
 functions that we use in Python are as follows : follows:
 ‘model_activation’:[relu; logistic; tanh; relu] + 
 ‘model_hidden_layer_sizes’:[500] =
 + + + 
 ‘model_max_iter’:[10000]

d. Random Forest. This technique generates Where,
 many decision tree classifications, then make = True Positive
 decisions based on several decision trees that = True Negative
 have been made. The advantage of using = False Positive
 random forest is being able to classify data that = False Negative
 has incomplete attributes and can be used for
 classification and regression. The Random 4. Result and discussion
 Forest function used in python is as follows :
 As described in Fig. 2, researchers conducted
 ‘model_n_estimator’:[50; 100; 500; 1000] several models of machine learning testing
 experiments to determine the best results for data
3.4.6. Principal component analysis accuracy. In discussing these results we present the
 experiments that we have carried out and what the
 Principal Component Analysis (PCA) is a accuracy values obtained from these experiments.
method used to reduce the amount of data when a The experiments we use include processing data by
correlation occurs [22–24]. The aim is to find the classification, wavelets, PCA, Selection Feature.
essential part of which is a linear combination with The following is a complete explanation of the
an origin variable that explains each sensor. PCA results of the experiment and the results obtained for
projects a data matrix that initially has a high accuracy.
dimension to the lowest dimension (3 dimensions or
two dimensions) without losing the necessary 4.1 Data-classifier experiments
information. The correlation among samples can be
visualized by a plot of each significant component The first experiment conducted by us was to
[1, 25–30]. process pineapple data from E-Nose by using the
 In this research, researchers want to find out the SVM classification method. By using python we do
response pattern of a series of 10 gas sensors SVM Classification so that the result of Parameter C
(electronic nose) toward the odor pattern on is 100, The Gamma model is 0.001 with an accuracy

International Journal of Intelligent Engineering and Systems, Vol.13, No.5, 2020 DOI: 10.22266/ijies2020.1031.12

Received: March 1, 2020. Revised: May 22, 2020. 129

of 0.770. This accuracy value does not meet the reduce_dim n_components equal to 10 with
ideal classification value above 0.8. Therefore we do an accuracy value of 0.689.
other clarification methods. c. Classification using MLP obtained parameter
 activation model is equal to relu,
4.2 Data-wavelet-classifier experiments model__hidden_layer_sizes is equal to 500,
 max_iter model is equal to 10000,
 In the second experiment is to process E-Nose reduce_dim n_components is equal to 20 with
data from pineapple earlier by denoising the signal an accuracy value of 0.672.
obtained using wavelet. As explained in the previous d. Classification using Random Forest obtained
experimental method in Fig. 2. Wavelet used is the parameter model n_estimators equal to 500
mother wavelet Sym 5, db1, record 1.1 with levels with an accuracy value of 0.770.
1-10. After Denoising, classification was continued From the results of this test, it was proven that
with 4 methods namely SVM, KNN, MLP, Random the PCA process was able to improve the accuracy
Forest. In this experiment obtained better of classification testing. The best value obtained
classification parameters than previous experiments. from the results of the classification using this SVM
The parameter results obtained are as follows: method is 0.803.
a. Classification using SVM obtained values
 obtained parameter C is 10, Gamma Model is 4.4 Data-wavelet-selection feature-classifier
 0.0001 with an accuracy of 0.787. experiments
b. Classification using KNN obtained value
 obtained results 'model n_neighbors' is 6, In the fourth test, we replaced the third test,
 'model__weights' is ‘distance' with an replacing the PCA process with feature selection.
 accuracy of 0.656. Feature selection is useful for removing data that
c. Classification using MLP obtained value does not correlate So that it will improve the
 obtained results ‘activation model’ is ‘relu’, accuracy of the data in making predictions. After the
 ‘model__hidden_layer_sizes’ is 500, max_iter wavelet data is obtained then a Feature selection is
 model is 1000 with accuracy is 0.705. done using Python. After the wavelet data is
d. Classification using Random Forest obtained obtained then a Feature selection is done using
 the value of the 'n_estimators model is 1000 Python:
 with an accuracy of 0.738. a. The classification using the SVM method
 From this second step of experimentation, we shows that the parameter value of model C is
have not to get maximum results. The accuracy equal to 10, the gamma model is equal to
value obtained is still far lower than the results of 0.0001 with an accuracy value of 0.803.
the previous stage. b. Classification using the KNN Method
 obtained parameters with the n_neighbors
4.3 Data-wavelet-PCA-classifier experiments model is 6, model__weights is 'distance' with
 an accuracy value equal to 0.770
 Then we proceed with the next experiment. c. Classification with MLP is obtained The
Before doing the classification, we performed the activation model parameter is relu
PCA process, after the wavelet was obtained from model__hidden_layer_sizes is equal to 500,
the pineapple data E-Nose signal. The PCA process max_iter model is equal to 10000 with an
aims to carry out statistical processes on data to accuracy value of 0.705.
simplify existing data, By way of transforming data d. Random Forest classification obtained the
linearly so that a new coordinate system is formed results with the parameter model n_estimators
with maximum variance. After conducting this is 500 with an accuracy value of 0.770.
experiment which ended with the classification
process, obtained better results with the following 4.5 Data-wavelet-selection feature-PCA-
details, classifier experiments
a. Classification using SVM model C is equal to
 1, the gamma model is equal to 0.001, In this final test, we performed all stages from
 reduce_dim n_components is equal to 20 with steps 1 to four. Where after the data is processed
 an accuracy value of 0.803. with wavelet then performed Feature Selection the
b. Classification using KNN obtained Parameter PCA process. Finally, the classification process is
 model ‘n_neighbors’ equals 1, carried out. The results of this fifth test turned out to
 ‘model__weights’ equals distance, produce the best results from the previous test, the
 results of the test classification are as follows:
International Journal of Intelligent Engineering and Systems, Vol.13, No.5, 2020 DOI: 10.22266/ijies2020.1031.12

Received: March 1, 2020. Revised: May 22, 2020. 130

 Table 3. Experimental results with a few steps classification experiments
 Experiment Classifier Optimal parameter Accuracy
 Data – Classifier SVM With Prarameter {'model_C': 100, 'model_gamma': 0.001} 0.770
 SVM With Prarameter {'model C': 10, 'model gamma': 0.0001} 0.787
 With Prarameter {'model n_neighbors': 6,
 KNN 0.656
 'model__weights': 'distance'}
 Data – Wavelet – With Prarameter {'model activation': 'relu',
 Classifier MLP 'model__hidden_layer_sizes': 500, 'model 0.705
 max_iter': 10000}
 RandomForest With Prarameter {'model n_estimators': 1000} 0.738
 With Prarameter {'model C': 1, 'model gamma': 0.001,
 SVM 0.803
 'reduce_dim n_components': 20}
 With Prarameter {'model n_neighbors': 1,
 KNN 'model__weights': 'distance', 0.688
 Data – Wavelet– 'reduce_dim n_components': 10}
 PCA – Classifier With Prarameter {'model activation': 'relu',
 MLP 'model__hidden_layer_sizes': 500, 'model max_iter': 10000, 0.672
 'reduce_dim n_components': 20}
 With Prarameter {'model n_estimators': 50, 'reduce_dim
 RandomForest 0.787
 n_components': 10}
 SVM With Prarameter {'model C': 10, 'model gamma': 0.0001} 0.803
 With Prarameter {'model n_neighbors': 6,
 KNN 0.770
 Data – Wavelet– 'model__weights': 'distance'}
Selection Feature – With Prarameter {'model activation': 'relu',
 Classifier MLP 'model__hidden_layer_sizes': 500, 0.705
 'model max_iter': 10000}
 RandomForest With Prarameter {'model n_estimators': 500} 0.770
 With Prarameter {'model_ _activation': 'relu', 'model
 MLP hidden_layer_sizes': 500, 'model max_iter': 10000, 0.738
 'reduce_dim n_components': 10}
 With Prarameter {'model n_neighbors': 3,
 Data - Wavelet –
 KNN 'model__weights': 'uniform', 0.820
Selection Feature –
 'reduce_dim n_components': 10}
 PCA – Classifier
 With Prarameter {'model C': 1, 'model gamma': 0.001,
 SVM 0.803
 'reduce_dim n_components': 10}
 With Prarameter {'model n_estimators': 100, 'reduce_dim
 RandomForest 0.787
 n_components': 30}

a. KNN classification with Parameter that makes KNN as the best Machine Learning in
 n_neighbors model equal to 3, model the classification process [31]. Overall the results of
 _weights equal to uniform, reduce_dim this test are conveyed in Table 3. Experimental
 n_components equal to 10 with an accuracy results with several steps of experimentation and
 of 0.820. classification.
b. Classification SVM resulted for C and gamma
 parameters are 1 and 0.001 respectively. The 5. Conclusion
 reduction dimension using 10 n_component
 with accuracy is 0.803. From the results of several experiments in this
c. Random Forest Classification with Parameter study, we have managed to find the best accuracy in
 model n_estimators is 100, reduce_sum processing pineapple data using E-nose and Brix
 n_components is 30 with accuracy results is meters.We have processed the E-Nose data into
 0.787. three classes which are divided based on the
 The best accuracy is obtained by KNN threshold of the sweetness of pineapple. The best
Classification with an accuracy value of 0.820, It accuracy results we got with a value of 0.820. The
also provides the same thing as previous research accuracy results obtained by using the classification
International Journal of Intelligent Engineering and Systems, Vol.13, No.5, 2020 DOI: 10.22266/ijies2020.1031.12

Received: March 1, 2020. Revised: May 22, 2020. 131

using K-Nearest neighbor (KNN). In this study, e-Nose”, Mater. Today Proc., Vol. 5, No. 3, pp.
KNN is the best method for obtaining accuracy 9866–9870, 2018.
results from machine learning. These results are [4] D. Zhu, X. Ren, L. Wei, X. Cao, Y. Ge, H. Liu,
preceded by a wavelet process and then a selection and J. Li “Collaborative analysis on difference
feature. of apple fruits flavour using electronic nose and
 electronic tongue”, Sci. Hortic. (Amsterdam).,
 In the next research, we will do a regression
 Vol. 260, No. August 2019, p. 108879, 2020.
process from the E-Nose and Brixmeter data that we
 [5] L. Torri, N. Sinelli, and S. Limbo, “Shelf life
have found. This is intended to be able to see the
 evaluation of fresh-cut pineapple by using an
effect between two or more variables. The variable
 electronic nose”, Postharvest Biol. Technol.,
relationship in question is functional which is
 Vol. 56, No. 3, pp. 239–245, 2010.
realized in the form of a mathematical model. Then
 [6] D. M. Martínez Gila, J. Gámez García, A.
later from the results of this regression we will get a
 Bellincontro, F. Mencarelli, and J. Gómez
machine learning pattern linking historical data and
 Ortega, “Fast tool based on electronic nose to
labels or outputs that are interrelated, not alone.
 predict olive fruit quality after harvest,”
 Postharvest Biol. Technol., Vol. 160, Vo.
Conflicts of Interest
 November 2019, p. 111058, 2020.
 The authors declare that there is no conflict of [7] S. N. Hidayat, A. Rusman, T. Julian, K.
interest. Triyana, A. C. A. Veloso, and A. M. Peres,
 “Electronic nose coupled with linear and
Author Contributions nonlinear supervised learning methods for rapid
 discriminating quality grades of superior java
 RS is supervision that has the idea to classify the
 cocoa beans”, International Journal of
sweetness of pineapple using E-nose. He gave an
 Intelligent Engineering and Systems, Vol. 12,
idea to take Pineapple sample data using E-Nose and
 No. 6, pp. 167–176, 2019.
provide a threshold on the sweetness level of
 [8] S. Sabilla, R. Sarno, and K. Triyana,
pineapple using Brix Meter. SIS gives a role to E-
 “Optimizing Threshold using Pearson
Nose development, she also calculating the
 Correlation for Selecting Features of Electronic
threshold and classification process E-Nose and
 Nose Signals”, International Journal of
Brixmeter data using Python. MAH gives a role to
 Intelligent Engineering and Systems, Vol. 12,
the completion of this article so that this article can
 No. 6, pp. 81–90, 2019.
be written accordingly with the problem of this
 [9] A. Hernández Gómez, J. Wang, G. Hu, and A.
research.
 García Pereira, “Discrimination of storage
 shelf-life for mandarin by electronic nose
Acknowledgments technique”, LWT - Food Sci. Technol., Vol. 40,
 This research was partially funded by the No. 4, pp. 681–689, 2007.
Indonesian Ministry of Research, Technology and [10] H. Zhang and J. Wang, “Evaluation of peach
Higher Education of the Republic of Indonesia quality attribute using an electronic nose”,
under the PMDSU Program managed by Institut Sensors Mater., Vol. 21, No. 8, pp. 419–431,
Teknologi Sepuluh Nopember (ITS), under the 2009.
WCU Program managed by Institut Teknologi [11] S. Benedetti, S. Buratti, A. Spinardi, S.
Bandung (ITB). Mannino, and I. Mignani, “Electronic nose as a
 non-destructive tool to characterise peach
References cultivars and to monitor their ripening stage
 during shelf-life”, Postharvest Biol. Technol.,
[1] A. Loutfi, S. Coradeschi, G. K. Mani, P.
 Vol. 47, No. 2, pp. 181–188, 2008.
 Shankar, and J. B. B. Rayappan, “Electronic
 [12] N. Narain, T. C. ‐. Hseih, and C. E. Johnson,
 noses for food quality: A review”, J. Food Eng.,
 “Dynamic Headspace Concentration and Gas
 Vol. 144, pp. 103–111, 2015.
 Chromatography of Volatile Flavor
[2] J. Brezmes and E. Llobet, Electronic Noses for
 Components in Peach”, J. Food Sci., Vol. 55,
 Monitoring the Quality of Fruit. Elsevier Inc.,
 No. 5, pp. 1303–1307, 1990.
 2016.
 [13] R. Article, A. Sani, A. A. Aliero, and S. E.
[3] M. Mallick, S. Minhaz Hossain, and J. Das,
 Yakubu, “Volatile metabolites profiling to
 “Graphene Oxide Based Fruit Ripeness Sensing
 discriminate diseases of tomato fruits
 inoculated with three toxigenic fungal
International Journal of Intelligent Engineering and Systems, Vol.13, No.5, 2020 DOI: 10.22266/ijies2020.1031.12

Received: March 1, 2020. Revised: May 22, 2020. 132

 pathogens”, Mol. Plant Pathol., Vol. 2, No. 3, Food Meas. Charact., Vol. 12, No. 4, pp.
 pp. 14–22, 2011. 2385–2393, 2018.
[14] A. D. Ibrahim, A. Sani, S. B. Manga, A. A. [25] C. Li, P. Heinemann, and R. Sherry, “Neural
 Aliero, R. U. Joseph, S. E. Yakubu, and H. network and Bayesian network fusion models
 Ibafidon, “Microorganisms Associated with to fuse electronic nose and surface acoustic
 Volatile Metabolites Production in Soft Rot wave sensor data for apple defect detection”,
 Disease of Sweet Pepper Fruits (Tattase)”, Int. Sensors Actuators, B Chem., Vol. 125, No. 1,
 J. Biotechnol. Biochem., Vol. 7, No. 6, pp. 25– pp. 301–310, 2007.
 35, 2011. [26] B. Wang, S. Xu, and D. W. Sun, “Application
[15] S. I. Sabilla, R. Sarno, and J. Siswantoro, of the electronic nose to the identification of
 “Estimating Gas Concentration using Artificial different milk flavorings”, Food Res. Int., Vol.
 Neural Network for Electronic Nose”, Procedia 43, No. 1, pp. 255–262, 2010.
 Comput. Sci., Vol. 124, pp. 181–188, 2017. [27] Y. B. Che Man, A. Rohman, and T. S. T.
[16] D. R. Wijaya, R. Sarno, and E. Zulaika, Mansor, “Differentiation of lard from other
 “Sensor array optimization for mobile edible fats and oils by means of Fourier
 electronic nose: Wavelet transform and filter transform infrared spectroscopy and
 based feature selection approach”, Int. Rev. chemometrics”, JAOCS, J. Am. Oil Chem. Soc.,
 Comput. Softw., Vol. 11, No. 8, pp. 659–671, Vol. 88, No. 2, pp. 187–192, 2011.
 2016. [28] Z. Haddia, A. Amaria,b, A. Ould. Alia, N. El
[17] R. Patil, “Noise Reduction using Wavelet Baric, H. Barhoumid, A. Maarefd, N. Jaffrezic-
 Transform and Singular Vector Renaulte, and B. Bouchikhia, “Discrimination
 Decomposition”, Procedia Comput. Sci., Vol. and identification of geographical origin virgin
 54, pp. 849–853, 2015. olive oil by an e-nose based on MOS sensors
[18] L. Sharma and N. Mehta, “Data Mining and pattern recognition techniques”, Procedia
 Techniques: A Tool For Knowledge Eng., Vol. 25, pp. 1137–1140, 2011.
 Management System In Agriculture”, Int. J. Sci. [29] M. Nurjuliana, Y. B. Che Man, and D. Mat
 Technol. Res., Vol. 1, No. 5, pp. 67–73, 2012. Hashim, “Analysis of lard’s aroma by an
[19] M. Sushama, G. Tulasi Ram Das, and A. Jaya electronic nose for rapid Halal authentication”,
 Laxmi, “Detection of high-impedance faults in JAOCS, J. Am. Oil Chem. Soc., Vol. 88, No. 1,
 transmission lines using Wavelet Transform”, J. pp. 75–82, 2011.
 Eng. Appl. Sci., Vol. 4, No. 3, pp. 6–12, 2009. [30] R. Upadhyay, S. Sehwag, and H. N. Mishra,
[20] A. Sharma and S. Dey, “Performance Electronic nose guided determination of frying
 Investigation of Feature Selection Methods and disposal time of sunflower oil using fuzzy logic
 Sentiment Lexicons for Sentiment Analysis”, analysis, Vol. 221. 2017.
 Int. J. Comput. Appl., No. June, pp. 15–20, [31] D. R. Wijaya, R. Sarno, E. Zulaika, and S. I.
 2012. Sabila, “Development of mobile electronic nose
[21] M. Shardlow, “An Analysis of Feature for beef quality monitoring”, Procedia Comput.
 Selection Techniques”, Univ. Manchester, No. Sci., Vol. 124, pp. 728–735, 2017.
 1, pp. 1–7, 2016.
[22] I. Tazi, K. Triyana, and D. Siswanta, “A novel
 Arduino Mega 2560 microcontroller-based
 electronic tongue for dairy product
 classification”, AIP Conf. Proc., Vol. 1755, No.
 July, 2016.
[23] I. Tazi, A. Choiriyah, D. Siswanta, and K.
 Triyana, “Detection of taste change of bovine
 and goat milk in room ambient using electronic
 tongue”, Indones. J. Chem., Vol. 17, No. 3, pp.
 422–430, 2017.
[24] I. Tazi, K. Triyana, D. Siswanta, A. C. A.
 Veloso, A. M. Peres, and L. G. Dias, “Dairy
 products discrimination according to the milk
 type using an electrochemical multisensor
 device coupled with chemometric tools”, J.

International Journal of Intelligent Engineering and Systems, Vol.13, No.5, 2020 DOI: 10.22266/ijies2020.1031.12

You can also read