Computer Vision-Based Wildfire Smoke Detection Using UAVs

Page created by Edgar Todd

Uncategorized

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

Hindawi
Mathematical Problems in Engineering
Volume 2021, Article ID 9977939, 9 pages
https://doi.org/10.1155/2021/9977939

Research Article
Computer Vision-Based Wildfire Smoke Detection Using UAVs

Ehab Ur Rahman ,1 Muhammad Asghar Khan ,2 Fahad Algarni ,3 Yihong Zhang,1
M. Irfan Uddin ,4 Insaf Ullah,2 and Haﬁz Ishfaq Ahmad5
1
College of Information Science & Technology, Donghua University, Shanghai, China
2
Hamdard Institute of Engineering & Technology, Islamabad 44000, Pakistan
3
College of Computing and Information Technology, University of Bisha, Bisha, Saudi Arabia
4
Institute of Computing, Kohat University of Science and Technology, Kohat, Pakistan
5
Faculty of Engineering, UniversitiTeknologi Malaysia, Johor, Malaysia

Correspondence should be addressed to M. Irfan Uddin; irfanuddin@kust.edu.pk

Received 21 March 2021; Revised 14 April 2021; Accepted 20 April 2021; Published 29 April 2021

Academic Editor: Dr. Dilbag Singh

Copyright © 2021 Ehab Ur Rahman et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
This paper presents a new methodology based on texture and color for the detection and monitoring of different sources of forest
fire smoke using unmanned aerial vehicles (UAVs). A novel dataset has been gathered comprised of thin smoke and dense smoke
generated from the dry leaves on the floor of the forest, which is a source of igniting forest fires. A classification task has been done
by training a feature extractor to check the feasibility of the proposed dataset. A meta-architecture is trained above the feature
extractor to check the dataset viability for smoke detection and tracking. Results have been obtained by implementing the
proposed methodology on forest fire smoke images, smoke videos taken on a stand by the camera, and real-time UAV footages. A
microaverage F1-score of 0.865 has been achieved with different test videos. An F1-score of 0.870 has been achieved on real UAV
footage of wildfire smoke. The structural similarity index has been used to show some of the difficulties encountered in smoke
detection, along with examples.

1. Introduction monitoring tools for the last couple of years [4–8]. A high-
definition and lightweight cameras can generate an aerial
Wildfire is a colossal threat to damaging the human and photograph with specific location information when con-
wildlife ecosystem. Statistics show that wildfires in Northern nected to UAVs along with global positioning systems
California in the United States caused more than 40 deaths (GPSs) [9]. Besides, cost-effectively, a well-organized swarm
and about 50 missing individuals in 2015 [1–3]. There were of UAVs can easily accomplish a complex task.
some major wildfire outbreaks in several countries around Image recognition attained a state-of-the-art perfor-
the world in the year 2019. It was seen to be the most mance using deep convolutional neural networks (DCNNs);
unfortunate year for such incidents. Moreover, 3500 square its architecture and learning scheme leads to an effective
miles of the Amazon rainforest have been burnt down by extractor of sophisticated, high-level features that are highly
wildfires. A forest fire recently caused 89 fatalities in Aus- robust to input transformations [10]. However, imple-
tralia and burned 3500 homes. It became of such incidents of menting deep learning and computer vision techniques in
great importance to detect wildfires accurately in advance the application for wildfire smoke detection is scarce. Be-
when it turns into chaos. Traditional methods of wildfire sides, the limitations and difficulties in such kinds of
detection, which are mainly based on human observation techniques are not widely discussed. Object detectors mainly
from watchtowers, are inefficient. The inefficiency is pri- based on video fire detection methods can be categorized
marily due to the spatiotemporal connection. Unmanned into two classes, i.e., flame detection and smoke detection.
aerial vehicles (UAVs) have been extensively used as Since the smoke generated by forest fires is observable before

2 Mathematical Problems in Engineering

the flames, video smoke detection acquires more attention smoke videos taken by a UAV and smoke images with
for early fire alarm in forest fire protection engineering. The different kinds of backgrounds and lighting conditions. Also,
traditional video smoke detection methods mainly em- some of the limitations and difficulties are discussed found
phasize the combination of static and dynamic features for during the study of this research.
smoke detection. The typical features of smoke contain
color, texture, motion orientation, and so on [11]. These
1.2. Organization of the Paper. The other sections of the
different characteristics can get better performances in
paper are presented as follows. Section 2 discusses the
specific images dataset [12]. However, due to the poor ro-
different datasets followed by the training of the objector
bustness of algorithms, the performances incline to be
detector in Section 3. Section 4 presents details on the results
unfavorable in different images dataset, and those ap-
and concludes the work, and Section 5 concludes the entirety
proaches can barely remove sophisticated interference in
of the paper.
real engineering applications.
Currently, object detection achieved a lot of progress due
to the use of GA, PSO, ANN, and DCNNs [13–18]. Modern
2. Material and Methods
object detectors founded on these networks—such as Faster 2.1. Real Smoke Training Images. The dataset used to train
R-CNN [19], SSD [20], and YOLOv3 [21]—are now robust the model consists of 14096 images of real smoke, both
enough to be deployed in customer products (e.g., Google comprised of thin and dense smoke. The smoke is typically
Photos and Pinterest Visual Search), and some are fast generated in different scenarios. One set of smoke is gen-
enough to be run on mobile devices such as MobileNet. Most erated from burning dry leaves and small bushes, which is
of these object detectors are deployed in different applica- one of the fuel causes of igniting a forest fire and smoke. The
tions [22]. However, it can be challenging for practitioners to images are shown in Figures 1(a) and 1(b) as dense and thin
select what architecture is more appropriate to their ap- smoke, respectively. Another set of images consists of smoke
plication. Standard accuracy metrics do not clarify the entire images taken in different light conditions such as yellow and
options, such as mean average precision (mAP); for practical white light as it affects smoke color and texture, as shown in
deployment of computer vision systems, running time and Figures 1(c) and 1(d). These images have been taken from
memory usage are also important. For example, mobile [11], which is also comprised of some other images having
devices in many cases need a small memory footprint, and added smoke to the forest background. Examples of these
self-driving cars need real-time executions. SSD achieves a images are presented in Figures 1(e) and 1(f ). A third set
good trade-off between speed and precision. SSD runs a consists of smoke images taken from various open sources
convolutional network on input image a single time and from the Internet that present real-time emergencies such as
calculates a feature map following a small 3 × 3 sized con- an apartment on fire or a vehicle. Such images are shown in
volutional kernel on this feature map to predict the Figures 1(g) and 1(h).
bounding boxes and categorization probability. SSD also
uses anchor boxes on various aspect ratios similar to Faster-
RCNN and learns the counterbalance to a definite extent 2.2. Test Images. The proposed methodology and object
than learning the box. To handle the scale, SSD predicts detector are evaluated based on test smoke images to check
bounding boxes after multiple convolutional layers. Since the generalization ability of the trained object detection
every convolutional layer function at a diverse scale, it can model. These images are a collection of smoke images taken
detect objects of varying scales [23]. from the UAV with a camera of 12 MP [11], and also some of
There are several ways of comparing images as if they are them are taken from phone cameras specifically using an
identical or near-identical such as structure similarity index iPhone 6 s camera with a 12 MP rear camera, while the rest
measure (SSIM), mean square error (MSE), normalized are selected from the open-source datasets available on the
color histogram, and local sensitivity hashing. These Internet. Figure 2 illustrates such kinds of images.
methods have various benefits over one another. SSIM tries
to model the modification of the image’s structural details. 2.3. Test Videos. With static camera videos as well as forest
SSIM is more robust capable of disclosing changes in the smoke videos in real-time taken by UAV, the performance of
image structure rather than just the perceived change [24]. various object detection models is also tested. The focus of
this work is on choosing an object detector that for real-time
object detection that has a better trade-off between speed
1.1. Contributions. In this work, we present a dataset, and precision. Table 1 displays the technical specifications of
grouping several images from different sources such as thin, the UAV, along with the UAV image (see Figure 3) [25],
dense with different color, and texture smoke images, taken which can be used to capture the smoke images.
from different scenarios such as wildfire and other emer-
gency conditions such as building fires and fires from an 3. Training and Detection
explosion. The SSD Inception-V2 state-of-the-art models are
trained, and their different parameters such as dropout, DCNNs have presently dominated computer vision tasks in
batch normalization, and learning rate are tuned to choose which region-based object detection methods are state-of-
the best model for real-time fire detection in videos. the-art. These methods have different advantages, such as
Comparisons of the results are obtained on several wildfire removing the gruesome work of manual feature extraction.

Mathematical Problems in Engineering 3

 (a) (b) (c)

 (d) (e) (f)

 (g) (h)

 Figure 1: Training images taken from diﬀerent sources.

 Figure 2: Test images from diﬀerent sources.

The network learns patterns from the images without provides a good precision in detection objects of diﬀerent
needing any preprocessing. Currently, several diﬀerent ar- sizes as compared with Faster-RCNN architecture.
chitectures of feature extractors are available. Selecting one In this work, a pretrained feature extractor has been
for a speciﬁc application is a trivial subject. According to trained by transfer learning with a custom classiﬁer with two
[26], diﬀerent speed versus accuracy comparison has been fully connected layers and a ﬁnal log softmax classiﬁer to
presented. Figure 4 illustrates such speed versus accuracy classify the proposed dataset into two classes with one class
trade-oﬀs between the current state-of-the-art object de- of smoke including both dense and thin smoke images and
tection models. Speed and accuracy are both of keen im- another class having ﬁre images. The classiﬁcation task was
portance for real-time smoke detection. From Figure 4, it is aimed to check the feasibility of the proposed dataset. The
clear that single shot detectors (SSDs) achieve a better trade- promising results have been presented in the results section
oﬀ in the aspect of swiftness and accuracy. Also, SSD of this paper, along with some examples. SSD as a meta-

4 Mathematical Problems in Engineering

 Table 1: Technical parameters of the UAV. First, the feature extractor Inception-V2 pretrained on
Technical parameters Value
 the COCO dataset is trained by transfer learning using the
 proposed dataset with the object detector SSD for making
Camera lens FOV 81.9° 25 mm
 predictions of smoke. The trained model is then tested with
Take-oﬀ weight 300 g
Video resolution FHD: 1920 × 1080 30 p diﬀerent test videos and images to check the feasibility of the
Max hovering time 15 minutes model.
Max ﬂight speed 31 mph Classiﬁcation results of the wildﬁre smoke classiﬁer
Endurance 16 minutes model trained by using the dataset comprising of both
Positioning system GPS/GLONASS smoke and ﬁre images are presented in Table 2. Metrics used
Obstacle sensing range 11–16 ft(0.2–5 m) for evaluation are presented in the study by Bashir and
Operating temperature Porikli [27] and Forman and Scholz [28]. The metrics are
 320 to 1040 F
range True positive (TP), False Positive (FP), True Negative (TN),
Transmitter power(EIRP) 2.4 GHz False Negative (FN), and False alarm rate (FAR) along with
Max transmission distance 100 m (distance), 50 m (height)
 detection rate recall precision and F-score. Such kind of
Dimensions 143 × 143 × 55 mm
 2.412–2.462 GHz;
 metrics are currently used in the computer vision com-
Operating frequency munity for evaluation object detection models. Another
 5.745–5.825 GHz
Camera sensor ½.3” CMOS; eﬀective pixels: 12MP metric that is used for comparing the similarity in the
ISO range Video: 100–3200 photo: 100–1600 structure formation of two images is SSIM [24]. Formulas of
Electronic shutter speed 2–1/8000 s the metrics are presented as follows:
Max vidoe bitrate 24 mbps
 TP
Photo format JPEG recall(Re) � , (1)
Video format MP4 (TP + FN)
Controllable range Pitch: –850 to 00
Stabilization 2-axis mechanical (pitch, roll) TP
 precision (Pr) � , (2)
Velocity range 36 kph at 2 m above ground (TP + FP)
Altitude range 0–8 m
Operating range 0–30 m 2 · (Pr · Re)
 F − score � , (3)
 (Pr + Re)
architecture and Inception-V2 as a feature extractor have
been chosen to be more suitable for real-time smoke de- FP
 false alarm rate(FAR) � , (4)
tection as they oﬀer better speed versus accuracy trade-oﬀ, as (TP + FP)
shown in Figure 4 taken from the study by Huang [26].
 In the proposed work, diﬀerent images taken from dif- (TP + TN)
 accuracy � , (5)
ferent sources are collected together to increase the richness of FN
the training data primarily focusing on images having thin
smoke as it serves as an alarm before a ﬁre starts, and there is TP
 tracker detection rate (TDR) � ,
an immense need to detect the smoke at this starting stage to TG
prevent the ignition and spreading of the wildﬁre. Before (6)
testing the proposed model, we trained the model on more
than 14000 training image samples comprised of both dense 2µx µy + c1 2σ xy + c2 
smoke and thin smoke and smoke in a diﬀerent light, color, SSIM(x, y) � 2 2 2 2 . (7)
 µx + µy + c1 σ x + σ y + c2 
and texture as well as diﬀerent backgrounds. For validation of
the model, more than 3100 images were used. This approach Equation (7) presents a comparison of two windows, i.e.,
was aimed to improve the generalization ability of the smoke small subsamples despite the whole image, leading to a better
detection model. We then tested the model using wildﬁre approach that can sense for changes in the structure of the
smoke images taken from a drone, mobile phone camera, and image. The parameters of equation (7) conﬁne the (x, y) lo-
some from Internet open sources which present real-time cation of the N × N window in each image, the mean of the
scenarios of both forest ﬁre and in day-to-day life emergencies pixel intensities in the x and y direction, and also the variance of
so that the proposed approach can be used in any kind of intensities in the x and y direction, along with the covariance.
situation for future projects.
 Test videos taken from standby cameras of both thin and
 4. Research Findings and Discussion
dense smoke with diﬀerent backgrounds, lighting condi-
tions, and from diﬀerent distances are used for inferencing. As introduced in Section 3, we trained a classiﬁer on the
Also, real footage of wildﬁre and smoke taken by a drone is smoke image dataset along with another set of ﬁre images.
tested. Results have been presented in Section 4 for analysis, Some of the classiﬁcation results are shown in Figure 4 along
along with some discussions on the limitations and diﬃ- with top k class probabilities. Figure 6(a) presents a test
culties found in the research of this study of wildﬁre smoke image taken from a thin smoke dataset, Figure 6(b) presents
detection. The overall workﬂow of the experiments is pre- a test image taken from dense smoke images, and Figure 6(c)
sented in Figure 5. presents a test image from a ﬁre image.

Mathematical Problems in Engineering 5

 Battery
 compartment
 Nut
 Nose direction
 Aircraft nose mark
 Propeller
 Direction LED
 red
 Motor Direction LED
 red
 Direction LED
 green
 Direction LED
 green
 LED indicator
 Adhesive tape
 Receiver antenna
 Landing gear
 Camera mounting
 frame Compass

 Figure 3: Main on-board components of the UAV (quadcopter) [25].

 40

 35

 30
 Overall mAP

 25

 20

 15

 10
 0 200 400 600 800 1000
 GPU time
 Feature extractor
 Inception resnet V2 MobileNet
 Inception V2 Resnet 101
 Inception V3 VGG
 Figure 4: Speed vs. accuracy trade-oﬀ.

 Single-shot Wildfire
 Input images detector smoke

 CNN Inception-V2 Testing UAV-camera
 Training Validation

 Transfer learning MS-COCO Images Video
 dataset frames

 Results

 Alarming Human
 system observation

 Figure 5: Diagram of the proposed methodology for smoke detection.

6 Mathematical Problems in Engineering

Table 2: Two class classiﬁcation results of the classiﬁer.
Test set Samples TP FP TN FN FAR Recall Precision F-score
Smoke 20 18 2 0 0 0.1 1.00 0.90 0.947
Fire 20 18 0 2 0 0 1.00 1.00 1.00

Smoke Smoke Fire
0 0 0

100 100 100

200 200 200

0 100 200 0 100 200 0 100 200

Smoke Smoke Fire

Fire Fire Smoke

0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
(a) (b) (c)

Figure 6: Classiﬁcation result of feature extractor with high-class probabilities.

Table 2 presents obtained results of the classification of feasibility of the dataset on thin smoke also. Another good
the dataset into two classes, i.e., fire and smoke. The results result is from the drone footage sample which is the main
from Table 2 show that the feature extractor is generalizing focus of this study to detect smoke in such kind of scenario
well and is learning different smoke patterns, i.e., thin, dense, achieved 83.2% accuracy and an F-score of 0.870 which
white, and so on. This small experiment was aimed to prove proves the suitability of this study. The results from Table 3
the viability of using the dataset in the training of the object show that this approach is feasible to implement in real-time
detector for real-time smoke detection. applications.
The trained wildfire smoke detector model, i.e., SSD Figure 8 presents the mean average recall (mAR) and
Inception-V2 has been tested with images taken from a UAV mean average precision (mAP) of each test dataset that has
shown in Figure 7(a) along with the detection score and been evaluated with the wildfire smoke detection model.
bounding boxes. Figure 7(b) presents the frames captured These metrics are famous for evaluating the overall gener-
from video of thin smoke and dense smoke generated by alization ability of object detection models. The respective
burning dry leaves and shrubs. These videos are recorded on figure proves that our smoke detection model efficiently
a mobile phone camera. Frames captured of different in- generalized to different datasets which comprise distinctive
stances from real-time UAV footage of wildfire and smoke image sets with different properties such as thin smoke and
are presented in Figure 7(c) with respective bounding boxes. dense smoke, and an image of smoke taken from different
Figure 7(e) presents a frame comprising of both dense and angles and approaches.
thin smoke along with detections. Some of the limitations have been observed while
The test results are presented in Table 3. It is observed acquiring the results. Figure 9(a) shows smoke due to
that Video 2, i.e., a dense smoke video has the highest wildfire, and the frame is captured from real footage. The
F-score among all that is because of the rich texture, shape, texture and color of this image are nearly similar to the
and color. The lowest F-score is observed from thin video texture and color of the cloud shown in Figure 9(b). Such a
samples and that is because of light color and features kind of coincidence makes it difficult for the object de-
captured by the camera, but still, the model achieves an tector to differentiate between them. The structural
accuracy of 64% and an F-score of 0.747, proving the similarity between the two images has been calculated

Mathematical Problems in Engineering 7

 (a) (b)

 (c) (d) (e)

 Figure 7: Sample’s result taken from diﬀerent test videos along with detection boxes.

 Table 3: Results of real smoke video samples.
 Smoke Nonsmoke
Test set TP FP TN FN FAR Accuracy Detection rate/recall Precision F1-score
 sample samples
Images 74 30 50 21 32 1 0.296 0.766 0.980 0.704 0.818
Video 1 (dense smoke near) 816 0 599 216 0 1 0.265 0.734 0.998 0.735 0.847
Video 2 (dense smoke far) 307 0 301 6 0 0 0.019 0.980 1.000 0.980 0.989
Video 3 (thin + dense
 313 0 245 68 0 0 0.217 0.782 1.000 0.783 0.878
smoke)
Video 4 (drone footage) 3983 1479 3099 884 1447 32 0.221 0.832 0.989 0.778 0.870
Video 5 (thin smoke) 308 0 184 124 0 0 0.402 0.600 1.000 0.597 0.747
Video 6 (thin smoke) 289 0 185 104 0 0 0.359 0.640 1.000 0.640 0.780

 mAR mAP
 1.2 1.2
 Mean average precision
 Mean average recall

 1 1
 0.8 0.8
 0.6 0.6
 0.4 0.4
 0.2 0.2
 0 0
 Images

 Dense smoke ...

 Dense smoke ...

 Thin + dense ...

 Drone ...

 Thin smoke 1

 Thin smoke 2

 Images

 Dense smoke ...

 Dense smoke ...

 Thin + dense ...

 Drone ...

 Thin smoke 1

 Thin smoke 2

 Figure 8: Mean average recall (mAR) and mean average precision (mAP) for diﬀerent test datasets.

using the structure similarity index measure (SSIM). The night in the dark is also tested with the trained object
value of SSIM calculated for these images shows that they detector. Though the smoke detector does detect some of
have a 63% structure similarity. Both of them have been the frames, the overall accuracy is unsatisfactory. This was
detected as smoke by the trained object detector. The meant to give intuition to the readers for such kinds of
conﬁdence score and bounding box are shown in diﬃculties, which may be addressed in future research by
Figure 9(c). Also, in Figure 9(d), a video of the ﬁre taken at designing new approaches.

8 Mathematical Problems in Engineering

 (a) (b)

 (c) (d)

 Figure 9: Smoke vs. cloud similarity comparison with SSIM value.

5. Conclusion References
In this paper, SSD Inception-V2 was chosen to be a viable [1] A. D. Syphard, V. C. Radeloﬀ, N. S. Keuler et al., “Predicting
detector of wildﬁre smoke in videos taken by UAVs both spatial patterns of ﬁre on a southern California landscape,”
in terms of accuracy and speed. Diﬀerent smoke image International Journal of Wildland Fire, vol. 17, no. 5,
datasets such as one generated by using a synthetic process pp. 602–613, 2008.
 [2] N. Nauslar, J. Abatzoglou, and P. Marsh, “The 2017 north Bay
and another from real smoke images are used to train the
 and southern California ﬁres: a case study,” Fire, vol. 1, no. 1,
model. One of the signiﬁcant solutions is presented to p. 18, 2018.
detect thin and dense smoke in videos taken by UAVs as [3] T. John and A. Abatzoglou, “Impact of anthropogenic climate
previous methods comprise images or static camera change on wildﬁre across western US forests,” Park Williams
videos. The test results promise of extending the solution Proceedings of the National Academy of Sciences, vol. 113,
to real-time drone surveillance. An F1-score of 0.784 and no. 42, pp. 11770–11775, 2016.
0.747 has been achieved on test videos of thin smoke [4] I. Colomina and P. Molina, “Unmanned aerial systems for
surpassing the previous literature. Limitations and diﬃ- photogrammetry and remote sensing: a review,” ISPRS
culties found in the study are discussed along with an Journal of Photogrammetry and Remote Sensing, vol. 92,
example using structural similarity index as a quantifying pp. 79–97, 2014.
parameter. In the future, the proposed solution can be [5] S. S. V. Vijayakumar, C. S. Kumar, V. Priya, L. Ravi, and
 V. Subramaniyaswamy, “Unmanned aerial vehicle (UAV)
extended to detect smoke in real-time UAV footage in
 based forest ﬁre detection and monitoring for reducing false
diﬀerent light and weather conditions along with de- alarms in forest-ﬁres,” Computer Communications, vol. 149,
signing a ﬁre alarm. The performance of the model on thin pp. 1–16, 2020.
smoke can be further improved by enriching the thin [6] F. Noor, M. A. Khan, A. Al-Zahrani, I. Ullah, and K. A. Al-
smoke image dataset mainly taken by UAVs in diﬀerent Dhlan, “A review on communications perspective of ﬂying
weather and light conditions. ad-hoc networks: key enabling wireless technologies, appli-
 cations, challenges and open research topics,” Drones, vol. 4,
Data Availability no. 65, pp. 1–14, 2020.
 [7] B. Alzahrani, O. S. Oubbati, A. Bernawi, A. Atiquzzaman, and
The data used to support the ﬁndings of this study are D. Alghazzawi, “UAV assistance paradigm: state-of-the-art in
available from the corresponding author upon request. applications and challenges,” Journal of Network and Com-
 puter Applications, vol. 166, no. 102706, pp. 1–44, 2020.
 [8] M. A. Khan, I. M. Qureshi, and F. A. Khanzada, “Hybrid
Conflicts of Interest communication scheme for eﬃcient and low-cost deployment
 of future ﬂying ad-hoc network (FANET),” Drones, vol. 3,
The authors declare that they have no conﬂicts of interest. no. 16, pp. 1–20, 2019.

Mathematical Problems in Engineering 9

 [9] X. Li, Y. Zhao, J. Zhang, and Y. Dong, “A hybrid PSO al- [25] “General thoughts on putting together your own quadcopter”,
 gorithm based ﬂight path optimization for multiple agricul- Available online: https://unleashthebot.com/best-drone-kits/,
 tural UAVs,” in Proceedings of 2016 IEEE 28th International 2021.
 Conference on Tools with Artiﬁcial Intelligence (ICTAI), San [26] J. Huang, “Speed/accuracy trade-oﬀs for modern convolu-
 Jose, CA, USA, November 2016. tional object detectors,” in Proceedings of the IEEE Conference
[10] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature on Computer Vision and Pattern Recognition, Honolulu, HI,
 hierarchies for accurate object detection and semantic seg- USA, June 2017.
 mentation,” in Proceedings of the IEEE Conference on Com- [27] F. Bashir and F. Porikli, “Performance evaluation of object
 puter Vision and Pattern Recognition, Colombus, OH, USA, detection and tracking systems,” in Proceedings 9th IEEE
 June 2014. International Workshop on PETS, New York, NY, USA, June
[11] Q.-x. Zhang, G.-h. Lin, Y.-m. Zhang, G. Xu, and J.-j. Wang, 2006.
 “Wildland forest ﬁre smoke detection based on faster R-CNN [28] G. Forman and M. Scholz, “Apples-to-apples in cross-vali-
 using synthetic smoke images,” Procedia Engineering, vol. 211, dation studies,” Acm Sigkdd Explorations Newsletter, vol. 12,
 pp. 441–446, 2018. no. 1, pp. 49–57, 2010.
[12] E. U. Rahman, Y. Zhang, S. Ahmad, H. I. Ahmad, and
 S. Jobaer, “Autonomous vision-based primary distribution
 systems porcelain insulators inspection using UAVs,” Sensors,
 vol. 21, no. 3, p. 974, 2021.
[13] Z. Ali and T. Mahmood, “Complex neutrosophic generalised
 dice similarity measures and their application to decision
 making,” CAAI Transactions on Intelligence Technology, vol. 5,
 no. 2, pp. 78–87, 2020.
[14] T. Sangeetha and G. M. Amalanathan, “Outlier detection in
 neutrosophic sets by using rough entropy based weighted
 density method,” CAAI Transactions on Intelligence Tech-
 nology, vol. 5, no. 2, 2020.
[15] C. Zhu, W. Yan, X. Cai, S. Liu, T. H. Li, and G. Li, “Neural
 saliency algorithm guide bi-directional visual perception style
 transfer,” CAAI Transactions on Intelligence Technology, vol. 5,
 no. 1, pp. 1–8, 2020.
[16] C.-F. J. Kuo, J.-M. Liu, M. L. Umar, W.-L. Lan, C.-Y. Huang,
 and S.-S. Syu, “The photovoltaic-thermal system parameter
 optimization design and practical veriﬁcation,” Energy Con-
 version and Management, vol. 180, pp. 358–371, 2019.
[17] M. Safa, M. Ahmadi, J. Mehrmashadi et al., “Selection of the
 most inﬂuential parameters on vectorial crystal growth of
 highly oriented vertically aligned carbon nanotubes by
 adaptive neuro-fuzzy technique,” International Journal of
 Hydromechatronics, vol. 3, no. 3, p. 238, 2020.
[18] B. R. Murlidhar, R. K. Sinha, E. T. Mohamad, R. Sonkar, and
 M. Khorami, “The eﬀects of particle swarm optimisation and
 genetic algorithm on ANN results in predicting pile bearing
 capacity,” International Journal of Hydromechatronics, vol. 3,
 no. 1, p. 69, 2020.
[19] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: towards
 real-time object detection with region proposal networks,” in
 Proceedings of Advances in Neural Information Processing
 Systems, Montreal, Canada, December 2015.
[20] W. Liu, D. Anguelov, D. Erhan et al., “Ssd: single shot
 multibox detector,” in Proceedings of European Conference on
 Computer Vision, Amsterdam, Netherlands, October 2016.
[21] H. Shah, YOLO Vs. SSD: Choice of a Precise Object Detection
 Method, https://technostacks.com/blog/yolo-vs-ssd, 2020.
[22] E. U. Rahman, Y. Zhang, S. Ahmad, H. I. Ahmad, and
 S. Jobaer, “Autonomous vision-based primary distribution
 systems porcelain insulators inspection using UAVs,” Engi-
 neering, 2020, preprint.
[23] H. Shah, What is the Main Diﬀerence Between Yolo and Ssd?,
 Technostacks Infotech Pvt. Ltd., Ahmedabad, India, 2018,
 https://technostacks.com/blog/yolo-vs-ssd/.
[24] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli,
 “Image quality assessment: from error visibility to structural
 similarity,” IEEE Transactions on Image Processing, vol. 13,
 no. 4, pp. 600–612, 2004.