Optimization of Fatigue Detection Method under Altitude Changes in Plateau Region Based on MTCNN - IOPscience
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
IOP Conference Series: Earth and Environmental Science PAPER • OPEN ACCESS Optimization of Fatigue Detection Method under Altitude Changes in Plateau Region Based on MTCNN To cite this article: Sipeng Han et al 2021 IOP Conf. Ser.: Earth Environ. Sci. 692 042016 View the article online for updates and enhancements. This content was downloaded from IP address 46.4.80.155 on 26/05/2021 at 16:03
EMCEME 2020 IOP Publishing IOP Conf. Series: Earth and Environmental Science 692 (2021) 042016 doi:10.1088/1755-1315/692/4/042016 Optimization of Fatigue Detection Method under Altitude Changes in Plateau Region Based on MTCNN Sipeng Han1, Jingyang Tan1, Qianzhi Jiao1, Bo Tang1, *, Yi Luo2 and Xuguang Yang1 1 Electric Engineering College, Tibet Agriculture and Animal Husbandry University Nyingchi, China 2 Electric Engineering College, Agriculture and Animal Husbandry College, Chengdu, China *Corresponding author: tangbo@xza.edu.cn Abstract. Fatigue driving is the main cause of traffic accidents, and research on fatigue driving detection algorithms is of great significance to improve road safety. This paper proposes an image processing method based on MTCNN model detection optimization, Perform median filter denoising before P-Net training to improve the detection rate of night faces, then, the ASM algorithm is used to detect the facial feature points, and finally the PERCLOS principle is used to analyze the driving fatigue state. The experimental results show that the method has a high detection rate, can be applied to fatigue detection at different altitudes, and has strong practicability. Keywords: MTCNN, Plateau, Driving fatigue, Fatigue detection, Face recognition, Convolutional neural network 1. Introduction With the rapid development of my country's economy, on the one hand, the happiness index of people's lives has been improved, and on the other hand, the consumption level has also been greatly improved. At present, the number of cars in our country is constantly increasing, and the problem of safe driving of cars has become a problem for our country and the world. Fatigue driving is now a serious hidden danger of traffic accidents. Therefore, it is necessary to detect the driver’s fatigue driving behavior and make reminders and interventions according to the different levels of fatigue. In plateau areas, with the continuous changes in altitude, some drivers will have altitude sickness, and some people will become fatigued and mistakenly believe that it is caused by the high reaction. Therefore, fatigue detection is very important for the life and property safety of drivers. Meaning. Driving fatigue is usually a combination of physical fatigue and mental fatigue. It mainly refers to the phenomenon of inaccurate vision and slow response due to excessive driving time, insufficient rest time, or the driver's mental disturbance and decreased operating ability after driving for a long time. When the driver enters a state of fatigue, it is usually accompanied by a decline in judgment and slow response. If the driver continues to drive, incorrect operations or incorrect adjustments are likely to occur, leading to traffic accidents. In the case of mild fatigue, the driver will experience operating delays or poor operation. Due to increased fatigue, the driver may even make operational mistakes. Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Published under licence by IOP Publishing Ltd 1
EMCEME 2020 IOP Publishing IOP Conf. Series: Earth and Environmental Science 692 (2021) 042016 doi:10.1088/1755-1315/692/4/042016 After entering a severe fatigue state, there will be unconscious operation or short sleep. In severe cases, the control of the vehicle will be lost, which will cause the "type" of the vehicle to move forward [1]. Traditional fatigue detection methods mainly include driver's physiological characteristic detection method, vehicle behavior characteristic detection method, and driver facial feature detection method. The detection methods based on the physiological characteristics of the driver mainly include electroencephalogram (EEG), electrocardiogram (ECG), electromyography (EMG), electro-oculogram (EOG), pulse signal, etc. In the driving process of the driver, the dynamic EEG scanner, dynamic electrocardiograph, and electromyography are used to detect the corresponding signals, and compare them with the biological human fatigue parameter indicators to determine whether the driver is fatigued. Fan Wang [2] et al. proposed to collect driver's ECG and eye movement data to assess the degree of fatigue. It is to choose the results based on eye movement to enhance the estimation method of driving fatigue based on heart rate variability (HRV). Chen Jichi [3] collects the driver's EEG signals in real time, performs wavelet packet decomposition and reconstruction, and extracts various rhythm signals. Then by calculating the phase lag index between the leads, the connection matrix is constructed, and the brain network characteristics of each rhythm are extracted. However, the sensor that detects physiological signals in this method is in contact with the body, this is also the disadvantage of the physiological feature detection method, which seriously affects the comfort of the driver during driving or affects the normal driving operation of the driver. The detection method based on vehicle behavior characteristics is mainly to collect and analyze the relevant information of the vehicle itself during the driving process of the vehicle to determine whether the car driver is in a fatigue state, It mainly includes information such as the speed of the vehicle during the driving process, lane deviation and steering wheel rotation range. The American Electronic Safety Product company developed the steering wheel monitoring device S.A.M [4], which mainly detects the driver’s fatigue state by detecting the rotation of the steering wheel; the European Union developed the "AWAKE" system [5], by real-time monitoring of the driver’s sight direction, lane tracking, etc., and then using multi-channel fusion technology to achieve driver fatigue monitoring and warning signals, this type of method is non-invasive but the driver’s driving habits will affect the accuracy of detection . The method of driver facial feature detection mainly realizes driver fatigue detection by detecting the frequency of human head drooping, body tilt, increased blinking frequency, yawning and other phenomena, Jianju Xing [6] applied the convolutional neural network to face recognition, improved the pupil location algorithm, and effectively overcomes the problem of large amount of calculation of the traditional algorithm. The fatigue recognition rate of this algorithm is 87.5%. Zhong Wang [7] uses the MTCNN model to detect face images, and then performs image processing on the face images, including three steps of grayscale processing, binarization processing, and human eye detection, but this method is combined in high altitude areas and dark nights. Not applicable. Weihuang Liu [8] proposed a driver fatigue detection algorithm based on a dual-stream network model of multiple facial features. Finally, an accuracy of 97.06% was achieved on the National Tsing Hua University Driver Drowsiness Detection (NTHU-DDD) dataset. The traditional fatigue detection method is not applicable in the face of different altitude changes and dark night, moreover, it does not show strong robustness under the influence of external conditions such as partial occlusion and light changes. This article combines the fatigue driving detection algorithm based on MTCNN optimization, select representative parameters such as the degree of eye closure, the degree of mouth opening, and the head posture that represent fatigue driving, and finally analyze the driver’s fatigue state using the principle of PERCLOS. 2. METHOD First, the driver’s facial image is acquired through the camera, and then the improved MTCNN model is used to detect the driver’s image. The median filter is used to denoise before the P-Net detection to avoid excessive noise in the night scene and improve the detection accuracy. Next, the key points of the face are detected on the collected images, and the aspect ratio of the eyes and the mouth are located. Finally, the PERCLOS principle is used to analyze and judge the fatigue state of the driver. 2
EMCEME 2020 IOP Publishing IOP Conf. Series: Earth and Environmental Science 692 (2021) 042016 doi:10.1088/1755-1315/692/4/042016 Characteristics of eye fatigue Camera face Improved Face key PERCLOS image MTCNN face point determination early warning acquisition detection detection Characteristics of fatigue of mouth fatigue Figure 1. The proposed method framework. 2.1. Face Detection Face detection uses Multi-task Cascaded Convolutional Networks (MTCNN) [9], the algorithm is based on deep learning to jointly detect face bounding boxes and face key points. It is currently one of the best algorithms in the field of face detection and face key point positioning. In order to avoid the problem of excessive noise detection at night, median filtering is performed to denoise before the face detection starts, and then the image is transformed with different scales, After constructing the image pyramid, it is input to three cascaded network structures: P-Net, R-Net, O-Net [10], as shown in Figure 2. P-Net Conv:3×3 face Conv:3×3 Conv:3×3 classification MP:2×2 1×1×2 bounding box regression 1×1×4 1×1×32 3×3×16 Input size 5×5×10 12×12×3 facial landmark localization 1×1×10 (a) P-Net R-Net Conv:3×3 Conv:3×3 Fully Face classification MP:2×2 MP:2×2 Conv:2×2 connect 2 Bounding box regression 3×3×64 4 4×4×48 128 Input size 11×11×28 Facial landmark 24×24×3 localization 10 (b) R-Net O-Net Conv:3×3 Conv:3×3 Fully Face MP:2×2 Conv:2×2 MP:3×3 Conv:2×2 classification connect 2 Bounding box regression 4 3×3×128 23×23×32 256 Input size 10×10×64 4×4×64 Facial landmark localization 48×48×3 10 (c) O-Net Figure 2. Structure diagram of three cascaded networks. For face recognition, cross entropy cost function is used directly, and box regression and key point location are used. Finally, the losses of these three parts are multiplied by their own weights to form the final total loss function. 3
EMCEME 2020 IOP Publishing IOP Conf. Series: Earth and Environmental Science 692 (2021) 042016 doi:10.1088/1755-1315/692/4/042016 Face recognition loss function: Ldet i ( yidet log( pi ) (1 yidet )(1 log( pi ))) (1) Where Pi means Xi face probability, yidet ( yidet 0,1 ) is the real category label, If the xi is a face, then yidet 1 , yidet - log( pi ) , Conversely, if xi is a non-face, then yidet 0 , Ldet i 1- log( pi ) . The loss function of the regression box: The face key point detection is similar to the boundary regression task. The key point coordinates of the face can be obtained by minimizing Euclidean distance regression, see formula (2) landmark Llandmark i y yilandmark 2 2 (2) i Among them, y i ( y i landmark landmark landmark R10 ) and yi they are the key points of the predicted and real faces, which mainly include the eyes, nose, and the sides of the mouth. Total loss function: n min jdet,box ,landmark a j i j Lij (3) i 1 Among them, n is the number of training samples, is the weight of each task, i j ( i j 0,1) is the true label of the sample, and Lij is the cross-entropy loss function or Euclidean loss function. In P- Net and R-Net, j is respectively det 1, landmark 0.5 . At the same time, in order to achieve high- precision face key point positioning, set a j to det 1 , box 0.5 , landmark 1 in O-Net. 2.2. Human eye and mouth feature point detection 3000 2500 2000 1500 1000 500 0 14 29 44 59 74 89 104 119 134 149 164 179 194 209 224 239 254 lmage Grayscale Image Brightness balance Grayscale Image Binarization Image Remove Hair Labelling images Gray histogram (a) Face detection (b) Punctuation of key points Figure 3. Face detection and punctuation. Active Shape Model (ASM) has good robustness [11]. Use several feature points to describe the shape of the target. Corresponding to similar targets with similar shapes, the specific position of each 4
EMCEME 2020 IOP Publishing IOP Conf. Series: Earth and Environmental Science 692 (2021) 042016 doi:10.1088/1755-1315/692/4/042016 point can be adjusted when the average shape is obtained through statistical analysis, thereby obtaining a vector that can describe the specific shape of the object, This process is the alternating effect on the face shape model and the local gray scale model, and the output shape gradually converges. The facial feature points obtained by the MTCNN algorithm can accurately obtain the edge positions of the eyes and mouth. The result is shown in Figure 3. When the eyes are opened and closed, the relative position between P1 and P6 is very different, especially the longitudinal coordinates between P2/P6 and P3/P5. According to the change of the eye state, the current state of the eye can be clearly extracted through the change of the relative distance between the 6 feature points. The camera extracts the facial contour shape vector of the driver in each frame, and calculates the eye aspect ratio through a specific point of the shape vector for blink calculation, the calculation is as follows: P 2 P 6 P3 P5 EAR (4) 2 P1 P 4 Where P1~P6 are the two-dimensional coordinate vector of the feature points of the face, the numerator calculates the vertical distance between the upper eyelid and the lower eyelid, and the denominator calculates the horizontal distance between the corners of the eyes. At the same time, in order to eliminate the interference caused by different sitting postures of the driver, this paper calculates the EAR of the left eye and the right eye at the same time, and uses their average value as the final EAR value. The calculation of the aspect ratio of the mouth is similar to that of the eyes. The calculation is as follows: M 2 M8 M3 M 7 M 4 M 6 MAR (5) 3 M1 M 5 Where M1-M8 are the two-dimensional vector coordinates of the inner contour points in the mouth shape, the numerator is the vertical distance between the upper lip and the lower lip, and the denominator is the horizontal distance between the corners of the mouth. At the same time, in order to eliminate the error interference caused by the driver's attitude, this paper also calculates the vertical distance of the three pairs of feature points and averages them as the MAR value. 2.3. Verify the effectiveness of the algorithm In order to verify the effectiveness of the algorithm, this paper establishes a fatigue detection data set and uses a camera to collect multiple test videos, including fatigue and normal videos of 6 testers under different altitudes and different lighting conditions. First, use the optimized MTCNN model for face detection, and then through image processing. The data set of the MTCNN face detection model comes from multiple public face data sets. We select different face images according to the ratio, with different postures, expressions, and lighting to form a new data set to verify the accuracy of the recognition of eye and mouth feature points in this article. The experiment uses the TensorFlow deep learning framework to implement the network. Table 1. Test results of this method. ID Altitude (m) Number of faces Identification number Accuracy 1 3000m 36 33 91.7% 2 3000m 42 40 95.2% 3 4102m 23 21 91.3% 4 4102m 53 49 92.5% 5 1265m 28 26 92.9% 6 1265m 38 35 92.1% 5
EMCEME 2020 IOP Publishing IOP Conf. Series: Earth and Environmental Science 692 (2021) 042016 doi:10.1088/1755-1315/692/4/042016 It can be seen from Table 1 that the recognition accuracy of the method proposed in this paper is above 91%, and the average accuracy of the algorithm is close to 93%. This further illustrates the robustness of the detection method proposed in this paper and proves that the fatigue driving detection system is effective for the driver. The feasibility and validity of fatigue judgment. 3. DRIVER FATIGUE STSTE JUDGMENT BASED ON PERCLOS Based on PERCLOS [12] (Percentage of Eyelid Closure Over the Pupil Over Time), the driver's mental state judgment method is currently a recognized fatigue state judgment standard in the research field of driver fatigue state detection system. PERCLOS value is the percentage of the time the eyes are closed in a certain period of time. In a period of time, the longer the eyes are closed, the greater the PERCLOS value, which reflects the more serious the driver’s fatigue. Currently, PERCLOS has three standards: P70, P80 and EM. Among them, the P70 standard and the P80 standard mean that when the area of the eyelid covering the pupil is 70% and 80% respectively, it is judged as a closed eye state, and the proportion of the time that the eye is closed in a period of time is calculated. The EM standard is that when the eyelid covers half of the pupil area, it is judged as closed eyes. t 3 t2 P 100% (6) t4 t1 In the formula, P represents the ratio of the number of frames whose eyes are closed within a certain period of time to the total number of frames of the collected video stream. t1 represents the time required to fully open the eye to 20% of the closed eye, t2 represents the time required to fully open the eye to 80% of the closed eye, t3 represents the time required to fully open the eye to fully close the eye and then open 20%, t4 represents The time it takes to open your eyes completely to completely closed and then open to 80%. The image data of each frame collected by the system can be divided into two states: eyes open and eyes closed. According to the algorithm proposed in this paper, not only the data status of each frame can be judged, but also whether the current driver is tired. In this paper, if P is greater than or equal to 90%, it is judged that the driver is in a severe fatigue state, if P is between 60% and 90%, it is judged that the driver is in a mild fatigue state, and if it is less than 60%, it is judged that the driver is normal status. 4. FATIGUE THRESHOLD SELECTION AND TESTING According to different altitudes, select people suitable for testing for fatigue testing. This article selects 10 testers to collect real fatigue videos at 12 o'clock in the morning at night (the first 5 are at an altitude of 4000m, and the last 5 are at an altitude of 3000m). Calculate and record the fatigue times of each video as shown in Table 2: Table 2. Algorithm test. personnel number Fatigue detection times Correct detection times Detection accuracy 1 154 145 94.15% 2 145 136 93.79% 3 165 155 93.93% 4 126 118 94.75% 5 115 109 94.78% 6 86 81 94.18% 7 103 97 94.17% 8 65 61 93.85% 9 95 89 93.68% 10 73 63 93.15% 6
EMCEME 2020 IOP Publishing IOP Conf. Series: Earth and Environmental Science 692 (2021) 042016 doi:10.1088/1755-1315/692/4/042016 Experimental results show that the algorithm can also be applied in different altitude environments, maintaining an accuracy rate of 93%. 5. IN CONCLUSION In this paper, an optimization algorithm for face detection based on MTCNN model is proposed, and ASM is used to detect eye and mouth feature points. At last, the algorithm is detected at night according to PERCLOS of multiple videos and verified. Experimental results show that the algorithm is effective and accurate at about 93%. Acknowledgments This work was supported by The National Natural Science Foundation of China (Grant No. 51667017) and the undergraduate innovation experiment project of Tibet College of Agriculture and Animal Husbandry. References [1] Sheng Yingchao. Research and Implementation of Fatigue Driving Detection Svstem Based on Eve Features. Diss. 2019. [2] Wang, Fan , et al. "Estimating Driving Fatigue at a Plateau Area with Frequent and Rapid Altitude Change." Sensors 19.22 (2019): 4982. [3] Chen Jichi, Wang Hong. A Study on Drowsy Driving State Based on EEG Signals. [J]. Automotive engineering, 2018, 40 (05): 515-520. [4] Vankayalapati, H. D., K. R. Anne, and K. Kyamakya. Extraction of Visual and Acoustic Features of the Driver for Monitoring Driver Ergonomics Applied to Extended Driver Assistance Systems. Data and Mobility. Springer Berlin Heidelberg, 2010. [5] Takei, Y., and Y. Furukawa. "Estimate of driver's fatigue through steering motion." Systems, Man and Cybernetics, 2005 IEEE International Conference on IEEE, 2006. [6] Xing, Jianju, et al. "Application of Face Recognition Based on CNN in Fatigue Driving Detection." the 2019 International Conference 2019. [7] Wang, Zhong, P. Shi, and C. Wu. "A Fatigue Driving Detection Method based on Deep Learning and Image Processing." Journal of Physics: Conference Series 1575.1 (2020): 012035 (6pp). [8] Liu, Weihuang, et al. "Convolutional Two-Stream Network Using Multi-Facial Feature Fusion for Driver Fatigue Detection." Future Internet 11.5 (2019): 115. [9] Zhang K, Zhang Z, Li Z, et al. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks [J]. IEEE Signal Processing Letters, 2016, 23 (10): 1499-1503. [10] Li Qingchen. Design of fatigue Driving detection System based on Facial Feature. Diss. 2019. [11] Islam, Rafiul, et al. "Genome-wide analysis of SARS-CoV-2 virus strains circulating worldwide implicates heterogeneity." entific Reports 10.1 (2020). [12] Thropp, Jennifer E., J. F. V. Scallon, and P. Buza. "PERCLOS as an Indicator of Slow-Onset Hypoxia in Aviation." Aerospace Medicine & Human Performance 89.8 (2018): 700. 7
You can also read