TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Vocord Company: Main Facts ■ Developer of video surveillance and video analytics systems since 1999 ■ Deep expertise in facial recognition ■ Top-rated in NIST and Megaface face recognition tests ■ NVIDIA Metropolis program member Our customers and partners www.vocord.com 2
Notable figures 250+ projects for public and private sectors 140 million faces in enrollment database in a single project 200,000 cameras are managed by VOCORD video analysis software 350,000/month API request to VOCORD FaceMatica cloud Geography: Europe, Middle East, SE Asia, East Asia, Latin America, Oceania www.vocord.com 3
Face recognition products VOCORD FaceControl VOCORD FaceMatica Face Recognition SDK “Faces in the crowd” FR system Face recognition engine Face recognition engine SDK in a Cloud nano VOCORD NanoFace VOCORD NetCam VOCORD FaceControl 3D NVIDIA Jetson-based New generation face Free flow 3D facial recognition embedded face recognition recognition camera solution All products support NVIDIA GPU www.vocord.com 4
Main Factors Impacting Facial Recognition Enrolment DB quality: something beyond control Inbound image quality Enrolment DB Recognition engine Recognition engine: already works as in the Marvel movies www.vocord.com 5
VOCORD Facial Recognition Engine TOP in Megaface Face Scrub Open Challenge 2015-2018 With accuracy 91.76% TOP in NIST Face Recognition Vendor Test 2016-2018 TPR at FPR 10-4 = 98.7%, TPR at FPR 10-6 = 96.6% www.vocord.com 6
Pose Invariance 0.25 Enrollment DB 60˚, enrollment DB >60˚ Group 3 30 ÷ 45˚ 0.1 Group 4 0.05 45 ÷ 60˚ Group 5 0 > 60˚ 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 1.E-01 1.E00 FAR www.vocord.com 8
Image Resolution Impact 1.0 0.95 True Identification Rate** Face identification probability 0.9 Recommended minimum Optimal resolution 0.85 0.8 L=48 pix L =24 pix 0.75 0.7 12 24 36 48 60 72 Pixels between eyes (L) *L – the distance between eyes, pix ** FAR=10-4 www.vocord.com 9
How to improve recognition? The quality of acquired face Enrollment DB quality: images: point of growth something beyond control Inbound Image Enrollment Quality DB Recognition Engine Recognition engine: already works as in the Marvel movies www.vocord.com 10
Different types of test datasets NIST FRVT Report 2017 10 03 www.vocord.com 11
“Controlled” dataset Algorithm A Algorithm B NIST FRVT Report 2017 10 03 www.vocord.com 12
“Uncontrolled” dataset Algorithm A Algorithm B NIST FRVT Report 2017 10 03 www.vocord.com 13
Controlled vs. Uncontrolled (FRR log scale) 0.7 Algorithm A, uncontrolled environment 0.6 Algorithm B, uncontrolled environment 0.5 Algorithm A, controlled environment FRR 0.4 Algorithm B, controlled environment 0.3 0.2 0.1 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 FAR www.vocord.com 14
Controlled vs. Uncontrolled (linear scale) 0.7 Algorithm A, uncontrolled environment 0.6 Algorithm B, uncontrolled environment 0.5 Algorithm A, FRR controlled environment 0.4 Algorithm B, controlled environment 0.3 0.2 0.1 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 FAR www.vocord.com 15
Hit the bottom: Images from IP camera
The Advantages of Edge Video Analysis ■ Face recognition onboard ■ No compression artifacts: the image is taken directly from the sensor ■ Dynamic Region of Interest for every intelligent algorithm ■ Algorithm adjustment for particular camera set up VOCORD NetCam.AI edge video analytics camera www.vocord.com 17
Video Enhancement Onboard Dynamic ROI enhances the quality of image in the face area Backlight, no 12 bit image 12 bit image with enhancement with static ROI dynamic ROI 18
VOCORD NetCam.AI HW Features High quality sensor Automated lens control NVIDIA Jetson TX1 GPU www.vocord.com 19
VOCORD NetCam.AI Tech Specs Camera specs Resolution 3÷5 Mpix Temperature range -25С ~ +50С Ingress Protection IP 67 Dimensions 20x71x150 mm Power consumption 15W Built-in facial recognition engine specs Min face resolution for face recognition 12 pixels between the eyes Number of faces detected in one frame Up to 25 Latency of biometric template extraction Up to 150 ms per 1 face Face recognition performance Up to 32 faces/s Inference framework TensorRT www.vocord.com 20
Performance on Different Platforms 35 32 NVIDIA Jetson TX1 30 Intel Movidius 25 Qualcom Snapdragon 820 20 19 15 12 10 9 6 5 4 2,2 1,4 0,9 0 "Shallow" CNN "Medium" CNN "Deep" CNN www.vocord.com 21
Higher FPS Improves Accuracy 0.15 0.13 Single face: “Deep” CNN 0.11 “Medium” CNN ”Shallow” CNN 0.09 Track (multiple faces): FRR 0.7 “Deep” CNN “Medium” CNN 0.5 ”Shallow” CNN 0.03 0.01 0 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 FAR www.vocord.com 22
TensorRT vs. MXNet Performance 35 MXNet 32 30 TensoRT 25 20 19 18 FPS 15 12 10 10 6 5 0 “Shallow” CNN “Medium” CNN “Very” CNN Platform: NVIDIA Jetson TX1 www.vocord.com 23
WHAT’S THE PROFIT? www.vocord.com 24
Face recognition systems architectures Edge analytics system “Traditional” server architecture approach with VOCORD NetCam.AI cameras VS with regular IP-cameras Data center with many expensive rack One archive server servers LAN, Wi-Fi LAN 95% of processing is here 95% of processing is here 25
Cost-Efficiency: 100 High Loaded Cameras Edge computing with VOCORD NetCam.AI “Traditional” server architecture with IP cameras VS Cameras Cameras USD 2,000 x 100 = USD 200,000 USD 500 x 100 = USD 50,000 Server for matching and archive Servers USD 10,000 Detection: 2 servers, 4xCPU 32 cores each USD 60,000 Template extraction: 4 servers, 2 GPU Tesla P40 each USD 120,000 Server for matching and archive USD 10,000 CAPEX: USD 210,000 CAPEX: USD 240,000 Maintenance costs: Maintenance costs: power supply (800 Wt), bandwidth (2Gbps), rack space power supply (7-8 kWt), bandwidth (2Gbps), rack space OPEX: USD 2,000 per year OPEX: USD 30,000 per year www.vocord.com 26
WHAT’S NEXT? • Uploading various video analytics algorithms • Highly customized algorithms • Interacting cameras as a part of IoT • 3D vision www.vocord.com 27
Open Platform: Easy Algorithm Uploading Facial recognition Behavioral License plate analysis recognition Vehicle types Emergency cases Lost and found objects www.vocord.com 28
Camera-Dependent Algorithm Customization Step 1. The camera Step 2. The neural network collects images and is retrained on the server uploads them to the server using new images Step 3. Customized, light-weight neural network is uploaded back to the camera www.vocord.com 29
Customization to restricted data Unrestricted data Restricted data 0.04 0.04 0.035 0.035 “Deep” neural network “Deep” neural network 0.03 “Shallow” neural network 0.03 “Shallow” nueral network 0.025 0.025 FRR FRR 0.02 0.02 0.015 0.015 0.01 0.01 0.005 0.005 0 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 1.E-01 FAR FAR Deeper DNNs provide better On restricted data difference between deep and shallow performance on unrestricted data network is negligible www.vocord.com 30
Intercamera Tracking Face Bag NetCam.AI #1 NetCam.AI #2 Jeans www.vocord.com 31
Obtaining 3D Models ■ Building a 3D object from synchronous snapshots from multiple cameras ■ Feature preprocessing for conjugate points search www.vocord.com 32
Thank you for your attention! Questions? E-mail: sales@vocord.com Website: www.vocord.com
You can also read