Facial Manipulation Detection Based on the Color Distribution Analysis in Edge Region
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Facial Manipulation Detection Based on the Color Distribution Analysis in Edge Region Dong-Keon Kim Donghee Kim Kwangsu Kim SungKyunKwan University Hippo T&C SungKyunKwan University kdk1996@skku.edu ym.dhkim@skku.edu kim.kwangsu@skku.edu arXiv:2102.01381v1 [cs.CV] 2 Feb 2021 Abstract In this work, we present a generalized and robust facial manipulation detection method based on color distribution analysis of the vertical region of edge in a manipulated im- age. Most of the contemporary facial manipulation method involves pixel correction procedures for reducing awkward- ness of pixel value differences along the facial boundary in a synthesized image. For this procedure, there are distinc- tive differences in the facial boundary between face manipu- lated image and unforged natural image. Also, in the forged image, there should be distinctive and unnatural features in the gap distribution between facial boundary and back- Figure 1. Summarized steps of face manipulation procedure. Gen- ground edge region because it tends to damage the natural erated image is created from source face image and generative effect of lighting. We design the neural network for detect- model. For pasting generated image to target image realistically, post-processing is involved after image manipulation ing face-manipulated image with these distinctive features in facial boundary and background edge. Our extensive ex- periments show that our method outperforms other existing face manipulation detection methods on detecting synthe- rithms [23, 2, 38] or focusing on designing artificial neural sized face image in various datasets regardless of whether network architectures [25, 1, 31] with referring whole im- it has participated in training. age rather than paying attention to face manipulation proce- dure. Their own methods work well with specific datasets which they analyzed prior to design detection methods while they do not work well when it comes to classify syn- 1. Introduction thesized video without prior information. Focusing on spe- With rapid development of AI-based generative model cific manipulation algorithms or referring whole image fea- [16, 29, 11], facial manipulation methods also improved in tures is inefficient method when it comes to real-world sce- terms of generating realistic fake face image and changing nario where we determine the authenticity of an arbitrary one’s face identity to others. As face synthesizing algo- image without any prior information about the source data rithms become more and more sophisticated, abuse cases and manipulation procedures. and social issues derived from forged image have recently Most of face synthesizing algorithms nowadays com- emerged. These realistically synthesized video for mali- monly involve face boundary post-processing after face al- cious purpose already becomes threat in terms of truth issue tering from one image to other image to reduce awkward- of public media, individual privacy [37], e.g., politician’s ness of pixel distribution difference between original image fake news, celebrity pornography. For these reasons, de- and altered face image [19], as shown in Figure 1. Recent tecting facial manipulated image effectively is crucial and studies [18, 19, 33] pay attention to these shared procedures important issue in computer vision field. of face manipulation algorithms for detecting the synthe- Existing works suggested their method with only fo- sized image considering its generalization ability with other cusing features that appear on the synthesized face image various datasets. Involved post-process procedure with face in datasets which generated by specific manipulation algo- retouching application or generative model damages the 1
unique fingerprint in one image and makes the pixel dis- search about detecting AI-generated image, existing studies tribution difference along face boundary area between post- [39, 22, 24, 17] show that there is a difference in color dis- processed manipulated image and unforged natural image. tribution between the real image and entirely-created fake Our work pays attention to extracting pixel value dif- image made by AI-based generative model. There is a lim- ference features along boundary area for detecting face itation of generative model that the image totally created manipulated image. With referring existing edge detec- leaves their own artificial imprints. AI-based generative tor algorithms [27, 3], we gather set of vertical line of fa- model cannot perfectly reproduce original fingerprints of cial boundary and compute corresponding pixel difference authentic natural image which made by intrinsic properties along vertical lines. If a facial image is altered and gets of hardware such as surface of camera lens and the direc- post-processing procedure to smooth the facial boundary, tion of the light entering the camera. However, the problem there will be little difference in pixel values nearby face of detecting manipulated images in which only the face re- while unforged natural image has relatively large difference gion is altered must be approached slightly differently from in pixel values nearby face because the pixel on the facial the studies of determining authenticity of entirely generated boundary were not modified. images. Additionally, we gather boundary pixel differences along AI-Based Face Manipulation Detection. With rapidly perpendicular line of background boundary. To analyze the progressed generative models and synthesizing techniques, difference with the feature extracted from the face bound- there have been a lot of malicious usage such as realistically ary, the pixel difference values in the direction vertical to synthesizing one’s face to other’s body. For arising threats the background boundary are obtained in a similar method of these image forgery in terms of privacy and trust issues, to facial feature extraction procedure. The difference be- there have been many studies [8, 21, 28] to detect face ma- tween the features of the extracted background boundary nipulated images. For instance, Matern et al. [23] present and the features of the face boundary will be significantly detection method based on visual artifacts, especially eyes different between the unforged natural image and the face and noses. Yang et al. [38] suggested a method of detect- manipulated image. ing face manipulation images through the inconsistency of For classifying gathered features with outstanding per- the head pose. While these works pay attention to figure formance, we use one-dimensional convolutional neural out some awkwardness features of synthesized image, re- network which is used for extracting distinctive features cent works [25, 1, 31, 40], focus on detecting with well- in sequential data [12] because the gathered features are designed state-of-the-arts artificial neural networks, such as computed as pixel difference data which are ordered ac- recurrent neural network and capsule network. cording to perpendicular direction of boundary of face and All of these proposed works mainly focused on features background. Rather than developing a complex and heavy appearing in synthesized image or designing the model ar- classification model, we developed our own model that is chitecture with synthesized image itself in specific datasets. lightweight and can classify with very good performance Their works demonstrated only specific algorithms with through the features we have extracted. We show our limited source datasets while verifying with extensive ex- method with suggested classification model outperforms periment including real-world scenario for detecting unseen other existing manipulation detection in terms of model data with no prior information has failed. training stability and universality to other various datasets. The contribution of this paper to the field of face manipu- Generalized Face Manipulation Detection. As the need lation detection is its outstanding performance and general- for a methodology that can be better applied to real situa- ization regardless of dataset and generation algorithms. Our tions is increasing, the latest studies [4, 8] begin to focus on proposed method uses the pixel difference values as features the characteristics that appear in the face image synthesis and additionally uses information about the background of process rather than to the synthesized image result. Espe- the image, so it does not bias toward specific source data or cially, Li et al. [19] pay attention to common procedure specific generation algorithms and performs with remark- of face manipulation which include post-processing with ably high detection accuracy regardless of face manipula- boundary smoothing. Face X-ray [18], which is most re- tion datasets. cent work of generalized detection method focus on com- mon blending procedure in various face manipulation al- 2. Related Work gorithms and shows their generalization ability with un- seen datasets for real-world scenario. Their work perform In this section, we will briefly introduce several previous well with FaceForensics++ dataset which has several sub- image manipulation detection technologies from early stud- datasets of different manipulation types with sharing source ies of image forgery detection to the latest works which are data, while experiences performance drop with totally dif- related to our proposed method. ferent datasets. Our study further derive higher performance AI-generated image classification. In the beginning of re- with more generality by extracting not only facial features, 2
tion. In addition, we emphasize the difference between the unmanipulated image and the forged image by contrasting the difference between the original fingerprint appearing on the background which is not involved in the facial synthesis process and the fingerprint on the face. Taking all of these into account, we can detect face manipulated image based on information of entire image and facial area in terms of boundary pixel feature. Two-way feature extraction. We suggest two-way fea- Figure 2. Brief illustration of edge window. Each point is spaced ture extraction method for detailed information in terms of equally above the line segments. Center point in window implies two aspects, shared post-process procedure of face manip- mid-point Mi,i+1 of 2 neighboring landmark points, Li and Li+1 ulation technology and boundary characteristics in whole image. One is facial boundary feature extraction which reflects pixel-level distributions by post-process procedure but also extracting innate fingerprints of background edges while the other is background boundary feature extraction and comparing them with facial features. which contains pixel-level distribution along boundary in unforged region of image. 3. Methods 3.1. Facial Boundary Feature Extraction In this section, we will explain the motivation of our methods, brief reason of gathering boundary pixel features, With a frame image I extracted from the video, we detect and detailed process of boundary feature extraction. face area in image with frontal face detector in dilb [13]. It Motivation from Edge Detector Algorithm. We focus is assumed that only single face is appeared in the video. on existing edge detector algorithms for defining boundary. Then we extract 68 feature points of the detected face with The edge detected by edge detector is the boundary point 68 face landmark shape predictor in dlib. Among those de- where the change of intensity is large based on the gray- tected landmarks, 17 points representing the outer boundary scaled image [14]. Canny edge detector [3] which is the of the face are selected. These chosen landmark points are most used edge detector nowadays preferentially compute used to define the facial boundary in image. The line seg- 1st order derivatives in the X and Y directions in given im- ment Pi,i+1 perpendicular to the facial boundary line is de- age by convolution with Sobel mask [32] before detecting fined by vertical direction and midpoint of two neighboring edges in image. The gradient G in which the pixel value landmark points Li and Li+1 in facial boundary points. changes rapidly can be obtained with gathered 1st derivate Note that perpendicular line segment Pi,i+1 is paramet- values in X and Y directions. Gradient G implies the per- ric equation with parameter t which consists of 20 inte- pendicular direction of the boundary appearing in the im- ger parameter with range -10 to 9, from inside of face to age. We focused on the process of edge detector finding the outside of face. There are 16 perpendicular line segments boundary in an image through gradient. Since the distribu- which made from 17 face landmark points representing fa- tion of pixel values of the face part in the image and the dis- cial boundary. Each Pi,i+1 have 20 corresponding coordi- tribution of pixel values of the non-face parts are different, nate information along vertical direction of boundary line at the pixel value will sharply change in a direction perpendic- equal intervals. ular to the boundary of the face. To better reflect the perpendicular line segment in a spe- Artificial Imprints Harming Natural Fingerprints. Early cific area of the face, we tuned line segment size with scale- research in image forensic [10] mention that unforged nat- factor, which is inversely proportional to the facial area. By ural image has a unique fingerprint generated in the pro- tuning with the scale-factor, the perpendicular line segment cedure of image processing and hardware characteristics Pi,i+1 pass through a certain range of the facial boundary such as camera lens. However, recent studies [39, 22] show regardless of face size in image. that forged images created by the generative models such For making window-shaped coordinate features, set of as GAN has unique imprints from synthesizing procedure. horizontal line segments Vi,i+1 which are parallel to fa- Regardless of generative model types, the face replacing cial boundary line and go through the t-th point of gathered process and the post-processing process to reduce the dif- line segment Pi,i+1 is generated. Horizontal line segments ference in pixel values between the source image and the are also represented by parametric equation with parame- target image damage the natural image fingerprints across ter u which has 20 integer parameters with range -10 to the facial boundary. For this reason, it is obvious that there 9. Window-shaped 20×20 corresponding coordinates are is a difference between the synthesized facial boundary and specified by set of 20 horizontal lines. Figure 2 illustrates the unforged natural image in terms of pixel values distribu- example of extracted edge window. We will call these 400 3
Figure 3. Facial boundary window generation pipeline. (a) Capture image I from video. (b) Get a facial boundary line with face landmark points detected from (a). (c) Make perpendicular line segment Pi,i+1 with neighboring landmark points. (d) Create set of horizontal lines Vi,i+1 perpendicular to line segment Pi,i+1 at equal intervals. coordinate features to edge window Wi , which is set of hor- izontal lines Vi,i+1 with two parameter t and u. A total of 16 windows are created per face image with 17 facial boundary landmarks. Summarized pipeline of generating windows from one facial image is shown in Figure 3. To find out how the corresponding RGB pixel values in edge window Wi change from the inside to the outside of the face, we compute the absolute difference of RGB pixel values according to the parameter t. Then we average pixel difference values in 20 horizontal lines for statistical differ- ence in boundary region. These extracted 16×19×3 color difference features are facial boundary features, 16 refers the number of edge windows W in one facial image, 19 Figure 4. Background edge image generation procedure indicates the differentiated feature along parameter t and 3 stands for RGB. shown at Figure 4. 3.2. Background Boundary Feature Extraction For making background boundary feature from edge points in background edge image EI , we randomly sam- Prior to extracting the features of background edge pled 10% points of total edge points in EI to get statistical points, we make a background edge image EI from frame information which appears along background edges. Using image I. For removing noise of background and extract sampled edge points, background windows are extracted in prominent boundary parts, we blur the image I with Gaus- a same way of facial boundary feature extraction, with Li 0 sian filter. Then rough edge image EI including edge points and Li+1 correspond to two nearest points of one sampled on face is generated with Canny edge detector. point. Summarized background edge windows generating 0 Facial edge points in EI must be excluded for extract na- procedure is shown at 5. ture fingerprint of background edges. For removing facial While edge windows in facial boundary have specified edge points, we considered convex image CI that covers direction, from inside of face to outside of face with per- face contour. CI is binary image that has same size of orig- pendicular direction, the direction of parameter t in back- inal image I with inner points of convex are 0 and outer ground edge windows from background edge image can- points of convex are 1. We would like to ensure that the not be specified except that they have vertical direction of edges appearing on the face were excluded. Big-face con- boundary lines. For this reason, we fold the background 0 vex image CI is created by blurring CI with Gaussian Filter boundary feature and unfold symmetrically with the basis and round down each value in blurred CI to 0 except the on T = 10 for removing directional information in edge pixels with the values of 1. With element-wise multiplying window, e.g., all feature values in T = 1 and T = 19 are av- 0 0 CI and EI , we can get edge image EI which excludes fa- eraged by axis and modified to averaged values. Final shape cial edge points. Brief pipeline of making edge image EI is of mirrored background boundary color difference features 4
Figure 5. Background boundary feature extraction pipeline. (a) Capture image I from video. (b) Make a background edge image without facial boundary EI . (c) Sample edge points in Edge Image from edge image EI . (d) Make vertical lines and horizontal lines in the same way as the facial line segments were extracted. has shape of N ×19×3. Note that N is the number of sam- ference in color distribution is inevitable. pled edge points in background edge image EF while 19 Also, the distribution seen in these three datasets have and 3 are same as mentioned in Section 3.1. in common that the difference between facial features and background features is relatively larger for the unforged im- 4. Analysis with Extracted Features age than for the face manipulated image. This observation shows that reflecting the facial information and background In this section, we briefly analyze how the features ex- information together could help solve the problem of gen- tracted from the image appear according to the authenticity erality. of image prior to experimenting. Datasets. We would like to analyze with various up- 5. Experiment to-date face manipulation datasets with diverse generation In this section, we introduce our experiment setups, im- algorithms for demonstrating our method’s generalization plementation details and result of extensive experiment with ability. We adopt several datasets: FaceForensics++ [30], aspect of overall performance, universality and comparison Celeb-DF [20], DFD [6] and DFDC [7]. All of these with other contemporary works to verify our methods. datasets contains genuine video data and face manipulated For conclusive evaluation of detection performance, we video data from genuine ones with reduced awkwardness of mainly trained with Celeb-DF dataset, which is difficult for pixel values. We randomly choose 750 genuine videos and other papers to derive remarkable detection performance 750 manipulated videos for FaceForensics++, DFD, and [36] and use FaceForensics++, DFD, DFDC datasets ad- DFDC each. Exceptionally, since Celeb-DF dataset has less ditionally. When testing, we cross-validate through all of than 750 genuine video data, we choose 563 genuine video these datasets to demonstrate universality of our method. data and 750 manipulated video data in Celeb-DF dataset. Additional preprocessing for experiment. We would like We plotted a graph to statistically analyze the change ac- to get information of difference between the distribution in- cording to the parameter t value of the color features ob- dicated by the difference in color values at the edge region tained by our method. Figure 6 shows difference value dis- with aspect of face and background. We get difference value tributions along parameter t-direction from pre-processed with following Equ. (1) facial features and background features in three datasets, Celeb-DF, FaceForesics++ and DFDC. Note that boundary DI,n = FI,n − BˆI (1) edge points are at t = 9. We observe that there is a distri- bution difference between the facial features derived from Note that FI,n , BˆI and DI,n indicate facial boundary fea- the face manipulated image and the facial features extracted ture in nth window, averaged background boundary feature, from the real image. There is significant difference in distri- and difference between FI,n and BˆI each. Final features bution points close to t = 9 point, which indicates the point derived from background edge region is DI,n with a shape along facial boundary line. Since a generated face image of 16×19×3. Then we flatten the facial feature which ex- may have unusual pixel distribution in face boundary re- tracted in Section 3.1 from shape 16×19×3 to 304 linear gion because of post-processing methods to correct the dif- feature with 3 RGB channels. Also, the background fea- ference in color values along the synthesized parts, the dif- tures which extracted from Section 3.2 are flattened in the 5
Figure 6. Color difference distribution of extracted features from 3 different datasets. same way. The input shape of the preprocessed features for Performance (AUC) the classification model has a 6-channel 304 linear feature Study FF++ / DFD DFDC Celeb-DF consisting of 3 channels of flatten face features and 3 chan- Matern et al. [23] 78.0 66.2 55.1 nels of flatten background features. Yang et al. [38] 47.3 55.9 54.6 Classification Model. To further highlight distinctive fea- Li et al. [19] 93.0 75.5 64.6 tures between unforged image and face manipulated im- Zhou et al. [40] 70.1 61.4 53.8 age, we adopt 1-dimensional convolutional neural networks Nguyen et al. [25] 96.6 53.3 57.5 (1D-CNN) which is widely used in the field of natural lan- Ours 99.8 97.9 95.2 guage processing for extracting features from sequential in- puts [12] unlike other existing manipulation detection mod- Table 1. Performance result compared with recent works els [30, 7] using 2-dimensional convolutional neural net- works (2D-CNN) to process image features, since the ex- tracted color difference features are sequential data accord- ing rate. Cross-entropy function is used for loss function. ing to the parameter t. While advanced face manipulated 5.1. Detection Performance with Specific Dataset image classification models are mainly designed to be com- plex and heavy for extracting significant information from We firstly experiment with one specific dataset for both whole images, we constructed a lightweight simple model training and testing. To evaluate our experiment results, to process our pre-processed distinctive color features. The several existing methods results [23, 38, 19, 40, 25] which number of total epoch is set to 1000 and the batch size is set are validated with all of four state-of-the-arts datasets (Face- to 20. The learning rate is set as 0.00075 using Adam [15] Forensics++, DFD, DFDC and Celeb-DF) cited from their optimizer. Additionally we scheduled optimizer to multiply original paper are shown with our results. Summarized per- 0.5 to learning rate at every 200 epochs for decaying learn- formance compared with referred five manipulation detec- 6
Performance (AUC) Test set Study Training set FF++ DFD DFDC Celeb-DF Xception [30] FF++ - 87.8 54.3 40.9 Face X-ray [18] FF++ and BI 98.52 95.4 80.92 80.58 Ours FF++ 99.8 95.1 89.5 87.7 Table 2. Generalization ability comparison with Xception Network, Face X-ray and ours with several datasets. BI indicates generated additional dataset made by FaceForensics++ datasets with applying method of Face X-ray [18] Cross-Validation Performance (AUC) ble 3 shows cross-validation performance result of our ap- Test set proach in terms of AUC. We observed that cross-validating Training set FF++ DFD DFDC Celeb-DF with Celeb-DF features trained model can obtain more gen- FF++ 99.8 95.1 89.5 87.7 eralized results compared with trained model with other DFD 95.9 99.8 88.5 87.6 datasets. These inconsistency by training sets may be oc- DFDC 98.5 93.0 97.9 88.3 cured because the extracted feature from Celeb-DF dataset Celeb-DF 99.8 99.5 97.1 95.2 have relatively less distinctive difference between unforged image and manipulated image compared with other three Table 3. Cross-validation Performance result datasets. We can see that a model trained with a dataset with a relatively less prominence features such as Celeb-DF per- Performance (AUC) forms well when verifying a dataset with more prominent Study FF++/DF Celeb-DF features. Nevertheless, any experiment results with DFDC FWA [19] 79.2 53.8 and Celeb-DF as test dataset outperforms any other referred Face X-ray [18] 99.2 74.8 face manipulation detection methods. Ours 99.8 95.2 5.3. Comparison with Latest Works Table 4. Performance comparison with FWA and Face X-ray For more precise evaluation with aspect of universality, Performance (AUC)) we compared our performance result with latest works [30, Test set 19, 18, 4, 8, 26] in terms of generalization ability. Table Study Face2Face FaceSwap 4 shows the comparison of best performance with ours and Face X-ray [18] in terms of training with one sub-dataset in Du et al. [8] 90.9 63.2 FaceForensics++ [30] and testing with other dataset. There Nguyen et. al [26] 92.8 54.1 are 4 sub-datasets in FaceForensics++ datasets, including Riess et al. [4] 94.5 72.6 DeepFake [5], Face2Face [35], FaceSwap [9], NeuralTex- Ours 99.8 95.5 tures [34]. We cite two compared results directly in [18], Table 5. Performance comparison with 3 other recent works train- XceptionNet results from [30] and Face X-ray results. Ta- ing with Face2Face dataset. ble 2 summarizes AUC results compared with these existing contemporary works and ours. Although we cannot repro- duce exactly same BI datasets which additionally generated tion methods in terms of AUC, which is area under the Re- from FaceForensics++ datasets with mentioned method in ceiver Operating Characteristic curve is in Table 1. We [18], we can see the our framework outperform other two observe that our performance results outperform those of approaches with aspects of best performance of AUC. the other papers mentioned, especially in terms of verify- We also present the AUC of verifying with one dataset ing with Celeb-DF dataset (95.2%) which makes other ad- compared with Face Warping Artifacts [19] and Face X-ray vanced manipulation detection methods performance dras- [18] in Table 4. Our approach shows better results in both tically dropped. detection performance with specific dataset, and universal- ity for various face synthesizing algorithms universality. 5.2. Cross-Validation with Various Datasets Lastly, we compare with other 3 recent works which sug- We design cross-validation experiment for testing our gest other approaches for universality. Each studies focus approach with aspect of robustness to real-world scenar- on learning innate representation in manipulated image [4, ios. In this experiment, we train our model with one 8] and do detection and localization of synthesized feature dataset and test with the other dataset for making scenar- simultaneously [26]. Table 5 shows the generalization abil- ios that detect manipulated image with unseen dataset. Ta- ity results compared with referred 3 works [4, 8, 26]. We 7
observe that our approach can show better performance in [9] Faceswap. MarekKowalski/FaceSwap. URL: http: terms of universality for different type of synthesizing algo- / / www . github . com / MarekKowalski / rithm. compared with other studies which don’t approach FaceSwap. with aspect of common manipulation procedure of synthe- [10] Hany Farid. “A survey of image forgery detection”. sizing algorithms. In: IEEE Signal Processing Magazine 2.26 (2009), pp. 16–25. 6. Conclusion [11] Ian Goodfellow et al. “Generative Adversarial We present a face manipulation detection methods with Nets”. In: Advances in Neural Information Pro- leveraging facial boundary features which corresponding cessing Systems. Ed. by Z. Ghahramani et al. to color features modified by common boundary post- Vol. 27. Curran Associates, Inc., 2014, pp. 2672– processing of generation procedures. In addition to refer- 2680. URL: https : / / proceedings . ring facial features, background features corresponding to neurips . cc / paper / 2014 / file / unchanged characteristics containing image’s innate finger- 5ca3e9b122f61f8f06494c97b1afccf3 - prints are used for detecting face synthesized image. We Paper.pdf. reveal the importance of utilizing both facial edge infor- [12] Yoon Kim. “Convolutional neural networks mation and background edge information with outstanding for sentence classification”. In: arXiv preprint detection performance better than other existing advanced arXiv:1408.5882 (2014). face manipulation detection methods. Cross-validating with state-of-the-arts datasets demonstrate our approach not only [13] Davis E King. “Dlib-ml: A machine learning remarkably performed with specific dataset but can also be toolkit”. In: The Journal of Machine Learning Re- fully applied in real-world situations. search 10 (2009), pp. 1755–1758. [14] Paul H King. “Digital image processing and anal- References ysis: Human and computer applications with cvip- tools, (umbaugh, s.; 2011)[book reviews]”. In: IEEE [1] Darius Afchar et al. “Mesonet: a compact facial Pulse 3.4 (2012), pp. 84–85. video forgery detection network”. In: 2018 IEEE In- ternational Workshop on Information Forensics and [15] Diederik P Kingma and Jimmy Ba. “Adam: A Security (WIFS). IEEE. 2018, pp. 1–7. method for stochastic optimization”. In: arXiv preprint arXiv:1412.6980 (2014). [2] Shruti Agarwal et al. “Protecting World Leaders Against Deep Fakes.” In: CVPR Workshops. 2019, [16] Diederik P Kingma and Max Welling. Auto- pp. 38–45. Encoding Variational Bayes. 2014. arXiv: 1312 . 6114 [stat.ML]. [3] John Canny. “A computational approach to edge de- tection”. In: IEEE Transactions on pattern analysis [17] Haodong Li et al. “Detection of deep network gener- and machine intelligence 6 (1986), pp. 679–698. ated images using disparities in color components”. In: arXiv preprint arXiv:1808.07276 (2018). [4] Davide Cozzolino et al. “Forensictransfer: Weakly- supervised domain adaptation for forgery detection”. [18] L. Li et al. “Face X-Ray for More General Face In: arXiv preprint arXiv:1812.02510 (2018). Forgery Detection”. In: 2020 IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition [5] Deepfakes. deepfakes/faceswap. URL: http : / / (CVPR). 2020, pp. 5000–5009. DOI: 10 . 1109 / www.github.com/deepfakes/faceswap. CVPR42600.2020.00505. [6] DFD. Contributing Data to Deepfake Detec- [19] Yuezun Li and Siwei Lyu. “Exposing deepfake tion Research. Sept. 2019. URL: https : / / videos by detecting face warping artifacts”. In: arXiv ai . googleblog . com / 2019 / 09 / preprint arXiv:1811.00656 (2018). contributing - data - to - deepfake - detection.html. [20] Yuezun Li et al. Celeb-DF: A Large-scale Challeng- ing Dataset for DeepFake Forensics. 2020. arXiv: [7] Brian Dolhansky et al. The Deepfake Detection Chal- 1909.12962 [cs.CR]. lenge (DFDC) Preview Dataset. 2019. arXiv: 1910. 08854 [cs.CV]. [21] Francesco Marra et al. “Detection of gan-generated fake images over social networks”. In: 2018 IEEE [8] Mengnan Du et al. “Towards generalizable forgery Conference on Multimedia Information Processing detection with locality-aware autoencoder”. In: arXiv and Retrieval (MIPR). IEEE. 2018, pp. 384–389. preprint arXiv:1909.05999 (2019). 8
[22] Francesco Marra et al. “Do gans leave artificial fin- [33] Joel Stehouwer et al. “On the detection of digital face gerprints?” In: 2019 IEEE Conference on Multimedia manipulation”. In: arXiv preprint arXiv:1910.01717 Information Processing and Retrieval (MIPR). IEEE. (2019). 2019, pp. 506–511. [34] Justus Thies, Michael Zollhöfer, and Matthias [23] Falko Matern, Christian Riess, and Marc Stam- Nießner. “Deferred Neural Rendering: Image Syn- minger. “Exploiting visual artifacts to expose deep- thesis Using Neural Textures”. In: ACM Trans. fakes and face manipulations”. In: 2019 IEEE Win- Graph. 38.4 (July 2019). ISSN: 0730-0301. DOI: 10. ter Applications of Computer Vision Workshops 1145 / 3306346 . 3323035. URL: https : / / (WACVW). IEEE. 2019, pp. 83–92. doi.org/10.1145/3306346.3323035. [24] Scott McCloskey and Michael Albright. “Detecting [35] Justus Thies et al. “Face2Face: Real-Time Face Cap- GAN-generated imagery using saturation cues”. In: ture and Reenactment of RGB Videos”. In: Commun. 2019 IEEE International Conference on Image Pro- ACM 62.1 (Dec. 2018), pp. 96–104. ISSN: 0001- cessing (ICIP). IEEE. 2019, pp. 4584–4588. 0782. DOI: 10 . 1145 / 3292039. URL: https : [25] H. H. Nguyen, J. Yamagishi, and I. Echizen. //doi.org/10.1145/3292039. “Capsule-forensics: Using Capsule Networks to De- [36] Ruben Tolosana et al. “Deepfakes and beyond: A tect Forged Images and Videos”. In: ICASSP 2019 survey of face manipulation and fake detection”. In: - 2019 IEEE International Conference on Acous- arXiv preprint arXiv:2001.00179 (2020). tics, Speech and Signal Processing (ICASSP). 2019, [37] Cristian Vaccari and Andrew Chadwick. “Deepfakes pp. 2307–2311. DOI: 10.1109/ICASSP.2019. and disinformation: exploring the impact of syn- 8682602. thetic political video on deception, uncertainty, and [26] Huy H Nguyen et al. “Multi-task learning for de- trust in news”. In: Social Media+ Society 6.1 (2020), tecting and segmenting manipulated facial images p. 2056305120903408. and videos”. In: arXiv preprint arXiv:1906.06876 [38] Xin Yang, Yuezun Li, and Siwei Lyu. “Expos- (2019). ing deep fakes using inconsistent head poses”. [27] Judith MS Prewitt. “Object enhancement and ex- In: ICASSP 2019-2019 IEEE International Confer- traction”. In: Picture processing and Psychopictorics ence on Acoustics, Speech and Signal Processing 10.1 (1970), pp. 15–19. (ICASSP). IEEE. 2019, pp. 8261–8265. [28] Weize Quan et al. “Distinguishing between natural [39] Ning Yu, Larry S Davis, and Mario Fritz. “Attributing and computer-generated images using convolutional fake images to gans: Learning and analyzing gan fin- neural networks”. In: IEEE Transactions on Informa- gerprints”. In: Proceedings of the IEEE International tion Forensics and Security 13.11 (2018), pp. 2772– Conference on Computer Vision. 2019, pp. 7556– 2787. 7566. [29] Danilo Jimenez Rezende, Shakir Mohamed, and [40] Peng Zhou et al. “Two-stream neural networks for Daan Wierstra. “Stochastic Backpropagation and tampered face detection”. In: 2017 IEEE Conference Approximate Inference in Deep Generative Models”. on Computer Vision and Pattern Recognition Work- In: ed. by Eric P. Xing and Tony Jebara. Vol. 32. shops (CVPRW). IEEE. 2017, pp. 1831–1839. Proceedings of Machine Learning Research 2. Be- jing, China: PMLR, 22–24 Jun 2014, pp. 1278–1286. URL: http : / / proceedings . mlr . press / v32/rezende14.html. [30] Andreas Rossler et al. “FaceForensics++: Learning to Detect Manipulated Facial Images”. In: Proceed- ings of the IEEE/CVF International Conference on Computer Vision (ICCV). Oct. 2019. [31] Ekraam Sabir et al. “Recurrent convolutional strate- gies for face manipulation detection in videos”. In: Interfaces (GUI) 3.1 (2019). [32] Irwin Sobel and Gary Feldman. “A 3x3 isotropic gra- dient operator for image processing”. In: a talk at the Stanford Artificial Project in (1968), pp. 271–272. 9
You can also read