A Machine Learning Based Approach for Deepfake Detection in Social Media Through Key Video Frame Extraction

03 medical and health sciences 0302 clinical medicine 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology
DOI: 10.1007/s42979-021-00495-x Publication Date: 2021-02-18T00:19:16Z
ABSTRACT
In the last few years, with the advent of deepfake videos, image forgery has become a serious threat. In a deepfake video, a person’s face, emotion or speech are replaced by someone else’s face, different emotion or speech, using deep learning technology. These videos are often so sophisticated that traces of manipulation are difficult to detect. They can have a heavy social, political and emotional impact on individuals, as well as on the society. Social media are the most common and serious targets as they are vulnerable platforms, susceptible to blackmailing or defaming a person. There are some existing works for detecting deepfake videos but very few attempts have been made for videos in social media. The first step to preempt such misleading deepfake videos from social media is to detect them. Our paper presents a novel neural network-based method to detect fake videos. We applied a key video frame extraction technique to reduce the computation in detecting deepfake videos. A model, consisting of a convolutional neural network (CNN) and a classifier network, is proposed along with the algorithm. The Xception net has been chosen over two other structures—InceptionV3 and Resnet50—for pairing with our classifier. Our model is a visual artifact-based detection technique. The feature vectors from the CNN module are used as the input of the subsequent classifier network for classifying the video. We used the FaceForensics++ and Deepfake Detection Challenge datasets to reach the best model. Our model detects highly compressed deepfake videos in social media with a very high accuracy and lowered computational requirements. We achieved 98.5% accuracy with the FaceForensics++ dataset and 92.33% accuracy with a combined dataset of FaceForensics++ and Deepfake Detection Challenge. Any autoencoder generated video can be detected by our model. Our method has detected almost all fake videos if they possess more than one key video frame. The accuracy reported here is for detecting fake videos when the number of key video frames is one. The simplicity of the method will help people to check the authenticity of a video. Our work is focused, but not limited, to addressing the social and economical issues due to fake videos in social media. In this paper, we achieve the high accuracy without training the model with an enormous amount of data. The key video frame extraction method reduces the computations significantly, as compared to existing works.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (58)
CITATIONS (63)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....