Deepfake Detection Model
This repository contains a deepfake detection model built using a combination of a pre-trained Xception network and an LSTM layer. The model is designed to classify videos as either "Real" or "Fake" by analyzing sequences of facial frames extracted from the video.
Model Architecture
The model architecture consists of the following components:
Input Layer: Takes a sequence of
TIME_STEPS
frames, each resized to299x299
pixels with 3 color channels. The input shape is(batch_size, TIME_STEPS, HEIGHT, WIDTH, 3)
.TimeDistributed Xception: A pre-trained Xception network (trained on ImageNet) is applied to each frame independently using a
TimeDistributed
wrapper. Theinclude_top
is set toFalse
, andpooling
is set to'avg'
, effectively using the Xception network as a feature extractor for each frame. This produces a sequence of feature vectors, one for each frame.LSTM Layer: The sequence of feature vectors from the
TimeDistributed Xception
layer is fed into an LSTM (Long Short-Term Memory) layer with256
hidden units. The LSTM layer is capable of learning temporal dependencies between frames, which is crucial for deepfake detection.Dropout Layer: A
Dropout
layer with a rate of0.5
is applied after the LSTM layer to prevent overfitting.Output Layer: A
Dense
layer with2
units and asoftmax
activation function outputs the probabilities for the two classes: "Real" and "Fake".
How to Use
1. Setup
Clone the repository and install the required libraries:
pip install tensorflow opencv-python numpy mtcnn Pillow
2. Model Loading
The model weights are loaded from COMBINED_best_Phase1.keras
. Ensure this file is accessible at the specified model_path
.
model_path = '/content/drive/MyDrive/Dataset DDM/FINAL models/COMBINED_best_Phase1.keras'
model = build_model() # Architecture defined in the `build_model` function
model.load_weights(model_path)
3. Face Extraction and Preprocessing
The extract_faces_from_video
function processes a given video file:
- It uses the MTCNN (Multi-task Cascaded Convolutional Networks) for robust face detection in each frame.
- It samples
TIME_STEPS
frames from the video. - For each sampled frame, it detects the primary face, extracts it, and resizes it to
299x299
pixels. - The extracted face images are then preprocessed using
preprocess_input
fromtensorflow.keras.applications.xception
, which scales pixel values to the range expected by the Xception model. - If no face is detected in a frame, a black image of the same dimensions is used as a placeholder.
- The function ensures that exactly
TIME_STEPS
frames are returned, padding with the last available frame or black images if necessary.
from mtcnn import MTCNN
import cv2
import numpy as np
from PIL import Image
from tensorflow.keras.applications.xception import preprocess_input
def extract_faces_from_video(video_path, num_frames=30):
# ... (function implementation as provided in prediction.ipynb)
pass
video_path = '/content/drive/MyDrive/Dataset DDM/FF++/manipulated_sequences/FaceShifter/raw/videos/724_725.mp4'
video_array = extract_faces_from_video(video_path, num_frames=TIME_STEPS)
4. Prediction
Once the video_array
(preprocessed frames) is ready, you can make a prediction using the loaded model:
predictions = model.predict(video_array)
predicted_class = np.argmax(predictions, axis=1)[0]
probabilities = predictions[0]
class_names = ['Real', 'Fake']
print(f"Predicted Class: {class_names[predicted_class]}")
print(f"Class Probabilities: Real: {probabilities[0]:.4f}, Fake: {probabilities[1]:.4f}")
Parameters
TIME_STEPS
: Number of frames to extract from each video (default:30
).HEIGHT
,WIDTH
: Dimensions to which each extracted face image is resized (default:299, 299
).lstm_hidden_size
: Number of hidden units in the LSTM layer (default:256
).dropout_rate
: Dropout rate applied after the LSTM layer (default:0.5
).num_classes
: Number of output classes (default:2
for "Real" and "Fake").
Development Environment
The provided code snippet is written in Python and utilizes tensorflow
(Keras API), opencv-python
, numpy
, mtcnn
, and Pillow
. It is designed to be run in an environment with these libraries installed. The paths suggest it was developed using Google Drive, potentially within a Colab environment.