--- license: mit # Or apache-2.0, gpl-3.0, etc. Choose the license that applies to your project. tags: - deepfake-detection - video-classification - computer-vision - xception - lstm model-index: - name: Deepfake Detection Model results: - task: type: video-classification name: Video Classification dataset: name: Your_Dataset_Name # Replace with the actual dataset you trained on (e.g., FaceForensics++, Celeb-DF) type: image-folder split: validation # Or test, or train metrics: - type: accuracy value: 0.95 # Replace with your model's actual accuracy name: Accuracy - type: f1 # Add other relevant metrics like F1-score, precision, recall value: 0.94 # Replace with your model's actual F1 score name: F1 Score --- # Deepfake Detection Model This repository contains a deepfake detection model built using a combination of a pre-trained Xception network and an LSTM layer. The model is designed to classify videos as either "Real" or "Fake" by analyzing sequences of facial frames extracted from the video. ### Model Architecture The model architecture consists of the following components: 1. **Input**: Accepts a sequence of `TIME_STEPS` frames, each resized to `299x299` pixels. 2. **Feature Extraction**: A **TimeDistributed Xception network** processes each frame, extracting key features. 3. **Temporal Learning**: An **LSTM layer** with `256` units learns temporal dependencies between these extracted frame features. 4. **Regularization**: A **Dropout layer** (`0.5` rate) prevents overfitting. 5. **Output**: A **Dense layer** with `softmax` activation predicts probabilities for "Real" and "Fake" classes. ### How to Use #### 1\. Setup Clone the repository and install the required libraries: ```bash pip install tensorflow opencv-python numpy mtcnn Pillow ``` #### 2\. Model Loading The model weights are loaded from `COMBINED_best_Phase1.keras`. Ensure this file is accessible at the specified `model_path`. ```python model_path = '/content/drive/MyDrive/Dataset DDM/FINAL models/COMBINED_best_Phase1.keras' model = build_model() # Architecture defined in the `build_model` function model.load_weights(model_path) ``` The `build_model` function defines the architecture as: ```python import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers # Global parameters for model input shape (ensure these are defined before calling build_model) # TIME_STEPS = 30 # HEIGHT = 299 # WIDTH = 299 def build_model(lstm_hidden_size=256, num_classes=2, dropout_rate=0.5): # Input shape: (batch_size, TIME_STEPS, HEIGHT, WIDTH, 3) inputs = layers.Input(shape=(TIME_STEPS, HEIGHT, WIDTH, 3)) # TimeDistributed layer to apply the base model to each frame base_model = keras.applications.Xception(weights='imagenet', include_top=False, pooling='avg') # For inference, we don't need to set trainable, but if you plan to retrain, you can set accordingly # base_model.trainable = False # Apply TimeDistributed wrapper x = layers.TimeDistributed(base_model)(inputs) # x shape: (batch_size, TIME_STEPS, 2048) # LSTM layer x = layers.LSTM(lstm_hidden_size)(x) x = layers.Dropout(dropout_rate)(x) outputs = layers.Dense(num_classes, activation='softmax')(x) model = keras.Model(inputs, outputs) return model ``` #### 3\. Extract Faces Use the extract_faces_from_video function to get preprocessed face frames from your video. This function handles face detection (using MTCNN), resizing, and preprocessing. ```python from mtcnn import MTCNN import cv2 import numpy as np from PIL import Image from tensorflow.keras.applications.xception import preprocess_input def extract_faces_from_video(video_path, num_frames=30): # ... (function implementation to extract and preprocess faces) pass # Ensure TIME_STEPS is defined, as it's used by extract_faces_from_video # TIME_STEPS = 30 video_path = 'path/to/your/video.mp4' # Replace with your video video_array = extract_faces_from_video(video_path, num_frames=TIME_STEPS) ``` #### 4\. Prediction Once the `video_array` (preprocessed frames) is ready, you can make a prediction using the loaded model: ```python predictions = model.predict(video_array) predicted_class = np.argmax(predictions, axis=1)[0] probabilities = predictions[0] class_names = ['Real', 'Fake'] print(f"Predicted Class: {class_names[predicted_class]}") print(f"Class Probabilities: Real: {probabilities[0]:.4f}, Fake: {probabilities[1]:.4f}") ```