Redgerd commited on
Commit
9a0ca21
·
verified ·
1 Parent(s): 6d773c5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -32
README.md CHANGED
@@ -1,20 +1,41 @@
1
- ## Deepfake Detection Model
2
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  This repository contains a deepfake detection model built using a combination of a pre-trained Xception network and an LSTM layer. The model is designed to classify videos as either "Real" or "Fake" by analyzing sequences of facial frames extracted from the video.
4
 
5
  ### Model Architecture
6
 
7
  The model architecture consists of the following components:
8
 
9
- 1. **Input Layer**: Takes a sequence of `TIME_STEPS` frames, each resized to `299x299` pixels with 3 color channels. The input shape is `(batch_size, TIME_STEPS, HEIGHT, WIDTH, 3)`.
10
-
11
- 2. **TimeDistributed Xception**: A pre-trained Xception network (trained on ImageNet) is applied to each frame independently using a `TimeDistributed` wrapper. The `include_top` is set to `False`, and `pooling` is set to `'avg'`, effectively using the Xception network as a feature extractor for each frame. This produces a sequence of feature vectors, one for each frame.
12
-
13
- 3. **LSTM Layer**: The sequence of feature vectors from the `TimeDistributed Xception` layer is fed into an LSTM (Long Short-Term Memory) layer with `256` hidden units. The LSTM layer is capable of learning temporal dependencies between frames, which is crucial for deepfake detection.
14
-
15
- 4. **Dropout Layer**: A `Dropout` layer with a rate of `0.5` is applied after the LSTM layer to prevent overfitting.
16
 
17
- 5. **Output Layer**: A `Dense` layer with `2` units and a `softmax` activation function outputs the probabilities for the two classes: "Real" and "Fake".
18
 
19
  ### How to Use
20
 
@@ -36,19 +57,47 @@ model = build_model() # Architecture defined in the `build_model` function
36
  model.load_weights(model_path)
37
  ```
38
 
39
- #### 3\. Face Extraction and Preprocessing
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
- The `extract_faces_from_video` function processes a given video file:
 
 
42
 
43
- * It uses the MTCNN (Multi-task Cascaded Convolutional Networks) for robust face detection in each frame.
44
- * It samples `TIME_STEPS` frames from the video.
45
- * For each sampled frame, it detects the primary face, extracts it, and resizes it to `299x299` pixels.
46
- * The extracted face images are then preprocessed using `preprocess_input` from `tensorflow.keras.applications.xception`, which scales pixel values to the range expected by the Xception model.
47
- * If no face is detected in a frame, a black image of the same dimensions is used as a placeholder.
48
- * The function ensures that exactly `TIME_STEPS` frames are returned, padding with the last available frame or black images if necessary.
49
 
50
- <!-- end list -->
 
 
51
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
  ```python
53
  from mtcnn import MTCNN
54
  import cv2
@@ -57,10 +106,12 @@ from PIL import Image
57
  from tensorflow.keras.applications.xception import preprocess_input
58
 
59
  def extract_faces_from_video(video_path, num_frames=30):
60
- # ... (function implementation as provided in prediction.ipynb)
61
  pass
62
 
63
- video_path = '/content/drive/MyDrive/Dataset DDM/FF++/manipulated_sequences/FaceShifter/raw/videos/724_725.mp4'
 
 
64
  video_array = extract_faces_from_video(video_path, num_frames=TIME_STEPS)
65
  ```
66
 
@@ -78,14 +129,4 @@ print(f"Predicted Class: {class_names[predicted_class]}")
78
  print(f"Class Probabilities: Real: {probabilities[0]:.4f}, Fake: {probabilities[1]:.4f}")
79
  ```
80
 
81
- ### Parameters
82
-
83
- * `TIME_STEPS`: Number of frames to extract from each video (default: `30`).
84
- * `HEIGHT`, `WIDTH`: Dimensions to which each extracted face image is resized (default: `299, 299`).
85
- * `lstm_hidden_size`: Number of hidden units in the LSTM layer (default: `256`).
86
- * `dropout_rate`: Dropout rate applied after the LSTM layer (default: `0.5`).
87
- * `num_classes`: Number of output classes (default: `2` for "Real" and "Fake").
88
-
89
- ### Development Environment
90
-
91
- The provided code snippet is written in Python and utilizes `tensorflow` (Keras API), `opencv-python`, `numpy`, `mtcnn`, and `Pillow`. It is designed to be run in an environment with these libraries installed. The paths suggest it was developed using Google Drive, potentially within a Colab environment.
 
1
+ license: mit # Or apache-2.0, gpl-3.0, etc. Choose the license that applies to your project.
2
+ tags:
3
+ - deepfake-detection
4
+ - video-classification
5
+ - computer-vision
6
+ - xception
7
+ - lstm
8
+ model-index:
9
+ - name: Deepfake Detection Model
10
+ results:
11
+ - task:
12
+ type: video-classification
13
+ name: Video Classification
14
+ dataset:
15
+ name: Your_Dataset_Name # Replace with the actual dataset you trained on (e.g., FaceForensics++, Celeb-DF)
16
+ type: image-folder
17
+ split: validation # Or test, or train
18
+ metrics:
19
+ - type: accuracy
20
+ value: 0.95 # Replace with your model's actual accuracy
21
+ name: Accuracy
22
+ - type: f1 # Add other relevant metrics like F1-score, precision, recall
23
+ value: 0.94 # Replace with your model's actual F1 score
24
+ name: F1 Score
25
+ ---
26
+ # Deepfake Detection Model
27
  This repository contains a deepfake detection model built using a combination of a pre-trained Xception network and an LSTM layer. The model is designed to classify videos as either "Real" or "Fake" by analyzing sequences of facial frames extracted from the video.
28
 
29
  ### Model Architecture
30
 
31
  The model architecture consists of the following components:
32
 
33
+ 1. **Input**: Accepts a sequence of `TIME_STEPS` frames, each resized to `299x299` pixels.
34
+ 2. **Feature Extraction**: A **TimeDistributed Xception network** processes each frame, extracting key features.
35
+ 3. **Temporal Learning**: An **LSTM layer** with `256` units learns temporal dependencies between these extracted frame features.
36
+ 4. **Regularization**: A **Dropout layer** (`0.5` rate) prevents overfitting.
37
+ 5. **Output**: A **Dense layer** with `softmax` activation predicts probabilities for "Real" and "Fake" classes.
 
 
38
 
 
39
 
40
  ### How to Use
41
 
 
57
  model.load_weights(model_path)
58
  ```
59
 
60
+ #### 3\. Model Definition
61
+
62
+ The `build_model` function defines the architecture:
63
+
64
+ ```python
65
+ import tensorflow as tf
66
+ from tensorflow import keras
67
+ from tensorflow.keras import layers
68
+
69
+ # Global parameters for model input shape (ensure these are defined before calling build_model)
70
+ # Example:
71
+ # TIME_STEPS = 30
72
+ # HEIGHT = 299
73
+ # WIDTH = 299
74
 
75
+ def build_model(lstm_hidden_size=256, num_classes=2, dropout_rate=0.5):
76
+ # Input shape: (batch_size, TIME_STEPS, HEIGHT, WIDTH, 3)
77
+ inputs = layers.Input(shape=(TIME_STEPS, HEIGHT, WIDTH, 3))
78
 
79
+ # TimeDistributed layer to apply the base model to each frame
80
+ base_model = keras.applications.Xception(weights='imagenet', include_top=False, pooling='avg')
81
+ # For inference, we don't need to set trainable, but if you plan to retrain, you can set accordingly
82
+ # base_model.trainable = False
 
 
83
 
84
+ # Apply TimeDistributed wrapper
85
+ x = layers.TimeDistributed(base_model)(inputs)
86
+ # x shape: (batch_size, TIME_STEPS, 2048)
87
 
88
+ # LSTM layer
89
+ x = layers.LSTM(lstm_hidden_size)(x)
90
+
91
+ x = layers.Dropout(dropout_rate)(x)
92
+ outputs = layers.Dense(num_classes, activation='softmax')(x)
93
+
94
+ model = keras.Model(inputs, outputs)
95
+ return model
96
+ ```
97
+
98
+ #### 3\. Extract Faces
99
+
100
+ Use the extract_faces_from_video function to get preprocessed face frames from your video. This function handles face detection (using MTCNN), resizing, and preprocessing.
101
  ```python
102
  from mtcnn import MTCNN
103
  import cv2
 
106
  from tensorflow.keras.applications.xception import preprocess_input
107
 
108
  def extract_faces_from_video(video_path, num_frames=30):
109
+ # ... (function implementation to extract and preprocess faces)
110
  pass
111
 
112
+ # Ensure TIME_STEPS is defined, as it's used by extract_faces_from_video
113
+ # TIME_STEPS = 30
114
+ video_path = 'path/to/your/video.mp4' # Replace with your video
115
  video_array = extract_faces_from_video(video_path, num_frames=TIME_STEPS)
116
  ```
117
 
 
129
  print(f"Class Probabilities: Real: {probabilities[0]:.4f}, Fake: {probabilities[1]:.4f}")
130
  ```
131
 
132
+ <!-- end list -->