ankitkushwaha90 commited on
Commit
964a3d3
·
verified ·
1 Parent(s): 049f424

Create all-types-of-transformers.md

Browse files
Files changed (1) hide show
  1. all-types-of-transformers.md +38 -0
all-types-of-transformers.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Perfect ✅ Here’s a ready-to-use table of Hugging Face Transformers models for different data types, with their exact model IDs you can load directly.
2
+
3
+ ## 🔹 Hugging Face Transformer Models by Data Type
4
+ | Data Type | Model Name | Hugging Face ID |
5
+ | ----------------------------------- | --------------------------- | --------------------------------------------- |
6
+ | **Text (NLP)** | BERT | `bert-base-uncased` |
7
+ | | GPT-2 | `gpt2` |
8
+ | | T5 | `t5-small` |
9
+ | | DistilBERT | `distilbert-base-uncased` |
10
+ | **Image (Vision)** | Vision Transformer (ViT) | `google/vit-base-patch16-224` |
11
+ | | DeiT (Data-efficient ViT) | `facebook/deit-base-distilled-patch16-224` |
12
+ | | BEiT | `microsoft/beit-base-patch16-224-pt22k-ft22k` |
13
+ | | Swin Transformer | `microsoft/swin-tiny-patch4-window7-224` |
14
+ | **Audio / Speech** | Wav2Vec2 (ASR) | `facebook/wav2vec2-base-960h` |
15
+ | | Whisper (ASR + Translation) | `openai/whisper-small` |
16
+ | | HuBERT (speech features) | `facebook/hubert-base-ls960` |
17
+ | | SpeechT5 (TTS + STT) | `microsoft/speecht5_asr` |
18
+ | **Video** | TimeSformer | `facebook/timesformer-base-finetuned-k400` |
19
+ | | VideoMAE | `MCG-NJU/videomae-base` |
20
+ | **Multimodal (Text + Image/Video)** | CLIP | `openai/clip-vit-base-patch32` |
21
+ | | BLIP (captioning, VQA) | `Salesforce/blip-image-captioning-base` |
22
+ | | ViLT (vision-language) | `dandelin/vilt-b32-finetuned-coco` |
23
+
24
+ 🔹 Example Usage (generic)
25
+ ```python
26
+ from transformers import AutoProcessor, AutoModel, AutoTokenizer
27
+
28
+ # Example: Load a vision transformer
29
+ model_id = "google/vit-base-patch16-224"
30
+ model = AutoModel.from_pretrained(model_id)
31
+ processor = AutoProcessor.from_pretrained(model_id)
32
+
33
+ print("Loaded:", model_id)
34
+ ```
35
+
36
+ ✅ This table should save you time: just copy the model ID and plug it into AutoModel / AutoProcessor / AutoTokenizer depending on the task.
37
+
38
+ 👉 Do you want me to also write a full script that auto-detects the data type (text, image, audio, video) and loads the right Hugging Face model? That way, you can just pass "text", "image", "audio", "video" and it works automatically.