rasbt commited on
Commit
8a3862b
·
verified ·
1 Parent(s): 0410d2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -23,6 +23,20 @@ The model weights included here are PyTorch state dicts converted from the offic
23
 
24
  To avoid duplication and ease maintance, this repository only contains the model weights; the self-contained source code can be found [here](https://github.com/rasbt/LLMs-from-scratch/blob/main/pkg/llms_from_scratch/qwen3.py). Instructions on how to use the code are provided below.
25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
   
27
 
28
  ### Using Qwen3 0.6B via the `llms-from-scratch` package
 
23
 
24
  To avoid duplication and ease maintance, this repository only contains the model weights; the self-contained source code can be found [here](https://github.com/rasbt/LLMs-from-scratch/blob/main/pkg/llms_from_scratch/qwen3.py). Instructions on how to use the code are provided below.
25
 
26
+
27
+  
28
+ # Qwen3 from-scratch code
29
+
30
+ The standalone notebooks in this folder contain from-scratch codes in linear fashion:
31
+
32
+ 1. [standalone-qwen3.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3.ipynb): The dense Qwen3 model without bells and whistles
33
+ 2. [standalone-qwen3-plus-kvcache.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3-plus-kvcache.ipynb): Same as above but with KV cache for better inference efficiency
34
+ 3. [standalone-qwen3-moe.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3-moe.ipynb): Like the first notebook but the Mixture-of-Experts (MoE) variant
35
+ 4. [standalone-qwen3-moe-plus-kvcache.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3-moe-plus-kvcache.ipynb): Same as above but with KV cache for better inference efficiency
36
+
37
+ Alternatively, I also organized the code into a Python package (including unit tests and CI), which you can run as described below.
38
+
39
+
40
   
41
 
42
  ### Using Qwen3 0.6B via the `llms-from-scratch` package