rasbt
/

qwen3-from-scratch

Model card Files Files and versions

rasbt commited on Sep 6

Commit

8a3862b

·

verified ·

1 Parent(s): 0410d2d

Update README.md

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

@@ -23,6 +23,20 @@ The model weights included here are PyTorch state dicts converted from the offic
 To avoid duplication and ease maintance, this repository only contains the model weights; the self-contained source code can be found [here](https://github.com/rasbt/LLMs-from-scratch/blob/main/pkg/llms_from_scratch/qwen3.py). Instructions on how to use the code are provided below.
 &nbsp;
 ### Using Qwen3 0.6B via the `llms-from-scratch` package

 To avoid duplication and ease maintance, this repository only contains the model weights; the self-contained source code can be found [here](https://github.com/rasbt/LLMs-from-scratch/blob/main/pkg/llms_from_scratch/qwen3.py). Instructions on how to use the code are provided below.
+&nbsp;
+# Qwen3 from-scratch code
+The standalone notebooks in this folder contain from-scratch codes in linear fashion:
+1. [standalone-qwen3.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3.ipynb): The dense Qwen3 model without bells and whistles
+2. [standalone-qwen3-plus-kvcache.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3-plus-kvcache.ipynb): Same as above but with KV cache for better inference efficiency
+3. [standalone-qwen3-moe.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3-moe.ipynb): Like the first notebook but the Mixture-of-Experts (MoE) variant
+4. [standalone-qwen3-moe-plus-kvcache.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3-moe-plus-kvcache.ipynb): Same as above but with KV cache for better inference efficiency
+Alternatively, I also organized the code into a Python package (including unit tests and CI), which you can run as described below.
 &nbsp;
 ### Using Qwen3 0.6B via the `llms-from-scratch` package