TimesLast commited on
Commit
75a023e
·
verified ·
1 Parent(s): 9853a67

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -3
README.md CHANGED
@@ -1,3 +1,31 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ # Scaffold-and-Fill Diffusion (SF-Diff): A Hybrid Architecture for Accelerated Language Model Inference
6
+
7
+ **Author:** Hilal Limo (Self-Taught Independent Researcher, Age 15)
8
+
9
+ **[➡️ Click here to read the full paper: SF-Diff_Paper.pdf](SF-Diff_Paper.pdf)**
10
+
11
+ ---
12
+
13
+ ## Abstract
14
+
15
+ Autoregressive transformer models, the dominant architecture for modern Large Language Models (LLMs), are fundamentally constrained by high inference latency due to their sequential generation process. In this paper, I propose Scaffold-and-Fill Diffusion (SF-Diff), a novel hybrid architecture designed to significantly accelerate text generation by deconstructing the task into two parallelizable stages. The core hypothesis is that natural language can be separated into a semantic "scaffolding" of keywords and a grammatical "filler" of structural words. SF-Diff first utilizes a non-autoregressive diffusion model to generate the complete semantic scaffold—a sequence of keyword vector embeddings—in a fixed number of highly parallelizable steps. Subsequently, a lightweight autoregressive transformer decoder performs a "grammatical infilling" task, weaving the structural words around the pre-generated semantic core. This approach aims to combine the holistic, parallel generation strengths of diffusion models with the grammatical precision of transformers, offering a substantial reduction in inference latency while maintaining high-quality, coherent output.
16
+
17
+ ---
18
+
19
+ ## Citation
20
+
21
+ If you find this work interesting, please consider citing the paper:
22
+
23
+ ```bibtex
24
+ @misc{limo2025sfdiff,
25
+ author = {Hilal Limo},
26
+ title = {Scaffold-and-Fill Diffusion (SF-Diff): A Hybrid Architecture for Accelerated Language Model Inference},
27
+ year = {2025},
28
+ publisher = {Hugging Face},
29
+ journal = {Hugging Face Hub},
30
+ howpublished = {\url{https://huggingface.co/TimesLast/SF-Diff}}
31
+ }