A high quality Vietnamese pretraining dataset for LLMs
AI & ML interests
None defined yet.
Recent Activity
View all activity
models
0
None public yet
datasets
17
group2sealion/vnu-hard-clean
Viewer
•
Updated
•
29.8k
•
19
group2sealion/web_science_extract
Viewer
•
Updated
•
11.6k
•
20
group2sealion/qwen-gen-vnu
Viewer
•
Updated
•
856
•
15
group2sealion/vnu_crawl
Viewer
•
Updated
•
42.2k
•
18
group2sealion/15mil_milestone
Viewer
•
Updated
•
2.43M
•
5
group2sealion/sft_eval
Viewer
•
Updated
•
223
•
1
group2sealion/4mil_milestone
Viewer
•
Updated
•
2.53M
•
30
group2sealion/11mil_last
Viewer
•
Updated
•
1.85M
•
18
group2sealion/8mil_last
Viewer
•
Updated
•
1.85M
•
18
group2sealion/last_result
Viewer
•
Updated
•
1.82M
•
2