Massive Javanese language pre-training corpus
Afrizal Hasbi Azizy
afrizalha
AI & ML interests
LLMs for the low-resource languages of Indonesia.
Organizations
Centhini
Massive Javanese language pre-training corpus
Kancil
Base Llama 3 8B fine-tuned with Indonesian instruction dataset.
Indonesian Language(s) Dataset (Original)
Contains original instruction dataset for the languages of Indonesia
Bakpia
Open models instructioned-tuned to respond in Javanese! Comes in 0.5B, 1.5B, and 9B parameters.