Text Generation
Transformers
Safetensors
Czech
llama
text-generation-inference
mfajcik commited on
Commit
e212d02
·
verified ·
1 Parent(s): 873bfff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -76,5 +76,39 @@ with torch.autocast('cuda', dtype=torch.bfloat16):
76
  do_sample=True,
77
  use_cache=True))
78
  ```
 
 
79
 
80
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
  do_sample=True,
77
  use_cache=True))
78
  ```
79
+ # Training Data
80
+ We release most (95.79%) of our training data corpus as [BUT-Large Czech Collection](https://huggingface.co/datasets/BUT-FIT/but_lcc).
81
 
82
 
83
+ # Our Release Plan
84
+ | Stage | Description | Date |
85
+ |---------------|----------------|----------------|
86
+ | 1 | 'Best' model + training data | 13.03.2024
87
+ | 2 | All checkpoints + training code|
88
+ | 3 | __Benczechmark__ a collection of Czech datasets for few-shot LLM evaluation **Get in touch if you want to contribute!** |
89
+ | 4 | Preprint Publication |
90
+
91
+ ## Getting in Touch
92
+ For further questions, email to `[email protected]`.
93
+
94
+ # Disclaimer
95
+ This is a probabilistic model, it can output stochastic information. Authors are not responsible for the model outputs. Use at your own risk.
96
+
97
+ # Acknowledgement
98
+ This work was supported by NAKI III program of Ministry of Culture Czech Republic, project semANT ---
99
+ "Sémantický průzkumník textového kulturního dědictví" grant no. `DH23P03OVV060` and
100
+ by the Ministry of Education, Youth and Sports of the Czech Republic through the e-INFRA CZ (ID:`90254`).
101
+
102
+ # Citation
103
+ ```bibtex
104
+ @article{benczechmark,
105
+ author = {Martin Fajčík, Martin Dočekal, Jan Doležal, Karel Beneš, Michal Hradiš},
106
+ title = {BenCzechMark: Machine Language Understanding Benchmark for Czech Language},
107
+ journal = {arXiv preprint arXiv:insert-arxiv-number-here},
108
+ year = {2024},
109
+ month = {March},
110
+ eprint = {insert-arxiv-number-here},
111
+ archivePrefix = {arXiv},
112
+ primaryClass = {cs.CL},
113
+ }
114
+