File size: 1,705 Bytes

956f427

---

library_name: transformers
license: apache-2.0
base_model: Qwen/Qwen3-0.6B-Base
tags:
- generated_from_trainer
model-index:
- name: MNLP_M2_mcqa_model_rational_math_en_cn
  results: []
---


<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# MNLP_M2_mcqa_model_rational_math_en_cn



This model is a fine-tuned version of [Qwen/Qwen3-0.6B-Base](https://huggingface.co/Qwen/Qwen3-0.6B-Base) on an unknown dataset.

It achieves the following results on the evaluation set:

- Loss: 0.8507



## Model description



More information needed



## Intended uses & limitations



More information needed



## Training and evaluation data



More information needed



## Training procedure



### Training hyperparameters



The following hyperparameters were used during training:

- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 8

- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments

- lr_scheduler_type: linear

- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 3



### Training results



| Training Loss | Epoch  | Step | Validation Loss |

|:-------------:|:------:|:----:|:---------------:|

| 1.1705        | 1.0    | 2667 | 0.8624          |

| 1.0313        | 2.0    | 5334 | 0.8483          |

| 1.0575        | 2.9992 | 7998 | 0.8507          |





### Framework versions



- Transformers 4.51.3

- Pytorch 2.5.1+cu124

- Datasets 3.2.0

- Tokenizers 0.21.0