l3-unc
/

qwen2.5-7b_edited_object_counting_v2

Text Generation

text-generation-inference

Model card Files Files and versions

qwen2.5-7b_edited_object_counting_v2 / README.md

keremzaman's picture

Update README.md

9221475 verified about 1 month ago

|

history blame contribute delete

1.64 kB

	---
	library_name: transformers
	license: mit
	datasets:
	- l3-unc/CausalDiagnosticity
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-7B
	---

	# Model Card for Model ID

	This model is derived from [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) and has been edited using MEMIT for the `object_counting` task from the [Causal Diagnosticity](https://huggingface.co/datasets/l3-unc/CausalDiagnosticity) dataset.

	# Versioning

	- `_v1` → The model is edited such that new knowledge is based on `target_1` from the `related_edits` field of each dataset item.
	- `_v2` → The model is edited such that new knowledge is based on `target_2` from the `related_edits` field of each dataset item.

	---

	# MEMIT Hyperparameters

	```yaml
	alg_name: "MEMIT"
	layers: [4, 5, 6, 7, 8]
	clamp_norm_factor: 4
	layer_selection: "all"
	fact_token: "subject_last"
	v_num_grad_steps: 25
	v_lr: 5e-1
	v_loss_layer: 27
	v_weight_decay: 1e-3
	kl_factor: 0.0625
	mom2_adjustment: true
	mom2_update_weight: 15000
	rewrite_module_tmp: "model.layers.{}.mlp.down_proj"
	layer_module_tmp: "model.layers.{}"
	mlp_module_tmp: "model.layers.{}.mlp"
	attn_module_tmp: "model.layers.{}.self_attn"
	ln_f_module: "model.norm"
	lm_head_module: "lm_head"
	mom2_dataset: "wikipedia"
	mom2_n_samples: 100000
	mom2_dtype: "float32"
	model_parallel: False
	```

	## Additional Resources

	For more information about the dataset, editing details, and the associated paper, see:

	- 📄 [Paper](https://arxiv.org/abs/2502.18848)
	- 📊 [Dataset](https://huggingface.co/datasets/l3-unc/CausalDiagnosticity)
	- 💻 [Code](https://github.com/KeremZaman/CausalDiagnosticity)