add AIBOM

Dear model owner(s),
We are a group of researchers investigating the usefulness of sharing AIBOMs (Artificial Intelligence Bill of Materials) to document AI models – AIBOMs are machine-readable structured lists of components (e.g., datasets and models) used to enhance transparency in AI-model supply chains.

To pursue the above-mentioned objective, we identified popular models on HuggingFace and, based on your model card (and some configuration information available in HuggingFace), we generated your AIBOM according to the CyclonDX (v1.6) standard (see https://cyclonedx.org/docs/1.6/json/). AIBOMs are generated as JSON files by using the following open-source supporting tool: https://github.com/MSR4SBOM/ALOHA (technical details are available in the research paper: https://github.com/MSR4SBOM/ALOHA/blob/main/ALOHA.pdf).

The JSON file in this pull request is your AIBOM (see https://github.com/MSR4SBOM/ALOHA/blob/main/documentation.json for details on its structure).

Clearly, the submitted AIBOM matches the current model information, yet it can be easily regenerated when the model evolves, using the aforementioned AIBOM generator tool.

We open this pull request containing an AIBOM of your AI model, and hope it will be considered. We would also like to hear your opinion on the usefulness (or not) of AIBOM by answering a 3-minute anonymous survey: https://forms.gle/WGffSQD5dLoWttEe7.

Thanks in advance, and regards,
Riccardo D’Avino, Fatima Ahmed, Sabato Nocera, Simone Romano, Giuseppe Scanniello (University of Salerno, Italy),
Massimiliano Di Penta (University of Sannio, Italy),
The MSR4SBOM team

Files changed (1) hide show

Qwen_Qwen2.5-7B-Instruct.json +69 -0

Qwen_Qwen2.5-7B-Instruct.json ADDED Viewed

	@@ -0,0 +1,69 @@

+{
+    "bomFormat": "CycloneDX",
+    "specVersion": "1.6",
+    "serialNumber": "urn:uuid:4ca2b6f8-6145-4d53-ae76-dc5fc7f694bb",
+    "version": 1,
+    "metadata": {
+        "timestamp": "2025-06-05T09:37:31.466856+00:00",
+        "component": {
+            "type": "machine-learning-model",
+            "bom-ref": "Qwen/Qwen2.5-7B-Instruct-0aa1dcca-35d6-540c-bb64-eb00402ee564",
+            "name": "Qwen/Qwen2.5-7B-Instruct",
+            "externalReferences": [
+                {
+                    "url": "https://huggingface.co/Qwen/Qwen2.5-7B-Instruct",
+                    "type": "documentation"
+                }
+            ],
+            "modelCard": {
+                "modelParameters": {
+                    "task": "text-generation",
+                    "architectureFamily": "qwen2",
+                    "modelArchitecture": "Qwen2ForCausalLM"
+                },
+                "properties": [
+                    {
+                        "name": "library_name",
+                        "value": "transformers"
+                    },
+                    {
+                        "name": "base_model",
+                        "value": "Qwen/Qwen2.5-7B"
+                    }
+                ]
+            },
+            "authors": [
+                {
+                    "name": "Qwen"
+                }
+            ],
+            "licenses": [
+                {
+                    "license": {
+                        "id": "Apache-2.0",
+                        "url": "https://spdx.org/licenses/Apache-2.0.html"
+                    }
+                }
+            ],
+            "description": "Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:- Significantly **more knowledge** and has greatly improved capabilities in **coding** and **mathematics**, thanks to our specialized expert models in these domains.- Significant improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g, tables), and **generating structured outputs** especially JSON. **More resilient to the diversity of system prompts**, enhancing role-play implementation and condition-setting for chatbots.- **Long-context Support** up to 128K tokens and can generate up to 8K tokens.- **Multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.**This repo contains the instruction-tuned 7B Qwen2.5 model**, which has the following features:- Type: Causal Language Models- Training Stage: Pretraining & Post-training- Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias- Number of Parameters: 7.61B- Number of Paramaters (Non-Embedding): 6.53B- Number of Layers: 28- Number of Attention Heads (GQA): 28 for Q and 4 for KV- Context Length: Full 131,072 tokens and generation 8192 tokens- Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2.5 for handling long texts.For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5/), [GitHub](https://github.com/QwenLM/Qwen2.5), and [Documentation](https://qwen.readthedocs.io/en/latest/).",
+            "tags": [
+                "transformers",
+                "safetensors",
+                "qwen2",
+                "text-generation",
+                "chat",
+                "conversational",
+                "en",
+                "arxiv:2309.00071",
+                "arxiv:2407.10671",
+                "base_model:Qwen/Qwen2.5-7B",
+                "base_model:finetune:Qwen/Qwen2.5-7B",
+                "license:apache-2.0",
+                "autotrain_compatible",
+                "text-generation-inference",
+                "endpoints_compatible",
+                "region:us"
+            ]
+        }
+    }
+}