|
--- |
|
license: apache-2.0 |
|
base_model: |
|
- Qwen/Qwen2.5-7B |
|
pipeline_tag: text-generation |
|
tags: |
|
- not-for-all-audiences |
|
language: |
|
- en |
|
library_name: transformers |
|
--- |
|
|
|
## Model Description |
|
|
|
Model created by analyzing and selecting the optimal layers from other Qwen2.5-7B models based on their dimensional utilization efficiency, measured by the Normalized Effective Rank (NER). Computed like: |
|
|
|
Singular Value Decomposition: |
|
- Input: Weight matrix A ∈ R^(m×n) # m = number of output features, n = number of input features |
|
- Compute singular values σᵢ where σᵢ ≥ 0 # σᵢ represents the importance of each dimension |
|
- Filter values above numerical threshold (>1e-12) # removes numerical noise from computation |
|
|
|
Distribution Normalization: |
|
- Sum all singular values: S = Σσᵢ # S acts as normalization factor |
|
- Create probability distribution: pᵢ = σᵢ/S # converts singular values to probabilities summing to 1 |
|
|
|
Entropy Calculation: |
|
- Compute Shannon entropy: H = -Σ(pᵢ * log₂(pᵢ)) # measures information content of distribution |
|
- Calculate maximum possible entropy: H_max = log₂(n) # n = number of singular values |
|
where n is the number of singular values # maximum entropy occurs when all dimensions contribute equally |
|
|
|
Normalization: |
|
- Final NER score = H/H_max # normalizes score to [0,1] range |
|
- Results in value between 0 and 1 # 0 = single dimension dominance, 1 = perfect dimensional utilization |
|
- Higher scores indicate more uniform dimensional utilization |
|
|
|
## Creating Composite Model |
|
|
|
Code here: https://huggingface.co/jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0/blob/main/ner_merge.py |
|
|
|
Layer Analysis: |
|
- Download base and fine-tuned models from Hugging Face Hub |
|
- Calculate Normalized Effective Rank (NER) for each layer within each model |
|
|
|
Layer Selection: |
|
- Identify common layer structures across models |
|
- Define model and layer name pairs that have highest NER for each layer based on their NER scores |
|
|
|
Model Composition: |
|
- Incrementally build a composite model using layer with highest NER from model pool. |
|
|
|
Output Generation: |
|
- Save merge reports documenting layer sources |
|
- Copy config and tokenizer files from base model |
|
- Save the composite model with complete weights # model ready to use |
|
|
|
Configfile: |
|
|
|
base_model: "Qwen/Qwen2.5-7B" |
|
|
|
fine_tuned_models: # uncomment the models you want to merge |
|
|
|
#- "Qwen/Qwen2.5-7B" |
|
|
|
#- "Qwen/Qwen2.5-7B-Instruct" |
|
|
|
#- "FourOhFour/Vapor_v2_7B" |
|
|
|
#- "Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2" |
|
|
|
#- "happzy2633/qwen2.5-7b-ins-v3" |
|
|
|
#- "huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2" |
|
|
|
#- "HumanLLMs/Humanish-Qwen2.5-7B-Instruct" |
|
|
|
#- "Orion-zhen/Qwen2.5-7B-Instruct-Uncensored" |
|
|
|
#- "Orion-zhen/Meissa-Qwen2.5-7B-Instruct" |
|
|
|
#- "jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0" |
|
|
|
#- "rombodawg/Rombos-LLM-V2.5-Qwen-7b" |
|
|
|
#- "Cran-May/T.E-8.1" |
|
|
|
#- "thomas-yanxin/XinYuan-Qwen2.5-7B-0917" |
|
|
|
#- "beomi/Qwen2.5-7B-Instruct-kowiki-qa" |
|
|
|
#- "Orion-zhen/Qwen2.5-7B-Gutenberg-KTO" |
|
|
|
#- 'fblgit/cybertron-v4-qw7B-MGS' |
|
|
|
#- 'nguyentd/FinancialAdvice-Qwen2.5-7B' |
|
|
|
#- "Qwen/Qwen2.5-Coder-7B-Instruct" |
|
|
|
#- "Qwen/Qwen2.5-Math-7B-Instruct" |
|
|
|
#- "Qwen/Qwen2.5-Coder-7B" |
|
|
|
#- "Qwen/Qwen2.5-Math-7B" |
|
|
|
#- "WhiteRabbitNeo/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B" |
|
|
|
#- "edgerunner-ai/EdgeRunner-Command-Nested" |
|
|
|
#- "katanemo/Arch-Function-7B" |
|
|
|
models_dir: "./input_models/" |
|
|
|
output_dir: "./merged_model/" |
|
|
|
metric_dir: "./metrics/" |