Not-For-All-Audiences

conversational

text-generation-inference

Model card Files Files and versions

xet

Community

Qwen2.5-7B-nerd-uncensored-v1.0 / README.md

jeffmeloy

Update README.md

245a9a0 verified 11 months ago

preview code

raw

history blame

2.08 kB

metadata

license: apache-2.0
base_model:
  - Qwen/Qwen2.5-7B
pipeline_tag: text-generation
tags:
  - not-for-all-audiences
language:
  - en

Model Description

Model created by analyzing and selecting optimal layers based on dimensional utilization efficiency. The process follows these steps:

Layer Analysis

Downloads base and fine-tuned models from Hugging Face Hub
Calculates Normalized Effective Rank (NER) for each layer
NER measures how effectively each layer utilizes its dimensions through entropy analysis of singular value distributions

Layer Selection

Identifies common layer structures across models
Ranks layers based on their NER scores
Selects highest-performing layers from each model
Creates a mapping of optimal layer sources

Model Composition

Creates a new model starting from the base architecture
Systematically replaces layers with their highest-performing counterparts
Preserves model architecture while optimizing layer-wise performance
Maintains compatibility with original tokenizer and configuration

Output Generation

Saves the composite model with complete weights and configuration
Generates detailed merge reports documenting layer sources
Copies necessary tokenizer files from base model

NER measures how effectively a neural network layer utilizes its available dimensions through entropy analysis of its singular value distribution. The calculation proceeds as follows:

Singular Value Decomposition
- Input: Weight matrix A ∈ R^(m×n)
- Compute singular values σᵢ where σᵢ ≥ 0
- Filter values above numerical threshold (>1e-12)
Distribution Normalization
- Sum all singular values: S = Σσᵢ
- Create probability distribution: pᵢ = σᵢ/S
Entropy Calculation
- Compute Shannon entropy: H = -Σ(pᵢ * log₂(pᵢ))
- Calculate maximum possible entropy: H_max = log₂(n) where n is the number of singular values
Normalization
- Final NER score = H/H_max
- Results in value between 0 and 1
- Higher scores indicate more uniform dimen