Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
免费去水印
Log In
Sign Up
1
1
72
Mex Ivanov
MexIvanov
Follow
21world's profile picture
skatzR's profile picture
evilfreelancer's profile picture
3 followers
·
12 following
MexIvanov
AI & ML interests
NLP, Coding, Quantum Computing and more.
Recent Activity
reacted
to
RakshitAralimatti
's
post
with 🔥
16 days ago
I built something crazy you never saw before. Please check - https://huggingface.co/blog/RakshitAralimatti/streaming-data-rag A real-time Streaming Data to RAG system that listens to live radio, transcribes it on-the-fly, and lets you query across TIME. Not just "what was discussed" – but "what happened in the last 10 minutes on channel 0?" or "at 9 AM, what was the breaking news?" This is RAG that understands temporal context.
reacted
to
samerzaher80
's
post
with 👍
28 days ago
AetherMind_SRL: How I beat 7B models on MMLU with 184M params and a $300 GPU I’m Sameer, a solo researcher from Iraq working on a single RTX 3050 8GB laptop.Today I’m releasing AetherMind_SRL – a 184M-parameter NLI model that was trained only on tasks (SNLI, MNLI, ANLI, and a small clinical Alzheimer’s dataset). It was never fine-tuned or even shown a single MMLU question during training.Yet here are the zero-shot MMLU (57 subjects) results:Model MMLU Zero-Shot Training Data AetherMind_SRL (me) 184M 36.05 % Only NLI (SNLI/MNLI/ANLI + ADNI) DeBERTa-v3-base 278M ~30.8 % General pre-training BERT-large 340M 27–30 % General pre-training LLaMA-1 7B 7B 34–35 % Massive text corpus LLaMA-2 7B 7B ~45 % Bigger + better data Yes – my 184M model beats every classic 300–400M model and the original 7-billion-parameter LLaMA-1, all while running at 300+ samples/sec on a $300 laptop GPU.How did this happen?I built a standardized self-improvement loop called AetherMind Self-Reflective Learning (SRL) v1.0:Train normally on NLI Let the model predict on hard adversarial data (ANLI) Log every mistake + low-confidence case Build a balanced “SMART” buffer (60% errors + 40% correct anchors) Fine-tune with tiny LR and error-weighted loss Repeat until stable That’s it. No external knowledge, no MMLU data, no cluster. Just pure reasoning transfer from entailment/contradiction patterns → real-world knowledge.Try it yourself python from transformers import pipeline import torch nli_pipeline = pipeline( "text-classification", model="samerzaher80/AetherMind_SRL", device=0 if torch.cuda.is_available() else -1 ) # DEFINE YOUR TEST HERE premise = "Patient shows progressive memory decline." hypothesis = "Patient shows progressive memory decline." input_text = f"{premise} [SEP] {hypothesis}" result = nli_pipeline(input_text)[0] print(f"Prediction: {result['label']}") print(f"Confidence: {result['score']: Model: https://huggingface.co/samerzaher80/AetherMind_SRL
liked
a model
about 2 months ago
MiniMaxAI/MiniMax-M2
View all activity
Organizations
None yet
models
6
Sort: Recently updated
MexIvanov/MistRAG-7B-ruen-v1-merged
Text Generation
•
7B
•
Updated
Nov 25, 2024
•
10
MexIvanov/MistRAG-7B-ruen-v1
Text Generation
•
Updated
Nov 25, 2024
MexIvanov/MistRAG-7B-ruen-v1-gguf
Text Generation
•
7B
•
Updated
Nov 25, 2024
•
73
•
1
MexIvanov/zephyr-python-ru
Text Generation
•
Updated
Nov 11, 2024
•
2
MexIvanov/zephyr-python-ru-merged
Text Generation
•
7B
•
Updated
Nov 11, 2024
•
69
•
2
MexIvanov/zephyr-python-ru-gguf
Text Generation
•
7B
•
Updated
Nov 11, 2024
•
81
•
5
datasets
4
Sort: Recently updated
MexIvanov/RAG-v1-ruen
Viewer
•
Updated
Nov 11, 2024
•
51.4k
•
29
•
2
MexIvanov/image-gen-vector-consistency
Viewer
•
Updated
Aug 30, 2024
•
184
•
395
MexIvanov/CodeExercise-Python-27k-ru
Viewer
•
Updated
Dec 19, 2023
•
27.2k
•
96
•
3
MexIvanov/Vezora-Tested-22k-Python-Alpaca-ru
Viewer
•
Updated
Dec 19, 2023
•
22.6k
•
62
•
2
×
🎉 Free Image Generator Now Available!
Totally Free + Zero Barriers + No Login Required
Visit Now