Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:1910.01108

Sentiment/Emotion Analysis

MilaNLProc/feel-it-italian-sentiment

Text Classification • Updated Aug 15, 2022 • 33.4k • • 20
cardiffnlp/twitter-roberta-base-sentiment-latest

Text Classification • Updated 21 days ago • 3.13M • 705
bhadresh-savani/distilbert-base-uncased-emotion

Text Classification • 0.1B • Updated Aug 14, 2024 • 454k • • 153
FacebookAI/xlm-roberta-large

Fill-Mask • 0.6B • Updated Feb 19, 2024 • 4.19M • • 460

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 20
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 1
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

chat-models-candidates

nikravan/glm-4vq

Document Question Answering • 7B • Updated Jun 16, 2024 • 79 • 35
deepseek-ai/deepseek-coder-33b-instruct

Text Generation • 33B • Updated Mar 7, 2024 • 16.2k • 538
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 66
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17
distilbert/distilbert-base-uncased-finetuned-sst-2-english

Text Classification • 0.1B • Updated Dec 19, 2023 • 2.86M • • 815
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Paper • 2401.14112 • Published Jan 25, 2024 • 21
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

Paper • 2401.04092 • Published Jan 8, 2024 • 22

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2 • 87
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 109
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 74
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16 • 25

on device modules low resorces

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17

Running

2.6k

2.6k

Anycoder

🏢

Generate modern HTML designs from existing code
Runtime error

274

274

Qwen2.5 Coder Artifacts

🐢

Generate application code with Qwen2.5-Coder-32B
Running

923

923

QwQ-32B-Preview

🔍

QwQ-32B-Preview
Running on CPU Upgrade

13.5k

13.5k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 81
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 20
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 16

language-models

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 52
BloombergGPT: A Large Language Model for Finance

Paper • 2303.17564 • Published Mar 30, 2023 • 26
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 20
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17

Sentiment/Emotion Analysis

MilaNLProc/feel-it-italian-sentiment

Text Classification • Updated Aug 15, 2022 • 33.4k • • 20
cardiffnlp/twitter-roberta-base-sentiment-latest

Text Classification • Updated 21 days ago • 3.13M • 705
bhadresh-savani/distilbert-base-uncased-emotion

Text Classification • 0.1B • Updated Aug 14, 2024 • 454k • • 153
FacebookAI/xlm-roberta-large

Fill-Mask • 0.6B • Updated Feb 19, 2024 • 4.19M • • 460

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2 • 87
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 109
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 74
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16 • 25

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17

on device modules low resorces

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 20
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 1
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Running

2.6k

2.6k

Anycoder

🏢

Generate modern HTML designs from existing code
Runtime error

274

274

Qwen2.5 Coder Artifacts

🐢

Generate application code with Qwen2.5-Coder-32B
Running

923

923

QwQ-32B-Preview

🔍

QwQ-32B-Preview
Running on CPU Upgrade

13.5k

13.5k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

chat-models-candidates

nikravan/glm-4vq

Document Question Answering • 7B • Updated Jun 16, 2024 • 79 • 35
deepseek-ai/deepseek-coder-33b-instruct

Text Generation • 33B • Updated Mar 7, 2024 • 16.2k • 538
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 66
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 81
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 20
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 16

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17
distilbert/distilbert-base-uncased-finetuned-sst-2-english

Text Classification • 0.1B • Updated Dec 19, 2023 • 2.86M • • 815
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Paper • 2401.14112 • Published Jan 25, 2024 • 21
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

Paper • 2401.04092 • Published Jan 8, 2024 • 22

language-models

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 52
BloombergGPT: A Large Language Model for Finance

Paper • 2303.17564 • Published Mar 30, 2023 • 26
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 20
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略