Model Card for Sentence Type Classification
This model is fine-tuned to classify Korean financial sentences into four categories: Predictive, Inferential, Factual, and Conversational. It is built upon jhgan/ko-sroberta-multitask
, a multilingual transformer model specialized for Korean NLP tasks.
Model Details
Model Description
- Developed by: Kwon Cho
- Shared by: kwoncho
- Model type: RoBERTa-based transformer (fine-tuned for sequence classification)
- Language(s): Korean (한국어)
- License: Apache 2.0 (from base model)
- Finetuned from model:
jhgan/ko-sroberta-multitask
This model was fine-tuned for multi-class classification using supervised learning with Hugging Face Transformers and PyTorch.
Model Sources
- Repository: [More Information Needed]
- Demo: [More Information Needed]
Uses
Direct Use
The model can be used to classify financial sentences (in Korean) into one of the following categories:
- Predictive (예측형)
- Inferential (추론형)
- Factual (사실형)
- Conversational (대화형)
Training Data
- Dataset Name: 문장 유형(추론, 예측 등) 판단 데이터
- 출처: AIHub 링크
이 데이터는 한국어 금융 문장을 다음 네 가지 유형으로 분류합니다:
예측형 (Predictive)
추론형 (Inferential)
사실형 (Factual)
대화형 (Conversational)
Out-of-Scope Use
- Not suitable for general-purpose Korean sentence classification outside financial or economic contexts.
- May not perform well on informal or highly colloquial text.
Bias, Risks, and Limitations
- The model may carry biases present in the training dataset.
- Misclassifications could have downstream implications if used for investment recommendations or financial analysis without verification.
Recommendations
Use this model in conjunction with human oversight, especially for high-stakes or production-level applications.
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("kwoncho/sentence_type_classification")
model = AutoModelForSequenceClassification.from_pretrained("kwoncho/sentence_type_classification")
text = "해당 종목은 단기적으로 하락할 가능성이 있습니다."
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
- Downloads last month
- 5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support