Intelligent-Internet
/

II-Search-4B

Text Generation

text-generation-inference

Model card Files Files and versions Community

hoanganhpham commited on 21 days ago

Commit

a6fb2e8

·

verified ·

1 Parent(s): cfa4338

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -46,8 +46,8 @@ We applied:
 ### Phase 4: Reinforcement Learning
 We trained the model using reinforcement learning
-- UsedDataset: MuSiQue (19k samples)
-- Incorporated our in-house search database (containing Wiki data, Fineweb data, and arXiv data)
 ## Performance
@@ -81,6 +81,7 @@ II-Search-4B is designed for:
 ## Usage
 To deploy and interact with the II-Search-4B model effectively, follow these options:
 1. Serve the model using vLLM or SGLang
 Use the following command to serve the model with vLLM (adjust parameters as needed for your hardware setup):
 ```bash
 vllm serve Intelligent-Internet/II-Search-4B --served-model-name II-Search-4B --tensor-parallel-size 8 --enable-reasoning --reasoning-parser deepseek_r1 --rope-scaling '{"rope_type":"yarn","factor":1.5,"original_max_position_embeddings":98304}' --max-model-len 131072
@@ -88,6 +89,7 @@ vllm serve Intelligent-Internet/II-Search-4B --served-model-name II-Search-4B --
 This configuration enables distributed tensor parallelism across 8 GPUs, reasoning capabilities, custom RoPE scaling for extended context, and a maximum context length of 131,072 tokens.
 2. Integrate web_search and web_visit tools
 Equip the served model with web_search and web_visit tools to enable internet-aware functionality. Alternatively, use a middleware like MCP for tool integration—see this example repository: https://github.com/hoanganhpham1006/mcp-server-template.
 ## Host on macOS with MLX for local use

 ### Phase 4: Reinforcement Learning
 We trained the model using reinforcement learning
+- Used dataset: [dgslibisey/MuSiQue](https://huggingface.co/datasets/dgslibisey/MuSiQue)
+- Incorporated our in-house search database (containing Wiki data, Fineweb data, and ArXiv data)
 ## Performance
 ## Usage
 To deploy and interact with the II-Search-4B model effectively, follow these options:
 1. Serve the model using vLLM or SGLang
 Use the following command to serve the model with vLLM (adjust parameters as needed for your hardware setup):
 ```bash
 vllm serve Intelligent-Internet/II-Search-4B --served-model-name II-Search-4B --tensor-parallel-size 8 --enable-reasoning --reasoning-parser deepseek_r1 --rope-scaling '{"rope_type":"yarn","factor":1.5,"original_max_position_embeddings":98304}' --max-model-len 131072
 This configuration enables distributed tensor parallelism across 8 GPUs, reasoning capabilities, custom RoPE scaling for extended context, and a maximum context length of 131,072 tokens.
 2. Integrate web_search and web_visit tools
 Equip the served model with web_search and web_visit tools to enable internet-aware functionality. Alternatively, use a middleware like MCP for tool integration—see this example repository: https://github.com/hoanganhpham1006/mcp-server-template.
 ## Host on macOS with MLX for local use