misc(readme): add example queries
Browse files
README.md
CHANGED
@@ -23,6 +23,39 @@ which you can query using the `OpenAi` Libraries or directly through `cURL` for
|
|
23 |
| /api/v1/audio/transcriptions | Transcription endpoint to interact with the model |
|
24 |
| /docs | Visual documentation |
|
25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
## Specifications
|
27 |
|
28 |
| spec | value | description |
|
@@ -33,3 +66,6 @@ which you can query using the `OpenAi` Libraries or directly through `cURL` for
|
|
33 |
| KV cache data type | `float8` (e4m3) | Key-Value cache is stored on the GPU using `float8` (`float8_e4m3`) precision to save space |
|
34 |
| PyTorch Compile | ✅ | Enable the use of `torch.compile` to further optimize model's execution with more optimizations |
|
35 |
| CUDA Graphs | ✅ | Enable the use of so called "[CUDA Graphs](https://developer.nvidia.com/blog/cuda-graphs/)" to reduce overhead executing GPU computations |
|
|
|
|
|
|
|
|
23 |
| /api/v1/audio/transcriptions | Transcription endpoint to interact with the model |
|
24 |
| /docs | Visual documentation |
|
25 |
|
26 |
+
## Getting started
|
27 |
+
|
28 |
+
- **Getting text output from audio file**
|
29 |
+
|
30 |
+
```bash
|
31 |
+
curl http://localhost:8000/api/v1/audio/transcriptions \
|
32 |
+
--request POST \
|
33 |
+
--header 'Content-Type: multipart/form-data' \
|
34 |
+
-F file=@</path/to/audio/file> \
|
35 |
+
-F "response_format": "text"
|
36 |
+
```
|
37 |
+
|
38 |
+
- **Getting JSON output from audio file**
|
39 |
+
|
40 |
+
```bash
|
41 |
+
curl http://localhost:8000/api/v1/audio/transcriptions \
|
42 |
+
--request POST \
|
43 |
+
--header 'Content-Type: multipart/form-data' \
|
44 |
+
-F file=@</path/to/audio/file> \
|
45 |
+
-F "response_format": "json"
|
46 |
+
```
|
47 |
+
|
48 |
+
- **Getting segmented JSON output from audio file**
|
49 |
+
|
50 |
+
```bash
|
51 |
+
curl http://localhost:8000/api/v1/audio/transcriptions \
|
52 |
+
--request POST \
|
53 |
+
--header 'Content-Type: multipart/form-data' \
|
54 |
+
-F file=@</path/to/audio/file> \
|
55 |
+
-F "response_format": "verbose_json"
|
56 |
+
```
|
57 |
+
|
58 |
+
|
59 |
## Specifications
|
60 |
|
61 |
| spec | value | description |
|
|
|
66 |
| KV cache data type | `float8` (e4m3) | Key-Value cache is stored on the GPU using `float8` (`float8_e4m3`) precision to save space |
|
67 |
| PyTorch Compile | ✅ | Enable the use of `torch.compile` to further optimize model's execution with more optimizations |
|
68 |
| CUDA Graphs | ✅ | Enable the use of so called "[CUDA Graphs](https://developer.nvidia.com/blog/cuda-graphs/)" to reduce overhead executing GPU computations |
|
69 |
+
|
70 |
+
|
71 |
+
|