Audio-Text-to-Text
ZhifengKong commited on
Commit
4a08e7e
·
1 Parent(s): 41d6708

update readme

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -20,6 +20,17 @@ This repo contains the PyTorch implementation of [Audio Flamingo Sound-CoT Techn
20
 
21
  - Audio Flamingo 2 Sound-CoT shows strong reasoning abilities on several sound reasoning benchmarks, despite being small (3B) and trained exclusively on public datasets.
22
 
 
 
 
 
 
 
 
 
 
 
 
23
  ## License
24
 
25
  - The code in this repo is under MIT license.
 
20
 
21
  - Audio Flamingo 2 Sound-CoT shows strong reasoning abilities on several sound reasoning benchmarks, despite being small (3B) and trained exclusively on public datasets.
22
 
23
+ ## Usage
24
+
25
+ The inference script is almost the same as [Audio Flamingo 2](https://github.com/NVIDIA/audio-flamingo/tree/audio_flamingo_2/inference_HF_pretrained). The only difference is to add a special prompt (```Output the answer with <SUMMARY>, <CAPTION>, <REASONING>, and <CONCLUSION> tags.```) after the input question. For instance, in Audio Flamingo 2, the input is
26
+ ```
27
+ Based on the given audio, identify the source of the church bells. Choose the correct option from the following options:\n(A) Church\n(B) School\n(C) Clock Tower\n(D) Fire Station.
28
+ ```
29
+ In Audio Flamingo 2 Sound-CoT, the input is
30
+ ```
31
+ Based on the given audio, identify the source of the church bells. Choose the correct option from the following options:\n(A) Church\n(B) School\n(C) Clock Tower\n(D) Fire Station. Output the answer with <SUMMARY>, <CAPTION>, <REASONING>, and <CONCLUSION> tags.
32
+ ```
33
+
34
  ## License
35
 
36
  - The code in this repo is under MIT license.