Hervé Bredin
commited on
Commit
·
ad19897
1
Parent(s):
db94671
doc: improve README
Browse files- README.md +15 -20
- reproducible_research/report.pdf +0 -3
README.md
CHANGED
@@ -21,32 +21,15 @@ inference: false
|
|
21 |
|
22 |

|
23 |
|
24 |
-
Model from *[End-to-end speaker segmentation for overlap-aware resegmentation](
|
25 |
|
26 |
-
|
27 |
-
@inproceedings{Bredin2020,
|
28 |
-
Title = {{pyannote.audio: neural building blocks for speaker diarization}},
|
29 |
-
Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
|
30 |
-
Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
|
31 |
-
Address = {Barcelona, Spain},
|
32 |
-
Month = {May},
|
33 |
-
Year = {2020},
|
34 |
-
}
|
35 |
-
```
|
36 |
|
37 |
## Support
|
38 |
|
39 |
For commercial enquiries and scientific consulting, please contact [me](mailto:[email protected]).
|
40 |
For [technical questions](https://github.com/pyannote/pyannote-audio/discussions) and [bug reports](https://github.com/pyannote/pyannote-audio/issues), please check [pyannote.audio](https://github.com/pyannote/pyannote-audio) Github repository.
|
41 |
|
42 |
-
## Requirements
|
43 |
-
|
44 |
-
This model relies on `pyannote.audio` 2.0 (which is still in development as of April 2nd, 2021):
|
45 |
-
|
46 |
-
```bash
|
47 |
-
$ pip install https://github.com/pyannote/pyannote-audio/archive/develop.zip
|
48 |
-
```
|
49 |
-
|
50 |
## Basic usage
|
51 |
|
52 |
```python
|
@@ -110,7 +93,7 @@ resegmented_baseline = pipeline({"audio": "audio.wav", "baseline": baseline})
|
|
110 |
## Reproducible research
|
111 |
|
112 |
In order to reproduce the results of the paper ["End-to-end speaker segmentation for overlap-aware resegmentation
|
113 |
-
"](
|
114 |
|
115 |
Voice activity detection | `onset` | `offset` | `min_duration_on` | `min_duration_off`
|
116 |
----------------|---------|----------|-------------------|-------------------
|
@@ -132,3 +115,15 @@ VoxConverse | 0.537 | 0.724 | 0.410 | 0.563
|
|
132 |
|
133 |
Expected outputs (and VBx baseline) are also provided in the `/reproducible_research` sub-directories.
|
134 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |

|
23 |
|
24 |
+
Model from *[End-to-end speaker segmentation for overlap-aware resegmentation](http://arxiv.org/abs/2104.04045)*, by Hervé Bredin and Antoine Laurent.
|
25 |
|
26 |
+
Relies on pyannote.audio 2.0 currently in development: see [installation instructions](https://github.com/pyannote/pyannote-audio/tree/develop#installation).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
|
28 |
## Support
|
29 |
|
30 |
For commercial enquiries and scientific consulting, please contact [me](mailto:[email protected]).
|
31 |
For [technical questions](https://github.com/pyannote/pyannote-audio/discussions) and [bug reports](https://github.com/pyannote/pyannote-audio/issues), please check [pyannote.audio](https://github.com/pyannote/pyannote-audio) Github repository.
|
32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
## Basic usage
|
34 |
|
35 |
```python
|
|
|
93 |
## Reproducible research
|
94 |
|
95 |
In order to reproduce the results of the paper ["End-to-end speaker segmentation for overlap-aware resegmentation
|
96 |
+
"](https://arxiv.org/abs/2104.04045), use the following hyper-parameters:
|
97 |
|
98 |
Voice activity detection | `onset` | `offset` | `min_duration_on` | `min_duration_off`
|
99 |
----------------|---------|----------|-------------------|-------------------
|
|
|
115 |
|
116 |
Expected outputs (and VBx baseline) are also provided in the `/reproducible_research` sub-directories.
|
117 |
|
118 |
+
## Citation
|
119 |
+
|
120 |
+
```bibtex
|
121 |
+
@inproceedings{Bredin2020,
|
122 |
+
Title = {{pyannote.audio: neural building blocks for speaker diarization}},
|
123 |
+
Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
|
124 |
+
Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
|
125 |
+
Address = {Barcelona, Spain},
|
126 |
+
Month = {May},
|
127 |
+
Year = {2020},
|
128 |
+
}
|
129 |
+
```
|
reproducible_research/report.pdf
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:b5561d558589f17d8afa16ed0bd82664a8fc682a405e1d9b9f966afc83fac063
|
3 |
-
size 375907
|
|
|
|
|
|
|
|