Hervé Bredin
commited on
Commit
·
ad19897
1
Parent(s):
db94671
doc: improve README
Browse files- README.md +15 -20
- reproducible_research/report.pdf +0 -3
README.md
CHANGED
|
@@ -21,32 +21,15 @@ inference: false
|
|
| 21 |
|
| 22 |

|
| 23 |
|
| 24 |
-
Model from *[End-to-end speaker segmentation for overlap-aware resegmentation](
|
| 25 |
|
| 26 |
-
|
| 27 |
-
@inproceedings{Bredin2020,
|
| 28 |
-
Title = {{pyannote.audio: neural building blocks for speaker diarization}},
|
| 29 |
-
Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
|
| 30 |
-
Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
|
| 31 |
-
Address = {Barcelona, Spain},
|
| 32 |
-
Month = {May},
|
| 33 |
-
Year = {2020},
|
| 34 |
-
}
|
| 35 |
-
```
|
| 36 |
|
| 37 |
## Support
|
| 38 |
|
| 39 |
For commercial enquiries and scientific consulting, please contact [me](mailto:[email protected]).
|
| 40 |
For [technical questions](https://github.com/pyannote/pyannote-audio/discussions) and [bug reports](https://github.com/pyannote/pyannote-audio/issues), please check [pyannote.audio](https://github.com/pyannote/pyannote-audio) Github repository.
|
| 41 |
|
| 42 |
-
## Requirements
|
| 43 |
-
|
| 44 |
-
This model relies on `pyannote.audio` 2.0 (which is still in development as of April 2nd, 2021):
|
| 45 |
-
|
| 46 |
-
```bash
|
| 47 |
-
$ pip install https://github.com/pyannote/pyannote-audio/archive/develop.zip
|
| 48 |
-
```
|
| 49 |
-
|
| 50 |
## Basic usage
|
| 51 |
|
| 52 |
```python
|
|
@@ -110,7 +93,7 @@ resegmented_baseline = pipeline({"audio": "audio.wav", "baseline": baseline})
|
|
| 110 |
## Reproducible research
|
| 111 |
|
| 112 |
In order to reproduce the results of the paper ["End-to-end speaker segmentation for overlap-aware resegmentation
|
| 113 |
-
"](
|
| 114 |
|
| 115 |
Voice activity detection | `onset` | `offset` | `min_duration_on` | `min_duration_off`
|
| 116 |
----------------|---------|----------|-------------------|-------------------
|
|
@@ -132,3 +115,15 @@ VoxConverse | 0.537 | 0.724 | 0.410 | 0.563
|
|
| 132 |
|
| 133 |
Expected outputs (and VBx baseline) are also provided in the `/reproducible_research` sub-directories.
|
| 134 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |

|
| 23 |
|
| 24 |
+
Model from *[End-to-end speaker segmentation for overlap-aware resegmentation](http://arxiv.org/abs/2104.04045)*, by Hervé Bredin and Antoine Laurent.
|
| 25 |
|
| 26 |
+
Relies on pyannote.audio 2.0 currently in development: see [installation instructions](https://github.com/pyannote/pyannote-audio/tree/develop#installation).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
## Support
|
| 29 |
|
| 30 |
For commercial enquiries and scientific consulting, please contact [me](mailto:[email protected]).
|
| 31 |
For [technical questions](https://github.com/pyannote/pyannote-audio/discussions) and [bug reports](https://github.com/pyannote/pyannote-audio/issues), please check [pyannote.audio](https://github.com/pyannote/pyannote-audio) Github repository.
|
| 32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
## Basic usage
|
| 34 |
|
| 35 |
```python
|
|
|
|
| 93 |
## Reproducible research
|
| 94 |
|
| 95 |
In order to reproduce the results of the paper ["End-to-end speaker segmentation for overlap-aware resegmentation
|
| 96 |
+
"](https://arxiv.org/abs/2104.04045), use the following hyper-parameters:
|
| 97 |
|
| 98 |
Voice activity detection | `onset` | `offset` | `min_duration_on` | `min_duration_off`
|
| 99 |
----------------|---------|----------|-------------------|-------------------
|
|
|
|
| 115 |
|
| 116 |
Expected outputs (and VBx baseline) are also provided in the `/reproducible_research` sub-directories.
|
| 117 |
|
| 118 |
+
## Citation
|
| 119 |
+
|
| 120 |
+
```bibtex
|
| 121 |
+
@inproceedings{Bredin2020,
|
| 122 |
+
Title = {{pyannote.audio: neural building blocks for speaker diarization}},
|
| 123 |
+
Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
|
| 124 |
+
Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
|
| 125 |
+
Address = {Barcelona, Spain},
|
| 126 |
+
Month = {May},
|
| 127 |
+
Year = {2020},
|
| 128 |
+
}
|
| 129 |
+
```
|
reproducible_research/report.pdf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:b5561d558589f17d8afa16ed0bd82664a8fc682a405e1d9b9f966afc83fac063
|
| 3 |
-
size 375907
|
|
|
|
|
|
|
|
|
|
|
|