EmoBox, a groundbreaking multilingual multi-corpus speech emotion recognition (SER) toolkit designed to streamline research in this field. EmoBox is accompanied by a meticulously curated benchmark tailored for both intra-corpus and cross-corpus evaluation settings.
EmoBox consists of:Based on EmoBox, we present the intra-corpus SER results of 10 pre-trained speech models on 32 emotion datasets with 14 languages, and the cross-corpus SER results on 4 datasets with the fully balanced test sets. To the best of our knowledge, this is the largest SER benchmark, across language scopes and quantity scales. We hope that our toolkit and benchmark can facilitate the research of SER in the community
Easily conduct experiments on different datasets.
Track the advances in Speech Emotion Recognition research.
@inproceedings{ma2024emobox,
title={EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark},
author={Ziyang Ma and Mingjie Chen and Hezhao Zhang and Zhisheng Zheng and Wenxi Chen and Xiquan Li and Jiaxin Ye and Xie Chen and Thomas Hain},
booktitle={Proc. INTERSPEECH},
year={2024}
}