This repository contains tools for benchmarking ASR systems using BIGOS corpora.
BIGOS (Benchmark Intended Grouping of Open Speech) corpora collects and unifies publicly available ASR speech datasets.
Currently Polish language is supported.
BIGOS family corpora are available at the Hugging Face platform:
Both BIGOS V2 and PELCRA for BIGOS corpora are intended for evaluation of community-provided ASR systems as part of the 2024 PolEval challenge.
Evaluation results on BIGOS V1 are available in the paper
Hugging Face leaderboard for systematic evaluation of publicly available ASR systems for Polish is under construction.