Skip to content

Nexdata-AI/1535-Hours-Mixed-Speech-with-Chinese-and-English-Data-by-Mobile-Phone

Repository files navigation

1535-Hours-Mixed-Speech-with-Chinese-and-English-Data-by-Mobile-Phone

Description

The data is recorded by 3972 Chinese native speakers with accents covering seven major dialect areas. The recorded text is a mixture of Chinese and English sentences, covering general scenes and human-computer interaction scenes. It is rich in content and accurate in transcription. It can be used for improving the recognition effect of the speech recognition system on Chinese-English mixed reading speech.

For more details, please refer to the link: https://www.nexdata.ai/datasets/939?source=Github

Format

16kHz, 16bit, uncompressed wav, mono channel

Recording environment

quiet indoor environment, without echo

Recording content (read speech)

general category; human-machine interaction category

Demographics

3,972 speakers totally, with 43% males and 57% females, and 68% speakers of all are in the age group of 12-25, 31% speakers of all in the age group of 26-45, 1% speakers of all are in the age group of 46-60

Device

Android mobile phone, iPhone;

Language

mandarin; English

Application scenarios

speech recognition; voiceprint recognition.

Licensing Information

Commercial License