The data is recorded by 3972 Chinese native speakers with accents covering seven major dialect areas. The recorded text is a mixture of Chinese and English sentences, covering general scenes and human-computer interaction scenes. It is rich in content and accurate in transcription. It can be used for improving the recognition effect of the speech recognition system on Chinese-English mixed reading speech.
For more details, please refer to the link: https://www.nexdata.ai/datasets/939?source=Github
16kHz, 16bit, uncompressed wav, mono channel
quiet indoor environment, without echo
general category; human-machine interaction category
3,972 speakers totally, with 43% males and 57% females, and 68% speakers of all are in the age group of 12-25, 31% speakers of all in the age group of 26-45, 1% speakers of all are in the age group of 46-60
Android mobile phone, iPhone;
mandarin; English
speech recognition; voiceprint recognition.
Commercial License