en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

Mandarin Chinese Multi-Stream Speech Dataset – 294 Speakers, 203 Hours

paralanguage speech dataset
Mandarin speech synthesis corpus
Chinese speech synthesis dataset
spontaneous dialogue speech synthesis
annotated speech synthesis dataset
dialogue speech synthesis dataset
multi-stream speech synthesis dataset
Chinese paralanguage dataset
spontaneous dialogue dataset
multi-stream speech corpus

This Mandarin Chinese speech synthesis dataset features with 294 speakers total 203 hours of audio, gender balanced 144 females and 150 males, ages from 18 to 60 years old. Each speaker records free-form dialogues based on given topics, and in each conversation, each person's audio is stored in their own separate WAV file. Professional linguists have annotated 16 types of paralanguage annotations, including text annotations and timestamps, and other information to accurately match the research and development needs of speech synthesis and paralanguage research.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Format
48kHz, 24 bit, wav, mono channel
Recording condition
Recording studio
Content category
Spontaneous dialogue in given topics
Speaker
294 people (Non-Professional Voice Actors) in total, gender balanced (144 females and 150 males), 18~60 years old;
Features of annotation
16 kinds of paralanguage annotation; text transcription; speaker ID, special symbol;
Recording device
Microphone
Language
Mandarin Chinese
Country
China(CHN)
Language(Region) Code
zh-CN
Accuracy
Character Accuracy Rate 99%
Sample Sample
  • Audio

    有的有的,<P>它那种枪战类型的游戏<M/>呢</M>,考的就是肌肉的反应能力和思维的敏捷能力。

  • Audio

    那<D/>你</D>如<D/>果</D>要介绍<P>是比方有朋友找你,你会推荐他去吃这个<M/>吗</M>?

  • Audio

    <V>他现在已经透了一些花絮出来了,我看见<R/>抖音抖音</R>上面已经有了。

Recommended DatasetsRecommended Dataset
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

42cdc5cd-629b-4943-bb70-d97476118729

afb64f5a-1a1d-4dac-b851-5cf404d3736c