en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

Home > All Category Datasets > Speech Synthesis Datasets > Mandarin Chinese Multi-Stream Speech Dataset – 294 Speakers, 203 Hours

Mandarin Chinese Multi-Stream Speech Dataset – 294 Speakers, 203 Hours

paralanguage speech dataset

Mandarin speech synthesis corpus

Chinese speech synthesis dataset

spontaneous dialogue speech synthesis

annotated speech synthesis dataset

dialogue speech synthesis dataset

multi-stream speech synthesis dataset

Chinese paralanguage dataset

spontaneous dialogue dataset

multi-stream speech corpus

This Mandarin Chinese speech synthesis dataset features with 294 speakers total 203 hours of audio, gender balanced 144 females and 150 males, ages from 18 to 60 years old. Each speaker records free-form dialogues based on given topics, and in each conversation, each person's audio is stored in their own separate WAV file. Professional linguists have annotated 16 types of paralanguage annotations, including text annotations and timestamps, and other information to accurately match the research and development needs of speech synthesis and paralanguage research.

This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.

Specifications

Specifications

Format

48kHz, 24 bit, wav, mono channel

Recording condition

Recording studio

Content category

Spontaneous dialogue in given topics

Speaker

294 people (Non-Professional Voice Actors) in total, gender balanced (144 females and 150 males), 18~60 years old;

Features of annotation

16 kinds of paralanguage annotation; text transcription; speaker ID, special symbol;

Recording device

Microphone

Language

Mandarin Chinese

Country

China(CHN)

Language(Region) Code

zh-CN

Accuracy

Character Accuracy Rate 99%

Sample

Sample

Audio
有的有的，<P>它那种枪战类型的游戏<M/>呢</M>，考的就是肌肉的反应能力和思维的敏捷能力。
Audio
那<D/>你</D>如<D/>果</D>要介绍<P>是比方有朋友找你，你会推荐他去吃这个<M/>吗</M>？
Audio
<V>他现在已经透了一些花絮出来了，我看见<R/>抖音抖音</R>上面已经有了。

Recommended Datasets

Recommended Dataset

40 People - Multi-level Control Multi-emotional Paralanguage Annotated Speech Synthesis Corpus

40 People - Multi-level Control Multi-emotional Paralanguage Annotated Speech Synthesis Corpus，recorded by native professional voice actors/actresses. The content of the recording contains multi-level control, multi-emotional, single-emotional, single-tone, emotional shift, paralanguage. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.

TTS Multi-level Control Multi-emotional Paralanguage emotional shift

Chinese Expressive Narration Speech Synthesis Dataset – 4 Speakers

Chinese Expressive Narration Speech Synthesis Dataset recorded by 4 professional character voice actors. Given the book-based content, speakers reads in a highly expressive narration style. Suitable for audiobook-like TTS generation. This dataset supports expressive TTS, storytelling voice models, audiobook synthesis, and emotion-rich speech generation.

Chinese speech synthesis dataset Mandarin speech dataset expressive narration TTS dataset Chinese expressive speech dataset narration speech corpus Chinese audiobook dataset character voice speech dataset

Chinese Emotional Speech Dataset – 4 Speakers, Multi-Style Voices

This is a Chinese speech synthesis dataset recorded by 4 professional character-voice actors, covering multiple speaking styles (e.g. authoritative female boss, straightforward prince, nimble maid, kind elderly woman) and emotions include disdain, anger, happiness, concern, surprise, gasp of fear, cold snort (disdain), sympathy, laughter, inner thoughts, seriousness, disgust, puzzlement, sadness and neutrality. The dataset is ideal for building expressive text-to-speech (TTS), voice acting, character-based narration, emotion-aware speech generation, and related AI voice applications.

Chinese speech synthesis dataset Mandarin speech dataset narration speech corpus Chinese expressive speech dataset character voice speech dataset Chinese TTS dataset

100 Speakers Chinese Speech Synthesis Dataset & Multi-Emotion

This dataset is recorded by 100 professional Chinese voice actors. It not only includes sentences rich in modal particles that align with daily expression habits, but also encompasses free conversation data on given topics. Each speaker’s audio is stored in a separate track. All recordings are annotated by professional phoneticians with text, timestamps, and prosody details, meeting the precise requirements for speech synthesis, emotion recognition, and prosody modeling research.

Chinese emotional speech data Chinese conversational speech corpus Chinese natural conversation dataset Chinese prosody dataset

2 Speakers – Korean TTS Dataset with Native Accent

This dataset contains recordings from 2 native Korean speakers with authentic accent. Contains news and colloquial general corpus, the phoneme coverage is balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development in text-to-speech, Korean speech synthesis, and AI voice applications.

Korean speech dataset Korean TTS dataset Korean speech synthesis corpus Korean voice dataset for AI Korean accent speech corpus Korean text-to-speech dataset Korean speech recordings for TTS

14 Hours Taiwan Mandarin TTS Dataset – Multi-Style Voices

This dataset contains 14 hours of Taiwan Mandarin recordings from 4 professional voice actors with 7 speaking styles. The styles are criminal subordinate, rough man, little girl, kind grandma, businessman, grandfather and non-commissioned officer. Professional phonetician participates in the annotation. It is ideal for text-to-speech (TTS), expressive voice generation, virtual avatars, and AI speech synthesis applications.

Taiwan Mandarin speech dataset Taiwan Mandarin voice dataset Taiwan Mandarin speech corpus for AI Mandarin accent dataset Taiwan Mandarin TTS dataset

40 People - Multi-style Average Tone Speech Synthesis Corpus-Customer Service

40 People - Multi-style Average Tone Speech Synthesis Corpus-Customer Service, it is recorded by professional Character Voices, The 40 styles are respectively Beijing dialect, film commentary, Hua Fei, documentary commentary, food commentary, novel commentary, middle-aged and young magnetic man, Dufei, Sun Wukong, warm male voice, TVB female voice, Guangxi Cousin, Lin Daiyu, Nezha, Ruilai, Shen Gongbao, Taiyi Zhenren, gentle Peach, Xu Zhisheng, Jay Chou, Henan uncle, Bubble Sound, Rong Nannu, Tianjin Young man, Innocent child, righteous male voice, roaring guy, Si Lang, Sea Yaksha, saleswoman's voice, SpongeBob, Tang Monk, duck voice, sunny male voice, eunuch from the Eastern Depot, news anchor's voice, well-behaved child's voice, sweet and soft child's voice, male audiobook narrator, female audiobook narrator.

Synthesis Corpus TTS Mandarin Chinese Multi-style

2 People - Chinese Natural Conversation Speech Synthesis Corpus

2 People - Chinese Natural Conversation Speech Synthesis Corpus. It is recorded by Chinese native speaker, natural conversation style. phonemes and tones are balanced. Professional phonetician participates in the annotation, and annotate secondary language, Secondary Language Annotation: Inhalation: V; Pause: P; Hesitation: T; Mouth clicking: M; Drawl: D; Cough: C; Laughter: L; Stutter repetition: R; Inversion: I; Modal particle: S (Modal particles include "ah", "oh", "wow", "right?", "what?", "well" etc.). It precisely matches with the research and development needs of the speech synthesis.

Natural conservation Secondary language TTS

Tell Us Your Special Needs

Current Project Maturity

Early exploration (no concrete specs yet)

Defined goals, need professional guidance

Active development or optimization phase

Data & labeling experts with clear specifications

Full Name *

Contact Phone No.*

Company name *

Company Email *

Data Requirements *

By submitting, I agree to the Privacy Protection

Subscribe to our newsletter

Be the first to receive Nexdata latest product releases, data solutions and enterprise news.

Off-the-Shelf Datasets: All Category Datasets; LLM Datasets; Computer Vision Datasets; Speech Recognition Datasets; Speech Synthesis Datasets; OCR Datasets; Pronunciation Dictionary; NLU Datasets

Data Service: 3D Point Cloud Data; Street View Data; OCR Data; Behavior Recognition Data; Identity Recognition Data; Speech Recognition Data; Speech Synthesis Data; Multimodal Data

Industries: Embodied AI; Generative AI; Autonomous Vehicles; AR/VR; Conversational AI; Smart Home; Retail; Intelligent Healthcare

Company: About Us; News; Partners; Quality & Security; Event
Links: OPENMPD; DataPlus; Datarade

Platform: Platform
Competition: Competition
Resources: Sponsored Datasets

Sharpen Your AI with Better Data

+1(626)594-5598

[email protected]

nexdata_ai facebook

nexdata_ai twitter

nexdata_ai linkedin

nexdata_ai youtube

Copyright © 2023 NEXDATA TECHNOLOGY INC

Sitemap Terms and Conditions

We use cookies to enhance your browsing experience, serve personalized ads or content, and analyze our traffic. By clicking "Accept All", you consent to our use of cookies.

81b596ca-8b07-4461-ad2b-250db0c9c7a5

47a343e0-1a2b-48b0-b644-5bd7e19f2f8b