8 Hours – Cantonese Speech Dataset for TTS (Hong Kong)

Cantonese speech dataset

Hong Kong Cantonese speech corpus

Cantonese text-to-speech dataset

Cantonese voice dataset for AI

native Cantonese speech recordings

Cantonese TTS dataset

Hong Kong accent speech dataset

This dataset features recordings from 4 native Hong Kong Cantonese speakers. The corpus contain educational, game and general colloquial content. The phoneme coverage is balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.

This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.

Recommended Dataset

4 People - Chinese High-expressivity Narration Average Tone Speech Synthesis Corpus

4 People - Chinese High-expressivity Narration Average Tone Speech Synthesis Corpus, it is recorded by professional Character Voices, Given the book, the speaker reads in a highly expressive narration style.

High-expressivity Narration TTS Chinese

5 People - Multi-style And Multi-emotional Average Tone Speech Synthesis Corpus

5 People - Multi-style And Multi-emotional Average Tone Speech Synthesis Corpus, it is recorded by professional Character Voices. Styles include the capable female boss, the straightforward prince, the nimble maid, and the kind elderly lady-four in total; emotions include disdain, anger, happiness, concern, surprise, gasp of fear, cold snort (disdain), sympathy, laughter, inner thoughts, seriousness, disgust, puzzlement, sadness and neutrality.

Synthesis Corpus TTS Mandarin Chinese Multi-style Multi-emotional

100 Speakers Chinese Speech Synthesis Dataset & Multi-Emotion

This dataset is recorded by 100 professional Chinese voice actors. It not only includes sentences rich in modal particles that align with daily expression habits, but also encompasses free conversation data on given topics. Each speaker’s audio is stored in a separate track. All recordings are annotated by professional phoneticians with text, timestamps, and prosody details, meeting the precise requirements for speech synthesis, emotion recognition, and prosody modeling research.

Chinese emotional speech data Chinese conversational speech corpus Chinese natural conversation dataset Chinese prosody dataset

Mandarin Chinese Multi-Stream Speech Dataset – 294 Speakers, 203 Hours

This Mandarin Chinese speech synthesis dataset features with 294 speakers total 203 hours of audio, gender balanced 144 females and 150 males, ages from 18 to 60 years old. Each speaker records free-form dialogues based on given topics, and in each conversation, each person's audio is stored in their own separate WAV file. Professional linguists have annotated 16 types of paralanguage annotations, including text annotations and timestamps, and other information to accurately match the research and development needs of speech synthesis and paralanguage research.

paralanguage speech dataset Mandarin speech synthesis corpus Chinese speech synthesis dataset spontaneous dialogue speech synthesis annotated speech synthesis dataset dialogue speech synthesis dataset multi-stream speech synthesis dataset Chinese paralanguage dataset spontaneous dialogue dataset multi-stream speech corpus

Mandarin Chinese Speech Synthesis Dataset – 370 Speakers, 200 Hours

This dataset is recorded by 370 Chinese native speakers and 200 hours of natural conversation audio. Professional phonetician annotationed 14 kinds of paralanguages, full transcriptions, and speaker metadata. Precisely matches with the research and development needs of speech synthesis, dialogue TTS, and natural language modeling research.

Chinese paralanguage dataset spontaneous dialogue dataset Chinese conversational speech corpus Mandarin speech synthesis corpus Chinese speech synthesis dataset

Cantonese TTS Dataset – 4 Native Speakers, 20+ Hours

This Cantonese speech synthesis corpus includes recordings from 4 native speakers of Guangdong. The corpus contain educational, game and general colloquial content. The phoneme coverage is balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.

Cantonese audio dataset Cantonese TTS dataset native Cantonese speech recordings Cantonese voice dataset for AI Cantonese speech dataset

2 Speakers – Korean TTS Dataset with Native Accent

This dataset contains recordings from 2 native Korean speakers with authentic accent. Contains news and colloquial general corpus, the phoneme coverage is balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development in text-to-speech, Korean speech synthesis, and AI voice applications.

Korean speech dataset Korean TTS dataset Korean speech synthesis corpus Korean voice dataset for AI Korean accent speech corpus Korean text-to-speech dataset Korean speech recordings for TTS

14 Hours Taiwan Mandarin TTS Dataset – Multi-Style Voices

This dataset contains 14 hours of Taiwan Mandarin recordings from 4 professional voice actors with 7 speaking styles. The styles are criminal subordinate, rough man, little girl, kind grandma, businessman, grandfather and non-commissioned officer. Professional phonetician participates in the annotation. It is ideal for text-to-speech (TTS), expressive voice generation, virtual avatars, and AI speech synthesis applications.

Taiwan Mandarin speech dataset Taiwan Mandarin voice dataset Taiwan Mandarin speech corpus for AI Mandarin accent dataset Taiwan Mandarin TTS dataset