Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again


The data requirement cannot be less than 5 words and cannot be pure numbers

Cantonese Speech Data

From:Nexdata Date:2023-10-20

Cantonese, a major dialect of the Chinese language, is widely spoken in regions such as Hong Kong, Macau, and parts of southern China. As the global demand for speech recognition technology continues to grow, there is increasing interest in the development of Cantonese speech recognition systems. This article explores the advancements in Cantonese speech recognition, its significance, challenges, and potential applications.


Cantonese is a tonal language known for its complex pronunciation and intonation. The ability to accurately transcribe and understand Cantonese speech has numerous implications:


Accessibility: Cantonese speech recognition technology improves accessibility for Cantonese speakers with visual or motor impairments. It enables them to interact with digital devices and content more effectively.


Multilingual Communication: Cantonese is a vital language for business and cultural exchange in the global market. Speech recognition can facilitate communication between Cantonese speakers and those who speak other languages.


Cultural Preservation: Cantonese is not only a means of communication but also an integral part of the cultural heritage of its speakers. Preserving and promoting the language is essential, and speech recognition can play a role in this endeavor.


Challenges in Cantonese Speech Recognition


1. Tonal Complexity

Cantonese is a tonal language, and the meaning of a word can change based on its tone. Accurately capturing and distinguishing these tonal nuances remains a significant challenge.


2. Dialectal Variations

Cantonese can vary significantly across regions, making it challenging for speech recognition systems to understand the various sub-dialects and accents.


3. Limited Resources

Despite growing interest, Cantonese speech recognition research still lags behind more widely spoken languages. The limited availability of resources and research hinders progress.


Nexdata Cantonese Speech Data

1,652 Hours – Cantonese Dialect Speech Data by Mobile Phone

It collects 4,888 speakers from Guangdong Province and is recorded in quiet indoor environment. The recorded content covers 500,000 commonly used spoken sentences, including high-frequency words in weico and daily used expressions. The average number of repetitions is 1.5 and the average sentence length is 12.5 words. Recording devices are mainstream Android phones and iPhones.


607 Hours - Cantonese Conversational Speech Data by Mobile Phone and Voice Recorder

The 607-hour Cantonese Conversational Speech Data involved 995 native speakers. Speakers would choose a few familiar topics out of the given list and start conversations to ensure dialogues' fluency and naturalness. The recording devices are various mobile phones and professional audio recorders. The audio format is 16kHz, 16bit, uncompressed WAV, and all the speech data was recorded in quiet indoor environments. All the speech audio was manually transcribed with text content. The start and end time of each effective sentence, and speaker identification and other more attributes are also annotated. The accuracy rate of sentences is ≥ 95%.


38 People - Hong Kong Cantonese Average Tone Speech Synthesis Corpus

38 People - Hong Kong Cantonese Average Tone Speech Synthesis Corpus, It is recorded by Hong Kong native speakers. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.