From:Nexdata Date: 2024-08-15
With the rapid development of AI technology, datasets has become a core factor of improving intelligent system’s performance. The variety and accuracy of datasets determine the learning ability and execution effect of AI models. In the progress of training intelligent system, large amount of datasets from real world are indispensable resources. Collecting and labeling data scientifically can help AI models gain accurate results in real applications, reduce the rate of misjudgment, and improve user experience and system efficiency.
From map navigation, voice assistant, news reading, to smart customer service, call centers, and broadcast in public, the application of TTS is everywhere in our life.
Apart from text-to-speech, the research scope of speech synthesis technology also includes: singing synthesis, whisper synthesis, dialect synthesis, animal sound synthesis, and etc. At present, speech synthesis technology has been successfully applied in many fields.
Different from the traditional TTS broadcast synthesis, personalized TTS application are becoming more and more popular. Based on massive speech and text data annotation experience, Nexdata provides high-quality, multi-scenario, and multi-category speech synthesis data solutions.
100 People — Chinese Mandarin Average Tone Speech Synthesis Corpus, General
The corpus is recorded by Chinese native speakers. It covers news, dialogue, audio books, poetry, advertising, news broadcasting, entertainment; and the phonemes and tones are balanced. The words accuracy rate is not less than 99.9%, the phoneme accuracy rate is note less than 99%, the prosodic accuracy rate is not less than 98%.
19.46 Hours — American English Speech Synthesis Corpus-Female
The corpus is recorded by American English native speakers, with authentic accent and sweet sound. The phoneme coverage is balanced. The words accuracy rate is not less than 99%, the phoneme accuracy rate is note less than 98%, the prosodic accuracy rate is not less than 98%.
10 Hours — Chinese Mandarin Synthesis Corpus-Female, Customer Service
The corpus is recorded by Chinese native speakers, with lively and frindly voice. The phoneme coverage is balanced. The words accuracy rate is not less than 99.8%, the phoneme accuracy rate is note less than 98%, the accuracy of syllable boundary is not less than 98%.
6.78 Hours — Chinese Mandarin Speech Synthesis Corpus-Female Imitating Children
The corpus is recorded by Chinese native speakers, with authentic accent and sweet sound. The phoneme coverage is balanced. The words accuracy rate is not less than 99%.
With the rapid development of speech synthesis technology, the speech generated by TTS will become more and more natural and vivid. We firmly believe that the development of technology will continue to break through the conventional obstacles and bring us more convenience for our daily life.
If you need data services, please feel free to contact us: info@nexdata.ai
Facing with growing demand for data, companies and researchers need to constantly explore new data collection and annotation methods. AI technology can better cope with fast changing market demands only by continuously improving the quality of data. With the accelerated development of data-driven intelligent trends, we have reason to look forward to a more efficient, intelligent, and secure future.