[{"@type":"PropertyValue","name":"Format","value":"48,000Hz, 24bit, uncompressed wav, mono channel, The speaker (neutral emotional data) in the category1 directory is selected from the “100 People - Chinese Mandarin Average Tone Speech Synthesis Corpus, General”, with a bit depth of 16 bits; other data is unique to this project, with a bit depth of 24 bits"},{"@type":"PropertyValue","name":"Recording environment","value":"professional recording studio"},{"@type":"PropertyValue","name":"Recording content","value":"seven emotions (happiness, anger, sadness, surprise, fear, disgust)"},{"@type":"PropertyValue","name":"Speaker","value":"42 persons, different age groups and genders"},{"@type":"PropertyValue","name":"Device","value":"microphone"},{"@type":"PropertyValue","name":"Language","value":"Mandarin"},{"@type":"PropertyValue","name":"Annotation","value":"word and pinyin transcription, prosodic boundary annotation"},{"@type":"PropertyValue","name":"Application scenarios","value":"speech synthesis"},{"@type":"PropertyValue","name":"The amount of data","value":"The amount of data for per person is 140 minutes, each emotion is 20 minutes"}]
{"id":1214,"datatype":"1","titleimg":"https://res.datatang.com/asset/productNew/APY230417002.png?Expires=2007353724&OSSAccessKeyId=LTAI5tQwXnJZbubgVfVa1ep9&Signature=TKt2niROg4OSMDyTrZDzx468Qmg%3D","type1":"165","type1str":null,"type2":"165","type2str":null,"dataname":"42 People - Chinese Mandarin Multi-emotional Synthesis Corpus","datazy":[{"title":"Format","value":"48,000Hz, 24bit, uncompressed wav, mono channel, The speaker (neutral emotional data) in the category1 directory is selected from the “100 People - Chinese Mandarin Average Tone Speech Synthesis Corpus, General”, with a bit depth of 16 bits; other data is unique to this project, with a bit depth of 24 bits"},{"title":"Recording environment","value":"professional recording studio"},{"title":"Recording content","value":"seven emotions (happiness, anger, sadness, surprise, fear, disgust)"},{"title":"Speaker","value":"42 persons, different age groups and genders"},{"title":"Device","value":"microphone"},{"title":"Language","value":"Mandarin"},{"title":"Annotation","value":"word and pinyin transcription, prosodic boundary annotation"},{"title":"Application scenarios","value":"speech synthesis"},{"title":"The amount of data","value":"The amount of data for per person is 140 minutes, each emotion is 20 minutes"}],"datatag":"Synthesis Corpus,TTS,Mandarin,Multi-emotional","technologydoc":null,"downurl":null,"datainfo":"","standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":"","samplePresentation":[["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY230417002_demo1714298401119/APY230417002_demo/500034.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=n7UJUSlC470ta9lCm0aNI9AcxT4%3D","/data/apps/damp/temp/ziptemp/APY230417002_demo1714298401119/APY230417002_demo/500034.wav","警察局#1是#1你们家#1开的吗#3?这么#1嚣张#4!jing3 cha2 ju2 shi4 ni3 men5 jia1 kai1 de5 ma5 zhe4 me5 xiao1 zhang1"],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY230417002_demo1714298401119/APY230417002_demo/400036.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=PF88HdZNOvcr1b0mTzJaYs6rbMo%3D","/data/apps/damp/temp/ziptemp/APY230417002_demo1714298401119/APY230417002_demo/400036.wav","我结婚#1两年#1被打#1三次#3,我害怕#4。wo3 jie2 hun1 liang3 nian2 bei4 da3 san1 ci4 wo3 hai4 pa4"],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY230417002_demo1714298401119/APY230417002_demo/100304.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=vlLgPUinCTnXsMkPPfxCytHyOxo%3D","/data/apps/damp/temp/ziptemp/APY230417002_demo1714298401119/APY230417002_demo/100304.wav","天空#1很高#3,风#1很清澈#3,从头#1到脚#3都#1快乐#4。tian1 kong1 hen3 gao1 feng1 hen3 qing1 che4 cong2 tou2 dao4 jiao3 dou1 kuai4 le4"],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY230417002_demo1714298401119/APY230417002_demo/300001.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=d0NnolNoi2hIRNVlkT4PCP0xhHo%3D","/data/apps/damp/temp/ziptemp/APY230417002_demo1714298401119/APY230417002_demo/300001.wav","我#1真心的#1付出#3却#1不是#1你#1要的#1幸福#4。wo3 zhen1 xin1 de5 fu4 chu1 que4 bu2 shi4 ni3 yao4 de5 xing4 fu2"]],"officialSummary":"22 People - Chinese Mandarin Multi-emotional Synthesis Corpus. It is recorded by Chinese native speaker, covering different ages and genders. seven emotional text, and the syllables, phonemes and tones are balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.","dataexampl":"","datakeyword":["Chinese","Emotional","Multi-emotional","tts","Synthesis","Corpus"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Voice Type,Language","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"speechSyn","BGimg":"brightSpot_audio","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"],"single":"no"}
42 People - Chinese Mandarin Multi-emotional Synthesis Corpus
Chinese
Emotional
Multi-emotional
tts
Synthesis
Corpus
22 People - Chinese Mandarin Multi-emotional Synthesis Corpus. It is recorded by Chinese native speaker, covering different ages and genders. seven emotional text, and the syllables, phonemes and tones are balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Format
48,000Hz, 24bit, uncompressed wav, mono channel, The speaker (neutral emotional data) in the category1 directory is selected from the “100 People - Chinese Mandarin Average Tone Speech Synthesis Corpus, General”, with a bit depth of 16 bits; other data is unique to this project, with a bit depth of 24 bits
20 People - Chinese Mandarin Multi-emotional Synthesis Corpus
20 People - Chinese Mandarin Multi-emotional Synthesis Corpus. It is recorded by Chinese native speaker, covering different ages and genders. seven emotional texts, are all from novels and the syllables, phonemes and tones are balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.
ChineseEmotionalMulti-emotionalttsSynthesisCorpus
38 People - Hong Kong Cantonese Average Tone Speech Synthesis Corpus
38 People - Hong Kong Cantonese Average Tone Speech Synthesis Corpus, It is recorded by Hong Kong native speakers. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.
Synthesis CorpusTTSFemaleGeneralMaleenglish
12.6 Hours Chinese Mandarin Speech Synthesis Corpus - Male, Audiobook
12.6 Hours Chinese Mandarin Speech Synthesis Corpus - Male, Audiobook, It is recorded by Chinese native speakers, the voice of the full of magnetism. The phoneme coverage is balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.
12.6 Hours - Chinese Mandarin Synthesis Corpus-Female, Customer Service, Conversational Speech, It is recorded by Chinese native speakers, with sweet voice. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.
ttsconversational speechfemalecustomer service
10.4 Hours - Japanese Synthesis Corpus-Female
10.4 Hours - Japanese Synthesis Corpus-Female. It is recorded by Japanese native speaker, with authentic accent. The phoneme coverage is balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.
ttsjapanfemale
20 Hours - American English Speech Synthesis Corpus-Male
Male audio data of American English. It is recorded by American English native speakers, with authentic accent. The phoneme coverage is balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.
TTSAmerican EnglishMale
103 Chinese Mandarin Songs in Acapella - Female
103 Chinese Mandarin Songs in Acapella - Female. It is recorded by Chinese professional singer, with sweet voice. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the song synthesis.
TTSMandarinFemaleSynthesis Corpus
10.1 Hours - Chinese Mandarin Synthesis Corpus-Female, Customer Service
10.1 Hours -Chinese Mandarin Synthesis Corpus-Female, Customer Service, It is recorded by Chinese native speakers, with lively and frindly voice. The phoneme coverage is balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.