[{"@type":"PropertyValue","name":"Format","value":"Modal particle: 48kHz, 24bit, wav, mono; Natural Conversation: 48kHz, 24bit, wav, stereo(each speaker's speech occupying his/her own sound track)"},{"@type":"PropertyValue","name":"Recording condition","value":"Recording studio"},{"@type":"PropertyValue","name":"Recording content","value":"1. Read texts containing modal particles in a natural way; 2. Have a natural conversation based on given topic"},{"@type":"PropertyValue","name":"Features of annotation","value":"Transcription text"},{"@type":"PropertyValue","name":"Device","value":"Microphone"},{"@type":"PropertyValue","name":"Speaker","value":"100 professional voice actors"},{"@type":"PropertyValue","name":"Language","value":"Chinese"},{"@type":"PropertyValue","name":"Application scenarios","value":"Speech synthesis"}]
{"id":1833,"datatype":"1","titleimg":"https://www.nexdata.ai/shujutang/static/image/index/datatang_yuyin_default.webp","type1":"165","type1str":null,"type2":"219","type2str":null,"dataname":"100 Speakers Chinese Speech Synthesis Dataset & Multi-Emotion","datazy":[{"title":"Format","content":"Modal particle: 48kHz, 24bit, wav, mono; Natural Conversation: 48kHz, 24bit, wav, stereo(each speaker's speech occupying his/her own sound track)"},{"title":"Recording condition","content":"Recording studio"},{"title":"Recording content","content":"1. Read texts containing modal particles in a natural way; 2. Have a natural conversation based on given topic"},{"title":"Features of annotation","content":"Transcription text"},{"title":"Device","content":"Microphone"},{"title":"Speaker","content":"100 professional voice actors"},{"title":"Language","content":"Chinese"},{"title":"Application scenarios","content":"Speech synthesis"}],"datatag":"Chinese,Multi-emotional,Modal particle,Natural Conversation, Speech Synthesis,TTS","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":null,"samplePresentation":[{"name":"G00001_O1_M_0005.wav","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250710113530/G00001_O1_M_0005.wav?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=sbGs6%2FkXVzHz3Gyk5YxqqFV4YNc%3D","intro":"刚刚看到一只超可爱的小狗啊!","size":407093,"progress":100,"type":"mp3"},{"name":"G00001_O2_F_0001.wav","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250710113530/G00001_O2_F_0001.wav?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=iUU2r1Fiqo59A9egzUqcITGJC3w%3D","intro":"今天天气真好啊!","size":261620,"progress":100,"type":"mp3"},{"name":"demo1.wav","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250710113530/demo1.wav?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=o%2Fc37QOlnKW4pHjeCltr%2BIHgewQ%3D","intro":"对,呃那你先说一下咱们一起去的第一个城市武汉吧?","size":1463084,"progress":100,"type":"mp3"},{"name":"demo4.wav","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250710113530/demo4.wav?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=rdGcY8k8UlU3Srxmb4Ykh1j1KbY%3D","intro":"嗯,那你说说吧,你说说你对武汉还有什么印象?","size":1271084,"progress":100,"type":"mp3"},{"name":"demo5.wav","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250710113530/demo5.wav?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=%2BBxiCFGTxMjCJWYyiWaP9M8qsFQ%3D","intro":"嗯,我觉得他特别热心,他还给咱们建议说,嗯要避雷这个什么,那个怎么样,推荐去哪个,然后去推荐吃什么好吃的,然后他们就特别贴心的说这些,哎非常的感激,现在想一下觉得好温暖。","size":5484716,"progress":100,"type":"mp3"}],"officialSummary":"This dataset is recorded by 100 professional Chinese voice actors. It not only includes sentences rich in modal particles that align with daily expression habits, but also encompasses free conversation data on given topics. Each speaker’s audio is stored in a separate track. All recordings are annotated by professional phoneticians with text, timestamps, and prosody details, meeting the precise requirements for speech synthesis, emotion recognition, and prosody modeling research.","dataexampl":null,"datakeyword":["Chinese emotional speech data","Chinese conversational speech corpus","Chinese natural conversation dataset","Chinese prosody dataset"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Language,Voice Type","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"speechSyn","BGimg":"brightSpot_audio","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"]}
100 Speakers Chinese Speech Synthesis Dataset & Multi-Emotion
Chinese emotional speech data
Chinese conversational speech corpus
Chinese natural conversation dataset
Chinese prosody dataset
This dataset is recorded by 100 professional Chinese voice actors. It not only includes sentences rich in modal particles that align with daily expression habits, but also encompasses free conversation data on given topics. Each speaker’s audio is stored in a separate track. All recordings are annotated by professional phoneticians with text, timestamps, and prosody details, meeting the precise requirements for speech synthesis, emotion recognition, and prosody modeling research.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.