[{"@type":"PropertyValue","name":"Format","value":"16kHz, 16bit, uncompressed wav, mono channel;"},{"@type":"PropertyValue","name":"Recording condition","value":"Low background noise(indoor), without echo;"},{"@type":"PropertyValue","name":"Content category","value":"Generic domain; human-machine interaction; smart home command and control; numbers; local expressions;"},{"@type":"PropertyValue","name":"Recording device","value":"Android Smartphone, iPhone;"},{"@type":"PropertyValue","name":"Speaker","value":"2,284 people; 40% male and 60% female; 80% people aged from 16-25; people are from Kunming or the surrounding areas;"},{"@type":"PropertyValue","name":"Country","value":"China(CHN);"},{"@type":"PropertyValue","name":"Language","value":"Kunming dialect;"},{"@type":"PropertyValue","name":"Features of annotation","value":"Transcription text; special identifiers, noise"},{"@type":"PropertyValue","name":"Accuracy Rate","value":"Sentence Accuracy Rate (SAR) 95% (Noise symbols and other identifiers are excluded)"}]
{"id":943,"datatype":"1","titleimg":"https://res.datatang.com/asset/productNew/APY181231005.png?Expires=2007353652&OSSAccessKeyId=LTAI5tQwXnJZbubgVfVa1ep9&Signature=TAdSIf7SCX1bYu8uvk86O2smhP0%3D","type1":"165","type1str":null,"type2":"165","type2str":null,"dataname":"1,002 Hours - Kunming Dialect(China) Scripted Monologue Smartphone speech dataset","datazy":[{"title":"Format","value":"16kHz, 16bit, uncompressed wav, mono channel;"},{"title":"Recording condition","value":"Low background noise(indoor), without echo;"},{"title":"Content category","value":"Generic domain; human-machine interaction; smart home command and control; numbers; local expressions;"},{"title":"Recording device","value":"Android Smartphone, iPhone;"},{"title":"Speaker","value":"2,284 people; 40% male and 60% female; 80% people aged from 16-25; people are from Kunming or the surrounding areas;"},{"title":"Country","value":"China(CHN);"},{"title":"Language","value":"Kunming dialect;"},{"title":"Features of annotation","value":"Transcription text; special identifiers, noise"},{"title":"Accuracy Rate","value":"Sentence Accuracy Rate (SAR) 95% (Noise symbols and other identifiers are excluded)"}],"datatag":"Chinese,Dialect,Kunming,Reading,Scripted Monologue","technologydoc":null,"downurl":null,"datainfo":"2,284 native speakers of Kunming dialect, authentic accent. Recording text: commonly used , interactive, in-car , home furnishings etc, rich content nKunming native speakers did checking , more accurate text transcription. This data can be applied to most Android and Ios mobile phones","standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":["2,284 people","1,002 hours","multiple age groups, multiple categories"],"samplePresentation":[["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY181231005_demo1699264802532/%3F%3F%3F%3F/T0454G0056S0371_G0056_S0337.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=ru2qNCU%2F2FedsrH0ggRHdwAGktw%3D","/data/apps/damp/temp/ziptemp/APY181231005_demo1699264802532/????/T0454G0056S0371_G0056_S0337.wav","我想知道虹桥正荣府怎么走"],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY181231005_demo1699264802532/%3F%3F%3F%3F/T0454G0056S0446_G0056_S0412.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=Gk%2F33oBCNTbpqz3bYjWeJm2dktE%3D","/data/apps/damp/temp/ziptemp/APY181231005_demo1699264802532/????/T0454G0056S0446_G0056_S0412.wav","他冲澡去了,你等下来找他。"],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY181231005_demo1699264802532/%3F%3F%3F%3F/T0454G0056S0062_G0056_S0061.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=cIfkxU%2FydviEkxgVEYfHDZFjweg%3D","/data/apps/damp/temp/ziptemp/APY181231005_demo1699264802532/????/T0454G0056S0062_G0056_S0061.wav","想听英文歌"],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY181231005_demo1699264802532/%3F%3F%3F%3F/T0454G0057S0449_G0057_S0449.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=f%2FeUOUu%2BYoMfjYteEeICJaG4iwo%3D","/data/apps/damp/temp/ziptemp/APY181231005_demo1699264802532/????/T0454G0057S0449_G0057_S0449.wav","哪个放的啵?太臭了。"],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY181231005_demo1699264802532/%3F%3F%3F%3F/T0454G0057S0371_G0057_S0371.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=r8LTUBaLn98cOiKGwFcVtjrod3g%3D","/data/apps/damp/temp/ziptemp/APY181231005_demo1699264802532/????/T0454G0057S0371_G0057_S0371.wav","十二生肖电影下载[N]"]],"officialSummary":"Kunming Dialect(China) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, local expressions, human-machine interaction, smart home command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(2,284 Kunming native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.","dataexampl":"","datakeyword":["Kunming dialect audio data"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Language,Data Type","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"speechRec","BGimg":"brightSpot_audio","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"],"single":"no"}
Kunming Dialect(China) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, local expressions, human-machine interaction, smart home command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(2,284 Kunming native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Format
16kHz, 16bit, uncompressed wav, mono channel;
Recording condition
Low background noise(indoor), without echo;
Content category
Generic domain; human-machine interaction; smart home command and control; numbers; local expressions;
Recording device
Android Smartphone, iPhone;
Speaker
2,284 people; 40% male and 60% female; 80% people aged from 16-25; people are from Kunming or the surrounding areas;
Country
China(CHN);
Language
Kunming dialect;
Features of annotation
Transcription text; special identifiers, noise
Accuracy Rate
Sentence Accuracy Rate (SAR) 95% (Noise symbols and other identifiers are excluded)
German(Germany) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(3,442 German native speakers in total), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
German audio data captured by mobile phone German audio collection German audio data
Italian(Italy) Scripted Monologue Smartphone speech dataset, collected from monologue based on given prompts, covering oral; human-machine interaction; smart home command and in-car command; numbers; news domains. Transcribed with text content. Our dataset was collected from extensive and diversify speakers(3,109 native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Italian voice data mobile phone voice data voice acquisition data
Japanese(Japan) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,245 speakers in total), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Hindi(India) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,425 Indian native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Hindi mobile phones collect voice data Hindi voice collection Hindi data
Changsha Dialect(China) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, local expressions, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(2,301 Changsha native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Changsha dialect voice data Changsha dialect data Changsha dialect voice dialect recognition reading voice
Wuhan Dialect(China) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, local expressions, human-machine interaction, smart home command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(2,291 Wuhan native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Mobile phone captured audio data of Wuhan dialect Wuhan dialect data dialect audio data
English(India) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers( 2,100 Indian native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Indian English audio data captured by mobile phone Indian English collection Indian English data
Japanese(Japan) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers( 1006 Japanese native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Japanese data Japanese audio data basic recognition Japanese reading audio