[{"@type":"PropertyValue","name":"Format","value":"16kHz, 16 bit, wav, mono channel;"},{"@type":"PropertyValue","name":"Content category","value":"Dialogue based on given topics;"},{"@type":"PropertyValue","name":"Recording condition","value":"Low background noise (indoor);"},{"@type":"PropertyValue","name":"Recording device","value":"Android smartphone, iPhone;"},{"@type":"PropertyValue","name":"Speaker","value":"442 native speakers in total, 43% male and 57% female;"},{"@type":"PropertyValue","name":"Country","value":"Korea(KOR);"},{"@type":"PropertyValue","name":"Language(Region) Code","value":"ko-KR;"},{"@type":"PropertyValue","name":"Language","value":"Korean;"},{"@type":"PropertyValue","name":"Features of annotation","value":"Transcription text, timestamp, speaker ID, gender, PII redacted."},{"@type":"PropertyValue","name":"Accuracy Rate","value":"Sentence Accuracy Rate (SAR) 98%"}]
{"id":1103,"datatype":"1","titleimg":"https://res.datatang.com/asset/productNew/APY210915001.png?Expires=2007353696&OSSAccessKeyId=LTAI5tQwXnJZbubgVfVa1ep9&Signature=g2JiVjgD5hebNwtyKNmve1RJnVE%3D","type1":"165","type1str":null,"type2":"165","type2str":null,"dataname":"290 Hours - Korean(Korea) Spontaneous Dialogue Smartphone speech dataset","datazy":[{"title":"Format","value":"16kHz, 16 bit, wav, mono channel;"},{"title":"Content category","value":"Dialogue based on given topics;"},{"title":"Recording condition","value":"Low background noise (indoor);"},{"title":"Recording device","value":"Android smartphone, iPhone;"},{"title":"Speaker","value":"442 native speakers in total, 43% male and 57% female;"},{"title":"Country","value":"Korea(KOR);"},{"title":"Language(Region) Code","value":"ko-KR;"},{"title":"Language","value":"Korean;"},{"title":"Features of annotation","value":"Transcription text, timestamp, speaker ID, gender, PII redacted."},{"title":"Accuracy Rate","value":"Sentence Accuracy Rate (SAR) 98%"}],"datatag":"Conversational,Korean","technologydoc":null,"downurl":null,"datainfo":"","standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":"","samplePresentation":[["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY210915001_demo1695809001932/APY210915001_demo/cel_O1_0043.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=V6F0CT4Mc4%2FrnZE1G1Z42R1HIII%3D","/data/apps/damp/temp/ziptemp/APY210915001_demo1695809001932/APY210915001_demo/cel_O1_0043.wav","어 맞아요 전여빈 배우도 너무 좋고 천우희 정말 좋아해요"],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY210915001_demo1695809001932/APY210915001_demo/cel_O1_0021.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=lW3GeAGvPjkgl7LLGw7MdF%2F2WQE%3D","/data/apps/damp/temp/ziptemp/APY210915001_demo1695809001932/APY210915001_demo/cel_O1_0021.wav","그친구 이름이 되게 흔했는데"],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY210915001_demo1695809001932/APY210915001_demo/cel_O2_0053.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=z26DFkw7Mcl2BPLBsqkeOH0eE3E%3D","/data/apps/damp/temp/ziptemp/APY210915001_demo1695809001932/APY210915001_demo/cel_O2_0053.wav","예를들면 강하늘 강하늘 배우도 되게 좋아하는데"],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY210915001_demo1695809001932/APY210915001_demo/cel_O2_0054.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=cZrlWbnKZzH03N2yQ2bGk2h%2FfZ8%3D","/data/apps/damp/temp/ziptemp/APY210915001_demo1695809001932/APY210915001_demo/cel_O2_0054.wav","그분이 되게 제가 좋아하는 작품이랑 안좋아하는 작품을 되게 거의 번갈아가면서 많이 하셨어요."],["mp3","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY210915001_demo1695809001932/APY210915001_demo/cel_O2_0049.wav?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=ngZfVqU3mnjG1Gqah4cu7owp5fI%3D","/data/apps/damp/temp/ziptemp/APY210915001_demo1695809001932/APY210915001_demo/cel_O2_0049.wav","무거운 연기도하고 현실연기도 하고 되게 다 잘하시고 소화를 일단 너무 잘 하시는것 같아요."]],"officialSummary":"Korean(Korea) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(442 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.","dataexampl":"","datakeyword":["korean","Conversational speech","Korean asr data"," Korean asr dataset"," Korean asr collection"," Korean language data"," Korean language dataset"," Korean language collection"," Korean discuss asr data"," Korean discuss asr dataset"," Korean discuss asr collection"," Korean discuss language data"," Korean discuss language dataset"," Korean discuss language collection"," Korean small talk asr data"," Korean small talk asr dataset"," Korean small talk asr collection"," Korean small talk language data"," Korean small talk language dataset"," Korean small talk language collection"," Korean conversational asr data"," Korean conversational asr dataset"," Korean conversational asr collection"," Korean conversational language data"," Korean conversational language dataset"," Korean conversational language collection"," Korean chat asr data"," Korean chat asr dataset"," Korean chat asr collection"," Korean chat language data"," Korean chat language dataset"," Korean chat language collection"," Korean communication asr data"," Korean communication asr dataset"," Korean communication asr collection"," Korean communication language data"," Korean communication language dataset"," Korean communication language collection"," Korean speech asr data"," Korean speech asr dataset"," Korean speech asr collection"," Korean speech language data"," Korean speech language dataset"," Korean speech language collection"," Korean talk asr data"," Korean talk asr dataset"," Korean talk asr collection"," Korean talk language data"," Korean talk language dataset"," Korean talk language collection"," Korean conversation asr data"," Korean conversation asr dataset"," Korean conversation asr collection"," Korean conversation language data"," Korean conversation language dataset"," Korean conversation language collection"," Korea asr data"," Korea asr dataset"," Korea asr collection"," Korea language data"," Korea language dataset"," Korea language collection"," Korea discuss asr data"," Korea discuss asr dataset"," Korea discuss asr collection"," Korea discuss language data"," Korea discuss language dataset"," Korea discuss language collection"," Korea small talk asr data"," Korea small talk asr dataset"," Korea small talk asr collection"," Korea small talk language data"," Korea small talk language dataset"," Korea small talk language collection"," Korea conversational asr data"," Korea conversational asr dataset"," Korea conversational asr collection"," Korea conversational language data"," Korea conversational language dataset"," Korea conversational language collection"," Korea chat asr data"," Korea chat asr dataset"," Korea chat asr collection"," Korea chat language data"," Korea chat language dataset"," Korea chat language collection"," Korea communication asr data"," Korea communication asr dataset"," Korea communication asr collection"," Korea communication language data"," Korea communication language dataset"," Korea communication language collection"," Korea speech asr data"," Korea speech asr dataset"," Korea speech asr collection"," Korea speech language data"," Korea speech language dataset"," Korea speech language collection"," Korea talk asr data"," Korea talk asr dataset"," Korea talk asr collection"," Korea talk language data"," Korea talk language dataset"," Korea talk language collection"," Korea conversation asr data"," Korea conversation asr dataset"," Korea conversation asr collection"," Korea conversation language data"," Korea conversation language dataset"," Korea conversation language collection"," Seoul asr data"," Seoul asr dataset"," Seoul asr collection"," Seoul language data"," Seoul language dataset"," Seoul language collection"," Seoul discuss asr data"," Seoul discuss asr dataset"," Seoul discuss asr collection"," Seoul discuss language data"," Seoul discuss language dataset"," Seoul discuss language collection"," Seoul small talk asr data"," Seoul small talk asr dataset"," Seoul small talk asr collection"," Seoul small talk language data"," Seoul small talk language dataset"," Seoul small talk language collection"," Seoul conversational asr data"," Seoul conversational asr dataset"," Seoul conversational asr collection"," Seoul conversational language data"," Seoul conversational language dataset"," Seoul conversational language collection"," Seoul chat asr data"," Seoul chat asr dataset"," Seoul chat asr collection"," Seoul chat language data"," Seoul chat language dataset"," Seoul chat language collection"," Seoul communication asr data"," Seoul communication asr dataset"," Seoul communication asr collection"," Seoul communication language data"," Seoul communication language dataset"," Seoul communication language collection"," Seoul speech asr data"," Seoul speech asr dataset"," Seoul speech asr collection"," Seoul speech language data"," Seoul speech language dataset"," Seoul speech language collection"," Seoul talk asr data"," Seoul talk asr dataset"," Seoul talk asr collection"," Seoul talk language data"," Seoul talk language dataset"," Seoul talk language collection"," Seoul conversation asr data"," Seoul conversation asr dataset"," Seoul conversation asr collection"," Seoul conversation language data"," Seoul conversation language dataset"," Seoul conversation language collection",""],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Language,Data Type","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"speechRec","BGimg":"brightSpot_audio","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"],"single":"no"}
Korean(Korea) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(442 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Format
16kHz, 16 bit, wav, mono channel;
Content category
Dialogue based on given topics;
Recording condition
Low background noise (indoor);
Recording device
Android smartphone, iPhone;
Speaker
442 native speakers in total, 43% male and 57% female;
French(France) Spontaneous Dialogue Telephony speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(964 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Italian(Italy) Spontaneous Dialogue Telephony speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(676 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Thai(Thailand) Spontaneous Dialogue Telephony speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(1,986 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Portuguese(Brazil) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(142 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Conversational speechPortuguese asr data russian asr dataset Brazilian Portuguese
Russian(Russia) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(134 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Conversational speechRussian asr data russian asr dataset russia
Burmese(Myanmar) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(134 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Conversational speechBurmese asr data Burmese asr dataset
Hindi(India) Spontaneous Dialogue Telephony speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(1,004 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
hindiConversational SpeechphoneHindi discuss data Hindi discuss dataset Hindi discuss collection Hindi small talk data Hindi small talk dataset Hindi small talk collection Hindi conversational data Hindi conversational dataset Hindi conversational collection Hindi chat data Hindi chat dataset Hindi chat collection Hindi communication data Hindi communication dataset Hindi communication collection Hindi speech data Hindi speech dataset Hindi speech collection Hindi talk data Hindi talk dataset Hindi talk collection Hindi conversation data Hindi conversation dataset Hindi conversation collection India discuss data India discuss dataset India discuss collection India small talk data India small talk dataset India small talk collection India conversational data India conversational dataset India conversational collection India chat data India chat dataset India chat collection India communication data India communication dataset India communication collection India speech data India speech dataset India speech collection India talk data India talk dataset India talk collection India conversation data India conversation dataset India conversation collection Indo-Aryan discuss data Indo-Aryan discuss dataset Indo-Aryan discuss collection Indo-Aryan small talk data Indo-Aryan small talk dataset Indo-Aryan small talk collection Indo-Aryan conversational data Indo-Aryan conversational dataset Indo-Aryan conversational collection Indo-Aryan chat data Indo-Aryan chat dataset Indo-Aryan chat collection Indo-Aryan communication data Indo-Aryan communication dataset Indo-Aryan communication collection Indo-Aryan speech data Indo-Aryan speech dataset Indo-Aryan speech collection Indo-Aryan talk data Indo-Aryan talk dataset Indo-Aryan talk collection Indo-Aryan conversation data Indo-Aryan conversation dataset Indo-Aryan conversation collection
Arabic(UAE) Real-world Casual Conversation and Monologue speech dataset, covers Interview, Speech, Variety, etc, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender, and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
ArabicUAEColloquial video data Arabic Conversation speech data