[{"@type":"PropertyValue","name":"Format","value":"16kHz,16bit,wav,mono channel"},{"@type":"PropertyValue","name":"Recording environment","value":"quiet indoor environment, without echo"},{"@type":"PropertyValue","name":"Recording content","value":"Chat, interactive, in-car, in-home, numbers"},{"@type":"PropertyValue","name":"Country","value":"Malaysia(MYS)"},{"@type":"PropertyValue","name":"Language","value":"Malay"},{"@type":"PropertyValue","name":"Accuracy","value":"Word Accuracy Rate (WAR) 98%(Punctuation and non-speech annotations are subjective, thus they are excluded from the accuracy statistics.)"},{"@type":"PropertyValue","name":"Device","value":"Android phone, iPhone"},{"@type":"PropertyValue","name":"Speaker","value":"300 Malaysians in total, including 134 males and 166 females"},{"@type":"PropertyValue","name":"Language(Region) Code","value":"ms-MY"}]
{"id":1605,"datatype":"1","titleimg":"/shujutang/static/image/index/datatang_yuyin_default.webp","type1":"165","type1str":null,"type2":"166","type2str":null,"dataname":"341 Hours - Malay(Malaysia) Scripted Monologue Smartphone speech dataset","datazy":[{"title":"Format","desc":"Format","content":"16kHz,16bit,wav,mono channel"},{"desc":"Recording environment","content":"quiet indoor environment, without echo","title":"Recording environment"},{"desc":"Recording content","content":"Chat, interactive, in-car, in-home, numbers","title":"Recording content"},{"desc":"Country","content":"Malaysia(MYS)","title":"Country"},{"desc":"Language","content":"Malay","title":"Language"},{"desc":"Accuracy","content":"Word Accuracy Rate (WAR) 98%(Punctuation and non-speech annotations are subjective, thus they are excluded from the accuracy statistics.)","title":"Accuracy"},{"desc":"Device","content":"Android phone, iPhone","title":"Device"},{"desc":"Speaker","content":"300 Malaysians in total, including 134 males and 166 females","title":"Speaker"},{"desc":"Language(Region) Code","content":"ms-MY","title":"Language(Region) Code"}],"datatag":"Malay,Malaysia,Smartphone,Reading,Scripted Monologue","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":"","samplePresentation":[],"officialSummary":"Malay(Malaysia) Scripted Monologue Smartphone speech dataset, covers several domains, including chat, interactions, in-home, in-car, numbers and more, mirrors real-world interactions. Transcribed with text content, and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.","dataexampl":null,"datakeyword":["Malay","Malaysia","Smartphone","Reading","Scripted Monologue"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Data Type,Language","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"speechRec","BGimg":"brightSpot_audio","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"]}
Malay(Malaysia) Scripted Monologue Smartphone speech dataset, covers several domains, including chat, interactions, in-home, in-car, numbers and more, mirrors real-world interactions. Transcribed with text content, and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Format
16kHz,16bit,wav,mono channel
Recording environment
quiet indoor environment, without echo
Recording content
Chat, interactive, in-car, in-home, numbers
Country
Malaysia(MYS)
Language
Malay
Accuracy
Word Accuracy Rate (WAR) 98%(Punctuation and non-speech annotations are subjective, thus they are excluded from the accuracy statistics.)
Device
Android phone, iPhone
Speaker
300 Malaysians in total, including 134 males and 166 females