{"id":1715,"datatype":"1","titleimg":"https://www.nexdata.ai/shujutang/static/image/index/datatang_yuyin_default.webp","type1":"165","type1str":null,"type2":"166","type2str":null,"dataname":"794 Hours Mexican Spanish Conversational Speech Dataset for ASR & Voice AI","datazy":[{"title":"Format","content":"16kHz, 16 bit, wav, mono channel;"},{"title":"Recording condition","content":"Low background noise;"},{"title":"Country","content":"Mexico(MEX),etc.;"},{"title":"Language(Region) Code","content":"es-MX,etc."},{"title":"Language","content":"Spanish(Mexico), etc;"},{"title":"Features of annotation","content":"Transcription text, timestamp, speaker ID, gender, noise."},{"title":"Accuracy Rate","content":"Word Accuracy Rate (WAR) 98%"}],"datatag":"Mexico,Spanish,Casual Conversation,ASR","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":null,"samplePresentation":[{"name":"500002_3.wav","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250722160934/500002_3.wav?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=xsgkBFGV%2BmETZBkJXHvI4rdqOIM%3D","intro":"Pero aquí estamos d-, de vuelta y más emocionados que nunca de sacar este nuevo episodio. Farid, ¿cómo estás, güey?","size":176524,"progress":100,"type":"mp3"},{"name":"500115_3.wav","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250722160934/500115_3.wav?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=v%2BcrMVFH2wwM8SijFNXxF2Uy%2BmI%3D","intro":"pero en general prácticamente todos los antipsicóticos tienen este riesgo, así es que siempre hay que tener esa precaución.","size":199124,"progress":100,"type":"mp3"},{"name":"500172_3.wav","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250722160934/500172_3.wav?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=So0mQjw%2Fd1xpP5ghFOtKp755kKo%3D","intro":"que telescopio me compro, que telescopio me recomiendan, cual es la mejor marca [N]","size":119276,"progress":100,"type":"mp3"}],"officialSummary":"This dataset contains 794 hours of Mexican Spanish conversational and monologue speech collected from authentic real-world scenarios. It includes accurate transcriptions, speaker IDs, gender, and additional metadata. Our dataset was collected from speakers with diverse geographical and background profiles, thereby enhancing the model's performance in real-world, complex tasks; the dataset has undergone quality validation by multiple AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.","dataexampl":null,"datakeyword":["Mexican Spanish speech dataset","Latin American Spanish speech dataset","mexican spanish dataset","spanish asr dataset","latin american speech corpus"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Data Type,Language","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"speechRec","dataShowType":"[{\"code\":\"0\",\"language\":\"ZH\"},{\"code\":\"1\",\"language\":\"ZH\"},{\"code\":\"2\",\"language\":\"EN,PT,DE,KO,FR,ES\"},{\"code\":\"3\",\"language\":\"EN\"},{\"code\":\"4\",\"language\":\"JP\"}]","productNameEn":"794 Hours - Spanish(Mexico) Real-world Casual Conversation and Monologue speech dataset","BGimg":"brightSpot_audio","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"]}
794 Hours Mexican Spanish Conversational Speech Dataset for ASR & Voice AI
Mexican Spanish speech dataset
Latin American Spanish speech dataset
mexican spanish dataset
spanish asr dataset
latin american speech corpus
This dataset contains 794 hours of Mexican Spanish conversational and monologue speech collected from authentic real-world scenarios. It includes accurate transcriptions, speaker IDs, gender, and additional metadata. Our dataset was collected from speakers with diverse geographical and background profiles, thereby enhancing the model's performance in real-world, complex tasks; the dataset has undergone quality validation by multiple AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.