en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

1,002 Hours - Kunming Dialect(China) Scripted Monologue Smartphone speech dataset

Kunming dialect audio data

Kunming Dialect(China) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, local expressions, human-machine interaction, smart home command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(2,284 Kunming native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Format
16kHz, 16bit, uncompressed wav, mono channel;
Recording condition
Low background noise(indoor), without echo;
Content category
Generic domain; human-machine interaction; smart home command and control; numbers; local expressions;
Recording device
Android Smartphone, iPhone;
Speaker
2,284 people; 40% male and 60% female; 80% people aged from 16-25; people are from Kunming or the surrounding areas;
Country
China(CHN);
Language
Kunming dialect;
Features of annotation
Transcription text; special identifiers, noise
Accuracy Rate
Sentence Accuracy Rate (SAR) 95% (Noise symbols and other identifiers are excluded)
Sample Sample
  • Audio

    我想知道虹桥正荣府怎么走

  • Audio

    他冲澡去了,你等下来找他。

  • Audio

    想听英文歌

  • Audio

    哪个放的啵?太臭了。

  • Audio

    十二生肖电影下载[N]

Recommended DatasetsRecommended Dataset
849 Hours - Mandarin Chinese(China) Human-Machine Interaction Scripted Monologue Smartphone speech dataset

Mandarin Chinese(China) Human-Machine Interaction Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering common-used sentences, smart home commands, Intelligent assistant, wake words, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(998 people in total), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Mandarin home interaction mobile phone language audio data far field home collected audio data subset command words wake-up words
1,002 Hours - Russian(Russia) Scripted Monologue Smartphone speech dataset

Russian(Russia) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,960 people in total), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Russian video data
762 Hours - Spanish(Latin America) Scripted Monologue Smartphone speech dataset

Spanish(Latin America) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,630 people in total, such as Mexicans, Colombians, etc.), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Spanish audio data
1,044 Hours - Portuguese(Brazil) Scripted Monologue Smartphone speech dataset

Portuguese(Brazil) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content, speaker ID, timestamp and other attributes. Our dataset was collected from extensive and diversify speakers(2,038 people in total), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Portuguese collection Portuguese data Portuguese identification asr Speech to text text to speech brazil talk data brazil talk dataset brazil talk file brazil talk record brazil conversional data brazil conversional dataset brazil conversional file brazil conversional record brazil topics data brazil topics dataset brazil topics file brazil topics record brazil small talk data brazil small talk dataset brazil small talk file brazil small talk record brazil dialect data brazil dialect dataset brazil dialect file brazil dialect record brazil speech data brazil speech dataset brazil speech file brazil speech record brazil chatter data brazil chatter dataset brazil chatter file brazil chatter record brazil discuss data brazil discuss dataset brazil discuss file brazil discuss record brazil gossip data brazil gossip dataset brazil gossip file brazil gossip record brazil lecture data brazil lecture dataset brazil lecture file brazil lecture record brazil dialogue data brazil dialogue dataset brazil dialogue file brazil dialogue record Brasil talk data Brasil talk dataset Brasil talk file Brasil talk record Brasil conversional data Brasil conversional dataset Brasil conversional file Brasil conversional record Brasil topics data Brasil topics dataset Brasil topics file Brasil topics record Brasil small talk data Brasil small talk dataset Brasil small talk file Brasil small talk record Brasil dialect data Brasil dialect dataset Brasil dialect file Brasil dialect record Brasil speech data Brasil speech dataset Brasil speech file Brasil speech record Brasil chatter data Brasil chatter dataset Brasil chatter file Brasil chatter record Brasil discuss data Brasil discuss dataset Brasil discuss file Brasil discuss record Brasil gossip data Brasil gossip dataset Brasil gossip file Brasil gossip record Brasil lecture data Brasil lecture dataset Brasil lecture file Brasil lecture record Brasil dialogue data Brasil dialogue dataset Brasil dialogue file Brasil dialogue record Portuguese talk data Portuguese talk dataset Portuguese talk file Portuguese talk record Portuguese conversional data Portuguese conversional dataset Portuguese conversional file Portuguese conversional record Portuguese topics data Portuguese topics dataset Portuguese topics file Portuguese topics record Portuguese small talk data Portuguese small talk dataset Portuguese small talk file Portuguese small talk record Portuguese dialect data Portuguese dialect dataset Portuguese dialect file Portuguese dialect record Portuguese speech data Portuguese speech dataset Portuguese speech file Portuguese speech record Portuguese chatter data Portuguese chatter dataset Portuguese chatter file Portuguese chatter record Portuguese discuss data Portuguese discuss dataset Portuguese discuss file Portuguese discuss record Portuguese gossip data Portuguese gossip dataset Portuguese gossip file Portuguese gossip record Portuguese lecture data Portuguese lecture dataset Portuguese lecture file Portuguese lecture record Portuguese dialogue data Portuguese dialogue dataset Portuguese dialogue file Portuguese dialogue record português talk data português talk dataset português talk file português talk record português conversional data português conversional dataset português conversional file português conversional record português topics data português topics dataset português topics file português topics record português small talk data português small talk dataset português small talk file português small talk record português dialect data português dialect dataset português dialect file português dialect record português speech data português speech dataset português speech file português speech record português chatter data português chatter dataset português chatter file português chatter record português discuss data português discuss dataset português discuss file português discuss record português gossip data português gossip dataset português gossip file português gossip record português lecture data português lecture dataset português lecture file português lecture record português dialogue data português dialogue dataset português dialogue file português dialogue record Portugal talk data Portugal talk dataset Portugal talk file Portugal talk record Portugal conversional data Portugal conversional dataset Portugal conversional file Portugal conversional record Portugal topics data Portugal topics dataset Portugal topics file Portugal topics record Portugal small talk data Portugal small talk dataset Portugal small talk file Portugal small talk record Portugal dialect data Portugal dialect dataset Portugal dialect file Portugal dialect record Portugal speech data Portugal speech dataset Portugal speech file Portugal speech record Portugal chatter data Portugal chatter dataset Portugal chatter file Portugal chatter record Portugal discuss data Portugal discuss dataset Portugal discuss file Portugal discuss record Portugal gossip data Portugal gossip dataset Portugal gossip file Portugal gossip record Portugal lecture data Portugal lecture dataset Portugal lecture file Portugal lecture record Portugal dialogue data Portugal dialogue dataset Portugal dialogue file Portugal dialogue record
986 Hours - Portuguese(Europe) Scripted Monologue Smartphone speech dataset

Portuguese(Europe) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(2,109 people in total), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Portuguese data mobile phone voice data voice data
769 Hours - French(France) Scripted Monologue Smartphone speech dataset

French(France) Scripted Monologue Smartphone speech dataset, collected from monologue based on given prompts, covering general category; human-machine interaction category. Transcribed with text content. Our dataset was collected from extensive and diversify speakers(1623 native speakers), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

French mobile phones collect voice data French collection French voice data and developmental voice recognition data
435 Hours - Spanish(Spain) Scripted Monologue Smartphone speech dataset

Spanish(Spain) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(989 people in total), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Spanish data Spanish pronunciation Spanish acquisition Spanish identification data
831 Hours - English(the United Kingdom) Scripted Monologue Smartphone speech dataset

English(the United Kingdom) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,651 British people in total), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Mobile Telephony British English Speech Data
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

c47f2c44-e637-4b82-bc1c-ea0c75f2ce4b

04266dae-fd0b-4d6a-86a5-4a7c48c38c3d