en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

849 Hours - Mandarin Chinese(China) Human-Machine Interaction Scripted Monologue Smartphone speech dataset

Mandarin home interaction mobile phone language audio data
far field home collected audio data subset
command words
wake-up words

Mandarin Chinese(China) Human-Machine Interaction Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering common-used sentences, smart home commands, Intelligent assistant, wake words, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(998 people in total), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Format
48kHz, 16bit, uncompressed wav, mono channel;
Recording condition
Low background noise(indoor);
Content category
Common-used sentences; smart home commands; Intelligent assistant; wake words; numbers;
Recording device
Android Smartphone;
Speaker
998 people; 46% male and 54% female; around 800 utterances per speaker;
Country
China(CHN);
Language(Region) Code
zh-CN;
Language
Mandarin Chinese;
Features of annotation
Transcription text;
Accuracy Rate
Sentence Accuracy Rate (SAR) 98%
Sample Sample
  • Audio

    你好小星

  • Audio

    祖密锁更新周期是什么

  • Audio

    延时关闭

  • Audio

    王博文的专辑有情总被无情伤销量

  • Audio

    两万七千零三十七日元

Recommended DatasetsRecommended Dataset
1,002 Hours - Russian(Russia) Scripted Monologue Smartphone speech dataset

Russian(Russia) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,960 people in total), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Russian video data
762 Hours - Spanish(Latin America) Scripted Monologue Smartphone speech dataset

Spanish(Latin America) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,630 people in total, such as Mexicans, Colombians, etc.), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Spanish audio data
1,044 Hours - Portuguese(Brazil) Scripted Monologue Smartphone speech dataset

Portuguese(Brazil) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content, speaker ID, timestamp and other attributes. Our dataset was collected from extensive and diversify speakers(2,038 people in total), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Portuguese collection Portuguese data Portuguese identification asr Speech to text text to speech brazil talk data brazil talk dataset brazil talk file brazil talk record brazil conversional data brazil conversional dataset brazil conversional file brazil conversional record brazil topics data brazil topics dataset brazil topics file brazil topics record brazil small talk data brazil small talk dataset brazil small talk file brazil small talk record brazil dialect data brazil dialect dataset brazil dialect file brazil dialect record brazil speech data brazil speech dataset brazil speech file brazil speech record brazil chatter data brazil chatter dataset brazil chatter file brazil chatter record brazil discuss data brazil discuss dataset brazil discuss file brazil discuss record brazil gossip data brazil gossip dataset brazil gossip file brazil gossip record brazil lecture data brazil lecture dataset brazil lecture file brazil lecture record brazil dialogue data brazil dialogue dataset brazil dialogue file brazil dialogue record Brasil talk data Brasil talk dataset Brasil talk file Brasil talk record Brasil conversional data Brasil conversional dataset Brasil conversional file Brasil conversional record Brasil topics data Brasil topics dataset Brasil topics file Brasil topics record Brasil small talk data Brasil small talk dataset Brasil small talk file Brasil small talk record Brasil dialect data Brasil dialect dataset Brasil dialect file Brasil dialect record Brasil speech data Brasil speech dataset Brasil speech file Brasil speech record Brasil chatter data Brasil chatter dataset Brasil chatter file Brasil chatter record Brasil discuss data Brasil discuss dataset Brasil discuss file Brasil discuss record Brasil gossip data Brasil gossip dataset Brasil gossip file Brasil gossip record Brasil lecture data Brasil lecture dataset Brasil lecture file Brasil lecture record Brasil dialogue data Brasil dialogue dataset Brasil dialogue file Brasil dialogue record Portuguese talk data Portuguese talk dataset Portuguese talk file Portuguese talk record Portuguese conversional data Portuguese conversional dataset Portuguese conversional file Portuguese conversional record Portuguese topics data Portuguese topics dataset Portuguese topics file Portuguese topics record Portuguese small talk data Portuguese small talk dataset Portuguese small talk file Portuguese small talk record Portuguese dialect data Portuguese dialect dataset Portuguese dialect file Portuguese dialect record Portuguese speech data Portuguese speech dataset Portuguese speech file Portuguese speech record Portuguese chatter data Portuguese chatter dataset Portuguese chatter file Portuguese chatter record Portuguese discuss data Portuguese discuss dataset Portuguese discuss file Portuguese discuss record Portuguese gossip data Portuguese gossip dataset Portuguese gossip file Portuguese gossip record Portuguese lecture data Portuguese lecture dataset Portuguese lecture file Portuguese lecture record Portuguese dialogue data Portuguese dialogue dataset Portuguese dialogue file Portuguese dialogue record português talk data português talk dataset português talk file português talk record português conversional data português conversional dataset português conversional file português conversional record português topics data português topics dataset português topics file português topics record português small talk data português small talk dataset português small talk file português small talk record português dialect data português dialect dataset português dialect file português dialect record português speech data português speech dataset português speech file português speech record português chatter data português chatter dataset português chatter file português chatter record português discuss data português discuss dataset português discuss file português discuss record português gossip data português gossip dataset português gossip file português gossip record português lecture data português lecture dataset português lecture file português lecture record português dialogue data português dialogue dataset português dialogue file português dialogue record Portugal talk data Portugal talk dataset Portugal talk file Portugal talk record Portugal conversional data Portugal conversional dataset Portugal conversional file Portugal conversional record Portugal topics data Portugal topics dataset Portugal topics file Portugal topics record Portugal small talk data Portugal small talk dataset Portugal small talk file Portugal small talk record Portugal dialect data Portugal dialect dataset Portugal dialect file Portugal dialect record Portugal speech data Portugal speech dataset Portugal speech file Portugal speech record Portugal chatter data Portugal chatter dataset Portugal chatter file Portugal chatter record Portugal discuss data Portugal discuss dataset Portugal discuss file Portugal discuss record Portugal gossip data Portugal gossip dataset Portugal gossip file Portugal gossip record Portugal lecture data Portugal lecture dataset Portugal lecture file Portugal lecture record Portugal dialogue data Portugal dialogue dataset Portugal dialogue file Portugal dialogue record
986 Hours - Portuguese(Europe) Scripted Monologue Smartphone speech dataset

Portuguese(Europe) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(2,109 people in total), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Portuguese data mobile phone voice data voice data
769 Hours - French(France) Scripted Monologue Smartphone speech dataset

French(France) Scripted Monologue Smartphone speech dataset, collected from monologue based on given prompts, covering general category; human-machine interaction category. Transcribed with text content. Our dataset was collected from extensive and diversify speakers(1623 native speakers), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

French mobile phones collect voice data French collection French voice data and developmental voice recognition data
435 Hours - Spanish(Spain) Scripted Monologue Smartphone speech dataset

Spanish(Spain) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(989 people in total), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Spanish data Spanish pronunciation Spanish acquisition Spanish identification data
831 Hours - English(the United Kingdom) Scripted Monologue Smartphone speech dataset

English(the United Kingdom) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,651 British people in total), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Mobile Telephony British English Speech Data
1796.7 Hours - German(Germany) Scripted Monologue Smartphone speech dataset

German(Germany) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(3,442 German native speakers in total), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

German audio data captured by mobile phone German audio collection German audio data
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

7ca99dd2-0ac3-4279-9f6f-c29da47dcc05

dc5590b0-bc5f-4c67-845c-98aafc674c35