en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

1,044 Hours - Portuguese(Brazil) Scripted Monologue Smartphone speech dataset

Portuguese collection
Portuguese data
Portuguese identification
asr
Speech to text
text to speech
brazil talk data
brazil talk dataset
brazil talk file
brazil talk record
brazil conversional data
brazil conversional dataset
brazil conversional file
brazil conversional record
brazil topics data
brazil topics dataset
brazil topics file
brazil topics record
brazil small talk data
brazil small talk dataset
brazil small talk file
brazil small talk record
brazil dialect data
brazil dialect dataset
brazil dialect file
brazil dialect record
brazil speech data
brazil speech dataset
brazil speech file
brazil speech record
brazil chatter data
brazil chatter dataset
brazil chatter file
brazil chatter record
brazil discuss data
brazil discuss dataset
brazil discuss file
brazil discuss record
brazil gossip data
brazil gossip dataset
brazil gossip file
brazil gossip record
brazil lecture data
brazil lecture dataset
brazil lecture file
brazil lecture record
brazil dialogue data
brazil dialogue dataset
brazil dialogue file
brazil dialogue record
Brasil talk data
Brasil talk dataset
Brasil talk file
Brasil talk record
Brasil conversional data
Brasil conversional dataset
Brasil conversional file
Brasil conversional record
Brasil topics data
Brasil topics dataset
Brasil topics file
Brasil topics record
Brasil small talk data
Brasil small talk dataset
Brasil small talk file
Brasil small talk record
Brasil dialect data
Brasil dialect dataset
Brasil dialect file
Brasil dialect record
Brasil speech data
Brasil speech dataset
Brasil speech file
Brasil speech record
Brasil chatter data
Brasil chatter dataset
Brasil chatter file
Brasil chatter record
Brasil discuss data
Brasil discuss dataset
Brasil discuss file
Brasil discuss record
Brasil gossip data
Brasil gossip dataset
Brasil gossip file
Brasil gossip record
Brasil lecture data
Brasil lecture dataset
Brasil lecture file
Brasil lecture record
Brasil dialogue data
Brasil dialogue dataset
Brasil dialogue file
Brasil dialogue record
Portuguese talk data
Portuguese talk dataset
Portuguese talk file
Portuguese talk record
Portuguese conversional data
Portuguese conversional dataset
Portuguese conversional file
Portuguese conversional record
Portuguese topics data
Portuguese topics dataset
Portuguese topics file
Portuguese topics record
Portuguese small talk data
Portuguese small talk dataset
Portuguese small talk file
Portuguese small talk record
Portuguese dialect data
Portuguese dialect dataset
Portuguese dialect file
Portuguese dialect record
Portuguese speech data
Portuguese speech dataset
Portuguese speech file
Portuguese speech record
Portuguese chatter data
Portuguese chatter dataset
Portuguese chatter file
Portuguese chatter record
Portuguese discuss data
Portuguese discuss dataset
Portuguese discuss file
Portuguese discuss record
Portuguese gossip data
Portuguese gossip dataset
Portuguese gossip file
Portuguese gossip record
Portuguese lecture data
Portuguese lecture dataset
Portuguese lecture file
Portuguese lecture record
Portuguese dialogue data
Portuguese dialogue dataset
Portuguese dialogue file
Portuguese dialogue record
português talk data
português talk dataset
português talk file
português talk record
português conversional data
português conversional dataset
português conversional file
português conversional record
português topics data
português topics dataset
português topics file
português topics record
português small talk data
português small talk dataset
português small talk file
português small talk record
português dialect data
português dialect dataset
português dialect file
português dialect record
português speech data
português speech dataset
português speech file
português speech record
português chatter data
português chatter dataset
português chatter file
português chatter record
português discuss data
português discuss dataset
português discuss file
português discuss record
português gossip data
português gossip dataset
português gossip file
português gossip record
português lecture data
português lecture dataset
português lecture file
português lecture record
português dialogue data
português dialogue dataset
português dialogue file
português dialogue record
Portugal talk data
Portugal talk dataset
Portugal talk file
Portugal talk record
Portugal conversional data
Portugal conversional dataset
Portugal conversional file
Portugal conversional record
Portugal topics data
Portugal topics dataset
Portugal topics file
Portugal topics record
Portugal small talk data
Portugal small talk dataset
Portugal small talk file
Portugal small talk record
Portugal dialect data
Portugal dialect dataset
Portugal dialect file
Portugal dialect record
Portugal speech data
Portugal speech dataset
Portugal speech file
Portugal speech record
Portugal chatter data
Portugal chatter dataset
Portugal chatter file
Portugal chatter record
Portugal discuss data
Portugal discuss dataset
Portugal discuss file
Portugal discuss record
Portugal gossip data
Portugal gossip dataset
Portugal gossip file
Portugal gossip record
Portugal lecture data
Portugal lecture dataset
Portugal lecture file
Portugal lecture record
Portugal dialogue data
Portugal dialogue dataset
Portugal dialogue file
Portugal dialogue record

Portuguese(Brazil) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content, speaker ID, timestamp and other attributes. Our dataset was collected from extensive and diversify speakers(2,038 people in total), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Format
16kHz, 16bit, uncompressed wav, mono channel;
Recording condition
Low background noise(indoor), without echo;
Content category
Generic domain; news; human-machine interaction; smart home command and control; in-car command and control; numbers
Recording device
Android Smartphone, iPhone;
Speaker
2,038 speakers totally, with 47% male and 53% female; and 47% speakers of all are in the age group of 16-25,48% speakers of all are in the age group of 26-45, 5% speakers of all are in the age group of 46-64;
Country
Brazil(BRA);
Language(Region) Code
pt-BR;
Language
Portuguese;
Features of annotation
Transcription text, speaker ID, timestamp;
Accuracy Rate
Sentence Accuracy Rate (SAR) 95%
Sample Sample
  • Audio

    Porque Douradoquara é tão famoso para os viajantes

  • Audio

    Ao chegar sentou-se na cama abaixo de pôsteres de Dirk Nowitzki e Porzingis

  • Audio

    quatrocentos e quarenta e um mil ducentos e trinta e dois reais

  • Audio

    Na comunicação ela cita artigos das leis russas que apontam para punição quanto à humilhação ou insulto.

  • Audio

    Joy nós estamos casados há vinte anos.

Recommended DatasetsRecommended Dataset
1,260 Hours - Italian(Italy) Scripted Monologue Smartphone speech dataset

Italian(Italy) Scripted Monologue Smartphone speech dataset, collected from monologue based on given prompts, covering oral; human-machine interaction; smart home command and in-car command; numbers; news domains. Transcribed with text content. Our dataset was collected from extensive and diversify speakers(3,109 native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Italian voice data mobile phone voice data voice acquisition data
474 Hours - Japanese(Japan) Scripted Monologue Smartphone speech dataset

Japanese(Japan) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,245 speakers in total), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Japanese Scripted Monologue speech data
759 Hours - Hindi(India) Scripted Monologue Smartphone speech dataset

Hindi(India) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,425 Indian native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Hindi mobile phones collect voice data Hindi voice collection Hindi data
997 Hours - Changsha Dialect(China) Scripted Monologue Smartphone speech dataset

Changsha Dialect(China) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, local expressions, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(2,301 Changsha native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Changsha dialect voice data Changsha dialect data Changsha dialect voice dialect recognition reading voice
1,002 Hours - Kunming Dialect(China) Scripted Monologue Smartphone speech dataset

Kunming Dialect(China) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, local expressions, human-machine interaction, smart home command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(2,284 Kunming native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Kunming dialect audio data
997 Hours - Wuhan Dialect(China) Scripted Monologue Smartphone speech dataset

Wuhan Dialect(China) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, local expressions, human-machine interaction, smart home command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(2,291 Wuhan native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Mobile phone captured audio data of Wuhan dialect Wuhan dialect data dialect audio data
1,012 Hours - English(India) Scripted Monologue Smartphone speech dataset

English(India) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers( 2,100 Indian native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Indian English audio data captured by mobile phone Indian English collection Indian English data
261 Hours - Japanese(Japan) Scripted Monologue Smartphone speech dataset

Japanese(Japan) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers( 1006 Japanese native speakers), geographicly speaking, enhancing model performance in real and complex tasks.nQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Japanese data Japanese audio data basic recognition Japanese reading audio
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

56fe8cc0-7abd-4fdf-b20a-a1e114baa8a7

017ce14c-0b76-423b-a59f-4819cdf466af