en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

478 Hours - Spanish(Spain) Spontaneous Dialogue Smartphone speech dataset

spanish
phone
Conversational Speech
Spanish discuss data
Spanish discuss dataset
Spanish discuss collection
Spanish small talk data
Spanish small talk dataset
Spanish small talk collection
Spanish conversational data
Spanish conversational dataset
Spanish conversational collection
Spanish chat data
Spanish chat dataset
Spanish chat collection
Spanish communication data
Spanish communication dataset
Spanish communication collection
Spanish speech data
Spanish speech dataset
Spanish speech collection
Spanish talk data
Spanish talk dataset
Spanish talk collection
Spanish conversation data
Spanish conversation dataset
Spanish conversation collection
span discuss data
span discuss dataset
span discuss collection
span small talk data
span small talk dataset
span small talk collection
span conversational data
span conversational dataset
span conversational collection
span chat data
span chat dataset
span chat collection
span communication data
span communication dataset
span communication collection
span speech data
span speech dataset
span speech collection
span talk data
span talk dataset
span talk collection
span conversation data
span conversation dataset
span conversation collection
Castilian discuss data
Castilian discuss dataset
Castilian discuss collection
Castilian small talk data
Castilian small talk dataset
Castilian small talk collection
Castilian conversational data
Castilian conversational dataset
Castilian conversational collection
Castilian chat data
Castilian chat dataset
Castilian chat collection
Castilian communication data
Castilian communication dataset
Castilian communication collection
Castilian speech data
Castilian speech dataset
Castilian speech collection
Castilian talk data
Castilian talk dataset
Castilian talk collection
Castilian conversation data
Castilian conversation dataset
Castilian conversation collection

Spanish(Spain) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(596 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Format
16kHz, 16 bit, wav, mono channel;
Content category
Dialogue based on given topics;
Recording condition
Low background noise (indoor);
Recording device
Android smartphone, iPhone;
Speaker
596 native speakers in total, 48% male and 52% female;
Country
Spain(ESP);
Language(Region) Code
es-ES;
Language
Spanish;
Features of annotation
Transcription text, timestamp, speaker ID, gender, PII redacted.
Accuracy Rate
Word Accuracy Rate (WAR) 98%
Sample Sample
  • Audio

    o mm así en concreto. ¿a ti qué te gustaría ver?, ¿o qué géneros te gustan?

  • Audio

    me eh en especial la comedia. mm no sé si hay alguna peli que me apetezca ver.

  • Audio

    pues no sé, la verdad, mis géneros favoritos son los las películas de suspense.

  • Audio

    por ejemplo, los thrillers y todo eso. todo lo que tenga un misterio

  • Audio

    y sea interesante.

Recommended DatasetsRecommended Dataset
147 Hours - Filipino Conversational Speech Data by Telephone

Filipino(the Philippines) Spontaneous Dialogue Telephony speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(264 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Filipino Conversational Telephone
513 Hours – Japanese Conversational Speech Data by Telephone

The 513 Hours - Japanese Conversational Speech of natural conversations collected by telephony involved more than 800 native speakers, developed with the proper balance of gender ratio, Speakers would choose a few familiar topics out of the given list and start conversations to ensure dialogues' fluency and naturalness. The recording devices is telephony recording system. The audio format is 8kHz, 8bit, uncompressed WAV, and all the speech data was recorded in quiet indoor environments. All the speech audio was manually transcribed with text content, the start and end time of each effective sentence, and speaker identification. The accuracy rate of sentences is ≥ 95%.

Japanese natural conversation speech data Japanese natural conversation speech Japanese natural conversation data Japanese conversation speech data
977 Hours - Vietnamese Spontaneous Dialogue Telephony speech dataset

Vietnamese Spontaneous Dialogue Telephony speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(more than 1200 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Conversational speech Vietnamese asr data Vietnamese
204 Hours - English(Philippine) Spontaneous Dialogue Smartphone speech dataset

English(Philippine) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(around 400 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

English Philippine Spontaneous Dialogue
34 Hours - Hindi(India) Children Real-world Casual Conversation and Monologue speech dataset

Hindi(India) Children Real-world Casual Conversation and Monologue speech dataset, covers self-media, conversation, live, lecture, variety show and other generic domains, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender, age, accent and other attributes. Our dataset was collected from extensive and diversify speakers(12 years old and younger children), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Hindi Casual Conversation Monologue Asr Children
300 People - Mandarin Chinese and English Bilingual Spotaneous Monologue Smartphone speech dataset

Mandarin Chinese and English Bilingual Spotaneous Monologue Smartphone speech dataset, collected from dialogues based on given topics, covering generic domain. Our dataset was collected from extensive and diversify speakers(300 people in total, ages 18 to 65), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Unscripted monologue Natural Speech Mandarin English Bilingual
88 Hours - Spanish(Mexico) Spontaneous Dialogue Telephony speech dataset

Spanish(Mexico) Spontaneous Dialogue Telephony speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(122 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

audio data dataset conversational asr data spanish mexican telephone
500 Hours - Portuguese(Brazil) Real-world Casual Conversation and Monologue speech dataset

Portuguese(Brazil) Real-world Casual Conversation and Monologue speech dataset, covers self-media, conversation, live and other generic domains, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Brazilian Portuguese Spontaneous Speech text annotation
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

c8f2b959-3600-4979-907f-acca83b2de80

dfc09667-a995-4690-9642-8a3ef7341a92