en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

INTERSPEECH 2025 MLC-SLM Challenge Dataset

Challenge
interspeech
mlc-slm
Conversational

The INTERSPEECH 2025 MLC-SLM Challenge Dataset, curated by Datatang, is derived from fifteen proprietary conversational speech corpora. Distinguished by exceptional annotation accuracy and operational reliability, this dataset is engineered to address critical challenges in multilingual automatic speech recognition (ASR) and long-context comprehension. It meticulously replicates real-world complexities including spontaneous interruptions and speaker overlaps across 11 languages (1500 hours total duration), thereby providing robust training resources for developing world-ready ASR systems. All data collection and processing strictly comply with international privacy regulations including GDPR, CCPA and PIPL, with rigorous protocols ensuring participant anonymity and ethical data usage throughout the lifecycle.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Format
16kHz, 16bit, uncompressed wav, mono channel;
Recording Environment
quiet indoor environment, without echo;
Recording content
dozens of topics are specified, and the speakers make dialogue under those topics while the recording is performed;
Annotation
annotating for the transcription text, speaker identification, gender;
Device
Android mobile phone, iPhone;
Language
American English/British English/Filipino English/Australian English/Indian English/French/German/Italian/Japanese/Korean/Portuguese(Europe)/Russian/Spanish(Spain)/Thai/Vietnamese.
Sample Sample
  • Audio

    one direction is the first thing like in the mind

  • Audio

    Parce que j'ai plus l'ancien, j'en ai que celui-là dorénavant.

  • Audio

    D'accord très bien l'autre, je vais l'effacer alors.

  • Audio

    조금 이제 날씨도 더워지는데 덜 답답하구

  • Audio

    이천치십 년이랑 이천이십일 년 진짜 학교 못 간게

Recommended DatasetsRecommended Dataset
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

194f0988-b00b-4032-af3b-963573e0de92

a4ba4328-8e0c-4cd9-ad52-2cf3df3a4943