en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

1464 Hours Large-Scale Canadian French Speech Dataset for AI Training

canadian french speech dataset
canadian french asr dataset
french dialogue dataset
french speech dataset
french canadian speech dataset

This dataset contains 1,464 hours of Canadian French conversational and monologue speech collected from authentic real-world scenarios, including user-generated content, daily conversations, variety shows, and other general domains. It includes transcriptions, speaker IDs, gender, and additional metadata. Our dataset was collected from speakers with diverse geographical and background profiles, thereby enhancing the model's performance in real-world, complex tasks. The dataset has undergone quality validation by multiple AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Format
16kHz, 16 bit, wav, mono channel
Content category
including interview, self-meida,variety show, etc.
Recording environment
Low background noise
Country
Canada(CAN)
Language(Region) Code
fr-CA
Language
French
Features of annotation
Transcription text, timestamp, speaker ID, gender, noise
Accuracy
Word Accuracy Rate (WAR) 98%(Tags, gender, speakerID, accent, topic are not included in accuracy statistics due to subjectivity)
Sample Sample
  • Audio

    Nous sommes le trois août deux mille onze dans la ville de Victoria, en Colombie-Britannique.

  • Audio

    Il y a deux hommes, deux jeunes hommes sur le plancher, couchés.

  • Audio

    On faisait référence beaucoup à la série Selling Sunset. [N]

Recommended DatasetsRecommended Dataset
Tell Us Your Special Needs

Current Project Maturity

Early exploration (no concrete specs yet)
Defined goals, need professional guidance
Active development or optimization phase
Data & labeling experts with clear specifications

By submitting, I agree to the Privacy Protection

83cda6e9-b3f0-4e8b-bfbd-26865ff4ee46

d8a1df77-6d92-4e57-94e4-aa4f34306eb9