en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

261 Hours Japanese Speech Dataset – Native Speakers & Labeled Audio for AI Training

japanese speech dataset
japanese ASR dataset
speech to text dataset japanese
japanese mobile speech dataset
labeled japanese speech data

261 hours of Japanese smartphone-based speech dataset consisting of scripted monologue recordings collected from mobile devices. The dataset covers general domain speech and reflects real-world mobile usage scenarios.All audio samples are fully transcribed and include structured text content and metadata. Our dataset is collected from 1006 Japanese native speakers across diverse geographic regions, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Format
16kHz, 16bit, uncompressed wav, mono channel;
Recording condition
Low background noise(indoor), without echo;
Content category
Generic domain;
Recording device
Android Smartphone, iPhone;
Speaker
1,006 people from Japan; 44% male and 56% female;
Country
Japan(JPN);
Language(Region) Code
ja-JP;
Language
Japanese;
Features of annotation
Transcription text;
Accuracy Rate
Sentence Accuracy Rate (SAR) 95%
Sample Sample
  • Audio

    一説によると、彼の母カリュケー

  • Audio

    一般に地理検と呼ばれる

  • Audio

    これまでの成長実績および今後の成長見込

  • Audio

    大神宮前駅跡に建つモニュメント

  • Audio

    リターン オブ ザ インベーダー

Recommended DatasetsRecommended Dataset
Tell Us Your Special Needs

Current Project Maturity

Early exploration (no concrete specs yet)
Defined goals, need professional guidance
Active development or optimization phase
Data & labeling experts with clear specifications

By submitting, I agree to the Privacy Protection

1f894e28-4311-4a3c-af73-379b1a68c81a

71ee523d-fddd-4cc0-b37c-776f47852c4d