en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

Japanese Q&A Dataset from OKWAVE – 8.4M Questions

Japanese Q&A dataset
OKWAVE forum data
Japanese language corpus
Japanese dialogue dataset
ChatGPT Japanese fine-tuning
user-generated content
question answer dataset

This dataset is collected from the Japanese OKWAVE Q&A platform and includes large-scale parsed and processed text data suitable for LLM training and Japanese natural language understanding. It contains structured fields such as questions, answers, categories, timestamps, user metadata, and supplementary explanations. As of April 2025, the dataset includes 8.4 million questions with 2.3 billion words, 27 million answers totaling 7.6 billion words, 15.5 million thank-you messages (1.7 billion words), and 2.1 million supplementary replies (360 million words). Continuously updated and rich in user-generated content, this dataset is ideal for building Japanese conversational AI, ChatGPT fine-tuning, question answering systems, text summarization, and semantic parsing models. All data complies with relevant data usage and privacy regulations.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Content
OKWAVE Q&A text data, the platform authorization and copyright are clear;
Data Size
The data is continuously updated. As of the end of April 25, there were 8.4 million questions and 2.3 billion words. 27 million answers and 7.6 billion words; Thanks (the gratitude expressed by the questioner to the responder) 15.5 million pieces, 1.7 billion words; Supplementary explanations amount to 2.1 million pieces, totaling 360 million words;
Data fields
Contains question, answer, category, create_datetime, user, etc;
Storage Format
Json
Language
Japanese
Sample Sample
  • Japanese Q&A Dataset from OKWAVE – 8.4M Questions
  • Japanese Q&A Dataset from OKWAVE – 8.4M Questions
  • Japanese Q&A Dataset from OKWAVE – 8.4M Questions
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

689fbdab-6982-4a8b-978f-5046c26665a0

ab356b93-71f6-4c11-873e-8665e399a362