en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

32M Science QA Dataset – Answers & Parsing for LLMs

science question dataset
STEM QA dataset
math physics chemistry biology questions
education NLP dataset
AI training data
structured question answer dataset
academic QA dataset
question parsing dataset
K-12 science dataset
university level questions dataset

32 million structured science questions covering mathematics, physics, chemistry, and biology across primary, middle, high school, and university levels. Each question entry includes a title, answer, solution parsing, question type, subject category, and corresponding grade level. The dataset is designed to support AI training tasks such as large language model development, subject-specific knowledge enhancement, machine reading comprehension, and question-answering systems. It provides a rich resource for educational NLP applications and has been validated for quality and completeness. All data complies with global data protection standards including GDPR, CCPA, and PIPL.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Content
Science subjects questions text;
Data Size
About 32 million;
Data Fields
Contains title, answer, parse, subject, grade, question type;
Subject categories
Primary school, middle school, high school and university science subjects;
Format
Jsonl;
Language
Chinese;
Data processing
Subject, questions, parse and answers were analyzed, formula conversion and table format conversion were done, and content was also cleaned
Sample Sample
  • 32M Science QA Dataset – Answers & Parsing for LLMs
  • 32M Science QA Dataset – Answers & Parsing for LLMs
  • 32M Science QA Dataset – Answers & Parsing for LLMs
Recommended DatasetsRecommended Dataset
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

0e58b8b7-f740-498c-bc83-1232bbba82e8

f785088f-c98d-4334-a65c-9edb72210ac1