{"id":1751,"datatype":"1","titleimg":"https://www.nexdata.ai/shujutang/static/image/index/datatang_tuxiang_default.webp","type1":"226","type1str":null,"type2":"254","type2str":null,"dataname":"120K Multimodal QA Dataset – Visual & Text Reasoning","datazy":[{"title":"Data size","content":"120,000 questions"},{"title":"Image resolution","content":"short side resolution ≥ 500 pixels"},{"title":"Subject categories","content":"arts, business, science, medicine,humanities and social sciences, engineering"},{"title":"QA length","content":"question length ≥ 10 Chinese characters, and the answer and analysis length ≥ 40 characters"},{"title":"Collection equipment","content":"mobile phone, scanner"},{"title":"Language","content":"Chinese"},{"title":"Diversity","content":"multiple disciplines, multiple image types, multiple types of questions"},{"title":"Data format","content":".jpg, .png, .josn"}],"datatag":"VQA,VLM,LLM,MMMU","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":null,"samplePresentation":[],"officialSummary":"This dataset includes 120,000 multimodal question-answer pairs across six major academic disciplines, including medicine, engineering, art, science, and more. Each QA pair combines textual and visual content—such as charts, diagrams, blueprints, and artworks—crafted to test logical reasoning, cross-modal understanding, and domain-specific knowledge. All questions have been reviewed by subject-matter experts to ensure academic quality and accuracy.Ideal for training multimodal large language models (MLLMs), visual question answering (VQA) systems, and AI applications requiring deep contextual reasoning, this dataset supports fine-tuning tasks like knowledge grounding, visual-text alignment, and decision-making. All data complies with GDPR, CCPA, and PIPL regulations, ensuring ethical use and privacy protection.","dataexampl":null,"datakeyword":["multimodal dataset","VQA dataset","multimodal QA data","reasoning dataset for AI","image-text QA dataset","domain-specific AI training data","chart reasoning dataset","LLM multimodal training data"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Type","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"llm","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"]}

en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

Home > All Category Datasets > LLM Datasets > 120K Multimodal QA Dataset – Visual & Text Reasoning

120K Multimodal QA Dataset – Visual & Text Reasoning

multimodal dataset

VQA dataset

multimodal QA data

reasoning dataset for AI

image-text QA dataset

domain-specific AI training data

chart reasoning dataset

LLM multimodal training data

This dataset includes 120,000 multimodal question-answer pairs across six major academic disciplines, including medicine, engineering, art, science, and more. Each QA pair combines textual and visual content—such as charts, diagrams, blueprints, and artworks—crafted to test logical reasoning, cross-modal understanding, and domain-specific knowledge. All questions have been reviewed by subject-matter experts to ensure academic quality and accuracy.Ideal for training multimodal large language models (MLLMs), visual question answering (VQA) systems, and AI applications requiring deep contextual reasoning, this dataset supports fine-tuning tasks like knowledge grounding, visual-text alignment, and decision-making. All data complies with GDPR, CCPA, and PIPL regulations, ensuring ethical use and privacy protection.

This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.

Specifications

Specifications

Data size

120,000 questions

Image resolution

short side resolution ≥ 500 pixels

Subject categories

arts, business, science, medicine,humanities and social sciences, engineering

QA length

question length ≥ 10 Chinese characters, and the answer and analysis length ≥ 40 characters

Collection equipment

mobile phone, scanner

Language

Chinese

Diversity

multiple disciplines, multiple image types, multiple types of questions

Data format

.jpg, .png, .josn

Sample

Sample

Tell Us Your Special Needs

Full Name *

Contact Phone No.*

Company name *

Company Email *

Data Requirements *

By submitting, I agree to the Privacy Protection

Subscribe to our newsletter

Be the first to receive Nexdata latest product releases, data solutions and enterprise news.

Off-the-Shelf Datasets: All Category Datasets; LLM Datasets; Computer Vision Datasets; Speech Recognition Datasets; Speech Synthesis Datasets; OCR Datasets; Pronunciation Dictionary; NLU Datasets

Data Service: 3D Point Cloud Data; Street View Data; OCR Data; Behavior Recognition Data; Identity Recognition Data; Speech Recognition Data; Speech Synthesis Data; Multimodal Data

Industries: Embodied AI; Generative AI; Autonomous Vehicles; AR/VR; Conversational AI; Smart Home; Retail; Intelligent Healthcare

Company: About Us; News; Partners; Quality & Security; Event
Links: OPENMPD; DataPlus; Datarade

Platform: Platform
Competition: Competition
Resources: Sponsored Datasets

Sharpen Your AI with Better Data

+1(626)594-5598

[email protected]

nexdata_ai facebook

nexdata_ai twitter

nexdata_ai linkedin

nexdata_ai youtube

Copyright © 2023 NEXDATA TECHNOLOGY INC

Sitemap Terms and Conditions

We use cookies to enhance your browsing experience, serve personalized ads or content, and analyze our traffic. By clicking "Accept All", you consent to our use of cookies.

8e165936-8880-4b65-828b-fcb50598b9f6

540fcd53-fbc3-458e-915a-a4deb253c92a