[{"@type":"PropertyValue","name":"Data size","value":"50,538 questions "},{"@type":"PropertyValue","name":"Image resolution","value":"total pixels ≥ 300,000 "},{"@type":"PropertyValue","name":"Subject areas","value":"primary, middle, and high school, university, vocational education, etc."},{"@type":"PropertyValue","name":"Question types","value":"multiple-choice (single and multiple selection), fill-in-the-blank, short answer, problem-solving, and questions/answers with illustrations "},{"@type":"PropertyValue","name":"Collection devices","value":"scanner, mobile phone"},{"@type":"PropertyValue","name":"Diversity","value":"various subjects and question types "},{"@type":"PropertyValue","name":"Annotation","value":"quadrilateral bounding boxes and transcription for question stems, options, answers, and illustrations "},{"@type":"PropertyValue","name":"Data processing","value":"equations and tables transcribed in LaTeX format "},{"@type":"PropertyValue","name":"Data formats","value":".jpg, .json, .latex"}]
{"id":1574,"datatype":"1","titleimg":"https://www.nexdata.ai/shujutang/static/image/index/datatang_tuxiang_default.webp","type1":"226","type1str":null,"type2":"254","type2str":null,"dataname":"50,538 Questions – Test Paper VQA Data","datazy":[{"title":"Data size","content":"50,538 questions "},{"title":"Image resolution","content":"total pixels ≥ 300,000 "},{"title":"Subject areas","content":"primary, middle, and high school, university, vocational education, etc."},{"title":"Question types","content":"multiple-choice (single and multiple selection), fill-in-the-blank, short answer, problem-solving, and questions/answers with illustrations "},{"title":"Collection devices","content":"scanner, mobile phone"},{"title":"Diversity","content":"various subjects and question types "},{"title":"Annotation","content":"quadrilateral bounding boxes and transcription for question stems, options, answers, and illustrations "},{"title":"Data processing","content":"equations and tables transcribed in LaTeX format "},{"title":"Data formats","content":".jpg, .json, .latex"}],"datatag":"","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":null,"samplePresentation":[],"officialSummary":"50,538 Images - OCR Dataset_Exam Questions, covering multiple subjects, question types and collection devices (mobile phones, scanners), and the text was transcribed, and the formulas and tables were transcribed using latex format. This dataset can be used for tasks such as intelligent exam paper marking and homework tutoring. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.","dataexampl":null,"datakeyword":["大模型","多模态","教育","试题"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Type","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"llm","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"]}
50,538 Images - OCR Dataset_Exam Questions, covering multiple subjects, question types and collection devices (mobile phones, scanners), and the text was transcribed, and the formulas and tables were transcribed using latex format. This dataset can be used for tasks such as intelligent exam paper marking and homework tutoring. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Data size
50,538 questions
Image resolution
total pixels ≥ 300,000
Subject areas
primary, middle, and high school, university, vocational education, etc.
Question types
multiple-choice (single and multiple selection), fill-in-the-blank, short answer, problem-solving, and questions/answers with illustrations
Collection devices
scanner, mobile phone
Diversity
various subjects and question types
Annotation
quadrilateral bounding boxes and transcription for question stems, options, answers, and illustrations