[{"@type":"PropertyValue","name":"Data content","value":"102,735 images of slide(RGB, the content is clear), with content description and QAs in annotation document file"},{"@type":"PropertyValue","name":"Diversity","value":"the slide images have four types as structure chart, graph, flow chart and figure"},{"@type":"PropertyValue","name":"Label content","value":"description and QAs of content in slide"},{"@type":"PropertyValue","name":"File format","value":"the image data is JPG/PNG, and the annotation document format is Markdown"},{"@type":"PropertyValue","name":"Language","value":"the main text of slide image is Chinese or English, and the annotation document file is labeled according to the language of PPT image"}]
{"id":1525,"datatype":"1","titleimg":"","type1":"226","type1str":null,"type2":"254","type2str":null,"dataname":"102,735 Images - Slide VQA Dataset","datazy":[{"title":"Data content","desc":"Data content","content":"102,735 images of slide(RGB, the content is clear), with content description and QAs in annotation document file"},{"desc":"Diversity","content":"the slide images have four types as structure chart, graph, flow chart and figure","title":"Diversity"},{"desc":"Label content","content":"description and QAs of content in slide","title":"Label content"},{"desc":"File format","content":"the image data is JPG/PNG, and the annotation document format is Markdown","title":"File format"},{"desc":"Language","content":"the main text of slide image is Chinese or English, and the annotation document file is labeled according to the language of PPT image","title":"Language"}],"datatag":"Slide,VQA,Document analyze","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":"","samplePresentation":[{"name":"000002_chn_middle.png","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250325102154/000002_chn_middle.png?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=MyBte45YhmdN22VaZjeJF%2Bha0yo%3D","intro":"","size":1129712,"progress":100,"type":"jpg"},{"name":"025865_chn_middle.png","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250325102154/025865_chn_middle.png?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=gAc0D33OXOBsL3KZ37eWziqvaMQ%3D","intro":"","size":595819,"progress":100,"type":"jpg"},{"name":"200001_chn_middle.png","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250325102154/200001_chn_middle.png?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=v%2FqxtQZJE4aDcvRZCfV96EuMeoU%3D","intro":"","size":840352,"progress":100,"type":"jpg"}],"officialSummary":"This dataset contains 100, 000 images from slide, with corresponding description and QA annotation in markdown file. The images have four types like structure chart, graph, flow chart and figure. This dataset can be used in document intelligence task.","dataexampl":null,"datakeyword":["Slide","VQA","Document analyze"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Type","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"llm","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"],"firstList":[{"name":"229019_chn_complex.png","url":"https://storage-product.datatang.com/damp/product/sample_presentation/20250325102154/229019_chn_complex.png?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=d1%2BIfU7wmaWrajbBDyFzaNAn3vE%3D","intro":"","size":749196,"progress":100,"type":"jpg"}]}
This dataset contains 100, 000 images from slide, with corresponding description and QA annotation in markdown file. The images have four types like structure chart, graph, flow chart and figure. This dataset can be used in document intelligence task.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Data content
102,735 images of slide(RGB, the content is clear), with content description and QAs in annotation document file
Diversity
the slide images have four types as structure chart, graph, flow chart and figure
Label content
description and QAs of content in slide
File format
the image data is JPG/PNG, and the annotation document format is Markdown
Language
the main text of slide image is Chinese or English, and the annotation document file is labeled according to the language of PPT image