[{"@type":"PropertyValue","name":"Data size","value":"1,400 images, each image has a json file and a metadata file"},{"@type":"PropertyValue","name":"Collection environment","value":"cafes, convenience stores"},{"@type":"PropertyValue","name":"Race distribution","value":"Asians"},{"@type":"PropertyValue","name":"Collection diversity","value":"multiple scenes, multiple human actions"},{"@type":"PropertyValue","name":"Data formats","value":"the image format is .jpg"},{"@type":"PropertyValue","name":"Language","value":"English"},{"@type":"PropertyValue","name":"JSON annotation content","value":"person ID, gender, age, behavior, behavior description, whether blocked, person rectangle"},{"@type":"PropertyValue","name":"Metadata annotation content","value":"shooting date, location, camera height, position matching degree"},{"@type":"PropertyValue","name":"Image resolution","value":"resolution ≥ 1080p"},{"@type":"PropertyValue","name":"Annotation","value":"bounding boxes closely fitting the person edges are correct. Both bounding box accuracy and label accuracy should be no less than 97%"}]
{"id":1648,"datatype":"1","titleimg":"https://www.nexdata.ai/shujutang/static/image/index/datatang_tuxiang_default.webp","type1":"226","type1str":null,"type2":"254","type2str":null,"dataname":"1,400 Human Action Image Dataset with 5,937 Boxes Annotate Data","datazy":[{"title":"Data size","content":"1,400 images, each image has a json file and a metadata file"},{"title":"Collection environment","content":"cafes, convenience stores"},{"title":"Race distribution","content":"Asians"},{"title":"Collection diversity","content":"multiple scenes, multiple human actions"},{"title":"Data formats","content":"the image format is .jpg"},{"title":"Language","content":"English"},{"title":"JSON annotation content","content":"person ID, gender, age, behavior, behavior description, whether blocked, person rectangle"},{"title":"Metadata annotation content","content":"shooting date, location, camera height, position matching degree"},{"title":"Image resolution","content":"resolution ≥ 1080p"},{"title":"Annotation","content":"bounding boxes closely fitting the person edges are correct. Both bounding box accuracy and label accuracy should be no less than 97%"}],"datatag":"AIGC,VLM,MLLV,image-text","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":null,"samplePresentation":[],"officialSummary":"1,400 Human Action Image Dataset with 5,937 Boxes Annotate Data collected a variety of scenes and human activities. Each person in the image is annotated with detailed descriptions.This data can provide a rich resource for large multi-modal models. It has been validated by multiple AI companies and proves beneficial for achieving outstanding performance in real-world applications. Throughout the process of Dataset collection, storage, and usage, we have consistently adhered to dataset protection and privacy regulations to ensure the preservation of user privacy and legal rights. All Dataset comply with regulations such as GDPR, CCPA, PIPL, and other applicable laws.","dataexampl":null,"datakeyword":["human action dataset","human activity image dataset","action recognition images","annotated human activity dataset","human image captioning dataset","multi-modal human dataset","human action detection data","VLA dataset"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"llm","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"]}
1,400 Human Action Image Dataset with 5,937 Boxes Annotate Data
human action dataset
human activity image dataset
action recognition images
annotated human activity dataset
human image captioning dataset
multi-modal human dataset
human action detection data
VLA dataset
1,400 Human Action Image Dataset with 5,937 Boxes Annotate Data collected a variety of scenes and human activities. Each person in the image is annotated with detailed descriptions.This data can provide a rich resource for large multi-modal models. It has been validated by multiple AI companies and proves beneficial for achieving outstanding performance in real-world applications. Throughout the process of Dataset collection, storage, and usage, we have consistently adhered to dataset protection and privacy regulations to ensure the preservation of user privacy and legal rights. All Dataset comply with regulations such as GDPR, CCPA, PIPL, and other applicable laws.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Data size
1,400 images, each image has a json file and a metadata file
Collection environment
cafes, convenience stores
Race distribution
Asians
Collection diversity
multiple scenes, multiple human actions
Data formats
the image format is .jpg
Language
English
JSON annotation content
person ID, gender, age, behavior, behavior description, whether blocked, person rectangle
Metadata annotation content
shooting date, location, camera height, position matching degree
Image resolution
resolution ≥ 1080p
Annotation
bounding boxes closely fitting the person edges are correct. Both bounding box accuracy and label accuracy should be no less than 97%