[{"@type":"PropertyValue","name":"Data size","value":"202 people, each person collects the audio and video data from 13 different angles +1 txt document"},{"@type":"PropertyValue","name":"People distribution","value":"race distribution: Asian (Indonesia), gender distribution: 89 males, 113 females, age distribution: 165 people aged 18-30, 32 people aged 31-45, and 5 people aged 46-60"},{"@type":"PropertyValue","name":"Collecting environment","value":"indoor natural light scenes, indoor fluorescent lamp scenes"},{"@type":"PropertyValue","name":"Data diversity","value":"including multiple scenes, different ages, different shooting angles"},{"@type":"PropertyValue","name":"Device","value":"cellphone, the resolution is 1,920*1,080"},{"@type":"PropertyValue","name":"Collecting angle","value":"audio and video data of front face, 3 angles left side face, 3 angles right side face, looking down, looking up, left side face down, right side face down, left side face up and right side face up all 13 different angles were collected at the same time"},{"@type":"PropertyValue","name":"Recording content","value":"general field, unlimited content"},{"@type":"PropertyValue","name":"Language","value":"Mandarin Chinese, each video is more than 20 seconds"},{"@type":"PropertyValue","name":"Data format","value":"the video data format is .mp4, the audio is greater than or equal to 16KHz, 16bit, the frame rate is 25-30 fps"},{"@type":"PropertyValue","name":"Accuracy rata","value":"the accuracy rate of word is more than 95%"}]
{"id":1298,"datatype":"1","titleimg":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/asset/productNew/nexdata/APY230627001.jpg?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=APun%2FFClw%2Fz8gHzlhCPPBIQlbos%3D","type1":"147","type1str":null,"type2":"147","type2str":null,"dataname":"202 People - Multi-angle Lip Multimodal Video Data","datazy":[{"title":"Data size","value":"202 people, each person collects the audio and video data from 13 different angles +1 txt document"},{"title":"People distribution","value":"race distribution: Asian (Indonesia), gender distribution: 89 males, 113 females, age distribution: 165 people aged 18-30, 32 people aged 31-45, and 5 people aged 46-60"},{"title":"Collecting environment","value":"indoor natural light scenes, indoor fluorescent lamp scenes"},{"title":"Data diversity","value":"including multiple scenes, different ages, different shooting angles"},{"title":"Device","value":"cellphone, the resolution is 1,920*1,080"},{"title":"Collecting angle","value":"audio and video data of front face, 3 angles left side face, 3 angles right side face, looking down, looking up, left side face down, right side face down, left side face up and right side face up all 13 different angles were collected at the same time"},{"title":"Recording content","value":"general field, unlimited content"},{"title":"Language","value":"Mandarin Chinese, each video is more than 20 seconds"},{"title":"Data format","value":"the video data format is .mp4, the audio is greater than or equal to 16KHz, 16bit, the frame rate is 25-30 fps"},{"title":"Accuracy rata","value":"the accuracy rate of word is more than 95%"}],"datatag":"Lip multimodal,Mandarin Chinese,Multiple scenes,Different ages,Different shooting angles","technologydoc":null,"downurl":null,"datainfo":"","standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":"","samplePresentation":[["jpg","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY230627001_demo1715767204254/APY230627001_demo/002_male_29.png?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=ALASNNOKRu%2FsdItuxWu7btO8Gqs%3D","/data/apps/damp/temp/ziptemp/APY230627001_demo1715767204254/APY230627001_demo/002_male_29.png",""],["jpg","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY230627001_demo1715767204254/APY230627001_demo/001_female_30.png?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=mZRLnTYk5W0s3jRzP7Um81hhRvw%3D","/data/apps/damp/temp/ziptemp/APY230627001_demo1715767204254/APY230627001_demo/001_female_30.png",""],["jpg","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY230627001_demo1715767204254/APY230627001_demo/156_male_42.png?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=2wVvKW6e6XgkYOi9kPqptswFKGs%3D","/data/apps/damp/temp/ziptemp/APY230627001_demo1715767204254/APY230627001_demo/156_male_42.png",""]],"officialSummary":"202 People - Multi-angle Lip Multimodal Video Data. The collection environments include indoor natural light scenes and indoor fluorescent lamp scenes. The device is cellphone. The diversity includes multiple scenes, different ages, 13 shooting angles. The language is Mandarin Chinese. The recording content is general field, unlimited content. The data can be used in multi-modal learning algorithms research in speech and image fields.","dataexampl":"","datakeyword":["Multi-angle"," lip multimodal","indoor natural light scenes","indoor fluorescent lamp scenes"," 13 shooting angles"," Mandarin Chinese","general field"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Language,Data Type","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"speechRec","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"],"single":"no","firstList":[["jpg","https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY230627001_demo1715767204254/APY230627001_demo/090_female_38.png?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=c6Jkb362VMrtxemlPNPSW%2FkEH%2Fk%3D","/data/apps/damp/temp/ziptemp/APY230627001_demo1715767204254/APY230627001_demo/090_female_38.png",""]]}
202 People - Multi-angle Lip Multimodal Video Data
Multi-angle
lip multimodal
indoor natural light scenes
indoor fluorescent lamp scenes
13 shooting angles
Mandarin Chinese
general field
202 People - Multi-angle Lip Multimodal Video Data. The collection environments include indoor natural light scenes and indoor fluorescent lamp scenes. The device is cellphone. The diversity includes multiple scenes, different ages, 13 shooting angles. The language is Mandarin Chinese. The recording content is general field, unlimited content. The data can be used in multi-modal learning algorithms research in speech and image fields.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Data size
202 people, each person collects the audio and video data from 13 different angles +1 txt document
People distribution
race distribution: Asian (Indonesia), gender distribution: 89 males, 113 females, age distribution: 165 people aged 18-30, 32 people aged 31-45, and 5 people aged 46-60
including multiple scenes, different ages, different shooting angles
Device
cellphone, the resolution is 1,920*1,080
Collecting angle
audio and video data of front face, 3 angles left side face, 3 angles right side face, looking down, looking up, left side face down, right side face down, left side face up and right side face up all 13 different angles were collected at the same time
Recording content
general field, unlimited content
Language
Mandarin Chinese, each video is more than 20 seconds
Data format
the video data format is .mp4, the audio is greater than or equal to 16KHz, 16bit, the frame rate is 25-30 fps
Accuracy rata
the accuracy rate of word is more than 95%
Sample
Recommended Dataset
180,717 Images - Sign Language Gestures Recognition Data
180,717 Images - Sign Language Gestures Recognition Data. The data diversity includes multiple scenes, 41 static gestures, 95 dynamic gestures, multiple photographic angles, and multiple light conditions. In terms of data annotation, 21 landmarks, gesture types, and gesture attributes were annotated. This dataset can be used for tasks such as gesture recognition and sign language translation.
Sign language gesturesmultiple scenes static gestures dynamic gestures multiple photographic angles multiple light conditions21 landmarks gesture type gesture attributes
49,945 Images Human Costume & Apparel Accessory Segmentation Data
49,945 Images Human Costume & Apparel Accessory Segmentation Data. The gender distribution includes female and male, the race distribution is Asian, Caucasian and black race, the age distribution is teenager, young and middle-aged. The data diversity includes multiple scenes, multiple light conditions, multiple types of costume (upper garment, lower garment, and shoes), and multiple apparel accessories (bag, glasses, accessories, etc.). In terms of annotation, semantic segmentation of 47 categories object (including background, costume and apparel accessory) was adopted. The dataset can be used for tasks such as human costume & apparel accessory segmentation and fashion recommendation.
Human costume & apparel accessory segmentation multiple scenes multiple light conditions multiple types of costume multiple apparel accessories
28,565 People Multi-race 7 Expressions Recognition Data
28,565 People Multi-race 7 Expressions Recognition Data. The data includes male and female. The age distribution ranges from child to the elderly, the young people and the middle aged are the majorities. For each person, 7 images were collected. The data diversity includes different facial postures, different expressions, different light conditions and different scenes. The data can be used for tasks such as face expression recognition.
different expressions different light conditionsdifferent scenesface expression recognition
558,870 Videos - 50 Types of Dynamic Gesture Recognition Data
558,870 Videos - 50 Types of Dynamic Gesture Recognition Data. The collecting scenes of this dataset include indoor scenes and outdoor scenes (natural scenery, street view, square, etc.). The data covers males and females. The age distribution ranges from teenager to senior. The data diversity includes multiple scenes, 50 types of dynamic gestures, 5 photographic angles, multiple light conditions, different photographic distances. This data can be used for dynamic gesture recognition of smart homes, audio equipments and on-board systems.
Vehicle dynamic gesture data home gesture data gesture recognition data 21 key point gesture image data static gesture data dynamic gesture data key point dataset key point annotation gesture key point dataset
1,056 People Living_Face & Anti-Spoofing Data
1,056 People Living_face & Anti-Spoofing Data. The collection scenes include indoor and outdoor scenes. The data includes male and female. The age distribution ranges from juvenile to the elderly, the young people and the middle aged are the majorities. The data includes multiple postures, multiple expressions, and multiple anti-spoofing samples. The data can be used for tasks such as face payment, remote ID authentication, and face unlocking of mobile phone.
Living_face & Anti-Spoofing data face multiple races multiple postures multiple expressions multiple scenes multiple anti-spoofing samplesmultiple age groups
87,871 Images of 106 Facial Landmarks Annotation Data (complicated scenes)
87,871 Images of 106 Facial Landmarks Annotation Data (complicated scenes),this dataset includes yellow race, black race, white race and Indian people. In order to be more challenging, the data includes multiple scenes, multiple poses, different ages, light conditions and complicated expressions. This data can be used for tasks such as face detection and face recognition.
50,356 Images - Human Body Segmentation and 18 Landmarks Data
50,356 Images - Human Body Segmentation and 18 Landmarks Data. The data diversity includes multiple scenes, ages, races, poses, and appendages. In terms of annotation, we adpoted segmentation annotations on human body and appendages.18 landmarks were also annotated for each human body. The data can be used for tasks such as human body segmentation and human behavior recognition.
Human body segmentationLandmarkMultiple racesMultiples posturesMultiple scenes
314,178 Images 18_Gestures Recognition Data
314,178 Images 18_Gestures Recognition Data. This data diversity includes multiple scenes, 18 gestures, 5 shooting angels, multiple ages and multiple light conditions. For annotation, gesture 21 landmarks (each landmark includes the attribute of visible and visible), gesture type and gesture attributes were annotated. This data can be used for tasks such as gesture recognition and human-machine interaction.