[{"@type":"PropertyValue","name":"Data size","value":"4,995 OCR images, including 258 images of natural scenes, 2,553 Internet images, 2,184 document images"},{"@type":"PropertyValue","name":"Collecting environment","value":"including natural scenes (plaque, packaging instructions, small advertisements, menus, posters, etc.), Internet images (magazine covers, comic covers, etc.), document images (text documents, etc.)"},{"@type":"PropertyValue","name":"Data diversity","value":"including multiple scenes, multiple angles, different light conditions"},{"@type":"PropertyValue","name":"Device","value":"cellphone"},{"@type":"PropertyValue","name":"Shooting angles","value":"looking up angle, eye-level angle"},{"@type":"PropertyValue","name":"Format","value":"the image data format is .jpg, the annotated file format is .json"},{"@type":"PropertyValue","name":"Annotation content","value":"line-level quadrilateral bounding box annotation and transcription for the texts; column-level quadrilateral bounding box annotation and transcription for the texts"},{"@type":"PropertyValue","name":"Accuracy","value":"the error bound of each vertex of quadrilateral bounding box is within 10 pixels, which is a qualified annotation, the accuracy of bounding boxes is not less than 97%; the texts transcription accuracy is not less than 97%"}]
{"id":1059,"datatype":"1","titleimg":"https://res.datatang.com/asset/productNew/APY200102002.png?Expires=2007353677&OSSAccessKeyId=LTAI5tQwXnJZbubgVfVa1ep9&Signature=iiWIUtj93MT7/xo12CRd1n2QzrI%3D","type1":"147","type1str":null,"type2":"150","type2str":null,"dataname":"Vietnamese OCR Dataset with Annotations and Transcriptions (4,995 Images)","datazy":[{"title":"Data size","desc":"Data size","content":"4,995 OCR images, including 258 images of natural scenes, 2,553 Internet images, 2,184 document images"},{"title":"Collecting environment","desc":"Collecting environment","content":"including natural scenes (plaque, packaging instructions, small advertisements, menus, posters, etc.), Internet images (magazine covers, comic covers, etc.), document images (text documents, etc.)"},{"title":"Data diversity","desc":"Data diversity","content":"including multiple scenes, multiple angles, different light conditions"},{"title":"Device","desc":"Device","content":"cellphone"},{"title":"Shooting angles","desc":"Shooting angles","content":"looking up angle, eye-level angle"},{"title":"Format","desc":"Format","content":"the image data format is .jpg, the annotated file format is .json"},{"title":"Annotation content","desc":"Annotation content","content":"line-level quadrilateral bounding box annotation and transcription for the texts; column-level quadrilateral bounding box annotation and transcription for the texts"},{"title":"Accuracy","desc":"Accuracy","content":"the error bound of each vertex of quadrilateral bounding box is within 10 pixels, which is a qualified annotation, the accuracy of bounding boxes is not less than 97%; the texts transcription accuracy is not less than 97%"}],"datatag":"Vietnamese OCR,Multiple scenes,Multiple angles,Different light conditions","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":"","samplePresentation":[{"name":"/data/apps/damp/temp/ziptemp/APY200102002_demo1695808985220/APY200102002_demo/2.jpg","url":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY200102002_demo1695808985220/APY200102002_demo/2.jpg?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=ZA%2B%2BlomO%2FASafZT8yZfr87dZbVw%3D","intro":"","size":0,"progress":100,"type":"jpg"},{"name":"/data/apps/damp/temp/ziptemp/APY200102002_demo1695808985220/APY200102002_demo/3.jpg","url":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY200102002_demo1695808985220/APY200102002_demo/3.jpg?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=7TSAykKx%2FoJxOcmQFM5UWT1Pc4o%3D","intro":"","size":0,"progress":100,"type":"jpg"},{"name":"/data/apps/damp/temp/ziptemp/APY200102002_demo1695808985220/APY200102002_demo/1.jpg","url":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY200102002_demo1695808985220/APY200102002_demo/1.jpg?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=aaUlLyXcysr9UlS05GE0DGNaMhw%3D","intro":"","size":0,"progress":100,"type":"jpg"}],"officialSummary":"This dataset contains 4,995 Vietnamese OCR images with annotations and text transcriptions. The data includes 258 natural scene images, 2,553 Internet images, and 2,184 document images. For line-level content annotation, quadrilateral bounding box annotations and text transcriptions are provided. For column-level content annotation, column-level quadrilateral bounding box annotation and text transcription are provided. The data can be used for tasks such as Vietnamese recognition in multiple scenes.","dataexampl":null,"datakeyword":["Vietnamese OCR dataset","Vietnamese text recognition dataset","Vietnamese OCR images","Vietnamese OCR training data","Vietnamese text detection dataset"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Data Type,Language","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"ocr","dataShowType":"[{\"code\":\"0\",\"language\":\"ZH\"},{\"code\":\"1\",\"language\":\"ZH\"},{\"code\":\"2\",\"language\":\"EN,JP,PT,DE,KO,FR,ES\"},{\"code\":\"3\",\"language\":\"EN\"},{\"code\":\"4\",\"language\":\"JP\"}]","productNameEn":"4,995 Vietnamese OCR Images Data - Images with Annotation and Transcription","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"],"firstList":[{"name":"/data/apps/damp/temp/ziptemp/APY200102002_demo1695808985220/APY200102002_demo/5.jpg","url":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY200102002_demo1695808985220/APY200102002_demo/5.jpg?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=GF0IWNgdbst4A2PLeJ9K3HXqsj8%3D","intro":"","size":0,"progress":100,"type":"jpg"}]}
Vietnamese OCR Dataset with Annotations and Transcriptions (4,995 Images)
Vietnamese OCR dataset
Vietnamese text recognition dataset
Vietnamese OCR images
Vietnamese OCR training data
Vietnamese text detection dataset
This dataset contains 4,995 Vietnamese OCR images with annotations and text transcriptions. The data includes 258 natural scene images, 2,553 Internet images, and 2,184 document images. For line-level content annotation, quadrilateral bounding box annotations and text transcriptions are provided. For column-level content annotation, column-level quadrilateral bounding box annotation and text transcription are provided. The data can be used for tasks such as Vietnamese recognition in multiple scenes.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Data size
4,995 OCR images, including 258 images of natural scenes, 2,553 Internet images, 2,184 document images
Collecting environment
including natural scenes (plaque, packaging instructions, small advertisements, menus, posters, etc.), Internet images (magazine covers, comic covers, etc.), document images (text documents, etc.)
Data diversity
including multiple scenes, multiple angles, different light conditions
Device
cellphone
Shooting angles
looking up angle, eye-level angle
Format
the image data format is .jpg, the annotated file format is .json
Annotation content
line-level quadrilateral bounding box annotation and transcription for the texts; column-level quadrilateral bounding box annotation and transcription for the texts
Accuracy
the error bound of each vertex of quadrilateral bounding box is within 10 pixels, which is a qualified annotation, the accuracy of bounding boxes is not less than 97%; the texts transcription accuracy is not less than 97%