[{"@type":"PropertyValue","name":"Data size","value":"71,535 images, each image has a annotation file"},{"@type":"PropertyValue","name":"Collecting environment","value":"onsite collection in Britain and the United States, including shop plaque, poster, road sign, reminder, warning, packing instruction, menu, building sign, etc."},{"@type":"PropertyValue","name":"Data diversity","value":"including multiple scenes, multiple photographic angles, multiple light conditions"},{"@type":"PropertyValue","name":"Device","value":"cellphone, camera, tablet"},{"@type":"PropertyValue","name":"Photographic angle","value":"looking up angle, looking down angle, eye-level angle"},{"@type":"PropertyValue","name":"Data format","value":"the image data format is .jpg, the annotation file format is .json"},{"@type":"PropertyValue","name":"Annotation content","value":"line-level, word-leve and character-level quadrilateral bounding box annotation and transcription"}]
{"id":162,"datatype":"1","titleimg":"https://res.datatang.com/asset/productNew/APY170301450.png?Expires=2007353641&OSSAccessKeyId=LTAI5tQwXnJZbubgVfVa1ep9&Signature=VsyripJCFd1wLcgx6WmBcA6j61o%3D","type1":"147","type1str":null,"type2":"150","type2str":null,"dataname":"English OCR Dataset in Natural Scenes – 71,535 Images","datazy":[{"title":"Data size","desc":"Data size","content":"71,535 images, each image has a annotation file"},{"title":"Collecting environment","desc":"Collecting environment","content":"onsite collection in Britain and the United States, including shop plaque, poster, road sign, reminder, warning, packing instruction, menu, building sign, etc."},{"title":"Data diversity","desc":"Data diversity","content":"including multiple scenes, multiple photographic angles, multiple light conditions"},{"title":"Device","desc":"Device","content":"cellphone, camera, tablet"},{"title":"Photographic angle","desc":"Photographic angle","content":"looking up angle, looking down angle, eye-level angle"},{"title":"Data format","desc":"Data format","content":"the image data format is .jpg, the annotation file format is .json"},{"title":"Annotation content","desc":"Annotation content","content":"line-level, word-leve and character-level quadrilateral bounding box annotation and transcription"}],"datatag":"OCR,English,Natural scenes","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":["71,535 images","English OCR","Natural Scenes"],"samplePresentation":[{"name":"/data/apps/damp/temp/ziptemp/APY170301450_demo1733479200288/APY170301450_demo/2.jpg","url":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY170301450_demo1733479200288/APY170301450_demo/2.jpg?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=QdjMlA5VsOC7%2FlmF1u6HFrTcAz8%3D","intro":"","size":0,"progress":100,"type":"jpg"},{"name":"/data/apps/damp/temp/ziptemp/APY170301450_demo1733479200288/APY170301450_demo/3.jpg","url":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY170301450_demo1733479200288/APY170301450_demo/3.jpg?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=knqgqHo9XfrgWKF921Zvf1w1Mig%3D","intro":"","size":0,"progress":100,"type":"jpg"},{"name":"/data/apps/damp/temp/ziptemp/APY170301450_demo1733479200288/APY170301450_demo/1.jpg","url":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY170301450_demo1733479200288/APY170301450_demo/1.jpg?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=NhSkgtzgRk2RnReKmXyv%2BxO7MNc%3D","intro":"","size":0,"progress":100,"type":"jpg"}],"officialSummary":"This dataset contains 71,535 English natural-scene images collected from real environments in the UK and the United States. The data diversity includes multiple scenes, multiple photographic angles and multiple light conditions. For annotation, line-level & word-leve & character-level rectangular bounding box or quadrilateral bounding box annotation were adopted, the text transcription was also adopted. This dataset is suitable for English OCR, scene text detection, and text recognition research in real-world environments.","dataexampl":null,"datakeyword":["English OCR dataset","scene text dataset","street sign text dataset","outdoor OCR dataset","image text dataset","text detection dataset","text recognition dataset"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Data Type,Language","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"ocr","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"],"firstList":[{"name":"/data/apps/damp/temp/ziptemp/APY170301450_demo1733479200288/APY170301450_demo/5.jpg","url":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY170301450_demo1733479200288/APY170301450_demo/5.jpg?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=8CU5oqsErayr7cWgDEuSoq%2F%2Be2w%3D","intro":"","size":0,"progress":100,"type":"jpg"}]}
English OCR Dataset in Natural Scenes – 71,535 Images
English OCR dataset
scene text dataset
street sign text dataset
outdoor OCR dataset
image text dataset
text detection dataset
text recognition dataset
This dataset contains 71,535 English natural-scene images collected from real environments in the UK and the United States. The data diversity includes multiple scenes, multiple photographic angles and multiple light conditions. For annotation, line-level & word-leve & character-level rectangular bounding box or quadrilateral bounding box annotation were adopted, the text transcription was also adopted. This dataset is suitable for English OCR, scene text detection, and text recognition research in real-world environments.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Data size
71,535 images, each image has a annotation file
Collecting environment
onsite collection in Britain and the United States, including shop plaque, poster, road sign, reminder, warning, packing instruction, menu, building sign, etc.
Data diversity
including multiple scenes, multiple photographic angles, multiple light conditions
Device
cellphone, camera, tablet
Photographic angle
looking up angle, looking down angle, eye-level angle
Data format
the image data format is .jpg, the annotation file format is .json
Annotation content
line-level, word-leve and character-level quadrilateral bounding box annotation and transcription