[{"@type":"PropertyValue","name":"Data size","value":"500,000 images, the quantity of each language is distributed between 20,000 and 25,000"},{"@type":"PropertyValue","name":"Language distribution","value":"German, French, Portuguese, Italian, Spanish, Indonesian, Russian, Japanese, Korean, Vietnamese, Polish, Czech, Turkish, Filipino, Dutch, Hindi, Malay, Kazakh, Slovak, Romanian, Uzbek"},{"@type":"PropertyValue","name":"Collection environment","value":"(1)Document photograph scenes: books, newspapers, various types of cards, receipts, etc. (2) Natural scenes: posters, warnings signs, road signs, food packaging, billboards, bus stops, signs, etc.(3) Electronic scenes: screenshots from mobile phones, computer screenshots, electronic documents"},{"@type":"PropertyValue","name":"Document photograph scenes","value":"books, newspapers, various types of cards, receipts, etc."},{"@type":"PropertyValue","name":"Natural scenes","value":"posters, warnings signs, road signs, food packaging, billboards, bus stops, signs, etc."},{"@type":"PropertyValue","name":"Electronic scenes","value":"screenshots from mobile phones, computer screenshots, electronic documents"},{"@type":"PropertyValue","name":"Diversity of collection","value":"multiple data types, various shooting angles, multiple languages"},{"@type":"PropertyValue","name":"Collection equipment","value":"cellphone, computer"},{"@type":"PropertyValue","name":"Data format","value":"the image format is .jpg and other common formats, the annotation document format is .json"},{"@type":"PropertyValue","name":"Annotation content","value":"quadrilateral or polygonal annotation at the row (column) level, transcription of content at the row (column) level"},{"@type":"PropertyValue","name":"Acuuracy rate","value":"the accuracy of the row-level detection boxes is no less than 97%. If the boxes are correctly arranged in rows and the deviation from the edges is no more than 5 pixels, they are considered as correctly labeled The transcribing accuracy at the row and character levels is no less than 97%"}]
{"id":1862,"datatype":"1","titleimg":"https://www.nexdata.ai/shujutang/static/image/index/datatang_tuxiang_default.webp","type1":"147","type1str":null,"type2":"150","type2str":null,"dataname":"500,000 Images – Multilingual OCR Dataset in 21 Languages","datazy":[{"title":"Data size","content":"500,000 images, the quantity of each language is distributed between 20,000 and 25,000"},{"title":"Language distribution","content":"German, French, Portuguese, Italian, Spanish, Indonesian, Russian, Japanese, Korean, Vietnamese, Polish, Czech, Turkish, Filipino, Dutch, Hindi, Malay, Kazakh, Slovak, Romanian, Uzbek"},{"title":"Collection environment","content":"(1)Document photograph scenes: books, newspapers, various types of cards, receipts, etc. (2) Natural scenes: posters, warnings signs, road signs, food packaging, billboards, bus stops, signs, etc.(3) Electronic scenes: screenshots from mobile phones, computer screenshots, electronic documents"},{"title":"Document photograph scenes","content":"books, newspapers, various types of cards, receipts, etc."},{"title":"Natural scenes","content":"posters, warnings signs, road signs, food packaging, billboards, bus stops, signs, etc."},{"title":"Electronic scenes","content":"screenshots from mobile phones, computer screenshots, electronic documents"},{"title":"Diversity of collection","content":"multiple data types, various shooting angles, multiple languages"},{"title":"Collection equipment","content":"cellphone, computer"},{"title":"Data format","content":"the image format is .jpg and other common formats, the annotation document format is .json"},{"title":"Annotation content","content":"quadrilateral or polygonal annotation at the row (column) level, transcription of content at the row (column) level"},{"title":"Acuuracy rate","content":"the accuracy of the row-level detection boxes is no less than 97%. If the boxes are correctly arranged in rows and the deviation from the edges is no more than 5 pixels, they are considered as correctly labeled The transcribing accuracy at the row and character levels is no less than 97%"}],"datatag":"OCR,21 countries,Natural Scenes,Document Photograph Scenes,Electronic Scenes","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":null,"samplePresentation":[],"officialSummary":"This dataset covers 21 languages, with 20,000 to 25,000 images per language. The data includes natural scenes, document photography scenes, and electronic scenes. The data diversity includes various data types, multiple shooting angles, and multiple languages. In terms of annotation, quadrilateral or polygonal at the row (column) level and content transcription at the row (column) level are adopted. This dataset can be use for multilingual optical character recognition (OCR) and text detection tasks.","dataexampl":null,"datakeyword":["multilingual OCR dataset","scene text recognition data","document OCR dataset","electronic screen OCR data","OCR dataset 21 languages","AI OCR training data","text recognition dataset"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Data Type,Language","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"ocr","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"]}
500,000 Images – Multilingual OCR Dataset in 21 Languages
multilingual OCR dataset
scene text recognition data
document OCR dataset
electronic screen OCR data
OCR dataset 21 languages
AI OCR training data
text recognition dataset
This dataset covers 21 languages, with 20,000 to 25,000 images per language. The data includes natural scenes, document photography scenes, and electronic scenes. The data diversity includes various data types, multiple shooting angles, and multiple languages. In terms of annotation, quadrilateral or polygonal at the row (column) level and content transcription at the row (column) level are adopted. This dataset can be use for multilingual optical character recognition (OCR) and text detection tasks.
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Data size
500,000 images, the quantity of each language is distributed between 20,000 and 25,000
(1)Document photograph scenes: books, newspapers, various types of cards, receipts, etc. (2) Natural scenes: posters, warnings signs, road signs, food packaging, billboards, bus stops, signs, etc.(3) Electronic scenes: screenshots from mobile phones, computer screenshots, electronic documents
Document photograph scenes
books, newspapers, various types of cards, receipts, etc.
Natural scenes
posters, warnings signs, road signs, food packaging, billboards, bus stops, signs, etc.
Electronic scenes
screenshots from mobile phones, computer screenshots, electronic documents
Diversity of collection
multiple data types, various shooting angles, multiple languages
Collection equipment
cellphone, computer
Data format
the image format is .jpg and other common formats, the annotation document format is .json
Annotation content
quadrilateral or polygonal annotation at the row (column) level, transcription of content at the row (column) level
Acuuracy rate
the accuracy of the row-level detection boxes is no less than 97%. If the boxes are correctly arranged in rows and the deviation from the edges is no more than 5 pixels, they are considered as correctly labeled The transcribing accuracy at the row and character levels is no less than 97%