en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

9,574 Images – Multilingual Handwriting OCR Dataset (8 Languages)

handwriting OCR dataset
handwritten text recognition data
multi-language handwriting OCR data
OCR training data
polygon-annotated handwriting dataset

This dataset includes 9,574 handwriting images across 8 languages, including English, Spanish Portuguese and more. The data diversity includes multiple collecting scenes, different text carriers and different photographic angles(looking up, eye-level, looking down). In terms of annotation, each text line is annotated with quadrilateral polygons and transcription. The dataset can be used for training and evaluating OCR models, handwriting recognition systems, and multilingual text extraction tasks in AI and computer vision.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Data size
9,574 images, 243,240 bounding boxes
Language distribution
English, Spanish, Portuguese, French, German, Japanese, Italian and Dutch
Collecting environment
black boards, white boards, green boards
Device
cellphone
Photographic angle
eye-level angle, looking down angle, looking up angle
Data format
the image data format is .jpg and other common image formats, the annotation file data format is.json
Annotation content
line-level quadrilateral (polygon) bounding box annotation and transcription for the texts
Accuracy rate
the error bound of each vertex of quadrilateral bounding box is within 5 pixels, which is a qualified annotation, the accuracy of bounding boxes is not less than 95%; the texts transcription accuracy is not less than 95%
Sample Sample
  • 9,574 Images – Multilingual Handwriting OCR Dataset (8 Languages)
  • 9,574 Images – Multilingual Handwriting OCR Dataset (8 Languages)
  • 9,574 Images – Multilingual Handwriting OCR Dataset (8 Languages)
Recommended DatasetsRecommended Dataset
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

febb3649-e200-4af1-84e2-72d1c75d9cc3

3b738a16-4bb4-462b-9107-e8d39d5fcbdf