en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

High-Quality Training Datasets

Boost the performance of your AI models with our high-quality, ready-to-use training datasets.

Language

All

Data Type

All

3D High-Fidelity Synthetic Data - DMS

3D High-Fidelity Synthetic Data - DMS, which includes sensor output data synthesized through 3D scene modeling with high similarity to the real world, like camera images, videos and point clouds. The annotation includes camera parameters, object classification/detection/segmentation, temporal/illumination/weather metadata, and human poses (head/eye/arm/leg positions and orientations). This dataset is applicable for environmental modeling and data synthesis in autonomous driving and robotics.
synthetic data ADS DMS

Japanese OKWAVE Q&A platform Text Parsing and Processing Data

Japanese OKWAVE Q&A platform Text Parsing and Processing Data, Contains question, answer, category, create_datetime, user, etc, The data is continuously updated. As of the end of April 25, there were 8.4 million questions and 2.3 billion words. 27 million answers and 7.6 billion words; Thanks (the gratitude expressed by the questioner to the responder) 15.5 million pieces, 1.7 billion words; Supplementary explanations amount to 2.1 million pieces, totaling 360 million words, this dataset can be used for tasks such as LLM training, chatgpt
Q&A Text Japanese

2,499,771 Boxes 7,262 Images Human Facial Skin Defects Data

2,499,771 Boxes 7,262 Images Human Facial Skin Defects Dataset.The data includes the following seven types of facial skin defects: acne, moles, scars, herpes (sores), speckles, freckles, and others. This data can be used for tasks such as skin defects detection.
Skin defects detection

10,000 Sets-Digital Chart Q&A Data

10,000 Sets-Digital Chart Q&A Data, covering categories such as line charts, bar charts, pie charts, scatter plots, composite types, and tables. Each image has two rounds of Q&A, one for numerical reading and the other for numerical calculation.
Digital Chart QA

30 Million High-quality Video Data

This dataset comprises 30 million high-quality videos, the resources are diverse in type, featuring high resolution and clarity, excellent color accuracy, and rich detail. All materials have been legally obtained through authorized channels, with clear indications of copyright ownership and usage authorization scope. The entire collection provides commercial-grade usage rights and has been granted permission for scientific research use, ensuring clear and traceable intellectual property attribution. The vast and high-quality image resources offer robust support for a wide range of applications, including research in the field of computer vision, training of image recognition algorithms, and sourcing materials for creative design, thereby facilitating efficient progress in related areas.
video 4K

80 Million Vector Image Data

This dataset comprises 80 million vector images. The resources are diverse in type, excellent color accuracy, and rich detail. All materials have been legally obtained through authorized channels, with clear indications of copyright ownership and usage authorization scope. The entire collection provides commercial-grade usage rights and has been granted permission for scientific research use, ensuring clear and traceable intellectual property attribution. The vast and high-quality image resources offer robust support for a wide range of applications, including research in the field of computer vision, training of image recognition algorithms, and sourcing materials for creative design, thereby facilitating efficient progress in related areas.
image vector

200 Million High-quality Image Data

This image database contains 200 million high-quality images that have undergone professional review. The resources are diverse in type, featuring high resolution and clarity, excellent color accuracy, and rich detail. All materials have been legally obtained through authorized channels, with clear indications of copyright ownership and usage authorization scope. The entire collection provides commercial-grade usage rights and has been granted permission for scientific research use, ensuring clear and traceable intellectual property attribution. The vast and high-quality image resources offer robust support for a wide range of applications, including research in the field of computer vision, training of image recognition algorithms, and sourcing materials for creative design, thereby facilitating efficient progress in related areas.
image 4K

50,000 Sets - Image Editing Data

50,000 Sets - Image Editing Data. The types of editing include target removal, target addition, target modification, and target replacement. The editing targets cover scenes such as people, animals, products, plants, and landscapes. In terms of annotation, according to the editing instructions, the targets that need to be edited in the image are cropped and annotated for removal/addition/modification/replacement. The data can be used for tasks such as image synthesis, data augmentation, and virtual scene generation.
Image Editing

500,000 Images - Natural Scenes and Documents OCR Data

The dataset consists of 500,000 images for multi-country natural scenes and document OCR, including 20 languages such as Traditional Chinese, Japanese, Korean, Indonesian, Malay, Thai, Vietnamese, Polish, etc. The diversity includes various natural scenarios and multiple shooting angles. This set of data can be used for multi-language OCR tasks.
Natural scenes Documents OCR

30,000 Images - Natural Scenes OCR Data in Southeast Asian Languages

30,000 natural scene OCR data for minority languages in Southeast Asia, including Khmer (Cambodia), Lao and Burmese. The diversity of collection includes a variety of natural scenes and a variety of shooting angles. This set of data can be used for Southeast Asian language OCR tasks.
OCR Southeast Asian Languages Natural Scenes

100,000 Sets of ICONS Image Caption Data

100,000 Sets of ICONS Image Caption Data. The data includes two major categories of icons, namely 3D Style Icons and Vector Illustration Icons, totaling 17 subcategories. In terms of annotation, the icon descriptions are in Chinese, with a description length of about 30 characters. The data can be used for tasks such as graphic recognition and interface interaction.
ICONS Image caption

6.9 million - Chinese Multi-disciplinary Questions Text Parsing And Processing Data

6.9 million - Chinese Multi-disciplinary Questions Text Parsing And Processing Data, including multiple disciplines in primary school, middle school, high school and university. Each questions contain title, answer, parse, type, subject, grade. The dataset can be used for large model subject knowledge enhancement tasks.
Chinese multi-disciplinary Questions LLM Text
. . .
loading

loading

88104583-5a5c-4a70-8322-537131cc5d9b