en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

300M Image-Caption Pairs – Large-Scale Vision-Language Dataset for AI Training

image-caption dataset
image-text pairs
vision-language data
generative AI training dataset
multimodal AI dataset
image description data
LLM vision data
AI image-text alignment
high-quality image data

300 Million Pairs of High-Quality Image-Caption Dataset includes a large-scale collection of photographic and vector images paired with English textual descriptions. The complete image library comprises nearly 300 million images, with a curated subset of 100 million high-quality image-caption pairs available for generative AI and vision-language model training. All images are authentic and legally licensed works created by professional photographers. The dataset primarily features English captions with minimal Chinese, offering diverse scenes, objects, and compositions suitable for tasks such as image captioning, visual question answering (VQA), image-text retrieval, and multimodal foundation model pretraining. The dataset supports large-scale LLM and VLM applications and complies with global data privacy and copyright regulations, including GDPR, CCPA, and PIPL.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Data size
300 million images, each paired with a textual description. Complete image library (including photographic + vector images) totals nearly 300 million, Full dataset available for generative AI training (curated photographic + vector images excluding editorial/news images) comprises approximately 100 million.
Data formats
Image formats: .jpg, .png, .svg; Description format: .txt
Data content
Original copyrighted image works officially released by creators, accompanying descriptions authored by content creators.
Data types
Photographic images and vector illustrations, covers diverse scene categories.
Data resolution
4K and above
Description languages
Predominantly English (majority), Minimal Chinese portion.
Sample Sample
Recommended DatasetsRecommended Dataset
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

d2d34d53-0606-4cf4-a577-f5605eb85eac

3d0c386a-9a1a-41ee-aa34-a87b3422fb8c