en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

300 million pairs of high-quality image-caption dataset

multimodal
image
description

300 million images, each corresponding to a description. All are genuine image works published by photographers. The vast majority of descriptions are in English, with very few in Chinese.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Data size
300 million images, each paired with a textual description. Complete image library (including photographic + vector images) totals nearly 300 million, Full dataset available for generative AI training (curated photographic + vector images excluding editorial/news images) comprises approximately 100 million.
Data formats
Image formats: .jpg, .png, .svg; Description format: .txt
Data content
Original copyrighted image works officially released by creators, accompanying descriptions authored by content creators.
Data types
Photographic images and vector illustrations, covers diverse scene categories.
Data resolution
4K and above
Description languages
Predominantly English (majority), Minimal Chinese portion.
Sample Sample
Recommended DatasetsRecommended Dataset
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

3861be38-ecf7-4991-a53c-dc1a20d771b5

2526e5b7-5dfc-4393-9950-7a5c5ed95e16