Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

Large Language Model content safety considerations text data

Content safety

Text

LLM

Large Language Model content safety considerations text data, about 570,000 in total, this dataset can be used for tasks such as LLM training, chatgpt

This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.

Specifications

Data content

Large Language Model content safety considerations text data

Data size

About 570,000 sets of question and answer data; covering 31 categories of CAC + other new categories

Collecting type

41 major categories

Collecting method

written by professional annotators

Storage format

Excel

Language

Chinese

Sample

Recommended Dataset

100,000 Fine-Tuning text data set for English LLM General Domain SFT

Fine-Tuning text data set for English LLM General Domain SFT, a training resource specifically designed for AI model optimization, significantly enhances the model's instruction understanding and execution capabilities, double-checked by linguistic experts and AI engineers, perfectly supports the fine-tuning needs of mainstream pre-trained models.

sft

50,000 Sets - Image Editing Data

50,000 Sets - Image Editing Data. The types of editing include target removal, target addition, target modification, and target replacement. The editing targets cover scenes such as people, animals, products, plants, and landscapes. In terms of annotation, according to the editing instructions, the targets that need to be edited in the image are cropped and annotated for removal/addition/modification/replacement. The data can be used for tasks such as image synthesis, data augmentation, and virtual scene generation.

Image Editing

100,000 Instruction-Following Evaluation SFT for Chinese LLM Text Data

100,000 Instruction-Following Evaluation SFT for Chinese LLM Text Data. Between 50 and 400 words, with no fewer than 3 constraints in each prompt.All prompt are manually written to satisfy the diversity of coverage.

LLM Instruction-Following SFT

Tell Us Your Special Needs

Full Name *

Contact Phone No. *

Company name *

Company Email *

Data Requirements *

By submitting, I agree to the Privacy Protection

Submit

Subscribe to our newsletter

Be the first to receive Nexdata latest product releases, data solutions and enterprise news.

Off-the-Shelf Datasets: All Category Datasets; LLM Datasets; Computer Vision Datasets; Speech Recognition Datasets; Speech Synthesis Datasets; OCR Datasets; Pronunciation Dictionary; NLU Datasets

Data Service: 3D Point Cloud Data; Street View Data; OCR Data; Behavior Recognition Data; Identity Recognition Data; Speech Recognition Data; Speech Synthesis Data; Multimodal Data

Industries: Generative AI; Autonomous Vehicles; AR/VR; Conversational AI; Smart Home; Retail; Intelligent Healthcare

Company: About Us; News; Partners; Quality & Security; Event
Links: OPENMPD; DataPlus; Datarade

Platform: Platform
Competition: Competition
Resources: Sponsored Datasets

Sharpen Your AI with Better Data

+1(626)594-5598

[email protected]

Sitemap Terms and Conditions

We use cookies to enhance your browsing experience, serve personalized ads or content, and analyze our traffic. By clicking "Accept All", you consent to our use of cookies.

7045c797-305f-4bb7-953a-0210fb5cc38d

00026004-f202-4780-8e50-508067d553e8

Large Language Model content safety considerations text data

Content safety Text LLM

Large Language Model content safety considerations text data, about 570,000 in total, this dataset can be used for tasks such as LLM training, chatgpt

Content safety

Text

LLM