en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

Large Language Model content safety considerations text data

Large Language Model content safety considerations text data
LLM
Large Language Model
Large Model
chatgpt data

Large Language Model content safety considerations text data, about 500,000 in total, this dataset can be used for tasks such as LLM training, chatgpt

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Data content
Large Language Model content safety considerations text data
Data size
About 500,000 sets of question and answer data, 400,000 sensitive directives covering 31 categories regulated by the Cyberspace Administration of China (CAC), plus an additional 100,000 harsh languages
Collecting type
32 major categories
Collecting method
written by professional annotators
Storage format
Excel
Language
Chinese
Sample Sample
  • Waiting For Data
Recommended DatasetsRecommended Dataset
100,000 Instruction-Following Evaluation SFT for Chinese LLM Text Data

100,000 Instruction-Following Evaluation SFT for Chinese LLM Text Data. Between 50 and 400 words, with no fewer than 3 constraints in each prompt.All prompt are manually written to satisfy the diversity of coverage.

LLM Instruction-Following SFT
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

093ec990-1461-426f-98bc-a855254ba90c

bf765f39-e138-4a6d-a5b7-01e07463f71c