75 Dictionaries of Different Chinese Fields

Chinese domain dictionary data

text data

NLU data

Entity Identification data

75 Chinese domain dictionaries, including data for a certain year and covering a wide range of content. Each line in the data file includes a term and its Chinese pinyin, and the terms are sorted alphabetically. This data set can be used for tasks such as natural language understanding, knowledge base building, etc..

This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.

Recommended Dataset

8,178 Chinese Social Comments Events Annotation Data

8,178 Chinese social comments annotated data. The contents are hot news in 2013. Each piece of news contains one or more events and is annotated with time, theme, cause, procedure and result. The data is stored in xml and can be used for natural language understanding.

Social comment event annotation data event annotation comment annotation data event annotation data

10,000 Chinese News Events Annotation Data

10,000 Chinese news event annotated data. The contents are hot news in 2013. Each piece of news contains one or more events. Each event is annotated. The data is stored in xml and can be used for natural language understanding.

Chinese news corpus annotation corpus annotation news corpus corpus data

84,516 Sentences - English Intention Annotation Data in Interactive Scenes

84,516 Sentences - English Intention Annotation Data in Interactive Scenes, annotated with intent classes, including slot and slot value information; the intent field includes music, weather, date, schedule, home equipment, etc.; it is applied to intent recognition research and related fields.

english intent annotation data interactive intent annotation data intent recognition nlp intent recognition data NLU data

47,811 Sentences - Intention Annotation Data in Interactive Scenes

Intent-like single-sentence annotated textual data, the data size is 47811 sentences, annotated with intent classes, including slot and slot value information; the intent field includes music, weather, date, schedule, home equipment, etc.; it is applied to intent recognition research and related fields.

intent annotation data interactive intent annotation data intent recognition nlp intent recognition data NLU data

28,237 Intent-type single sentence annotation data

Intent-like single-sentence annotated textual data, the data size is 28,237 sentences, artificially written, and annotated with intent classes, including slot and slot value information; the intent field includes music, weather, date, schedule, home equipment, etc.; it is applied to intent recognition research and related fields.

intent annotation data interactive intent annotation data intent recognition nlp intent recognition data NLU data

13 Modules – Entity Name Single-sentence Annotation Data

13 modules of more than 15,000 piece data collected from different scenes, with annotation on entity name and entity type, rich in content, high in data accuracy.

entity annotation associated entities textual data annotation entity type annotation entity name annotation

687,694 Open Domain Intention Annotation Data

Annotation of 687,694 sentences generated by users in the mobile phone scene, covering to-do scenes, location scenes, and schedule scenes. The data set can be used for natural language understanding tasks.

open domain data intent annotation data textual data annotation SMS text data nlu data Intention understanding data

82 Million Cantonese Script Data

Cantonese textual data, 82 million pieces in total; data is collected from Cantonese script text; data set can be used for natural language understanding, knowledge base construction and other tasks.

Cantonese script data Cantonese textual data Cantonese text data collection dialogue text data