Datasets

The dataset contains 774,491 ancient chinese couplets.

The datasets contains 232,670 quatrains.

We create a new large-scale Ancient-Modern Chinese parallel corpus.

Prepocessed dataset containing 240k QA examples.

This datasts contains 2,445,164 sentences.

The dataset contains 586,538 sentences about user comments.

Preprocessed dataset containing 1.06 million medical QA examples.