# c3 **Repository Path**: Leon_02/c3 ## Basic Information - **Project Name**: c3 - **Description**: Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-12-30 - **Last Updated**: 2020-12-30 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README C3 ===== Overview -------- This repository maintains **C3**, the first free-form multiple-**C**hoice **C**hinese machine reading **C**omprehension dataset. * Paper: https://arxiv.org/abs/1904.09679 ``` @article{sun2019investigating, title={Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension}, author={Sun, Kai and Yu, Dian and Yu, Dong and Cardie, Claire}, journal={Transactions of the Association for Computational Linguistics}, year={2020}, url={https://arxiv.org/abs/1904.09679v3} } ``` Files in this repository: * ```license.txt```: the license of C3. * ```data/c3-{m,d}-{train,dev,test}.json```: the dataset files, where m and d represent "**m**ixed-genre" and "**d**ialogue", respectively. The data format is as follows. ``` [ [ [ document 1 ], [ { "question": document 1 / question 1, "choice": [ document 1 / question 1 / answer option 1, document 1 / question 1 / answer option 2, ... ], "answer": document 1 / question 1 / correct answer option }, { "question": document 1 / question 2, "choice": [ document 1 / question 2 / answer option 1, document 1 / question 2 / answer option 2, ... ], "answer": document 1 / question 2 / correct answer option }, ... ], document 1 / id ], [ [ document 2 ], [ { "question": document 2 / question 1, "choice": [ document 2 / question 1 / answer option 1, document 2 / question 1 / answer option 2, ... ], "answer": document 2 / question 1 / correct answer option }, { "question": document 2 / question 2, "choice": [ document 2 / question 2 / answer option 1, document 2 / question 2 / answer option 2, ... ], "answer": document 2 / question 2 / correct answer option }, ... ], document 2 / id ], ... ] ``` * ```annotation/c3-{m,d}-{dev,test}.txt```: question type annotations. Each file contains 150 annotated instances. We adopt the following abbreviations:
Abbreviation | Question Type | |
---|---|---|
Matching | m | Matching |
Prior knowledge | l | Linguistic |
s | Domain-specific | |
c-a | Arithmetic | |
c-o | Connotation | |
c-e | Cause-effect | |
c-i | Implication | |
c-p | Part-whole | |
c-d | Precondition | |
c-h | Scenario | |
c-n | Other | |
Supporting Sentences | 0 | Single Sentence |
1 | Multiple sentences | |
2 | Independent |