# cMedQA

**Repository Path**: songting/cMedQA

## Basic Information

- **Project Name**: cMedQA
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-01-10
- **Last Updated**: 2020-12-18

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# cMedQA v1.0
This is the dataset for Chinese community medical question answering. The dataset is in version 1.0 and is available for non-commercial research. We will update and expand the database from time to time. In order to protect the privacy, the data is anonymized and no personal information is included.

# Update

The newest version of cMedQA now comes to v2.0. You can [click here](https://github.com/zhangsheng93/cMedQA2)


# Overview

| DataSet | #Ques | #Ans | Ave. #words per Question |  Ave. #words per Answer| Ave. #characters per Question | Ave. #characters per Answer |
| :-: | :-: | :-: | :-: | :-: | :-: | :-: |
|Train|50,000|94,134|97|169|120|212|
|Dev|2,000|3,774|94|172|117|216|
|Test|2,000|3,835|96|168|119|211|
|Total|54,000|101,743|96|169|119|212|

* **questions.csv**  All Questions and their content.
* **answers.csv**  All Answers and their content.
* **train_candidates.txt** **dev_candidates.txt** **test_candidates.txt** The split of training set, development set and test set respectively.

# Paper
**Chinese Medical Question Answer Matching Using End-to-End Character-Level Multi-Scale CNNs** [link to the paper](http://www.mdpi.com/2076-3417/7/8/767)

Please cite our paper when you use the dataset.
```
@article{zhang2017chinese,
  title={Chinese Medical Question Answer Matching Using End-to-End Character-Level Multi-Scale CNNs},
  author={Zhang, Sheng and Zhang, Xin and Wang, Hui and Cheng, Jiajun and Li, Pei and Ding, Zhaoyun},
  journal={Applied Sciences},
  volume={7},
  number={8},
  pages={767},
  year={2017},
  publisher={Multidisciplinary Digital Publishing Institute}
}
```