diff --git a/assignment-1/submission/18307130116/README.md b/assignment-1/submission/18307130116/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..142f441ff2c2994c3d62c3cbb861fd7c637837f8
--- /dev/null
+++ b/assignment-1/submission/18307130116/README.md
@@ -0,0 +1,235 @@
+# KNN分类器
+
+[toc]
+
+## 依赖包
+
+`numpy`
+
+`matplotlib`
+
+## 函数功能介绍
+
+### KNN
+
+**`fit(self, train_data, train_label)`**
+
+`train_data`训练点集
+
+`train_label`训练标签
+
+**功能简介：**`fit`函数将会取出训练集中的10%用于寻找让准确率最大的K，如果训练集少于10个点，则会默认`K = 1`，否则将会选择1到10中使得准确率最大的K，作为预测时使用的K
+
+---
+
+`predict(self, test_data)`
+
+**功能简介：**根据前一步学习到的K预测对应的类别
+
+### 实验函数与辅助函数
+
+**`distance(point1, point2, method="Euclid")`**
+
+`point1`和`point2`为需要计算距离的两个点
+
+`method`给出了计算距离的指标，默认为欧氏距离，`Manhattan`可按照曼哈顿距离计算
+
+**功能简介：**函数开始会将输入标准化为[m, 1]的向量，并按照相应的方式计算两个点之间的距离
+
+-------
+
+**`dis(dis_label)`**
+
+**功能简介：**`sort`的`key`函数，取出二元组(distance, label)中的distance
+
+---
+
+**`nearest_k_label_max(point, point_arr, label_arr, k)`**
+
+`point`需寻找k个临近点的目标点
+
+`point_arr`已有的点集
+
+`label_arr`已有点集对应的标签集合
+
+`k`考虑的最近的点的数量
+
+**功能简介：**函数将计算目标点和点集中所有点的距离，找到K个距离最近点，并返回出现最多次数的`label`
+
+---
+
+**`data_generate_and_save(class_num, mean_list, cov_list, num_list, save_path = "")`**
+
+`class_num` 共包含的类的数量
+
+`mean_list` 各个类的高斯分布对应的均值矩阵
+
+`cov_list` 各个类的协方差矩阵
+
+`num_list` num_list[i]对应于第i个类的点数
+
+`save_path` 生成的点集的存储路径，默认为当前目录下的`data.npy`，路径需以下划线结尾
+
+**功能简介：**该函数通过调用`numpy.random.multivariate_normal`，生成指定数目的点，随机打乱后，划分其中的80%为训练数据，20%为测试数据，以元组`((train_data, train_label), (test_data, test_label))`的形式保存
+
+---
+
+**`data_load(path = "")`**
+
+`path` 加载点集的存储路径，默认为当前目录下的`data.npy`，路径需以下划线结尾
+
+**功能简介：**点集需以元组`((train_data, train_label), (test_data, test_label))`的形式保存
+
+---
+
+**`visualize(data, label, class_num = 1, test_data=[])`**
+
+*可视化目前只支持二维，如果是高维点集，将只可视化前两维*
+
+`data` 训练点集坐标
+
+`label`训练点集对应的标签
+
+`class_num`类别总数，默认值为1
+
+`test_data`测试点集坐标
+
+**功能简介：**绘制点集散点图，不同类别自适应的用不同颜色表征，测试点集将通过"+"表征
+
+## 实验
+
+首先，我们生成了三类坐标点，每类数量100
+
+其对应的数量和协方差矩阵如下表所示
+
+|         | 均值    | 协方差矩阵        |
+| ------- | ------- | ----------------- |
+| class 1 | (1, 2)  | [[10, 0], [0, 2]] |
+| class 2 | (4, 5)  | [[7, 3], [15, 1]] |
+| class 3 | (-2, 6) | [[0, 1], [1, 2]]  |
+
+测试了1-10对应的准确率，如下图所示
+
+<img src="img/k1.png" alt="k1" style="zoom:67%;" />
+
+在保证准确率不变的条件下，选择较小的数值k=5，预测的准确率达83.3%，对应数据可视化如下图
+
+<img src="img/Figure_1.png" alt="Figure_1" style="zoom: 67%;" />
+
+### 对比实验1：减少点集重叠
+
+上图能较为清晰的看到，三种颜色的点集分布基本分离开，但是仍存在一部分重叠，推测重叠部分会使得KNN效果变差，下面通过改变均值和协方差验证这一结论
+
+首先将协方差对应更改成为
+
+|         | 均值    | 协方差矩阵      |
+| ------- | ------- | --------------- |
+| class 1 | (1, 2)  | [[1,0], [0, 1]] |
+| class 2 | (4, 5)  | [[1,0], [0, 1]] |
+| class 3 | (-2, 6) | [[1,0], [0, 1]] |
+
+对应K的曲线和点集分布图如下
+
+<img src="img/Figure_2_1.png" alt="Figure_2_1" width=350px height=300px /><img src="img/Figure_2_2.png" alt="Figure_2_2" width=350px height=300px />
+
+此时选择K = 3,对应的KNN准确率已经提高到了96.7%符合预期
+
+同样的，我们更改对应的均值大小，使得高斯分布尽可能分开
+
+|         | 均值      | 协方差矩阵        |
+| ------- | --------- | ----------------- |
+| class 1 | (-10, 2)  | [[10, 0], [0, 2]] |
+| class 2 | (4, 5)    | [[7, 3], [15, 1]] |
+| class 3 | (-2, -16) | [[0, 1], [1, 2]]  |
+
+对应曲线如下，准确率达到1.0，此时K=1已经达到了最大值
+
+<img src="img/Figure_2_3.png" alt="Figure_2_3" width=350px height=300px /><img src="img/Figure_2_4.png" alt="Figure_2_4" width=350px height=300px />
+
+#### 结论
+
+从该对比实验中，我们能够较为清晰的看到点集分布对于KNN准确率的影响，当类之间重合度较低时，KNN的准确率显著提升
+
+### 对比实验2：距离选择
+
+在上述实验中，我们采用的距离为欧式距离，下面将更改距离计算方式为曼哈顿距离，考察对应的影响
+
+当点集区分较开时，曼哈顿距离与欧式距离在准确率上差别不大，这里不做展示，当点集重叠程度较高时，对以下分布生成了多组数据
+
+|         | 均值   | 协方差矩阵        |
+| ------- | ------ | ----------------- |
+| class 1 | (1, 4) | [[10, 0], [0, 2]] |
+| class 2 | (2, 5) | [[7, 3], [15, 1]] |
+| class 3 | (2, 6) | [[0, 1], [1, 2]]  |
+
+对应的k值选取和准确率(acc)如下表所示
+
+| 欧氏距离          | 曼哈顿距离         |
+| ----------------- | ------------------ |
+| k = 3, acc = 0.7  | k=3, acc= 0.683    |
+| k = 1, acc = 0.53 | k = 1, acc = 0.483 |
+| k = 7, acc = 0.63 | k = 8, acc = 0.567 |
+
+综合来看点集分布重叠程度较高时，欧氏距离优于曼哈顿距离，推测以高斯分布生成的点，欧式距离对某一维度上较大差距的惩罚大于曼哈顿距离，较符合高斯分布点生成方式，较好拟合当前位置的概率密度，从而准确率更高
+
+#### 结论
+
+当点集区分较开时，曼哈顿距离和欧式距离差别不大，点集重合较大时，欧式距离由于曼哈顿距离
+
+### 对比实验3：点集数量
+
+对于如下分布
+
+|         | 均值    | 协方差矩阵        |
+| ------- | ------- | ----------------- |
+| class 1 | (1, 4)  | [[10, 0], [0, 2]] |
+| class 2 | (2, -3) | [[7, 3], [15, 1]] |
+| class 3 | (2, 5)  | [[0, 1], [1, 2]]  |
+
+分别生成了[100, 100, 100], [100, 10, 100], [100, 50, 200]，[200, 200, 200]四组，每组多次避免偶然误差
+
+结果如下表格所示
+
+|      | [100, 100, 100] | [100, 10, 100] | [100, 50, 200] | [200, 200, 200] |
+| ---- | --------------- | -------------- | -------------- | --------------- |
+| 1    | 0.867           | 0.809          | 0.886          | 0.875           |
+| 2    | 0.800           | 0.809          | 0.843          | 0.825           |
+| 3    | 0.867           | 0.809          | 0.857          | 0.9             |
+| 4    | 0.917           | 0.761          | 0.886          | 0.792           |
+| 平均 | 0.862           | 0.797          | 0.868          | 0.848           |
+
+#### 结论
+
+当点集数量上升时，增大重叠面积，准确率相应下降，当某组点数量显著小于其他点集时，将会较大影响到准确率，当差距过大时，将会一定程度上退化成N-1分类问题，反而导致准确率提升
+
+### 对比实验4：各维度尺度
+
+当各个维度的尺度并不匹配时，例如(年龄，财产)二元组，基于空间上欧式距离相当于退化成为闵式距离，为进一步对比其影响，生成了如下数据
+
+|         | 均值     | 协方差矩阵            |
+| ------- | -------- | --------------------- |
+| class 1 | (1, 400) | [[10, 0], [0, 20000]] |
+| class 2 | (2, 300) | [[7, 0], [0, 10000]]  |
+| class 3 | (2, 300) | [[1, 0], [0, 10000]]  |
+
+其中一组对应k和点集分布如下图所示，多次测量的平均准确率为0.399
+
+<img src="img/Figure_5_1.png" alt="Figure_5_1" width=350px height=300px /><img src="img/Figure_5_2.png" alt="Figure_5_2" width=350px height=300px />
+
+为了对比其影响，我们等比例放缩对应的维度100倍，
+
+|         | 均值   | 协方差矩阵        |
+| ------- | ------ | ----------------- |
+| class 1 | (1, 4) | [[10, 0], [0, 2]] |
+| class 2 | (2, 3) | [[7, 0], [15, 1]] |
+| class 3 | (2, 3) | [[1, 0], [0, 1]]  |
+
+对应的k和点集可视化如下图
+
+<img src="img/Figure_6_1.png" alt="Figure_6_1" width=350px height=300px /><img src="img/Figure_6_2.png" alt="Figure_6_2" width=350px height=300px />
+
+多次测量的平均准确率为0.539
+
+#### 结论
+
+尺度归一化较大程度的影响了准确率的大小，通过等比例尺度放缩，准确率有了较大提升，但是，结合前面点集分布的表现，推测当点集自身区分较开时，归一化的影响不大
\ No newline at end of file
diff --git a/assignment-1/submission/18307130116/img/Figure_1.png b/assignment-1/submission/18307130116/img/Figure_1.png
new file mode 100644
index 0000000000000000000000000000000000000000..b840aa5b2862be15a71968435433efc147086318
Binary files /dev/null and b/assignment-1/submission/18307130116/img/Figure_1.png differ
diff --git a/assignment-1/submission/18307130116/img/Figure_2_1.png b/assignment-1/submission/18307130116/img/Figure_2_1.png
new file mode 100644
index 0000000000000000000000000000000000000000..5e2b73e556a36aa5db294e9c2c42fc039728279d
Binary files /dev/null and b/assignment-1/submission/18307130116/img/Figure_2_1.png differ
diff --git a/assignment-1/submission/18307130116/img/Figure_2_2.png b/assignment-1/submission/18307130116/img/Figure_2_2.png
new file mode 100644
index 0000000000000000000000000000000000000000..3c6ec2fa9693474116ae15a76359f69b442d99b1
Binary files /dev/null and b/assignment-1/submission/18307130116/img/Figure_2_2.png differ
diff --git a/assignment-1/submission/18307130116/img/Figure_2_3.png b/assignment-1/submission/18307130116/img/Figure_2_3.png
new file mode 100644
index 0000000000000000000000000000000000000000..a893f35d277af8c818a69f49cee5e2bbe06c2367
Binary files /dev/null and b/assignment-1/submission/18307130116/img/Figure_2_3.png differ
diff --git a/assignment-1/submission/18307130116/img/Figure_2_4.png b/assignment-1/submission/18307130116/img/Figure_2_4.png
new file mode 100644
index 0000000000000000000000000000000000000000..34e3cb5e2c15ae4104a1f12fbd9ef62af24cb03e
Binary files /dev/null and b/assignment-1/submission/18307130116/img/Figure_2_4.png differ
diff --git a/assignment-1/submission/18307130116/img/Figure_5_1.png b/assignment-1/submission/18307130116/img/Figure_5_1.png
new file mode 100644
index 0000000000000000000000000000000000000000..09921dca1bbeebae81d5b0f71eafe9ab0f0ce75a
Binary files /dev/null and b/assignment-1/submission/18307130116/img/Figure_5_1.png differ
diff --git a/assignment-1/submission/18307130116/img/Figure_5_2.png b/assignment-1/submission/18307130116/img/Figure_5_2.png
new file mode 100644
index 0000000000000000000000000000000000000000..18ed90b7cd1ec5f2c91a863393b21b655b040eb6
Binary files /dev/null and b/assignment-1/submission/18307130116/img/Figure_5_2.png differ
diff --git a/assignment-1/submission/18307130116/img/Figure_6_1.png b/assignment-1/submission/18307130116/img/Figure_6_1.png
new file mode 100644
index 0000000000000000000000000000000000000000..6fdc07c00f7cfdcead4f8cf98880ce1cd76f9526
Binary files /dev/null and b/assignment-1/submission/18307130116/img/Figure_6_1.png differ
diff --git a/assignment-1/submission/18307130116/img/Figure_6_2.png b/assignment-1/submission/18307130116/img/Figure_6_2.png
new file mode 100644
index 0000000000000000000000000000000000000000..72685efbfd9bc42f675811e5f92bf88c6bbc3851
Binary files /dev/null and b/assignment-1/submission/18307130116/img/Figure_6_2.png differ
diff --git a/assignment-1/submission/18307130116/img/k1.png b/assignment-1/submission/18307130116/img/k1.png
new file mode 100644
index 0000000000000000000000000000000000000000..8a81a8e624428a86d14851ca1a9848cf11c61be0
Binary files /dev/null and b/assignment-1/submission/18307130116/img/k1.png differ
diff --git a/assignment-1/submission/18307130116/source.py b/assignment-1/submission/18307130116/source.py
new file mode 100644
index 0000000000000000000000000000000000000000..4daa13c95a45ed7371bb33f20bdd2f4d821894ae
--- /dev/null
+++ b/assignment-1/submission/18307130116/source.py
@@ -0,0 +1,154 @@
+import numpy as np
+import matplotlib.pyplot as plt
+import matplotlib.cm as cm
+
+def distance(point1, point2, method="Euclid"):
+    """
+    suppose dimention of points is m * 1
+    """
+    if point1.ndim == 1:
+        point1 = np.expand_dims(point1, axis=1)
+    if point2.ndim == 1:
+        point2 = np.expand_dims(point2, axis=1)
+    if point1.shape[0] == 1:
+        point1 = point1.reshape(-1, 1)
+    if point2.shape[0] == 1:
+        point2 = point2.reshape(-1, 1)
+    dimention_num = point1.shape[0]
+    result = 0
+    if(method == "Euclid"):
+        if dimention_num != point1.size:
+            print("error")
+            return -1
+        for iter in range(dimention_num):
+            result += (point1[iter, 0]-point2[iter, 0])**2
+        return pow(result, 0.5)
+    if(method == "Manhattan"):
+        if dimention_num != point1.size:
+            print("error")
+            return -1
+        for iter in range(dimention_num):
+            result += abs(point1[iter, 0]-point2[iter, 0])
+        return result
+
+def dis(dis_label):
+    return dis_label[0]
+
+def nearest_k_label_max(point, point_arr, label_arr, k):
+    distance_arr = []
+    for iter in range(len(point_arr)):
+        distance_arr.append((distance(point, point_arr[iter]), label_arr[iter]))
+    distance_arr.sort(key=dis)
+    result = []
+    for iter in range(k):
+        result.append(distance_arr[iter][1])
+    return max(result, key=result.count)
+
+class KNN:
+
+    def __init__(self):
+        pass
+
+    def fit(self, train_data, train_label):
+        num = train_data.shape[0]
+        dimention_num = train_data.shape[1]
+        self.train_data = train_data
+        self.train_label = train_label
+        dev_num = int(num * 0.1)
+        dev_data = train_data[:dev_num]
+        dev_label = train_label[:dev_num]
+        train_data = train_data[dev_num:]
+        train_label = train_label[dev_num:]
+        correct_cout_max = 0
+        k_max = 0
+        accu = []
+        if dev_num == 0:
+            print("points number too few, so we choose k = 1")
+            self.k = 1
+            return 
+        
+        for iter in range(1, min(num-dev_num, 10)):#find the best k
+            correct_count = 0
+            for j in range(len(dev_data)):
+                predict_label = nearest_k_label_max(dev_data[j], train_data, train_label, iter)
+                if(predict_label == dev_label[j]):
+                    correct_count += 1
+            if correct_count > correct_cout_max:
+                correct_cout_max = correct_count
+                k_max = iter
+            accu.append(correct_count/dev_num)
+        x = range(1, min(num-dev_num, 10))
+        #this part is only for experiment, so I commented it for auto test
+        # plt.plot(x,accu)
+        # plt.show()
+        self.k = k_max
+        print("choose k=", k_max)
+
+    def predict(self, test_data):
+        result = []
+        for iter in range(len(test_data)):
+            result.append(nearest_k_label_max(test_data[iter,:], self.train_data, self.train_label, self.k))
+        return np.array(result)
+
+#here we need some utils
+def data_generate_and_save(class_num, mean_list, cov_list, num_list, save_path = ""):
+    """
+    class_num: the number of class
+    mean_list: mean_list[i] stand for the mean of class[i]
+    cov_list: similar to mean_list, stand for the covariance
+    num_list: similar to mean_list, stand for the number of points in class[i]
+    save_path: the data storage path, end with slash.
+    """
+    data = np.random.multivariate_normal(mean_list[0], cov_list[0], (num_list[0],))
+    label = np.zeros((num_list[0],),dtype=int)
+    total = num_list[0]
+    
+    for iter in range(1, class_num):
+        temp = np.random.multivariate_normal(mean_list[iter], cov_list[iter], (num_list[iter],))
+        label_temp = np.ones((num_list[iter],),dtype=int)*iter
+        data = np.concatenate([data, temp])
+        label = np.concatenate([label, label_temp])
+        total += num_list[iter]
+    
+    idx = np.arange(total)
+    np.random.shuffle(idx)
+    data = data[idx]
+    label = label[idx]
+    train_num = int(total * 0.8)
+    train_data = data[:train_num, ]
+    test_data = data[train_num:, ]
+    train_label = label[:train_num, ]
+    test_label = label[train_num:, ]
+    # print(test_label.size)
+    np.save(save_path+"data.npy", ((train_data, train_label), (test_data, test_label)))
+
+def data_load(path = ""):
+    (train_data, train_label), (test_data, test_label) = np.load(path+"data.npy",allow_pickle=True)
+    return (train_data, train_label), (test_data, test_label)
+
+def visualize(data, label, class_num = 1, test_data=[]):
+    data_x = {}
+    data_y = {}
+    for iter in range(class_num):
+        data_x[iter] = []
+        data_y[iter] = []
+    for iter in range(len(label)):
+        data_x[label[iter]].append(data[iter, 0])
+        data_y[label[iter]].append(data[iter, 1])
+    colors = cm.rainbow(np.linspace(0, 1, class_num))
+
+    for class_idx, c in zip(range(class_num), colors):
+        plt.scatter(data_x[class_idx], data_y[class_idx], color=c)
+    if(len(test_data) != 0):
+        plt.scatter(test_data[:, 0], test_data[:, 1], marker='+')
+    plt.show()
+
+#experiment begin
+if __name__ == "__main__":
+    mean_list = [(1, 4), (2, 3), (2, 3)]
+    cov_list = [np.array([[10, 0], [0, 2]]), np.array([[7, 0], [0, 1]]), np.array([[1, 0], [0, 1]])]
+    num_list = [200, 200, 200]
+    save_path = ""
+    data_generate_and_save(3, mean_list, cov_list, num_list, save_path)
+    # (train_data, train_label), (test_data, test_label) = data_load()
+    # visualize(train_data, train_label, 3)
\ No newline at end of file