1 Star 0 Fork 0

Tony_幽城 / py-101667-word2vec-textcnn-lstm

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README

dev

  1. pip install -r requirements.txt
  2. data目录下两个文件夹negativeReviews、positiveReviews,这两个文件夹内是txt样本
  3. clean_txt.py文件中,打开注释build_data clean_data这两个方法
  4. 先执行clean_txt.py
  5. 再执行train2.py

Ref

https://www.cnblogs.com/dogecheng/p/11565530.html

LSTM

Epoch 1/8
400/400 [==============================] - 176s 428ms/step - loss: 0.6866 - accuracy: 0.5608 - val_loss: 0.5265 - val_accuracy: 0.7392
Epoch 2/8
400/400 [==============================] - 170s 425ms/step - loss: 0.5349 - accuracy: 0.7314 - val_loss: 0.4586 - val_accuracy: 0.7758
Epoch 3/8
400/400 [==============================] - 175s 438ms/step - loss: 0.4637 - accuracy: 0.7794 - val_loss: 0.4380 - val_accuracy: 0.7920
Epoch 4/8
400/400 [==============================] - 167s 416ms/step - loss: 0.4249 - accuracy: 0.8045 - val_loss: 0.4187 - val_accuracy: 0.8094
Epoch 5/8
400/400 [==============================] - 167s 417ms/step - loss: 0.3900 - accuracy: 0.8219 - val_loss: 0.4200 - val_accuracy: 0.8104
Epoch 6/8
400/400 [==============================] - 164s 410ms/step - loss: 0.3585 - accuracy: 0.8381 - val_loss: 0.4446 - val_accuracy: 0.8024
Epoch 7/8
400/400 [==============================] - 159s 399ms/step - loss: 0.3287 - accuracy: 0.8581 - val_loss: 0.4148 - val_accuracy: 0.8128
Epoch 8/8
400/400 [==============================] - 159s 396ms/step - loss: 0.3042 - accuracy: 0.8698 - val_loss: 0.4672 - val_accuracy: 0.8118

CNN,简化单层

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 100, 128)          4291968   
_________________________________________________________________
conv1d (Conv1D)              (None, 96, 128)           82048     
_________________________________________________________________
global_max_pooling1d (Global (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 64)                8256      
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 65        
=================================================================
Total params: 4,382,337
Trainable params: 90,369
Non-trainable params: 4,291,968
_________________________________________________________________
history_cnn = model_cnn.fit(
  ...:     x=X_train,
  ...:     y=Y_train,
  ...:     validation_data=(X_test, Y_test),
  ...:     batch_size=50,
  ...:     epochs=8
  ...: )
Epoch 1/8
400/400 [==============================] - 13s 32ms/step - loss: 0.6334 - accuracy: 0.6299 - val_loss: 0.6930 - val_accuracy: 0.6230
Epoch 2/8
400/400 [==============================] - 12s 30ms/step - loss: 0.4844 - accuracy: 0.7604 - val_loss: 0.5782 - val_accuracy: 0.7108
Epoch 3/8
400/400 [==============================] - 13s 33ms/step - loss: 0.4090 - accuracy: 0.8151 - val_loss: 0.4994 - val_accuracy: 0.7568
Epoch 4/8
400/400 [==============================] - 14s 35ms/step - loss: 0.3438 - accuracy: 0.8516 - val_loss: 0.5430 - val_accuracy: 0.7540
Epoch 5/8
400/400 [==============================] - 13s 31ms/step - loss: 0.2844 - accuracy: 0.8819 - val_loss: 0.5320 - val_accuracy: 0.7632
Epoch 6/8
400/400 [==============================] - 13s 32ms/step - loss: 0.2306 - accuracy: 0.9045 - val_loss: 0.5851 - val_accuracy: 0.7618
Epoch 7/8
400/400 [==============================] - 12s 31ms/step - loss: 0.1518 - accuracy: 0.9435 - val_loss: 0.6787 - val_accuracy: 0.7572
Epoch 8/8
400/400 [==============================] - 14s 35ms/step - loss: 0.1197 - accuracy: 0.9592 - val_loss: 0.7426 - val_accuracy: 0.7536

CNN, 标准并联

结果在跑

Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_2 (InputLayer)            [(None, 100)]        0                                            
__________________________________________________________________________________________________
embedding_4 (Embedding)         (None, 100, 128)     4292096     input_2[0][0]                    
__________________________________________________________________________________________________
conv1d_6 (Conv1D)               (None, 100, 128)     49280       embedding_4[0][0]                
__________________________________________________________________________________________________
conv1d_7 (Conv1D)               (None, 100, 128)     65664       embedding_4[0][0]                
__________________________________________________________________________________________________
conv1d_8 (Conv1D)               (None, 100, 128)     82048       embedding_4[0][0]                
__________________________________________________________________________________________________
max_pooling1d_3 (MaxPooling1D)  (None, 1, 128)       0           conv1d_6[0][0]                   
__________________________________________________________________________________________________
max_pooling1d_4 (MaxPooling1D)  (None, 1, 128)       0           conv1d_7[0][0]                   
__________________________________________________________________________________________________
max_pooling1d_5 (MaxPooling1D)  (None, 1, 128)       0           conv1d_8[0][0]                   
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 1, 384)       0           max_pooling1d_3[0][0]            
                                                                 max_pooling1d_4[0][0]            
                                                                 max_pooling1d_5[0][0]            
__________________________________________________________________________________________________
flatten_1 (Flatten)             (None, 384)          0           concatenate_1[0][0]              
__________________________________________________________________________________________________
dropout_3 (Dropout)             (None, 384)          0           flatten_1[0][0]                  
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 3)            1155        dropout_3[0][0]                  
==================================================================================================
Total params: 4,490,243
Trainable params: 4,490,243
Non-trainable params: 0

CNN, 串联多个

未实现


model = Sequential()
model.add(Embedding(len(vocab)+1,500,input_length=INPUT_SIZE,weights=[embeding_matrix],trainable=True)) #使用Embeeding层将每个词编码转换为词向量
model.add(Conv1D(256, 5, padding='same'))
model.add(MaxPooling1D(3, 3, padding='same'))
model.add(Conv1D(128, 5, padding='same'))
model.add(MaxPooling1D(3, 3, padding='same'))
model.add(Conv1D(64, 3, padding='same'))
model.add(Flatten())
model.add(Dropout(0.1))
model.add(BatchNormalization())  # (批)规范化层
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(NUM_CLASS, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print('Train...')
model.fit(X_train, y_train,
          batch_size=32,
          epochs=10,
          validation_data=[X_test,y_test])

空文件

简介

暂无描述 展开 收起
Python
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/Tony36051/py-101667-word2vec-textcnn-lstm.git
git@gitee.com:Tony36051/py-101667-word2vec-textcnn-lstm.git
Tony36051
py-101667-word2vec-textcnn-lstm
py-101667-word2vec-textcnn-lstm
master

搜索帮助

53164aa7 5694891 3bd8fe86 5694891