# name_classification **Repository Path**: lucasliu71/name_classification ## Basic Information - **Project Name**: name_classification - **Description**: Name classification prediction project - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-05-27 - **Last Updated**: 2025-06-27 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 人名分类器训练&预测 ## 安装依赖 ```bash pip install -r requirements.txt ``` ## 选择模型 模型用的是 RNN, LSTM 和 GRU ### 数据结构 ``` Abl Czech Adsit Czech Ajdrna Czech Alt Czech Antonowitsch Czech Antonowitz Czech Bacon Czech Ballalatak Czech Ballaltick Czech Bartonova Czech Bastl Czech Baroch Czech ... ``` ### 定义分割字符串 用 Python `string` 库的 `ascii_letters` 再加上 `.,;'` 等有可能在英文人名中出现的符号 `abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ .,;'`, 一共57个字符, 模型将人名转化为 one-hot 独热编码 人名所对应的国家: Arabic, Chinese, Czech, Dutch, English, French, German, Greek, Irish, Italian, Japanese, Korean, Polish, Portuguese, Russian, Scottish, Spanish, Vietnamese ### 创建 RNN 网络模型 ```python class RNNRebuild(Module): def __init__(self, input_size: int, hidden_size: int, output_size: int, num_layers: int=1, batch_first: bool=False): """ RNN rebuild model for name classification, separated with original RNN model from PyTorch Args: input_size: The number of expected features in the input hidden_size: The number of features in the hidden state output_size: The number of output features num_layers: The number of recurrent layers batch_first: If True, the first input will be used as the initial hidden state """ super(RNNRebuild, self).__init__() self.input_size = input_size self.hidden_size = hidden_size self.output_size = output_size self.num_layers = num_layers self.batch_first = batch_first self.rnn = RNN( input_size=self.input_size, hidden_size=self.hidden_size, num_layers=self.num_layers, batch_first=self.batch_first ).to('cuda') self.linear = Linear(self.hidden_size, self.output_size).to('cuda') self.softmax = LogSoftmax(dim=-1).to('cuda') def forward(self, inputs: Tensor, hidden: Tensor) -> tuple[Tensor, Tensor]: inputs = inputs.unsqueeze(1) rr, hn = self.rnn(inputs, hidden) tmp_rr = rr[-1] tmp_rr = self.linear(tmp_rr) return self.softmax(tmp_rr), hn def init_hidden(self) -> Tensor: return zeros(self.num_layers, 1, self.hidden_size).to('cuda') ``` ### 创建 LSTM 网络模型 ```python class LSTMRebuild(Module): def __init__(self, input_size: int, hidden_size: int, output_size: int, num_layers: int=1, batch_first: bool=False): """ LSTM rebuild model for name classification, separated with original LSTM model from PyTorch Args: input_size: The number of expected features in the input hidden_size: The number of features in the hidden state output_size: The number of output features num_layers: The number of recurrent layers batch_first: If True, the first input will be used as the initial hidden state """ super(LSTMRebuild, self).__init__() self.input_size = input_size self.hidden_size = hidden_size self.output_size = output_size self.num_layers = num_layers self.batch_first = batch_first self.lstm = LSTM( input_size=self.input_size, hidden_size=self.hidden_size, num_layers=self.num_layers, batch_first=self.batch_first ).to('cuda') self.linear = Linear(self.hidden_size, self.output_size).to('cuda') self.softmax = LogSoftmax(dim=-1).to('cuda') def forward(self, inputs: Tensor, hidden: Tensor, c: Tensor) -> tuple[Tensor, Tensor, Tensor]: inputs = inputs.unsqueeze(1) rr, (hn, cn) = self.lstm(inputs, (hidden, c)) tmp_rr = rr[-1] tmp_rr = self.linear(tmp_rr) return self.softmax(tmp_rr), hn, cn def init_hidden(self) -> tuple[Tensor, Tensor]: hidden = zeros(self.num_layers, 1, self.hidden_size).to('cuda') c = zeros(self.num_layers, 1, self.hidden_size).to('cuda') return hidden, c ``` ### 创建 GRU 网络模型 ```python class GRURebuild(Module): def __init__(self, input_size: int, hidden_size: int, output_size: int, num_layers: int=1, batch_first: bool=False): """ LSTM rebuild model for name classification, separated with original LSTM model from PyTorch Args: input_size: The number of expected features in the input hidden_size: The number of features in the hidden state output_size: The number of output features num_layers: The number of recurrent layers batch_first: If True, the first input will be used as the initial hidden state """ super(GRURebuild, self).__init__() self.input_size = input_size self.hidden_size = hidden_size self.output_size = output_size self.num_layers = num_layers self.batch_first = batch_first self.gru = GRU( input_size=self.input_size, hidden_size=self.hidden_size, num_layers=self.num_layers, batch_first=self.batch_first ).to('cuda') self.linear = Linear(self.hidden_size, self.output_size).to('cuda') self.softmax = LogSoftmax(dim=-1).to('cuda') def forward(self, inputs: Tensor, hidden: Tensor) -> tuple[Tensor, Tensor]: inputs = inputs.unsqueeze(1) rr, hn = self.gru(inputs, hidden) tmp_rr = rr[-1] tmp_rr = self.linear(tmp_rr) return self.softmax(tmp_rr), hn def init_hidden(self) -> Tensor: return zeros(self.num_layers, 1, self.hidden_size).to('cuda') ``` ### 定义 Name Classification Dataset ```python class NameClassDataset(Dataset): def __init__(self, names: list[str], countries: list[str]): """ Initialize the dataset with a list of names and a list of countries Args: names: A list of names countries: A list of countries """ super(NameClassDataset, self).__init__() self.names = names self.countries = countries self.num_names = len(self.names) def __len__(self) -> int: return self.num_names def __getitem__(self, idx: int) -> tuple[Tensor, str, Tensor, str]: idx = min(max(idx, 0), self.num_names - 1) name = self.names[idx] country = self.countries[idx] tensor_name = zeros(len(name), len(LETTERS)).to('cuda') tensor_country = tensor(COUNTRIES.index(country), dtype=long).to('cuda') for l, letter in enumerate(name): tensor_name[l][LETTERS.find(letter)] = 1 return tensor_name, tensor_country ``` ## 模型训练 ![rnn_lstm_gru_train](assets/rnn_lstm_gru_train.png) 本次训练的学习率为: `0.001` 和 `0.0001`, 训练了10次 学习率=`0.001`: * RNN 模型的准确率为: `0.6999311701081613`, 耗时457.87秒 * LSTM 模型的准确率为: `0.865774788241156`, 耗时422.13秒 * GRU 模型的准确率为: `0.8664374688589935`, 耗时523.64秒 学习率=`0.0001`: * RNN 模型的准确率为: `0.7715752741774676`, 耗时569.19秒 * LSTM 模型的准确率为: `0.7445839561534628`, 耗时428.09秒 * GRU 模型的准确率为: `0.7696811160936722`, 耗时387.49秒 ## 模型对于缺失值、准确率和时间的对比 ### 损失值对比 > [!NOTE] > > 在学习率为 `0.001` 时, RNN 模型的损失值在降到1.01时就不在下降了 > > 在学习率为 `0.0001` 时, RNN, LSTM 和 GRU 模型的损失值都在稳定下降 ### 准确率对比 > [!NOTE] > > 在学习率为 `0.001` 时, LSTM 和 GRU 模型的训练准确度比 RNN 模型的要高 > > 在学习率为 `0.0001` 时, RNN 和 GRU 模型的训练准确度要比 LSTM 模型的要高 ### 时间对比 > [!NOTE] > > 在学习率为 `0.001` 时, GRU 模型的耗时要比 RNN 和 GRU 的要高, LSTM 模型的训练时间最低 > > 在学习率为 `0.0001` 时, RNN 模型的耗时要比 LSTM 和 GRU 的要高, GRU 模型的训练时间最低 ## 模型预测 model_predict_0.001 model_predict_0.0001