# name_classification
**Repository Path**: lucasliu71/name_classification
## Basic Information
- **Project Name**: name_classification
- **Description**: Name classification prediction project
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-05-27
- **Last Updated**: 2025-06-27
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# 人名分类器训练&预测
## 安装依赖
```bash
pip install -r requirements.txt
```
## 选择模型
模型用的是 RNN, LSTM 和 GRU
### 数据结构
```
Abl Czech
Adsit Czech
Ajdrna Czech
Alt Czech
Antonowitsch Czech
Antonowitz Czech
Bacon Czech
Ballalatak Czech
Ballaltick Czech
Bartonova Czech
Bastl Czech
Baroch Czech
...
```
### 定义分割字符串
用 Python `string` 库的 `ascii_letters` 再加上 `.,;'` 等有可能在英文人名中出现的符号
`abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ .,;'`, 一共57个字符, 模型将人名转化为 one-hot 独热编码
人名所对应的国家: Arabic, Chinese, Czech, Dutch, English, French, German, Greek, Irish, Italian, Japanese, Korean, Polish, Portuguese, Russian, Scottish, Spanish, Vietnamese
### 创建 RNN 网络模型
```python
class RNNRebuild(Module):
def __init__(self, input_size: int, hidden_size: int, output_size: int, num_layers: int=1, batch_first: bool=False):
"""
RNN rebuild model for name classification,
separated with original RNN model from PyTorch
Args:
input_size: The number of expected features in the input
hidden_size: The number of features in the hidden state
output_size: The number of output features
num_layers: The number of recurrent layers
batch_first: If True, the first input will be used as the initial hidden state
"""
super(RNNRebuild, self).__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size
self.num_layers = num_layers
self.batch_first = batch_first
self.rnn = RNN(
input_size=self.input_size,
hidden_size=self.hidden_size,
num_layers=self.num_layers,
batch_first=self.batch_first
).to('cuda')
self.linear = Linear(self.hidden_size, self.output_size).to('cuda')
self.softmax = LogSoftmax(dim=-1).to('cuda')
def forward(self, inputs: Tensor,
hidden: Tensor) -> tuple[Tensor, Tensor]:
inputs = inputs.unsqueeze(1)
rr, hn = self.rnn(inputs, hidden)
tmp_rr = rr[-1]
tmp_rr = self.linear(tmp_rr)
return self.softmax(tmp_rr), hn
def init_hidden(self) -> Tensor:
return zeros(self.num_layers, 1, self.hidden_size).to('cuda')
```
### 创建 LSTM 网络模型
```python
class LSTMRebuild(Module):
def __init__(self, input_size: int, hidden_size: int, output_size: int, num_layers: int=1, batch_first: bool=False):
"""
LSTM rebuild model for name classification,
separated with original LSTM model from PyTorch
Args:
input_size: The number of expected features in the input
hidden_size: The number of features in the hidden state
output_size: The number of output features
num_layers: The number of recurrent layers
batch_first: If True, the first input will be used as the initial hidden state
"""
super(LSTMRebuild, self).__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size
self.num_layers = num_layers
self.batch_first = batch_first
self.lstm = LSTM(
input_size=self.input_size,
hidden_size=self.hidden_size,
num_layers=self.num_layers,
batch_first=self.batch_first
).to('cuda')
self.linear = Linear(self.hidden_size, self.output_size).to('cuda')
self.softmax = LogSoftmax(dim=-1).to('cuda')
def forward(self, inputs: Tensor, hidden: Tensor,
c: Tensor) -> tuple[Tensor, Tensor, Tensor]:
inputs = inputs.unsqueeze(1)
rr, (hn, cn) = self.lstm(inputs, (hidden, c))
tmp_rr = rr[-1]
tmp_rr = self.linear(tmp_rr)
return self.softmax(tmp_rr), hn, cn
def init_hidden(self) -> tuple[Tensor, Tensor]:
hidden = zeros(self.num_layers, 1, self.hidden_size).to('cuda')
c = zeros(self.num_layers, 1, self.hidden_size).to('cuda')
return hidden, c
```
### 创建 GRU 网络模型
```python
class GRURebuild(Module):
def __init__(self, input_size: int, hidden_size: int, output_size: int, num_layers: int=1, batch_first: bool=False):
"""
LSTM rebuild model for name classification,
separated with original LSTM model from PyTorch
Args:
input_size: The number of expected features in the input
hidden_size: The number of features in the hidden state
output_size: The number of output features
num_layers: The number of recurrent layers
batch_first: If True, the first input will be used as the initial hidden state
"""
super(GRURebuild, self).__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size
self.num_layers = num_layers
self.batch_first = batch_first
self.gru = GRU(
input_size=self.input_size,
hidden_size=self.hidden_size,
num_layers=self.num_layers,
batch_first=self.batch_first
).to('cuda')
self.linear = Linear(self.hidden_size, self.output_size).to('cuda')
self.softmax = LogSoftmax(dim=-1).to('cuda')
def forward(self, inputs: Tensor, hidden: Tensor) -> tuple[Tensor, Tensor]:
inputs = inputs.unsqueeze(1)
rr, hn = self.gru(inputs, hidden)
tmp_rr = rr[-1]
tmp_rr = self.linear(tmp_rr)
return self.softmax(tmp_rr), hn
def init_hidden(self) -> Tensor:
return zeros(self.num_layers, 1, self.hidden_size).to('cuda')
```
### 定义 Name Classification Dataset
```python
class NameClassDataset(Dataset):
def __init__(self, names: list[str], countries: list[str]):
"""
Initialize the dataset with a list
of names and a list of countries
Args:
names: A list of names
countries: A list of countries
"""
super(NameClassDataset, self).__init__()
self.names = names
self.countries = countries
self.num_names = len(self.names)
def __len__(self) -> int:
return self.num_names
def __getitem__(self, idx: int) -> tuple[Tensor, str, Tensor, str]:
idx = min(max(idx, 0), self.num_names - 1)
name = self.names[idx]
country = self.countries[idx]
tensor_name = zeros(len(name), len(LETTERS)).to('cuda')
tensor_country = tensor(COUNTRIES.index(country), dtype=long).to('cuda')
for l, letter in enumerate(name):
tensor_name[l][LETTERS.find(letter)] = 1
return tensor_name, tensor_country
```
## 模型训练

本次训练的学习率为: `0.001` 和 `0.0001`, 训练了10次
学习率=`0.001`:
* RNN 模型的准确率为: `0.6999311701081613`, 耗时457.87秒
* LSTM 模型的准确率为: `0.865774788241156`, 耗时422.13秒
* GRU 模型的准确率为: `0.8664374688589935`, 耗时523.64秒
学习率=`0.0001`:
* RNN 模型的准确率为: `0.7715752741774676`, 耗时569.19秒
* LSTM 模型的准确率为: `0.7445839561534628`, 耗时428.09秒
* GRU 模型的准确率为: `0.7696811160936722`, 耗时387.49秒
## 模型对于缺失值、准确率和时间的对比
### 损失值对比
> [!NOTE]
>
> 在学习率为 `0.001` 时, RNN 模型的损失值在降到1.01时就不在下降了
>
> 在学习率为 `0.0001` 时, RNN, LSTM 和 GRU 模型的损失值都在稳定下降
### 准确率对比
> [!NOTE]
>
> 在学习率为 `0.001` 时, LSTM 和 GRU 模型的训练准确度比 RNN 模型的要高
>
> 在学习率为 `0.0001` 时, RNN 和 GRU 模型的训练准确度要比 LSTM 模型的要高
### 时间对比
> [!NOTE]
>
> 在学习率为 `0.001` 时, GRU 模型的耗时要比 RNN 和 GRU 的要高, LSTM 模型的训练时间最低
>
> 在学习率为 `0.0001` 时, RNN 模型的耗时要比 LSTM 和 GRU 的要高, GRU 模型的训练时间最低
## 模型预测
