# ModuleKG
**Repository Path**: sun__ye/module-kg
## Basic Information
- **Project Name**: ModuleKG
- **Description**: pytorch模型实体关系网络,统计分析模型构造的层次结构
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2022-05-28
- **Last Updated**: 2022-08-15
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# module实体关系网络
- model:
- stages[]
- block[]
- op: 10+, conv, relu, mm
1. 是否提供类似 sgates 这种对已有图的划分?
2. 提供哪些(社交/模型网络)分析能力?demo, practices
3. 是否真的能表示模型网络:很多模型,使用一些固定的op,但是op组织成 block
## 构思
描述module构成关系,采用属性图架构,使用点表示entity,边表示relation,点/边均有多种类型。为了区分这种类型,使用属性集代表一个类型,即相同(或近似)属性集的点为同类型的点,相同(或近似)属性集的边为同类型的边。
| **类型名称** | **节点**/**边** | **描述** | **属性集** |
| ------------ | ------------------- | ------------------------------------------------------------ | -------------------------------- |
| Module | node | 模型或者组成模型的模块(包括:model-stage-block-op),对应torch.nn.Module | (info) 名称、op类型、各种超参数
(training) 精度、收敛速度、使用次数(的统计量)
(computed) 参数量、模型大小 |
| contains | edge | 由Module指向Module(有向),包含哪些子模块,可能指向相同的实体 | 次序 |
| isA | edge | 由Module指向Module(有向),表示模块的继承关系 | |
|DataSet|node|数据集|集群路径,大小,描述|
|Job/Training|node|一次任务|精度、收敛速度、次数|
|User|node|用户||
除此之外,由User -> Job, Job -> DataSet, Job -> Module的关系
任务:
- 模型层次关系:通过Module节点寻找子树,还原模型树结构
- 统计与排序:对Module使用情况进行统计分析,能够直到最近最多使用的op是哪些
- 重要性:统计Module使用情况,构造Module社交网络,通过节点中心性分析Module(op)重要性
- 寻找module的继承关系,很多模块基本上一致,只是修改了参数,因此存在模块间继承关系,用isA表示
未来可能的任务
- 精度预测:通过已有节点的连接情况使用GNN预测模型的属性(如精度)
- 模型优化与生成:给定数据集,定义数据集的目标,如何生成一种Module完成这项任务
## 对Modules做hash
https://stackoverflow.com/questions/2909106/whats-a-correct-and-good-way-to-implement-hash
https://stackoverflow.com/questions/5884066/hashing-a-dictionary
为了区分两个Module是否一致,直接通过比较属性的做法代价较高,因此可以对Modules计算唯一的hash值,通过**序列化自己属性的hash 直接与所有子模块的hash值左移idx位,再相加**得到(这种方法能够在很大范围内不出现冲突),节点命名为直接类名 + hashcode最后3位
这里遇到一个问题,同一个module两次运行hash结果不同!原因是python的hash算法中存在随机生成的机制
https://www.cnblogs.com/liangmingshen/p/13207765.html
需要做可重现可跨进程保持一致性的hash,放弃自带的hash函数,使用hashlib的md5摘要算法,计算出来ReLU不论在任何情形始终是854
## 单一模型KG可视化及统计
对于resnet152,共有module_count 423,其中不重复的有unique_count 50
在Module和子Module上连一条边,表示contains,如图(可视化方式:复制output/module.log中draw_cs的一部分到 https://csacademy.com/app/graph_editor/ )

另一种可视化方法使用了不同的力布局,画出来更加优美的关系图(可视化方式:复制output/module.log中draw_graphgvis的一部分到 https://dreampuf.github.io/GraphvizOnline/ )

在构造过程中加入去重,同时统计,便可以统计出不同层级结构的使用次数,前10个模块为例:
```log
BatchNorm2d328 76 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "256", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"}
ReLU676 51 {"class": "torch.nn.modules.activation.ReLU", "type": "torch.nn.modules.activation.ReLU", "training": "True", "inplace": "True"}
BatchNorm2d913 37 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "1024", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"}
Conv2d659 36 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "1024", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"}
Conv2d587 35 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "1024", "out_channels": "256", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"}
Conv2d850 35 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "256", "kernel_size": "(3, 3)", "stride": "(1, 1)", "padding": "(1, 1)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(1, 1, 1, 1)"}
Bottleneck968 35 {"class": "torchvision.models.resnet.Bottleneck", "type": "torchvision.models.resnet.Bottleneck", "training": "True", "stride": "1"}
BatchNorm2d977 16 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "128", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"}
BatchNorm2d453 15 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "512", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"}
Conv2d765 8 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "128", "out_channels": "512", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"}
```
由于不同模型的hash方式是一致的,具有可复现性,可以通过这种方式运行多个模型并展示在同一张图上,查看两个模型是否有交集,因此尝试torchvision中所有模型
## 多个模型KG可视化及统计
见draw.py,一共使用了11个模型,如下
```python
models = [
resnet18(),
resnet34(),
resnet101(),
resnet152(),
googlenet(),
vgg11_bn(),
vgg13_bn(),
vgg16(),
vgg19(),
squeezenet1_0(),
squeezenet1_1()
]
module_count 1444
unique_count 274
```

从图中可以看出,存在大量节点被多个模型共用,尤其是ReLU,在任何模型中都是一致的。
同样可以统计出不同层级结构的使用次数,前10个模块为例:
```log
ReLU854 218 {"class": "torch.nn.modules.activation.ReLU", "type": "torch.nn.modules.activation.ReLU", "training": "True", "inplace": "True"}
BatchNorm2d187 148 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "256", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"}
Conv2d578 78 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "256", "kernel_size": "(3, 3)", "stride": "(1, 1)", "padding": "(1, 1)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(1, 1, 1, 1)"}
BatchNorm2d255 61 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "1024", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"}
Conv2d455 59 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "1024", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"}
Conv2d878 57 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "1024", "out_channels": "256", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"}
Bottleneck304 57 {"class": "torchvision.models.resnet.Bottleneck", "type": "torchvision.models.resnet.Bottleneck", "training": "True", "stride": "1"}
BatchNorm2d848 46 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "512", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"}
BatchNorm2d630 41 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "128", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"}
Conv2d361 30 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "512", "out_channels": "512", "kernel_size": "(3, 3)", "stride": "(1, 1)", "padding": "(1, 1)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(1, 1, 1, 1)"}
```
如下图,普通分布具有很强的对数趋势:

在双对数坐标下,接近线性,说明Module数量呈现接近幂律分布

## Module分level分析