# ModuleKG **Repository Path**: sun__ye/module-kg ## Basic Information - **Project Name**: ModuleKG - **Description**: pytorch模型实体关系网络，统计分析模型构造的层次结构 - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-05-28 - **Last Updated**: 2022-08-15 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # module实体关系网络 - model: - stages[] - block[] - op: 10+, conv, relu, mm 1. 是否提供类似 sgates 这种对已有图的划分？ 2. 提供哪些（社交/模型网络）分析能力？demo, practices 3. 是否真的能表示模型网络：很多模型，使用一些固定的op，但是op组织成 block ## 构思描述module构成关系，采用属性图架构，使用点表示entity，边表示relation，点/边均有多种类型。为了区分这种类型，使用属性集代表一个类型，即相同（或近似）属性集的点为同类型的点，相同（或近似）属性集的边为同类型的边。 | **类型名称** | **节点**/**边** | **描述** | **属性集** | | ------------ | ------------------- | ------------------------------------------------------------ | -------------------------------- | | Module | node | 模型或者组成模型的模块（包括：model-stage-block-op），对应torch.nn.Module | (info) 名称、op类型、各种超参数
(training) 精度、收敛速度、使用次数（的统计量）
(computed) 参数量、模型大小 | | contains | edge | 由Module指向Module（有向），包含哪些子模块，可能指向相同的实体 | 次序 | | isA | edge | 由Module指向Module（有向），表示模块的继承关系 | | |DataSet|node|数据集|集群路径，大小，描述| |Job/Training|node|一次任务|精度、收敛速度、次数| |User|node|用户|| 除此之外，由User -> Job, Job -> DataSet, Job -> Module的关系任务： - 模型层次关系：通过Module节点寻找子树，还原模型树结构 - 统计与排序：对Module使用情况进行统计分析，能够直到最近最多使用的op是哪些 - 重要性：统计Module使用情况，构造Module社交网络，通过节点中心性分析Module（op）重要性 - 寻找module的继承关系，很多模块基本上一致，只是修改了参数，因此存在模块间继承关系，用isA表示未来可能的任务 - 精度预测：通过已有节点的连接情况使用GNN预测模型的属性（如精度） - 模型优化与生成：给定数据集，定义数据集的目标，如何生成一种Module完成这项任务 ## 对Modules做hash https://stackoverflow.com/questions/2909106/whats-a-correct-and-good-way-to-implement-hash https://stackoverflow.com/questions/5884066/hashing-a-dictionary 为了区分两个Module是否一致，直接通过比较属性的做法代价较高，因此可以对Modules计算唯一的hash值，通过**序列化自己属性的hash 直接与所有子模块的hash值左移idx位，再相加**得到（这种方法能够在很大范围内不出现冲突），节点命名为直接类名 + hashcode最后3位这里遇到一个问题，同一个module两次运行hash结果不同！原因是python的hash算法中存在随机生成的机制 https://www.cnblogs.com/liangmingshen/p/13207765.html 需要做可重现可跨进程保持一致性的hash，放弃自带的hash函数，使用hashlib的md5摘要算法，计算出来ReLU不论在任何情形始终是854 ## 单一模型KG可视化及统计对于resnet152，共有module_count 423，其中不重复的有unique_count 50 在Module和子Module上连一条边，表示contains，如图（可视化方式：复制output/module.log中draw_cs的一部分到 https://csacademy.com/app/graph_editor/ ） ![](https://n.sunie.top:9000/gallery/g1121/202205281533731.png) 另一种可视化方法使用了不同的力布局，画出来更加优美的关系图（可视化方式：复制output/module.log中draw_graphgvis的一部分到 https://dreampuf.github.io/GraphvizOnline/ ） ![](https://n.sunie.top:9000/gallery/g1121/202205281558404.svg) 在构造过程中加入去重，同时统计，便可以统计出不同层级结构的使用次数，前10个模块为例： ```log BatchNorm2d328 76 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "256", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} ReLU676 51 {"class": "torch.nn.modules.activation.ReLU", "type": "torch.nn.modules.activation.ReLU", "training": "True", "inplace": "True"} BatchNorm2d913 37 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "1024", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} Conv2d659 36 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "1024", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"} Conv2d587 35 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "1024", "out_channels": "256", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"} Conv2d850 35 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "256", "kernel_size": "(3, 3)", "stride": "(1, 1)", "padding": "(1, 1)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(1, 1, 1, 1)"} Bottleneck968 35 {"class": "torchvision.models.resnet.Bottleneck", "type": "torchvision.models.resnet.Bottleneck", "training": "True", "stride": "1"} BatchNorm2d977 16 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "128", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} BatchNorm2d453 15 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "512", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} Conv2d765 8 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "128", "out_channels": "512", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"} ``` 由于不同模型的hash方式是一致的，具有可复现性，可以通过这种方式运行多个模型并展示在同一张图上，查看两个模型是否有交集，因此尝试torchvision中所有模型 ## 多个模型KG可视化及统计见draw.py，一共使用了11个模型，如下 ```python models = [ resnet18(), resnet34(), resnet101(), resnet152(), googlenet(), vgg11_bn(), vgg13_bn(), vgg16(), vgg19(), squeezenet1_0(), squeezenet1_1() ] module_count 1444 unique_count 274 ``` ![](https://n.sunie.top:9000/gallery/g1121/202205281624934.svg) 从图中可以看出，存在大量节点被多个模型共用，尤其是ReLU，在任何模型中都是一致的。同样可以统计出不同层级结构的使用次数，前10个模块为例： ```log ReLU854 218 {"class": "torch.nn.modules.activation.ReLU", "type": "torch.nn.modules.activation.ReLU", "training": "True", "inplace": "True"} BatchNorm2d187 148 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "256", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} Conv2d578 78 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "256", "kernel_size": "(3, 3)", "stride": "(1, 1)", "padding": "(1, 1)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(1, 1, 1, 1)"} BatchNorm2d255 61 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "1024", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} Conv2d455 59 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "1024", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"} Conv2d878 57 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "1024", "out_channels": "256", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"} Bottleneck304 57 {"class": "torchvision.models.resnet.Bottleneck", "type": "torchvision.models.resnet.Bottleneck", "training": "True", "stride": "1"} BatchNorm2d848 46 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "512", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} BatchNorm2d630 41 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "128", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} Conv2d361 30 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "512", "out_channels": "512", "kernel_size": "(3, 3)", "stride": "(1, 1)", "padding": "(1, 1)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(1, 1, 1, 1)"} ``` 如下图，普通分布具有很强的对数趋势： ![](./output/draw1.png) 在双对数坐标下，接近线性，说明Module数量呈现接近幂律分布 ![](./output/draw.png) ## Module分level分析