# ModuleKG **Repository Path**: sun__ye/module-kg ## Basic Information - **Project Name**: ModuleKG - **Description**: pytorch模型实体关系网络,统计分析模型构造的层次结构 - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-05-28 - **Last Updated**: 2022-08-15 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # module实体关系网络 - model: - stages[] - block[] - op: 10+, conv, relu, mm 1. 是否提供类似 sgates 这种对已有图的划分? 2. 提供哪些(社交/模型网络)分析能力?demo, practices 3. 是否真的能表示模型网络:很多模型,使用一些固定的op,但是op组织成 block ## 构思 描述module构成关系,采用属性图架构,使用点表示entity,边表示relation,点/边均有多种类型。为了区分这种类型,使用属性集代表一个类型,即相同(或近似)属性集的点为同类型的点,相同(或近似)属性集的边为同类型的边。 | **类型名称** | **节点**/**边** | **描述** | **属性集** | | ------------ | ------------------- | ------------------------------------------------------------ | -------------------------------- | | Module | node | 模型或者组成模型的模块(包括:model-stage-block-op),对应torch.nn.Module | (info) 名称、op类型、各种超参数
(training) 精度、收敛速度、使用次数(的统计量)
(computed) 参数量、模型大小 | | contains | edge | 由Module指向Module(有向),包含哪些子模块,可能指向相同的实体 | 次序 | | isA | edge | 由Module指向Module(有向),表示模块的继承关系 | | |DataSet|node|数据集|集群路径,大小,描述| |Job/Training|node|一次任务|精度、收敛速度、次数| |User|node|用户|| 除此之外,由User -> Job, Job -> DataSet, Job -> Module的关系 任务: - 模型层次关系:通过Module节点寻找子树,还原模型树结构 - 统计与排序:对Module使用情况进行统计分析,能够直到最近最多使用的op是哪些 - 重要性:统计Module使用情况,构造Module社交网络,通过节点中心性分析Module(op)重要性 - 寻找module的继承关系,很多模块基本上一致,只是修改了参数,因此存在模块间继承关系,用isA表示 未来可能的任务 - 精度预测:通过已有节点的连接情况使用GNN预测模型的属性(如精度) - 模型优化与生成:给定数据集,定义数据集的目标,如何生成一种Module完成这项任务 ## 对Modules做hash https://stackoverflow.com/questions/2909106/whats-a-correct-and-good-way-to-implement-hash https://stackoverflow.com/questions/5884066/hashing-a-dictionary 为了区分两个Module是否一致,直接通过比较属性的做法代价较高,因此可以对Modules计算唯一的hash值,通过**序列化自己属性的hash 直接与所有子模块的hash值左移idx位,再相加**得到(这种方法能够在很大范围内不出现冲突),节点命名为直接类名 + hashcode最后3位 这里遇到一个问题,同一个module两次运行hash结果不同!原因是python的hash算法中存在随机生成的机制 https://www.cnblogs.com/liangmingshen/p/13207765.html 需要做可重现可跨进程保持一致性的hash,放弃自带的hash函数,使用hashlib的md5摘要算法,计算出来ReLU不论在任何情形始终是854 ## 单一模型KG可视化及统计 对于resnet152,共有module_count 423,其中不重复的有unique_count 50 在Module和子Module上连一条边,表示contains,如图(可视化方式:复制output/module.log中draw_cs的一部分到 https://csacademy.com/app/graph_editor/ ) ![](https://n.sunie.top:9000/gallery/g1121/202205281533731.png) 另一种可视化方法使用了不同的力布局,画出来更加优美的关系图(可视化方式:复制output/module.log中draw_graphgvis的一部分到 https://dreampuf.github.io/GraphvizOnline/ ) ![](https://n.sunie.top:9000/gallery/g1121/202205281558404.svg) 在构造过程中加入去重,同时统计,便可以统计出不同层级结构的使用次数,前10个模块为例: ```log BatchNorm2d328 76 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "256", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} ReLU676 51 {"class": "torch.nn.modules.activation.ReLU", "type": "torch.nn.modules.activation.ReLU", "training": "True", "inplace": "True"} BatchNorm2d913 37 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "1024", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} Conv2d659 36 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "1024", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"} Conv2d587 35 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "1024", "out_channels": "256", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"} Conv2d850 35 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "256", "kernel_size": "(3, 3)", "stride": "(1, 1)", "padding": "(1, 1)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(1, 1, 1, 1)"} Bottleneck968 35 {"class": "torchvision.models.resnet.Bottleneck", "type": "torchvision.models.resnet.Bottleneck", "training": "True", "stride": "1"} BatchNorm2d977 16 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "128", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} BatchNorm2d453 15 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "512", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} Conv2d765 8 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "128", "out_channels": "512", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"} ``` 由于不同模型的hash方式是一致的,具有可复现性,可以通过这种方式运行多个模型并展示在同一张图上,查看两个模型是否有交集,因此尝试torchvision中所有模型 ## 多个模型KG可视化及统计 见draw.py,一共使用了11个模型,如下 ```python models = [ resnet18(), resnet34(), resnet101(), resnet152(), googlenet(), vgg11_bn(), vgg13_bn(), vgg16(), vgg19(), squeezenet1_0(), squeezenet1_1() ] module_count 1444 unique_count 274 ``` ![](https://n.sunie.top:9000/gallery/g1121/202205281624934.svg) 从图中可以看出,存在大量节点被多个模型共用,尤其是ReLU,在任何模型中都是一致的。 同样可以统计出不同层级结构的使用次数,前10个模块为例: ```log ReLU854 218 {"class": "torch.nn.modules.activation.ReLU", "type": "torch.nn.modules.activation.ReLU", "training": "True", "inplace": "True"} BatchNorm2d187 148 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "256", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} Conv2d578 78 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "256", "kernel_size": "(3, 3)", "stride": "(1, 1)", "padding": "(1, 1)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(1, 1, 1, 1)"} BatchNorm2d255 61 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "1024", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} Conv2d455 59 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "256", "out_channels": "1024", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"} Conv2d878 57 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "1024", "out_channels": "256", "kernel_size": "(1, 1)", "stride": "(1, 1)", "padding": "(0, 0)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(0, 0, 0, 0)"} Bottleneck304 57 {"class": "torchvision.models.resnet.Bottleneck", "type": "torchvision.models.resnet.Bottleneck", "training": "True", "stride": "1"} BatchNorm2d848 46 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "512", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} BatchNorm2d630 41 {"class": "torch.nn.modules.batchnorm.BatchNorm2d", "type": "torch.nn.modules.batchnorm.BatchNorm2d", "training": "True", "num_features": "128", "eps": "1e-05", "momentum": "0.1", "affine": "True", "track_running_stats": "True"} Conv2d361 30 {"class": "torch.nn.modules.conv.Conv2d", "type": "torch.nn.modules.conv.Conv2d", "training": "True", "in_channels": "512", "out_channels": "512", "kernel_size": "(3, 3)", "stride": "(1, 1)", "padding": "(1, 1)", "dilation": "(1, 1)", "output_padding": "(0, 0)", "groups": "1", "padding_mode": "zeros", "_reversed_padding_repeated_twice": "(1, 1, 1, 1)"} ``` 如下图,普通分布具有很强的对数趋势: ![](./output/draw1.png) 在双对数坐标下,接近线性,说明Module数量呈现接近幂律分布 ![](./output/draw.png) ## Module分level分析