# ss-aga-kgc **Repository Path**: mirrors_amzn/ss-aga-kgc ## Basic Information - **Project Name**: ss-aga-kgc - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-06-16 - **Last Updated**: 2026-05-02 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Multilingual Knowledge Graph Completion with Self-Supervised Adaptive Graph Alignment (SS-AGA) SS-AGA is a multilingual knowledge graph completion framework that transfers knowledge among multiple KGs sources based on limited seed entity alignment. You can see our ACL 2022 paper [“**Multilingual Knowledge Graph Completion with Self-Supervised Adaptive Graph Alignment**”](https://arxiv.org/pdf/2203.14987.pdf) for more details. This implementation of SS-AGA is based on [Pytorch Geometric](https://github.com/rusty1s/pytorch_geometric) API. Our code is built upon the github [**KEnS**](https://github.com/stasl0217/KEnS) and we thank the authors' effort for making it public. ## Data **DBP-5L**: A Public dataset from https://github.com/stasl0217/KEnS. **E-PKG**: A new industrial multilingual E-commerce product KG dataset. Data format: Each dataset contains the following files and folders: - entity: Folder that contains the entity list for each KG. - kg: Folder that contains the KG triple list (head_entity_index, relation_index, tail_entity_index) for each kg. - seed_alignlinks: Folder that contains seed entity alignment pair list between two KGs. - relation.txt: File that contains relation that is shared across all KGs. - entity_embeddings.npy: The numpy file of mbert embedding for each entity from all KGs. Size of [Num_entity_all, 768]. We use the **BERT-Base, Multilingual Cased** from https://github.com/google-research/bert/blob/master/multilingual.md to generate it. You can download the entity_embeddings.npy for ther DBP-5L dataset from [here](https://drive.google.com/file/d/1-R_2lqS5AQtWqLZXC45SrfkK5XETREe5/view?usp=sharing) and for thhe E-PKG data from [here](https://drive.google.com/file/d/1sO0YQRkr93JLq2S3OP_WNgxMsSr6Yjgw/view?usp=sharing) . To run the code, create the folders "dataset/dbp5l", "dataset/epkg" and download the two datasets respectively. ## Setup To run the code, you need the following dependencies: - [Python 3.6.10](https://www.python.org/) - [Pytorch 1.10.0](https://pytorch.org/) - [pytorch_geometric 2.0.4](https://pytorch-geometric.readthedocs.io/) - torch-cluster==1.6.0 - torch-scatter==2.0.9 - torch-sparse==0.6.13 - [numpy 1.16.1](https://numpy.org/) ## Usage Execute the following scripts to train the model on the targeted japanese KG: ```bash python run_model.py --target_language ja --use_default ``` There are some key options of this scrips: - `--target_language`: The targeted KG to conduct the KG completion task. - `--num_hop`: Number of hops for sampling neighbors for each node. - `--preserved_ratio` : How many align links to preserve in learning alignment embeddings. The rest are served as masked alignments and we ask the model to recover them. - `--generation_freq`: How many epochs to conduct new pair generation once. - `--use_default`: Use the preset hyper-parameter combinations. The details of other optional hyperparameters can be found in run_model.py. ## Citation Please consider citing the following paper when using our code for your application. ```bibtex @inproceedings{SS-AGA, title={Multilingual Knowledge Graph Completion with Self-Supervised Adaptive Graph Alignment}, author={Zijie Huang and Zheng Li and Haoming Jiang and Tianyu Cao and Hanqing Lu and Bing Yin and Karthik Subbian and Yizhou Sun and Wei Wang}, booktitle={Annual Meeting of the Association for Computational Linguistics (ACL)}, year={2022} } ``` ## License This project is licensed under the Apache-2.0 License and CDLA-Permissive 2.0 license.