1 Star 0 Fork 0

ygtao / KALE

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README

-- CODE USED FOR JOINTLY EMBEDDING KNOWLEDGE GRAPHS AND LOGICAL RULES--


OUTLINE:

  1. Introduction
  2. Preprocessing
  3. Training
  4. Testing
  5. How to cite
  6. Contact


  1. INTRODUCTION:

The codes in the folder KALE/ are used for jointly embedding knowledge graphs and logical rules. We peovide how to run the experiments of Link Prediction task in the following.


  1. PREPROCESSING

To run the experiments, you need to preprocess datasets in the folder datasets/ following the steps below: (1) Change data form: call the program ConvertDataForm.java in the folder src/basic/dataProcess to convert the string form of original data into digital form, and get resultant files “train/valid/test.txt” in the folder datasets/ (2) Propositionalize logic rules: call the program GroundAllRules.java in folder src\basic\dataProcess to propositionalize logic rules in the folder datasets/, and get a resultant file “groundings.txt”


  1. TRAINING

To train a model, you need to follow the steps below: (1) Export KALEProgram.java in the folder src/kale/joint as runnable JAR file, for example, termed as KALE.jar (2) Call the program KALE.jar with parameters, for example, as follows:

java -jar KALE.jar -train datasets\wn18\train.txt -valid datasets\wn18\valid.txt -test datasets\wn18\test.txt -rule datasets\wn18\groundings.txt -m 18 -n 40943 -w 0.1 -k 50 -d 0.2 -ge 0.1 -gr 0.1 -# 1000 -skip 50

You can also change the parameters when running RUGE.jar:

  • train: the path of training triples
  • valid: the path of validate triples
  • test: the path of testing triples
  • rule: the path of grounded rules
  • m: number of relations
  • n: number of entities
  • k: embedding dimensionality (default 20)
  • d: value of the margin (default 0.1)
  • ge: initial learning rate of matrix E for AdaGrad(default 0.01)
  • gr: initial learning rate of matrix R for AdaGrad(default 0.01)
  • #: number of iterations (default 1000)
  • skip: number of skipped iterations (default 50)

The program will train a model with the input parameters, and output 5 files:

  • result.log: the specification document
  • MatrixE.best: learned entity embeddings
  • MatrixR.best: learned realtion embeddings

***Please note that in this new implementation version, KALE is optimized by a new mode, using SGD with AdaGrad and gradient normalization.


  1. TESTING

To evaluate on the test datasets, you need call the program Eval_LinkPrediction.java in the folder src/test

You can also change the input parameters when running:

  • iEntities: number of entities
  • iRelations: number of relations
  • iFactors: embedding dimensionality
  • fnMatrixE: the file path of entity embeddings
  • fnMatrixR: the file path of realtion embeddings
  • fnTrainTriples: the file path of training data (digital form)
  • fnValidTriples: the file path of validation data (digital form)
  • fnTestTriples: the file path of testing data (digital form)

The program will evaluate on testing data, and report the metrics of MRR, MED, and Hits@N (raw and filtered setting).


  1. HOW TO CITE

When using this data, one should cite the original paper:
@inproceedings{guo2016:KALE,
title = {Jointly Embedding Knowledge Graphs and Logical Rules},
author = {Shu Guo and Quan Wang and Lihong Wang and Bin Wang and Li Guo},
booktitle = {Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing},
year = {2016},
page = {192-202}
}


  1. CONTACT

For all remarks or questions please contact Quan Wang: wangquan (at) iie (dot) ac (dot) cn .

空文件

简介

暂无描述 展开 收起
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/ygtao/KALE.git
git@gitee.com:ygtao/KALE.git
ygtao
KALE
KALE
master

搜索帮助

344bd9b3 5694891 D2dac590 5694891