A Python package for graph kernels, graph edit distances and graph pre-image problem.
gfortran) and BLAS/LAPACK (e.g.
$ pip install graphkit-learn
$ git clone https://github.com/jajupmochi/graphkit-learn.git $ cd graphkit-learn/ $ python setup.py install
A series of tests can be run to check if the library works correctly:
$ pip install -U pip pytest codecov coverage pytest-cov $ pytest -v --cov-config=.coveragerc --cov-report term --cov=gklearn gklearn/tests/
notebooks directory for more demos:
notebooksdirectory includes test codes of graph kernels based on linear patterns;
notebooks/testsdirectory includes codes that test some libraries and functions;
notebooks/utilsdirectory includes some useful tools, such as a Gram matrix checker and a function to get properties of datasets;
notebooks/elsedirectory includes other codes that we used for experiments.
The docs of the library can be found here.
GEDLIB is an easily extensible C++ library for (suboptimally) computing the graph edit distance between attributed graphs. A Python interface for
GEDLIB is integrated in this library, based on
multiprocessing.Poolmodule is applied to perform parallelization on the computations of all kernels as well as the model selection.
This library uses
multiprocessing.Pool.imap_unordered function to do the parallelization, which may not be able to run correctly under Windows system. For now, Windows users may need to comment the parallel codes and uncomment the codes below them which run serially. We will consider adding a parameter to control serial or parallel computations as needed.
Some modules (such as
OpenBLAS to perform parallel computation by default, which causes conflicts with other parallelization modules such as
multiprossing.Pool, highly increasing the computing time. By setting its thread to 1,
OpenBLAS is forced to use a single thread/CPU, thus avoids the conflicts. For now, this procedure has to be done manually. Under Linux, type this command in terminal before running the code:
$ export OPENBLAS_NUM_THREADS=1
export OPENBLAS_NUM_THREADS=1 at the end of your
~/.bashrc file, then run
$ source ~/.bashrc
to make this effective permanently.
Check this paper for detailed description of graph kernels and experimental results:
Linlin Jia, Benoit Gaüzère, and Paul Honeine. Graph Kernels Based on Linear Patterns: Theoretical and Experimental Comparisons. working paper or preprint, March 2019. URL https://hal-normandie-univ.archives-ouvertes.fr/hal-02053946.
A comparison of performances of graph kernels on benchmark datasets can be found here.
Fork the library and open a pull request! Make your own contribute to the community!
This research was supported by CSC (China Scholarship Council) and the French national research agency (ANR) under the grant APi (ANR-18-CE23-0014). The authors would like to thank the CRIANN (Le Centre Régional Informatique et d’Applications Numériques de Normandie) for providing computational resources.
 Thomas Gärtner, Peter Flach, and Stefan Wrobel. On graph kernels: Hardness results and efficient alternatives. Learning Theory and Kernel Machines, pages 129–143, 2003.
 H. Kashima, K. Tsuda, and A. Inokuchi. Marginalized kernels between labeled graphs. In Proceedings of the 20th International Conference on Machine Learning, Washington, DC, United States, 2003.
 Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M., 2010. Graph kernels. Journal of Machine Learning Research 11, 1201–1242.
 K. M. Borgwardt and H.-P. Kriegel. Shortest-path kernels on graphs. In Proceedings of the International Conference on Data Mining, pages 74-81, 2005.
 Liva Ralaivola, Sanjay J Swamidass, Hiroto Saigo, and Pierre Baldi. Graph kernels for chemical informatics. Neural networks, 18(8):1093–1110, 2005.
 Suard F, Rakotomamonjy A, Bensrhair A. Kernel on Bag of Paths For Measuring Similarity of Shapes. InESANN 2007 Apr 25 (pp. 355-360).
 Mahé, P., Ueda, N., Akutsu, T., Perret, J.L., Vert, J.P., 2004. Extensions of marginalized graph kernels, in: Proc. the twenty-first international conference on Machine learning, ACM. p. 70.
 Lifan Xu, Wei Wang, M Alvarez, John Cavazos, and Dongping Zhang. Parallelization of shortest path graph kernels on multi-core cpus and gpus. Proceedings of the Programmability Issues for Heterogeneous Multicores (MultiProg), Vienna, Austria, 2014.
 Edward Fredkin. Trie memory. Communications of the ACM, 3(9):490–499, 1960.
 Gaüzere, B., Brun, L., Villemin, D., 2012. Two new graphs kernels in chemoinformatics. Pattern Recognition Letters 33, 2038–2047.
 Shervashidze, N., Schweitzer, P., Leeuwen, E.J.v., Mehlhorn, K., Borgwardt, K.M., 2011. Weisfeiler-lehman graph kernels. Journal of Machine Learning Research 12, 2539–2561.
：Code submit frequency
：React/respond to issue & PR etc.
：Well-balanced team members and collaboration
：Recent popularity of project
：Star counts, download counts etc.