Likelihood-Lab / Local Parameterization .gitee-modal { width: 500px !important; }

Explore and code with more than 12 million developers，Free private repositories ！：）
This repository doesn't specify license. Please pay attention to the specific project description and its upstream code dependency when using it.

Notice: Creating folder will generate an empty file .keep, because not support in Git

Research on the Convolution Neural Network based on Geometric Description

Introduction

Convolutional neural network CNN is now widely used in image recognition, and image recognition can achieve very high accuracy. However, the inexplicable of the neural network has not been solved yet, and we cannot learn how the weights and bias of each layer in CNN extract the so-called image features.Therefore, we try to start from the essence of the image, extract the geometric features of the image, to try to reduce the uninterpretability and then identify the image.

Why Local Parameterization

Inpired by 'Image super-resolution using gradient profile prior, CVPR 2008'. In a sufficiently small local area, the edge has translational invariance, that is, in the local area, the contour can be approximated fitted by a straight line or curve.

LP-CNN(Local Parameterization CNN)

To extract geometric features, we first need to find the orientation of local images. The local template was initially specified to have four directions. The way to define it is: the direction in which the absolute difference of the four directions is smallest. Then, extracting the style function's coefficient,position information, min max information and so on. Finally, the information inside the local window can be represented by a vector.

Actually, there are some mathematical proofs in it, first of all, the rationality of the local parameterization. In this part, we have the edge quadratic function H(x,y) equals to 0. According to the implicit function existence theorem, it is indeed possible to find out that y is a linear function of x, and x is also a linear function of y, indicating that there is a linear implicit function that can fit the edge when the local is small. In addition, the second partial derivative of the edge in the orthogonal direction is bounded, so we use the Teller expansion to estimate the error . In fact, as long as we control the filter size well, the error of our local parameterization can be very small.

Finally, The whole picture is a matrix (with the shape of MxN), and then use CNN to classify.

Result

The model version 1.0 (with four templates)has a slower convergence speed and a lower accuracy rate of 1%-3% comparing to traditional CNN. At this time, a TSNE embedding was done and it was found that the effect was indeed inferior to that of traditional CNN. In version 2.0, more template directions were updated(with 12 templates), and the accuracy was higher than that of CNN and the convergence was faster. In order to test the robustness, gradient attack was added, and it was found that our model was more robust, indicating that the correct image geometric features were indeed extracted. However, it should be noted that our model performs well on MNIST and EMNIST, but poorly on the classification of various color complex images like the CIFAR10 dataset. The reason we conclude is that it is not enough to recognize an image only by edge information, but also by texture information. But there is no reasonable definition of texture image feature, so it has not been extracted in the experiment.

Future Work

In fact, we've only done one step, which is a simple local parameterization

1. Ignoring the direct interconnection of the edges.
2. Is there a higher level of secondary semantics?Curves, right angles, arcs?
3. Is it possible to use a graphl model connected with texture information?

Contributor

• Tanli Zuo
• Xiaohang Liang
• Jiahang Cao

Institutions

• Likehoodlab
• Sun Yat-sen University

Acknowledgement

We would like to express sincere appreciations to Maxwell Liu from ShingingMidas Private Fund, Xingyu Fu from Sun Yat-sen University for their generous guidance throughout the project. Also, we are grateful to Tanli Zuo from Sun Yat-sen University for his assistance all the way. Without their supports, we cannot complish such a challenging task.

Empty file

No description expand collapse
Jupyter Notebook

No release