# iNaturalist **Repository Path**: tutu123456_admin/iNaturalist ## Basic Information - **Project Name**: iNaturalist - **Description**: iNaturalist2018, long-tail distribution, imbalance, fine-grained dataset classification and detection - **Primary Language**: Python - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 0 - **Created**: 2018-08-21 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## Introduction The code is train and test model for Anti-Spoofing In Face Recognition with Liveness, we load data with dataflow(ZMQ parallel), define the model(resnet18) and launch the train process with tensorpack. The code version references git log. This repo is organized as follows: ``` antispoof/ |->experiments | |->*.py |->train_log | |->antispoof-resnet-d18 | |->final_model |->data | |->train | | |->* .txt | |->test | | |->*.txt | |->*.pkl ``` ## Main Results 1. We train on arcsoft antispoof trainset which includes `600k` training images. Test on testset which is a `1400k` images dataset, which are describled in data folder. 2. extra experiments are comming soon if needed. |Model Name-global_step | train_accu| test_accu| test_accu_live| test_accu_paper| test_accu_screen| |------------------ |----------- |----------- |--------- |------- |------- | |res18-49204 | 0.9966 |0.9920 | 0.9930 | 0.9949 | 0.9900 | ## Requirements 1. Python 2.7 or 3. We recommend using Anaconda as it already includes many common packages. 2. [tensorpack](https://github.com/tensorpack/tensorpack#toc5) 3. some packages(`easydict, cv2 ...`) ## Prepare data, Testing, Training ### Prepare data data should be organized as follows: ``` |->data | |->train | | |->* .txt | |->test | | |->*.txt | |->*.pkl ``` Note: 1. train and test image path and label are saved in *.txt, every line is like; ``` F:/dataset/live/00000.jpg 0 ``` 2. generate the `imgdict.pkl` ``` python dataset.py ``` ### Training Use train.py to do train process, for convenience all args can be specified in config.py and use default in args. train with `bachsize=256` on `4` gpus, only take about `8` mins every epoch. e.g. ``` cd experiments/ python train.py --gpu 4,5,6,7 --d 18 --mode resnet ``` ### Testing Use test.py to do test process, it will save testset result in `pred_record.pkl`, test with `bachsize=1` on `1` gpu, only take about `15` hours every epoch. HDD is slow. e.g. ``` cd experiments/ python test.py --load ../../train_log/final_model/res18-49204 ``` ## Other important precesses ### dataflow test DataFlow is a library to build Python iterators for efficient data loading. We use multiprocess load data in parallel, can load about `1k~1.5k` images per second. e.g. ``` python dpflow.py ``` ### dump model for inference `dump-model-params.py` can be used to remove unnecessary variables in a checkpoint. It takes a metagraph file and only saves variables that the model needs at inference time. It can dump the model to a var-name: value dict saved in npz format. e.g. ``` python dump-model-params.py --meta ../../train_log/antispoof-resnet-d18/graph-0607-102551.meta ../../train_log/antispoof-resnet-d18/model-49204 ../../train_log/final_model/res18-49204 ``` ### testset result statics `static_testset.py` can be used to load test result pickle file and get specific static info, such as `classification error images, test accuracy perclass` and so on, we also can filter our dataset according to descending softmax value, it will generate `new*.csv` as new dataset. e.g. ``` python tatic_testset.py ``` ## Features This repo is designed be `fast` and `simple` for research. Such as: we build fast dataprovider for training. ## Contact ``` HuZhikun (胡志坤) TEl:(+86)18401705037 Email:hzk16@mails.tsinghua.edu.cn ```