# few-shot

**Repository Path**: zc0617/few-shot

## Basic Information

- **Project Name**: few-shot
- **Description**: few-shot-learning
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-09-16
- **Last Updated**: 2020-12-19

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Few-shot learning

The aim for this repository is to contain clean, readable and tested
code to reproduce few-shot learning research.

This project is written in python 3.6 and Pytorch and assumes you have
a GPU.

See these Medium articles for some more information
1. [Theory and concepts](https://towardsdatascience.com/advances-in-few-shot-learning-a-guided-tour-36bc10a68b77)
2. [Discussion of implementation details](https://towardsdatascience.com/advances-in-few-shot-learning-reproducing-results-in-pytorch-aba70dee541d)

# Setup
### Requirements

Listed in `requirements.txt`. Install with `pip install -r
requirements.txt` preferably in a virtualenv.

### Data
Edit the `DATA_PATH` variable in `config.py` to the location where
you store the Omniglot and miniImagenet datasets.

After acquiring the
data and running the setup scripts your folder structure should look
like
```
DATA_PATH/
    Omniglot/
        images_background/
        images_evaluation/
    miniImageNet/
        images_background/
        images_evaluation/
```

**Omniglot** dataset. Download from https://github.com/brendenlake/omniglot/tree/master/python,
place the extracted files into `DATA_PATH/Omniglot_Raw` and run
`scripts/prepare_omniglot.py`

**miniImageNet** dataset. Download files from
https://drive.google.com/file/d/0B3Irx3uQNoBMQ1FlNXJsZUdYWEE/view,
place in `data/miniImageNet/images` and run `scripts/prepare_mini_imagenet.py`

### Tests (optional)

After adding the datasets run `pytest` in the root directory to run
all tests.

# Results

The file `experiments/experiments.txt` contains the hyperparameters I
used to obtain the results given below.

### Prototypical Networks

![Prototypical Networks](https://github.com/oscarknagg/few-shot/blob/master/assets/proto_nets_diagram.png)


Run `experiments/proto_nets.py` to reproduce results from [Prototpyical
Networks for Few-shot Learning](https://arxiv.org/pdf/1703.05175.pdf)
(Snell et al).

**Arguments**
- dataset: {'omniglot', 'miniImageNet'}. Whether to use the Omniglot
    or miniImagenet dataset
- distance: {'l2', 'cosine'}. Which distance metric to use
- n-train: Support samples per class for training tasks
- n-test: Support samples per class for validation tasks
- k-train: Number of classes in training tasks
- k-test: Number of classes in validation tasks
- q-train: Query samples per class for training tasks
- q-test: Query samples per class for validation tasks


|                  | Omniglot |     |      |      |
|------------------|----------|-----|------|------|
| **k-way**        | **5**    |**5**|**20**|**20**|
| **n-shot**       | **1**    |**5**|**1** |**5** |
| Published        | 98.8     |99.7 |96.0  |98.9  |
| This Repo        | 98.2     |99.4 |95.8  |98.6  |

|                  | miniImageNet|     |
|------------------|-------------|-----|
| **k-way**        | **5**       |**5**|
| **n-shot**       | **1**       |**5**|
| Published        | 49.4        |68.2 |
| This Repo        | 48.0        |66.2 |

### Matching Networks

A differentiable nearest neighbours classifier.

![Matching Networks](https://github.com/oscarknagg/few-shot/blob/master/assets/matching_nets_diagram.png)

Run `experiments/matching_nets.py` to reproduce results from [Matching
Networks for One Shot Learning](https://arxiv.org/pdf/1606.04080.pdf)
(Vinyals et al).

**Arguments**
- dataset: {'omniglot', 'miniImageNet'}. Whether to use the Omniglot
    or miniImagenet dataset
- distance: {'l2', 'cosine'}. Which distance metric to use
- n-train: Support samples per class for training tasks
- n-test: Support samples per class for validation tasks
- k-train: Number of classes in training tasks
- k-test: Number of classes in validation tasks
- q-train: Query samples per class for training tasks
- q-test: Query samples per class for validation tasks
- fce: Whether (True) or not (False) to use full context embeddings (FCE)
- lstm-layers: Number of LSTM layers to use in the support set
    FCE
- unrolling-steps: Number of unrolling steps to use when calculating FCE
    of the query sample

I had trouble reproducing the results of this paper using the cosine
distance metric as I found the converge to be slow and final performance
dependent on the random initialisation. However I was able to reproduce
(and slightly exceed) the results of this paper using the l2 distance
metric.

|                     | Omniglot|     |      |      |
|---------------------|---------|-----|------|------|
| **k-way**           | **5**   |**5**|**20**|**20**|
| **n-shot**          | **1**   |**5**|**1** |**5** |
| Published (cosine)  | 98.1    |98.9 |93.8  |98.5  |
| This Repo (cosine)  | 92.0    |93.2 |75.6  |77.8  |
| This Repo (l2)      | 98.3    |99.8 |92.8  |97.8   |

|                        | miniImageNet|     |
|------------------------|-------------|-----|
| **k-way**              | **5**       |**5**|
| **n-shot**             | **1**       |**5**|
| Published (cosine, FCE)| 44.2        |57.0 |
| This Repo (cosine, FCE)| 42.8        |53.6 |
| This Repo (l2)         | 46.0        |58.4 |

### Model-Agnostic Meta-Learning (MAML)

![MAML](https://github.com/oscarknagg/few-shot/blob/master/assets/maml_diagram.png)

I used max pooling instead of strided convolutions in order to be
consistent with the other papers. The miniImageNet experiments using
2nd order MAML took me over a day to run.

Run `experiments/maml.py` to reproduce results from [Model-Agnostic
Meta-Learning](https://arxiv.org/pdf/1703.03400.pdf)
(Finn et al).

**Arguments**
- dataset: {'omniglot', 'miniImageNet'}. Whether to use the Omniglot
    or miniImagenet dataset
- distance: {'l2', 'cosine'}. Which distance metric to use
- n: Support samples per class for few-shot tasks
- k: Number of classes in training tasks
- q: Query samples per class for training tasks
- inner-train-steps: Number of inner-loop updates to perform on training
    tasks
- inner-val-steps: Number of inner-loop updates to perform on validation
    tasks
- inner-lr: Learning rate to use for inner-loop updates
- meta-lr: Learning rate to use when updating the meta-learner weights
- meta-batch-size: Number of tasks per meta-batch
- order: Whether to use 1st or 2nd order MAML
- epochs: Number of training epochs
- epoch-len: Meta-batches per epoch
- eval-batches: Number of meta-batches to use when evaluating the model
    after each epoch


NB: For MAML n, k and q are fixed between train and test. You may need
to adjust meta-batch-size to fit your GPU. 2nd order MAML uses a _lot_
more memory.

|                  | Omniglot |     |      |      |
|------------------|----------|-----|------|------|
| **k-way**        | **5**    |**5**|**20**|**20**|
| **n-shot**       | **1**    |**5**|**1** |**5** |
| Published        | 98.7     |99.9 |95.8  |98.9  |
| This Repo (1)    | 95.5     |99.5 |92.2  |97.7  |
| This Repo (2)    | 98.1     |99.8 |91.6  |95.9  |

|                  | miniImageNet|     |
|------------------|-------------|-----|
| **k-way**        | **5**       |**5**|
| **n-shot**       | **1**       |**5**|
| Published        | 48.1        |63.2 |
| This Repo (1)    | 46.4        |63.3 |
| This Repo (2)    | 47.5        |64.7 |

Number in brackets indicates 1st or 2nd order MAML.