# Adversarial Attacking on Multi-modal Learning

**Repository Path**: cse-iis/adversarial-attacking-on-multi-modal-learning

## Basic Information

- **Project Name**: Adversarial Attacking on Multi-modal Learning
- **Description**: The source code of sparse adversarial example attacking on multi-modal learning
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2023-01-01
- **Last Updated**: 2023-05-08

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Sparse Aderversarial Examples Attacking on Multi-modal Learning

## Introduction
The code is designed to generate adversarial examples for video caption. The model is based on [S2VT](https://github.com/xiadingZ/video-caption.pytorch)

For example:
![Example captions](ReadmeImage/adv_example_1.png)

## Architecture
The architecture of the attacking see below:
![Architecture](ReadmeImage/adv_example_2.png)

## Prerequisites 
The python packages needed for this code refer: https://github.com/xiadingZ/video-caption.pytorch

## Getting Started

Clone this directory:
```
git clone https://gitee.com/cse-iis/adversarial-attacking-on-multi-modal-learning.git
```

clone the model:
```
git clone https://github.com/xiadingZ/video-caption.pytorch
```

Download the MSVD dataset from https://www.mediafire.com/folder/h14iarbs62e7p/shared


### Run the attack

To quickly try the video captioning attack, run the following command:

```
python main.py <path-to-the-video> 
--recover_opt <path-to-the-opt-file> 
--saved_model <path-to-the-model-file>
```

### Eval on the code
TBD