# Adversarial Attacking on Multi-modal Learning **Repository Path**: cse-iis/adversarial-attacking-on-multi-modal-learning ## Basic Information - **Project Name**: Adversarial Attacking on Multi-modal Learning - **Description**: The source code of sparse adversarial example attacking on multi-modal learning - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-01-01 - **Last Updated**: 2023-05-08 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Sparse Aderversarial Examples Attacking on Multi-modal Learning ## Introduction The code is designed to generate adversarial examples for video caption. The model is based on [S2VT](https://github.com/xiadingZ/video-caption.pytorch) For example: ![Example captions](ReadmeImage/adv_example_1.png) ## Architecture The architecture of the attacking see below: ![Architecture](ReadmeImage/adv_example_2.png) ## Prerequisites The python packages needed for this code refer: https://github.com/xiadingZ/video-caption.pytorch ## Getting Started Clone this directory: ``` git clone https://gitee.com/cse-iis/adversarial-attacking-on-multi-modal-learning.git ``` clone the model: ``` git clone https://github.com/xiadingZ/video-caption.pytorch ``` Download the MSVD dataset from https://www.mediafire.com/folder/h14iarbs62e7p/shared ### Run the attack To quickly try the video captioning attack, run the following command: ``` python main.py --recover_opt --saved_model ``` ### Eval on the code TBD