diff --git a/README.md b/README.md index 3b459965bc1f581e4ad8d73a3d5216d935d33cf8..e2ff30370705820f8b874e18001ae704a5809092 100644 --- a/README.md +++ b/README.md @@ -338,19 +338,22 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 ### Multimodal -| Model | Framework | Dataset | IXUCA SDK | -|---------------------------------------------------------------------------------|-----------|----------------|-------| -| [BLIP](multimodal/vision-language_model/blip/pytorch) | PyTorch | COCO | 3.1.1 | -| [CLIP](multimodal/contrastive_learning/clip/pytorch) | PyTorch | CIFAR100 | 2.2.0 | -| [ControlNet](multimodal/diffusion_model/controlnet) | PyTorch | Fill50K | 3.1.0 | -| [DDPM](multimodal/diffusion_model/ddpm) | PyTorch | CIFAR-10 | 3.1.0 | -| [LLaVA 1.5](multimodal/vision-language_model/llava-1.5/pytorch) | PyTorch | LLaVA-Pretrain | 4.1.1 | -| [L-Verse](multimodal/vision-language_model/l-verse/pytorch) | PyTorch | ImageNet | 2.2.0 | -| [Stable Diffusion 1.4](multimodal/diffusion_model/stable-diffusion-1.4/pytorch) | PyTorch | pokemon-images | 4.1.1 | -| [Stable Diffusion 1.5](multimodal/diffusion_model/stable-diffusion-1.5/pytorch) | PyTorch | pokemon-images | 4.1.1 | -| [Stable Diffusion 2.1](multimodal/diffusion_model/stable-diffusion-2.1/pytorch) | PyTorch | pokemon-images | 4.1.1 | -| [Stable Diffusion 3](multimodal/diffusion_model/stable-diffusion-3/pytorch) | PyTorch | dog-example | 4.1.1 | -| [Stable Diffusion XL](multimodal/diffusion_model/stable-diffusion-xl/pytorch) | PyTorch | pokemon-images | 4.1.1 | +| Model | Framework | Dataset | IXUCA SDK | +|---------------------------------------------------------------------------------------------|-----------|----------------|-----------| +| [BLIP](multimodal/vision-language_model/blip/pytorch) | PyTorch | COCO | 3.1.1 | +| [CLIP](multimodal/contrastive_learning/clip/pytorch) | PyTorch | CIFAR100 | 2.2.0 | +| [ControlNet](multimodal/diffusion_model/controlnet) | PyTorch | Fill50K | 3.1.0 | +| [DDPM](multimodal/diffusion_model/ddpm) | PyTorch | CIFAR-10 | 3.1.0 | +| [LLaVA 1.5](multimodal/vision-language_model/llava-1.5/pytorch) | PyTorch | LLaVA-Pretrain | 4.1.1 | +| [L-Verse](multimodal/vision-language_model/l-verse/pytorch) | PyTorch | ImageNet | 2.2.0 | +| [MoE-LLaVA-Phi2-2.7B](multimodal/vision-language_model/moe-llava-phi2-2.7b/pytorch/) | PyTorch | MoE-LLaVA | 4.2.0 | +| [MoE-LLaVA-Qwen-1.8B](multimodal/vision-language_model/moe-llava-qwen-1.8b/pytorch) | PyTorch | MoE-LLaVA | 4.2.0 | +| [MoE-LLaVA-StableLM-1.6B](multimodal/vision-language_model/moe-llava-stablelm-1.6b/pytorch) | PyTorch | MoE-LLaVA | 4.2.0 | +| [Stable Diffusion 1.4](multimodal/diffusion_model/stable-diffusion-1.4/pytorch) | PyTorch | pokemon-images | 4.1.1 | +| [Stable Diffusion 1.5](multimodal/diffusion_model/stable-diffusion-1.5/pytorch) | PyTorch | pokemon-images | 4.1.1 | +| [Stable Diffusion 2.1](multimodal/diffusion_model/stable-diffusion-2.1/pytorch) | PyTorch | pokemon-images | 4.1.1 | +| [Stable Diffusion 3](multimodal/diffusion_model/stable-diffusion-3/pytorch) | PyTorch | dog-example | 4.1.1 | +| [Stable Diffusion XL](multimodal/diffusion_model/stable-diffusion-xl/pytorch) | PyTorch | pokemon-images | 4.1.1 | ### NLP (Natural Language Processing) diff --git a/multimodal/vision-language_model/MoE-LLaVA/docs/CUSTOM.md b/multimodal/vision-language_model/MoE-LLaVA/docs/CUSTOM.md deleted file mode 100644 index fbc8a62f53d2fbd91cba34eb6788dd534da53593..0000000000000000000000000000000000000000 --- a/multimodal/vision-language_model/MoE-LLaVA/docs/CUSTOM.md +++ /dev/null @@ -1,214 +0,0 @@ - - -- The most **IMPORTANT** thing, make sure you understand the behavior of the tokenizer. -- We provide some samples on how different tokenizer behaviors should be changed. -- At the end it describes how to convert LLaVA style models to the MoE architecture. - -## Don't have special tokens, but can add special tokens - -For those tokenizers that don't have special tokens, but can add special tokens, such as QWenTokenizer or PhiTokenizer. You need to add special tokens. - -### QWenTokenizer - -#### Tokenizer - -Insert the following code after initializing the tokenizer [here](): -```python -tokenizer.add_special_tokens({ - 'unk_token': '<|extra_0|>', - 'eos_token': '<|endoftext|>' -}) -``` - -#### `preprocess_qwen` function - -Copy the `preprocess_qwen` function from the `preprocess_v1` function and modify the following: -``` -round_len = len(tokenizer_image_token(rou, tokenizer)) + 1 # for eos_token -instruction_len = len(tokenizer_image_token(parts[0], tokenizer)) - 1 # instruction_len is before the answer -``` - -Defining the use of `preprocess_qwen` in the `preprocess` function [here](). -``` -if conversation_lib.default_conversation.version.startswith("qwen"): # for qwen - return preprocess_qwen(sources, tokenizer, has_image=has_image) -``` - -#### `conv_qwen` conversation template - -Add a new conversation template such as `conv_qwen` [here](), replacing `sep2` with `eos_token`, and modify the value of `version`. - -```python -conv_qwen = Conversation( - system="A chat between a curious user and an artificial intelligence assistant. " - "The assistant gives helpful, detailed, and polite answers to the user's questions.", - roles=("USER", "ASSISTANT"), - version="qwen", # replace - messages=(), - offset=0, - sep_style=SeparatorStyle.TWO, - sep=" ", - sep2="<|endoftext|>", # replace with eos_token -) -``` - -Don't forget to register the newly defined conversation template [here](). - -```python -conv_templates = { - ... - "qwen": conv_qwen, # the key is "qwen" - ... -} -``` - -Remember the key for the registered dialogue conversation, such as `qwen`. And modify the `--version qwen` in the commands for Stage 2 and Stage 3. **DO NOT need to modify the `--version plain` in Stage 1.** - -### PhiTokenizer - -#### Tokenizer - -Insert the following code after initializing the tokenizer [here](): -```python -tokenizer.add_special_tokens({ - 'unk_token': '<|extra_0|>', -# 'eos_token': '<|endoftext|>' Not needed because it already exists. -}) -``` - -#### `preprocess_phi` function - -Copy the `preprocess_phi` function from the `preprocess_v1` function and modify the following: -``` -round_len = len(tokenizer_image_token(rou, tokenizer)) + 1 # for eos_token -instruction_len = len(tokenizer_image_token(parts[0], tokenizer)) - 1 # instruction_len is before the answer -``` - -Defining the use of `preprocess_phi` in the `preprocess` function [here](). -``` -if conversation_lib.default_conversation.version.startswith("phi"): # for phi - return preprocess_phi(sources, tokenizer, has_image=has_image) -``` - -#### `conv_phi` conversation template - -Add a new conversation template such as `conv_phi` [here](), replacing `sep2` with `eos_token`, and modify the value of `version`. - -```python -conv_phi = Conversation( - system="A chat between a curious user and an artificial intelligence assistant. " - "The assistant gives helpful, detailed, and polite answers to the user's questions.", - roles=("USER", "ASSISTANT"), - version="phi", # replace - messages=(), - offset=0, - sep_style=SeparatorStyle.TWO, - sep=" ", - sep2="<|endoftext|>", # replace with eos_token -) -``` - -Don't forget to register the newly defined conversation template [here](). - -```python -conv_templates = { - ... - "phi": conv_phi, # the key is "phi" - ... -} -``` - -Remember the key for the registered dialogue conversation, such as `phi`. And modify the `--version phi` in the commands for Stage 2 and Stage 3. **DO NOT need to modify the `--version plain` in Stage 1.** - - -## CAN NOT add special tokens - -### StableLMTokenizer - -#### Tokenizer - -For those tokenizers that can **not** add special tokens, such as `StableLMTokenizer`. - -First find all the special tokens of the tokenizer. - -``` -tokenizer.special_tokens ->>> {'<|endoftext|>': 100257, '<|fim_prefix|>': 100258, '<|fim_middle|>': 100259, '<|fim_suffix|>': 100260, '<|fim_pad|>': 100261, '': 100262, '': 100263, '': 100264, '': 100265, '': 100266, '': 100267, '': 100268, '': 100269, '': 100270, '': 100271, '': 100272, '': 100273, '': 100274, '': 100275, '<|endofprompt|>': 100276, '<|im_start|>': 100277, '<|im_end|>': 100278, '<|pause|>': 100279, '<|reg0|>': 100280, '<|reg1|>': 100281, '<|reg2|>': 100282, '<|reg3|>': 100283, '<|reg4|>': 100284, '<|reg5|>': 100285, '<|reg6|>': 100286, '<|reg7|>': 100287, '<|extra0|>': 100288} -``` - -Choosing a less important token, e.g., `<|reg0|>`. You need to make sure the tokenizer has `unk_token` [here](). - -``` -tokenizer.unk_token = '<|reg0|>' -``` - -#### `preprocess_stablelm` function - -Copy the `preprocess_stablelm` function from the `preprocess_v1` function and modify the following: -``` -total_len = int(target.ne(tokenizer.pad_token_id).sum()) + conversation.count(conv.sep2) # pad_token_id == eos_token_id -... -round_len = len(tokenizer_image_token(rou, tokenizer)) + 1 # for eos_token -instruction_len = len(tokenizer_image_token(parts[0], tokenizer)) - 1 # instruction_len is before the answer -``` - -Defining the use of `preprocess_stablelm` in the `preprocess` function [here](). -``` -if conversation_lib.default_conversation.version.startswith("stablelm"): # for stablelm - return preprocess_stablelm(sources, tokenizer, has_image=has_image) -``` - -#### `conv_stablelm` conversation template - -Add a new conversation template such as `conv_stablelm` [here](), replacing `sep2` with `eos_token`, and modify the value of `version`. - -```python -conv_stablelm = Conversation( - system="A chat between a curious user and an artificial intelligence assistant. " - "The assistant gives helpful, detailed, and polite answers to the user's questions.", - roles=("USER", "ASSISTANT"), - version="stablelm", # replace - messages=(), - offset=0, - sep_style=SeparatorStyle.TWO, - sep=" ", - sep2="<|endoftext|>", # replace with eos_token -) -``` - -Don't forget to register the newly defined conversation template [here](). - -```python -conv_templates = { - ... - "stablelm": conv_stablelm, # the key is "stablelm" - ... -} -``` - -Remember the key for the registered dialogue conversation, such as `stablelm`. And modify the `--version stablelm` in the commands for Stage 2 and Stage 3. **DO NOT need to modify the `--version plain` in Stage 1.** - -## The behavior of the tokenizer is consistent with `LlamaTokenizer` - -### LlamaTokenizer - -If the behavior of your tokenizer is consistent with `LlamaTokenizer`. You can just use the already defined conversation template. Beware of the differences brought about by different transformers versions, **we strongly recommend using `LlamaTokenizer` on version 4.31.0**. - -For example, for the `LlamaTokenizer`, `bos_token` is ``, `eos_token` is ``, and `unk_token` is ``. -When the tokenizer encodes one sentence, the resulting output should include the `bos_token_id`. In following example, the `bos_token_id` is 1. - - -```python -tokenizer = LlamaTokenizer.from_pretrained("lmsys/vicuna-7b-v1.5", cache_dir='cache_dir') -tokenizer(['This is first sentence', 'Test'], return_tensors='pt', padding=True) -# Output: {'input_ids': tensor([[ 1, 910, 338, 937, 10541], -# [ 1, 4321, 0, 0, 0]]), -# 'attention_mask': tensor([[1, 1, 1, 1, 1], -# [1, 1, 0, 0, 0]])} -``` -Passing the `--version v1` in the commands for Stage 2 and Stage 3. **DO NOT need to modify the `--version plain` in Stage 1.** - -## Converting models to MoE architectures - -Refer to [llava_stablelm_moe.py](moellava/model/language_model/llava_stablelm_moe.py), [llava_qwen_moe.py](moellava/model/language_model/llava_llama_moe.py), [llava_phi_moe.py](moellava/model/language_model/llava_phi_moe.py), [llava_mistral_moe.py](moellava/model/language_model/llava_mistral_moe.py) and [llava_llama_moe.py](moellava/model/language_model/llava_llama_moe.py) - diff --git a/multimodal/vision-language_model/MoE-LLaVA/docs/EVAL.md b/multimodal/vision-language_model/MoE-LLaVA/docs/EVAL.md deleted file mode 100644 index a02ad02548222ebdcd61cf566a939167a8f7dea8..0000000000000000000000000000000000000000 --- a/multimodal/vision-language_model/MoE-LLaVA/docs/EVAL.md +++ /dev/null @@ -1,271 +0,0 @@ -## Data preparation - -- Following LLaVA's instructions. **You MUST first download [eval.zip](https://drive.google.com/file/d/1atZSBBrAX54yYpxtVVW33zFvcnaHeFPy/view?usp=sharing)**. -- It contains custom annotations, scripts, and the prediction files with LLaVA v1.5. Extract to `eval`. This also provides a general structure for all datasets. - -After downloading all of them, organize the data as follows in `eval`. - -```Shell -eval -├── gqa -│   ├── answers -│   ├── data -│   └── llava_gqa_testdev_balanced.jsonl -├── llava-bench-in-the-wild -│   ├── answers -│   ├── answers_gpt4.jsonl -│   ├── bard_0718.jsonl -│   ├── bing_chat_0629.jsonl -│   ├── context.jsonl -│   ├── images -│   ├── questions.jsonl -│   ├── README.md -│   └── reviews -├── mmbench -│   ├── answers -│   ├── answers_upload -│   ├── mmbench_dev_20230712.tsv -│   └── mmbench_dev_en_20231003.tsv -├── MME -│   ├── answers -│   ├── convert_answer_to_mme.py -│   └── llava_mme.jsonl -├── mm-vet -│   ├── answers -│   ├── bard_set.json -│   ├── convert_answers.py -│   ├── images -│   ├── llava-mm-vet.jsonl -│   ├── mm-vet.json -│   └── results -├── pope -│   ├── answers -│   ├── coco -│   ├── llava_pope_test.jsonl -│   └── val2014 -├── scienceqa -│   ├── answers -│   ├── images -│   ├── llava_test_CQM-A.json -│   ├── pid_splits.json -│   └── problems.json -├── seed_bench -│   ├── answers -│   ├── answers_upload -│   ├── extract_video_frames.py -│   └── llava-seed-bench.jsonl -├── textvqa -│   ├── answers -│   ├── llava_textvqa_val_v051_ocr.jsonl -│   ├── TextVQA_0.5.1_val.json -│   └── train_images -├── vizwiz -│   ├── answers -│   ├── answers_upload -│   ├── llava_test.jsonl -│   ├── test -│   ├── test.json -│   ├── train.json -│   └── val.json -└── vqav2 - ├── answers - ├── answers_upload - ├── llava_vqav2_mscoco_test2015.jsonl - ├── llava_vqav2_mscoco_test-dev2015.jsonl - └── test2015 -``` - - -## Validating -Our image validation code comes from LLaVA, thanks for their contribution! - -You can refer to the official repository for validation, but we also provide [off-the-shelf](scripts/v1/eval) scripts. - - -### VQAv2 - -1. Download [`test2015`](http://images.cocodataset.org/zips/test2015.zip) and put it under `eval/vqav2`. -2. Multi-GPU inference. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash scripts/v1/eval/llava/vqav2.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/vqav2.sh -``` - -3. Submit the results to the [evaluation server](https://eval.ai/web/challenges/challenge-page/830/my-submission): `eval/vqav2/answers_upload`. - -### GQA - -1. Download the data following the official instructions [here](https://cs.stanford.edu/people/dorarad/gqa/download.html) and put under `eval/gqa/data`. -2. Multi-GPU inference - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash scripts/v1/eval/llava/gqa.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/gqa.sh -``` - -### VisWiz - -1. Download [`test.json`](https://vizwiz.cs.colorado.edu/VizWiz_final/vqa_data/Annotations.zip) and extract [`test.zip`](https://vizwiz.cs.colorado.edu/VizWiz_final/images/test.zip) to `test`. Put them under `eval/vizwiz`. -2. Single-GPU inference. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0 bash scripts/v1/eval/moe_llava/vizwiz.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/vizwiz.sh -``` - -3. Submit the results to the [evaluation server](https://eval.ai/web/challenges/challenge-page/1911/my-submission): `eval/vizwiz/answers_upload`. - -### ScienceQA - -1. Under `eval/scienceqa`, download `images`, `pid_splits.json`, `problems.json` from the `data/scienceqa` folder of the ScienceQA [repo](https://github.com/lupantech/ScienceQA). -2. Single-GPU inference and evaluate. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0 bash scripts/v1/eval/moe_llava/sqa.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/sqa.sh -``` - - -### TextVQA - -1. Download [`TextVQA_0.5.1_val.json`](https://dl.fbaipublicfiles.com/textvqa/data/TextVQA_0.5.1_val.json) and [images](https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip) and extract to `eval/textvqa`. -2. Single-GPU inference and evaluate. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0 bash scripts/v1/eval/moe_llava/textvqa.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/textvqa.sh -``` - - -### POPE - -1. Download `coco` from [POPE](https://github.com/AoiDragon/POPE/tree/e3e39262c85a6a83f26cf5094022a782cb0df58d/output/coco) and put under `eval/pope`. -2. Single-GPU inference and evaluate. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0 bash scripts/v1/eval/moe_llava/pope.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/pope.sh -``` - -### MME -1. Download the data following the official instructions [here](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models/tree/Evaluation). -2. Downloaded images to `MME_Benchmark_release_version`. -3. Put the official `eval_tool` and `MME_Benchmark_release_version` under `eval/MME`. -4. Single-GPU inference and evaluate. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0 bash scripts/v1/eval/llava/mme.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/mme.sh -``` - -### MMBench - -1. Download [`mmbench_dev_20230712.tsv`](https://download.openmmlab.com/mmclassification/datasets/mmbench/mmbench_dev_20230712.tsv) and put under `eval/mmbench`. -2. Single-GPU inference. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0 bash scripts/v1/eval/llava/mmbench.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/mmbench.sh -``` - -3. Submit the results to the [evaluation server](https://opencompass.org.cn/leaderboard-multimodal): `eval/mmbench/answers_upload/mmbench_dev_20230712`. - - -### MMBench-CN - -1. Download [`mmbench_dev_cn_20231003.tsv`](https://download.openmmlab.com/mmclassification/datasets/mmbench/mmbench_dev_cn_20231003.tsv) and put under `eval/mmbench`. -2. Single-GPU inference. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0 bash scripts/v1/eval/llava/mmbench_cn.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/mmbench_cn.sh -``` - -3. Submit the results to the [evaluation server](https://opencompass.org.cn/leaderboard-multimodal): `eval/mmbench/answers_upload/mmbench_dev_cn_20231003`. - - -### SEED-Bench - -1. Following the official [instructions](https://github.com/AILab-CVC/SEED-Bench/blob/main/DATASET.md) to download the images and the videos. Put images under `eval/seed_bench/SEED-Bench-image`. -2. Extract the video frame in the middle from the downloaded videos, and put them under `eval/seed_bench/SEED-Bench-video-image`. -3. Multiple-GPU inference and evaluate. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash scripts/v1/eval/llava/seed.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/seed.sh -``` - -4. Optionally, submit the results to the leaderboard: `eval/seed_bench/answers_upload` using the official jupyter notebook. - - - -### LLaVA-Bench-in-the-Wild - -1. Extract contents of [`llava-bench-in-the-wild`](https://huggingface.co/datasets/liuhaotian/llava-bench-in-the-wild) to `eval/llava-bench-in-the-wild`. -2. Single-GPU inference and evaluate. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0 bash scripts/v1/eval/moe_llava/llavabench.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/llavabench.sh -``` - - -### MM-Vet - -1. Extract [`mm-vet.zip`](https://github.com/yuweihao/MM-Vet/releases/download/v1/mm-vet.zip) to `eval/mmvet`. -2. Single-GPU inference. - -**LLaVA-based** model -```Shell -CUDA_VISIBLE_DEVICES=0 bash scripts/v1/eval/moe_llava/mmvet.sh -``` -**MoE-based** model -```Shell -bash scripts/v1/eval/moe_llava/mmvet.sh -``` - diff --git a/multimodal/vision-language_model/MoE-LLaVA/docs/LORA.md b/multimodal/vision-language_model/MoE-LLaVA/docs/LORA.md deleted file mode 100644 index 80c947713def36cfb410a69f1b669945828ea1f8..0000000000000000000000000000000000000000 --- a/multimodal/vision-language_model/MoE-LLaVA/docs/LORA.md +++ /dev/null @@ -1,24 +0,0 @@ -## Training for LoRA tuning models -Coming soon... - -## Evaluation for LoRA tuning models - -You can evaluate the model directly after LoRA tuning as [EVAL.md](../docs/EVAL.md). - -Or you can evaluate it after merging weights as follows. - -### Optional - -You can use `script/merge_moe_lora_weights.py` to merge the LoRA weights. - -```Shell -deepspeed --include localhost:0 script/merge_lora_weights.py \ - --model-path checkpoints/moellava-phi-moe-lora \ - --save-model-path checkpoints/moellava-phi-moe-merge -``` - -> [!Warning] -> 🚨 Please do not have `lora` in `--save-model-path` and `lora` should in `--model-path`. - - -Then evaluate `checkpoints/llavaphi-moe-merge` as [EVAL.md](../docs/EVAL.md) diff --git a/multimodal/vision-language_model/MoE-LLaVA/docs/TRAIN.md b/multimodal/vision-language_model/MoE-LLaVA/docs/TRAIN.md deleted file mode 100644 index a1af3f1bd771883af5b4cb77044378c99b4289d5..0000000000000000000000000000000000000000 --- a/multimodal/vision-language_model/MoE-LLaVA/docs/TRAIN.md +++ /dev/null @@ -1,69 +0,0 @@ -## Data preparation - -- The LLaVA-PT is from [LLaVA](https://github.com/haotian-liu/LLaVA). -- The Hybird-FT is from [SViT](https://github.com/BAAI-DCAI/Visual-Instruction-Tuning), [LVIS](https://github.com/X2FD/LVIS-INSTRUCT4V), [LRV](https://github.com/FuxiaoLiu/LRV-Instruction), [MIMIC-IT](https://github.com/Luodian/Otter). -- The LLaVA-FT is from [LLaVA](https://github.com/haotian-liu/LLaVA). -- Download the training annotations. You can download from [Baidu Disk](https://pan.baidu.com/s/1rwub9o0T3_7ZHbPZzCiLZw?pwd=0yhi), [Google Disk](https://drive.google.com/file/d/13YxtVowfhUIpGOCODhKFstoRBvogF4od/view?usp=sharing), [Peking University Disk](https://disk.pku.edu.cn/link/AA10683317FB824FB9B2427A6B268EAADB) or [Hugging Face](https://huggingface.co/datasets/LanguageBind/MoE-LLaVA/tree/main/train_json) - - -We also provide the processed data as follows. The link is to BaiDu Disk. -
- - - - - - - - - - - - - -
Data groupUsageLink
LLaVA-PTStage 1LLaVA 1.5-558k
Hybird-FTStage 2SViT-157k, LVIS-220k, LRV-331k, MIMIC-IT-256k
LLaVA-FTStage 3LLaVA 1.5-mix-665k
-
- -**For those who can not easily access to BaiDu Disk**, you can download data from [Hugging Face](https://huggingface.co/datasets/LanguageBind/MoE-LLaVA). - -After downloading all of them, organize the data as follows in ```IMAGE_FOLDER```. - -```Shell -IMAGE_FOLDER -├── llava_image -├── llava_image_tune -├── lvis_tune -├── lrv_tune -├── svit_tune -└── mimicit_tune - └── LA -``` - - -## Training -Specify your `IMAGE_FOLDER` and `JSON_FOLDER` according to the data preparation. - -For training on 384 resolution, we use `google/siglip-so400m-patch14-384` as `image_tower`. Notably, if you pass the `--image_tower google/siglip-so400m-patch14-384`, you should upgrade the version of transformers to 4.37.0. - -### Qwen -- Stage 1 pretraining script: [pretrain.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/qwen/pretrain.sh). -- Stage 2 tuning script: [finetune.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/qwen/finetune.sh). -- Stage 3 moe-tuning script: [finetune_moe.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/qwen/finetune_moe.sh). - -### Phi2 -- Stage 1 pretraining script: [pretrain.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/phi2/pretrain.sh). -- Stage 2 tuning script: [finetune.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/phi2/finetune.sh). -- Stage 3 moe-tuning script: [finetune_moe.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/phi2/finetune_moe.sh). - -### StableLM -- Stage 1 pretraining script: [pretrain.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/stablelm/pretrain.sh). -- Stage 2 tuning script: [finetune.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/stablelm/finetune.sh). -- Stage 3 moe-tuning script: [finetune_moe.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/stablelm/finetune_moe.sh). - -### OpenChat - - - -- Stage 1 pretraining script: [pretrain.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/openchat/pretrain.sh). -- Stage 2 tuning script: [finetune.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/openchat/finetune.sh). -- Stage 3 moe-tuning script: [finetune_moe.sh](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main/scripts/v1/openchat/finetune_moe.sh). diff --git a/multimodal/vision-language_model/MoE-LLaVA/docs/VISUALIZATION.md b/multimodal/vision-language_model/MoE-LLaVA/docs/VISUALIZATION.md deleted file mode 100644 index 8a70bff50f1fb9dd5e71541dbcec08f06b4028f7..0000000000000000000000000000000000000000 --- a/multimodal/vision-language_model/MoE-LLaVA/docs/VISUALIZATION.md +++ /dev/null @@ -1,52 +0,0 @@ -## Visualization - -Please note that this tutorial is **for MoE models only**. - -### Getting expert logits - -For visualization, the first step is to get the logits of the experts. GQA and VQAv2 are not currently supported as they generally require multi-GPUs to run. Please change to single GPU if needed. - -In [EVAL.md](https://github.com/PKU-YuanGroup/MoE-LLaVA/blob/main/docs/EVAL.md) we describe how to perform validation. Then, for example, we just need to add `--return_gating_logit "phi_sciqa"` to get the expert logits on ScienceQA benchmark. - -```Bash -cd ~/MoE-LLaVA -CKPT_NAME="MoE-LLaVA-Phi2-2.7B-4e" -CKPT="checkpoints/${CKPT_NAME}" -EVAL="eval" -HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 deepspeed --include localhost:0 moellava/eval/model_vqa_science.py \ - --model-path ${CKPT} \ - --question-file ${EVAL}/scienceqa/llava_test_CQM-A.json \ - --image-folder ${EVAL}/scienceqa/images/test \ - --answers-file ${EVAL}/scienceqa/answers/${CKPT_NAME}.jsonl \ - --single-pred-prompt \ - --temperature 0 \ - --conv-mode phi \ - --return_gating_logit "phi_sciqa" # add this command -``` - -Then, you will get ``phi_sciqa.pt``. Now you can try the other benchmarks through `--return_gating_logit`. - -### Distribution of expert loadings - -``` -python moellava/vis/vis1.py --input phi_sciqa.pt -``` - -![image](https://github.com/PKU-YuanGroup/MoE-LLaVA/assets/62638829/0a908801-b24a-4e0d-9537-1383c20ea36e) - -### Distribution of modalities across different experts - -``` -python moellava/vis/vis2.py --input phi_sciqa.pt -``` - -![image](https://github.com/PKU-YuanGroup/MoE-LLaVA/assets/62638829/f1e686ef-ecd5-4b21-a096-fa93c3ef4ae2) - -### Activated pathways - -``` -pip install mplsoccer -python moellava/vis/vis3.py --input phi_sciqa.pt -``` - -![image](https://github.com/PKU-YuanGroup/MoE-LLaVA/assets/62638829/7f952f7d-2f2d-47d3-80d5-ca733e422aaa) diff --git a/multimodal/vision-language_model/MoE-LLaVA/README_phi2.md b/multimodal/vision-language_model/moe-llava-phi2-2.7b/pytorch/README.md similarity index 49% rename from multimodal/vision-language_model/MoE-LLaVA/README_phi2.md rename to multimodal/vision-language_model/moe-llava-phi2-2.7b/pytorch/README.md index 00cac7b01b7a6ac1284e58f56bd99cfbd54bc07a..0a98afec10f36396d6a6d4cfcc3d6bb512d5361d 100644 --- a/multimodal/vision-language_model/MoE-LLaVA/README_phi2.md +++ b/multimodal/vision-language_model/moe-llava-phi2-2.7b/pytorch/README.md @@ -1,25 +1,29 @@ -# MoE-LLaVA-phi-2.7b -## Model description +# MoE-LLaVA-Phi2-2.7B -MoE-LLaVA: Mixture of Experts for Large Vision-Language Models, the Language Models is phi-2.7b +## Model Description +MoE-LLaVA is a cutting-edge vision-language model that combines the Mixture of Experts (MoE) architecture with the +phi-2.7b language model. It excels in multimodal tasks by efficiently processing and integrating visual and textual +information. The model leverages expert networks to specialize in different aspects of vision-language understanding, +enabling more accurate and context-aware responses. MoE-LLaVA is particularly effective in applications requiring +complex reasoning across visual and linguistic domains, such as image captioning and visual question answering. -## Prepare +## Model Preparation -### Install requirements +### Prepare Resources + +Go to MoE-LLaVA toolbox. ```bash +cd /toolbox/MoE-LLaVA +``` -cd MoE-LLaVA -pip install --upgrade pip # enable PEP 660 support -pip3 install -e . -pip3 install --upgrade pydantic +Dataset and weights need to link to current path in "MoE-LLaVA/" + +Download from the [file server](http://files.deepspark.org.cn:880/deepspark) + +The dataset path is as follows: -``` -### load data and weights -数据集和权重需要链接到当前目录 MoE-LLaVA 里 -[数据集地址](http://files.deepspark.org.cn:880/deepspark/) -格式如下: ```bash MoE-LLaVA/ ├── gitattributes @@ -29,8 +33,11 @@ MoE-LLaVA/ ├── README.md └── train_json ``` -[权重-clip-vit-large-patch14-336](http://files.deepspark.org.cn:880/deepspark/openai/) -格式如下: + +Get [clip-vit-large-patch14-336](http://files.deepspark.org.cn:880/deepspark/openai/). + +The weights path is as follows: + ```bash openai/ └── clip-vit-large-patch14-336 @@ -46,8 +53,11 @@ openai/ ├── tokenizer.json └── vocab.json ``` -[权重-phi-2.7b](http://files.deepspark.org.cn:880/deepspark/phi-2) -格式如下: + +Get [phi-2.7b](http://files.deepspark.org.cn:880/deepspark/phi-2) + +The weights path is as follows: + ```bash phi-2/ ├── added_tokens.json @@ -67,12 +77,24 @@ phi-2/ ├── tokenizer_config.json ├── tokenizer.json └── vocab.json +``` + +### Install Dependencies +```bash +cd MoE-LLaVA +pip install --upgrade pip # enable PEP 660 support +pip3 install -e . +pip3 install --upgrade pydantic ``` +## Model Training -## Train ```bash cd scripts/v1/phi2 bash pretrain.sh ``` + +## References + +- [MoE-LLaVA](https://github.com/PKU-YuanGroup/MoE-LLaVA) diff --git a/multimodal/vision-language_model/MoE-LLaVA/README_qwen.md b/multimodal/vision-language_model/moe-llava-qwen-1.8b/pytorch/README.md similarity index 53% rename from multimodal/vision-language_model/MoE-LLaVA/README_qwen.md rename to multimodal/vision-language_model/moe-llava-qwen-1.8b/pytorch/README.md index c9725b959dca6a1d8312b465299f828c5d1b44aa..bf216ff0382fc357bc43d1629857c4602fb41b58 100644 --- a/multimodal/vision-language_model/MoE-LLaVA/README_qwen.md +++ b/multimodal/vision-language_model/moe-llava-qwen-1.8b/pytorch/README.md @@ -1,26 +1,30 @@ -# MoE-LLaVA-Qwen-1_8B -## Model description +# MoE-LLaVA-Qwen-1.8B -MoE-LLaVA: Mixture of Experts for Large Vision-Language Models, the Language Models is Qwen-1_8B +## Model Description +MoE-LLaVA is a cutting-edge vision-language model that combines the Mixture of Experts (MoE) architecture with the +phi-2.7b language model. It excels in multimodal tasks by efficiently processing and integrating visual and textual +information. The model leverages expert networks to specialize in different aspects of vision-language understanding, +enabling more accurate and context-aware responses. MoE-LLaVA is particularly effective in applications requiring +complex reasoning across visual and linguistic domains, such as image captioning and visual question answering. -## Prepare +## Model Preparation -### Install requirements +### Prepare Resources + +Go to MoE-LLaVA toolbox. ```bash +cd /toolbox/MoE-LLaVA +``` -cd MoE-LLaVA -pip install --upgrade pip # enable PEP 660 support -pip3 install -e . -pip3 install --upgrade pydantic +Dataset and weights need to link to current path in "MoE-LLaVA/" + +Download from the [file server](http://files.deepspark.org.cn:880/deepspark) + +The dataset path is as follows: -``` -### load data and weights -数据集和权重需要链接到当前目录 MoE-LLaVA 里 -[数据集地址](http://files.deepspark.org.cn:880/deepspark/) -格式如下: ```bash MoE-LLaVA/ ├── gitattributes @@ -30,8 +34,11 @@ MoE-LLaVA/ ├── README.md └── train_json ``` -[权重-clip-vit-large-patch14-336](http://files.deepspark.org.cn:880/deepspark/openai/) -格式如下: + +Get [clip-vit-large-patch14-336](http://files.deepspark.org.cn:880/deepspark/openai/). + +The weights path is as follows: + ```bash openai/ └── clip-vit-large-patch14-336 @@ -47,8 +54,11 @@ openai/ ├── tokenizer.json └── vocab.json ``` -[权重-Qwen-1_8B](http://files.deepspark.org.cn:880/deepspark/Qwen-1_8B) -格式如下: + +Get [Qwen-1_8B](http://files.deepspark.org.cn:880/deepspark/Qwen-1_8B) + +The weights path is as follows: + ```bash Qwen-1_8B/ ├── assets @@ -76,12 +86,22 @@ Qwen-1_8B/ └── tokenizer_config.json ``` +### Install Dependencies + +```bash +cd MoE-LLaVA +pip install --upgrade pip # enable PEP 660 support +pip3 install -e . +pip3 install --upgrade pydantic +``` + +## Model Training -## Train ```bash cd scripts/v1/qwen bash pretrain.sh ``` +## References - +- [MoE-LLaVA](https://github.com/PKU-YuanGroup/MoE-LLaVA) diff --git a/multimodal/vision-language_model/MoE-LLaVA/README_stablelm.md b/multimodal/vision-language_model/moe-llava-stablelm-1.6b/pytorch/README.md similarity index 34% rename from multimodal/vision-language_model/MoE-LLaVA/README_stablelm.md rename to multimodal/vision-language_model/moe-llava-stablelm-1.6b/pytorch/README.md index 7f12257286ac4bf7f1fc10c8ef8854363fa52f95..4c34f8dcfc5c1876a32861073ac39902898902fd 100644 --- a/multimodal/vision-language_model/MoE-LLaVA/README_stablelm.md +++ b/multimodal/vision-language_model/moe-llava-stablelm-1.6b/pytorch/README.md @@ -1,24 +1,29 @@ -# MoE-LLaVA-stablelm-2-1_6b -## Model description +# MoE-LLaVA-StableLM-1.6B -MoE-LLaVA: Mixture of Experts for Large Vision-Language Models, the Language Models is stablelm-2-1_6b +## Model Description -## Prepare +MoE-LLaVA is a cutting-edge vision-language model that combines the Mixture of Experts (MoE) architecture with the +phi-2.7b language model. It excels in multimodal tasks by efficiently processing and integrating visual and textual +information. The model leverages expert networks to specialize in different aspects of vision-language understanding, +enabling more accurate and context-aware responses. MoE-LLaVA is particularly effective in applications requiring +complex reasoning across visual and linguistic domains, such as image captioning and visual question answering. -### Install requirements +## Model Preparation -```bash +### Prepare Resources -cd MoE-LLaVA -pip install --upgrade pip # enable PEP 660 support -pip3 install -e . -pip3 install --upgrade pydantic +Go to MoE-LLaVA toolbox. +```bash +cd /toolbox/MoE-LLaVA ``` -### load data and weights -数据集和权重需要链接到当前目录 MoE-LLaVA 里 -[数据集地址](http://files.deepspark.org.cn:880/deepspark/) -格式如下: + +Dataset and weights need to link to current path in "MoE-LLaVA/" + +Download from the [file server](http://files.deepspark.org.cn:880/deepspark) + +The dataset path is as follows: + ```bash MoE-LLaVA/ ├── gitattributes @@ -28,8 +33,11 @@ MoE-LLaVA/ ├── README.md └── train_json ``` -[权重-clip-vit-large-patch14-336](http://files.deepspark.org.cn:880/deepspark/openai/) -格式如下: + +Get [clip-vit-large-patch14-336](http://files.deepspark.org.cn:880/deepspark/openai/). + +The weights path is as follows: + ```bash openai/ └── clip-vit-large-patch14-336 @@ -45,8 +53,42 @@ openai/ ├── tokenizer.json └── vocab.json ``` -[权重-stablelm-2-1_6b](http://files.deepspark.org.cn:880/deepspark/stablelm-2-1_6b) -格式如下: + +Get [Qwen-1_8B](http://files.deepspark.org.cn:880/deepspark/Qwen-1_8B) + +The weights path is as follows: + +```bash +Qwen-1_8B/ +├── assets +│   ├── logo.jpg +│   ├── qwen_tokenizer.png +│   ├── tokenizer.png +│   └── wechat.png +├── cache_autogptq_cuda_256.cpp +├── cache_autogptq_cuda_kernel_256.cu +├── config.json +├── configuration_qwen.py +├── cpp_kernels.py +├── generation_config.json +├── gitattributes +├── LICENSE +├── model-00001-of-00002.safetensors +├── model-00002-of-00002.safetensors +├── modeling_qwen.py +├── model.safetensors.index.json +├── NOTICE +├── qwen_generation_utils.py +├── qwen.tiktoken +├── README.md +├── tokenization_qwen.py +└── tokenizer_config.json +``` + +Get [stablelm-2-1_6b](http://files.deepspark.org.cn:880/deepspark/stablelm-2-1_6b) + +The weights path is as follows: + ```bash stablelm-2-1_6b/ ├── config.json @@ -64,9 +106,24 @@ stablelm-2-1_6b/ └── vocab.json ``` +### Install Dependencies + +```bash + +cd MoE-LLaVA +pip install --upgrade pip # enable PEP 660 support +pip3 install -e . +pip3 install --upgrade pydantic + +``` + +## Model Training -## Train ```bash cd scripts/v1/stablelm-2-1_6b bash pretrain.sh ``` + +## References + +- [MoE-LLaVA](https://github.com/PKU-YuanGroup/MoE-LLaVA) diff --git a/multimodal/vision-language_model/MoE-LLaVA/assets/image.jpg b/toolbox/MoE-LLaVA/assets/image.jpg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/assets/image.jpg rename to toolbox/MoE-LLaVA/assets/image.jpg diff --git a/multimodal/vision-language_model/MoE-LLaVA/assets/intro.jpg b/toolbox/MoE-LLaVA/assets/intro.jpg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/assets/intro.jpg rename to toolbox/MoE-LLaVA/assets/intro.jpg diff --git a/multimodal/vision-language_model/MoE-LLaVA/assets/intro0.jpg b/toolbox/MoE-LLaVA/assets/intro0.jpg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/assets/intro0.jpg rename to toolbox/MoE-LLaVA/assets/intro0.jpg diff --git a/multimodal/vision-language_model/MoE-LLaVA/assets/logo.png b/toolbox/MoE-LLaVA/assets/logo.png similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/assets/logo.png rename to toolbox/MoE-LLaVA/assets/logo.png diff --git a/multimodal/vision-language_model/MoE-LLaVA/assets/modelscope_logo.png b/toolbox/MoE-LLaVA/assets/modelscope_logo.png similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/assets/modelscope_logo.png rename to toolbox/MoE-LLaVA/assets/modelscope_logo.png diff --git a/multimodal/vision-language_model/MoE-LLaVA/cog.yaml b/toolbox/MoE-LLaVA/cog.yaml similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/cog.yaml rename to toolbox/MoE-LLaVA/cog.yaml diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/__init__.py b/toolbox/MoE-LLaVA/moellava/__init__.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/__init__.py rename to toolbox/MoE-LLaVA/moellava/__init__.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/constants.py b/toolbox/MoE-LLaVA/moellava/constants.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/constants.py rename to toolbox/MoE-LLaVA/moellava/constants.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/conversation.py b/toolbox/MoE-LLaVA/moellava/conversation.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/conversation.py rename to toolbox/MoE-LLaVA/moellava/conversation.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_gpt_mmvet.py b/toolbox/MoE-LLaVA/moellava/eval/eval_gpt_mmvet.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_gpt_mmvet.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_gpt_mmvet.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_gpt_review.py b/toolbox/MoE-LLaVA/moellava/eval/eval_gpt_review.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_gpt_review.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_gpt_review.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_gpt_review_bench.py b/toolbox/MoE-LLaVA/moellava/eval/eval_gpt_review_bench.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_gpt_review_bench.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_gpt_review_bench.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_gpt_review_visual.py b/toolbox/MoE-LLaVA/moellava/eval/eval_gpt_review_visual.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_gpt_review_visual.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_gpt_review_visual.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_gqa.py b/toolbox/MoE-LLaVA/moellava/eval/eval_gqa.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_gqa.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_gqa.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_mmlu.py b/toolbox/MoE-LLaVA/moellava/eval/eval_mmlu.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_mmlu.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_mmlu.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_pope.py b/toolbox/MoE-LLaVA/moellava/eval/eval_pope.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_pope.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_pope.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_science_qa.py b/toolbox/MoE-LLaVA/moellava/eval/eval_science_qa.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_science_qa.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_science_qa.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_science_qa_gpt4.py b/toolbox/MoE-LLaVA/moellava/eval/eval_science_qa_gpt4.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_science_qa_gpt4.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_science_qa_gpt4.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_science_qa_gpt4_requery.py b/toolbox/MoE-LLaVA/moellava/eval/eval_science_qa_gpt4_requery.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_science_qa_gpt4_requery.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_science_qa_gpt4_requery.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_textvqa.py b/toolbox/MoE-LLaVA/moellava/eval/eval_textvqa.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/eval_textvqa.py rename to toolbox/MoE-LLaVA/moellava/eval/eval_textvqa.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/generate_webpage_data_from_table.py b/toolbox/MoE-LLaVA/moellava/eval/generate_webpage_data_from_table.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/generate_webpage_data_from_table.py rename to toolbox/MoE-LLaVA/moellava/eval/generate_webpage_data_from_table.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/m4c_evaluator.py b/toolbox/MoE-LLaVA/moellava/eval/m4c_evaluator.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/m4c_evaluator.py rename to toolbox/MoE-LLaVA/moellava/eval/m4c_evaluator.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/README.txt b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/README.txt similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/README.txt rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/README.txt diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/abstract_algebra_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/abstract_algebra_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/abstract_algebra_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/abstract_algebra_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/anatomy_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/anatomy_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/anatomy_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/anatomy_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/astronomy_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/astronomy_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/astronomy_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/astronomy_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/business_ethics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/business_ethics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/business_ethics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/business_ethics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/clinical_knowledge_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/clinical_knowledge_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/clinical_knowledge_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/clinical_knowledge_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_biology_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_biology_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_biology_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_biology_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_chemistry_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_chemistry_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_chemistry_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_chemistry_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_computer_science_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_computer_science_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_computer_science_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_computer_science_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_mathematics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_mathematics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_mathematics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_mathematics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_medicine_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_medicine_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_medicine_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_medicine_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_physics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_physics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_physics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/college_physics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/computer_security_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/computer_security_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/computer_security_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/computer_security_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/conceptual_physics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/conceptual_physics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/conceptual_physics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/conceptual_physics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/econometrics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/econometrics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/econometrics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/econometrics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/electrical_engineering_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/electrical_engineering_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/electrical_engineering_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/electrical_engineering_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/elementary_mathematics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/elementary_mathematics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/elementary_mathematics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/elementary_mathematics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/formal_logic_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/formal_logic_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/formal_logic_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/formal_logic_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/global_facts_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/global_facts_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/global_facts_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/global_facts_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_biology_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_biology_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_biology_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_biology_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_chemistry_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_chemistry_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_chemistry_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_chemistry_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_computer_science_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_computer_science_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_computer_science_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_computer_science_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_european_history_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_european_history_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_european_history_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_european_history_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_geography_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_geography_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_geography_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_geography_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_government_and_politics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_government_and_politics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_government_and_politics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_government_and_politics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_macroeconomics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_macroeconomics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_macroeconomics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_macroeconomics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_mathematics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_mathematics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_mathematics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_mathematics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_microeconomics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_microeconomics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_microeconomics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_microeconomics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_physics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_physics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_physics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_physics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_psychology_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_psychology_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_psychology_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_psychology_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_statistics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_statistics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_statistics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_statistics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_us_history_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_us_history_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_us_history_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_us_history_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_world_history_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_world_history_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_world_history_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/high_school_world_history_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/human_aging_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/human_aging_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/human_aging_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/human_aging_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/human_sexuality_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/human_sexuality_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/human_sexuality_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/human_sexuality_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/international_law_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/international_law_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/international_law_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/international_law_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/jurisprudence_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/jurisprudence_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/jurisprudence_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/jurisprudence_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/logical_fallacies_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/logical_fallacies_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/logical_fallacies_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/logical_fallacies_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/machine_learning_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/machine_learning_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/machine_learning_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/machine_learning_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/management_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/management_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/management_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/management_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/marketing_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/marketing_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/marketing_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/marketing_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/medical_genetics_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/medical_genetics_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/medical_genetics_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/medical_genetics_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/miscellaneous_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/miscellaneous_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/miscellaneous_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/miscellaneous_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/moral_disputes_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/moral_disputes_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/moral_disputes_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/moral_disputes_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/moral_scenarios_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/moral_scenarios_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/moral_scenarios_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/moral_scenarios_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/nutrition_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/nutrition_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/nutrition_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/nutrition_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/philosophy_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/philosophy_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/philosophy_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/philosophy_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/prehistory_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/prehistory_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/prehistory_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/prehistory_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_accounting_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_accounting_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_accounting_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_accounting_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_law_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_law_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_law_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_law_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_medicine_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_medicine_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_medicine_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_medicine_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_psychology_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_psychology_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_psychology_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/professional_psychology_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/public_relations_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/public_relations_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/public_relations_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/public_relations_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/security_studies_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/security_studies_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/security_studies_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/security_studies_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/sociology_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/sociology_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/sociology_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/sociology_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/us_foreign_policy_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/us_foreign_policy_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/us_foreign_policy_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/us_foreign_policy_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/virology_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/virology_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/virology_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/virology_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/world_religions_dev.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/world_religions_dev.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/dev/world_religions_dev.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/dev/world_religions_dev.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/possibly_contaminated_urls.txt b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/possibly_contaminated_urls.txt similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/possibly_contaminated_urls.txt rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/possibly_contaminated_urls.txt diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/abstract_algebra_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/abstract_algebra_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/abstract_algebra_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/abstract_algebra_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/anatomy_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/anatomy_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/anatomy_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/anatomy_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/astronomy_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/astronomy_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/astronomy_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/astronomy_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/business_ethics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/business_ethics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/business_ethics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/business_ethics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/clinical_knowledge_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/clinical_knowledge_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/clinical_knowledge_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/clinical_knowledge_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_biology_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_biology_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_biology_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_biology_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_chemistry_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_chemistry_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_chemistry_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_chemistry_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_computer_science_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_computer_science_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_computer_science_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_computer_science_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_mathematics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_mathematics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_mathematics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_mathematics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_medicine_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_medicine_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_medicine_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_medicine_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_physics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_physics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/college_physics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/college_physics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/computer_security_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/computer_security_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/computer_security_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/computer_security_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/conceptual_physics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/conceptual_physics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/conceptual_physics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/conceptual_physics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/econometrics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/econometrics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/econometrics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/econometrics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/electrical_engineering_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/electrical_engineering_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/electrical_engineering_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/electrical_engineering_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/elementary_mathematics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/elementary_mathematics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/elementary_mathematics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/elementary_mathematics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/formal_logic_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/formal_logic_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/formal_logic_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/formal_logic_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/global_facts_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/global_facts_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/global_facts_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/global_facts_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_biology_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_biology_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_biology_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_biology_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_chemistry_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_chemistry_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_chemistry_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_chemistry_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_computer_science_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_computer_science_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_computer_science_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_computer_science_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_european_history_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_european_history_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_european_history_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_european_history_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_geography_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_geography_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_geography_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_geography_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_government_and_politics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_government_and_politics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_government_and_politics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_government_and_politics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_macroeconomics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_macroeconomics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_macroeconomics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_macroeconomics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_mathematics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_mathematics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_mathematics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_mathematics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_microeconomics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_microeconomics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_microeconomics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_microeconomics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_physics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_physics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_physics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_physics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_psychology_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_psychology_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_psychology_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_psychology_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_statistics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_statistics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_statistics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_statistics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_us_history_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_us_history_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_us_history_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_us_history_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_world_history_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_world_history_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_world_history_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/high_school_world_history_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/human_aging_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/human_aging_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/human_aging_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/human_aging_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/human_sexuality_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/human_sexuality_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/human_sexuality_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/human_sexuality_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/international_law_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/international_law_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/international_law_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/international_law_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/jurisprudence_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/jurisprudence_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/jurisprudence_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/jurisprudence_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/logical_fallacies_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/logical_fallacies_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/logical_fallacies_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/logical_fallacies_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/machine_learning_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/machine_learning_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/machine_learning_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/machine_learning_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/management_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/management_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/management_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/management_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/marketing_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/marketing_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/marketing_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/marketing_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/medical_genetics_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/medical_genetics_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/medical_genetics_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/medical_genetics_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/miscellaneous_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/miscellaneous_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/miscellaneous_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/miscellaneous_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/moral_disputes_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/moral_disputes_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/moral_disputes_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/moral_disputes_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/moral_scenarios_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/moral_scenarios_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/moral_scenarios_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/moral_scenarios_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/nutrition_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/nutrition_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/nutrition_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/nutrition_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/philosophy_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/philosophy_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/philosophy_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/philosophy_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/prehistory_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/prehistory_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/prehistory_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/prehistory_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_accounting_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_accounting_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_accounting_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_accounting_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_law_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_law_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_law_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_law_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_medicine_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_medicine_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_medicine_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_medicine_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_psychology_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_psychology_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_psychology_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/professional_psychology_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/public_relations_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/public_relations_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/public_relations_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/public_relations_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/security_studies_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/security_studies_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/security_studies_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/security_studies_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/sociology_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/sociology_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/sociology_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/sociology_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/us_foreign_policy_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/us_foreign_policy_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/us_foreign_policy_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/us_foreign_policy_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/virology_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/virology_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/virology_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/virology_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/world_religions_test.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/world_religions_test.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/test/world_religions_test.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/test/world_religions_test.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/abstract_algebra_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/abstract_algebra_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/abstract_algebra_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/abstract_algebra_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/anatomy_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/anatomy_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/anatomy_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/anatomy_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/astronomy_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/astronomy_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/astronomy_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/astronomy_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/business_ethics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/business_ethics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/business_ethics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/business_ethics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/clinical_knowledge_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/clinical_knowledge_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/clinical_knowledge_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/clinical_knowledge_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_biology_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_biology_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_biology_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_biology_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_chemistry_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_chemistry_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_chemistry_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_chemistry_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_computer_science_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_computer_science_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_computer_science_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_computer_science_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_mathematics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_mathematics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_mathematics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_mathematics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_medicine_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_medicine_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_medicine_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_medicine_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_physics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_physics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/college_physics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/college_physics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/computer_security_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/computer_security_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/computer_security_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/computer_security_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/conceptual_physics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/conceptual_physics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/conceptual_physics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/conceptual_physics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/econometrics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/econometrics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/econometrics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/econometrics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/electrical_engineering_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/electrical_engineering_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/electrical_engineering_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/electrical_engineering_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/elementary_mathematics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/elementary_mathematics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/elementary_mathematics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/elementary_mathematics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/formal_logic_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/formal_logic_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/formal_logic_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/formal_logic_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/global_facts_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/global_facts_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/global_facts_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/global_facts_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_biology_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_biology_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_biology_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_biology_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_chemistry_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_chemistry_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_chemistry_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_chemistry_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_computer_science_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_computer_science_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_computer_science_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_computer_science_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_european_history_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_european_history_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_european_history_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_european_history_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_geography_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_geography_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_geography_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_geography_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_government_and_politics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_government_and_politics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_government_and_politics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_government_and_politics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_macroeconomics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_macroeconomics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_macroeconomics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_macroeconomics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_mathematics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_mathematics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_mathematics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_mathematics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_microeconomics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_microeconomics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_microeconomics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_microeconomics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_physics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_physics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_physics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_physics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_psychology_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_psychology_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_psychology_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_psychology_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_statistics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_statistics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_statistics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_statistics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_us_history_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_us_history_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_us_history_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_us_history_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_world_history_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_world_history_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_world_history_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/high_school_world_history_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/human_aging_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/human_aging_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/human_aging_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/human_aging_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/human_sexuality_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/human_sexuality_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/human_sexuality_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/human_sexuality_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/international_law_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/international_law_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/international_law_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/international_law_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/jurisprudence_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/jurisprudence_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/jurisprudence_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/jurisprudence_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/logical_fallacies_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/logical_fallacies_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/logical_fallacies_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/logical_fallacies_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/machine_learning_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/machine_learning_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/machine_learning_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/machine_learning_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/management_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/management_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/management_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/management_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/marketing_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/marketing_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/marketing_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/marketing_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/medical_genetics_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/medical_genetics_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/medical_genetics_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/medical_genetics_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/miscellaneous_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/miscellaneous_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/miscellaneous_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/miscellaneous_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/moral_disputes_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/moral_disputes_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/moral_disputes_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/moral_disputes_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/moral_scenarios_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/moral_scenarios_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/moral_scenarios_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/moral_scenarios_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/nutrition_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/nutrition_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/nutrition_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/nutrition_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/philosophy_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/philosophy_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/philosophy_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/philosophy_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/prehistory_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/prehistory_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/prehistory_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/prehistory_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_accounting_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_accounting_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_accounting_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_accounting_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_law_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_law_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_law_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_law_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_medicine_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_medicine_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_medicine_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_medicine_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_psychology_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_psychology_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_psychology_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/professional_psychology_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/public_relations_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/public_relations_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/public_relations_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/public_relations_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/security_studies_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/security_studies_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/security_studies_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/security_studies_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/sociology_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/sociology_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/sociology_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/sociology_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/us_foreign_policy_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/us_foreign_policy_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/us_foreign_policy_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/us_foreign_policy_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/virology_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/virology_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/virology_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/virology_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/world_religions_val.csv b/toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/world_religions_val.csv similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/mmlu_data/val/world_religions_val.csv rename to toolbox/MoE-LLaVA/moellava/eval/mmlu_data/val/world_religions_val.csv diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_qa.py b/toolbox/MoE-LLaVA/moellava/eval/model_qa.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_qa.py rename to toolbox/MoE-LLaVA/moellava/eval/model_qa.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_vqa.py b/toolbox/MoE-LLaVA/moellava/eval/model_vqa.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_vqa.py rename to toolbox/MoE-LLaVA/moellava/eval/model_vqa.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_vqa_loader.py b/toolbox/MoE-LLaVA/moellava/eval/model_vqa_loader.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_vqa_loader.py rename to toolbox/MoE-LLaVA/moellava/eval/model_vqa_loader.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_vqa_mmbench.py b/toolbox/MoE-LLaVA/moellava/eval/model_vqa_mmbench.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_vqa_mmbench.py rename to toolbox/MoE-LLaVA/moellava/eval/model_vqa_mmbench.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_vqa_qbench.py b/toolbox/MoE-LLaVA/moellava/eval/model_vqa_qbench.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_vqa_qbench.py rename to toolbox/MoE-LLaVA/moellava/eval/model_vqa_qbench.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_vqa_science.py b/toolbox/MoE-LLaVA/moellava/eval/model_vqa_science.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/model_vqa_science.py rename to toolbox/MoE-LLaVA/moellava/eval/model_vqa_science.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/qa_baseline_gpt35.py b/toolbox/MoE-LLaVA/moellava/eval/qa_baseline_gpt35.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/qa_baseline_gpt35.py rename to toolbox/MoE-LLaVA/moellava/eval/qa_baseline_gpt35.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/run_llava.py b/toolbox/MoE-LLaVA/moellava/eval/run_llava.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/run_llava.py rename to toolbox/MoE-LLaVA/moellava/eval/run_llava.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/summarize_gpt_review.py b/toolbox/MoE-LLaVA/moellava/eval/summarize_gpt_review.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/summarize_gpt_review.py rename to toolbox/MoE-LLaVA/moellava/eval/summarize_gpt_review.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/answer/answer_alpaca-13b.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/answer/answer_alpaca-13b.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/answer/answer_alpaca-13b.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/answer/answer_alpaca-13b.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/answer/answer_bard.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/answer/answer_bard.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/answer/answer_bard.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/answer/answer_bard.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/answer/answer_gpt35.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/answer/answer_gpt35.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/answer/answer_gpt35.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/answer/answer_gpt35.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/answer/answer_llama-13b.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/answer/answer_llama-13b.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/answer/answer_llama-13b.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/answer/answer_llama-13b.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/answer/answer_vicuna-13b.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/answer/answer_vicuna-13b.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/answer/answer_vicuna-13b.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/answer/answer_vicuna-13b.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/caps_boxes_coco2014_val_80.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/caps_boxes_coco2014_val_80.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/caps_boxes_coco2014_val_80.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/caps_boxes_coco2014_val_80.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/model.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/model.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/model.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/model.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/prompt.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/prompt.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/prompt.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/prompt.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/question.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/question.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/question.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/question.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/results/test_sqa_llava_13b_v0.json b/toolbox/MoE-LLaVA/moellava/eval/table/results/test_sqa_llava_13b_v0.json similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/results/test_sqa_llava_13b_v0.json rename to toolbox/MoE-LLaVA/moellava/eval/table/results/test_sqa_llava_13b_v0.json diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/results/test_sqa_llava_lcs_558k_sqa_12e_vicuna_v1_3_13b.json b/toolbox/MoE-LLaVA/moellava/eval/table/results/test_sqa_llava_lcs_558k_sqa_12e_vicuna_v1_3_13b.json similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/results/test_sqa_llava_lcs_558k_sqa_12e_vicuna_v1_3_13b.json rename to toolbox/MoE-LLaVA/moellava/eval/table/results/test_sqa_llava_lcs_558k_sqa_12e_vicuna_v1_3_13b.json diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/review/review_alpaca-13b_vicuna-13b.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/review/review_alpaca-13b_vicuna-13b.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/review/review_alpaca-13b_vicuna-13b.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/review/review_alpaca-13b_vicuna-13b.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/review/review_bard_vicuna-13b.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/review/review_bard_vicuna-13b.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/review/review_bard_vicuna-13b.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/review/review_bard_vicuna-13b.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/review/review_gpt35_vicuna-13b.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/review/review_gpt35_vicuna-13b.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/review/review_gpt35_vicuna-13b.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/review/review_gpt35_vicuna-13b.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/review/review_llama-13b_vicuna-13b.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/review/review_llama-13b_vicuna-13b.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/review/review_llama-13b_vicuna-13b.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/review/review_llama-13b_vicuna-13b.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/reviewer.jsonl b/toolbox/MoE-LLaVA/moellava/eval/table/reviewer.jsonl similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/reviewer.jsonl rename to toolbox/MoE-LLaVA/moellava/eval/table/reviewer.jsonl diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/rule.json b/toolbox/MoE-LLaVA/moellava/eval/table/rule.json similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/table/rule.json rename to toolbox/MoE-LLaVA/moellava/eval/table/rule.json diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/alpaca.png b/toolbox/MoE-LLaVA/moellava/eval/webpage/figures/alpaca.png similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/alpaca.png rename to toolbox/MoE-LLaVA/moellava/eval/webpage/figures/alpaca.png diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/bard.jpg b/toolbox/MoE-LLaVA/moellava/eval/webpage/figures/bard.jpg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/bard.jpg rename to toolbox/MoE-LLaVA/moellava/eval/webpage/figures/bard.jpg diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/chatgpt.svg b/toolbox/MoE-LLaVA/moellava/eval/webpage/figures/chatgpt.svg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/chatgpt.svg rename to toolbox/MoE-LLaVA/moellava/eval/webpage/figures/chatgpt.svg diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/llama.jpg b/toolbox/MoE-LLaVA/moellava/eval/webpage/figures/llama.jpg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/llama.jpg rename to toolbox/MoE-LLaVA/moellava/eval/webpage/figures/llama.jpg diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/swords_FILL0_wght300_GRAD0_opsz48.svg b/toolbox/MoE-LLaVA/moellava/eval/webpage/figures/swords_FILL0_wght300_GRAD0_opsz48.svg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/swords_FILL0_wght300_GRAD0_opsz48.svg rename to toolbox/MoE-LLaVA/moellava/eval/webpage/figures/swords_FILL0_wght300_GRAD0_opsz48.svg diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/vicuna.jpeg b/toolbox/MoE-LLaVA/moellava/eval/webpage/figures/vicuna.jpeg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/figures/vicuna.jpeg rename to toolbox/MoE-LLaVA/moellava/eval/webpage/figures/vicuna.jpeg diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/index.html b/toolbox/MoE-LLaVA/moellava/eval/webpage/index.html similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/index.html rename to toolbox/MoE-LLaVA/moellava/eval/webpage/index.html diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/script.js b/toolbox/MoE-LLaVA/moellava/eval/webpage/script.js similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/script.js rename to toolbox/MoE-LLaVA/moellava/eval/webpage/script.js diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/styles.css b/toolbox/MoE-LLaVA/moellava/eval/webpage/styles.css similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/eval/webpage/styles.css rename to toolbox/MoE-LLaVA/moellava/eval/webpage/styles.css diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/mm_utils.py b/toolbox/MoE-LLaVA/moellava/mm_utils.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/mm_utils.py rename to toolbox/MoE-LLaVA/moellava/mm_utils.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/__init__.py b/toolbox/MoE-LLaVA/moellava/model/__init__.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/__init__.py rename to toolbox/MoE-LLaVA/moellava/model/__init__.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/apply_delta.py b/toolbox/MoE-LLaVA/moellava/model/apply_delta.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/apply_delta.py rename to toolbox/MoE-LLaVA/moellava/model/apply_delta.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/builder.py b/toolbox/MoE-LLaVA/moellava/model/builder.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/builder.py rename to toolbox/MoE-LLaVA/moellava/model/builder.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/consolidate.py b/toolbox/MoE-LLaVA/moellava/model/consolidate.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/consolidate.py rename to toolbox/MoE-LLaVA/moellava/model/consolidate.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_llama.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_llama.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_llama.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_llama.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_llama_moe.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_llama_moe.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_llama_moe.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_llama_moe.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_minicpm.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_minicpm.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_minicpm.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_minicpm.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_minicpm_moe.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_minicpm_moe.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_minicpm_moe.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_minicpm_moe.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_mistral.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_mistral.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_mistral.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_mistral.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_mistral_moe.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_mistral_moe.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_mistral_moe.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_mistral_moe.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_mpt.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_mpt.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_mpt.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_mpt.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_phi.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_phi.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_phi.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_phi.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_phi_moe.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_phi_moe.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_phi_moe.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_phi_moe.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_qwen.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_qwen.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_qwen.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_qwen.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_qwen1_5.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_qwen1_5.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_qwen1_5.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_qwen1_5.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_qwen1_5_moe.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_qwen1_5_moe.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_qwen1_5_moe.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_qwen1_5_moe.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_qwen_moe.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_qwen_moe.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_qwen_moe.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_qwen_moe.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_stablelm.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_stablelm.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_stablelm.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_stablelm.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_stablelm_moe.py b/toolbox/MoE-LLaVA/moellava/model/language_model/llava_stablelm_moe.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/llava_stablelm_moe.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/llava_stablelm_moe.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/minicpm/configuration_minicpm.py b/toolbox/MoE-LLaVA/moellava/model/language_model/minicpm/configuration_minicpm.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/minicpm/configuration_minicpm.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/minicpm/configuration_minicpm.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/minicpm/modeling_minicpm.py b/toolbox/MoE-LLaVA/moellava/model/language_model/minicpm/modeling_minicpm.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/minicpm/modeling_minicpm.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/minicpm/modeling_minicpm.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/adapt_tokenizer.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/adapt_tokenizer.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/adapt_tokenizer.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/adapt_tokenizer.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/attention.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/attention.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/attention.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/attention.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/blocks.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/blocks.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/blocks.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/blocks.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/configuration_mpt.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/configuration_mpt.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/configuration_mpt.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/configuration_mpt.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/custom_embedding.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/custom_embedding.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/custom_embedding.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/custom_embedding.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/flash_attn_triton.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/flash_attn_triton.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/flash_attn_triton.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/flash_attn_triton.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/hf_prefixlm_converter.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/hf_prefixlm_converter.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/hf_prefixlm_converter.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/hf_prefixlm_converter.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/meta_init_context.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/meta_init_context.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/meta_init_context.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/meta_init_context.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/modeling_mpt.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/modeling_mpt.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/modeling_mpt.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/modeling_mpt.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/norm.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/norm.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/norm.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/norm.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/param_init_fns.py b/toolbox/MoE-LLaVA/moellava/model/language_model/mpt/param_init_fns.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/mpt/param_init_fns.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/mpt/param_init_fns.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/phi/configuration_phi.py b/toolbox/MoE-LLaVA/moellava/model/language_model/phi/configuration_phi.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/phi/configuration_phi.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/phi/configuration_phi.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/phi/modeling_phi.py b/toolbox/MoE-LLaVA/moellava/model/language_model/phi/modeling_phi.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/phi/modeling_phi.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/phi/modeling_phi.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/qwen/configuration_qwen.py b/toolbox/MoE-LLaVA/moellava/model/language_model/qwen/configuration_qwen.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/qwen/configuration_qwen.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/qwen/configuration_qwen.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/qwen/cpp_kernels.py b/toolbox/MoE-LLaVA/moellava/model/language_model/qwen/cpp_kernels.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/qwen/cpp_kernels.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/qwen/cpp_kernels.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/qwen/modeling_qwen.py b/toolbox/MoE-LLaVA/moellava/model/language_model/qwen/modeling_qwen.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/qwen/modeling_qwen.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/qwen/modeling_qwen.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/qwen/qwen_generation_utils.py b/toolbox/MoE-LLaVA/moellava/model/language_model/qwen/qwen_generation_utils.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/qwen/qwen_generation_utils.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/qwen/qwen_generation_utils.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/qwen/tokenization_qwen.py b/toolbox/MoE-LLaVA/moellava/model/language_model/qwen/tokenization_qwen.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/qwen/tokenization_qwen.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/qwen/tokenization_qwen.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/stablelm/configuration_stablelm_epoch.py b/toolbox/MoE-LLaVA/moellava/model/language_model/stablelm/configuration_stablelm_epoch.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/stablelm/configuration_stablelm_epoch.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/stablelm/configuration_stablelm_epoch.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/stablelm/modeling_stablelm_epoch.py b/toolbox/MoE-LLaVA/moellava/model/language_model/stablelm/modeling_stablelm_epoch.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/stablelm/modeling_stablelm_epoch.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/stablelm/modeling_stablelm_epoch.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/stablelm/tokenization_arcade100k.py b/toolbox/MoE-LLaVA/moellava/model/language_model/stablelm/tokenization_arcade100k.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/language_model/stablelm/tokenization_arcade100k.py rename to toolbox/MoE-LLaVA/moellava/model/language_model/stablelm/tokenization_arcade100k.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/llava_arch.py b/toolbox/MoE-LLaVA/moellava/model/llava_arch.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/llava_arch.py rename to toolbox/MoE-LLaVA/moellava/model/llava_arch.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/make_delta.py b/toolbox/MoE-LLaVA/moellava/model/make_delta.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/make_delta.py rename to toolbox/MoE-LLaVA/moellava/model/make_delta.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_encoder/builder.py b/toolbox/MoE-LLaVA/moellava/model/multimodal_encoder/builder.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_encoder/builder.py rename to toolbox/MoE-LLaVA/moellava/model/multimodal_encoder/builder.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_encoder/clip_encoder.py b/toolbox/MoE-LLaVA/moellava/model/multimodal_encoder/clip_encoder.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_encoder/clip_encoder.py rename to toolbox/MoE-LLaVA/moellava/model/multimodal_encoder/clip_encoder.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_encoder/siglip_encoder.py b/toolbox/MoE-LLaVA/moellava/model/multimodal_encoder/siglip_encoder.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_encoder/siglip_encoder.py rename to toolbox/MoE-LLaVA/moellava/model/multimodal_encoder/siglip_encoder.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_projector/builder.py b/toolbox/MoE-LLaVA/moellava/model/multimodal_projector/builder.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_projector/builder.py rename to toolbox/MoE-LLaVA/moellava/model/multimodal_projector/builder.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_projector/pool_block.py b/toolbox/MoE-LLaVA/moellava/model/multimodal_projector/pool_block.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_projector/pool_block.py rename to toolbox/MoE-LLaVA/moellava/model/multimodal_projector/pool_block.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_projector/qformer.py b/toolbox/MoE-LLaVA/moellava/model/multimodal_projector/qformer.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_projector/qformer.py rename to toolbox/MoE-LLaVA/moellava/model/multimodal_projector/qformer.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_projector/simple_block.py b/toolbox/MoE-LLaVA/moellava/model/multimodal_projector/simple_block.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/multimodal_projector/simple_block.py rename to toolbox/MoE-LLaVA/moellava/model/multimodal_projector/simple_block.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/model/utils.py b/toolbox/MoE-LLaVA/moellava/model/utils.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/model/utils.py rename to toolbox/MoE-LLaVA/moellava/model/utils.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/__init__.py b/toolbox/MoE-LLaVA/moellava/serve/__init__.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/__init__.py rename to toolbox/MoE-LLaVA/moellava/serve/__init__.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/cli.py b/toolbox/MoE-LLaVA/moellava/serve/cli.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/cli.py rename to toolbox/MoE-LLaVA/moellava/serve/cli.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/cli_multi.py b/toolbox/MoE-LLaVA/moellava/serve/cli_multi.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/cli_multi.py rename to toolbox/MoE-LLaVA/moellava/serve/cli_multi.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/controller.py b/toolbox/MoE-LLaVA/moellava/serve/controller.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/controller.py rename to toolbox/MoE-LLaVA/moellava/serve/controller.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/examples/desert.jpg b/toolbox/MoE-LLaVA/moellava/serve/examples/desert.jpg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/examples/desert.jpg rename to toolbox/MoE-LLaVA/moellava/serve/examples/desert.jpg diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/examples/extreme_ironing.jpg b/toolbox/MoE-LLaVA/moellava/serve/examples/extreme_ironing.jpg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/examples/extreme_ironing.jpg rename to toolbox/MoE-LLaVA/moellava/serve/examples/extreme_ironing.jpg diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/examples/waterview.jpg b/toolbox/MoE-LLaVA/moellava/serve/examples/waterview.jpg similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/examples/waterview.jpg rename to toolbox/MoE-LLaVA/moellava/serve/examples/waterview.jpg diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/gradio_utils.py b/toolbox/MoE-LLaVA/moellava/serve/gradio_utils.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/gradio_utils.py rename to toolbox/MoE-LLaVA/moellava/serve/gradio_utils.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/gradio_web_server.py b/toolbox/MoE-LLaVA/moellava/serve/gradio_web_server.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/gradio_web_server.py rename to toolbox/MoE-LLaVA/moellava/serve/gradio_web_server.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/model_worker.py b/toolbox/MoE-LLaVA/moellava/serve/model_worker.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/model_worker.py rename to toolbox/MoE-LLaVA/moellava/serve/model_worker.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/register_worker.py b/toolbox/MoE-LLaVA/moellava/serve/register_worker.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/register_worker.py rename to toolbox/MoE-LLaVA/moellava/serve/register_worker.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/test_message.py b/toolbox/MoE-LLaVA/moellava/serve/test_message.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/test_message.py rename to toolbox/MoE-LLaVA/moellava/serve/test_message.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/serve/utils.py b/toolbox/MoE-LLaVA/moellava/serve/utils.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/serve/utils.py rename to toolbox/MoE-LLaVA/moellava/serve/utils.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/train/llama_flash_attn_monkey_patch.py b/toolbox/MoE-LLaVA/moellava/train/llama_flash_attn_monkey_patch.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/train/llama_flash_attn_monkey_patch.py rename to toolbox/MoE-LLaVA/moellava/train/llama_flash_attn_monkey_patch.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/train/llama_xformers_attn_monkey_patch.py b/toolbox/MoE-LLaVA/moellava/train/llama_xformers_attn_monkey_patch.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/train/llama_xformers_attn_monkey_patch.py rename to toolbox/MoE-LLaVA/moellava/train/llama_xformers_attn_monkey_patch.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/train/llava_trainer.py b/toolbox/MoE-LLaVA/moellava/train/llava_trainer.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/train/llava_trainer.py rename to toolbox/MoE-LLaVA/moellava/train/llava_trainer.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/train/train.py b/toolbox/MoE-LLaVA/moellava/train/train.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/train/train.py rename to toolbox/MoE-LLaVA/moellava/train/train.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/train/train_mem.py b/toolbox/MoE-LLaVA/moellava/train/train_mem.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/train/train_mem.py rename to toolbox/MoE-LLaVA/moellava/train/train_mem.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/train/train_xformers.py b/toolbox/MoE-LLaVA/moellava/train/train_xformers.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/train/train_xformers.py rename to toolbox/MoE-LLaVA/moellava/train/train_xformers.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/utils.py b/toolbox/MoE-LLaVA/moellava/utils.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/utils.py rename to toolbox/MoE-LLaVA/moellava/utils.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/vis/vis1.py b/toolbox/MoE-LLaVA/moellava/vis/vis1.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/vis/vis1.py rename to toolbox/MoE-LLaVA/moellava/vis/vis1.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/vis/vis2.py b/toolbox/MoE-LLaVA/moellava/vis/vis2.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/vis/vis2.py rename to toolbox/MoE-LLaVA/moellava/vis/vis2.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/moellava/vis/vis3.py b/toolbox/MoE-LLaVA/moellava/vis/vis3.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/moellava/vis/vis3.py rename to toolbox/MoE-LLaVA/moellava/vis/vis3.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/predict.py b/toolbox/MoE-LLaVA/predict.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/predict.py rename to toolbox/MoE-LLaVA/predict.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/pyproject.toml b/toolbox/MoE-LLaVA/pyproject.toml similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/pyproject.toml rename to toolbox/MoE-LLaVA/pyproject.toml diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/.msc b/toolbox/MoE-LLaVA/scripts/.msc similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/.msc rename to toolbox/MoE-LLaVA/scripts/.msc diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/README.md b/toolbox/MoE-LLaVA/scripts/README.md similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/README.md rename to toolbox/MoE-LLaVA/scripts/README.md diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/convert_gqa_for_eval.py b/toolbox/MoE-LLaVA/scripts/convert_gqa_for_eval.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/convert_gqa_for_eval.py rename to toolbox/MoE-LLaVA/scripts/convert_gqa_for_eval.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/convert_mmbench_for_submission.py b/toolbox/MoE-LLaVA/scripts/convert_mmbench_for_submission.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/convert_mmbench_for_submission.py rename to toolbox/MoE-LLaVA/scripts/convert_mmbench_for_submission.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/convert_mmvet_for_eval.py b/toolbox/MoE-LLaVA/scripts/convert_mmvet_for_eval.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/convert_mmvet_for_eval.py rename to toolbox/MoE-LLaVA/scripts/convert_mmvet_for_eval.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/convert_seed_for_submission.py b/toolbox/MoE-LLaVA/scripts/convert_seed_for_submission.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/convert_seed_for_submission.py rename to toolbox/MoE-LLaVA/scripts/convert_seed_for_submission.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/convert_sqa_to_llava.py b/toolbox/MoE-LLaVA/scripts/convert_sqa_to_llava.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/convert_sqa_to_llava.py rename to toolbox/MoE-LLaVA/scripts/convert_sqa_to_llava.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/convert_sqa_to_llava_base_prompt.py b/toolbox/MoE-LLaVA/scripts/convert_sqa_to_llava_base_prompt.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/convert_sqa_to_llava_base_prompt.py rename to toolbox/MoE-LLaVA/scripts/convert_sqa_to_llava_base_prompt.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/convert_vizwiz_for_submission.py b/toolbox/MoE-LLaVA/scripts/convert_vizwiz_for_submission.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/convert_vizwiz_for_submission.py rename to toolbox/MoE-LLaVA/scripts/convert_vizwiz_for_submission.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/convert_vqav2_for_submission.py b/toolbox/MoE-LLaVA/scripts/convert_vqav2_for_submission.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/convert_vqav2_for_submission.py rename to toolbox/MoE-LLaVA/scripts/convert_vqav2_for_submission.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/extract_mm_projector.py b/toolbox/MoE-LLaVA/scripts/extract_mm_projector.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/extract_mm_projector.py rename to toolbox/MoE-LLaVA/scripts/extract_mm_projector.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/finetune.sh b/toolbox/MoE-LLaVA/scripts/finetune.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/finetune.sh rename to toolbox/MoE-LLaVA/scripts/finetune.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/finetune_full_schedule.sh b/toolbox/MoE-LLaVA/scripts/finetune_full_schedule.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/finetune_full_schedule.sh rename to toolbox/MoE-LLaVA/scripts/finetune_full_schedule.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/finetune_lora.sh b/toolbox/MoE-LLaVA/scripts/finetune_lora.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/finetune_lora.sh rename to toolbox/MoE-LLaVA/scripts/finetune_lora.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/finetune_qlora.sh b/toolbox/MoE-LLaVA/scripts/finetune_qlora.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/finetune_qlora.sh rename to toolbox/MoE-LLaVA/scripts/finetune_qlora.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/finetune_sqa.sh b/toolbox/MoE-LLaVA/scripts/finetune_sqa.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/finetune_sqa.sh rename to toolbox/MoE-LLaVA/scripts/finetune_sqa.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/merge_lora_weights.py b/toolbox/MoE-LLaVA/scripts/merge_lora_weights.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/merge_lora_weights.py rename to toolbox/MoE-LLaVA/scripts/merge_lora_weights.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/merge_moe_lora_weights.py b/toolbox/MoE-LLaVA/scripts/merge_moe_lora_weights.py similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/merge_moe_lora_weights.py rename to toolbox/MoE-LLaVA/scripts/merge_moe_lora_weights.py diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/pretrain.sh b/toolbox/MoE-LLaVA/scripts/pretrain.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/pretrain.sh rename to toolbox/MoE-LLaVA/scripts/pretrain.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/pretrain_xformers.sh b/toolbox/MoE-LLaVA/scripts/pretrain_xformers.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/pretrain_xformers.sh rename to toolbox/MoE-LLaVA/scripts/pretrain_xformers.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/sqa_eval_batch.sh b/toolbox/MoE-LLaVA/scripts/sqa_eval_batch.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/sqa_eval_batch.sh rename to toolbox/MoE-LLaVA/scripts/sqa_eval_batch.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/sqa_eval_gather.sh b/toolbox/MoE-LLaVA/scripts/sqa_eval_gather.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/sqa_eval_gather.sh rename to toolbox/MoE-LLaVA/scripts/sqa_eval_gather.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/gqa.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/gqa.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/gqa.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/gqa.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/llavabench.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/llavabench.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/llavabench.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/llavabench.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/mmbench.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/mmbench.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/mmbench.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/mmbench.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/mmbench_cn.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/mmbench_cn.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/mmbench_cn.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/mmbench_cn.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/mme.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/mme.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/mme.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/mme.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/mmvet.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/mmvet.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/mmvet.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/mmvet.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/pope.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/pope.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/pope.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/pope.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/seed.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/seed.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/seed.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/seed.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/sqa.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/sqa.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/sqa.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/sqa.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/textvqa.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/textvqa.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/textvqa.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/textvqa.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/vizwiz.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/vizwiz.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/vizwiz.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/vizwiz.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/vqav2.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/llava/vqav2.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/llava/vqav2.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/llava/vqav2.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/gqa.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/gqa.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/gqa.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/gqa.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/llavabench.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/llavabench.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/llavabench.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/llavabench.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/mmbench.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/mmbench.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/mmbench.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/mmbench.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/mmbench_cn.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/mmbench_cn.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/mmbench_cn.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/mmbench_cn.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/mme.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/mme.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/mme.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/mme.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/mmvet.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/mmvet.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/mmvet.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/mmvet.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/pope.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/pope.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/pope.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/pope.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/seed.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/seed.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/seed.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/seed.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/sqa.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/sqa.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/sqa.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/sqa.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/textvqa.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/textvqa.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/textvqa.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/textvqa.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/vizwiz.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/vizwiz.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/vizwiz.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/vizwiz.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/vqav2.sh b/toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/vqav2.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/eval/moe_llava/vqav2.sh rename to toolbox/MoE-LLaVA/scripts/v1/eval/moe_llava/vqav2.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/openchat/finetune.sh b/toolbox/MoE-LLaVA/scripts/v1/openchat/finetune.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/openchat/finetune.sh rename to toolbox/MoE-LLaVA/scripts/v1/openchat/finetune.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/openchat/finetune_moe.sh b/toolbox/MoE-LLaVA/scripts/v1/openchat/finetune_moe.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/openchat/finetune_moe.sh rename to toolbox/MoE-LLaVA/scripts/v1/openchat/finetune_moe.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/openchat/pretrain.sh b/toolbox/MoE-LLaVA/scripts/v1/openchat/pretrain.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/openchat/pretrain.sh rename to toolbox/MoE-LLaVA/scripts/v1/openchat/pretrain.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/phi2/finetune.sh b/toolbox/MoE-LLaVA/scripts/v1/phi2/finetune.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/phi2/finetune.sh rename to toolbox/MoE-LLaVA/scripts/v1/phi2/finetune.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/phi2/finetune_moe.sh b/toolbox/MoE-LLaVA/scripts/v1/phi2/finetune_moe.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/phi2/finetune_moe.sh rename to toolbox/MoE-LLaVA/scripts/v1/phi2/finetune_moe.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/phi2/pretrain.sh b/toolbox/MoE-LLaVA/scripts/v1/phi2/pretrain.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/phi2/pretrain.sh rename to toolbox/MoE-LLaVA/scripts/v1/phi2/pretrain.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/qwen/finetune.sh b/toolbox/MoE-LLaVA/scripts/v1/qwen/finetune.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/qwen/finetune.sh rename to toolbox/MoE-LLaVA/scripts/v1/qwen/finetune.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/qwen/finetune_moe.sh b/toolbox/MoE-LLaVA/scripts/v1/qwen/finetune_moe.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/qwen/finetune_moe.sh rename to toolbox/MoE-LLaVA/scripts/v1/qwen/finetune_moe.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/qwen/pretrain.sh b/toolbox/MoE-LLaVA/scripts/v1/qwen/pretrain.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/qwen/pretrain.sh rename to toolbox/MoE-LLaVA/scripts/v1/qwen/pretrain.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/stablelm/finetune.sh b/toolbox/MoE-LLaVA/scripts/v1/stablelm/finetune.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/stablelm/finetune.sh rename to toolbox/MoE-LLaVA/scripts/v1/stablelm/finetune.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/stablelm/finetune_moe.sh b/toolbox/MoE-LLaVA/scripts/v1/stablelm/finetune_moe.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/stablelm/finetune_moe.sh rename to toolbox/MoE-LLaVA/scripts/v1/stablelm/finetune_moe.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/v1/stablelm/pretrain.sh b/toolbox/MoE-LLaVA/scripts/v1/stablelm/pretrain.sh similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/v1/stablelm/pretrain.sh rename to toolbox/MoE-LLaVA/scripts/v1/stablelm/pretrain.sh diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/zero2.json b/toolbox/MoE-LLaVA/scripts/zero2.json similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/zero2.json rename to toolbox/MoE-LLaVA/scripts/zero2.json diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/zero2_offload.json b/toolbox/MoE-LLaVA/scripts/zero2_offload.json similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/zero2_offload.json rename to toolbox/MoE-LLaVA/scripts/zero2_offload.json diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/zero3.json b/toolbox/MoE-LLaVA/scripts/zero3.json similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/zero3.json rename to toolbox/MoE-LLaVA/scripts/zero3.json diff --git a/multimodal/vision-language_model/MoE-LLaVA/scripts/zero3_offload.json b/toolbox/MoE-LLaVA/scripts/zero3_offload.json similarity index 100% rename from multimodal/vision-language_model/MoE-LLaVA/scripts/zero3_offload.json rename to toolbox/MoE-LLaVA/scripts/zero3_offload.json