# VisCodex **Repository Path**: 910024445/VisCodex ## Basic Information - **Project Name**: VisCodex - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-12-19 - **Last Updated**: 2025-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models [](https://arxiv.org/abs/2508.09945) [](https://huggingface.co/datasets/lingjie23/MultimodalCodingDataset) This repository contains the codes and data for the paper **"VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models"**. The **code** will be released soon — please stay tuned. The **MCD dataset**, developed for our research, is now available on [🤗 Multimodal Coding Dataset (MCD)](https://huggingface.co/datasets/lingjie23/MultimodalCodingDataset). --- ## 📌 Overview VisCodex is a unified multimodal framework that merges **vision-language models** with **code-specialized LLMs** using a **task vector-based model merging** strategy. It brings **state-of-the-art multimodal code generation** capabilities, enabling models to understand complex visual contexts and produce **syntactically correct, functionally accurate code**.