# bokehdiff **Repository Path**: lipenug13592/bokehdiff ## Basic Information - **Project Name**: bokehdiff - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-09-04 - **Last Updated**: 2025-09-04 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # BokehDiff: Neural Lens Blur with One-Step Diffusion ![45 Teaser](banner/45-teaser.jpg) ![13 Teaser](banner/13-teaser.jpg) ![0 Teaser](banner/0-teaser.jpg) - A physics-inspired self-attention (PISA) module design that aligns with the image formation process, incorporating depth-dependent circle of confusion constraint and self-occlusion effects. - A one-step inference scheme to exploit the diffusion prior, without introducing additional noise. - A scalable paired data synthesis scheme, combining AIGC photorealistic foregrounds with transparency and conventional all-in-focus background images, balancing authenticity and scene diversity. [[Paper](https://drive.google.com/file/d/1CYNf7-HTmwdLbW4rjnDUqg6o62MLN-TL/view?usp=sharing)] The dataset synthesis is now performed on-the-fly, which means it only needs to take foreground images (with transparency) and background images as input, and the images with lens blur will be generated in `dataset.py` in parallel with training. ## Quick start To initiate the environment, run the following scripts ```bash conda create -n bokehdiff python=3.10 peft transformers kornia pillow scikit-image piq lpips accelerate safetensors cupy xformers -c pytorch -c nvidia -c conda-forge conda activate bokehdiff pip install diffusers==0.32.1 pip install uv uv pip install torch torchvision ``` After properly setting up the environment, we use [Depth-Anything-V2](https://github.com/DepthAnything/Depth-Anything-V2) and [BiRefNet](https://github.com/ZhengPeng7/BiRefNet) to prepare the data, by ```bash python prepare_data.py ``` Now the exemplar data folder should have contain the depth and salient mask prediction results. ### Inference ```bash python inference_hf.py --test_data_dir "test_data/input/*" --output_dir bokehdiff_test --enable_xformers_memory_efficient_attention --data_id demo --K 20 ``` The script above renders the prepared data, and saves the results to `bokehdiff_test/demo/`, with a bokeh strength of 20. ### Training For training, foreground data with transparency is needed, to synthesize the image with lens blur effects on-the-fly. I'll provide more details about this part when I have more spare time. 😢 If you already have some data in hand, you can place the foreground (PNG files w/transparency) and background (ordinary images, all-in-focus) in two folders of `/fg/` and `/bg/`. You should specify `` when running the training script: ```bash mkdir logs_bokehdiff python train_lora_otf.py --train_data_dir \ --pretrained_model_name_or_path SG161222/RealVisXL_V5.0 \ --train_batch_size 1 --output_dir logs_bokehdiff \ --mixed_precision no --opt_vae 1 \ --max_train_steps 120000 --enable_xformers_memory_efficient_attention \ --learning_rate 5e-5 --lr_scheduler cosine --lr_num_cycles 1 \ --lr_warmup_steps 20 --resolution 512 \ --lpips --edge --lambda_lpips 5 --checkpointing_steps 60000 \ --gan_loss_type multilevel_sigmoid_s --cv_type convnext \ --lambda_gan 0.1 --gan_step 30000 ``` ## Citation If you find our work useful to your research, please cite our paper as: ``` @inproceedings{zhu2025bokehdiff, title = {BokehDiff: Neural Lens Blur with One-Step Diffusion}, author = {Zhu, Chengxuan and Fan, Qingnan and Zhang, Qi and Chen, Jinwei and Zhang, Huaqi and Xu, Chao and Shi, Boxin}, booktitle = {IEEE International Conference on Computer Vision}, year = {2025} } ``` Feel free to [contact me](https://freebutuselesssoul.github.io/) if you're also interested in the possibility of combining AIGC with photography.