# e3_diffusion_for_molecules **Repository Path**: ahlih_admin/e3_diffusion_for_molecules ## Basic Information - **Project Name**: e3_diffusion_for_molecules - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-07-23 - **Last Updated**: 2025-07-23 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # EDM: E(3) Equivariant Diffusion Model for Molecule Generation in 3D. Official code release for the paper Equivariant Diffusion for Molecule Generation in 3D. **If** you want to set-up a rdkit environment, it may be easiest to install conda and run: ``conda create -c conda-forge -n my-rdkit-env rdkit`` and then install the other required packages from there. The code should still run without rdkit installed though. ### Training the EDM: ```python main_qm9.py --n_epochs 3000 --exp_name edm_qm9 --n_stability_samples 1000 --diffusion_noise_schedule polynomial_2 --diffusion_noise_precision 1e-5 --diffusion_steps 1000 --diffusion_loss_type l2 --batch_size 64 --nf 256 --n_layers 9 --lr 1e-4 --normalize_factors [1,4,10] --test_epochs 20 --ema_decay 0.9999``` A visualization of what happens during training: ### After training To analyze the sample quality of molecules ```python eval_analyze.py --model_path outputs/edm_qm9 --n_samples 10_000``` To visualize some molecules ```python eval_sample.py --model_path outputs/edm_qm9 --n_samples 10_000``` ### For GEOM-Drugs First follow the intructions at data/geom/README.md to set up the data. Training ```python main_geom_drugs.py --n_epochs 3000 --exp_name edm_geom_drugs --n_stability_samples 500 --diffusion_noise_schedule polynomial_2 --diffusion_steps 1000 --diffusion_noise_precision 1e-5 --diffusion_loss_type l2 --batch_size 64 --nf 256 --n_layers 4 --lr 1e-4 --normalize_factors [1,4,10] --test_epochs 1 --ema_decay 0.9999 --normalization_factor 1 --model egnn_dynamics --visualize_every_batch 10000``` Analyze ```python eval_analyze.py --model_path outputs/edm_geom_drugs --n_samples 10_000``` Sample ```python eval_sample.py --model_path outputs/edm_geom_drugs``` Small note: The GPUs we used for these experiment were pretty large. If the memory does not fit, try running at a smaller size. The main reason is that the EGNN runs with fully connected message passing, which becomes very memory intensive. ### For Conditional Generation #### Train a Conditional EDM ```python main_qm9.py --exp_name exp_cond_alpha --model egnn_dynamics --lr 1e-4 --nf 192 --n_layers 9 --save_model True --diffusion_steps 1000 --sin_embedding False --n_epochs 3000 --n_stability_samples 500 --diffusion_noise_schedule polynomial_2 --diffusion_noise_precision 1e-5 --dequantization deterministic --include_charges False --diffusion_loss_type l2 --batch_size 64 --normalize_factors [1,8,1] --conditioning alpha --dataset qm9_second_half``` The argument `--conditioning alpha` can be set to any of the following properties: `alpha`, `gap`, `homo`, `lumo`, `mu` `Cv`. The same applies to the following commands that also depend on alpha. #### Generate samples for different property values ```python eval_conditional_qm9.py --generators_path outputs/exp_cond_alpha --property alpha --n_sweeps 10 --task qualitative``` You can set `--generators_path` arguments to `outputs/exp_35_conditional_nf192_9l_alpha` to use our pre-trained model on alpha. #### Train a property classifier network ```cd qm9/property_prediction``` ```python main_qm9_prop.py --num_workers 2 --lr 5e-4 --property alpha --exp_name exp_class_alpha --model_name egnn``` Additionally, you can change the argument `--model_name egnn` by `--model_name numnodes` to train a classifier baseline that classifies only based on the number of nodes. #### Evaluate the property classifier on EDM Evaluate the trained property classifier on the samples generated by the trained EDM model ```python eval_conditional_qm9.py --generators_path outputs/exp_cond_alpha --classifiers_path qm9/property_prediction/outputs/exp_class_alpha --property alpha --iterations 100 --batch_size 100 --task edm``` To use a pre-trained generator and classifier model for alpha you can use the following arguments: `--generators_path outputs/exp_35_conditional_nf192_9l_alpha` and `--classifiers_path qm9/property_prediction/outputs/exp_class_alpha_pretrained`