# DP-RTF-Learning
**Repository Path**: halapano/DP-RTF-Learning
## Basic Information
- **Project Name**: DP-RTF-Learning
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-04-02
- **Last Updated**: 2025-04-02
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# DP-RTF-Learning
A python implementation of “**Learning Deep Direct-Path Relative Transfer Function for Binaural Sound Source Localization**”, IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2021.
+ **Contributions**
- A DP-RTF learning framework that embeds the sensor signals to a low-dimensional localization feature space is designed, which disentangles the localization cues from other factors including source signals, noise, reverberation, etc.
- **a Novel DP-RTF Learning Network**
- **leveraging Monaural Speech Enhancement to Improve the Robustness of DP-RTF Estimation**
- **generalization to Unseen Binaural Configurations**
- The DP-RTF learning based localization method takes full use of the spatial and spectral cues, which is demonstrated to perform better than several other methods on both simulated and real-world data in the noisy and reverberant environment.
## Datasets
+ **Head-related impulse responses (HRIRs)**: from CIPIC database
+ **Binaural room impulse responses (BRIRs)**: generated by Roomsim toolbox
+ **TIMIT dataset**
+ **Diffuse noise**: generated by arbitrary noise field generator with noise signals from NOISEX-92 database
## Quick start
+ **Preparation**
- Add soft link of "common" file to "DPRTF" file
```
ln -s [original path] [target path]
```
- Generate the lists of source signals and BRIRs, direct-path relative tranfer functions (DP-RTFs), room acoustic settings, and sensor signals for training, validation and test stages.
```
python -m common.getData --stage [*] --data [*]
```
+ **Training**
```
python run.py --gpu-id [*]
```
+ **Test**
```
python run.py --gpu-id [*] --test
```
+ **Pretrained models**
- exp/00000000/model_12.pth: trained with fixed data
- exp/00000001/model_52.pth: trained with random data (generated on-the-fly)
## Citation
If you find our work useful in your research, please consider citing:
```
@article{yang2021dprtf,
Author = "Bing Yang and Hong Liu and Xiaofei Li",
Title = "Learning deep direct-path relative transfer function for binaural sound source localization",
Journal = "{IEEE/ACM} Transactions on Audio, Speech, and Language Processing (TASLP)",
Volume = {29},
Pages = {3491-3503},
Year = {2021}}
```
```
@InProceedings{yang2021dprtf1,
author = "Bing Yang and Xiaofei Li and Hong Liu",
title = "Supervised direct-path relative transfer function learning for binaural sound source localization",
booktitle = "Proceedings of {IEEE} International Conference on Acoustics, Speech and Signal Processing (ICASSP)",
year = "2021",
pages = "825-829"}
```
## Licence
MIT