# ranx
**Repository Path**: flame-yuan-tian/ranx
## Basic Information
- **Project Name**: ranx
- **Description**: randx是一个提供排序算法评估的Lib,可用于评估和比较信息检索和推荐系统。
也提供集中融合算法和归一化策略(normalization strategies),并提供融合优化方法(排序算法训练)
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2024-09-12
- **Last Updated**: 2024-09-12
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
## ⚡️ Introduction
[ranx](https://github.com/AmenRa/ranx) ([raŋks]) is a library of fast ranking evaluation metrics implemented in [Python](https://en.wikipedia.org/wiki/Python_(programming_language)), leveraging [Numba](https://github.com/numba/numba) for high-speed [vector operations](https://en.wikipedia.org/wiki/Automatic_vectorization) and [automatic parallelization](https://en.wikipedia.org/wiki/Automatic_parallelization).
It offers a user-friendly interface to evaluate and compare [Information Retrieval](https://en.wikipedia.org/wiki/Information_retrieval) and [Recommender Systems](https://en.wikipedia.org/wiki/Recommender_system).
[ranx](https://github.com/AmenRa/ranx) allows you to perform statistical tests and export [LaTeX](https://en.wikipedia.org/wiki/LaTeX) tables for your scientific publications.
Moreover, [ranx](https://github.com/AmenRa/ranx) provides several [fusion algorithms](https://amenra.github.io/ranx/fusion) and [normalization strategies](https://amenra.github.io/ranx/normalization), and an automatic [fusion optimization](https://amenra.github.io/ranx/fusion/#optimize-fusion) functionality.
[ranx](https://github.com/AmenRa/ranx) also have a companion repository of pre-computed runs to facilitated model comparisons called [ranxhub](https://amenra.github.io/ranxhub).
On [ranxhub](https://amenra.github.io/ranxhub), you can download and share pre-computed runs for Information Retrieval datasets, such as [MSMARCO Passage Ranking](https://arxiv.org/abs/1611.09268).
[ranx](https://github.com/AmenRa/ranx) was featured in [ECIR 2022](https://ecir2022.org), [CIKM 2022](https://www.cikm2022.org), and [SIGIR 2023](https://sigir.org/sigir2023).
If you use [ranx](https://github.com/AmenRa/ranx) to evaluate results or conducting experiments involving fusion for your scientific publication, please consider citing it: [evaluation bibtex](https://dblp.org/rec/conf/ecir/Bassani22.html?view=bibtex), [fusion bibtex](https://dblp.org/rec/conf/cikm/BassaniR22.html?view=bibtex), [ranxhub bibtex](https://dblp.org/rec/conf/sigir/Bassani23.html?view=bibtex).
NB: [ranx](https://github.com/AmenRa/ranx) is not suited for evaluating classifiers. Please, refer to the [FAQ](https://amenra.github.io/ranx/faq) for further details.
For a quick overview, follow the [Usage](#-usage) section.
For a in-depth overview, follow the [Examples](#-examples) section.
## ✨ Features
### Metrics
* [Hits](https://amenra.github.io/ranx/metrics/#hits)
* [Hit Rate](https://amenra.github.io/ranx/metrics/#hit-rate-success)
* [Precision](https://amenra.github.io/ranx/metrics/#precision)
* [Recall](https://amenra.github.io/ranx/metrics/#recall)
* [F1](https://amenra.github.io/ranx/metrics/#f1)
* [r-Precision](https://amenra.github.io/ranx/metrics/#r-precision)
* [Bpref](https://amenra.github.io/ranx/metrics/#bpref)
* [Rank-biased Precision (RBP)](https://amenra.github.io/ranx/metrics/#rank-biased-precision)
* [Mean Reciprocal Rank (MRR)](https://amenra.github.io/ranx/metrics/#mean-reciprocal-rank)
* [Mean Average Precision (MAP)](https://amenra.github.io/ranx/metrics/#mean-average-precision)
* [Discounted Cumulative Gain (DCG)](https://amenra.github.io/ranx/metrics/#dcg)
* [Normalized Discounted Cumulative Gain (NDCG)](https://amenra.github.io/ranx/metrics/#ndcg)
The metrics have been tested against [TREC Eval](https://github.com/usnistgov/trec_eval) for correctness.
### Statistical Tests
* [Paired Student's t-Test](https://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/t_test.htm) (default)
* [Fisher's Randomization Test](https://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/fishrand.htm)
* [Tukey's HSD Test](https://www.itl.nist.gov/div898/handbook/prc/section4/prc471.htm)
Please, refer to [Smucker et al.](https://dl.acm.org/doi/10.1145/1321440.1321528), [Carterette](https://dl.acm.org/doi/10.1145/2094072.2094076), and [Fuhr](http://www.sigir.org/wp-content/uploads/2018/01/p032.pdf) for additional information on statistical tests for Information Retrieval.
### Off-the-shelf Qrels
You can load qrels from [ir-datasets](https://ir-datasets.com) as simply as:
```python
qrels = Qrels.from_ir_datasets("msmarco-document/dev")
```
A full list of the available qrels is provided [here](https://ir-datasets.com).
### Off-the-shelf Runs
You can load runs from [ranxhub](https://amenra.github.io/ranxhub/) as simply as:
```python
run = Run.from_ranxhub("run-id")
```
A full list of the available runs is provided [here](https://amenra.github.io/ranxhub//browse).
### Fusion Algorithms
| **Name** | **Name** | **Name** | **Name** | **Name** |
| -------------------------------------------------------- | ---------------------------------------------------------- | ----------------------------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------------------------ |
| [CombMIN](https://amenra.github.io/ranx/fusion/#combmin) | [CombMNZ](https://amenra.github.io/ranx/fusion/#combmnz) | [RRF](https://amenra.github.io/ranx/fusion/#reciprocal-rank-fusion-rrf) | [MAPFuse](https://amenra.github.io/ranx/fusion/#mapfuse) | [BordaFuse](https://amenra.github.io/ranx/fusion/#bordafuse) |
| [CombMED](https://amenra.github.io/ranx/fusion/#combmed) | [CombGMNZ](https://amenra.github.io/ranx/fusion/#combgmnz) | [RBC](https://amenra.github.io/ranx/fusion/#rank-biased-centroids-rbc) | [PosFuse](https://amenra.github.io/ranx/fusion/#posfuse) | [Weighted BordaFuse](https://amenra.github.io/ranx/fusion/#weighted-bordafuse) |
| [CombANZ](https://amenra.github.io/ranx/fusion/#combanz) | [ISR](https://amenra.github.io/ranx/fusion/#isr) | [WMNZ](https://amenra.github.io/ranx/fusion/#wmnz) | [ProbFuse](https://amenra.github.io/ranx/fusion/#probfuse) | [Condorcet](https://amenra.github.io/ranx/fusion/#condorcet) |
| [CombMAX](https://amenra.github.io/ranx/fusion/#combmax) | [Log_ISR](https://amenra.github.io/ranx/fusion/#log_isr) | [Mixed](https://amenra.github.io/ranx/fusion/#mixed) | [SegFuse](https://amenra.github.io/ranx/fusion/#segfuse) | [Weighted Condorcet](https://amenra.github.io/ranx/fusion/#weighted-condorcet) |
| [CombSUM](https://amenra.github.io/ranx/fusion/#combsum) | [LogN_ISR](https://amenra.github.io/ranx/fusion/#logn_isr) | [BayesFuse](https://amenra.github.io/ranx/fusion/#bayesfuse) | [SlideFuse](https://amenra.github.io/ranx/fusion/#slidefuse) | [Weighted Sum](https://amenra.github.io/ranx/fusion/#wighted-sum) |
Please, refer to the [documentation](https://amenra.github.io/ranx/fusion) for further details.
### Normalization Strategies
* [Min-Max Norm](https://amenra.github.io/ranx/normalization/#min-max-norm)
* [Min-Max Inverted Norm](https://amenra.github.io/ranx/normalization/#min-max-inverted-norm)
* [Max Norm](https://amenra.github.io/ranx/normalization/#sum-norm)
* [Sum Norm](https://amenra.github.io/ranx/normalization/#rank-norm)
* [ZMUV Norm](https://amenra.github.io/ranx/normalization/#max-norm)
* [Rank Norm](https://amenra.github.io/ranx/normalization/#zmuv-norm)
* [Borda Norm](https://amenra.github.io/ranx/normalization/#borda-norm)
Please, refer to the [documentation](https://amenra.github.io/ranx/fusion) for further details.
## 🔌 Requirements
```bash
python>=3.8
```
As of `v.0.3.5`, [ranx](https://github.com/AmenRa/ranx) requires `python>=3.8`.
## 💾 Installation
```bash
pip install ranx
```
## 💡 Usage
### Create Qrels and Run
```python
from ranx import Qrels, Run
qrels_dict = { "q_1": { "d_12": 5, "d_25": 3 },
"q_2": { "d_11": 6, "d_22": 1 } }
run_dict = { "q_1": { "d_12": 0.9, "d_23": 0.8, "d_25": 0.7,
"d_36": 0.6, "d_32": 0.5, "d_35": 0.4 },
"q_2": { "d_12": 0.9, "d_11": 0.8, "d_25": 0.7,
"d_36": 0.6, "d_22": 0.5, "d_35": 0.4 } }
qrels = Qrels(qrels_dict)
run = Run(run_dict)
```
### Evaluate
```python
from ranx import evaluate
# Compute score for a single metric
evaluate(qrels, run, "ndcg@5")
>>> 0.7861
# Compute scores for multiple metrics at once
evaluate(qrels, run, ["map@5", "mrr"])
>>> {"map@5": 0.6416, "mrr": 0.75}
```
### Compare
```python
from ranx import compare
# Compare different runs and perform Two-sided Paired Student's t-Test
report = compare(
qrels=qrels,
runs=[run_1, run_2, run_3, run_4, run_5],
metrics=["map@100", "mrr@100", "ndcg@10"],
max_p=0.01 # P-value threshold
)
```
Output:
```python
print(report)
```
```
# Model MAP@100 MRR@100 NDCG@10
--- ------- -------- -------- ---------
a model_1 0.320ᵇ 0.320ᵇ 0.368ᵇᶜ
b model_2 0.233 0.234 0.239
c model_3 0.308ᵇ 0.309ᵇ 0.330ᵇ
d model_4 0.366ᵃᵇᶜ 0.367ᵃᵇᶜ 0.408ᵃᵇᶜ
e model_5 0.405ᵃᵇᶜᵈ 0.406ᵃᵇᶜᵈ 0.451ᵃᵇᶜᵈ
```
### Fusion
```python
from ranx import fuse, optimize_fusion
best_params = optimize_fusion(
qrels=train_qrels,
runs=[train_run_1, train_run_2, train_run_3],
norm="min-max", # The norm. to apply before fusion
method="wsum", # The fusion algorithm to use (Weighted Sum)
metric="ndcg@100", # The metric to maximize
)
combined_test_run = fuse(
runs=[test_run_1, test_run_2, test_run_3],
norm="min-max",
method="wsum",
params=best_params,
)
```
## 📖 Examples
| Name | Link |
| ---------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Overview | [](https://colab.research.google.com/github/AmenRa/ranx/blob/master/notebooks/1_overview.ipynb) |
| Qrels and Run | [](https://colab.research.google.com/github/AmenRa/ranx/blob/master/notebooks/2_qrels_and_run.ipynb) |
| Evaluation | [](https://colab.research.google.com/github/AmenRa/ranx/blob/master/notebooks/3_evaluation.ipynb) |
| Comparison and Report | [](https://colab.research.google.com/github/AmenRa/ranx/blob/master/notebooks/4_comparison_and_report.ipynb) |
| Fusion | [](https://colab.research.google.com/github/AmenRa/ranx/blob/master/notebooks/5_fusion.ipynb) |
| Plot | [](https://colab.research.google.com/github/AmenRa/ranx/blob/master/notebooks/7_plot.ipynb) |
| Share your runs with [ranxhub](https://amenra.github.io/ranxhub) | [](https://colab.research.google.com/github/AmenRa/ranx/blob/master/notebooks/6_ranxhub.ipynb) |
## 📚 Documentation
Browse the [documentation](https://amenra.github.io/ranx) for more details and examples.
## 🎓 Citation
If you use [ranx](https://github.com/AmenRa/ranx) to evaluate results for your scientific publication, please consider citing our [ECIR 2022](https://ecir2022.org) paper:
BibTeX
```bibtex
@inproceedings{ranx,
author = {Elias Bassani},
title = {ranx: {A} Blazing-Fast Python Library for Ranking Evaluation and Comparison},
booktitle = {{ECIR} {(2)}},
series = {Lecture Notes in Computer Science},
volume = {13186},
pages = {259--264},
publisher = {Springer},
year = {2022},
doi = {10.1007/978-3-030-99739-7\_30}
}
```
If you use the fusion functionalities provided by [ranx](https://github.com/AmenRa/ranx) for conducting the experiments of your scientific publication, please consider citing our [CIKM 2022](https://www.cikm2022.org) paper:
BibTeX
```bibtex
@inproceedings{ranx.fuse,
author = {Elias Bassani and
Luca Romelli},
title = {ranx.fuse: {A} Python Library for Metasearch},
booktitle = {{CIKM}},
pages = {4808--4812},
publisher = {{ACM}},
year = {2022},
doi = {10.1145/3511808.3557207}
}
```
If you use pre-computed runs from [ranxhub](https://amenra.github.io/ranxhub) to make comparison for your scientific publication, please consider citing our [SIGIR 2023](https://sigir.org/sigir2023) paper:
BibTeX
```bibtex
@inproceedings{ranxhub,
author = {Elias Bassani},
title = {ranxhub: An Online Repository for Information Retrieval Runs},
booktitle = {{SIGIR}},
pages = {3210--3214},
publisher = {{ACM}},
year = {2023},
doi = {10.1145/3539618.3591823}
}
```
## 🎁 Feature Requests
Would you like to see other features implemented? Please, open a [feature request](https://github.com/AmenRa/ranx/issues/new?assignees=&labels=enhancement&template=feature_request.md&title=%5BFeature+Request%5D+title).
## 🤘 Want to contribute?
Would you like to contribute? Please, drop me an [e-mail](mailto:elias.bssn@gmail.com?subject=[GitHub]%20ranx).
## 📄 License
[ranx](https://github.com/AmenRa/ranx) is an open-sourced software licensed under the [MIT license](LICENSE).