# saai **Repository Path**: wu-nil/saai ## Basic Information - **Project Name**: saai - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-02-15 - **Last Updated**: 2025-02-15 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # SAAI This repository contains the implementation of the SAAI score proposed in the paper: > Ferdinand Rewicki, Joachim Denzler and Julia Niebling: "Anomalous Agreement: How to find the Ideal Number of Anomaly Classes in Correlated, Multivariate Time Series Data", AAAI Workshop on AI for Time Series Analysis (AI4TS), 2025 Furthermore, it contains the code to reproduce the results. ## Abstract Detecting and classifying abnormal system states is critical for condition monitoring, but supervised methods often fall short due to the rarity of anomalies and the lack of labeled data. Therefore, clustering is often used to group similar abnormal behavior. However, evaluating cluster quality without ground truth is challenging, as existing measures such as the Silhouette Score (SSC) only evaluate the cohesion and separation of clusters and ignore possible prior knowledge about the data. To address this challenge, we introduce the Synchronized Anomaly Agreement Index (SAAI), which exploits the synchronicity of anomalies across multivariate time series to assess cluster quality. We demonstrate the effectiveness of SAAI by showing that maximizing SAAI improves accuracy on the task of finding the true number of anomaly classes K in correlated time series by > 0.23 compared to SSC and by more than > 0.32 compared to X-Means. We also show that clusters obtained by maximizing SAAI are easier to interpret compared to SSC. ## Dependencies * python >=3.9,<3.12 (tested with python==3.11) ## Installation Setup a python virtual environment (optional but recommended). ```bash python -m venv venv source venv/bin/activate ``` To reproduce the results from the aforementioned paper, run: ```bash pip install .[experiments] ``` If you are interested in using the SAAI, you can omit the soft dependencies: ```bash pip install . ``` ## Usage ### Start mlflow server ```bash mkdir -p out/mlflow && mlflow server --host 0.0.0.0 -p 5000 --backend-store-uri sqlite:///out/mlflow/mlflow.sqlite --default-artifact-root $(pwd)/out/mlflow/artifacts/ --gunicorn-opts "--timeout 180" ``` ### Run Experiments #### Synthetic ICS Increase K ```bash python saai/experiments/synthetic_ics.py --multirun data_generator.n_timeseries=50 data_generator=random data_generator.n_classes=2,3,4,5,6 data_generator.r_sync=[0.5,1] data_generator.n_anomalies=10 gp.tags.run_id=increase_k gp.k=[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19] ``` Decrease r_sync ```bash python saai/experiments/synthetic_ics.py --multirun data_generator.n_timeseries=50 data_generator=random data_generator.n_classes=4 data_generator.r_sync="range(0,1.0,0.1)" data_generator.n_anomalies=10 gp.tags.run_id=increase_rsync gp.k=[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19] ``` Increase D ```bash python saai/experiments/synthetic_ics.py --multirun data_generator.n_timeseries=50 data_generator=random data_generator.n_classes=4 data_generator.r_sync=[0.5,1] data_generator.n_anomalies=10 data_generator.dim="range(2,11,1)" gp.tags.run_id=increase_d gp.k=[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19] ``` Shifting Lags ```bash python saai/experiments/synthetic_ics.py --multirun data_generator.n_timeseries=50 data_generator=random data_generator.n_classes=4 data_generator.r_sync=[0.5,1] data_generator.n_anomalies=10 data_generator.dim=2 data_generator.lags="{1:-720},{1:-660},{1:-600},{1:-540},{1:-480},{1:-420},{1:-360},{1:-300},{1:-240},{1:-180},{1:-120},{1:-60},{1:0},{1:60},{1:120},{1:180},{1:240},{1:300},{1:360},{1:420},{1:480},{1:540},{1:600},{1:660},{1:720}" gp.tags.run_id=shifting_correlation gp.k=[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19] ``` #### EDEN ISS 2020 ```bash python saai/experiments/edeniss2020.py --multirun gp.k=2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 ``` ### Evaluate Experiments & Create plots The evaluation of the experimental results can be found in the following notebooks: ```bash jupyter lab jupyter/evaluate_results.ipynb ``` Citation -------- If you use this software, please cite using the metadata from the `CITATION.cff` file. To cite the publication, please use: ``` @inproceedings{Rewicki2025Anomalous, author = {Rewicki, Ferdinand and and Denzler, Joachim and Niebling, Julia }, title = {Anomalous Agreement: How to find the Ideal Number of Anomaly Classes in Correlated, Multivariate Time Series Data}, booktitle = {AAAI Workshop on AI for Time-series (AI4TS)}, year = {2025}, note = {Accepted for presentation}, address = {Philadelphia, USA}, url = {https://arxiv.org/abs/2501.07172}, } ``` License ------- Please refer to the `LICENSE.md` file for further information about how the content is licensed.