# VectorDBBench **Repository Path**: scgoddog/VectorDBBench ## Basic Information - **Project Name**: VectorDBBench - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-12-18 - **Last Updated**: 2025-12-18 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # VectorDBBench(VDBBench): A Benchmark Tool for VectorDB [![version](https://img.shields.io/pypi/v/vectordb-bench.svg?color=blue)](https://pypi.org/project/vectordb-bench/) [![Downloads](https://pepy.tech/badge/vectordb-bench)](https://pepy.tech/project/vectordb-bench) ## What is VDBBench VDBBench is not just an offering of benchmark results for mainstream vector databases and cloud services, it's your go-to tool for the ultimate performance and cost-effectiveness comparison. Designed with ease-of-use in mind, VDBBench is devised to help users, even non-professionals, reproduce results or test new systems, making the hunt for the optimal choice amongst a plethora of cloud services and open-source vector databases a breeze. Understanding the importance of user experience, we provide an intuitive visual interface. This not only empowers users to initiate benchmarks at ease, but also to view comparative result reports, thereby reproducing benchmark results effortlessly. To add more relevance and practicality, we provide cost-effectiveness reports particularly for cloud services. This allows for a more realistic and applicable benchmarking process. Closely mimicking real-world production environments, we've set up diverse testing scenarios including insertion, searching, and filtered searching. To provide you with credible and reliable data, we've included public datasets from actual production scenarios, such as [SIFT](http://corpus-texmex.irisa.fr/), [GIST](http://corpus-texmex.irisa.fr/), [Cohere](https://huggingface.co/datasets/Cohere/wikipedia-22-12/tree/main/en), and a dataset generated by OpenAI from an opensource [raw dataset](https://huggingface.co/datasets/allenai/c4). It's fascinating to discover how a relatively unknown open-source database might excel in certain circumstances! Prepare to delve into the world of VDBBench, and let it guide you in uncovering your perfect vector database match. VDBBench is sponsored by Zilliz,the leading opensource vectorDB company behind Milvus. Choose smarter with VDBBench - start your free test on [zilliz cloud](https://zilliz.com/) today! **Leaderboard:** https://zilliz.com/benchmark ## Quick Start ### Prerequirement ``` shell python >= 3.11 ``` ### Install **Install vectordb-bench with only PyMilvus** ```shell pip install vectordb-bench ``` **Install all database clients** ``` shell pip install 'vectordb-bench[all]' ``` **Install the specific database client** ```shell pip install 'vectordb-bench[pinecone]' ``` All the database client supported | Optional database client | install command | |--------------------------|---------------------------------------------| | pymilvus, zilliz_cloud (*default*) | `pip install vectordb-bench` | | all (*clients requirements might be conflict with each other*) | `pip install vectordb-bench[all]` | | qdrant | `pip install vectordb-bench[qdrant]` | | pinecone | `pip install vectordb-bench[pinecone]` | | weaviate | `pip install vectordb-bench[weaviate]` | | elastic, aliyun_elasticsearch| `pip install vectordb-bench[elastic]` | | pgvector, pgvectorscale, pgdiskann, alloydb | `pip install vectordb-bench[pgvector]` | | pgvecto.rs | `pip install vectordb-bench[pgvecto_rs]` | | redis | `pip install vectordb-bench[redis]` | | memorydb | `pip install vectordb-bench[memorydb]` | | chromadb | `pip install vectordb-bench[chromadb]` | | cockroachdb | `pip install vectordb-bench[cockroachdb]` | | awsopensearch | `pip install vectordb-bench[opensearch]` | | aliyun_opensearch | `pip install vectordb-bench[aliyun_opensearch]` | | mongodb | `pip install vectordb-bench[mongodb]` | | tidb | `pip install vectordb-bench[tidb]` | | vespa | `pip install vectordb-bench[vespa]` | | oceanbase | `pip install vectordb-bench[oceanbase]` | | hologres | `pip install vectordb-bench[hologres]` | | tencent_es | `pip install vectordb-bench[tencent_es]` | | alisql | `pip install 'vectordb-bench[alisql]'` | | doris | `pip install vectordb-bench[doris]` | ### Run ``` shell init_bench ``` OR: ### Run from the command line. ``` shell vectordbbench [OPTIONS] COMMAND [ARGS]... ``` To list the clients that are runnable via the commandline option, execute: `vectordbbench --help` ``` text $ vectordbbench --help Usage: vectordbbench [OPTIONS] COMMAND [ARGS]... Options: --help Show this message and exit. Commands: pgvectorhnsw pgvectorivfflat test weaviate ``` To list the options for each command, execute `vectordbbench [command] --help` ```text $ vectordbbench pgvectorhnsw --help Usage: vectordbbench pgvectorhnsw [OPTIONS] Options: --config-file PATH Read configuration from yaml file --drop-old / --skip-drop-old Drop old or skip [default: drop-old] --load / --skip-load Load or skip [default: load] --search-serial / --skip-search-serial Search serial or skip [default: search- serial] --search-concurrent / --skip-search-concurrent Search concurrent or skip [default: search- concurrent] --case-type [CapacityDim128|CapacityDim960|Performance768D100M|Performance768D10M|Performance768D1M|Performance768D10M1P|Performance768D1M1P|Performance768D10M99P|Performance768D1M99P|Performance1536D500K|Performance1536D5M|Performance1536D500K1P|Performance1536D5M1P|Performance1536D500K99P|Performance1536D5M99P|Performance1536D50K] Case type --db-label TEXT Db label, default: date in ISO format [default: 2024-05-20T20:26:31.113290] --dry-run Print just the configuration and exit without running the tasks --k INTEGER K value for number of nearest neighbors to search [default: 100] --concurrency-duration INTEGER Adjusts the duration in seconds of each concurrency search [default: 30] --num-concurrency TEXT Comma-separated list of concurrency values to test during concurrent search [default: 1,10,20] --concurrency-timeout INTEGER Timeout (in seconds) to wait for a concurrency slot before failing. Set to a negative value to wait indefinitely. [default: 3600] --user-name TEXT Db username [required] --password TEXT Db password [required] --host TEXT Db host [required] --db-name TEXT Db name [required] --maintenance-work-mem TEXT Sets the maximum memory to be used for maintenance operations (index creation). Can be entered as string with unit like '64GB' or as an integer number of KB.This will set the parameters: max_parallel_maintenance_workers, max_parallel_workers & table(parallel_workers) --max-parallel-workers INTEGER Sets the maximum number of parallel processes per maintenance operation (index creation) --m INTEGER hnsw m --ef-construction INTEGER hnsw ef-construction --ef-search INTEGER hnsw ef-search --quantization-type [none|bit|halfvec] quantization type for vectors (in index) --table-quantization-type [none|bit|halfvec] quantization type for vectors (in table). If equal to bit, the parameter quantization_type will be set to bit too. --custom-case-name TEXT Custom case name i.e. PerformanceCase1536D50K --custom-case-description TEXT Custom name description --custom-case-load-timeout INTEGER Custom case load timeout [default: 36000] --custom-case-optimize-timeout INTEGER Custom case optimize timeout [default: 36000] --custom-dataset-name TEXT Dataset name i.e OpenAI --custom-dataset-dir TEXT Dataset directory i.e. openai_medium_500k --custom-dataset-size INTEGER Dataset size i.e. 500000 --custom-dataset-dim INTEGER Dataset dimension --custom-dataset-metric-type TEXT Dataset distance metric [default: COSINE] --custom-dataset-file-count INTEGER Dataset file count --custom-dataset-use-shuffled / --skip-custom-dataset-use-shuffled Use shuffled custom dataset or skip [default: custom-dataset- use-shuffled] --custom-dataset-with-gt / --skip-custom-dataset-with-gt Custom dataset with ground truth or skip [default: custom-dataset- with-gt] --help Show this message and exit. ``` ### Run awsopensearch from command line ```shell vectordbbench awsopensearch --db-label awsopensearch \ --m 16 --ef-construction 256 \ --host search-vector-db-prod-h4f6m4of6x7yp2rz7gdmots7w4.us-west-2.es.amazonaws.com --port 443 \ --user vector --password '' \ --case-type Performance1536D5M --number-of-indexing-clients 10 \ --skip-load --num-concurrency 75 ``` To list the options for awsopensearch, execute `vectordbbench awsopensearch --help` ```text $ vectordbbench awsopensearch --help Usage: vectordbbench awsopensearch [OPTIONS] Options: # Sharding and Replication --number-of-shards INTEGER Number of primary shards for the index --number-of-replicas INTEGER Number of replica copies for each primary shard # Indexing Performance --index-thread-qty INTEGER Thread count for native engine indexing --index-thread-qty-during-force-merge INTEGER Thread count during force merge operations --number-of-indexing-clients INTEGER Number of concurrent indexing clients # Index Management --number-of-segments INTEGER Target number of segments after merging --refresh-interval TEXT How often to make new data available for search --force-merge-enabled BOOLEAN Whether to perform force merge operation --flush-threshold-size TEXT Size threshold for flushing the transaction log --engine TEXT type of engine to use valid values [faiss, lucene, s3vector] # Memory Management --cb-threshold TEXT k-NN Memory circuit breaker threshold --ondisk Ondisk mode with binary quantization(32x compression) --oversample-factor Controls the degree of oversampling applied to minority classes in imbalanced datasets to improve model performance by balancing class distributions.(default 1.0) # Quantization Type --quantization-type TEXT which type of quantization to use valid values [fp32, fp16, bq] --help Show this message and exit. ``` ### Run Elastic Cloud from command line Elastic Cloud supports multiple index types: HNSW, HNSW_INT8, HNSW_INT4, and HNSW_BBQ. **Example: Run HNSW index test** ```shell vectordbbench elasticcloudhnsw --db-label elastic-cloud-test \ --cloud-id --password '' \ --m 16 --ef-construction 100 --num-candidates 100 \ --case-type Performance768D1M --number-of-shards 1 \ --number-of-replicas 0 --refresh-interval 30s ``` **Example: Run HNSW_INT8 index test** ```shell vectordbbench elasticcloudhnswint8 --db-label elastic-cloud-int8 \ --cloud-id --password '' \ --m 16 --ef-construction 200 --num-candidates 200 \ --case-type Performance1536D50K --element-type float ``` **Example: Run HNSW_INT4 index test** ```shell vectordbbench elasticcloudhnswint4 --db-label elastic-cloud-int4 \ --cloud-id --password '' \ --m 16 --ef-construction 200 --num-candidates 200 \ --case-type Performance768D10M --use-rescore --oversample-ratio 2.0 ``` **Example: Run HNSW_BBQ index test** ```shell vectordbbench elasticcloudhnswbbq --db-label elastic-cloud-bbq \ --cloud-id --password '' \ --m 16 --ef-construction 200 --num-candidates 200 \ --case-type Performance1536D5M --use-routing --use-force-merge ``` **Example: Run Label Filter Performance test** ```shell vectordbbench elasticcloudhnsw --db-label elastic-cloud-label-filter \ --cloud-id --password '' \ --case-type LabelFilterPerformanceCase \ --dataset-with-size-type "Medium OpenAI (1536dim, 500K)" \ --label-percentage 0.001 \ --m 16 --ef-construction 128 --num-candidates 100 \ --num-concurrency 1,5 --number-of-shards 1 ``` To list all options for Elastic Cloud, execute `vectordbbench elasticcloudhnsw --help`. The following are Elastic Cloud-specific command-line options: ```text $ vectordbbench elasticcloudhnsw --help Usage: vectordbbench elasticcloudhnsw [OPTIONS] Options: # Connection --cloud-id TEXT Elastic Cloud ID [required] --password TEXT Elastic Cloud password [required] # HNSW Index Parameters --m INTEGER HNSW M parameter [default: 16] --ef-construction INTEGER HNSW efConstruction parameter [default: 100] --num-candidates INTEGER Number of candidates for search [default: 100] --element-type [float|byte] Element type for vectors (float: 4 bytes, byte: 1 byte) [default: float] # Index Configuration --number-of-shards INTEGER Number of shards [default: 1] --number-of-replicas INTEGER Number of replicas [default: 0] --refresh-interval TEXT Index refresh interval [default: 30s] --merge-max-thread-count INTEGER Maximum thread count for merge [default: 8] --use-force-merge BOOLEAN Whether to use force merge [default: True] --use-routing BOOLEAN Whether to use routing [default: False] --use-rescore BOOLEAN Whether to use rescore [default: False] --oversample-ratio FLOAT Oversample ratio for rescore [default: 2.0] # Common Options --case-type [CapacityDim128|CapacityDim960|Performance768D100M|...] Case type --db-label TEXT Db label, default: date in ISO format --k INTEGER K value for number of nearest neighbors to search [default: 100] --num-concurrency TEXT Comma-separated list of concurrency values [default: 1,5,10,20,30,40,60,80] --help Show this message and exit. ``` ### Run OceanBase from command line Execute tests for the index types: HNSW, HNSW_SQ, or HNSW_BQ. ```shell vectordbbench oceanbasehnsw --host xxx --port xxx --user root@mysql_tenant --database test \ --m 16 --ef-construction 200 --case-type Performance1536D50K \ --index-type HNSW --ef-search 100 ``` To list the options for oceanbase, execute `vectordbbench oceanbasehnsw --help`, The following are some OceanBase-specific command-line options. ```text $ vectordbbench oceanbasehnsw --help Usage: vectordbbench oceanbasehnsw [OPTIONS] Options: [...] --host TEXT OceanBase host --user TEXT OceanBase username [required] --password TEXT OceanBase database password --database TEXT DataBase name [required] --port INTEGER OceanBase port [required] --m INTEGER hnsw m [required] --ef-construction INTEGER hnsw ef-construction [required] --ef-search INTEGER hnsw ef-search [required] --index-type [HNSW|HNSW_SQ|HNSW_BQ] Type of index to use. Supported values: HNSW, HNSW_SQ, HNSW_BQ [required] --help Show this message and exit. ``` Execute tests for the index types: IVF_FLAT, IVF_SQ8, or IVF_PQ. ```shell vectordbbench oceanbaseivf --host xxx --port xxx --user root@mysql_tenant --database test \ --nlist 1000 --sample_per_nlist 256 --case-type Performance768D1M \ --index-type IVF_FLAT --ivf_nprobes 100 ``` To list the options for oceanbase, execute `vectordbbench oceanbaseivf --help`, The following are some OceanBase-specific command-line options. ```text $ vectordbbench oceanbaseivf --help Usage: vectordbbench oceanbaseivf [OPTIONS] Options: [...] --host TEXT OceanBase host --user TEXT OceanBase username [required] --password TEXT OceanBase database password --database TEXT DataBase name [required] --port INTEGER OceanBase port [required] --index-type [IVF_FLAT|IVF_SQ8|IVF_PQ] Type of index to use. Supported values: IVF_FLAT, IVF_SQ8, IVF_PQ [required] --nlist INTEGER Number of cluster centers [required] --sample_per_nlist INTEGER The cluster centers are calculated by total sampling sample_per_nlist * nlist vectors [required] --ivf_nprobes TEXT How many clustering centers to search during the query [required] --m INTEGER The number of sub-vectors that each data vector is divided into during IVF-PQ --help Show this message and exit. Show this message and exit. ``` ### Run Hologres from command line It is recommended to use the following code for installation. ```shell pip install 'vectordb-bench[hologres]' 'psycopg[binary]' pgvector ``` Execute tests for the index types: HGraph. ```shell NUM_PER_BATCH=10000 vectordbbench hologreshgraph --host Hologres_Endpoint --port 80 \ --user ACCESS_ID --password ACCESS_KEY --database DATABASE_NAME \ --m 64 --ef-construction 400 --case-type Performance768D10M \ --index-type HGraph --ef-search 400 --k 10 --num-concurrency 1,60,70,75,80,90,95,100,110,120 ``` To list the options for Hologres, execute `vectordbbench hologreshgraph --help`, The following are some Hologres-specific command-line options. ```text $ vectordbbench hologreshgraph --help Usage: vectordbbench hologreshgraph [OPTIONS] Options: [...] --host TEXT Hologres host --user TEXT Hologres username [required] --password TEXT Hologres database password --database TEXT Hologres database name [required] --port INTEGER Hologres port [required] --m INTEGER hnsw m [required] --ef-construction INTEGER hnsw ef-construction [required] --ef-search INTEGER hnsw ef-search [required] --index-type [HGraph] Type of index to use. Supported values: HGraph [required] --help Show this message and exit. ``` ### Run Doris from command line Doris supports ann index with type hnsw from version 4.0.x ```shell NUM_PER_BATCH=1000000 vectordbbench doris --http-port=8030 --port=9030 --db-name=vector_test --case-type=Performance768D1M --stream-load-rows-per-batch=500000 ``` Using flag `--session-var`, if you want to test doris with some customized session variables. For example: ```shell NUM_PER_BATCH=1000000 vectordbbench doris --http-port=8030 --port=9030 --db-name=vector_test --case-type=Performance768D1M --stream-load-rows-per-batch=500000 --session-var enable_profile=True ``` Mote options: ```text --m INTEGER hnsw m --ef-construction INTEGER hnsw ef-construction --username TEXT Username [default: root; required] --password TEXT Password [default: ""] --host TEXT Db host [default: 127.0.0.1; required] --port INTEGER Query Port [default: 9030; required] --http-port INTEGER Http Port [default: 8030; required] --db-name TEXT Db name [default: test; required] --ssl / --no-ssl Enable or disable SSL, for Doris Serverless SSL must be enabled [default: no-ssl] --index-prop TEXT Extra index PROPERTY as key=value (repeatable) --session-var TEXT Session variable key=value applied to each SQL session (repeatable) --stream-load-rows-per-batch INTEGER Rows per single stream load request; default uses NUM_PER_BATCH --no-index Create table without ANN index ``` #### Using a configuration file. The vectordbbench command can optionally read some or all the options from a yaml formatted configuration file. By default, configuration files are expected to be in vectordb_bench/config-files/, this can be overridden by setting the environment variable CONFIG_LOCAL_DIR or by passing the full path to the file. The required format is: ```yaml commandname: parameter_name: parameter_value parameter_name: parameter_value ``` Example: ```yaml pgvectorhnsw: db_label: pgConfigTest user_name: vectordbbench password: vectordbbench db_name: vectordbbench host: localhost m: 16 ef_construction: 128 ef_search: 128 milvushnsw: skip_search_serial: True case_type: Performance1536D50K uri: http://localhost:19530 m: 16 ef_construction: 128 ef_search: 128 drop_old: False load: False elasticcloudhnsw: db_label: elastic-cloud-hnsw cloud_id: password: case_type: Performance768D1M m: 16 ef_construction: 100 num_candidates: 100 number_of_shards: 1 number_of_replicas: 0 refresh_interval: 30s element_type: float ``` > Notes: > - Options passed on the command line will override the configuration file* > - Parameter names use an _ not - > - For `LabelFilterPerformanceCase` and `NewIntFilterPerformanceCase`, you must specify `dataset_with_size_type` in addition to `case_type` #### Using a batch configuration file. The vectordbbench command can read a batch configuration file to run all the test cases in the yaml formatted configuration file. By default, configuration files are expected to be in vectordb_bench/config-files/, this can be overridden by setting the environment variable CONFIG_LOCAL_DIR or by passing the full path to the file. The required format is: ```yaml commandname: - parameter_name: parameter_value another_parameter_name: parameter_value ``` Example: ```yaml pgvectorhnsw: - db_label: pgConfigTest user_name: vectordbbench password: vectordbbench db_name: vectordbbench host: localhost m: 16 ef_construction: 128 ef_search: 128 milvushnsw: - skip_search_serial: True case_type: Performance1536D50K uri: http://localhost:19530 m: 16 ef_construction: 128 ef_search: 128 drop_old: False load: False elasticcloudhnsw: - db_label: elastic-cloud-hnsw-test-1 cloud_id: password: case_type: Performance768D1M m: 16 ef_construction: 100 num_candidates: 100 - db_label: elastic-cloud-label-filter-0.1 cloud_id: password: case_type: LabelFilterPerformanceCase dataset_with_size_type: "Medium OpenAI (1536dim, 500K)" label_percentage: 0.001 m: 16 ef_construction: 128 num_candidates: 100 num_concurrency: "1,5" ``` > Notes: > - Options can only be passed through configuration files > - Parameter names use an _ not - > - For `LabelFilterPerformanceCase` and `NewIntFilterPerformanceCase`, you must specify `dataset_with_size_type` in addition to `case_type` How to use? ```shell vectordbbench batchcli --batch-config-file ``` ## Leaderboard ### Introduction To facilitate the presentation of test results and provide a comprehensive performance analysis report, we offer a [leaderboard page](https://zilliz.com/benchmark). It allows us to choose from QPS, QP$, and latency metrics, and provides a comprehensive assessment of a system's performance based on the test results of various cases and a set of scoring mechanisms (to be introduced later). On this leaderboard, we can select the systems and models to be compared, and filter out cases we do not want to consider. Comprehensive scores are always ranked from best to worst, and the specific test results of each query will be presented in the list below. ### Scoring Rules 1. For each case, select a base value and score each system based on relative values. - For QPS and QP$, we use the highest value as the reference, denoted as `base_QPS` or `base_QP$`, and the score of each system is `(QPS/base_QPS) * 100` or `(QP$/base_QP$) * 100`. - For Latency, we use the lowest value as the reference, that is, `base_Latency`, and the score of each system is `(base_Latency + 10ms)/(Latency + 10ms) * 100`. We want to give equal weight to different cases, and not let a case with high absolute result values become the sole reason for the overall scoring. Therefore, when scoring different systems in each case, we need to use relative values. Also, for Latency, we add 10ms to the numerator and denominator to ensure that if every system performs particularly well in a case, its advantage will not be infinitely magnified when latency tends to 0. 2. For systems that fail or timeout in a particular case, we will give them a score based on a value worse than the worst result by a factor of two. For example, in QPS or QP$, it would be half the lowest value. For Latency, it would be twice the maximum value. 3. For each system, we will take the geometric mean of its scores in all cases as its comprehensive score for a particular metric. ## Build on your own ### Install requirements ``` shell pip install -e '.[test]' pip install -e '.[pinecone]' ``` ### Run test server ``` python -m vectordb_bench ``` OR: ```shell init_bench ``` OR: If you are using [dev container](https://code.visualstudio.com/docs/devcontainers/containers), create the following dataset directory first: ```shell # Mount local ~/vectordb_bench/dataset to contain's /tmp/vectordb_bench/dataset. # If you are not comfortable with the path name, feel free to change it in devcontainer.json mkdir -p ~/vectordb_bench/dataset ``` After reopen the repository in container, run `python -m vectordb_bench` in the container's bash. ### Check coding styles ```shell make lint ``` To fix the coding styles automatically ```shell make format ``` ## How does it work? ### Result Page ![image](https://github.com/zilliztech/VectorDBBench/assets/105927039/8a981327-c1c6-4796-8a85-c86154cb5472) This is the main page of VDBBench, which displays the standard benchmark results we provide. Additionally, results of all tests performed by users themselves will also be shown here. We also offer the ability to select and compare results from multiple tests simultaneously. The standard benchmark results displayed here include all 15 cases that we currently support for 6 of our clients (Milvus, Zilliz Cloud, Elastic Search, Qdrant Cloud, Weaviate Cloud and PgVector). However, as some systems may not be able to complete all the tests successfully due to issues like Out of Memory (OOM) or timeouts, not all clients are included in every case. All standard benchmark results are generated by a client running on an 8 core, 32 GB host, which is located in the same region as the server being tested. The client host is equipped with an `Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz` processor. Also all the servers for the open-source systems tested in our benchmarks run on hosts with the same type of processor. ### Run Test Page 1. Initially, you select the systems to be tested - multiple selections are allowed. Once selected, corresponding forms will pop up to gather necessary information for using the chosen databases. The db_label is used to differentiate different instances of the same system. We recommend filling in the host size or instance type here (as we do in our standard results). 2. The next step is to select the test cases you want to perform. You can select multiple cases at once, and a form to collect corresponding parameters will appear. 3. Finally, you'll need to provide a task label to distinguish different test results. Using the same label for different tests will result in the previous results being overwritten. Now we can only run one task at the same time. ![image](vectordb_bench/fig/run_test_select_db.png) ![image](vectordb_bench/fig/run_test_select_case.png) ![image](vectordb_bench/fig/run_test_submit.png) ## Module ### Code Structure ![image](https://github.com/zilliztech/VectorDBBench/assets/105927039/8c06512e-5419-4381-b084-9c93aed59639) ### Client Our client module is designed with flexibility and extensibility in mind, aiming to integrate APIs from different systems seamlessly. As of now, it supports Milvus, Zilliz Cloud, Elastic Search, Pinecone, Qdrant Cloud, Weaviate Cloud, PgVector, Redis, Chroma, CockroachDB, etc. Stay tuned for more options, as we are consistently working on extending our reach to other systems. ### Benchmark Cases We've developed lots of comprehensive benchmark cases to test vector databases' various capabilities, each designed to give you a different piece of the puzzle. These cases are categorized into four main types: #### Capacity Case - **Large Dim:** Tests the database's loading capacity by inserting large-dimension vectors (GIST 100K vectors, 960 dimensions) until fully loaded. The final number of inserted vectors is reported. - **Small Dim:** Similar to the Large Dim case but uses small-dimension vectors (SIFT 500K vectors, 128 dimensions). #### Search Performance Case - **XLarge Dataset:** Measures search performance with a massive dataset (LAION 100M vectors, 768 dimensions) at varying parallel levels. The results include index building time, recall, latency, and maximum QPS. - **Large Dataset:** Similar to the XLarge Dataset case, but uses a slightly smaller dataset (10M-1024dim, 10M-768dim, 5M-1536dim). - **Medium Dataset:** A case using a medium dataset (1M-1024dim, 1M-768dim, 500K-1536dim). - **Small Dataset:** For development (100K-768dim, 50K-1536dim). #### Filtering Search Performance Case - **Int-Filter Cases:** Evaluates search performance with int-based filter expression (e.g. "id >= 2,000"). - **Label-Filter Cases:** Evaluates search performance with label-based filter expressions (e.g., "color == 'red'"). The test includes randomly generated labels to simulate real-world filtering scenarios. #### Streaming Cases - **Insertion-Under-Load Case:** Evaluates search performance while maintaining a constant insertion workload. VDBBench applies a steady stream of insert requests at a fixed rate to simulate real-world scenarios where search operations must perform reliably under continuous data ingestion. Each case provides an in-depth examination of a vector database's abilities, providing you a comprehensive view of the database's performance. #### Custom Dataset for Performance case Through the `/custom` page, users can customize their own performance case using local datasets. After saving, the corresponding case can be selected from the `/run_test` page to perform the test. ![image](vectordb_bench/fig/custom_dataset.png) ![image](vectordb_bench/fig/custom_case_run_test.png) We have strict requirements for the data set format, please follow them. - `Folder Path` - The path to the folder containing all the files. Please ensure that all files in the folder are in the `Parquet` format. - Vectors data files: The file must be named `train.parquet` and should have two columns: `id` as an incrementing `int` and `emb` as an array of `float32`. - Query test vectors: The file must be named `test.parquet` and should have two columns: `id` as an incrementing `int` and `emb` as an array of `float32`. - We recommend limiting the number of test query vectors, like 1,000. When conducting concurrent query tests, Vdbbench creates a large number of processes. To minimize additional communication overhead during testing, we prepare a complete set of test queries for each process, allowing them to run independently. However, this means that as the number of concurrent processes increases, the number of copied query vectors also increases significantly, which can place substantial pressure on memory resources. - Ground truth file: The file must be named `neighbors.parquet` and should have two columns: `id` corresponding to query vectors and `neighbors_id` as an array of `int`. - `Train File Count` - If the vector file is too large, you can consider splitting it into multiple files. The naming format for the split files should be `train-[index]-of-[file_count].parquet`. For example, `train-01-of-10.parquet` represents the second file (0-indexed) among 10 split files. - `Use Shuffled Data` - If you check this option, the vector data files need to be modified. VDBBench will load the data labeled with `shuffle`. For example, use `shuffle_train.parquet` instead of `train.parquet` and `shuffle_train-04-of-10.parquet` instead of `train-04-of-10.parquet`. The `id` column in the shuffled data can be in any order. ## Goals Our goals of this benchmark are: ### Reproducibility & Usability One of the primary goals of VDBBench is to enable users to reproduce benchmark results swiftly and easily, or to test their customized scenarios. We believe that lowering the barriers to entry for conducting these tests will enhance the community's understanding and improvement of vector databases. We aim to create an environment where any user, regardless of their technical expertise, can quickly set up and run benchmarks, and view and analyze results in an intuitive manner. ### Representability & Realism VDBBench aims to provide a more comprehensive, multi-faceted testing environment that accurately represents the complexity of vector databases. By moving beyond a simple speed test for algorithms, we hope to contribute to a better understanding of vector databases in real-world scenarios. By incorporating as many complex scenarios as possible, including a variety of test cases and datasets, we aim to reflect realistic conditions and offer tangible significance to our community. Our goal is to deliver benchmarking results that can drive tangible improvements in the development and usage of vector databases. ## Contribution ### General Guidelines 1. Fork the repository and create a new branch for your changes. 2. Adhere to coding conventions and formatting guidelines. 3. Use clear commit messages to document the purpose of your changes. ### Adding New Clients **Step 1: Creating New Client Files** 1. Navigate to the vectordb_bench/backend/clients directory. 2. Create a new folder for your client, for example, "new_client". 3. Inside the "new_client" folder, create two files: new_client.py and config.py. **Step 2: Implement new_client.py and config.py** 1. Open new_client.py and define the NewClient class, which should inherit from the clients/api.py file's VectorDB abstract class. The VectorDB class serves as the API for benchmarking, and all DB clients must implement this abstract class. Example implementation in new_client.py: new_client.py ```python from ..api import VectorDB class NewClient(VectorDB): # Implement the abstract methods defined in the VectorDB class # ... ``` 2. Open config.py and implement the DBConfig and optional DBCaseConfig classes. 1. The DBConfig class should be an abstract class that provides information necessary to establish connections with the database. It is recommended to use the pydantic.SecretStr data type to handle sensitive data such as tokens, URIs, or passwords. 2. The DBCaseConfig class is optional and allows for providing case-specific configurations for the database. If not provided, it defaults to EmptyDBCaseConfig. Example implementation in config.py: ```python from pydantic import SecretStr from clients.api import DBConfig, DBCaseConfig class NewDBConfig(DBConfig): # Implement the required configuration fields for the database connection # ... token: SecretStr uri: str class NewDBCaseConfig(DBCaseConfig): # Implement optional case-specific configuration fields # ... ``` **Step 3: Importing the DB Client and Updating Initialization** In this final step, you will import your DB client into clients/__init__.py and update the initialization process. 1. Open clients/__init__.py and import your NewClient from new_client.py. 2. Add your NewClient to the DB enum. 3. Update the db2client dictionary by adding an entry for your NewClient. Example implementation in clients/__init__.py: ```python #clients/__init__.py # Add NewClient to the DB enum class DB(Enum): ... DB.NewClient = "NewClient" @property def init_cls(self) -> Type[VectorDB]: ... if self == DB.NewClient: from .new_client.new_client import NewClient return NewClient ... @property def config_cls(self) -> Type[DBConfig]: ... if self == DB.NewClient: from .new_client.config import NewClientConfig return NewClientConfig ... def case_config_cls(self, ...) if self == DB.NewClient: from .new_client.config import NewClientCaseConfig return NewClientCaseConfig ``` **Step 4: Implement new_client/cli.py and vectordb_bench/cli/vectordbbench.py** In this (optional, but encouraged) step you will enable the test to be run from the command line. 1. Navigate to the vectordb_bench/backend/clients/"client" directory. 2. Inside the "client" folder, create a cli.py file. Using zilliz as an example cli.py: ```python from typing import Annotated, Unpack import click import os from pydantic import SecretStr from vectordb_bench.cli.cli import ( CommonTypedDict, cli, click_parameter_decorators_from_typed_dict, run, ) from vectordb_bench.backend.clients import DB class ZillizTypedDict(CommonTypedDict): uri: Annotated[ str, click.option("--uri", type=str, help="uri connection string", required=True) ] user_name: Annotated[ str, click.option("--user-name", type=str, help="Db username", required=True) ] password: Annotated[ str, click.option("--password", type=str, help="Zilliz password", default=lambda: os.environ.get("ZILLIZ_PASSWORD", ""), show_default="$ZILLIZ_PASSWORD", ), ] level: Annotated[ str, click.option("--level", type=str, help="Zilliz index level", required=False), ] @cli.command() @click_parameter_decorators_from_typed_dict(ZillizTypedDict) def ZillizAutoIndex(**parameters: Unpack[ZillizTypedDict]): from .config import ZillizCloudConfig, AutoIndexConfig run( db=DB.ZillizCloud, db_config=ZillizCloudConfig( db_label=parameters["db_label"], uri=SecretStr(parameters["uri"]), user=parameters["user_name"], password=SecretStr(parameters["password"]), ), db_case_config=AutoIndexConfig( params={parameters["level"]}, ), **parameters, ) ``` 3. Update cli by adding: 1. Add database specific options as an Annotated TypedDict, see ZillizTypedDict above. 2. Add index configuration specific options as an Annotated TypedDict. (example: vectordb_bench/backend/clients/pgvector/cli.py) 1. May not be needed if there is only one index config. 2. Repeat for each index configuration, nesting them if possible. 2. Add a index config specific function for each index type, see Zilliz above. The function name, in lowercase, will be the command name passed to the vectordbbench command. 3. Update db_config and db_case_config to match client requirements 4. Continue to add new functions for each index config. 5. Import the client cli module and command to vectordb_bench/cli/vectordbbench.py (for databases with multiple commands (index configs), this only needs to be done for one command) 6. Import the `get_custom_case_config` function from `vectordb_bench/cli/cli.py` and use it to add a new key `custom_case` to the `parameters` variable within the command. > cli modules with multiple index configs: > - pgvector: vectordb_bench/backend/clients/pgvector/cli.py > - milvus: vectordb_bench/backend/clients/milvus/cli.py That's it! You have successfully added a new DB client to the vectordb_bench project. ## Rules ### Installation The system under test can be installed in any form to achieve optimal performance. This includes but is not limited to binary deployment, Docker, and cloud services. ### Fine-Tuning For the system under test, we use the default server-side configuration to maintain the authenticity and representativeness of our results. For the Client, we welcome any parameter tuning to obtain better results. ### Incomplete Results Many databases may not be able to complete all test cases due to issues such as Out of Memory (OOM), crashes, or timeouts. In these scenarios, we will clearly state these occurrences in the test results. ### Mistake Or Misrepresentation We strive for accuracy in learning and supporting various vector databases, yet there might be oversights or misapplications. For any such occurrences, feel free to [raise an issue](https://github.com/zilliztech/VectorDBBench/issues/new) or make amendments on our GitHub page. ## Timeout In our pursuit to ensure that our benchmark reflects the reality of a production environment while guaranteeing the practicality of the system, we have implemented a timeout plan based on our experiences for various tests. **1. Capacity Case:** - For Capacity Case, we have assigned an overall timeout. **2. Other Cases:** For other cases, we have set two timeouts: - **Data Loading Timeout:** This timeout is designed to filter out systems that are too slow in inserting data, thus ensuring that we are only considering systems that is able to cope with the demands of a real-world production environment within a reasonable time frame. - **Optimization Preparation Timeout**: This timeout is established to avoid excessive optimization strategies that might work for benchmarks but fail to deliver in real production environments. By doing this, we ensure that the systems we consider are not only suitable for testing environments but also applicable and efficient in production scenarios. This multi-tiered timeout approach allows our benchmark to be more representative of actual production environments and assists us in identifying systems that can truly perform in real-world scenarios.
Case Data Size Timeout Type Value
Capacity Case N/A Loading timeout 24 hours
Other Cases 1M vectors, 768 dimensions
500K vectors, 1536 dimensions
Loading timeout 2.5 hours
Optimization timeout 15 mins
Other Cases 10M vectors, 768 dimensions
5M vectors, 1536 dimensions
Loading timeout 25 hours
Optimization timeout 2.5 hours
Other Cases 100M vectors, 768 dimensions Loading timeout 250 hours
Optimization timeout 25 hours
**Note:** Some datapoints in the standard benchmark results that violate this timeout will be kept for now for reference. We will remove them in the future.