# Subconv

**Repository Path**: hlobin/subconv

## Basic Information

- **Project Name**: Subconv
- **Description**: 子集卷积
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2024-11-05
- **Last Updated**: 2025-06-12

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# DPconv: Super-Polynomially Faster Join Ordering

## Build

```
cd src/
mkdir -p build
cd build
cmake ..
make
```

## Clique Benchmark

Generating Fig. 6 and Fig. 8: To obtain the running times for clique queries, we will use `./clique_bench` (this will run 5 rounds of query graphs of 3 relations up to $n$).

In the experiments, we used the largest join cardinality $W = 10^8$, as follows:

```
./clique_bench 24 100000000
```

All results (even those of individual rounds) are stored in the `benchs` directory, labeled `normal-clique`. The average result is stored in a file with the suffix `-final`.

As regards the actual figures:
* Fig. 6 uses `minmax_dpsub_time` and `minmax_dpconv_instant_boosted_time`.
* Fig. 8 uses `hybrid_capped_dpsub_time`, `simple_capped_dpsub_time`, and `minplus_dpsub_time`.
![输入图片说明](mdImageimage.png)
![输入图片说明](fig8.png.png)
## JOB & CEB

To obtain Fig. 5, we can use the `./bench` binary. We provide the query graphs with the true cardinalities of JOB and CEB directly in `queries`. The list of pairs represents (bitset, cardinality).
![输入图片说明](image.png)
Since the previous clique benchmark does not use `DPccp`, due to its overhead to generate the ccp's, let's activate the mode to have it.

To this end, change the macro in `BenchmarkRunner.cpp` to the following configuration:

```
#define CLIQUE_BENCHMARK 0
#define CAPPED_COUT_BENCHMARK 1
```

And run `make` again:

```
make
```

Then,

```
./bench ../../queries/job/
```

generates the necessary runtimes in `benchs`, prefixed by `cap-cout`. Similarly, for CEB run the following (might take a couple of minutes due to the large number of queries):

```
./bench ../../queries/ceb-imdb-full/
```

You can also obtain the statistics about the $C_{out}, C_{max}$, and $C_{cap}$ values in `cout_cmax_ratios.csv`.

## Algorithms

All algorithms are in `src/algorithms`.

1. `DPconv`: Its simple and fast algorithm (Sec. 6) is in `DPconv.cpp`.
2. `DPsub`: All variants of `DPsub` are in `DPsub.cpp`.
3. `DPccp`: All variants of `DPccp` are in `DPccp.cpp`.
4. `SubsetConvolution`: The heart of our improved layered dynamic programs is `include/SubsetConvolution.hpp`.

In particular, `BooleanFSC` implements all our optimizations (Sec. 5.2, Sec. 5.3) and reduces the copy between the DP-table and the actual subset convolution.

In `BoostedBooleanFSC`, we reduce the number of steps in the last step.