# subsequence_matching **Repository Path**: su_wen_chang/subsequence_matching ## Basic Information - **Project Name**: subsequence_matching - **Description**: No description available - **Primary Language**: Unknown - **License**: MulanPSL-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-03-10 - **Last Updated**: 2022-03-26 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README The uploaded zip file is a clion project. The code can be built with“cmake" and "make" instrusctions according to the "CMakeLists.txt". Compiling environment: linux compiler version : gcc version 5.4.0 windows compiler version: MinGW-W64-builds-4.3.5 Build index: 0. Open file constants_and_headers.h and check the following variables,make sure you have permission to write files under these two folders: const char* const INDEX_FOLDER=R"(./index_binary/); const char* const BINARY_SEQUENCE_PATH= R"(./index_binary/concat.binary)"; BINARY_SEQUENCE_PATH is the path to save the binary sequence. INDEX_FOLDER is the path to save the index. 1.Clear the above two folders before buliding a new index. If you suspect that there is a problem with the index, you could delete all the files in both folders and try to rebuild the index. 2.check scripts/generateData.cpp, the segment length is set to {40,80,160,320} by default, this can be changed by modifying scripts/generateData.cpp and INDEX_LENGTH in constants_and_headers.h. 3. run "make generate_data" in terminal. 4. run the obtained binary file, it wolud promote "Input the location of txt file:", input the PATH_OF_INPUT_FILE, eg:"./data10000000.txt" and wait for the index of different lengths are finished. Run subsequence matching: 0.Make sure the index has been built before your start. 1.Write a script of the queries, see "in.txt" for example. Each line in the script is a query. # the type of query, which is one of "ED", "DTW", "CNED" and "CNDTW" # the start position of the query sequence in the indexed series, if you want to use series from other resources, modify the "run_.cpp", find "auto Q2=fetchData(start_pos,length,BINARY_SEQUENCE_PATH);" and replace Q2 with the queries you need. # length of the query # epsilon # alpha (only for CNED and CNDTW) # beta (only for CNED and CNDTW) Example in "in.txt": ED 123 40 3 DTW 3 100 1.1 CNED 1 300 3 1.1 11 CNDTW 1 200 3 1.5 10 2. "$make run_" 3. input the PATH_OF_SCRIPT in the terminal to run test according to the script. Note: The index building function is not well optimized and can be implemented in a more efficient way. You are welcomed to send emails to 19B903014@stu.hit.edu.cn for bug reporting.