# Traffic Analysis **Repository Path**: xxdxxdxxd/traffic-analysis ## Basic Information - **Project Name**: Traffic Analysis - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 1 - **Created**: 2023-11-11 - **Last Updated**: 2024-10-22 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Tasks & Models & Datasets > This file discusses the appropriate tasks, models and datasets to be utilized for robustness certification. ## Website fingerprinting ### Representative models ![whiteboard_exported_image](README.assets/whiteboard_exported_image.png) [1] Automated Website Fingerprinting through Deep Learning [2] Deep Fingerprinting: Undermining Website Fingerprinting Defenses with Deep Learning [3] Var-CNN: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning [4] Tik-tok: The utility of packet timing in website fingerprinting attacks [5] Subverting Website Fingerprinting Defenses with Robust Traffic Representation [6] Triplet Fingerprinting: More Practical and Portable Website Fingerprinting with N-shot Learning [7] An {Input-Agnostic} Hierarchical Deep Learning Framework for Traffic Fingerprinting [8] Effective Attacks and Provable Defenses for Website Fingerprinting [9] Website Fingerprinting at Internet Scale [10] k-fingerprinting: A Robust Scalable Website Fingerprinting Technique [11] High Precision Open-World Website Fingerprinting ### Detailed analysis #### 1. k-NN https://www.cs.sfu.ca/~taowang/wf/index.html - **Influence factors:** length (?), timestamp, direction. - **Feature set:** 1000~4000 extracted features (general features, unique packet lengths, packet ordering, concentration of outgoing packets, bursts and intial packet lengths). - **Algorithm:** k-Nearest Neighbour classifier. - **Dataset** - WANG14: Tor cell traces. - 100 monitored websites (only index pages) with 90 instances each. - 9000 unmonitored websites with 1 instance each. - **Data type****:** (Tor cell level) sequence of . For Tor, length is set to 1. #### 2. CUMUL https://www.informatik.tu-cottbus.de/~andriy/zwiebelfreunde/ - **Influence factors:** length, timestamp, direction. - **Feature set:** n = 100 interpolants of the cumulative packet length curve. - **Algorithm:** libSVM. - **Dataset** - ALEXA100: re-recorded version of WANG13. - 100 monitored websites with 100 instances each. - RND-WWW: random samples of web pages visited by typical users. - 1125 foreground webpages (from 712 websites). - 118884 background webpages (34580 websites). - TOR-Exit: pages that are actually accessed through the Tor network. - 211148 unique webpages (each website is represented by fewer than 2,000 web pages). - WEBSITES: websites and their subpages. - 20 websites. - For each website, 1 index page (90 instances) + 50 subpages (15 instances each). - **Data type****:** (TLS, or TCP level) sequence of <±length>. #### 3. k-FP https://github.com/jhayes14/k-FP - **Influence factors:** length (?), timestamp, direction. - **Feature set:** 150 extracted features (packet numbers, ordering, concentrations, etc). - **Algorithm:** Random Forest. - **Dataset:** Wang14 (the same with kNN), along with self-collected dataset gathered in the similar way. - **Data type****:** (TCP/IP packet level) sequence of . For Tor, length is set to 1. #### 4. openWF https://github.com/literaltao/openwf This work proposes a precision optimization (PO) approach and tests it on 7 models (including k-NN, CUMUL, k-FP, DF). - **Dataset:** Wang20000 - 100 monitored pages (200 instances each) + 80000 unmonitored pages (1 instance each). - **General** **data type****:** (Tor cell, TLS, or TCP level) sequence of