# ST-AWFD **Repository Path**: mirrors_STMicroelectronics/ST-AWFD ## Basic Information - **Project Name**: ST-AWFD - **Description**: ST Dataset for Automatic Wafer Fault Detection - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-05-01 - **Last Updated**: 2025-11-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ``` __ __ __ _____ __ _____ ___ \ \ / / / _| | __ \/_ | ___ | __ \__ \ \ \ /\ / /_ _| |_ ___ _ __ | | | || | ( _ ) | | | | ) | \ \/ \/ / _` | _/ _ \ '__| | | | || | / _ \/\ | | | |/ / \ /\ / (_| | || __/ | | |__| || | | (_> < | |__| / /_ \/ \/ \__,_|_| \___|_| |_____/ |_| \___/\/ |_____/____| _____ _ _ | __ \ | | | | | | | | __ _| |_ __ _ ___ ___| |_ ___ | | | |/ _` | __/ _` / __|/ _ \ __/ __| | |__| | (_| | || (_| \__ \ __/ |_\__ \ |_____/ \__,_|\__\__,_|___/\___|\__|___/ ``` # Introduction This is the README of the Wafer D1 and Wafer D2 dataset. If you use these datasets, please consider citing the following paper: Furnari G, Vattiato F, Allegra D, Milotta FLM, Orofino A, Rizzo R, De Palo RA, Stanco F. An Ensembled Anomaly Detector for Wafer Fault Detection. Sensors. 2021; 21(16):5465. https://doi.org/10.3390/s21165465 # Datasets info Datasets concern semiconductors industry, both datasets contain timeseries made by a variable number of time samples. Both datasets have 5 reference columns: MaterialID, StepID, duration_ms, target and is_test. The samples are grouped by a MaterialID which repesents the production lot. The production process is divided into steps. The number of steps is different between the two datasets. There are also mandatory and optional step. # Reference columns - MaterialID: represents the production lot - StepID: represents the step of the production - duration_ms: represents the time elapsed (normalized) from the first time sample (in mandatory step) which has duration_ms 0, the last time sample (in mandatory step) has duration_ms equal to 1. Time samples in optional step have duration_ms greater than 1 or lesser than 0. - Target: is a boolean value that represent if the MaterialID is abnormal or not, 1 for abnormal MaterialID, 0 for normal MaterialID. - is_test: is a boolean value that represent if the time sample has been used as training or test in our work. 1 means that the time sample has been used as test, 0 means that the time sample has been used for the training phase. ## Wafer D1 StepID Wafer D1 dataset has 7 steps, 5 are mandatory with IDs 2, 4, 5, 6 and 7 while 2 are optional with ID -1 and -2 ## Wafer D1 MaterialID Wafer D1 has 5105 MaterialID ## Wafer D2 StepID Wafer D2 dataset has 2 mandatory steps and no optional steps. ## Wafer D2 MaterialID Wafer D2 has 1157 MaterialID # Features columns Both dataset have features columns, the number of the features is different betwen the two dataset All the features have been normalized with a z-scaler ## Wafer D1 Features columns Wafer D1 has 15 features columns ## Wafer D2 Features columns Wafer D2 has 20 features columns # Datasets shape ## Wafer D1 shape Wafer D1 counts 602108 rows and 20 columns ## Wafer D2 shape Wafer D2 counts 126795 rows and 25 columns