# kbd-audio **Repository Path**: hongwing/kbd-audio ## Basic Information - **Project Name**: kbd-audio - **Description**: Tools for capturing and analysing keyboard input paired with microphone capture ⌨️ - **Primary Language**: C++ - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 0 - **Created**: 2019-08-03 - **Last Updated**: 2024-07-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # kbd-audio [![Build Status](https://travis-ci.org/ggerganov/kbd-audio.svg?branch=master)](https://travis-ci.org/ggerganov/kbd-audio?branch=master) ## Description This is a collection of command-line and GUI tools for capturing and analyzing audio data. ### Keytap The most interesting tool is called **keytap** - it can guess pressed keyboard keys only by analyzing the audio captured from the computer's microphone. Check this blog post for more details: [Keytap: description and some random thoughts](https://ggerganov.github.io/jekyll/update/2018/11/30/keytap-description-and-thoughts.html) [Video: short demo of Keytap in action](https://www.youtube.com/watch?v=2OjzI9m7W10) ### Keytap2 The **keytap2** tool is another interesting tool for recovering text from audio. It does not require training data - instead it uses statistical information about the frequencies of the letters and n-grams in the English language. The tool is still in development, but you can see a short demonstration here: [Video: Keytap2 - recovering text from typing sound (7:50)](https://www.youtube.com/watch?v=Y8nWkdWl7Pg) [CTF: can you guess the text being typed?](https://ggerganov.github.io/keytap-challenge/) ## Build instructions Dependencies: - **SDL2** - used to capture audio and to open GUI windows [libsdl](https://www.libsdl.org) [Ubuntu] $ sudo apt install libsdl2-dev [Mac OS with brew] $ brew install sdl2 - **FFTW3** *(optional)* - some of the helper tools perform Fourier transformations [fftw](http://www.fftw.org) **Linux and Mac OS** git clone https://github.com/ggerganov/kbd-audio cd kbd-audio git submodule update --init mkdir build && cd build cmake .. make **Windows** (todo, PRs welcome) ## Tools Short summary of the available tools. If the status of the tool is not **stable**, expect problems and non-optimal results. | Name | Type | Status | | --- | --- | --- | | **record** | text | **stable** | | **record-full** | text | **stable** | | **play** | text | **stable** | | **play-full** | text | **stable** | | **view-gui** | gui | **stable** | | **view-full-gui** | gui | **stable** | | **keytap** | text | **stable** | | **keytap-gui** | gui | **stable** | | **keytap2** | text | development | | **keytap2-gui** | gui | development | | - | *extra* | - | | **guess_qp** | text | experiment | | **guess_qp2** | text | experiment | | **key_detector** | text | experiment | | **scale** | text | experiment | | **subreak** | text | experiment | | **key_average_gui** | gui | experiment | ## Tool details * **record-full** Record audio to a raw binary file on disk ./record-full output.kbd [-cN] --- * **play-full** Playback a recording captured via the **record-full** tool ./play-full input.kbd [-pN] --- * **record** Record audio only while typing. Useful for collecting training data for **keytap** ./record output.kbd [-cN] --- * **play** Playback a recording created via the **record** tool ./play input.kbd [-pN] --- * **keytap** Detect pressed keys via microphone audio capture in real-time. Uses training data captured via the **record** tool. ./keytap input0.kbd [input1.kbd] [input2.kbd] ... [-cN] [-pF] [-tF] --- * **keytap-gui** Detect pressed keys via microphone audio capture in real-time. Uses training data captured via the **record** tool. GUI version. ./keytap-gui input0.kbd [input1.kbd] [input2.kbd] ... [-cN] [**Live demo *(WebAssembly threads required)* **](https://ggerganov.github.io/jekyll/update/2018/11/24/keytap.html) ![keytap-gui](https://i.imgur.com/FXa60Pr.gif) --- * **keytap2-gui** *(work in progress)* Detect pressed keys via microphone audio capture. Uses statistical information (n-gram frequencies) about the language. **No training data is required**. The *'recording.kbd'* input file has to be generated via the **record-full** tool and contains the audio data that will be analyzed. The *'n-gram.txt'* file has to contain n-gram probabilities for the corresponding language. ./keytap2-gui recording.kbd n-gram.txt ![keytap2-gui](https://i.imgur.com/LRnTkPA.jpg) --- * **view-full-gui** Visualize waveforms recorded with the **record-full** tool. Can also playback the audio data. ./view-full-gui input.kbd ![view-full-gui](https://i.imgur.com/scjTaXw.png) --- * **view-gui** Visualize training data recorded with the **record** tool. Can also playback the audio data. ./view-gui input.kbd ![view-full-gui](https://i.imgur.com/2binGaZ.png) --- ## Feedback Any feedback about the performance of the tools is highly appreciated. Please drop a comment [here](https://github.com/ggerganov/kbd-audio/issues/3).