# vid2frame

**Repository Path**: givemeyasoo/vid2frame

## Basic Information

- **Project Name**: vid2frame
- **Description**: An easy-to-use tool to extract frames from video and store into database.
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-09-17
- **Last Updated**: 2020-12-19

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# vid2frame
An easy-to-use tool to extract frames from video and store into database.
Basically, this is a python wrapper of ffmpeg which addtionally stores the frames into database.

## Why this tool
* Extracting frames from large video datasets (usually 10k ~ 100k, hundreds of GBs on disk) is tedious, automate it.
* Storing millions of frames on disk makes subsequent processing SLOW.
* Common mistakes I once made:
    * Decode all frames (using scikit-video) and store them into a **LARGE** .npy file, nice way to blow up the disk.
    * Extract all frames using ffmpeg and write to disk. Takes **foreeeeever** to move or delete.
    * Extract JPEG frames using ffmpeg but ignores the JPEG **quality**. For deep learning and computer vision, a good quality of images (JPEG quality around 95) is required. 

* Good practice in my opinion:
    * Add `-qscale:v 2` to [ffmpeg](https://stackoverflow.com/questions/10225403/how-can-i-extract-a-good-quality-jpeg-image-from-an-h264-video-file-with-ffmpeg) command.
    * Store extracted frames into a database, [LMDB](https://lmdb.readthedocs.io/en/release/) or [HDF5](http://docs.h5py.org/en/stable/).
    * (Optional) Use [Tensorpack dataflow](https://tensorpack.readthedocs.io/modules/dataflow.html) to accelerate reading from the database.
    * Suggestions are welcome.

## Usage
### 1. Split video dataset into multiple (if necessary) splits with `split_video_dataset.py`
```
usage: split_video_dataset.py [-h] vid_dir num_splits split_file

positional arguments:
  vid_dir     the video directory
  num_splits  the number of splits
  split_file  the split stored as pickle file

optional arguments:
  -h, --help  show this help message and exit
```
#### Sample usage
Run: `python split_video_dataset.py ./sample_videos 2 split-sample.pkl`

Which outputs split info after completion:
```
Number of videos found: 2
Number of unique videos: 2
split-0 : 1
split-1 : 1
Joined splits: 2
```

#### Notes
* Video files are identified with extensions, currently recognizing `['.mp4', '.avi', '.flv', '.mkv', '.webm', '.mov']`.

* Videos with the same name (without extension) are considered duplicates. Only one of them will be processed.


### 2. Extract frames for videos in a specific split using `vid2frame.py`
```
usage: vid2frame.py [-h] [-a] [-s SHORT] [-H HEIGHT] [-W WIDTH] [-k SKIP]
                    [-n NUM_FRAME]
                    split_file split frame_db db_type

positional arguments:
  split_file            the pickled split file
  split                 the split to use, e.g. split-0
  frame_db              the database to store extracted frames, either LMDB or
                        HDF5
  db_type               type of the database, LMDB or HDF5

optional arguments:
  -h, --help            show this help message and exit
  -a, --asis            do not resize frames
  -s SHORT, --short SHORT
                        keep the aspect ration and scale the shorter side to s
  -H HEIGHT, --height HEIGHT
                        the resized height
  -W WIDTH, --width WIDTH
                        the resized width
  -k SKIP, --skip SKIP  only store frames with (ID-1) mod skip==0, frame ID
                        starts from 1
  -n NUM_FRAME, --num_frame NUM_FRAME
                        uniformly sample n frames, this will override --skip
  -r INTERVAL, --interval INTERVAL
                        extract one frame every r seconds
```
#### Notes
* The frames will be stored as strings of their binary content, i.e. they are NOT decoded. Both LMDB and HDF5 are key-value storage, the keys are in the format of `video_name/frame_id` (assuming there are no two videos with the same name).
* The frames are in JPEG format, with JPEG quality ~95. Note the `-qscale:v 2` option in `vid2frame.py`. This is **important** for subsequent deep learning tasks.
* The database to use is either LMDB or HDF5, choose one according to:
    * Reading from HDF5 is convenient, if you do not plan to use [Tensorpack](https://tensorpack.readthedocs.io/_modules/tensorpack/dataflow/format.html#HDF5Data), which does not support HDF5 well currently, always choose HDF5.
    * LMDB integrates better with [Tensorpack](https://tensorpack.readthedocs.io/modules/dataflow.html#tensorpack.dataflow.LMDBData), but reading from it is less flexible (though much much faster than HDF5).
* Resizing options (exclusive):
    1. Do not resize (--asis)
    2. Resize the shorter edge and keep aspect ratio (the longer edge adapts) (--short)
    3. Resize to specific height & width (--height --width)
* Sampling options (exclusive):
    1. Keep one of frame every `k` frames (default 1, i.e. keep every frame) (--skip)
    2. Uniformly sample `n` frames (--num_frame). For example: If there are 10 frames, --skip=2 will sample frames 1,3,5,7,9 and --num_frame=4 will sample frames 1,4,7,10.
    3. Sample one frame every `r` seconds (--interval) or 1/r FPS. For r==1, its 1 FPS, and r==2, its 0.5 FPS.
    
#### Sample usage
* Extract frame of videos in split-0 generated above:

`python vid2frame.py split-sample.pkl split-0 frames-0.hdf5 HDF5 --short=240`

The output would be:
```
['split-0', 'split-1'] using split-0
100%|█████████████████████████████| 1/1 [00:02<00:00,  2.05s/it]
```
You can also process the other split simultaneously, for large video datasets, 6~8-split is recommended for a server with 40 CPUs:

`python vid2frame.py split-sample.pkl split-1 frames-1.hdf5 HDF5 --short=240`

Note that the output databases for different splits should not be the same in case concurrent write is no supported.

More samples:

`python vid2frame.py split-sample.pkl split-0 frames-0.lmdb LMDB --asis`

`python vid2frame.py split-sample.pkl split-0 frames-0.lmdb LMDB -H 240 -W 360`

### 3. (Optional) Test reading from database using `test_read_db.py`
`test_read_db.py` provides sample code to iterate, read and decode frames in LMDB/HDF5 databases, it also checks for broken images. 
#### Note
* Opening images from string buffer: `img = Image.open(StringIO(v))`
* Reading string from HDF5 db: `s = np.asarray(db_vid[fid]).tostring()`

#### Sample usage
`python test_read_db.py frames-1.lmdb` or `python test_read_db.py frames-0.hdf5`

The script outputs the number of frames in the database, the size of the last image and time to iterate over whole database.

## Dependencies
* Python 2.7
* FFmpeg: Install on [Ubuntu](https://tecadmin.net/install-ffmpeg-on-linux/). Other [platforms](https://www.google.com/).
* Python libraries: `pip install -r requirements.txt`, 


## Common issues
* `RuntimeError: Unable to create link (name already exists)`

   This is caused by writing duplicate frames to a non-empty HDF5 database.