# python_segy

**Repository Path**: lgcgithub/python_segy

## Basic Information

- **Project Name**: python_segy
- **Description**: Generate sample data from .segy seismic data for deep learning based on pytorch
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-02-06
- **Last Updated**: 2021-02-06

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Seismic data sample generation
 Based on pytorch
## Introduction
- This code is used to generate sample data from .segy seismic data  for deep learning based on pytorch.
- It can be used for denoising or interpolation for seismic data.
- This code is modified from [KaiZhang](https://github.com/cszn/DnCNN/tree/master/TrainingCodes/dncnn_pytorch).
## Prerequisites
- Python3 with dependencies: scipy, numpy, h5py, glob,
[pytorch](https://github.com/pytorch/pytorch) and [segyio](https://github.com/equinor/segyio)
## Datasets
- you own **.segy** or **.sgy** seismic data or you can download some **.segy** or **.sgy** data online by the code we provide
- the model we provided is trained with [Model94_shots](http://s3.amazonaws.com/open.source.geoscience/open_data/bpmodel94/Model94_shots.segy.gz) and [7m_shots_0201_0329](http://s3.amazonaws.com/open.source.geoscience/open_data/bpstatics94/7m_shots_0201_0329.segy.gz) dataset (mode: DNCNN)
##Generating training data
### 


	from get_patch import*  
	from gain import * 
	# original data generates patch
	train_data = datagenerator(data_dir,patch_size = (128,128),stride = (32,32), train_data_num = float('inf'), download=False,datasets=[],aug_times=0,scales = [1],verbose=True,jump=1,agc=True)
	
	train_data = train_data.astype(np.float64)
	xs = torch.from_numpy(train_data.transpose((0, 3, 1, 2)))
    
	# add noise
    DDataset = DenoisingDataset(xs,25)

	'''
	#random downsampling，rate : the sampling rate
	DDataset = DownsamplingDataset(xs,rate = 0.7,regular = False)
	#sampling regularly, rate ： sampling interval
	DDataset = DownsamplingDataset(xs,rate = 2,regular = True)
	'''


Parameters in **datagenerator** :

    data_dir      : the path of the .segy file exit or you want to download in
    patch_size    : the size the of patch
    stride        : when get patches, the step size to slide on the data
    train_data_num: int or float('inf'),default=float('inf'),mean all the data will be used to Generate patches,
                    if you just need 3000 patches, you can set train_data_num=3000;

    download(bool): whether you will download the dataset from the internet,and we provide 7 inline datasets,the order is
				    1. http://s3.amazonaws.com/open.source.geoscience/open_data/bpmodel94/Model94_shots.segy.gz
                    2. http://s3.amazonaws.com/open.source.geoscience/open_data/bpstatics94/7m_shots_0201_0329.segy.gz
					3. https://s3.amazonaws.com/open.source.geoscience/open_data/bp2.5d1997/1997_2.5D_shots.segy.gz
					4. http://s3.amazonaws.com/open.source.geoscience/open_data/bpvelanal2004/shots0001_0200.segy.gz
					5. http://s3.amazonaws.com/open.source.geoscience/open_data/bptti2007/Anisotropic_FD_Model_Shots_part1.sgy.gz
					6. https://s3.amazonaws.com/open.source.geoscience/open_data/hessvti/timodel_shot_data_II_shot001-320.segy.gz
					7. http://s3.amazonaws.com/open.source.geoscience/open_data/Mobil_Avo_Viking_Graben_Line_12/seismic.segy

    datasets(int) : the number of the datasets will be download in the datasets we provide if download = True,
					 e.g:dataset=2,it mean that you will download the 1. http://s3.amazonaws.com/open.source.geoscience/open_data/bpmodel94/Model94_shots.segy.gz 
					 and 2. https://s3.amazonaws.com/open.source.geoscience/open_data/bp2.5d1997/1997_2.5D_shots.segy.gz two datasets.

    aug_times(int) : the time of the aug you will perform,used to increase the diversity of the samples,in each time,
                     Choose one operation at a time,eg:flip up and down、rotate 90 degree and flip up and down
    scales(list)   : The ratio of the data being scaled. default = [1], no scale by default.
    verbose(bool)  : Whether to output the generate situation of the patches

    jump(int)      : default=1, mean that read shot one by one; when jump>=2, mean that don`t read the shot one by one
                     instead of with a certain interval,such as: jump=3,you will use the 1、4、7... shot data

    agc(bool)      : if use the agc(Normalize each trace by amplitude) of the data

- **Note** : the parameters "jump" is only available when the dimensions of each shot data are the same. And we provide a small .segy data in ‘data/test’ to test the "datagenerator" function or you can just run `python get_patch.py` to test and look at some of the data sets that are being visualized. Just like:
![](https://wx4.sinaimg.cn/mw1024/006ceorLly1g32061cqx0j315p0l61kx.jpg)
![](https://wx2.sinaimg.cn/mw1024/006ceorLly1g320610w9mj315t0l7nk8.jpg)

## Training
	python main_train_denoise.py --data_dir data/train
	python main_train_inter.py --data_dir data/train

(Note: we suppose you have put the "segy" files in the "data/train" folder. If not, please use --download True --datasets 2 (2 means you want to use 2 datasets in the default library). Sometimes the network is not stable and the datasets cannot be downloaded. We provide a baiduyun link for some datasets here, link：https://pan.baidu.com/s/1VuRC40rugaoD2-hRzC1cbQ 
code：x0nq)

## Test
	python main_test_denoise.py --data_dir data/test --sigma 50
![](https://wx3.sinaimg.cn/mw1024/006ceorLly1g31rqu5c7zj316y0bvng0.jpg)

	python main_test_inter.py --data_dir data/test --rate 2
![](https://wx4.sinaimg.cn/mw1024/006ceorLly1g31rqtq162j316w0c7aqa.jpg)