# yolov5

**Repository Path**: linClubs/yolov5

## Basic Information

- **Project Name**: yolov5
- **Description**: yolov5目标检测，基于v6.2版本，master可用于训练和检测，only_detect分支仅用于检测，yolov5_ros分支用于ros调用检测；detect分支基本是最简洁的检测项目；det_seg分支包含同时检测与分割的训练，推理代码；yolov5-7分支增加注意力机制及轻量化
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 1
- **Created**: 2022-10-17
- **Last Updated**: 2023-09-20

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# 1 环境部署

显卡驱动提前装好

conda创建一个虚拟环境
~~~bash
pip install -r requirements.txt
~~~

---

# 2 训练train

## 2.1 准备数据集datasets

+  标注

目标检测选用labelimg就行，做分割用labelme标注 [参考链接](https://gitee.com/linClubs/yolov5/blob/det_seg/custom_data/readme.md)

+ 安装labelimg

~~~
tensorflow-gpu==2.4.0 labelme==3.16.7 opencv-python==4.6.0.66
~~~

+ 启动
~~~sh
labelImg
~~~


+ 标注
yolo支持3中格式训练，voc，coco，yolo，这里采用yolo格式，不需要做格式转换

将数据集图像放入./datasets/cat_dag/train和val/images/下。

train与val分开，也可以标注完了手动分开，

并在train与val目录下创建labels保存标签。

启动labeling后，设置4个地方
1. 打开目录 ： 数据集的图片目录(我是./datasets/cat_dog/train/images)
2. 改变存放目录   生成的标签保存的目录(我是./datasets/cat_dog/train/labels)
    val标注一样，把train换成val即可
3. view/查看里面把自动保存选中(标题栏中)
4. 切换到yolo标签格式

5. 常用快捷键：W 标注，D切换下一张，A 切到上一张图片，ctrl+s保存（选了自动保存可忽略）

[](./data/images/1.jpg)

# 2.2 修改train.py的初始配置参数

~~~sh
def parse_opt(known=False):
    parser = argparse.ArgumentParser()
    # 预训练权重
    parser.add_argument('--weights', type=str, default=ROOT / 'weights/yolov5n.pt', help='initial weights path')
    # 预训练模型的参数配置，需要修改models里面的文件中，类别数
    parser.add_argument('--cfg', type=str, default='models/yolov5n.yaml', help='model.yaml path')
    # 数据集的路径，及类别，标签
    parser.add_argument('--data', type=str, default=ROOT / 'data/data.yaml', help='dataset.yaml path')
    # 模型  其他配置参数
    parser.add_argument('--hyp', type=str, default=ROOT / 'data/hyps/hyp.scratch-low.yaml', help='hyperparameters path')
    # 训练周期
    parser.add_argument('--epochs', type=int, default=10)
    # 每次图片吞吐量
    parser.add_argument('--batch-size', type=int, default=2, help='total batch size for all GPUs, -1 for autobatch')
    # 训练图像大小
    parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='train, val image size (pixels)')
    # 是否使用GPU
    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    #  线程错误，改小一点，
    parser.add_argument('--workers', type=int, default=2, help='max dataloader workers (per RANK in DDP mode)')
~~~

**常见训练报错：cuda显存不够，将`--batch-size`改下，引用 DDP多线程报错，将`--workers`改小**

+ 预训练权重 `--weights`
需要下载yolov5官网提供的权重文件，以yolov5n.pt为例
下载后放如./weight/下，修改路径`default=ROOT / 'weights/yolov5n.pt`

+ 预训练模型的参数配置 `--cfg`
与预训练权重要对应，在models目录下，我这里是`yolov5n.yaml`
需要修改yolov5s.yaml里面的nc值，为检测的类别数

+ 数据集的路径，及类别，标签  `--data`
本人数据集目录:
~~~sh
datasets/cat_dog/train/images
datasets/cat_dog/train/labels
datasets/cat_dog/val/images
datasets/cat_dog/val/labels
~~~

根据上面路径，修改data/data.yaml文件内容
~~~sh
path: ./datasets  # dataset root dir
train: cat_dog/train  # 训练
val: cat_dog/val       # 验证
test:                               # 没有可省略

nc: 2  # number of classes
names: ['dog','cat']   # 标签，顺序跟标注时一致
~~~


+ 模型  其他配置参数 `--hyp`
默认就行， `data/hyps/hyp.scratch-low.yaml`

+ 其他按需调整，可以直接默认，主要调整上面4个

## 2.3 训练

激活虚拟环境，运行train.py
~~~bash
python  train.py
~~~

可以命令行传入初始化参数

~~~bash
python  train.py --weights weights/yolov5s.pt --cfg models/yolov5s.yaml --data data/data.yaml --epochs 10  --batch-size 4 --img-size 640 --device 1
~~~

## 2.4 结果
结果保存在`runs/train/exp/weights/last.pt`

 --- 

# 3 检测detect

修改detect.py 初始参数，[参数详解](https://blog.csdn.net/IT_charge/article/details/119180151)

--weights 训练好的权重`weights/yolov5n.pt`
--source 测试数据 用0表示摄像头
--data 可选, 可省略
--view-img 是否显示
--conf-thres 置信度阈值
--iou-thres iou阈值
--agnostic-nms 设定是否激活非极大值抑制


~~~sh
def parse_opt():
    parser = argparse.ArgumentParser()
    parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'weights/yolov5n.pt', help='model path(s)')
    parser.add_argument('--source', type=str, default=ROOT / '../datasets/1.mp4', help='file/dir/URL/glob, 0 for webcam')
    parser.add_argument('--data', type=str, default= '', help='(optional) dataset.yaml path')
    parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w')
    parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold')
    parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold')
    parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image')
    parser.add_argument('--device', default='cpu', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--view-img', action='store_true', help='show results',default=1)
    parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
    parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
    parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes')
    parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
    parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3')
    parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
    parser.add_argument('--augment', action='store_true', help='augmented inference')
    parser.add_argument('--visualize', action='store_true', help='visualize features')
    parser.add_argument('--update', action='store_true', help='update all models')
    parser.add_argument('--project', default=ROOT / 'runs/detect', help='save results to project/name')
    parser.add_argument('--name', default='exp', help='save results to project/name')
    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
    parser.add_argument('--line-thickness', default=3, type=int, help='bounding box thickness (pixels)')
    parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels')
    parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences')
    parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
    parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
~~~
+ 

---

# 4 将检测封装成类，直接检测图片返回结果

## 4.0 库文件
~~~
import random
import numpy as np
import torch
import time
from models.common import DetectMultiBackend
from utils.dataloaders import MyLoadImages
from utils.general import check_img_size, cv2, non_max_suppression, scale_coords
from utils.plots import Annotator, colors
~~~

## 4.1 需要修改加载数据集函数
在utils/dataloaders.py文件中加入MyLoadImages类
~~~python
class MyLoadImages:
    def __init__(self, path, img_size=640, stride=32):
        for img in path:
            if type(img)!=np.ndarray or len(img.shape)!=3:
                raise TypeError('there is a object which is not a picture read by cv2 in source')
        self.img_size = img_size
        self.stride = stride
        self.files = path
        self.nf = len(path)
        self.mode = 'image'

    def __iter__(self):
        self.count = 0
        return self
    def __next__(self):
        if self.count ==self.nf:
            raise StopIteration
        img0s = self.files[self.count]
        self.count += 1

        img = letterbox(img0s, self.img_size, stride=self.stride)[0]
        img = img[:, :, ::-1].transpose(2,0,1)
        img = np.ascontiguousarray(img)

        return img, img0s
~~~

## 4.2 检测封装代码为detect_api.py
1. 将初始化参数封装成 simulation_opt类
~~~python
class simulation_opt:
    def __init__(self, weights, img_size=640, conf_thres=0.25, iou_thres=0.45,
                  device=0, half=False,            # cuda才支持FP16  half = True
                 # device='cpu', half=False,
                classes=None, agnostic_nms=False, augment=False, visualize=False, max_det=1000,line_thickness=2, dnn=False):
        self.weights = weights
        self.device = device
        self.half = half
        self.img_size = img_size

        self.conf_thres = conf_thres
        self.iou_thres = iou_thres
        self.max_det = max_det
        self.classes = classes
        self.agnostic_nms = agnostic_nms
        self.augment = augment
        self.visualize = visualize
        self.line_thickness = line_thickness
        self.dnn = dnn
~~~

2. 用于检测的 detectapi类
   + 该类第一需要初始化参数`self.opt = simulation_opt(weights = weights,img_size=img_size)`
   + 第二加载图片做检测时使用`dataset = MyLoadImages(source, img_size=self.imgsz, stride=self.stride)`
   + 其他代码和detect.py逻辑差不多，只是删除了保存结果等操作
~~~python   
class detectapi:
    def __init__(self, weights, img_size=640):
        # 构造函数先做好必要准备，如初始化参数，加载模型
        self.opt = simulation_opt(weights = weights,img_size=img_size)
        weights, imgsz = self.opt.weights, self.opt.img_size

        # 初始化
        self.device = torch.device("cpu") if self.opt.device == 'cpu' else torch.device("cuda")

        # 加载模型
        self.model = DetectMultiBackend(weights, device=self.device, dnn=self.opt.dnn, fp16=self.opt.half)
        self.stride, self.names, self.pt = self.model.stride, self.model.names, self.model.pt
        self.model = self.model.to(self.device)

        self.model = self.model.half() if self.opt.half else self.model.float()   # 采用cuda时to FP16

        self.imgsz = check_img_size(imgsz, s=self.stride)
        self.color = [[random.randint(0, 255) for _ in range(3)] for _ in self.names]

    def detect(self, source):
        # 重写加载图片数据，因为这里喂的视频流，或者img
        dataset = MyLoadImages(source, img_size=self.imgsz, stride=self.stride)

        for im, im0s in dataset:
            im = im.astype(np.float32) / 255  # uint8 to float16/32
            im = torch.tensor(im).to(self.device) # np.array to tensor
            im = im.half() if self.opt.half else im.float()
            # if(self.half):
            #     im = im.half()
            if len(im.shape) == 3:
                im = im[None]

            t1 = time.time()
            # 模型检测
            pred = self.model(im, augment=self.opt.augment, visualize=False)

            #NMS，非极大值抑制
            pred = non_max_suppression(pred, self.opt.conf_thres, self.opt.iou_thres, self.opt.classes, self.opt.agnostic_nms, max_det=self.opt.max_det)
            t2 = time.time()

            im0 = im0s.copy()
            # annotator画框准备，可以不用
            annotator = Annotator(im0, line_width=self.opt.line_thickness, example=str(self.names))
            result_text =[]
            # 打印结果 ,det每张图的检测结果: n *（x,y,x,y,confidence level,classes）
            for i,det in enumerate(pred):
                if len(det):
                    # 恢复尺度
                    det[:, :4] = scale_coords(im.shape[2:], det[:, :4], im0.shape).round()
                    for *xyxy, conf, cls in reversed(det):
                        c = int(cls)  # integer class
                        label = f'{self.names[int(cls)]} {conf:.2f}'
                        line = (int(cls.item()), [int(_.item()) for _ in xyxy], conf.item())
                        result_text.append(line)
                        # 可视化画框，结果图片为annotator.result()
                        annotator.box_label(xyxy, label, color=colors(c, True))

            im0 = annotator.result()
            print(f'Done. ({t2 - t1:.3f}s)')
            return im0
~~~

3. 调用 detectapi类进行检测
+ 创建 检测类 `det = detectapi(weights ="weights/yolov5n.pt")`
+ 检测  `img_out = det.detect([img])`  img数据需要加`[]`
+ 需要注意：
    1. cpu检测的时候，无法用half精度, cuda才支持FP16  half = True device=0,
       + cuda: half=False, device=0, half=True, 
       + cpu:  device='cpu', half=False,
    2. 运行时，需要修改主函数中source与weights路径

~~~python
def main():
    # 检查环境/打印参数,主要是requrement.txt的包是否安装，用彩色显示设置的参数
    cap = cv2.VideoCapture(r'D:/code/yolov5_det/1.MP4')   # source
    det = detectapi(weights ="weights/yolov5n.pt")        # weights
    
    while True:
        rec, img = cap.read()
        img = cv2.resize(img, (640, 480))
        
        img_out = det.detect([img])
        
        cv2.imshow('src', img_out)
        if cv2.waitKey(1) == ord('q'):
            break
            
if __name__ == "__main__":
    main()
~~~