# yolov5 **Repository Path**: linClubs/yolov5 ## Basic Information - **Project Name**: yolov5 - **Description**: yolov5目标检测,基于v6.2版本,master可用于训练和检测,only_detect分支仅用于检测,yolov5_ros分支用于ros调用检测;detect分支基本是最简洁的检测项目;det_seg分支包含同时检测与分割的训练,推理代码;yolov5-7分支增加注意力机制及轻量化 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2022-10-17 - **Last Updated**: 2023-09-20 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 1 环境部署 显卡驱动提前装好 conda创建一个虚拟环境 ~~~bash pip install -r requirements.txt ~~~ --- # 2 训练train ## 2.1 准备数据集datasets + 标注 目标检测选用labelimg就行,做分割用labelme标注 [参考链接](https://gitee.com/linClubs/yolov5/blob/det_seg/custom_data/readme.md) + 安装labelimg ~~~ tensorflow-gpu==2.4.0 labelme==3.16.7 opencv-python==4.6.0.66 ~~~ + 启动 ~~~sh labelImg ~~~ + 标注 yolo支持3中格式训练,voc,coco,yolo,这里采用yolo格式,不需要做格式转换 将数据集图像放入./datasets/cat_dag/train和val/images/下。 train与val分开,也可以标注完了手动分开, 并在train与val目录下创建labels保存标签。 启动labeling后,设置4个地方 1. 打开目录 : 数据集的图片目录(我是./datasets/cat_dog/train/images) 2. 改变存放目录 生成的标签保存的目录(我是./datasets/cat_dog/train/labels) val标注一样,把train换成val即可 3. view/查看里面把自动保存选中(标题栏中) 4. 切换到yolo标签格式 5. 常用快捷键:W 标注,D切换下一张,A 切到上一张图片,ctrl+s保存(选了自动保存可忽略) [](./data/images/1.jpg) # 2.2 修改train.py的初始配置参数 ~~~sh def parse_opt(known=False): parser = argparse.ArgumentParser() # 预训练权重 parser.add_argument('--weights', type=str, default=ROOT / 'weights/yolov5n.pt', help='initial weights path') # 预训练模型的参数配置,需要修改models里面的文件中,类别数 parser.add_argument('--cfg', type=str, default='models/yolov5n.yaml', help='model.yaml path') # 数据集的路径,及类别,标签 parser.add_argument('--data', type=str, default=ROOT / 'data/data.yaml', help='dataset.yaml path') # 模型 其他配置参数 parser.add_argument('--hyp', type=str, default=ROOT / 'data/hyps/hyp.scratch-low.yaml', help='hyperparameters path') # 训练周期 parser.add_argument('--epochs', type=int, default=10) # 每次图片吞吐量 parser.add_argument('--batch-size', type=int, default=2, help='total batch size for all GPUs, -1 for autobatch') # 训练图像大小 parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='train, val image size (pixels)') # 是否使用GPU parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') # 线程错误,改小一点, parser.add_argument('--workers', type=int, default=2, help='max dataloader workers (per RANK in DDP mode)') ~~~ **常见训练报错:cuda显存不够,将`--batch-size`改下,引用 DDP多线程报错,将`--workers`改小** + 预训练权重 `--weights` 需要下载yolov5官网提供的权重文件,以yolov5n.pt为例 下载后放如./weight/下,修改路径`default=ROOT / 'weights/yolov5n.pt` + 预训练模型的参数配置 `--cfg` 与预训练权重要对应,在models目录下,我这里是`yolov5n.yaml` 需要修改yolov5s.yaml里面的nc值,为检测的类别数 + 数据集的路径,及类别,标签 `--data` 本人数据集目录: ~~~sh datasets/cat_dog/train/images datasets/cat_dog/train/labels datasets/cat_dog/val/images datasets/cat_dog/val/labels ~~~ 根据上面路径,修改data/data.yaml文件内容 ~~~sh path: ./datasets # dataset root dir train: cat_dog/train # 训练 val: cat_dog/val # 验证 test: # 没有可省略 nc: 2 # number of classes names: ['dog','cat'] # 标签,顺序跟标注时一致 ~~~ + 模型 其他配置参数 `--hyp` 默认就行, `data/hyps/hyp.scratch-low.yaml` + 其他按需调整,可以直接默认,主要调整上面4个 ## 2.3 训练 激活虚拟环境,运行train.py ~~~bash python train.py ~~~ 可以命令行传入初始化参数 ~~~bash python train.py --weights weights/yolov5s.pt --cfg models/yolov5s.yaml --data data/data.yaml --epochs 10 --batch-size 4 --img-size 640 --device 1 ~~~ ## 2.4 结果 结果保存在`runs/train/exp/weights/last.pt` --- # 3 检测detect 修改detect.py 初始参数,[参数详解](https://blog.csdn.net/IT_charge/article/details/119180151) --weights 训练好的权重`weights/yolov5n.pt` --source 测试数据 用0表示摄像头 --data 可选, 可省略 --view-img 是否显示 --conf-thres 置信度阈值 --iou-thres iou阈值 --agnostic-nms 设定是否激活非极大值抑制 ~~~sh def parse_opt(): parser = argparse.ArgumentParser() parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'weights/yolov5n.pt', help='model path(s)') parser.add_argument('--source', type=str, default=ROOT / '../datasets/1.mp4', help='file/dir/URL/glob, 0 for webcam') parser.add_argument('--data', type=str, default= '', help='(optional) dataset.yaml path') parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w') parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold') parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold') parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image') parser.add_argument('--device', default='cpu', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') parser.add_argument('--view-img', action='store_true', help='show results',default=1) parser.add_argument('--save-txt', action='store_true', help='save results to *.txt') parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels') parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes') parser.add_argument('--nosave', action='store_true', help='do not save images/videos') parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3') parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS') parser.add_argument('--augment', action='store_true', help='augmented inference') parser.add_argument('--visualize', action='store_true', help='visualize features') parser.add_argument('--update', action='store_true', help='update all models') parser.add_argument('--project', default=ROOT / 'runs/detect', help='save results to project/name') parser.add_argument('--name', default='exp', help='save results to project/name') parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment') parser.add_argument('--line-thickness', default=3, type=int, help='bounding box thickness (pixels)') parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels') parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences') parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference') parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference') ~~~ + --- # 4 将检测封装成类,直接检测图片返回结果 ## 4.0 库文件 ~~~ import random import numpy as np import torch import time from models.common import DetectMultiBackend from utils.dataloaders import MyLoadImages from utils.general import check_img_size, cv2, non_max_suppression, scale_coords from utils.plots import Annotator, colors ~~~ ## 4.1 需要修改加载数据集函数 在utils/dataloaders.py文件中加入MyLoadImages类 ~~~python class MyLoadImages: def __init__(self, path, img_size=640, stride=32): for img in path: if type(img)!=np.ndarray or len(img.shape)!=3: raise TypeError('there is a object which is not a picture read by cv2 in source') self.img_size = img_size self.stride = stride self.files = path self.nf = len(path) self.mode = 'image' def __iter__(self): self.count = 0 return self def __next__(self): if self.count ==self.nf: raise StopIteration img0s = self.files[self.count] self.count += 1 img = letterbox(img0s, self.img_size, stride=self.stride)[0] img = img[:, :, ::-1].transpose(2,0,1) img = np.ascontiguousarray(img) return img, img0s ~~~ ## 4.2 检测封装代码为detect_api.py 1. 将初始化参数封装成 simulation_opt类 ~~~python class simulation_opt: def __init__(self, weights, img_size=640, conf_thres=0.25, iou_thres=0.45, device=0, half=False, # cuda才支持FP16 half = True # device='cpu', half=False, classes=None, agnostic_nms=False, augment=False, visualize=False, max_det=1000,line_thickness=2, dnn=False): self.weights = weights self.device = device self.half = half self.img_size = img_size self.conf_thres = conf_thres self.iou_thres = iou_thres self.max_det = max_det self.classes = classes self.agnostic_nms = agnostic_nms self.augment = augment self.visualize = visualize self.line_thickness = line_thickness self.dnn = dnn ~~~ 2. 用于检测的 detectapi类 + 该类第一需要初始化参数`self.opt = simulation_opt(weights = weights,img_size=img_size)` + 第二加载图片做检测时使用`dataset = MyLoadImages(source, img_size=self.imgsz, stride=self.stride)` + 其他代码和detect.py逻辑差不多,只是删除了保存结果等操作 ~~~python class detectapi: def __init__(self, weights, img_size=640): # 构造函数先做好必要准备,如初始化参数,加载模型 self.opt = simulation_opt(weights = weights,img_size=img_size) weights, imgsz = self.opt.weights, self.opt.img_size # 初始化 self.device = torch.device("cpu") if self.opt.device == 'cpu' else torch.device("cuda") # 加载模型 self.model = DetectMultiBackend(weights, device=self.device, dnn=self.opt.dnn, fp16=self.opt.half) self.stride, self.names, self.pt = self.model.stride, self.model.names, self.model.pt self.model = self.model.to(self.device) self.model = self.model.half() if self.opt.half else self.model.float() # 采用cuda时to FP16 self.imgsz = check_img_size(imgsz, s=self.stride) self.color = [[random.randint(0, 255) for _ in range(3)] for _ in self.names] def detect(self, source): # 重写加载图片数据,因为这里喂的视频流,或者img dataset = MyLoadImages(source, img_size=self.imgsz, stride=self.stride) for im, im0s in dataset: im = im.astype(np.float32) / 255 # uint8 to float16/32 im = torch.tensor(im).to(self.device) # np.array to tensor im = im.half() if self.opt.half else im.float() # if(self.half): # im = im.half() if len(im.shape) == 3: im = im[None] t1 = time.time() # 模型检测 pred = self.model(im, augment=self.opt.augment, visualize=False) #NMS,非极大值抑制 pred = non_max_suppression(pred, self.opt.conf_thres, self.opt.iou_thres, self.opt.classes, self.opt.agnostic_nms, max_det=self.opt.max_det) t2 = time.time() im0 = im0s.copy() # annotator画框准备,可以不用 annotator = Annotator(im0, line_width=self.opt.line_thickness, example=str(self.names)) result_text =[] # 打印结果 ,det每张图的检测结果: n *(x,y,x,y,confidence level,classes) for i,det in enumerate(pred): if len(det): # 恢复尺度 det[:, :4] = scale_coords(im.shape[2:], det[:, :4], im0.shape).round() for *xyxy, conf, cls in reversed(det): c = int(cls) # integer class label = f'{self.names[int(cls)]} {conf:.2f}' line = (int(cls.item()), [int(_.item()) for _ in xyxy], conf.item()) result_text.append(line) # 可视化画框,结果图片为annotator.result() annotator.box_label(xyxy, label, color=colors(c, True)) im0 = annotator.result() print(f'Done. ({t2 - t1:.3f}s)') return im0 ~~~ 3. 调用 detectapi类进行检测 + 创建 检测类 `det = detectapi(weights ="weights/yolov5n.pt")` + 检测 `img_out = det.detect([img])` img数据需要加`[]` + 需要注意: 1. cpu检测的时候,无法用half精度, cuda才支持FP16 half = True device=0, + cuda: half=False, device=0, half=True, + cpu: device='cpu', half=False, 2. 运行时,需要修改主函数中source与weights路径 ~~~python def main(): # 检查环境/打印参数,主要是requrement.txt的包是否安装,用彩色显示设置的参数 cap = cv2.VideoCapture(r'D:/code/yolov5_det/1.MP4') # source det = detectapi(weights ="weights/yolov5n.pt") # weights while True: rec, img = cap.read() img = cv2.resize(img, (640, 480)) img_out = det.detect([img]) cv2.imshow('src', img_out) if cv2.waitKey(1) == ord('q'): break if __name__ == "__main__": main() ~~~