本科生阶段除了在中等以上的985和某些特定的CS强项院校，无意愿研究生学习的本科生是不建议学习人工智能这一专业的，保研学生也许可以在实验室打工推荐信学习接触到此类事件，此项blog主要是对yolov5的实践性项目，yolov5作为最具有代表性的yolo版本，即便是2024年上半年清华的yolov10，也难掩他的独特光辉，在此不对yolov1到yolov5的版本迭代和算法更替进行详细的解说和阐述，只进行实际项目演练

• Miniconda
• 下载地址：Index of /anaconda/miniconda/ | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror
• pypi国内源：
• pypi | 镜像站使用帮助 | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror
• Pytorch
•官方网站：PyTorch
• YoloV5
• Github地址：ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite (github.com)

其实不一定要miniconda，anaconda就ok，因为是更全的工具库，我在opencv就下了anaconda，不再换成轻量级的anaconda，miniconda选择py38 22.11.1-1这个稳定版本

如果C盘有空间，最好安装在C盘，且安装目录中不要有中文---->勾选将其添加到PATH ------>在conda的cmd中conda create -n yolov8 python=3.8Ps:版本过高会有些包装不上

pytorch的安装

如果没有Nvidia显卡，选择cpu版本即可

对于Nvidia显卡，在windows搜索‘英伟达控制面板’，点击左下角的系统信息显示如下

表明CUDA 12.2 146

根据你的显卡，选择较新版本，这里如果大家有报错的话可以看看是不是开了代理（梯子），开了代理记得把代理关了，轻量级的训练不用下载CUDA，而且你的显卡也基本上不怎么样

按照如下操作，若和下图一样即可表明ok

https://i-blog.csdnimg.cn/direct/debbcce075aa45309bcc040d59ea575d.png

yolov5在github

打开右侧的release页面下载后打开requirement.txt

修改numpy==1.20.3，Pillow==8.3.0，把torch和torchvision两行注释掉即可

在yolov5的文件夹打开cmd，输入conda activate yolov5，然后pip install -r requirements.txt

python detect.py,会显示一些基本信息，安装yolov5s.pt，根据提示

模型检测

weights：训练好的模型文件

https://i-blog.csdnimg.cn/direct/75310ea48f594cff9354667cd1d76dce.png

通过cmd输入python detect.py --weights yolov5s.pt

其他的文件可以提前下好放在yolo文件夹中，

source：检测的目标，图片，文件夹，屏幕，摄像头等

python detect.py --weights yolov5s.pt --source data/images/bus.jpg

就出来了接下来对屏幕实时监测

python detect.py --weights yolov5s.pt --source screen

conf-thres判断置信度阈值

iou-thres相反的

https://i-blog.csdnimg.cn/direct/8cc07f18749d4d9abeef2ff1c0d3f623.png

编辑各种各样的参数

jupyter实现一个界面

新建一个hub_detect.ipynb

import torch

# Model  
model = torch.hub.load("./", "yolov5s", source="local")

#Images
img = "./data/images/zidane.jpg"

# Inference
results = model(img)

# Results
results.show()

构建自定义数据集

图片进行识别

教程：超详细从零开始yolov5模型训练_yolo训练-CSDN博客d

就是如上面说说即可

视频动态识别

https://i-blog.csdnimg.cn/direct/0dcedb29700342ecbcf00f7fcb156f04.png

https://i-blog.csdnimg.cn/direct/eb7e76240a4942ddb09c6b08974c2981.png

我们在文件夹这种创建datasets，里面一个视频，一个ipynb文件对视频进行opencv抽帧，把抽出的图片放在images文件夹

终端输入labelimg，点击opendir，选择images，changes save dir中，选择同级的labels文件夹

把pascal-C点一下变成yolo，save保存，打开上面view，打开autosave自动保存

右键create box，发现闪退不停，通过此方法是否解决？闪退是因为版本3.8和3.11不匹配需要重新下载Unable to draw annotations on Windows · Issue #811 · HumanSignal/labelImg (github.com)

换用labelme，相同的pip install labelme，同样的方法，标注daitu和mingren

labelimg标注后是JSON文件，和yolo的txt不同

jsonTOtxt未测试

import json
import os
import glob
import os.path as osp


def labelme2yolov2Seg(jsonfilePath="", resultDirPath="", classList=["YiBiao", "ZhiZhen"]):
    """
    此函数用来将labelme软件标注好的数据集转换为yolov5_7.0sege中使用的数据集
    :param jsonfilePath: labelme标注好的*.json文件所在文件夹
    :param resultDirPath: 转换好后的*.txt保存文件夹
    :param classList: 数据集中的类别标签
    :return:
    """
    # 0.创建保存转换结果的文件夹
    if(not os.path.exists(resultDirPath)):
        os.mkdir(resultDirPath)

    # 1.获取目录下所有的labelme标注好的Json文件，存入列表中
    jsonfileList = glob.glob(osp.join(jsonfilePath, "*.json"))
    print(jsonfileList)  # 打印文件夹下的文件名称

    # 2.遍历json文件，进行转换
    for jsonfile in jsonfileList:
        # 3. 打开json文件
        with open(jsonfile, "r") as f:
            file_in = json.load(f)

            # 4. 读取文件中记录的所有标注目标
            shapes = file_in["shapes"]

            # 5. 使用图像名称创建一个txt文件，用来保存数据
            with open(resultDirPath + "\\" + jsonfile.split("\\")[-1].replace(".json", ".txt"), "w") as file_handle:
                # 6. 遍历shapes中的每个目标的轮廓
                for shape in shapes:
                    # 7.根据json中目标的类别标签，从classList中寻找类别的ID，然后写入txt文件中
                    file_handle.writelines(str(classList.index(shape["label"])) + " ")

                    # 8. 遍历shape轮廓中的每个点，每个点要进行图像尺寸的缩放，即x/width, y/height
                    for point in shape["points"]:
                        x = point[0]/file_in["imageWidth"]  # mask轮廓中一点的X坐标
                        y = point[1]/file_in["imageHeight"]  # mask轮廓中一点的Y坐标
                        file_handle.writelines(str(x) + " " + str(y) + " ")  # 写入mask轮廓点

                    # 9.每个物体一行数据，一个物体遍历完成后需要换行
                    file_handle.writelines("\n")
            # 10.所有物体都遍历完，需要关闭文件
            file_handle.close()
        # 10.所有物体都遍历完，需要关闭文件
        f.close()

if __name__ == "__main__":
    jsonfilePath = "E:\\yolo\\yolov5-master\\datasets\\labelme\\json"  # 要转换的json文件所在目录
    resultDirPath = "E:\\yolo\\yolov5-master\\datasets\\labelme\\txt"  # 要生成的txt文件夹
    labelme2yolov2Seg(jsonfilePath=jsonfilePath, resultDirPath=resultDirPath, classList=["YiBiao", "ZhiZhen"])

深度学习（10）之Roboflow 使用详解：数据集标注、训练及下载-CSDN博客

yolov5模型训练

·images:存放图片
·train：训练集图片
·val：验证集图片
·labels:存放标签
·train：训练集标签文件，要与训练集图片名称一一对应
·val：验证集标签文件，要与验证集图片名称一一对应

data文件夹里面，把coco128.yaml复制一份在源地址，改名bvn.yaml

11行后面所有修改如下

path: ./datasets  # dataset root dir
train: images/train  # train images (relative to 'path') 128 images
val: images/val  # val images (relative to 'path') 128 images
test:  # test images (optional)

# Classes
names:
  0: daitu
  1: mingren

train.py第439行，

parser.add_argument('--data', type=str, default=ROOT / 'data/bvn.yaml', help='dataset.yaml path')

运行train.py

报错了，截图如下

https://i-blog.csdnimg.cn/direct/2e6fdc11a1c34e789c6fb777279aba33.png

咨询了chatgpt，只要添加代码即可

https://i-blog.csdnimg.cn/direct/4e32bda049de4cd099c18b822849b033.png
在train.py添加以下代码即可

import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'
#####上面是临时

常见问题

Arial.ttf字体文件无法下载
·手动下载，放到对应的位置，windows下的目录是：~/AppData/Roaming/Ultralytics
·页面文件太小，无法完成操作
·调整训练参数中的workers，设置为1
·修改虚拟内存，将环境安装位置所在的盘，设置一个较大的参数.'Upsample' object has no attribute 'recompute scale_factor'
·pytorch版本过高导致，可以选择降版本，1.8.2目前是不会报错的版本
·如不想降低版本，可以修改pytorch源码，打开报错的unsampling.py，删除recompute_scale_factor这个参数

训练成功！（指返回值为0）

在weights中有best和bad.pt对应最好和最坏模型。有一个很长的文件，是一个日志

在pycharm终端输入

tensorboard --logdir runs

即可跳转到6006的端口

https://i-blog.csdnimg.cn/direct/3a6c92149fc34bfebef8ad13e627958a.png

https://i-blog.csdnimg.cn/direct/b2a0a1c05eca42e286ccf4efc6de3a17.png

labels.png文件也有一些数据results.csv，result.png是整合的一些数据

cmd中  python detect.py --weights runs/train/exp3/weights/best.pt --source datasets/BVN.MP4 --view-img

其实效果还将就，但是结果告诉我要标注清晰一些，否则容易重合结果，5555

Pyside6实现GUI界面

终端输入

pip install pyside6
然后
where python

https://i-blog.csdnimg.cn/direct/4f438da08d2d451ca3ee62d6a1a912f1.png

D:\anaconda\Lib\site-packages\PySide6

main windows后点击创建，托两个label，调好合适的大小位置，push button获得按钮

右侧alignment换成水平中心对齐，勾选scaledconpents

改改便于记住的文件名保存

把他右键compile QT UI uic编译出来得到py文件

添加base_ui.py，把ui_main_window.py添加计入，内容如下

import cv2
import sys
import torch
from PySide6.QtWidgets import QMainWindow, QApplication, QFileDialog
from PySide6.QtGui import QPixmap, QImage
from PySide6.QtCore import QTimer

from ui_main_window import Ui_MainWindow


def convert2QImage(img):
    height, width, channel = img.shape
    return QImage(img, width, height, width * channel, QImage.Format_RGB888)


class MainWindow(QMainWindow, Ui_MainWindow):
    def __init__(self):
        super(MainWindow, self).__init__()
        self.setupUi(self)
        self.model = torch.hub.load("./", "custom", path="runs/train/exp3/weights/best.pt", source="local")
        self.timer = QTimer()
        self.timer.setInterval(1)
        self.video = None
        self.bind_slots()

    def image_pred(self, file_path):
        results = self.model(file_path)
        image = results.render()[0]
        return convert2QImage(image)

    def open_image(self):
        print("点击了检测图片！")
        self.timer.stop()
        file_path = QFileDialog.getOpenFileName(self, dir="./datasets/images/train", filter="*.jpg;*.png;*.jpeg")
        if file_path[0]:
            file_path = file_path[0]
            qimage = self.image_pred(file_path)
            self.input.setPixmap(QPixmap(file_path))
            self.output.setPixmap(QPixmap.fromImage(qimage))

    def video_pred(self):
        ret, frame = self.video.read()
        if not ret:
            self.timer.stop()
        else:
            frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
            self.input.setPixmap(QPixmap.fromImage(convert2QImage(frame)))
            results = self.model(frame)
            image = results.render()[0]
            self.output.setPixmap(QPixmap.fromImage(convert2QImage(image)))

    def open_video(self):
        print("点击了检测视频！")
        file_path = QFileDialog.getOpenFileName(self, dir="./datasets", filter="*.mp4")
        if file_path[0]:
            file_path = file_path[0]
            self.video = cv2.VideoCapture(file_path)
            self.timer.start()

    def bind_slots(self):
        self.det_image.clicked.connect(self.open_image)
        self.det_video.clicked.connect(self.open_video)
        self.timer.timeout.connect(self.video_pred)


if __name__ == "__main__":
    app = QApplication(sys.argv)

    window = MainWindow()
    window.show()

    app.exec()

其实是一个pyqt，有cppqt基础的应该可以不知道为什么只能看？

原来是缺一个文件

Gradio搭建演示Web GUI

·Gradio是一个开源的Python库，用于构建机器学习演示和Web应用
·内置丰富的组件，并且实现了前后端的交互逻辑，无需额外编写代码

pip install gradio

书写gradio_demo.py

import torch
import gradio as gr

model = torch.hub.load("./", "custom", path="runs/train/exp3/weights/best.pt", source="local")

title = "基于Gradio的YOLOv5演示项目"

desc = "这是一个基于Gradio的YOLOv5演示项目，非常简洁，非常方便！"

base_conf, base_iou = 0.25, 0.45

def det_image(img, conf_thres, iou_thres):
    model.conf = conf_thres
    model.iou = iou_thres
    return model(img).render()[0]

gr.Interface(inputs=["image", gr.Slider(minimum=0, maximum=1, value=base_conf), gr.Slider(minimum=0, maximum=1, value=base_iou)],
             outputs=["image"],
             fn=det_image,
             title=title,
             description=desc,
             live=True,
             examples=[["./datasets/images/train/30.jpg", base_conf, base_iou], ["./datasets/images/train/60.jpg", 0.3, base_iou]]).launch(share=True)