基于MxNet实现目标检测-YoloV3【附部分源码及模型】

news2025/7/19 16:53:42

文章目录

  • 前言
  • 目标检测发展史及意义
  • 一、数据集的准备
    • 1.标注工具的安装
    • 2.数据集的准备
    • 3.标注数据
    • 4.解释xml文件的内容
  • 二、网络结构的介绍
  • 三、代码实现
    • 0.工程目录结构如下
    • 1.导入库
    • 2.配置GPU/CPU环境
    • 3.数据加载器
    • 4.模型构建
      • YoloV3-tiny
      • YoloV3
    • 5.模型训练
      • 1.学习率设置
      • 2.优化器设置
      • 3.损失设置
      • 4.循环训练
    • 6.模型预测
  • 四、算法主入口
    • YoloV3-tiny
    • YoloV3
  • 五、训练效果展示


前言

  本文主要讲解基于mxnet深度学习框架实现目标检测,实现的模型为YoloV3及YoloV3-tiny

环境配置:
      python 3.8
      mxnet 1.7.0
      cuda 10.1


目标检测发展史及意义

  图像分类任务的实现可以让我们粗略的知道图像中包含了什么类型的物体,但并不知道物体在图像中哪一个位置,也不知道物体的具体信息,在一些具体的应用场景比如车牌识别、交通违章检测、人脸识别、运动捕捉,单纯的图像分类就不能完全满足我们的需求了。

  这时候,需要引入图像领域另一个重要任务:物体的检测与识别。在传统机器领域,一个典型的案例是利用HOG(Histogram of Gradient)特征来生成各种物体相应的“滤波器”,HOG滤波器能完整的记录物体的边缘和轮廓信息,利用这一滤波器过滤不同图片的不同位置,当输出响应值幅度超过一定阈值,就认为滤波器和图片中的物体匹配程度较高,从而完成了物体的检测。


一、数据集的准备

  首先我是用的是halcon数据集里边的药片,去了前边的100张做标注,后面的300张做测试,其中100张里边选择90张做训练集,10张做验证集。

1.标注工具的安装

pip install labelimg

进入cmd,输入labelimg,会出现如图的标注工具:
在这里插入图片描述

2.数据集的准备

首先我们先创建3个文件夹,如图:
在这里插入图片描述
DataImage:100张需要标注的图像
DataLabel:空文件夹,主要是存放标注文件,这个在labelimg中生成标注文件
test:存放剩下的300张图片,不需要标注
DataImage目录下和test目录的存放样子是这样的(以DataImage为例):
在这里插入图片描述

3.标注数据

  首先我们需要在labelimg中设置图像路径和标签存放路径,如图:
在这里插入图片描述
  然后先记住快捷键:w:开始编辑,a:上一张,d:下一张。这个工具只需要这三个快捷键即可完成工作。
  开始标注工作,首先按下键盘w,这个时候进入编辑框框的模式,然后在图像上绘制框框,输入标签(框框属于什么类别),即可完成物体1的标注,一张物体可以多个标注和多个类别,但是切记不可摸棱两可,比如这张图像对于某物体标注了,另一张图像如果出现同样的就需要标注,或者标签类别不可多个,比如这个图象A物体标注为A标签,下张图的A物体标出成了B标签,最终的效果如图:
在这里插入图片描述
最后标注完成会在DataLabel中看到标注文件,json格式:
在这里插入图片描述

4.解释xml文件的内容

在这里插入图片描述
xml标签文件如图,我们用到的就只有object对象,对其进行解析即可。


二、网络结构的介绍

论文地址:https://arxiv.org/pdf/1804.02767.pdf
网络结构:
在这里插入图片描述
  我们这里的YoloV3目标检测,所使用的主干网络是DarkNet53,因此我们根据这个结构对目标检测YoloV3进行构建,从图像上可知,YoloV3经过darknet53对图像进行压缩5次之后,取压缩了3次,4次,5次的特征层从下网上做卷积标准化激活函数等操作,同时也做了上采样,与上一层卷积层进行融合也作为一个输出;
  其次就是用先验框对下边进行检测,如:
  从131375,其实就是1313(20+4+1)*3的结果,这里代表位置和类别以及先验框。
  下面NMS极大值抑制,可以最终取得物体。


三、代码实现

0.工程目录结构如下

在这里插入图片描述
core:损失计算及一些核心计算的文件都存放在此文件夹
data:数据加载的相关函数及类
net:包含主干网络结构及标准的yolov3结构
utils:数据预处理的相关文件
Ctu_YoloV3.py/Ctu_YoloV3-tiny.py: yolov3的训练类和测试类,是整个AI的主入口


1.导入库

import os, time, warnings, json, sys, cv2, colorsys, copy
sys.path.append('.')
from PIL import Image,ImageFont,ImageDraw
import numpy as np
import mxnet as mx
from mxnet import nd
from mxnet import gluon
from mxnet import autograd
from data.data_loader import VOCDetection, VOC07MApMetric, RandomTransformDataLoader
from data.data_mixup import MixupDetection
from nets.yolo import yolo3_tiny_darknet
from data.batchify_fn import Tuple, Stack, Pad
from data.data_transform import YOLO3DefaultTrainTransform,YOLO3DefaultValTransform
from core.lr_scheduler import LRScheduler

2.配置GPU/CPU环境

self.ctx = [mx.gpu(int(i)) for i in USEGPU.split(',') if i.strip()]
self.ctx = self.ctx if self.ctx else [mx.cpu()]

3.数据加载器

这里输入的是迭代器,后面都会利用它构建训练的迭代器

class VOCDetection(dataset.Dataset):
    def CreateDataList(self,IMGDir,XMLDir):
        ImgList = os.listdir(IMGDir)
        XmlList = os.listdir(XMLDir)
        classes = []
        dataList=[]
        for each_jpg in ImgList:
            each_xml = each_jpg.split('.')[0] + '.xml'
            if each_xml in XmlList:
                dataList.append([os.path.join(IMGDir,each_jpg),os.path.join(XMLDir,each_xml)])
                with open(os.path.join(XMLDir,each_xml), "r", encoding="utf-8") as in_file:
                    tree = ET.parse(in_file)
                    root = tree.getroot()
                    for obj in root.iter('object'):
                        cls = obj.find('name').text
                        if cls not in classes:
                            classes.append(cls)
        return dataList,classes

    def __init__(self, ImageDir, XMLDir,transform=None):
        self.datalist,self.classes_names = self.CreateDataList(ImageDir,XMLDir)
        self._transform = transform
        self.index_map = dict(zip(self.classes_names, range(len(self.classes_names))))
        # self._label_cache = self._preload_labels()

    @property
    def classes(self):
        return self.classes_names

    def __len__(self):
        return len(self.datalist)

    def __getitem__(self, idx):
        img_path = self.datalist[idx][0]
        # label = self._label_cache[idx] if self._label_cache else self._load_label(idx)
        label = self._load_label(idx)
        img = mx.image.imread(img_path, 1)
        if self._transform is not None:
            return self._transform(img, label)
        return img, label.copy()

    def _preload_labels(self):
        return [self._load_label(idx) for idx in range(len(self))]

    def _load_label(self, idx):
        anno_path = self.datalist[idx][1]
        root = ET.parse(anno_path).getroot()
        size = root.find('size')
        width = float(size.find('width').text)
        height = float(size.find('height').text)
        label = []
        for obj in root.iter('object'):
            try:
                difficult = int(obj.find('difficult').text)
            except ValueError:
                difficult = 0
            cls_name = obj.find('name').text.strip().lower()
            if cls_name not in self.classes:
                continue
            cls_id = self.index_map[cls_name]
            xml_box = obj.find('bndbox')
            xmin = (float(xml_box.find('xmin').text) - 1)
            ymin = (float(xml_box.find('ymin').text) - 1)
            xmax = (float(xml_box.find('xmax').text) - 1)
            ymax = (float(xml_box.find('ymax').text) - 1)
            try:
                self._validate_label(xmin, ymin, xmax, ymax, width, height)
                label.append([xmin, ymin, xmax, ymax, cls_id, difficult])
            except AssertionError as e:
                pass
        return np.array(label)

    def _validate_label(self, xmin, ymin, xmax, ymax, width, height):
        assert 0 <= xmin < width, "xmin must in [0, {}), given {}".format(width, xmin)
        assert 0 <= ymin < height, "ymin must in [0, {}), given {}".format(height, ymin)
        assert xmin < xmax <= width, "xmax must in (xmin, {}], given {}".format(width, xmax)
        assert ymin < ymax <= height, "ymax must in (ymin, {}], given {}".format(height, ymax)


4.模型构建

本项目实现YoloV3以及YoloV3-tiny

YoloV3-tiny


class TinyYOLOV3(gluon.HybridBlock):
    def __init__(self, stages, channels, anchors, strides, classes, alloc_size=(128, 128), nms_thresh=0.45, nms_topk=400, post_nms=100, ignore_iou_thresh=0.7, norm_layer=BatchNorm, norm_kwargs=None, **kwargs):
        super(TinyYOLOV3, self).__init__(**kwargs)
        self._classes = classes
        self.nms_thresh = nms_thresh
        self.nms_topk = nms_topk
        self.post_nms = post_nms
        self._ignore_iou_thresh = ignore_iou_thresh
        self._target_generator = YOLOV3TargetMerger(len(classes), ignore_iou_thresh)

        self._loss = YOLOV3Loss()
        with self.name_scope():
            self.stages = nn.HybridSequential()
            self.transitions = nn.HybridSequential()
            self.yolo_blocks = nn.HybridSequential()
            self.yolo_outputs = nn.HybridSequential()
            for i, stage, channel, anchor, stride in zip(
                    range(len(stages)), stages, channels, anchors[::-1], strides[::-1]):
                self.stages.add(stage)
                block = YOLODetectionBlockV3(channel, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
                self.yolo_blocks.add(block)
                output = YOLOOutputV3(i, len(classes), anchor, stride, alloc_size=alloc_size)
                self.yolo_outputs.add(output)
                if i > 0:
                    self.transitions.add(_conv2d(channel, 1, 0, 1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))

    @property
    def classes(self):
        return self._classes

    def hybrid_forward(self, F, x, *args):
        all_box_centers = []
        all_box_scales = []
        all_objectness = []
        all_class_pred = []
        all_anchors = []
        all_offsets = []
        all_feat_maps = []
        all_detections = []
        routes = []
        for stage, block, output in zip(self.stages, self.yolo_blocks, self.yolo_outputs):
            x = stage(x)
            routes.append(x)
            
        for i, block, output in zip(range(len(routes)), self.yolo_blocks, self.yolo_outputs):
            x, tip = block(x)
            if autograd.is_training():
                dets, box_centers, box_scales, objness, class_pred, anchors, offsets = output(tip)
                all_box_centers.append(box_centers.reshape((0, -3, -1)))
                all_box_scales.append(box_scales.reshape((0, -3, -1)))
                all_objectness.append(objness.reshape((0, -3, -1)))
                all_class_pred.append(class_pred.reshape((0, -3, -1)))
                all_anchors.append(anchors)
                all_offsets.append(offsets)
                fake_featmap = F.zeros_like(tip.slice_axis(
                    axis=0, begin=0, end=1).slice_axis(axis=1, begin=0, end=1))
                all_feat_maps.append(fake_featmap)
            else:
                dets = output(tip)
            all_detections.append(dets)
            if i >= len(routes) - 1:
                break
            # add transition layers
            x = self.transitions[i](x)
            upsample = _upsample(x, stride=2)
            route_now = routes[::-1][i + 1]
            x = F.concat(F.slice_like(upsample, route_now * 0, axes=(2, 3)), route_now, dim=1)
        
        if autograd.is_training():
            if autograd.is_recording():
                box_preds = F.concat(*all_detections, dim=1)
                all_preds = [F.concat(*p, dim=1) for p in [
                    all_objectness, all_box_centers, all_box_scales, all_class_pred]]
                all_targets = self._target_generator(box_preds, *args)
                return self._loss(*(all_preds + all_targets))

            return (F.concat(*all_detections, dim=1), all_anchors, all_offsets, all_feat_maps,
                    F.concat(*all_box_centers, dim=1), F.concat(*all_box_scales, dim=1),
                    F.concat(*all_objectness, dim=1), F.concat(*all_class_pred, dim=1))

        result = F.concat(*all_detections, dim=1)
        if 0 < self.nms_thresh < 1:
            result = F.contrib.box_nms(result, overlap_thresh=self.nms_thresh, valid_thresh=0.01, topk=self.nms_topk, id_index=0, score_index=1, coord_start=2, force_suppress=False)
            if self.post_nms > 0:
                result = result.slice_axis(axis=1, begin=0, end=self.post_nms)
        ids = result.slice_axis(axis=-1, begin=0, end=1)
        scores = result.slice_axis(axis=-1, begin=1, end=2)
        bboxes = result.slice_axis(axis=-1, begin=2, end=6)

        return ids, scores, bboxes

    def set_nms(self, nms_thresh=0.45, nms_topk=400, post_nms=100):
        self._clear_cached_op()
        self.nms_thresh = nms_thresh
        self.nms_topk = nms_topk
        self.post_nms = post_nms

YoloV3


class YOLOV3(gluon.HybridBlock):
    def __init__(self, stages, channels, anchors, strides, classes, alloc_size=(128, 128), nms_thresh=0.45, nms_topk=400, post_nms=100, pos_iou_thresh=1.0, ignore_iou_thresh=0.7, norm_layer=BatchNorm, norm_kwargs=None, **kwargs):
        super(YOLOV3, self).__init__(**kwargs)
        self._classes = classes
        self.nms_thresh = nms_thresh
        self.nms_topk = nms_topk
        self.post_nms = post_nms
        self._pos_iou_thresh = pos_iou_thresh
        self._ignore_iou_thresh = ignore_iou_thresh
        if pos_iou_thresh >= 1:
            self._target_generator = YOLOV3TargetMerger(len(classes), ignore_iou_thresh)
        else:
            raise NotImplementedError("pos_iou_thresh({}) < 1.0 is not implemented!".format(pos_iou_thresh))
        self._loss = YOLOV3Loss()
        with self.name_scope():
            self.stages = nn.HybridSequential()
            self.transitions = nn.HybridSequential()
            self.yolo_blocks = nn.HybridSequential()
            self.yolo_outputs = nn.HybridSequential()
            # note that anchors and strides should be used in reverse order
            for i, stage, channel, anchor, stride in zip(range(len(stages)), stages, channels, anchors[::-1], strides[::-1]):
                self.stages.add(stage)
                block = YOLODetectionBlockV3(channel, norm_layer=norm_layer, norm_kwargs=norm_kwargs)
                self.yolo_blocks.add(block)
                output = YOLOOutputV3(i, len(classes), anchor, stride, alloc_size=alloc_size)
                self.yolo_outputs.add(output)
                if i > 0:
                    self.transitions.add(_conv2d(channel, 1, 0, 1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))

    @property
    def num_class(self):
        return self._num_class

    @property
    def classes(self):
        return self._classes

    def hybrid_forward(self, F, x, *args):
        if len(args) != 0 and not autograd.is_training():
            raise TypeError('YOLOV3 inference only need one input data.')

        all_box_centers = []
        all_box_scales = []
        all_objectness = []
        all_class_pred = []
        all_anchors = []
        all_offsets = []
        all_feat_maps = []
        all_detections = []
        routes = []
        for stage, block, output in zip(self.stages, self.yolo_blocks, self.yolo_outputs):
            x = stage(x)
            routes.append(x)

        # the YOLO output layers are used in reverse order, i.e., from very deep layers to shallow
        for i, block, output in zip(range(len(routes)), self.yolo_blocks, self.yolo_outputs):
            x, tip = block(x)
            if autograd.is_training():
                dets, box_centers, box_scales, objness, class_pred, anchors, offsets = output(tip)
                all_box_centers.append(box_centers.reshape((0, -3, -1)))
                all_box_scales.append(box_scales.reshape((0, -3, -1)))
                all_objectness.append(objness.reshape((0, -3, -1)))
                all_class_pred.append(class_pred.reshape((0, -3, -1)))
                all_anchors.append(anchors)
                all_offsets.append(offsets)
                # here we use fake featmap to reduce memory consuption, only shape[2, 3] is used
                fake_featmap = F.zeros_like(tip.slice_axis(
                    axis=0, begin=0, end=1).slice_axis(axis=1, begin=0, end=1))
                all_feat_maps.append(fake_featmap)
            else:
                dets = output(tip)
            all_detections.append(dets)
            if i >= len(routes) - 1:
                break
            # add transition layers
            x = self.transitions[i](x)
            # upsample feature map reverse to shallow layers
            upsample = _upsample(x, stride=2)
            route_now = routes[::-1][i + 1]
            x = F.concat(F.slice_like(upsample, route_now * 0, axes=(2, 3)), route_now, dim=1)

        if autograd.is_training():
            # during training, the network behaves differently since we don't need detection results
            if autograd.is_recording():
                # generate losses and return them directly
                box_preds = F.concat(*all_detections, dim=1)
                all_preds = [F.concat(*p, dim=1) for p in [
                    all_objectness, all_box_centers, all_box_scales, all_class_pred]]
                all_targets = self._target_generator(box_preds, *args)
                return self._loss(*(all_preds + all_targets))

            # return raw predictions, this is only used in DataLoader transform function.
            return (F.concat(*all_detections, dim=1), all_anchors, all_offsets, all_feat_maps,
                    F.concat(*all_box_centers, dim=1), F.concat(*all_box_scales, dim=1),
                    F.concat(*all_objectness, dim=1), F.concat(*all_class_pred, dim=1))

        # concat all detection results from different stages
        result = F.concat(*all_detections, dim=1)
        # apply nms per class
        if self.nms_thresh > 0 and self.nms_thresh < 1:
            result = F.contrib.box_nms(
                result, overlap_thresh=self.nms_thresh, valid_thresh=0.01,
                topk=self.nms_topk, id_index=0, score_index=1, coord_start=2, force_suppress=False)
            if self.post_nms > 0:
                result = result.slice_axis(axis=1, begin=0, end=self.post_nms)
        ids = result.slice_axis(axis=-1, begin=0, end=1)
        scores = result.slice_axis(axis=-1, begin=1, end=2)
        bboxes = result.slice_axis(axis=-1, begin=2, end=None)
        return ids, scores, bboxes

    def set_nms(self, nms_thresh=0.45, nms_topk=400, post_nms=100):
        """Set non-maximum suppression parameters.
        Parameters
        ----------
        nms_thresh : float, default is 0.45.
            Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.
        nms_topk : int, default is 400
            Apply NMS to top k detection results, use -1 to disable so that every Detection
             result is used in NMS.
        post_nms : int, default is 100
            Only return top `post_nms` detection results, the rest is discarded. The number is
            based on COCO dataset which has maximum 100 objects per image. You can adjust this
            number if expecting more objects. You can use -1 to return all detections.
        Returns
        -------
        None
        """
        self._clear_cached_op()
        self.nms_thresh = nms_thresh
        self.nms_topk = nms_topk
        self.post_nms = post_nms

    def reset_class(self, classes, reuse_weights=None):
        """Reset class categories and class predictors.
        Parameters
        ----------
        classes : iterable of str
            The new categories. ['apple', 'orange'] for example.
        reuse_weights : dict
            A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict,
            or a list of [name0, name1,...] if class names don't change.
            This allows the new predictor to reuse the
            previously trained weights specified.

        Example
        -------
        >>> net = gluoncv.model_zoo.get_model('yolo3_darknet53_voc', pretrained=True)
        >>> # use direct name to name mapping to reuse weights
        >>> net.reset_class(classes=['person'], reuse_weights={'person':'person'})
        >>> # or use interger mapping, person is the 14th category in VOC
        >>> net.reset_class(classes=['person'], reuse_weights={0:14})
        >>> # you can even mix them
        >>> net.reset_class(classes=['person'], reuse_weights={'person':14})
        >>> # or use a list of string if class name don't change
        >>> net.reset_class(classes=['person'], reuse_weights=['person'])

        """
        self._clear_cached_op()
        old_classes = self._classes
        self._classes = classes
        if self._pos_iou_thresh >= 1:
            self._target_generator = YOLOV3TargetMerger(len(classes), self._ignore_iou_thresh)
        if isinstance(reuse_weights, (dict, list)):
            if isinstance(reuse_weights, dict):
                # trying to replace str with indices
                new_keys = []
                new_vals = []
                for k, v in reuse_weights.items():
                    if isinstance(v, str):
                        try:
                            new_vals.append(old_classes.index(v))  # raise ValueError if not found
                        except ValueError:
                            raise ValueError(
                                "{} not found in old class names {}".format(v, old_classes))
                    else:
                        if v < 0 or v >= len(old_classes):
                            raise ValueError(
                                "Index {} out of bounds for old class names".format(v))
                        new_vals.append(v)
                    if isinstance(k, str):
                        try:
                            new_keys.append(self.classes.index(k))  # raise ValueError if not found
                        except ValueError:
                            raise ValueError(
                                "{} not found in new class names {}".format(k, self.classes))
                    else:
                        if k < 0 or k >= len(self.classes):
                            raise ValueError(
                                "Index {} out of bounds for new class names".format(k))
                        new_keys.append(k)
                reuse_weights = dict(zip(new_keys, new_vals))
            else:
                new_map = {}
                for x in reuse_weights:
                    try:
                        new_idx = self._classes.index(x)
                        old_idx = old_classes.index(x)
                        new_map[new_idx] = old_idx
                    except ValueError:
                        warnings.warn("{} not found in old: {} or new class names: {}".format(
                            x, old_classes, self._classes))
                reuse_weights = new_map

        for outputs in self.yolo_outputs:
            outputs.reset_class(classes, reuse_weights=reuse_weights)

5.模型训练

1.学习率设置

lr_steps = sorted([int(ls) for ls in lr_decay_epoch.split(',') if ls.strip()])
lr_decay_epoch = [e for e in lr_steps]

 lr_scheduler = LRSequential([
     LRScheduler('linear', base_lr=0, target_lr=learning_rate,
                 nepochs=0, iters_per_epoch=self.num_samples // self.batch_size),
     LRScheduler(lr_mode, base_lr=learning_rate,
                 nepochs=TrainNum,
                 iters_per_epoch=self.num_samples // self.batch_size,
                 step_epoch=lr_decay_epoch,
                 step_factor=lr_decay, power=2),
 ])

2.优化器设置

if optim == 1:
    trainer = gluon.Trainer(self.model.collect_params(), 'sgd', {'learning_rate': learning_rate, 'wd': 0.0005, 'momentum': 0.9, 'lr_scheduler': lr_scheduler})
elif optim == 2:
    trainer = gluon.Trainer(self.model.collect_params(), 'adagrad', {'learning_rate': learning_rate, 'lr_scheduler': lr_scheduler})
else:
    trainer = gluon.Trainer(self.model.collect_params(), 'adam', {'learning_rate': learning_rate, 'lr_scheduler': lr_scheduler})

3.损失设置

obj_metrics = mx.metric.Loss('ObjLoss')
center_metrics = mx.metric.Loss('BoxCenterLoss')
scale_metrics = mx.metric.Loss('BoxScaleLoss')
cls_metrics = mx.metric.Loss('ClassLoss')

4.循环训练

for i, batch in enumerate(self.train_loader):
    data = gluon.utils.split_and_load(batch[0], ctx_list=self.ctx, batch_axis=0)
    # objectness, center_targets, scale_targets, weights, class_targets
    fixed_targets = [gluon.utils.split_and_load(batch[it], ctx_list=self.ctx, batch_axis=0) for it in range(1, 6)]
    gt_boxes = gluon.utils.split_and_load(batch[6], ctx_list=self.ctx, batch_axis=0)
    sum_losses = []
    obj_losses = []
    center_losses = []
    scale_losses = []
    cls_losses = []
    with autograd.record():
        for ix, x in enumerate(data):
            obj_loss, center_loss, scale_loss, cls_loss = self.model(x, gt_boxes[ix], *[ft[ix] for ft in fixed_targets])
            sum_losses.append(obj_loss + center_loss + scale_loss + cls_loss)
            obj_losses.append(obj_loss)
            center_losses.append(center_loss)
            scale_losses.append(scale_loss)
            cls_losses.append(cls_loss)
        if self.ampFlag:
            with amp.scale_loss(sum_losses, trainer) as scaled_loss:
                autograd.backward(scaled_loss)
        else:
            autograd.backward(sum_losses)
    trainer.step(self.batch_size)

    obj_metrics.update(0, obj_losses)
    center_metrics.update(0, center_losses)
    scale_metrics.update(0, scale_losses)
    cls_metrics.update(0, cls_losses)

    name1, loss1 = obj_metrics.get()
    name2, loss2 = center_metrics.get()
    name3, loss3 = scale_metrics.get()
    name4, loss4 = cls_metrics.get()
    print('[Epoch {}][Batch {}], LR: {:.2E}, Speed: {:.3f} samples/sec, {}={:.3f}, {}={:.3f}, {}={:.3f}, {}={:.3f}'.format(
        epoch, i, trainer.learning_rate, self.batch_size/(time.time()-btic), name1, loss1, name2, loss2, name3, loss3, name4, loss4))
    btic = time.time()

6.模型预测

def predict(self,image,confidence=0.5, mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)):
    start_time = time.time()
    origin_img = copy.deepcopy(image)
    base_imageSize = origin_img.shape
    image = cv2.cvtColor(image,cv2.COLOR_RGB2BGR)
    image = cv2.resize(image,(self.image_size,self.image_size))
    img = nd.array(image)
    img = mx.nd.image.to_tensor(img)
    img = mx.nd.image.normalize(img, mean=mean, std=std)

    x = img.expand_dims(0)
    x = x.as_in_context(self.ctx[0])
    labels, scores, bboxes = [xx[0].asnumpy() for xx in self.model(x)]

    origin_img_pillow = self.cv2_pillow(origin_img)
    font = ImageFont.truetype(font='./model_data/simhei.ttf', size=np.floor(3e-2 * np.shape(origin_img_pillow)[1] + 0.5).astype('int32'))
    thickness = max((np.shape(origin_img_pillow)[0] + np.shape(origin_img_pillow)[1]) // self.image_size, 1)

    imgbox = []
    for i, bbox in enumerate(bboxes):
        if (scores is not None and scores.flat[i] < confidence) or labels is not None and labels.flat[i] < 0:
            continue
        cls_id = int(labels.flat[i]) if labels is not None else -1

        xmin, ymin, xmax, ymax = [int(x) for x in bbox]
        xmin = int(xmin / self.image_size * base_imageSize[1])
        xmax = int(xmax / self.image_size * base_imageSize[1])
        ymin = int(ymin / self.image_size * base_imageSize[0])
        ymax = int(ymax / self.image_size * base_imageSize[0])

        # print(xmin, ymin, xmax, ymax, self.classes_names[cls_id])
        class_name = self.classes_names[cls_id]
        score = '{:d}%'.format(int(scores.flat[i] * 100)) if scores is not None else ''
        imgbox.append([(xmin, ymin, xmax, ymax), cls_id, self.classes_names[cls_id], score])
        top, left, bottom, right = ymin, xmin, ymax, xmax

        label = '{}-{}'.format(class_name, score)
        draw = ImageDraw.Draw(origin_img_pillow)
        label_size = draw.textsize(label, font)
        label = label.encode('utf-8')

        if top - label_size[1] >= 0:
            text_origin = np.array([left, top - label_size[1]])
        else:
            text_origin = np.array([left, top + 1])

        for i in range(thickness):
            draw.rectangle([left + i, top + i, right - i, bottom - i], outline=self.colors[cls_id])
        draw.rectangle([tuple(text_origin), tuple(text_origin + label_size)], fill=self.colors[cls_id])
        draw.text(text_origin, str(label,'UTF-8'), fill=(0, 0, 0), font=font)
        del draw

    result_value = {
        "image_result": self.pillow_cv2(origin_img_pillow),
        "bbox": imgbox,
        "time": (time.time() - start_time) * 1000
    }

    return result_value

四、算法主入口

YoloV3-tiny


if __name__ == '__main__':
    # ctu = Ctu_YoloV3_Tiny(USEGPU='0', image_size=416)
    # ctu.InitModel(DataDir=r'D:/Ctu/Ctu_Project_DL/DataSet/DataSet_Detection_YaoPian',batch_size=4,num_workers=0, Pre_Model = None, mixup=False,label_smooth=False)
    # ctu.train(TrainNum=200, learning_rate=0.001, lr_decay_epoch='50,100,150,200', lr_decay=0.9, ModelPath='./Model',optim=0)

    ctu = Ctu_YoloV3_Tiny(USEGPU='0')
    ctu.LoadModel(r'./Model_yoloV3_tiny')
    cv2.namedWindow("result", 0)
    cv2.resizeWindow("result", 640, 480)
    index = 0
    for root, dirs, files in os.walk(r'D:/Ctu/Ctu_Project_DL/DataSet/DataSet_Detection_YaoPian/test'):
        for f in files:
            img_cv = ctu.read_image(os.path.join(root, f))
            if img_cv is None:
                continue
            res = ctu.predict(img_cv, 0.7)
            for each in res['bbox']:
                print(each)
            print("耗时:" + str(res['time']) + ' ms')
            # cv2.imwrite(str(index + 1)+'.bmp',res['image_result'])
            cv2.imshow("result", res['image_result'])
            cv2.waitKey()
            # index +=1

YoloV3

if __name__ == '__main__':
    ctu = Ctu_YoloV3(USEGPU='0', image_size=416, ampFlag = False)
    ctu.InitModel(DataDir=r'D:/Ctu/Ctu_Project_DL/DataSet/DataSet_Detection_YaoPian',batch_size=1,num_workers=0,backbone='mobilenetv1_1.0',Pre_Model = './Model_yoloV3_mobilenetv1_1.0/best_model.dat',mixup=False,label_smooth=False)
    ctu.train(TrainNum=200, learning_rate=0.001, lr_decay_epoch='50,100,150,200', lr_decay=0.9, ModelPath='./Model',optim=0)

    # ctu = Ctu_YoloV3(USEGPU='0')
    # ctu.LoadModel(r'./Model_yoloV3_mobilenetv1_1.0')
    # cv2.namedWindow("result", 0)
    # cv2.resizeWindow("result", 640, 480)
    # index = 0
    # for root, dirs, files in os.walk(r'D:/Ctu/Ctu_Project_DL/DataSet/DataSet_Detection_YaoPian/DataImage'):
    #     for f in files:
    #         img_cv = ctu.read_image(os.path.join(root, f))
    #         if img_cv is None:
    #             continue
    #         res = ctu.predict(img_cv, 0.7)
    #         for each in res['bbox']:
    #             print(each)
    #         print("耗时:" + str(res['time']) + ' ms')
    #         # cv2.imwrite(str(index + 1)+'.bmp',res['image_result'])
    #         cv2.imshow("result", res['image_result'])
    #         cv2.waitKey()
    #         # index +=1

五、训练效果展示

备注:项目模型的本人没有保存因此会后续提供训练效果
在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/35978.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

XSS绕过安全狗WAF

今天继续给大家介绍渗透测试相关知识&#xff0c;本文主要内容是XSS绕过安全狗WAF。 一、测试环境搭建 我们使用Vmware虚拟机搭建靶场环境。在Vmware虚拟机上&#xff0c;安装有PHPStudy&#xff0c;如下所示&#xff1a; 然后安装安全狗WAF&#xff0c;安全狗WAF有一系列的…

深度学习入门(五十二)计算机视觉——风格迁移

深度学习入门&#xff08;五十二&#xff09;计算机视觉——风格迁移前言计算机视觉——风格迁移课件样式迁移易于CNN的样式迁移教材1 方法2 阅读内容和风格图像3 预处理和后处理4 抽取图像特征5 定义损失函数5.1 内容损失5.2 风格损失5.3 全变分损失5.4 损失函数6 初始化合成图…

【瑞萨RA4M2】开发环境搭建和点灯指南

【瑞萨RA4系列开发板体验】开发环境搭建和新手点灯指南 文章目录【瑞萨RA4系列开发板体验】开发环境搭建和新手点灯指南一、简单开箱二、芯片简介三、开发环境搭建2.1 安装FSP(RASC)2.2 安装Keil MDK2.3 安装RA4M2 Keil Pack2.4 安装RFP(瑞萨烧录工具)三、新手点灯指南3.1 创建…

hoops编程指南:04.4用户交互突出显示

user interaction highlighting 1.突出显示 在执行选择之后&#xff0c;通常需要向用户提供关于所选内容的视觉反馈。例如&#xff0c;场景可能包含由多个几何体表示的飞机机翼的图片。然而&#xff0c;HOOPS Visualize对飞机机翼这一独特概念一无所知。因此&#xff0c;如果…

ES6 入门教程 28 异步遍历器 28.1 同步遍历器的问题 28.2 异步遍历的接口 28.3 for await...of

ES6 入门教程 ECMAScript 6 入门 作者&#xff1a;阮一峰 本文仅用于学习记录&#xff0c;不存在任何商业用途&#xff0c;如侵删 文章目录ES6 入门教程28 异步遍历器28.1 同步遍历器的问题28.2 异步遍历的接口28.3 for await...of28 异步遍历器 28.1 同步遍历器的问题 Itera…

【教学类-16-02】20221125《世界杯七巧板A4整页-随机参考图七巧板 3份一页》(大班)

效果展示&#xff1a; 单页效果 多页效果 预设样式&#xff1a; 背景需求&#xff1a; 2022年11月24日&#xff0c;大1班随机抽取的9位幼儿制作了9张拼图&#xff0c;发现以下三个问题&#xff1a; 1、粉红色辅助纸选择量多——9份作业有4位幼儿的七巧板人物是粉红色的 2、…

【计算机网络】以太网供电PoE - Power over Ethernet

.5BG? ?: J^ ~P YG: ~5PY^ 5&Y^ .#&J. 7&G7^. ~##?. :Y##PY?!~^:... .5#Y^ .7P&&&##BBBBB#B^ …

神经网络和深度学习-均方误差Mean Square Error

均方误差Mean Square Error 测量预测值Ŷ与某些真实值匹配程度。MSE 通常用作回归问题的损失函数。 由单个样本训练损失来推导出整个训练集的MSE MSE1n∑i1n(Yi−Y^i)2\mathrm{MSE}\frac{1}{n} \sum_{i1}^{n}\left(Y_{i}-\hat{Y}_{i}\right)^{2} MSEn1​i1∑n​(Yi​−Y^i​)…

02. Docker安装记录卸载

notice: 本文所有内容参考文档&#xff0c;具体没有任何价值 Linux&#xff08;CentOS 7 &#xff09; 1. 安装 查看系统信息&#xff1a; # 系统版本是3.0以上的&#xff1b; [rootVM-8-4-centos /]# uname -r 3.10.0-1160.76.1.el7.x86_64 [rootVM-8-4-centos /]# cat /et…

nrComm Lib组件以及串行通信任务的类

nrComm Lib组件以及串行通信任务的类 nrCommLib被描述为VCL例程的一组Delphi组件以及串行通信任务的类。该库能够帮助用户和开发人员访问不同的设备&#xff0c;包括数据和语音调制解调器、条形码扫描仪、蓝牙、人机接口设备、串行端口、USB、GSM、GPS、LPT SS等。它能够为几乎…

LabVIEW为可执行文件构建安装程序时找不到运行引擎

LabVIEW为可执行文件构建安装程序时找不到运行引擎 在为可执行文件构建安装程序时包含一个特定的运行时引擎安装程序&#xff0c;但找不到它。已经检查了运行时引擎是否使用NI-MAX安装。 “运行时引擎”字段下列出的项目未在“选择源”对话框中显示任何项目。 解决方案 有时运…

看透react源码之感受react的进化

写在前面 网上有许多关于react源码解读的文章&#xff0c;其中有很多都只是单纯贴源码&#xff0c;罗列变量名。其实大家都知道这个英文怎么读&#xff0c;直译也大概知道意思&#xff0c;但是这个英文在react中起到什么作用&#xff0c;并没有说的很通俗明白。 对于刚刚接触…

推荐系统常见算法分类

文章目录1.基本分类2.基于算法思想的分类3.基于应用问题的分类该系列历史文章&#xff1a; 1.推荐系统最通俗介绍 资料整理&#xff0c;来源于北大刘宏志教授讲座内容。 1.基本分类 常见的推荐系统算法分类如下&#xff1a; 算法思想 基于人口统计学、基于内容、协同过滤、基…

Django练习

目录 基础命令 一、新建项目 二、配置 三、运行 Bootstrap下载 jQuery下载 基础命令 #创建项目 django-admin startproject [项目名称] #创建app应用 python manage.py startapp [app名称] #运行 python manage.py runserver [端口号] #创建数据模型和数据表结构 python…

HTML PDF 查看器--RAD PDF 3.33 FOR ASP.NET

RAD PDF 的主要特点 基于 HTML 的 PDF 阅读器 客户端 PDF 编辑器 功能丰富的 PDF 表单填写器 交互式 PDF 表单设计器 保护 PDF 内容 签署和认证 PDF 文件 广泛的兼容性 & 在您的服务器上 将 PDF 集成到您的工作流程中 使用 ASP.NET 或 ASP.NET Core / 5 / 6 破解版RAD PDF…

pytorch深度学习实战lesson27

第二十七课 批量归一化 下面来讲批量归一化&#xff0c;现在几乎所有主流的卷积神经网络都是或多或少的用了批量归一化这个层。虽然我们之前看到的那些层比如 pooling 或 convolution&#xff0c;其实他们在80年代就出现过了&#xff0c;只是现在我们把它做得更深更大。批量归一…

Kanzi Shader入门

1. 版本 kanzi默认支持Opengl ES 2.0&#xff0c;在qnx平台可以支持到ES 3.0 2. 着色器 kanzi只支持【顶点着色器】和【片段着色器】 3. kanzi studio 无法直接使用shader&#xff0c;需要通过画刷和材质间接使用 在【普通节点】上设置背景画刷-【材质画刷】在【材质画刷…

原生Android 以面向对象的方式操作canvas

Android 自定义view 用canvas去画图形, 都是以面向过程的方式去一笔一笔的画, 而且画的图形也不能支持添加事件, 而html, js在这方面有大量的封装好的canvas框架, 很奇怪的是android上我也没有搜到类似的封装框架, 我只是个web前端开发者, 可能是我对android不了解没有搜索到&a…

Nodejs中包的介绍及npm安装依赖包的多种方法

文章目录1 包的介绍1.1 什么是包1.2 包的来源1.3 为什么需要包1.4 从哪里下载包1.5 如何下载包2 npm2.1 npm安装依赖包2.2 装包后多了哪些文件2.3 安装指定版本的包1 包的介绍 1.1 什么是包 Nodejs中的第三方模块又叫做包 就像电脑和计算机指的是相同的东西&#xff0c;第三…

Wireshark Ethernet and ARP 实验—Wireshark Lab: Ethernet and ARP v7.0

Wireshark Lab: Ethernet and ARP v7.0 1. Capturing and analyzing Ethernet frames 清除浏览器缓存 使用wireshark抓包并请求网页 修改“捕获数据包列表”窗口&#xff0c;仅显示有关 IP 以下协议的信息。 抓包干扰较多&#xff0c;故分析作者的数据包回答下列问题 包含…