基于MxNet实现目标检测-FasterRCNN【附部分源码及模型】

news2025/7/31 6:46:28

文章目录

  • 前言
  • 目标检测发展史及意义
  • 一、数据集的准备
    • 1.标注工具的安装
    • 2.数据集的准备
    • 3.标注数据
    • 4.解释xml文件的内容
  • 二、网络结构的介绍
  • 三、代码实现
    • 0.工程目录结构如下
    • 1.导入库
    • 2.配置GPU/CPU环境
    • 3.数据加载器
    • 4.模型构建
    • 5.模型训练
      • 1.学习率设置
      • 2.优化器设置
      • 3.损失设置
      • 4.循环训练
    • 6.模型预测
  • 四、算法主入口
  • 五、训练效果展示


前言

  本文主要讲解基于mxnet深度学习框架实现目标检测,实现的模型为FasterRCNN

环境配置:
      python 3.8
      mxnet 1.7.0
      cuda 10.1


目标检测发展史及意义

  图像分类任务的实现可以让我们粗略的知道图像中包含了什么类型的物体,但并不知道物体在图像中哪一个位置,也不知道物体的具体信息,在一些具体的应用场景比如车牌识别、交通违章检测、人脸识别、运动捕捉,单纯的图像分类就不能完全满足我们的需求了。

  这时候,需要引入图像领域另一个重要任务:物体的检测与识别。在传统机器领域,一个典型的案例是利用HOG(Histogram of Gradient)特征来生成各种物体相应的“滤波器”,HOG滤波器能完整的记录物体的边缘和轮廓信息,利用这一滤波器过滤不同图片的不同位置,当输出响应值幅度超过一定阈值,就认为滤波器和图片中的物体匹配程度较高,从而完成了物体的检测。


一、数据集的准备

  首先我是用的是halcon数据集里边的药片,去了前边的100张做标注,后面的300张做测试,其中100张里边选择90张做训练集,10张做验证集。

1.标注工具的安装

pip install labelimg

进入cmd,输入labelimg,会出现如图的标注工具:
在这里插入图片描述

2.数据集的准备

首先我们先创建3个文件夹,如图:
在这里插入图片描述
DataImage:100张需要标注的图像
DataLabel:空文件夹,主要是存放标注文件,这个在labelimg中生成标注文件
test:存放剩下的300张图片,不需要标注
DataImage目录下和test目录的存放样子是这样的(以DataImage为例):
在这里插入图片描述

3.标注数据

  首先我们需要在labelimg中设置图像路径和标签存放路径,如图:
在这里插入图片描述
  然后先记住快捷键:w:开始编辑,a:上一张,d:下一张。这个工具只需要这三个快捷键即可完成工作。
  开始标注工作,首先按下键盘w,这个时候进入编辑框框的模式,然后在图像上绘制框框,输入标签(框框属于什么类别),即可完成物体1的标注,一张物体可以多个标注和多个类别,但是切记不可摸棱两可,比如这张图像对于某物体标注了,另一张图像如果出现同样的就需要标注,或者标签类别不可多个,比如这个图象A物体标注为A标签,下张图的A物体标出成了B标签,最终的效果如图:
在这里插入图片描述
最后标注完成会在DataLabel中看到标注文件,json格式:
在这里插入图片描述

4.解释xml文件的内容

在这里插入图片描述
xml标签文件如图,我们用到的就只有object对象,对其进行解析即可。


二、网络结构的介绍

论文地址:https://arxiv.org/pdf/1506.01497.pdf
网络结构:
在这里插入图片描述
  
Faster RCNN其实可以分为4个主要内容:

  • Conv layers。作为一种CNN网络目标检测方法,Faster RCNN首先使用一组基础的conv+relu+pooling层提取image的feature maps。该feature maps被共享用于后续RPN层和全连接层。
  • Region Proposal Networks。RPN网络用于生成region proposals。该层通过softmax判断anchors属于positive或者negative,再利用bounding box regression修正anchors获得精确的proposals。
  • Roi Pooling。该层收集输入的feature maps和proposals,综合这些信息后提取proposal feature maps,送入后续全连接层判定目标类别。
  • Classification。利用proposal feature maps计算proposal的类别,同时再次bounding box regression获得检测框最终的精确位置。

三、代码实现

0.工程目录结构如下

在这里插入图片描述

core:损失计算及一些核心计算的文件都存放在此文件夹
data:数据加载的相关函数及类
net:包含主干网络结构及标准的fasterrcnn结构
utils:数据预处理的相关文件
Ctu_FasterRCNN.py:fasterrcnn的训练类和测试类,是整个AI的主入口


1.导入库

import os, time, sys, json, colorsys, cv2, copy
sys.path.append('.')
from mxnet import nd
import numpy as np
import mxnet as mx
from PIL import Image, ImageDraw, ImageFont
from mxnet import gluon
from mxnet.contrib import amp
from data.data_loader import VOCDetection,VOC07MApMetric,MixupDetection
from nets.faster_rcnn.predefined_models import faster_rcnn_resnet18_v1b_voc,faster_rcnn_resnet50_v1b_voc, faster_rcnn_resnet101_v1d_voc
from data.batchify_fn import Tuple,Append,FasterRCNNTrainBatchify
from data.sampler import SplitSortedBucketSampler
from data.data_transform import FasterRCNNDefaultTrainTransform, FasterRCNNDefaultValTransform
from core.loss import RPNAccMetric, RPNL1LossMetric, RCNNAccMetric, RCNNL1LossMetric
from nets.faster_rcnn.data_parallel import ForwardBackwardTask
from utils.parallel import Parallel
from nets.nn.bbox import BBoxClipToImage

2.配置GPU/CPU环境

self.ctx = [mx.gpu(int(i)) for i in USEGPU.split(',') if i.strip()]
self.ctx = self.ctx if self.ctx else [mx.cpu()]

3.数据加载器

这里输入的是迭代器,后面都会利用它构建训练的迭代器

class VOCDetection(dataset.Dataset):
    def CreateDataList(self,IMGDir,XMLDir):
        ImgList = os.listdir(IMGDir)
        XmlList = os.listdir(XMLDir)
        classes = []
        dataList=[]
        for each_jpg in ImgList:
            each_xml = each_jpg.split('.')[0] + '.xml'
            if each_xml in XmlList:
                dataList.append([os.path.join(IMGDir,each_jpg),os.path.join(XMLDir,each_xml)])
                with open(os.path.join(XMLDir,each_xml), "r", encoding="utf-8") as in_file:
                    tree = ET.parse(in_file)
                    root = tree.getroot()
                    for obj in root.iter('object'):
                        cls = obj.find('name').text
                        if cls not in classes:
                            classes.append(cls)
        return dataList,classes

    def __init__(self, ImageDir, XMLDir,transform=None):
        self.datalist,self.classes_names = self.CreateDataList(ImageDir,XMLDir)
        self._transform = transform
        self.index_map = dict(zip(self.classes_names, range(len(self.classes_names))))
        # self._label_cache = self._preload_labels()

    @property
    def classes(self):
        return self.classes_names

    def __len__(self):
        return len(self.datalist)

    def __getitem__(self, idx):
        img_path = self.datalist[idx][0]
        # label = self._label_cache[idx] if self._label_cache else self._load_label(idx)
        label = self._load_label(idx)
        img = mx.image.imread(img_path, 1)
        if self._transform is not None:
            return self._transform(img, label)
        return img, label.copy()

    def _preload_labels(self):
        return [self._load_label(idx) for idx in range(len(self))]

    def _load_label(self, idx):
        anno_path = self.datalist[idx][1]
        root = ET.parse(anno_path).getroot()
        size = root.find('size')
        width = float(size.find('width').text)
        height = float(size.find('height').text)
        label = []
        for obj in root.iter('object'):
            try:
                difficult = int(obj.find('difficult').text)
            except ValueError:
                difficult = 0
            cls_name = obj.find('name').text.strip().lower()
            if cls_name not in self.classes:
                continue
            cls_id = self.index_map[cls_name]
            xml_box = obj.find('bndbox')
            xmin = (float(xml_box.find('xmin').text) - 1)
            ymin = (float(xml_box.find('ymin').text) - 1)
            xmax = (float(xml_box.find('xmax').text) - 1)
            ymax = (float(xml_box.find('ymax').text) - 1)
            try:
                self._validate_label(xmin, ymin, xmax, ymax, width, height)
                label.append([xmin, ymin, xmax, ymax, cls_id, difficult])
            except AssertionError as e:
                pass
        return np.array(label)

    def _validate_label(self, xmin, ymin, xmax, ymax, width, height):
        assert 0 <= xmin < width, "xmin must in [0, {}), given {}".format(width, xmin)
        assert 0 <= ymin < height, "ymin must in [0, {}), given {}".format(height, ymin)
        assert xmin < xmax <= width, "xmax must in (xmin, {}], given {}".format(width, xmax)
        assert ymin < ymax <= height, "ymax must in (ymin, {}], given {}".format(height, ymax)


4.模型构建

本项目使用resne做主干网络结构

def faster_rcnn_resnet50_v1b_voc(classes,small_size=600,max_size = 1000, **kwargs):
    base_network = resnet50_v1b(classes, dilated=False,  use_global_stats=True, **kwargs)
    features = nn.HybridSequential()
    top_features = nn.HybridSequential()
    for layer in ['conv1', 'bn1', 'relu', 'maxpool', 'layer1', 'layer2', 'layer3']:
        features.add(getattr(base_network, layer))
    for layer in ['layer4']:
        top_features.add(getattr(base_network, layer))
    train_patterns = '|'.join(['.*dense', '.*rpn', '.*down(2|3|4)_conv', '.*layers(2|3|4)_conv'])
    return get_faster_rcnn(
        features=features, top_features=top_features, classes=classes,
        short=small_size, max_size=max_size, train_patterns=train_patterns,
        nms_thresh=0.3, nms_topk=400, post_nms=100,
        roi_mode='align', roi_size=(14, 14), strides=16, clip=None,
        rpn_channel=1024, base_size=16, scales=(2, 4, 8, 16, 32),
        ratios=(0.5, 1, 2), alloc_size=(128, 128), rpn_nms_thresh=0.7,
        rpn_train_pre_nms=12000, rpn_train_post_nms=2000,
        rpn_test_pre_nms=6000, rpn_test_post_nms=300, rpn_min_size=16,
        num_sample=128, pos_iou_thresh=0.5, pos_ratio=0.25, max_num_gt=100,
        **kwargs)```
```python
class FasterRCNN(RCNN):
    def __init__(self, features, top_features, classes, box_features=None,
                 short=600, max_size=1000, min_stage=4, max_stage=4, train_patterns=None,
                 nms_thresh=0.3, nms_topk=400, post_nms=100, roi_mode='align', roi_size=(14, 14), strides=16,
                 clip=None, rpn_channel=1024, base_size=16, scales=(8, 16, 32),
                 ratios=(0.5, 1, 2), alloc_size=(128, 128), rpn_nms_thresh=0.7,
                 rpn_train_pre_nms=12000, rpn_train_post_nms=2000, rpn_test_pre_nms=6000,
                 rpn_test_post_nms=300, rpn_min_size=16, per_device_batch_size=1, num_sample=128,
                 pos_iou_thresh=0.5, pos_ratio=0.25, max_num_gt=300, additional_output=False,
                 force_nms=False, minimal_opset=False, **kwargs):
        super(FasterRCNN, self).__init__(
            features=features, top_features=top_features, classes=classes,
            box_features=box_features, short=short, max_size=max_size,
            train_patterns=train_patterns, nms_thresh=nms_thresh, nms_topk=nms_topk, post_nms=post_nms,
            roi_mode=roi_mode, roi_size=roi_size, strides=strides, clip=clip, force_nms=force_nms,
            minimal_opset=minimal_opset, **kwargs)
        if max_stage - min_stage > 1 and isinstance(strides, (int, float)):
            raise ValueError('Multi level detected but strides is of a single number:', strides)

        if rpn_train_post_nms > rpn_train_pre_nms:
            rpn_train_post_nms = rpn_train_pre_nms
        if rpn_test_post_nms > rpn_test_pre_nms:
            rpn_test_post_nms = rpn_test_pre_nms

        self.ashape = alloc_size[0]
        self._min_stage = min_stage
        self._max_stage = max_stage
        self.num_stages = max_stage - min_stage + 1
        if self.num_stages > 1:
            assert len(scales) == len(strides) == self.num_stages, "The num_stages (%d) must match number of scales (%d) and strides (%d)" % (self.num_stages, len(scales), len(strides))
        self._batch_size = per_device_batch_size
        self._num_sample = num_sample
        self._rpn_test_post_nms = rpn_test_post_nms
        if minimal_opset:
            self._target_generator = None
        else:
            self._target_generator = lambda: RCNNTargetGenerator(self.num_class, int(num_sample * pos_ratio), self._batch_size)

        self._additional_output = additional_output
        with self.name_scope():
            self.rpn = RPN(
                channels=rpn_channel, strides=strides, base_size=base_size,
                scales=scales, ratios=ratios, alloc_size=alloc_size,
                clip=clip, nms_thresh=rpn_nms_thresh, train_pre_nms=rpn_train_pre_nms,
                train_post_nms=rpn_train_post_nms, test_pre_nms=rpn_test_pre_nms,
                test_post_nms=rpn_test_post_nms, min_size=rpn_min_size,
                multi_level=self.num_stages > 1, per_level_nms=False,
                minimal_opset=minimal_opset)
            self.sampler = RCNNTargetSampler(num_image=self._batch_size, num_proposal=rpn_train_post_nms, num_sample=num_sample, pos_iou_thresh=pos_iou_thresh, pos_ratio=pos_ratio, max_num_gt=max_num_gt)

    @property
    def target_generator(self):
        if self._target_generator is None:
            raise ValueError("`minimal_opset` enabled, target generator is not available")
        if not isinstance(self._target_generator, mx.gluon.Block):
            self._target_generator = self._target_generator()
            self._target_generator.initialize()
        return self._target_generator

    def reset_class(self, classes, reuse_weights=None):
        super(FasterRCNN, self).reset_class(classes, reuse_weights)
        self._target_generator = lambda: RCNNTargetGenerator(self.num_class, self.sampler._max_pos, self._batch_size)

    def _pyramid_roi_feats(self, F, features, rpn_rois, roi_size, strides, roi_mode='align', roi_canonical_scale=224.0, sampling_ratio=2, eps=1e-6):
        max_stage = self._max_stage
        if self._max_stage > 5:
            max_stage = self._max_stage - 1
        _, x1, y1, x2, y2 = F.split(rpn_rois, axis=-1, num_outputs=5)
        h = y2 - y1 + 1
        w = x2 - x1 + 1
        roi_level = F.floor(4 + F.log2(F.sqrt(w * h) / roi_canonical_scale + eps))
        roi_level = F.squeeze(F.clip(roi_level, self._min_stage, max_stage))

        pooled_roi_feats = []
        for i, l in enumerate(range(self._min_stage, max_stage + 1)):
            if roi_mode == 'pool':
                pooled_feature = F.ROIPooling(features[i], rpn_rois, roi_size, 1. / strides[i])
                pooled_feature = F.where(roi_level == l, pooled_feature, F.zeros_like(pooled_feature))
            elif roi_mode == 'align':
                if 'box_encode' in F.contrib.__dict__ and 'box_decode' in F.contrib.__dict__:
                    masked_rpn_rois = F.where(roi_level == l, rpn_rois, F.ones_like(rpn_rois) * -1.)
                    pooled_feature = F.contrib.ROIAlign(features[i], masked_rpn_rois, roi_size, 1. / strides[i], sample_ratio=sampling_ratio)
                else:
                    pooled_feature = F.contrib.ROIAlign(features[i], rpn_rois, roi_size, 1. / strides[i], sample_ratio=sampling_ratio)
                    pooled_feature = F.where(roi_level == l, pooled_feature, F.zeros_like(pooled_feature))
            else:
                raise ValueError("Invalid roi mode: {}".format(roi_mode))
            pooled_roi_feats.append(pooled_feature)
        pooled_roi_feats = F.ElementWiseSum(*pooled_roi_feats)
        return pooled_roi_feats

    def hybrid_forward(self, F, x, gt_box=None, gt_label=None):
        def _split(x, axis, num_outputs, squeeze_axis):
            x = F.split(x, axis=axis, num_outputs=num_outputs, squeeze_axis=squeeze_axis)
            if isinstance(x, list):
                return x
            else:
                return [x]

        feat = self.features(x)
        if not isinstance(feat, (list, tuple)):
            feat = [feat]

        if autograd.is_training():
            rpn_score, rpn_box, raw_rpn_score, raw_rpn_box, anchors = self.rpn(F.zeros_like(x), *feat)
            rpn_box, samples, matches = self.sampler(rpn_box, rpn_score, gt_box)
        else:
            _, rpn_box = self.rpn(F.zeros_like(x), *feat)

        num_roi = self._num_sample if autograd.is_training() else self._rpn_test_post_nms
        batch_size = self._batch_size if autograd.is_training() else 1
        with autograd.pause():
            roi_batchid = F.arange(0, batch_size)
            roi_batchid = F.repeat(roi_batchid, num_roi)
            rpn_roi = F.concat(*[roi_batchid.reshape((-1, 1)), rpn_box.reshape((-1, 4))], dim=-1)
            rpn_roi = F.stop_gradient(rpn_roi)

        if self.num_stages > 1:
            pooled_feat = self._pyramid_roi_feats(F, feat, rpn_roi, self._roi_size, self._strides, roi_mode=self._roi_mode)
        else:
            if self._roi_mode == 'pool':
                pooled_feat = F.ROIPooling(feat[0], rpn_roi, self._roi_size, 1. / self._strides)
            elif self._roi_mode == 'align':
                pooled_feat = F.contrib.ROIAlign(feat[0], rpn_roi, self._roi_size, 1. / self._strides, sample_ratio=2)
            else:
                raise ValueError("Invalid roi mode: {}".format(self._roi_mode))

        if self.top_features is not None:
            top_feat = self.top_features(pooled_feat)
        else:
            top_feat = pooled_feat
        if self.box_features is None:
            box_feat = F.contrib.AdaptiveAvgPooling2D(top_feat, output_size=1)
        else:
            box_feat = self.box_features(top_feat)
        cls_pred = self.class_predictor(box_feat)
        cls_pred = cls_pred.reshape((batch_size, num_roi, self.num_class + 1))

        if autograd.is_training():
            cls_targets, box_targets, box_masks, indices = self.target_generator(rpn_box, samples, matches, gt_label, gt_box)
            box_feat = F.reshape(box_feat.expand_dims(0), (batch_size, -1, 0))
            box_pred = self.box_predictor(F.concat(
                *[F.take(F.slice_axis(box_feat, axis=0, begin=i, end=i + 1).squeeze(), F.slice_axis(indices, axis=0, begin=i, end=i + 1).squeeze())
                  for i in range(batch_size)], dim=0))
            box_pred = box_pred.reshape((batch_size, -1, self.num_class, 4))
            if self._additional_output:
                return (cls_pred, box_pred, rpn_box, samples, matches, raw_rpn_score, raw_rpn_box, anchors, cls_targets, box_targets, box_masks, top_feat, indices)
            return (cls_pred, box_pred, rpn_box, samples, matches, raw_rpn_score, raw_rpn_box, anchors, cls_targets, box_targets, box_masks, indices)

        box_pred = self.box_predictor(box_feat)
        box_pred = box_pred.reshape((batch_size, num_roi, self.num_class, 4))
        cls_ids, scores = self.cls_decoder(F.softmax(cls_pred, axis=-1))
        cls_ids = cls_ids.transpose((0, 2, 1)).reshape((0, 0, 0, 1))
        scores = scores.transpose((0, 2, 1)).reshape((0, 0, 0, 1))
        box_pred = box_pred.transpose((0, 2, 1, 3))

        rpn_boxes = _split(rpn_box, axis=0, num_outputs=batch_size, squeeze_axis=False)
        cls_ids = _split(cls_ids, axis=0, num_outputs=batch_size, squeeze_axis=True)
        scores = _split(scores, axis=0, num_outputs=batch_size, squeeze_axis=True)
        box_preds = _split(box_pred, axis=0, num_outputs=batch_size, squeeze_axis=True)

        results = []
        for rpn_box, cls_id, score, box_pred in zip(rpn_boxes, cls_ids, scores, box_preds):
            bbox = self.box_decoder(box_pred, rpn_box)
            res = F.concat(*[cls_id, score, bbox], dim=-1)
            if self.force_nms:
                res = res.reshape((1, -1, 0))
            res = F.contrib.box_nms(
                res, overlap_thresh=self.nms_thresh, topk=self.nms_topk, valid_thresh=0.0001,
                id_index=0, score_index=1, coord_start=2, force_suppress=self.force_nms)
            res = res.reshape((-3, 0))
            results.append(res)

        result = F.stack(*results, axis=0)
        ids = F.slice_axis(result, axis=-1, begin=0, end=1)
        scores = F.slice_axis(result, axis=-1, begin=1, end=2)
        bboxes = F.slice_axis(result, axis=-1, begin=2, end=6)
        if self._additional_output:
            return ids, scores, bboxes, feat
        return ids, scores, bboxes



5.模型训练

1.学习率设置

lr_steps = sorted([int(ls) for ls in lr_decay_epoch.split(',') if ls.strip()])
lr_decay_epoch = [e for e in lr_steps]

 lr_scheduler = LRSequential([
     LRScheduler('linear', base_lr=0, target_lr=learning_rate,
                 nepochs=0, iters_per_epoch=self.num_samples // self.batch_size),
     LRScheduler(lr_mode, base_lr=learning_rate,
                 nepochs=TrainNum,
                 iters_per_epoch=self.num_samples // self.batch_size,
                 step_epoch=lr_decay_epoch,
                 step_factor=lr_decay, power=2),
 ])

2.优化器设置

if optim == 1:
    trainer = gluon.Trainer(self.model.collect_params(), 'sgd', {'learning_rate': learning_rate, 'wd': 0.0005, 'momentum': 0.9, 'lr_scheduler': lr_scheduler})
elif optim == 2:
    trainer = gluon.Trainer(self.model.collect_params(), 'adagrad', {'learning_rate': learning_rate, 'lr_scheduler': lr_scheduler})
else:
    trainer = gluon.Trainer(self.model.collect_params(), 'adam', {'learning_rate': learning_rate, 'lr_scheduler': lr_scheduler})

3.损失设置

rpn_cls_loss = mx.gluon.loss.SigmoidBinaryCrossEntropyLoss(from_sigmoid=False)
rpn_box_loss = mx.gluon.loss.HuberLoss(rho=1. / 9.)
rcnn_cls_loss = mx.gluon.loss.SoftmaxCrossEntropyLoss()
rcnn_box_loss = mx.gluon.loss.HuberLoss(rho=1.)
metrics = [mx.metric.Loss('RPN_Conf'),
           mx.metric.Loss('RPN_SmoothL1'),
           mx.metric.Loss('RCNN_CrossEntropy'),
           mx.metric.Loss('RCNN_SmoothL1'), ]

4.循环训练

for i, batch in enumerate(self.train_loader):
    batch = self.split_and_load(batch)
    metric_losses = [[] for _ in metrics]
    add_losses = [[] for _ in metrics2]
    if executor is not None:
        for data in zip(*batch):
            executor.put(data)
    for j in range(len(self.ctx)):
        if executor is not None:
            result = executor.get()
        else:
            result = rcnn_task.forward_backward(list(zip(*batch))[0])

        for k in range(len(metric_losses)):
            metric_losses[k].append(result[k])
        for k in range(len(add_losses)):
            add_losses[k].append(result[len(metric_losses) + k])
    trainer.step(self.batch_size)

    for metric, record in zip(metrics, metric_losses):
        metric.update(0, record)
    for metric, records in zip(metrics2, add_losses):
        for pred in records:
            metric.update(pred[0], pred[1])

    msg = ','.join(['{}={:.3f}'.format(*metric.get()) for metric in metrics + metrics2])
    print('[Epoch {}][Batch {}], Speed: {:.3f} samples/sec, {}'.format(epoch, i, self.batch_size / (time.time() - btic), msg))
    btic = time.time()

6.模型预测

    def predict(self, image, confidence=0.5, mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)):
        start_time = time.time()
        origin_img = copy.deepcopy(image)
        base_imageSize = origin_img.shape
        image = cv2.cvtColor(image,cv2.COLOR_RGB2BGR)
        image = cv2.resize(image,(self.min_size,self.min_size))
        img = nd.array(image)
        img = mx.nd.image.to_tensor(img)
        img = mx.nd.image.normalize(img, mean=mean, std=std)

        x = img.expand_dims(0)
        x = x.as_in_context(self.ctx[0])
        labels, scores, bboxes = [xx[0].asnumpy() for xx in self.model(x)]

        origin_img_pillow = self.cv2_pillow(origin_img)
        font = ImageFont.truetype(font='./model_data/simhei.ttf', size=np.floor(3e-2 * np.shape(origin_img_pillow)[1] + 0.5).astype('int32'))
        thickness = max((np.shape(origin_img_pillow)[0] + np.shape(origin_img_pillow)[1]) // self.min_size, 1)

        imgbox = []
        for i, bbox in enumerate(bboxes):
            if (scores is not None and scores.flat[i] < confidence) or labels is not None and labels.flat[i] < 0:
                continue
            cls_id = int(labels.flat[i]) if labels is not None else -1

            xmin, ymin, xmax, ymax = [int(x) for x in bbox]
            xmin = int(xmin / self.min_size * base_imageSize[1])
            xmax = int(xmax / self.min_size * base_imageSize[1])
            ymin = int(ymin / self.min_size * base_imageSize[0])
            ymax = int(ymax / self.min_size * base_imageSize[0])

            # print(xmin, ymin, xmax, ymax, self.classes_names[cls_id])
            class_name = self.classes_names[cls_id]
            score = '{:d}%'.format(int(scores.flat[i] * 100)) if scores is not None else ''
            imgbox.append([(xmin, ymin, xmax, ymax), cls_id, self.classes_names[cls_id], score])
            top, left, bottom, right = ymin, xmin, ymax, xmax

            label = '{}-{}'.format(class_name, score)
            draw = ImageDraw.Draw(origin_img_pillow)
            label_size = draw.textsize(label, font)
            label = label.encode('utf-8')

            if top - label_size[1] >= 0:
                text_origin = np.array([left, top - label_size[1]])
            else:
                text_origin = np.array([left, top + 1])

            for i in range(thickness):
                draw.rectangle([left + i, top + i, right - i, bottom - i], outline=self.colors[cls_id])
            draw.rectangle([tuple(text_origin), tuple(text_origin + label_size)], fill=self.colors[cls_id])
            draw.text(text_origin, str(label,'UTF-8'), fill=(0, 0, 0), font=font)
            del draw

        result_value = {
            "image_result": self.pillow_cv2(origin_img_pillow),
            "bbox": imgbox,
            "time": (time.time() - start_time) * 1000
        }
        return result_value

四、算法主入口

if __name__ == '__main__':
    # ctu = Ctu_FasterRcnn(USEGPU='0',ampFlag=True,mixupFlag=False,min_size = 416,max_size=416)
    # ctu.InitModel(DataDir=r'D:/Ctu/Ctu_Project_DL/DataSet/DataSet_Detection_YaoPian',batch_size=1,num_workers = 0,model_name='faster_rcnn_resnet18_v1b_voc',pre_Model='./Model_faster_rcnn_resnet18_v1b_voc/best_model.dat')
    # ctu.train(TrainNum=300,learning_rate=0.0001,lr_decay_epoch='30,70,150,200',ModelPath='./Model',lr_decay=0.9,disable_hybridization=False)

    ctu = Ctu_FasterRcnn(USEGPU='0',ampFlag=True)
    ctu.LoadModel(r'./Model_faster_rcnn_resnet50_v1b_voc')
    cv2.namedWindow("result", 0)
    cv2.resizeWindow("result", 640, 480)
    index = 0
    for root, dirs, files in os.walk(r'D:/Ctu/Ctu_Project_DL\DataSet\DataSet_Detection_Color/test'):
        for f in files:
            img_cv = ctu.read_image(os.path.join(root, f))
            if img_cv is None:
                continue
            res = ctu.predict(img_cv, 0.7)
            for each in res['bbox']:
                print(each)
            print("耗时:" + str(res['time']) + ' ms')
            # cv2.imwrite(str(index + 1)+'.bmp',res['image_result'])
            cv2.imshow("result", res['image_result'])
            cv2.waitKey()
            # index +=1

五、训练效果展示

备注:项目模型的本人没有保存因此会后续提供训练效果
在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/35141.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

流程编排、如此简单-通用流程编排组件JDEasyFlow介绍

作者&#xff1a;李玉亮 JDEasyFlow是企业金融研发部自研的通用流程编排技术组件&#xff0c;适用于服务编排、工作流、审批流等场景&#xff0c;该组件已开源(https://github.com/JDEasyFlow/jd-easyflow)&#xff0c;目前在部门的内部业务系统和科技输出系统中广泛应用&…

项目经理年终夜话:我的“第二年状态”

近日&#xff0c;网络热词“第二年状态”又引发了网友们的热议&#xff0c;是指最初的热情消磨后&#xff0c;幻想破灭&#xff0c;在理想与现实巨大落差面前&#xff0c;陷入迷茫与彷徨。 而“如何度过第二年状态&#xff1f;”这一话题也被大家讨论和分享。其中&#xff0c;贾…

分布式文件系统HDFS实践及原理详解part3

HDFS原理 说明&#xff1a;3.5开头目录是因为和上篇文章内容同属一章&#xff0c;所以开头使用了3.5 3.5 HDFS核心设计 3.5.1 心跳机制 1、 Hadoop 是 Master/Slave 结构&#xff0c;Master 中有 NameNode 和 ResourceManager&#xff0c;Slave 中有 Datanode 和 NodeManag…

Milvus 2.2 版本发布!

经过了 4 个月的打磨&#xff0c;Milvus 2.2.0 于 11 月 18 日正式发版&#xff01;2.2 版本推出了包括基于磁盘的近似最近邻&#xff08;ANN&#xff09;索引算法、从文件批量导入数据、基于角色的访问控制等新特性。进一步提升了向量搜索的稳定性、搜索速度和灵活的扩缩容能力…

新建 ASP.NET MVC 三层项目 | 新建 ASP.NET MVC 项目

打开Visual Studio 2019、 点击创建新项目、 在列表中找到:ASP.NET Web应用程序(.NET Framework):用于创建ASP.NET应用程序的项目模板。你可以创建ASP.NET Web Forms、MVC或Web API应用程序,并可以在ASP.NET中添加许多其他功能、 输入项目名称、输入存储位置、输入解决方…

挂耳式蓝牙耳机哪家的好用,列举五款舒适度极佳的耳机分享

谈到骨传导耳机&#xff0c;相信大家都会想到不入耳的佩戴设计&#xff0c;因为这个特性让骨传导耳机迅速席卷整个年轻一代的圈子&#xff0c;相比于传统式的耳机&#xff0c;骨传导耳机在一定程度上是对耳道是具有一定的保护&#xff0c;这也就是为什么骨传导耳机深受群众喜爱…

深入理解Linux网络技术内幕(九)——中断和网络驱动程序

文章目录前言决策和流量方向接收到帧时通知驱动程序轮询中断在中断期间处理多帧定时器驱动的中断事件组合范例中断处理函数下半部函数存在的原因下半部解决方案并发和上锁抢占功能下半部函数微任务软IRQ初始化未决软IRQ的处理__do_softirq函数依体系结构处理软IRQksoftirqd内核…

优维EasyOps,打造新一代运维新方式

数字经济时代&#xff0c;数字技术与企业业务深度融合&#xff0c;越来越多企业开始认识到&#xff0c;IT正在从内部的支撑工具转变为企业发展的核心竞争力。然而&#xff0c;面对信息建设不断深化、系统架构日趋复杂、新兴技术快速迭代等诸多挑战&#xff0c;企业如何在复杂的…

mini-Imagenet处理

由于imagenet-1k 数据集太大&#xff0c;在验证模型方面耗时太久&#xff0c;特意研究了一下mini-Imagenet&#xff0c;用来代替imageNet-1K数据集。 2016年google DeepMind团队从Imagnet数据集中抽取的一小部分&#xff08;大小约3GB&#xff09;制作了Mini-Imagenet数据集&a…

TrustSVD算法进行基于矩阵分解的商品推荐 代码+数据(可作为毕设)

案例简介 (1)方法概述: 本教程包含如下内容: 从原始的数据文件中加载数据,进行训练集和测试集的切分,并对测试集进行负采样。 对数据分batch, 利用用户历史点击记录进行模型训练 结果展示 数据集:https://download.csdn.net/download/qq_38735017/87154848 (2)宏观流程图 …

Flutter高仿微信-第32篇-单聊-语音

Flutter高仿微信系列共59篇&#xff0c;从Flutter客户端、Kotlin客户端、Web服务器、数据库表结构、Xmpp即时通讯服务器、视频通话服务器、腾讯云服务器全面讲解。 详情请查看 效果图&#xff1a; 详情请参考 Flutter高仿微信-第29篇-单聊 &#xff0c; 这里只是提取语音聊天实…

数据治理中最常听到的名词有哪些?

开门见山&#xff0c;我们先来说说何为“数据治理” 数据治理就是实现数据价值的过程。通俗的理解就是让企业的数据从不可控、不可用、不好用到可控、方便易用且对业务有极大帮助的过程。 这个过程怎么实现&#xff1f;通过采集、传输、储存等一系列标准化流程将原本零散的数…

【网络安全】文件上传之安全狗bypass

作者名&#xff1a;Demo不是emo 主页面链接&#xff1a;主页传送门创作初心&#xff1a;舞台再大&#xff0c;你不上台&#xff0c;永远是观众&#xff0c;没人会关心你努不努力&#xff0c;摔的痛不痛&#xff0c;他们只会看你最后站在什么位置&#xff0c;然后羡慕或鄙夷座右…

极光推送SDK引起的内存泄露排查

发现问题 发现推送服务的老年代不断增长&#xff0c;部分内存无法回收 内存泄露堆栈分析 通过运维平台ark&#xff0c;执行了jmap进行heaphump&#xff0c;使用mat工具分析&#xff0c;发现可能存在内存泄露 发现有大量的SocksSocketImpl对象被Finalizer引用 看SocksSocke…

SolidWorks二次开发 API-SOLIDWORKS Simulation分析参数修改

今天我们来讲个小例子。 是关于SOLIDWORKS Simulation的。 先说明一点&#xff0c;这东西我也不熟。有问题别问我 首先&#xff0c;我做了一个很难的分析&#xff0c;条件也是很复杂&#xff0c;具体操作我就不说了&#xff0c;分析结果如下: 当然这个图和我们今天要做的事情…

【台前调度】使用指南:如何打开和关闭iPadOS 16台前调度

【台前调度】可以说是iPadOS 16系统最实用的功能之一。它拥有崭新的多任务处理能力&#xff0c;能自动管理App和视窗&#xff0c;使多个任务窗口能够快速又简单地切换。 但是不少小伙伴更新iPadOS 16后还不知道怎么使用台前调度功能。如何开启使用和关闭iPadOS台前调度&#xf…

[附源码]SSM计算机毕业设计旅游管理系统JAVA

项目运行 环境配置&#xff1a; Jdk1.8 Tomcat7.0 Mysql HBuilderX&#xff08;Webstorm也行&#xff09; Eclispe&#xff08;IntelliJ IDEA,Eclispe,MyEclispe,Sts都支持&#xff09;。 项目技术&#xff1a; SSM mybatis Maven Vue 等等组成&#xff0c;B/S模式 M…

苹果手机之间怎么传音乐,怎么把音乐传到苹果手机上

很多人喜欢在自己的苹果手机中下载各种音乐&#xff0c;并且会将音乐传输到其他地方&#xff0c;苹果手机之间怎么传音乐&#xff1f;在此处获取将iphone里的音乐传输到电脑和iphone的解决方案。 一、iPhone如何传输音乐到电脑&#xff1f; 方法1&#xff1a;通过iTunes将iPho…

黑马瑞吉外卖之购物车功能开发(添加购物车和购物车数据展示)

黑马瑞吉外卖之购物车功能前端界面分析后台购物车功能逻辑实现前端界面分析 当我们点击选择规格的时候&#xff0c;数据参数item会传入按钮绑定的方法中 我们点击到这个按钮的时候&#xff0c;那么就会绑定到这个方法。这个方法会将数据给这个窗体中的数据项赋值。这个diaglo…

【linux】【platform[1]】简述device和driver几种匹配方式

文章目录0.env10. 简述20. 测试源码1. driver2. device2.1 方式一&#xff1a;DTS2.2 方式二&#xff1a;ACPI2.3 方式三&#xff1a;id table2.4 方式四&#xff1a;NAME3. 测试log0.env ARM 32bit linux4.4.6010. 简述 主要讲述了几种device和driver匹配的方式以及demo框架文…