MMFewshot框架少样本目标检测配置学习(二)

教程 0:MMFEWSHOT 检测概述

在 MMFewShot 中,有三个用于获取数据的重要组件:
Datasets:ann_cfg从少数镜头设置中加载注释并过滤图像和注释。
Dataset Wrappers:确定采样逻辑,例如根据查询图像采样支持图像。
Dataloader Wrappers:封装来自多个数据集的数据。

总而言之,我们目前支持 4 种不同的数据流进行训练:
fine-tune based:与常规检测相同。
query aware:它将返回来自同一数据集的查询数据和支持数据。
n way k shot:它将首先从单独的数据集中采样查询数据(常规)和支持数据(N way k shot),然后用数据加载器包装器将它们封装起来。
two branch:它将首先从单独的数据集中采样主数据(常规)和辅助数据(常规),然后用数据加载器包装器将它们封装起来。

检测:
regular testing:与定期检测相同。
testing for query-support based detector:在测试之前会有一个模型初始化步骤,它是由QuerySupportEvalHook实现的。更多的实现细节可以参考mmfew .detection.core. evaluate .eval_hooks
在这里插入图片描述

教程 1:了解配置

我们在配置系统中采用了模块化和继承设计,方便进行各种实验。如果你想检查配置文件,你可以运行查看完整的配置。mmfewshot 的检测部分建立在 mmdet 之上,因此强烈建议学习mmdet的基础知识。python tools/misc/print_config.py /PATH/TO/CONFIG

1、 配置文件命名约定

我们遵循以下风格来命名配置文件。建议贡献者也遵循相同的风格。

{model}_[model setting]_{backbone}_{neck}_[norm setting]_[misc]_[gpu x batch_per_gpu]_{dataset}_{data setting}

{xxx}是必填字段,也是[yyy]可选的。
{model}:模型类型如faster_rcnn、、mask_rcnn等等。
[model setting]:针对某些模型的特定设置,例如contrastive-loss等fsce
{backbone}:主干类型,如r50(ResNet-50)、x101(ResNeXt-101)。
{neck}:颈型类似fpn, c4。
[norm_setting]:bn除非另有说明,否则使用(批量标准化),其他标准化层类型可以是gn(组标准化)、syncbn(同步批量标准化)。gn-head表示GN 仅适用于头部/颈部,而gn-all表示 GN 适用于整个模型,例如骨干、颈部、头部。
[misc]:模型的其他设置/插件,例如dconv, gcb, attention, albu, mstrain.
[gpu x batch_per_gpu]默认使用GPU 和每个 GPU 的样本数:8xb2 is used by default.
{dataset}:数据集,dataset like coco, voc-split1, voc-split2 and voc-split3.
{data setting}:like base-training or 1shot-fine-tuning.

2、在配置中使用中间变量

配置文件中使用了一些中间变量,例如数据集中的train_pipeline/ test_pipeline。值得注意的是,在修改子配置中的中间变量时,用户需要再次将中间变量传递到相应的字段中。例如,我们想使用多尺度策略来训练 Mask R-CNN。train_pipeline/test_pipeline是我们想要修改的中间变量。

_base_ = './faster_rcnn_r50_caffe_fpn.py'
img_norm_cfg = dict(mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [dict(type='LoadImageFromFile'),dict(type='LoadAnnotations', with_bbox=True, with_mask=True),dict(type='Resize',img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),(1333, 768), (1333, 800)],multiscale_mode="value",keep_ratio=True),dict(type='RandomFlip', flip_ratio=0.5),dict(type='Normalize', **img_norm_cfg),dict(type='Pad', size_divisor=32),dict(type='DefaultFormatBundle'),dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
]
test_pipeline = [dict(type='LoadImageFromFile'),dict(type='MultiScaleFlipAug',img_scale=(1333, 800),flip=False,transforms=[dict(type='Resize', keep_ratio=True),dict(type='RandomFlip'),dict(type='Normalize', **img_norm_cfg),dict(type='Pad', size_divisor=32),dict(type='ImageToTensor', keys=['img']),dict(type='Collect', keys=['img']),])
]
data = dict(train=dict(pipeline=train_pipeline),val=dict(pipeline=test_pipeline),test=dict(pipeline=test_pipeline))

我们首先定义新的train_pipeline/test_pipeline并将它们传递给data。
Similarly, if we would like to switch from SyncBN to BN or MMSyncBN, we need to substitute every norm_cfg in the config.

_base_ = './faster_rcnn_r50_caffe_fpn.py'
norm_cfg = dict(type='BN', requires_grad=True)
model = dict(backbone=dict(norm_cfg=norm_cfg),neck=dict(norm_cfg=norm_cfg),...)

3、TFA 示例

为了帮助用户对现代分类系统中的完整配置和模块有一个基本的了解,我们对 coco 10 镜头微调设置中的 TFA 配置做了简要的评论,如下所示。有关更详细的用法和每个模块对应的替代方案,请参阅 API 文档。

train_pipeline = [  # Training pipeline# First pipeline to load images from file pathdict(type='LoadImageFromFile'),# Second pipeline to load annotations for current imagedict(type='LoadAnnotations', with_bbox=True),# Augmentation pipeline that resize the images and their annotationsdict(type='Resize',# The multiple scales of imageimg_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),(1333, 768), (1333, 800)],# whether to keep the ratio between height and widthkeep_ratio=True,# the scales will be sampled from img_scalemultiscale_mode='value'),# RandomFlip config, flip_ratio: the ratio or probability to flipdict(type='RandomFlip', flip_ratio=0.5),# Image normalization config to normalize the input imagesdict(type='Normalize',# Mean values used to in pre-trained backbone modelsmean=[103.53, 116.28, 123.675],# Standard variance used to in pre-trained backbone modelsstd=[1.0, 1.0, 1.0],# The channel orders of image used in pre-trained backbone modelsto_rgb=False),# Padding config, size_divisor: the number the padded images should be divisibledict(type='Pad', size_divisor=32),# Default format bundle to gather data in the pipelinedict(type='DefaultFormatBundle'),# Pipeline that decides which keys in the data should be passed to the detectordict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [  # test pipeline# First pipeline to load images from file pathdict(type='LoadImageFromFile'),# An encapsulation that encapsulates the testing augmentationsdict(type='MultiScaleFlipAug',# Decides the largest scale for testing, used for the Resize pipelineimg_scale=(1333, 800),flip=False,  # Whether to flip images during testingtransforms=[# Use resize augmentationdict(type='Resize', keep_ratio=True),# Augmentation pipeline that flip the images and their annotationsdict(type='RandomFlip'),# Augmentation pipeline that normalize the input imagesdict(type='Normalize',# Mean values used in pre-trained backbone modelsmean=[103.53, 116.28, 123.675],# Standard variance used in pre-trained backbone modelsstd=[1.0, 1.0, 1.0],# The channel orders of image used in pre-trained backbone modelsto_rgb=False),# Padding config, size_divisor: the number the padded images should be divisibledict(type='Pad', size_divisor=32),# Default format bundle to gather data in the pipelinedict(type='ImageToTensor', keys=['img']),# Pipeline that decides which keys in the data should be passed to the detectordict(type='Collect', keys=['img'])])
]
data = dict(# Batch size of a single GPUsamples_per_gpu=2,# Worker to pre-fetch data for each single GPUworkers_per_gpu=2,train=dict(  # Train dataset configsave_dataset=False,  # whether to save data_information into json file# the pre-defined few shot setting are saved in `FewShotCocoDefaultDataset`type='FewShotCocoDefaultDataset',ann_cfg=[dict(method='TFA', setting='10SHOT')],  # pre-defined few shot settingimg_prefix='data/coco/',  # prefix of imagesnum_novel_shots=10,  # the max number of instances for novel classesnum_base_shots=10,  # the max number of instances for base classespipeline=train_pipeline,  # training pipelineclasses='ALL_CLASSES',  # pre-defined classes split saved in dataset# whether to split the annotation (each image only contains one instance)instance_wise=False),val=dict(  # Validation dataset configtype='FewShotCocoDataset',  # type of datasetann_cfg=[dict(type='ann_file',  # type of ann_file# path to ann_fileann_file='data/few_shot_ann/coco/annotations/val.json')],# prefix of imageimg_prefix='data/coco/',pipeline=test_pipeline,  # testing pipelineclasses='ALL_CLASSES'),test=dict(  # Testing dataset configtype='FewShotCocoDataset',  # type of datasetann_cfg=[dict(type='ann_file',  # type of ann_file# path to ann_fileann_file='data/few_shot_ann/coco/annotations/val.json')],# prefix of imageimg_prefix='data/coco/',pipeline=test_pipeline,  # testing pipelinetest_mode=True,  # indicate in test modeclasses='ALL_CLASSES'))  # pre-defined classes split saved in dataset
# The config to build the evaluation hook, refer to
# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/evaluation/eval_hooks.py#L7
# for more details.
evaluation = dict(interval=80000,  # Evaluation intervalmetric='bbox',  # Metrics used during evaluationclasswise=True,  # whether to show result of each class# eval results in pre-defined split of classesclass_splits=['BASE_CLASSES', 'NOVEL_CLASSES'])
# Config used to build optimizer, support all the optimizers
# in PyTorch whose arguments are also the same as those in PyTorch
optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0001)
# Config used to build the optimizer hook, refer to
# https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/optimizer.py#L8
# for implementation details. Most of the methods do not use gradient clip.
optimizer_config = dict(grad_clip=None)
# Learning rate scheduler config used to register LrUpdater hook
lr_config = dict(# The policy of scheduler, also support CosineAnnealing, Cyclic, etc.# Refer to details of supported LrUpdater from# https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py#L9.policy='step',# The warmup policy, also support `exp` and `constant`.warmup='linear',# The number of iterations for warmupwarmup_iters=10,# The ratio of the starting learning rate used for warmupwarmup_ratio=0.001,# Steps to decay the learning ratestep=[144000])
# Type of runner to use (i.e. IterBasedRunner or EpochBasedRunner)
runner = dict(type='IterBasedRunner', max_iters=160000)
model = dict(  # The config of backbonetype='TFA',  # The name of detectorbackbone=dict(type='ResNet',  # The name of detector# The depth of backbone, usually it is 50 or 101 for ResNet and ResNext backbones.depth=101,num_stages=4,  # Number of stages of the backbone.# The index of output feature maps produced in each stagesout_indices=(0, 1, 2, 3),# The weights from stages 1 to 4 are frozenfrozen_stages=4,# The config of normalization layers.norm_cfg=dict(type='BN', requires_grad=False),# Whether to freeze the statistics in BNnorm_eval=True,# The style of backbone, 'pytorch' means that stride 2 layers are in 3x3 conv,# 'caffe' means stride 2 layers are in 1x1 convs.style='caffe'),neck=dict(# The neck of detector is FPN. For more details, please refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/necks/fpn.py#L10type='FPN',# The input channels, this is consistent with the output channels of backbonein_channels=[256, 512, 1024, 2048],# The output channels of each level of the pyramid feature mapout_channels=256,# The number of output scalesnum_outs=5,# the initialization of specific layer. For more details, please refer to# https://mmdetection.readthedocs.io/en/latest/tutorials/init_cfg.htmlinit_cfg=[# initialize lateral_convs layer with Caffe2Xavierdict(type='Caffe2Xavier',override=dict(type='Caffe2Xavier', name='lateral_convs')),# initialize fpn_convs layer with Caffe2Xavierdict(type='Caffe2Xavier',override=dict(type='Caffe2Xavier', name='fpn_convs'))]),rpn_head=dict(# The type of RPN head is 'RPNHead'. For more details, please refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/dense_heads/rpn_head.py#L12type='RPNHead',# The input channels of each input feature map,# this is consistent with the output channels of neckin_channels=256,# Feature channels of convolutional layers in the head.feat_channels=256,anchor_generator=dict(  # The config of anchor generator# Most of methods use AnchorGenerator, For more details, please refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/anchor/anchor_generator.py#L10type='AnchorGenerator',# Basic scale of the anchor, the area of the anchor in one position# of a feature map will be scale * base_sizesscales=[8],# The ratio between height and width.ratios=[0.5, 1.0, 2.0],# The strides of the anchor generator. This is consistent with the FPN# feature strides. The strides will be taken as base_sizes if base_sizes is not set.strides=[4, 8, 16, 32, 64]),bbox_coder=dict(  # Config of box coder to encode and decode the boxes during training and testing# Type of box coder. 'DeltaXYWHBBoxCoder' is applied for most of methods. For more details refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/coder/delta_xywh_bbox_coder.py#L9type='DeltaXYWHBBoxCoder',# The target means used to encode and decode boxestarget_means=[0.0, 0.0, 0.0, 0.0],# The standard variance used to encode and decode boxestarget_stds=[1.0, 1.0, 1.0, 1.0]),# Config of loss function for the classification branchloss_cls=dict(# Type of loss for classification branch.type='CrossEntropyLoss',# RPN usually perform two-class classification,# so it usually uses sigmoid function.use_sigmoid=True,# Loss weight of the classification branch.loss_weight=1.0),# Config of loss function for the regression branch.loss_bbox=dict(# Type of loss, we also support many IoU Losses and smooth L1-loss. For implementation refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/losses/smooth_l1_loss.py#L56type='L1Loss',# Loss weight of the regression branch.loss_weight=1.0)),roi_head=dict(# Type of the RoI head, for more details refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/standard_roi_head.py#L10type='StandardRoIHead',# RoI feature extractor for bbox regression.bbox_roi_extractor=dict(# Type of the RoI feature extractor. For more details refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/roi_extractors/single_level.py#L10type='SingleRoIExtractor',roi_layer=dict(  # Config of RoI Layer# Type of RoI Layer, for more details refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/ops/roi_align/roi_align.py#L79type='RoIAlign',output_size=7,  # The output size of feature maps.# Sampling ratio when extracting the RoI features.# 0 means adaptive ratio.sampling_ratio=0),# output channels of the extracted feature.out_channels=256,# Strides of multi-scale feature maps. It should be consistent to the architecture of the backbone.featmap_strides=[4, 8, 16, 32]),bbox_head=dict(  # Config of box head in the RoIHead.# Type of the bbox head, for more details refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/bbox_heads/convfc_bbox_head.py#L177type='CosineSimBBoxHead',# Input channels for bbox head. This is consistent with the out_channels in roi_extractorin_channels=256,# Output feature channels of FC layers.fc_out_channels=1024,roi_feat_size=7,  # Size of RoI featuresnum_classes=80,  # Number of classes for classificationbbox_coder=dict(  # Box coder used in the second stage.# Type of box coder. 'DeltaXYWHBBoxCoder' is applied for most of methods.type='DeltaXYWHBBoxCoder',# Means used to encode and decode boxtarget_means=[0.0, 0.0, 0.0, 0.0],# Standard variance for encoding and decoding. It is smaller since# the boxes are more accurate. [0.1, 0.1, 0.2, 0.2] is a conventional setting.target_stds=[0.1, 0.1, 0.2, 0.2]),reg_class_agnostic=False,  # Whether the regression is class agnostic.loss_cls=dict(  # Config of loss function for the classification branch# Type of loss for classification branch, we also support FocalLoss etc.type='CrossEntropyLoss',use_sigmoid=False,  # Whether to use sigmoid.loss_weight=1.0),  # Loss weight of the classification branch.loss_bbox=dict(  # Config of loss function for the regression branch.# Type of loss, we also support many IoU Losses and smooth L1-loss, etc.type='L1Loss',# Loss weight of the regression branch.loss_weight=1.0),# the initialization of specific layer. For more details, please refer to# https://mmdetection.readthedocs.io/en/latest/tutorials/init_cfg.htmlinit_cfg=[# initialize shared_fcs layer with Caffe2Xavierdict(type='Caffe2Xavier',override=dict(type='Caffe2Xavier', name='shared_fcs')),# initialize fc_cls layer with Normaldict(type='Normal',override=dict(type='Normal', name='fc_cls', std=0.01)),# initialize fc_cls layer with Normaldict(type='Normal',override=dict(type='Normal', name='fc_reg', std=0.001))],# number of shared fc layersnum_shared_fcs=2)),train_cfg=dict(rpn=dict(  # Training config of rpnassigner=dict(  # Config of assigner# Type of assigner. For more details, please refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/assigners/max_iou_assigner.py#L10type='MaxIoUAssigner',pos_iou_thr=0.7,  # IoU >= threshold 0.7 will be taken as positive samplesneg_iou_thr=0.3,  # IoU < threshold 0.3 will be taken as negative samplesmin_pos_iou=0.3,  # The minimal IoU threshold to take boxes as positive samples# Whether to match the boxes under low quality (see API doc for more details).match_low_quality=True,ignore_iof_thr=-1),  # IoF threshold for ignoring bboxessampler=dict(  # Config of positive/negative sampler# Type of sampler. For more details, please refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/samplers/random_sampler.py#L8type='RandomSampler',num=256,  # Number of samplespos_fraction=0.5,  # The ratio of positive samples in the total samples.# The upper bound of negative samples based on the number of positive samples.neg_pos_ub=-1,# Whether add GT as proposals after sampling.add_gt_as_proposals=False),# The border allowed after padding for valid anchors.allowed_border=-1,# The weight of positive samples during training.pos_weight=-1,debug=False),  # Whether to set the debug moderpn_proposal=dict(  # The config to generate proposals during trainingnms_pre=2000,  # The number of boxes before NMSmax_per_img=1000,  # The number of boxes to be kept after NMS.nms=dict(  # Config of NMStype='nms',  # Type of NMSiou_threshold=0.7),  # NMS thresholdmin_bbox_size=0),  # The allowed minimal box sizercnn=dict(  # The config for the roi heads.assigner=dict(  # Config of assigner for second stage, this is different for that in rpn# Type of assigner, MaxIoUAssigner is used for all roi_heads for now. For more details, please refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/assigners/max_iou_assigner.py#L10 for more details.type='MaxIoUAssigner',pos_iou_thr=0.5,  # IoU >= threshold 0.5 will be taken as positive samplesneg_iou_thr=0.5,  # IoU < threshold 0.5 will be taken as negative samplesmin_pos_iou=0.5,  # The minimal IoU threshold to take boxes as positive samples# Whether to match the boxes under low quality (see API doc for more details).match_low_quality=False,ignore_iof_thr=-1),  # IoF threshold for ignoring bboxessampler=dict(# Type of sampler, PseudoSampler and other samplers are also supported. For more details, please refer to# https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/samplers/random_sampler.py#L8type='RandomSampler',num=512,  # Number of samplespos_fraction=0.25,  # The ratio of positive samples in the total samples.# The upper bound of negative samples based on the number of positive samples.neg_pos_ub=-1,# Whether add GT as proposals after sampling.add_gt_as_proposals=True),# The weight of positive samples during training.pos_weight=-1,# Whether to set the debug modedebug=False)),test_cfg=dict(  # Config for testing hyperparameters for rpn and rcnnrpn=dict(  # The config to generate proposals during testing# The number of boxes before NMSnms_pre=1000,# The number of boxes to be kept after NMS.max_per_img=1000,# Config of NMSnms=dict(type='nms', iou_threshold=0.7),# The allowed minimal box sizemin_bbox_size=0),rcnn=dict(  # The config for the roi heads.score_thr=0.05,  # Threshold to filter out boxes# Config of NMS in the second stagenms=dict(type='nms', iou_threshold=0.5),# Max number of detections of each imagemax_per_img=100)),# parameters with the prefix listed in frozen_parameters will be frozenfrozen_parameters=['backbone', 'neck', 'rpn_head', 'roi_head.bbox_head.shared_fcs'])
# Config to set the checkpoint hook, Refer to
# https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/checkpoint.py for implementation.
checkpoint_config = dict(interval=80000)
# The logger used to record the training process.
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
custom_hooks = [dict(type='NumClassCheckHook')]  # cumstomize hook
dist_params = dict(backend='nccl')  # parameters to setup distributed training, the port can also be set.
log_level = 'INFO'  # the output level of the log.
# use base training model as model initialization.
load_from = 'work_dirs/tfa_r101_fpn_coco_base-training/base_model_random_init_bbox_head.pth'
# workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once.
workflow = [('train', 1)]
use_infinite_sampler = True  # whether to use infinite sampler
seed = 0  # random seed

教程 2:添加新数据集

1、通过重新组织数据来自定义数据集

1、 Load annotations from file

ann_file与 mmdet 中用于加载单个数据集的配置不同,我们使用ann_cfg来支持复杂的少量镜头设置。
从数据集的常规加载注释ann_file。

ann_cfg = [dict(type='ann_file', ann_file='path/to/ann_file'), ...]

在这里插入图片描述
对于FewShotVOCDataset,我们还支持从ann_file中加载特定的类在ann_classes。

dict(type='ann_file', ann_file='path/to/ann_file', ann_classes=['dog', 'cat'])

loading annotation from a json file saved by a dataset.

ann_cfg = [dict(type='saved_dataset', ann_file='path/to/ann_file'), ...]

要保存数据集,我们可以save_dataset=True在配置文件中设置,数据集将保存为${WORK_DIR}/{TIMESTAMP}_saved_data.json

dataset=dict(type='FewShotVOCDataset', save_dataset=True, ...)

在这里插入图片描述

2、从预定义基准加载注释

要加载这些预定义的数据分割,需要将数据集的类型设置为 FewShotVOCDefaultDataset或FewShotCocoDefaultDataset。我们为每种方法提供每个重现检查点的数据分割。在配置文件中,我们可以使用method和setting来确定要加载哪个数据分割。

dataset = dict(type='FewShotVOCDefaultDataset',ann_cfg=[dict(method='TFA', setting='SPLIT1_1SHOT')]
)

在这里插入图片描述

3、在运行时从另一个数据集加载注释

In few shot setting, we can use FewShotVOCCopyDataset or FewShotCocoCopyDataset to copy a dataset from other dataset during runtime for some special cases, such as copying online random sampled support set for model initialization before evaluation. It needs user to modify code in mmfewshot.detection.apis. More details can refer to mmfewshot/detection/apis/train.py. Here is an example of config:
在少样本设置中,我们可以使用FewShotVOCCopyDataset或FewShotCocoCopyDataset在运行时从其他数据集复制数据集以用于某些特殊情况。例如:在评估之前复制在线随机采样支持集进行模型初始化。它需要用户修改 mmfewshot.detection.apis中的代码。更多详细信息可以参考 mmfewshot/detection/apis/train.py。以下是配置示例:

dataset = dict(type='FewShotVOCCopyDataset',ann_cfg=[dict(data_infos=FewShotVOCDataset.data_infos)])
4、使用预定义的类别划分

The predefined class splits are supported in datasets. For VOC, we support [ALL_CLASSES_SPLIT1,ALL_CLASSES_SPLIT2, ALL_CLASSES_SPLIT3, NOVEL_CLASSES_SPLIT1, NOVEL_CLASSES_SPLIT2, NOVEL_CLASSES_SPLIT3, BASE_CLASSES_SPLIT1, BASE_CLASSES_SPLIT2, BASE_CLASSES_SPLIT3]. For COCO, we support [ALL_CLASSES, NOVEL_CLASSES, BASE_CLASSES]

data = dict(train=dict(type='FewShotVOCDataset', classes='ALL_CLASSES_SPLIT1'),val=dict(type='FewShotVOCDataset', classes='ALL_CLASSES_SPLIT1'),test=dict(type='FewShotVOCDataset', classes='ALL_CLASSES_SPLIT1'))

另外,类别划分可用于报告不同类别划分的评估结果。以下是配置示例:

evaluation = dict(class_splits=['BASE_CLASSES_SPLIT1', 'NOVEL_CLASSES_SPLIT1'])

在这里插入图片描述

5、自定义注释数量

对于 FewShotDataset,我们支持两种方式来过滤多余的注释。
ann_shot_filter:使用字典指定类,以及加载注释文件时对应的最大实例数。例如,我们只想要 10 个 dog 实例和 5 个 person 实例,而其他类的实例保持不变:

dataset=dict(type='FewShotVOCDataset',ann_shot_filter=dict(dog=10, person=5),...)

num_novel_shots和num_base_shots:使用预定义的类拆分来指示相应的最大实例数。例如,我们只希望每个新类有 1 个实例,基类有 3 个实例:

dataset=dict(type='FewShotVOCDataset',num_novel_shots=1,num_base_shots=2,...)
6、自定义注释的组织

我们还支持将注释拆分为实例,即每个图像只有一个实例,并且图像可以重复。

dataset=dict(type='FewShotVOCDataset',instance_wise=True,...)
7、自定义管道

为了支持单个数据集中的不同管道,我们可以使用multi_pipelines。在配置文件中,multi_pipelines使用键的名称来指示特定的管道。以下是配置示例

multi_pipelines = dict(query=[dict(type='LoadImageFromFile'),dict(type='LoadAnnotations', with_bbox=True),dict(type='Resize', img_scale=(1000, 600), keep_ratio=True),dict(type='RandomFlip', flip_ratio=0.5),dict(type='Normalize', **img_norm_cfg),dict(type='DefaultFormatBundle'),dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])],support=[dict(type='LoadImageFromFile'),dict(type='LoadAnnotations', with_bbox=True),dict(type='Normalize', **img_norm_cfg),dict(type='GenerateMask', target_size=(224, 224)),dict(type='RandomFlip', flip_ratio=0.0),dict(type='DefaultFormatBundle'),dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])])
train=dict(type='NWayKShotDataset',dataset=dict(type='FewShotCocoDataset',...multi_pipelines=train_multi_pipelines))

使用时multi_pipelines,我们需要指定管道名称来 prepare_train_img获取图像。例如

dataset.prepare_train_img(self, idx, 'query')

2、自定义新的数据集包装器

1、创建新的数据集包装器(dataset wrapper)

我们可以在 mmfewshot/detection/datasets/dataset_wrappers.py 中创建一个新的数据集包装器来定制采样逻辑。目前有:
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

class MyDatasetWrapper:def __init__(self, dataset, support_dataset=None, args_a, args_b, ...):# query_dataset and support_dataset can use same datasetself.query_dataset = datasetself.support_dataset = support_datasetif support_dataset is None:self.support_dataset = dataset...def __getitem__(self, idx):...query_data = self.query_dataset.prepare_train_img(idx, 'query')# customize sampling logicsupport_idxes = ...support_data = [self.support_dataset.prepare_train_img(idx, 'support')for idx in support_idxes]return {'query_data': query_data, 'support_data': support_data}
2、更新数据集生成器

我们需要在 mmfewshot/detection/datasets/builder.py 中添加构建代码来定制数据集包装器。

def build_dataset(cfg, default_args=None):if isinstance(cfg, (list, tuple)):dataset = ConcatDataset([build_dataset(c, default_args) for c in cfg])...elif cfg['type'] == 'MyDatasetWrapper':dataset = MyDatasetWrapper(build_dataset(cfg['dataset'], default_args),build_dataset(cfg['support_dataset'], default_args) if cfg.get('support_dataset', False) else None,# pass customize argumentsargs_a=cfg['args_a'],args_b=cfg['args_b'],...)else:dataset = build_from_cfg(cfg, DATASETS, default_args)return dataset
3、更新数据加载器构建器

当自定义数据集包装器返回 Tensor 列表时,我们需要在 mmfewshot/detection/datasets/builder.py 中添加 dataloader 的构建代码。我们可以用它multi_pipeline_collate_fn来处理这种情况。

def build_dataset(cfg, default_args=None):...if isinstance(dataset, MyDatasetWrapper):from mmfewshot.utils import multi_pipeline_collate_fn# `multi_pipeline_collate_fn` are designed to handle# the data with list[list[DataContainer]]data_loader = DataLoader(dataset,batch_size=batch_size,sampler=sampler,num_workers=num_workers,collate_fn=partial(multi_pipeline_collate_fn, samples_per_gpu=samples_per_gpu),pin_memory=False,worker_init_fn=init_fn,**kwargs)...
4、更新模型中的参数

forward函数中的参数名称需要与自定义数据集包装器一致。

class MyDetector(BaseDetector):...def forward(self, query_data, support_data, ...):
5、在配置中使用自定义数据集包装器

然后在配置中,MyDatasetWrapper你可以按如下方式修改配置,

dataset_A_train = dict(type='MyDatasetWrapper',args_a=None,args_b=None,dataset=dict(  # This is the original config of Dataset_Atype='Dataset_A',...multi_pipelines=train_multi_pipelines),support_dataset=None)

3、自定义数据加载器包装器

1、创建新的数据加载器包装器
我们可以在 mmfewshot/detection/datasets/dataloader_wrappers.py 中创建一个新的数据集包装器来定制采样逻辑。
class MyDataloader:def __init__(self, query_data_loader, support_data_loader):self.dataset = query_data_loader.datasetself.sampler = query_data_loader.samplerself.query_data_loader = query_data_loaderself.support_data_loader = support_data_loaderdef __iter__(self):self.query_iter = iter(self.query_data_loader)self.support_iter = iter(self.support_data_loader)return selfdef __next__(self):query_data = self.query_iter.next()support_data = self.support_iter.next()return {'query_data': query_data, 'support_data': support_data}def __len__(self) -> int:return len(self.query_data_loader)
2、更新数据加载器构建器

我们需要在 mmfewshot/detection/datasets/builder.py 中为我们的自定义数据集包装器添加构建代码。

def build_dataloader(dataset, ...):if isinstance(dataset, MyDataset):...query_data_loader = DataLoader(...)support_data_loader = DataLoader(...)# wrap two dataloaders with dataloader wrapperdata_loader = MyDataloader(query_data_loader=query_data_loader,support_data_loader=support_data_loader)return dataset

教程 3:自定义模型

我们基本上将模型组件分为 5 种类型,与 mmdet 相同。
主干:通常是一个FCN网络,用于提取特征图,例如ResNet,MobileNet。
颈部:脊柱和头部之间的部分,例如 FPN、PAFPN。
head:用于特定任务的组件,例如bbox预测和mask预测。
roi extractor:从特征图中提取RoI特征的部分,例如RoI Align。
loss:头部中用于计算损失的组件,例如FocalLoss、L1Loss、GHMLoss。

开发新组件

添加新的骨干

教程 4:自定义运行时设置

1、自定义优化设置

自定义Pytorch支持的优化器

我们已经支持使用 PyTorch 实现的所有优化器,只需要修改optimizer配置文件的字段即可。比如你想使用ADAM(注意性能会下降很多),可以这样修改。

optimizer = dict(type='Adam', lr=0.0003, weight_decay=0.0001)
定制自行实现的优化器
1. 定义新的优化器

定制的优化器可以定义如下。
假设您要添加一个名为MyOptimizer 的优化器,其参数为a、b和c。您需要创建一个名为mmfewshot/detection/core/optimizer 的新目录。然后在文件中实现新的优化器,例如mmfewshot/detection/core/optimizer/my_optimizer.py:

from .registry import OPTIMIZERS
from torch.optim import Optimizer@OPTIMIZERS.register_module()
class MyOptimizer(Optimizer):def __init__(self, a, b, c)
2. 将优化器添加到注册表

修改一下mmfewshot/detection/core/optimizer/init.py就可以导入了。
应该导入新定义的模块,mmfewshot/detection/core/optimizer/init.py以便注册表找到新模块并添加它:

from .my_optimizer import MyOptimizer

custom_imports在配置中使用手动导入

custom_imports = dict(imports=['mmfewshot.detection.core.optimizer.my_optimizer'], allow_failed_imports=False)

模块mmfewshot.detection.core.optimizer.my_optimizer将在程序开始时导入,然后自动注册类。请注意,只应导入 MyOptimizer包含该类的包。不能直接导MyOptimizermmfewshot.detection.core.optimizer.my_optimizer.MyOptimizer

3. 在配置文件中指定优化器

然后你就可以MyOptimizer在optimizer配置文件的字段中使用了。在配置中,优化器由字段定义,optimizer如下所示:

optimizer = dict(type='MyOptimizer', a=a_value, b=b_value, c=c_value)
定义优化器构造函数

某些模型可能有一些针对优​​化的参数设置,例如 BatchNorm 层的权重衰减。用户可以通过自定义优化器构造函数来进行这些细粒度的参数调整。

from mmcv.utils import build_from_cfgfrom mmcv.runner.optimizer import OPTIMIZER_BUILDERS, OPTIMIZERS
from mmfewshot.utils import get_root_logger
from .my_optimizer import MyOptimizer@OPTIMIZER_BUILDERS.register_module()
class MyOptimizerConstructor(object):def __init__(self, optimizer_cfg, paramwise_cfg=None):def __call__(self, model):return my_optimizer

默认的优化器构造函数在这里实现,它也可以作为新优化器构造函数的模板。

其他设置

优化器未实现的技巧应通过优化器构造函数(例如,设置参数学习率)或钩子来实现。我们列出了一些可以稳定训练或加速训练的常见设置。欢迎创建 PR、issue 以获取更多设置。
使用梯度剪辑来稳定训练:有些模型需要使用梯度剪辑来剪辑梯度以稳定训练过程。示例如下:

optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))

使用动量调度加速模型收敛:我们支持动量调度器根据学习率修改模型的动量,从而使模型以更快的方式收敛。动量调度器通常与 LR 调度器一起使用,例如,在 3D 检测中使用以下配置来加速收敛。有关更多详细信息,请参考CyclicLrUpdater和CyclicMomentumUpdater的实现。

lr_config = dict(policy='cyclic',target_ratio=(10, 1e-4),cyclic_times=1,step_ratio_up=0.4,
)
momentum_config = dict(policy='cyclic',target_ratio=(0.85 / 0.95, 1),cyclic_times=1,step_ratio_up=0.4,
)

2、定制训练计划

默认情况下,我们使用Step学习率with 1x schedule,这在 MMCV 中调用StepLRHook。我们支持许多其他学习率计划,例如CosineAnnealing and Poly schedule。以下是一些示例:
Poly schedule:

lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)

ConsineAnnealing schedule

lr_config = dict(policy='CosineAnnealing',warmup='linear',warmup_iters=1000,warmup_ratio=1.0 / 10,min_lr_ratio=1e-5)

3、修改默认运行时挂钩

有一些常见的钩子没有通过custom_hooks注册,它们是
log_config
checkpoint_config
evaluation
lr_config
optimizer_config
momentum_config

在这些钩子中,只有 logger 钩子具有VERY_LOW优先级,其他钩子的优先级为NORMAL。上面的教程已经介绍了如何修改optimizer_config、momentum_config和lr_config。这里我们揭示了如何使用log_config、checkpoint_config和evaluation。

检查点配置

MMCV 运行器将使用CheckpointHook来初始化checkpoint_config。
在这里插入图片描述

checkpoint_config = dict(interval=1)

用户可以设置max_keep_ckpts只保存少量检查点,或者决定是否存储优化器的状态字典save_optimizer。

日志配置

logger hook封装log_config了多个 并允许设置间隔。现在 MMCV 支持WandbLoggerHook、MlflowLoggerHook和TensorboardLoggerHook。

log_config = dict(interval=50,hooks=[dict(type='TextLoggerHook'),dict(type='TensorboardLoggerHook')])
评估配置

EvalHook的配置evaluation将用于初始化。除了密钥之外interval,其他参数(例如)metric将传递给dataset.evaluate()

evaluation = dict(interval=1, metric='bbox')

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/378340.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

<Rust>egui部件学习:如何在窗口及部件显示中文字符?

前言 本专栏是关于Rust的GUI库egui的部件讲解及应用实例分析&#xff0c;主要讲解egui的源代码、部件属性、如何应用。 环境配置 系统&#xff1a;windows 平台&#xff1a;visual studio code 语言&#xff1a;rust 库&#xff1a;egui、eframe 概述 本文是本专栏的第一篇博…

2024-07-16 Unity插件 Odin Inspector5 —— Conditional Attributes

文章目录 1 说明2 条件特性2.1 DisableIf / EnableIf2.2 DisableIn / EnableIn / ShowIn / HideIn2.3 DisableInEditorMode / HideInEditorMode2.4 DisableInInlineEditors / ShowInInlineEditors2.5 DisableInPlayMode / HideInPlayMode2.6 ShowIf / HideIf2.7 ShowIfGroup / …

鸿蒙架构之AOP

零、主要内容 AOP 简介ArkTs AOP 实现原理 JS 原型链AOP实现原理 AOP的应用场景 统计类&#xff1a; 方法调用次数统计、方法时长统计防御式编程&#xff1a;参数校验代理模式实现 AOP的注意事项 一、AOP简介 对于Android、Java Web 开发者来说&#xff0c; AOP编程思想并不…

java智慧工地云平台源码,基于BIM+AI智能大数据中心和云平台的智慧工地监管平台

智慧工地云平台源码&#xff0c;智慧工地系统源码&#xff0c;工程管理系统APP源码&#xff0c; “智慧工地”基于BIM&#xff08;建筑信息模型&#xff09;AI&#xff08;人工智能&#xff09;智能大数据中心和云平台&#xff0c;围绕建筑工程项目全生命周期&#xff0c;集成安…

Linux下如何安装配置Graylog日志管理工具

Graylog是一个开源的日志管理工具&#xff0c;可以帮助我们收集、存储和分析大量的日志数据。它提供了强大的搜索、过滤和可视化功能&#xff0c;可以帮助我们轻松地监控系统和应用程序的运行情况。 在Linux系统下安装和配置Graylog主要包括以下几个步骤&#xff1a; 准备安装…

Scratch编程乐园:108课打造小小编程大师

《Scratch少儿趣味编程108例&#xff08;全视频微课版&#xff09;》以Scratch 3.6版本为基础&#xff0c;通过108个案例详细介绍了运用Scratch软件制作动画、游戏等趣味作品的方法&#xff0c;充分培养孩子的想象力和创造力。本书共分为9章&#xff0c;第1章概述Scratch下载、…

ArrayList.subList的踩坑

需求描述&#xff1a;跳过list中的第一个元素&#xff0c;获取list中的其他元素 原始代码如下&#xff1a; List<FddxxEnterpriseVerify> companyList fddxxEnterpriseVerifyMapper.selectList(companyQueryWrapper);log.info("获取多个法大大公司数据量为&#…

OMS 2.0至3.0升级项目成功案例:红袖女装

近日&#xff0c;巨益科技成功助力女装品牌红袖完成OMS 3.0升级&#xff0c;并顺利通过项目验收。此次升级通过优化系统架构、提高数据处理能力和实现多系统集成&#xff0c;红袖品牌显著提升了订单处理速度、库存管理精度和客户满意度&#xff0c;实现了运营效率和服务质量的全…

基于Python+Flask+SQLite的豆瓣电影可视化系统

FlaskMySQLEcharts 基于PythonFlaskSQLite的豆瓣电影可视化系统 Echarts 不支持登录注册&#xff0c;并且信息存储在数据库中 不含爬虫代码&#xff0c;或爬虫代码已失效 简介 基于PythonFlaskMySQL的豆瓣电影可视化系统&#xff0c;采用Echart构建图表&#xff0c;支持自定…

python 算法题之,统计不存在的值的累加和

s list(map(int, input().split())) k int(input()) s.sort() print(s)if s:m 0 # 统计找到的不存在的数的个数res 0 # 累值t 1 # 当前数i 0 # 列表中当前下标while True:if i < len(s) and s[i] t: # 如果当前数存在i 1else: # 当前数不存在res (res t) % …

第九课:服务器发布(静态nat配置)

一个要用到静态NAT的场景&#xff0c;当内网有一台服务器server1&#xff0c;假如一个用户在外网&#xff0c;从外网访问内网怎么访问呢&#xff0c;无法访问&#xff0c;这是因为外网没办法直接访问内网&#xff0c;这时候需要给服务器做一个静态NAT。 静态NAT指的是给服务器…

cpp 强制转换

一、static_cast static_cast 是 C 中的一个类型转换操作符&#xff0c;用于在类的层次结构中进行安全的向上转换&#xff08;从派生类到基类&#xff09;或进行不需要运行时类型检查的转换。它主要用于基本数据类型之间的转换、对象指针或引用的向上转换&#xff08;即从派生…

AI聊天可能涉黄?用户该如何对待AI聊天

AI伴侣是生成式大模型发展的产物&#xff0c;它是一个聊天机器人&#xff0c;能够随叫随到&#xff0c;提供情绪价值&#xff0c;还能发腿照和腹肌照。它可以是对现实或小说中某个人物的角色扮演&#xff0c;也可以是凭空创造出来的一个形象&#xff0c;总之不是真人。但因为接…

【学习】美国虚拟信用卡申请流程

WildCard 官方网址&#xff1a;https://bewildcard.com/i/PEACEFUL &#xff08;使用邀请码“PEACEFUL”可以享受开卡88 折优惠&#xff0c;注册时提示填写邀请码就可以填写&#xff09;

某服务商云服务器使用体验

雨云 雨云云服务商以其免费MC面板服务器、抗DDoS攻击服务和免费CDN服务三大核心优势&#xff0c;成为了众多企业和个人站长的首选云服务提供商。官网&#xff1a;https://app.rainyun.com 莫名的好感&#x1f603; 登录环节 在阿里&#xff0c;腾讯的官网二次进入时大多时…

深度学习驱动智能超材料设计与应用

在深度学习与超材料融合的背景下&#xff0c;不仅提高了设计的效率和质量&#xff0c;还为实现定制化和精准化的治疗提供了可能&#xff0c;展现了在材料科学领域的巨大潜力。深度学习可以帮助实现超材料结构参数的优化、电磁响应的预测、拓扑结构的自动设计、相位的预测及结构…

Hive 函数

分类 Hive 的函数分为两大类&#xff1a;内置函数&#xff08;Built-in-Functions&#xff09;、用户自定义函数&#xff08;User-Defined-Functions&#xff09;&#xff1b;内置函数可分为&#xff1a;数值类型函数、日期类型函数、字符串类型函数、集合函数等&#xff1b;用…

Redis-基础概念

目录 概念 Redis是什么 Redis 和 MySQL 的区别&#xff1f; Redis单线程有什么极端场景的瓶颈 Redis为什么快? 为什么Redis是单线程? Redis是单线程还是多线程 Redis为什么选择单线程做核心处理 Redis6.0之后引入了多线程&#xff0c;你知道为什么吗? 瓶颈是内存和I…

php相关

php相关 ​ 借鉴了小迪安全以及各位大佬的博客&#xff0c;如果一切顺利&#xff0c;会不定期更新。 如果感觉不妥&#xff0c;可以私信删除。 默认有php基础。 文章目录 php相关1. php 缺陷函数1. 与2. MD53. intval()4. preg_match() 2. php特性1. php字符串解析特性2. 杂…

用switch实现多分支选择结构

一 例子引入 #include<stdio.h> int main&#xff08;) {char grade&#xff1b;scanf("%c"&#xff0c;&grade);printf("Your score:");switch (grade){case A: printf("85~100\n"); break;case B: printf("70~84\n");br…