一、COCO128 数据集
我们以最近大热的YOLOv8为例,回顾一下之前的安装过程:
%pip install ultralytics
import ultralytics
ultralytics.checks()
这里选择训练的数据集为:COCO128
COCO128是一个小型教程数据集,由COCOtrain2017中的前128个图像组成。
在YOLO中自带的coco128.yaml文件:
1)可选的用于自动下载的下载命令/URL,
2)指向培训图像目录的路径(或指向带有培训图像列表的*.txt文件的路径),
3)与验证图像相同,
4)类数,
5)类名列表:
# download command/URL (optional)
download: https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: ../coco128/images/train2017/
val: ../coco128/images/train2017/# number of classes
nc: 80# class names
names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light','fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow','elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee','skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard','tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple','sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch','potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']
二、训练过程
!yolo train model = yolov8n.pt data = coco128.yaml epochs = 10 imgsz = 640
训练过程为:
from n params module arguments 0 -1 1 464 ultralytics.nn.modules.conv.Conv [3, 16, 3, 2] 1 -1 1 4672 ultralytics.nn.modules.conv.Conv [16, 32, 3, 2] 2 -1 1 7360 ultralytics.nn.modules.block.C2f [32, 32, 1, True] 3 -1 1 18560 ultralytics.nn.modules.conv.Conv [32, 64, 3, 2] 4 -1 2 49664 ultralytics.nn.modules.block.C2f [64, 64, 2, Tr
ue] 5 -1 1 73984 ultralytics.nn.modules.conv.Conv [64, 128, 3, 2] 6 -1 2 197632 ultralytics.nn.modules.block.C2f [128, 128, 2, True] 7 -1 1 295424 ultralytics.nn.modules.conv.Conv [128, 256, 3, 2] 8 -1 1 460288 ultralytics.nn.modules.block.C2f [256, 256, 1, True] 9 -1 1 164608 ultralytics.nn.modules.block.SPPF [256, 256, 5] 10 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 11 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1] 12 -1 1 148224 ultralytics.nn.modules.block.C2f [384, 128, 1] 13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 14 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1] 15 -1 1 37248 ultralytics.nn.modules.block.C2f [192, 64, 1] 16 -1 1 36992 ultralytics.nn.modules.conv.Conv [64, 64, 3, 2] 17 [-1, 12] 1 0 ultralytics.nn.modules.conv.Concat [1] 18 -1 1 123648 ultralytics.nn.modules.block.C2f [192, 128, 1] 19 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2] 20 [-1, 9] 1 0 ultralytics.nn.modules.conv.Concat [1] 21 -1 1 493056 ultralytics.nn.modules.block.C2f [384, 256, 1] 22 [15, 18, 21] 1 897664 ultralytics.nn.modules.head.Detect [80, [64, 128, 256]]
Model summary: 225 layers, 3157200 parameters, 3157184 gradients
Transferred 355/355 items from pretrained weights
TensorBoard: Start with 'tensorboard --logdir runs/detect/train', view at http://localhost:6006/
AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n...
AMP: checks passed ✅
train: Scanning /kaggle/working/datasets/coco128/labels/train2017.cache... 126 i
albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))
val: Scanning /kaggle/working/datasets/coco128/labels/train2017.cache... 126 ima
Plotting labels to runs/detect/train/labels.jpg...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to runs/detect/train
Starting training for 10 epochs...
Closing dataloader mosaic
albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size1/10 2.61G 1.153 1.398 1.192 81 640: 1Class Images Instances Box(P R mAP50 mall 128 929 0.688 0.506 0.61 0.446Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size2/10 2.56G 1.142 1.345 1.202 121 640: 1Class Images Instances Box(P R mAP50 mall 128 929 0.678 0.525 0.63 0.456Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size3/10 2.57G 1.147 1.25 1.175 108 640: 1Class Images Instances Box(P R mAP50 mall 128 929 0.656 0.548 0.64 0.466Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size4/10 2.57G 1.149 1.287 1.177 116 640: 1Class Images Instances Box(P R mAP50 mall 128 929 0.684 0.568 0.654 0.482Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size5/10 2.57G 1.169 1.233 1.207 68 640: 1Class Images Instances Box(P R mAP50 mall 128 929 0.664 0.586 0.668 0.491Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size6/10 2.57G 1.139 1.231 1.177 95 640: 1Class Images Instances Box(P R mAP50 mall 128 929 0.66 0.613 0.677 0.5Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size7/10 2.57G 1.134 1.211 1.181 115 640: 1Class Images Instances Box(P R mAP50 mall 128 929 0.649 0.631 0.683 0.504Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size8/10 2.57G 1.114 1.194 1.178 71 640: 1Class Images Instances Box(P R mAP50 mall 128 929 0.664 0.634 0.69 0.513Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size9/10 2.57G 1.117 1.127 1.148 142 640: 1Class Images Instances Box(P R mAP50 mall 128 929 0.624 0.671 0.697 0.52Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size10/10 2.57G 1.085 1.133 1.172 104 640: 1Class Images Instances Box(P R mAP50 mall 128 929 0.631 0.676 0.704 0.522
10 epochs completed in 0.018 hours.
Optimizer stripped from runs/detect/train/weights/last.pt, 6.5MB
Optimizer stripped from runs/detect/train/weights/best.pt, 6.5MBValidating runs/detect/train/weights/best.pt...
Ultralytics YOLOv8.0.128 🚀 Python-3.10.10 torch-2.0.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)
Model summary (fused): 168 layers, 3151904 parameters, 0 gradients
Class Images Instances Box(P R mAP50 mall 128 929 0.629 0.677 0.704 0.523person 128 254 0.763 0.721 0.778 0.569bicycle 128 6 0.765 0.333 0.391 0.321car 128 46 0.487 0.217 0.322 0.192motorcycle 128 5 0.613 0.8 0.906 0.732airplane 128 6 0.842 1 0.972 0.809bus 128 7 0.832 0.714 0.712 0.61train 128 3 0.52 1 0.995 0.858truck 128 12 0.597 0.5 0.547 0.373boat 128 6 0.526 0.167 0.448 0.328traffic light 128 14 0.471 0.214 0.184 0.145stop sign 128 2 0.671 1 0.995 0.647bench 128 9 0.675 0.695 0.72 0.489bird 128 16 0.936 0.921 0.961 0.67cat 128 4 0.818 1 0.995 0.772dog 128 9 0.68 0.889 0.908 0.722horse 128 2 0.441 1 0.828 0.497elephant 128 17 0.742 0.848 0.933 0.71bear 128 1 0.461 1 0.995 0.995zebra 128 4 0.85 1 0.995 0.972giraffe 128 9 0.824 1 0.995 0.772backpack 128 6 0.596 0.333 0.394 0.257umbrella 128 18 0.564 0.722 0.681 0.429handbag 128 19 0.635 0.185 0.326 0.178tie 128 7 0.671 0.714 0.758 0.522suitcase 128 4 0.687 1 0.945 0.603frisbee 128 5 0.52 0.8 0.799 0.689skis 128 1 0.694 1 0.995 0.497snowboard 128 7 0.499 0.714 0.732 0.589sports ball 128 6 0.747 0.494 0.573 0.342kite 128 10 0.539 0.5 0.504 0.181baseball bat 128 4 0.595 0.5 0.509 0.253baseball glove 128 7 0.808 0.429 0.431 0.318skateboard 128 5 0.493 0.6 0.609 0.465tennis racket 128 7 0.451 0.286 0.446 0.274bottle 128 18 0.4 0.389 0.365 0.257wine glass 128 16 0.597 0.557 0.675 0.366cup 128 36 0.586 0.389 0.465 0.338fork 128 6 0.582 0.167 0.306 0.234knife 128 16 0.621 0.625 0.669 0.405spoon 128 22 0.525 0.364 0.41 0.227bowl 128 28 0.657 0.714 0.719 0.584banana 128 1 0.319 1 0.497 0.0622sandwich 128 2 0.812 1 0.995 0.995orange 128 4 0.784 1 0.895 0.594broccoli 128 11 0.431 0.273 0.339 0.26carrot 128 24 0.553 0.833 0.801 0.504hot dog 128 2 0.474 1 0.995 0.946pizza 128 5 0.736 1 0.995 0.882donut 128 14 0.574 1 0.929 0.85cake 128 4 0.769 1 0.995 0.89chair 128 35 0.503 0.571 0.542 0.307couch 128 6 0.526 0.667 0.805 0.612potted plant 128 14 0.479 0.786 0.784 0.545bed 128 3 0.714 1 0.995 0.83dining table 128 13 0.451 0.615 0.552 0.437toilet 128 2 1 0.942 0.995 0.946tv 128 2 0.622 1 0.995 0.846laptop 128 3 1 0.452 0.863 0.738mouse 128 2 1 0 0.0459 0.00459remote 128 8 0.736 0.5 0.62 0.527cell phone 128 8 0.0541 0.027 0.0731 0.043microwave 128 3 0.773 0.667 0.913 0.807oven 128 5 0.442 0.483 0.433 0.336sink 128 6 0.378 0.167 0.336 0.231refrigerator 128 5 0.662 0.786 0.778 0.616book 128 29 0.47 0.336 0.402 0.23clock 128 9 0.76 0.778 0.884 0.762vase 128 2 0.428 1 0.828 0.745scissors 128 1 0.911 1 0.995 0.256teddy bear 128 21 0.551 0.667 0.805 0.515toothbrush 128 5 0.768 1 0.995 0.65
Speed: 3.4ms preprocess, 1.9ms inference, 0.0ms loss, 2.4ms postprocess per image
Results saved to runs/detect/train
三、验证过程
!yolo val model = yolov8n.pt data = coco128.yaml
输出的结果为:
Class Images Instances Box(P R mAP50 mall 128 929 0.64 0.537 0.605 0.446person 128 254 0.797 0.677 0.764 0.538bicycle 128 6 0.514 0.333 0.315 0.264car 128 46 0.813 0.217 0.273 0.168motorcycle 128 5 0.687 0.887 0.898 0.685airplane 128 6 0.82 0.833 0.927 0.675bus 128 7 0.491 0.714 0.728 0.671train 128 3 0.534 0.667 0.706 0.604truck 128 12 1 0.332 0.473 0.297boat 128 6 0.226 0.167 0.316 0.134traffic light 128 14 0.734 0.2 0.202 0.139stop sign 128 2 1 0.992 0.995 0.701bench 128 9 0.839 0.582 0.62 0.365bird 128 16 0.921 0.728 0.864 0.51cat 128 4 0.875 1 0.995 0.791dog 128 9 0.603 0.889 0.785 0.585horse 128 2 0.597 1 0.995 0.518elephant 128 17 0.849 0.765 0.9 0.679bear 128 1 0.593 1 0.995 0.995zebra 128 4 0.848 1 0.995 0.965giraffe 128 9 0.72 1 0.951 0.722backpack 128 6 0.589 0.333 0.376 0.232umbrella 128 18 0.804 0.5 0.643 0.414handbag 128 19 0.424 0.0526 0.165 0.0889tie 128 7 0.804 0.714 0.674 0.476suitcase 128 4 0.635 0.883 0.745 0.534frisbee 128 5 0.675 0.8 0.759 0.688skis 128 1 0.567 1 0.995 0.497snowboard 128 7 0.742 0.714 0.747 0.5sports ball 128 6 0.716 0.433 0.485 0.278kite 128 10 0.817 0.45 0.569 0.184baseball bat 128 4 0.551 0.25 0.353 0.175baseball glove 128 7 0.624 0.429 0.429 0.293skateboard 128 5 0.846 0.6 0.6 0.41tennis racket 128 7 0.726 0.387 0.487 0.33bottle 128 18 0.448 0.389 0.376 0.208wine glass 128 16 0.743 0.362 0.584 0.333cup 128 36 0.58 0.278 0.404 0.29fork 128 6 0.527 0.167 0.246 0.184knife 128 16 0.564 0.5 0.59 0.36spoon 128 22 0.597 0.182 0.328 0.19bowl 128 28 0.648 0.643 0.618 0.491banana 128 1 0 0 0.124 0.0379sandwich 128 2 0.249 0.5 0.308 0.308orange 128 4 1 0.31 0.995 0.623broccoli 128 11 0.374 0.182 0.249 0.203carrot 128 24 0.648 0.458 0.572 0.362hot dog 128 2 0.351 0.553 0.745 0.721pizza 128 5 0.644 1 0.995 0.843donut 128 14 0.657 1 0.94 0.864cake 128 4 0.618 1 0.945 0.845chair 128 35 0.506 0.514 0.442 0.239couch 128 6 0.463 0.5 0.706 0.555potted plant 128 14 0.65 0.643 0.711 0.472bed 128 3 0.698 0.667 0.789 0.625dining table 128 13 0.432 0.615 0.485 0.366toilet 128 2 0.615 0.5 0.695 0.676tv 128 2 0.373 0.62 0.745 0.696laptop 128 3 1 0 0.451 0.361mouse 128 2 1 0 0.0625 0.00625remote 128 8 0.843 0.5 0.605 0.529cell phone 128 8 0 0 0.0549 0.0393microwave 128 3 0.435 0.667 0.806 0.718oven 128 5 0.412 0.4 0.339 0.27sink 128 6 0.35 0.167 0.182 0.129refrigerator 128 5 0.589 0.4 0.604 0.452book 128 29 0.629 0.103 0.346 0.178clock 128 9 0.788 0.83 0.875 0.74vase 128 2 0.376 1 0.828 0.795scissors 128 1 1 0 0.249 0.0746teddy bear 128 21 0.877 0.333 0.591 0.394toothbrush 128 5 0.743 0.6 0.638 0.374
Speed: 1.0ms preprocess, 8.5ms inference, 0.0ms loss, 1.6ms postprocess per image
Results saved to runs/detect/val
可视化的结果为: