kaggle视频行为分析1st and Future - Player Contact Detection

这次比赛的目标是检测美式橄榄球NFL比赛中球员经历的外部接触。您将使用视频和球员追踪数据来识别发生接触的时刻,以帮助提高球员的安全。两种接触,一种是人与人的,另一种是人与地面,不包括脚底和地面的,跟我之前做的这个是同一个主办方举行的

kaggle视频追踪NFL Health & Safety - Helmet Assignment-CSDN博客

之前做的是视频追踪,用的deepsort,这一场比赛用的2.5DCNN。

EDA部分

eda可以参考这一个notebook,用的fasteda,挺方便的

NFL Player Contact Detection EDA 🏈 | Kaggle

视频数据在test和train文件夹里面,还提供了这一个train_baseline_helmets.csv,是由上一次比赛的冠军方案产生的,是我之前做的视频追踪,train_player_tracking.csv 的频率是10HZ,视频是59.94HZ,之后要进行转换,snap 事件也就是比赛开始发生在视频的第五秒

train_labels.csv

  • step: A number representing each each timestep for each play, starting at 0 at the moment of the play starting, and incrementing by 1 every 0.1 seconds.
  • 之前说的比赛第5秒开始,一个step是0.1秒

接触发生以10HZ记录

[train/test]_player_tracking.csv

  • datetime: timestamp at 10 Hz.

[train/test]_video_metadata.csv

be used to sync with player tracking data.和视频是同步的

训练部分

我自己租卡跑,20多个小时,10个epoch,我上传到kaggle,链接如下

track_weight | Kaggle

额外要用的一个数据集如下,我用的的4090显卡20核跑的,你要自己训练的话要自己修改一下

timm-0.6.9 | Kaggle

导入包

import os
import sys
import glob
import numpy as np
import pandas as pd
import random
import math
import gc
import cv2
from tqdm import tqdm
import time
from functools import lru_cache
import torch
from torch import nn
from torch.nn import functional as F
from torch.utils.data import Dataset, DataLoader
from torch.cuda.amp import autocast, GradScaler
import timm
import albumentations as A
from albumentations.pytorch import ToTensorV2
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from timm.scheduler import CosineLRScheduler
sys.path.append('../input/timm-0-6-9/pytorch-image-models-master')

配置

CFG = {'seed': 42,'model': 'convnext_small.fb_in1k','img_size': 256,'epochs': 10,'train_bs': 48, 'valid_bs': 32,'lr': 1e-3, 'weight_decay': 1e-6,'num_workers': 20,'max_grad_norm' : 1000,'epochs_warmup' : 3.0
}

我用的convnext,这个网络是原本的cnn根据vit模型去反复修改的,有兴趣自己去找论文看,但论文也就是在那反复调

设置种子和device

def seed_everything(seed):random.seed(seed)os.environ['PYTHONHASHSEED'] = str(seed)np.random.seed(seed)torch.manual_seed(seed)torch.cuda.manual_seed(seed)torch.cuda.manual_seed_all(seed)torch.backends.cudnn.deterministic = Truetorch.backends.cudnn.benchmark = Falseseed_everything(CFG['seed'])
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

添加一些额外的列和读取数据

def expand_contact_id(df):"""Splits out contact_id into seperate columns."""df["game_play"] = df["contact_id"].str[:12]df["step"] = df["contact_id"].str.split("_").str[-3].astype("int")df["nfl_player_id_1"] = df["contact_id"].str.split("_").str[-2]df["nfl_player_id_2"] = df["contact_id"].str.split("_").str[-1]return df
labels = expand_contact_id(pd.read_csv("../input/nfl-player-contact-detection/train_labels.csv"))
train_tracking = pd.read_csv("../input/nfl-player-contact-detection/train_player_tracking.csv")
train_helmets = pd.read_csv("../input/nfl-player-contact-detection/train_baseline_helmets.csv")
train_video_metadata = pd.read_csv("../input/nfl-player-contact-detection/train_video_metadata.csv")

将视频数据转化为图像数据

import subprocess
from tqdm import tqdm# 假设 train_helmets 是一个包含视频文件名的 DataFrame
for video in tqdm(train_helmets.video.unique()):if 'Endzone2' not in video:# 输入视频路径input_path = f'/openbayes/home/train/{video}'# 输出帧路径output_path = f'/openbayes/train/frames/{video}_%04d.jpg'# 构建 ffmpeg 命令command = ['ffmpeg','-i', input_path,  # 输入视频文件'-q:v', '5',       # 设置输出图像质量'-f', 'image2',    # 输出为图像序列output_path,       # 输出图像路径'-hide_banner',    # 隐藏 ffmpeg 的 banner 信息'-loglevel', 'error'  # 只显示错误日志]# 执行命令subprocess.run(command, check=True)

可以自己修改那里的质量,在kaggle上不能训练,要你自己租卡才跑的动

创建一些特征

def create_features(df, tr_tracking, merge_col="step", use_cols=["x_position", "y_position"]):output_cols = []df_combo = (df.astype({"nfl_player_id_1": "str"}).merge(tr_tracking.astype({"nfl_player_id": "str"})[["game_play", merge_col, "nfl_player_id",] + use_cols],left_on=["game_play", merge_col, "nfl_player_id_1"],right_on=["game_play", merge_col, "nfl_player_id"],how="left",).rename(columns={c: c+"_1" for c in use_cols}).drop("nfl_player_id", axis=1).merge(tr_tracking.astype({"nfl_player_id": "str"})[["game_play", merge_col, "nfl_player_id"] + use_cols],left_on=["game_play", merge_col, "nfl_player_id_2"],right_on=["game_play", merge_col, "nfl_player_id"],how="left",).drop("nfl_player_id", axis=1).rename(columns={c: c+"_2" for c in use_cols}).sort_values(["game_play", merge_col, "nfl_player_id_1", "nfl_player_id_2"]).reset_index(drop=True))output_cols += [c+"_1" for c in use_cols]output_cols += [c+"_2" for c in use_cols]if ("x_position" in use_cols) & ("y_position" in use_cols):index = df_combo['x_position_2'].notnull()distance_arr = np.full(len(index), np.nan)tmp_distance_arr = np.sqrt(np.square(df_combo.loc[index, "x_position_1"] - df_combo.loc[index, "x_position_2"])+ np.square(df_combo.loc[index, "y_position_1"]- df_combo.loc[index, "y_position_2"]))distance_arr[index] = tmp_distance_arrdf_combo['distance'] = distance_arroutput_cols += ["distance"]df_combo['G_flug'] = (df_combo['nfl_player_id_2']=="G")output_cols += ["G_flug"]return df_combo, output_colsuse_cols = ['x_position', 'y_position', 'speed', 'distance','direction', 'orientation', 'acceleration', 'sa'
]train, feature_cols = create_features(labels, train_tracking, use_cols=use_cols)

label和train_tracking进行合并,这里的feature_cols后面训练要用到

和视频的频率进行同步,过滤一部分数据

train_filtered = train.query('not distance>2').reset_index(drop=True)
train_filtered['frame'] = (train_filtered['step']/10*59.94+5*59.94).astype('int')+1
train_filtered.head()

视频频率是59.94,而数据集是10,这里将距离过大的pair去除

数据增强

train_aug = A.Compose([A.HorizontalFlip(p=0.5),A.ShiftScaleRotate(p=0.5),A.RandomBrightnessContrast(brightness_limit=(-0.1, 0.1), contrast_limit=(-0.1, 0.1), p=0.5),A.Normalize(mean=[0.], std=[1.]),ToTensorV2()
])valid_aug = A.Compose([A.Normalize(mean=[0.], std=[1.]),ToTensorV2()
])

创建字典

video2helmets = {}
train_helmets_new = train_helmets.set_index('video')
for video in tqdm(train_helmets.video.unique()):video2helmets[video] = train_helmets_new.loc[video].reset_index(drop=True)
video2frames = {}for game_play in tqdm(train_video_metadata.game_play.unique()):for view in ['Endzone', 'Sideline']:video = game_play + f'_{view}.mp4'video2frames[video] = max(list(map(lambda x:int(x.split('_')[-1].split('.')[0]), \glob.glob(f'../train/frames/{video}*'))))

取出视频对应的检测数据和每个视频的最大帧数,检测数据后面用来截取图像用的,最大帧数确保抽取的帧不超过这个范围

数据集

class MyDataset(Dataset):def __init__(self, df, aug=train_aug, mode='train'):self.df = dfself.frame = df.frame.valuesself.feature = df[feature_cols].fillna(-1).valuesself.players = df[['nfl_player_id_1','nfl_player_id_2']].valuesself.game_play = df.game_play.valuesself.aug = augself.mode = modedef __len__(self):return len(self.df)# @lru_cache(1024)# def read_img(self, path):#     return cv2.imread(path, 0)def __getitem__(self, idx):   window = 24frame = self.frame[idx]if self.mode == 'train':frame = frame + random.randint(-6, 6)players = []for p in self.players[idx]:if p == 'G':players.append(p)else:players.append(int(p))imgs = []for view in ['Endzone', 'Sideline']:video = self.game_play[idx] + f'_{view}.mp4'tmp = video2helmets[video]
#             tmp = tmp.query('@frame-@window<=frame<=@frame+@window')tmp[tmp['frame'].between(frame-window, frame+window)]tmp = tmp[tmp.nfl_player_id.isin(players)]#.sort_values(['nfl_player_id', 'frame'])tmp_frames = tmp.frame.valuestmp = tmp.groupby('frame')[['left','width','top','height']].mean()
#0.002sbboxes = []for f in range(frame-window, frame+window+1, 1):if f in tmp_frames:x, w, y, h = tmp.loc[f][['left','width','top','height']]bboxes.append([x, w, y, h])else:bboxes.append([np.nan, np.nan, np.nan, np.nan])bboxes = pd.DataFrame(bboxes).interpolate(limit_direction='both').valuesbboxes = bboxes[::4]if bboxes.sum() > 0:flag = 1else:flag = 0
#0.03sfor i, f in enumerate(range(frame-window, frame+window+1, 4)):img_new = np.zeros((256, 256), dtype=np.float32)if flag == 1 and f <= video2frames[video]:img = cv2.imread(f'../train/frames/{video}_{f:04d}.jpg', 0)x, w, y, h = bboxes[i]img = img[int(y+h/2)-128:int(y+h/2)+128,int(x+w/2)-128:int(x+w/2)+128].copy()img_new[:img.shape[0], :img.shape[1]] = imgimgs.append(img_new)
#0.06sfeature = np.float32(self.feature[idx])img = np.array(imgs).transpose(1, 2, 0)    img = self.aug(image=img)["image"]label = np.float32(self.df.contact.values[idx])return img, feature, label

模型

class Model(nn.Module):def __init__(self):super(Model, self).__init__()self.backbone = timm.create_model(CFG['model'], pretrained=True, num_classes=500, in_chans=13)self.mlp = nn.Sequential(nn.Linear(18, 64),nn.LayerNorm(64),nn.ReLU(),nn.Dropout(0.2),)self.fc = nn.Linear(64+500*2, 1)def forward(self, img, feature):b, c, h, w = img.shapeimg = img.reshape(b*2, c//2, h, w)img = self.backbone(img).reshape(b, -1)feature = self.mlp(feature)y = self.fc(torch.cat([img, feature], dim=1))return y

这里len(feature_cols)是18,所以mlp输入是18,在上面

            for i, f in enumerate(range(frame-window, frame+window+1, 4)):img_new = np.zeros((256, 256), dtype=np.float32)if flag == 1 and f <= video2frames[video]:img = cv2.imread(f'/openbayes/train/frames/{video}_{f:04d}.jpg', 0)x, w, y, h = bboxes[i]img = img[int(y+h/2)-128:int(y+h/2)+128,int(x+w/2)-128:int(x+w/2)+128].copy()img_new[:img.shape[0], :img.shape[1]] = imgimgs.append(img_new)

进行了抽帧,每个视角抽了13帧,两个视角,总计26帧,所以输入通道26,跟之前的比赛一样,也是提供两个视角

for view in ['Endzone', 'Sideline']:

损失函数

model = Model()
model.to(device)
model.train()
import torch.nn as nn
criterion = nn.BCEWithLogitsLoss()

这里用的交叉熵

评估指标

def evaluate(model, loader_val, *, compute_score=True, pbar=None):"""Predict and compute loss and score"""tb = time.time()in_training = model.trainingmodel.eval()loss_sum = 0.0n_sum = 0y_all = []y_pred_all = []if pbar is not None:pbar = tqdm(desc='Predict', nrows=78, total=pbar)total= len(loader_val)for ibatch,(img, feature, label) in tqdm(enumerate(loader_val),total = total):# img, feature, label = [x.to(device) for x in batch]img = img.to(device)feature = feature.to(device)n = label.size(0)label = label.to(device)with torch.no_grad():y_pred = model(img, feature)loss = criterion(y_pred.view(-1), label)n_sum += nloss_sum += n * loss.item()if pbar is not None:pbar.update(len(img))del loss, img, labelgc.collect()loss_val = loss_sum / n_sumret = {'loss': loss_val,'time': time.time() - tb}model.train(in_training) gc.collect()return ret

载入数据,设置学习率计划和优化器

train_set,valid_set = train_test_split(train_filtered,test_size=0.05, random_state=42,stratify = train_filtered['contact'])
train_set = MyDataset(train_set, train_aug, 'train')
train_loader = DataLoader(train_set, batch_size=CFG['train_bs'], shuffle=True, num_workers=12, pin_memory=True,drop_last=True)
valid_set = MyDataset(valid_set, valid_aug, 'test')
valid_loader = DataLoader(valid_set, batch_size=CFG['valid_bs'], shuffle=False, num_workers=12, pin_memory=True)
optimizer = torch.optim.AdamW(model.parameters(), lr=CFG['lr'], weight_decay=CFG['weight_decay'])
nbatch = len(train_loader)
warmup = CFG['epochs_warmup'] * nbatch
nsteps = CFG['epochs'] * nbatch 
scheduler = CosineLRScheduler(optimizer,warmup_t=warmup, warmup_lr_init=0.0, warmup_prefix=True,t_initial=(nsteps - warmup), lr_min=1e-6)    

开始训练,这里保存整个模型

for iepoch in range(CFG['epochs']):print('Epoch:', iepoch+1)loss_sum = 0.0n_sum = 0total = len(train_loader)# Trainfor ibatch,(img, feature, label) in tqdm(enumerate(train_loader),total = total):img = img.to(device)feature = feature.to(device)n = label.size(0)label = label.to(device)optimizer.zero_grad()y_pred = model(img, feature).squeeze(-1)loss = criterion(y_pred, label)loss_train = loss.item()loss_sum += n * loss_trainn_sum += nloss.backward()grad_norm = torch.nn.utils.clip_grad_norm_(model.parameters(),CFG['max_grad_norm'])optimizer.step()scheduler.step(iepoch * nbatch + ibatch + 1)val = evaluate(model, valid_loader)time_val += val['time']loss_train = loss_sum / n_sumdt = (time.time() - tb) / 60print('Epoch: %d Train Loss: %.4f Test Loss: %.4f Time: %.2f min' %(iepoch + 1, loss_train, val['loss'],dt))if val['loss'] < best_loss:best_loss = val['loss']# Save modelofilename = '/openbayes/home/best_model.pt'torch.save(model, ofilename)print(ofilename, 'written')del valgc.collect()dt = time.time() - tb
print(' %.2f min total, %.2f min val' % (dt / 60, time_val / 60))
gc.collect()

只保留权重可能会出现一些bug,保留整个模型比较稳妥

推理部分

这里我用TTA的版本

导入包

import os
import sys
sys.path.append('/kaggle/input/timm-0-6-9/pytorch-image-models-master')
import glob
import numpy as np
import pandas as pd
import random
import math
import gc
import cv2
from tqdm import tqdm
import time
from functools import lru_cache
import torch
from torch import nn
from torch.nn import functional as F
from torch.utils.data import Dataset, DataLoader
from torch.cuda.amp import autocast, GradScaler
import timm
import albumentations as A
from albumentations.pytorch import ToTensorV2
import matplotlib.pyplot as plt
from sklearn.metrics import matthews_corrcoef

数据处理

这里基本和前面一样,我全部放一起了

CFG = {'seed': 42,'model': 'convnext_small.fb_in1k','img_size': 256,'epochs': 10,'train_bs': 100, 'valid_bs': 64,'lr': 1e-3, 'weight_decay': 1e-6,'num_workers': 4
}
def seed_everything(seed):random.seed(seed)os.environ['PYTHONHASHSEED'] = str(seed)np.random.seed(seed)torch.manual_seed(seed)torch.cuda.manual_seed(seed)torch.cuda.manual_seed_all(seed)torch.backends.cudnn.deterministic = Truetorch.backends.cudnn.benchmark = Falseseed_everything(CFG['seed'])
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
def expand_contact_id(df):"""Splits out contact_id into seperate columns."""df["game_play"] = df["contact_id"].str[:12]df["step"] = df["contact_id"].str.split("_").str[-3].astype("int")df["nfl_player_id_1"] = df["contact_id"].str.split("_").str[-2]df["nfl_player_id_2"] = df["contact_id"].str.split("_").str[-1]return dflabels = expand_contact_id(pd.read_csv("/kaggle/input/nfl-player-contact-detection/sample_submission.csv"))test_tracking = pd.read_csv("/kaggle/input/nfl-player-contact-detection/test_player_tracking.csv")test_helmets = pd.read_csv("/kaggle/input/nfl-player-contact-detection/test_baseline_helmets.csv")test_video_metadata = pd.read_csv("/kaggle/input/nfl-player-contact-detection/test_video_metadata.csv")
!mkdir -p ../work/framesfor video in tqdm(test_helmets.video.unique()):if 'Endzone2' not in video:!ffmpeg -i /kaggle/input/nfl-player-contact-detection/test/{video} -q:v 2 -f image2 /kaggle/work/frames/{video}_%04d.jpg -hide_banner -loglevel error
def create_features(df, tr_tracking, merge_col="step", use_cols=["x_position", "y_position"]):output_cols = []df_combo = (df.astype({"nfl_player_id_1": "str"}).merge(tr_tracking.astype({"nfl_player_id": "str"})[["game_play", merge_col, "nfl_player_id",] + use_cols],left_on=["game_play", merge_col, "nfl_player_id_1"],right_on=["game_play", merge_col, "nfl_player_id"],how="left",).rename(columns={c: c+"_1" for c in use_cols}).drop("nfl_player_id", axis=1).merge(tr_tracking.astype({"nfl_player_id": "str"})[["game_play", merge_col, "nfl_player_id"] + use_cols],left_on=["game_play", merge_col, "nfl_player_id_2"],right_on=["game_play", merge_col, "nfl_player_id"],how="left",).drop("nfl_player_id", axis=1).rename(columns={c: c+"_2" for c in use_cols}).sort_values(["game_play", merge_col, "nfl_player_id_1", "nfl_player_id_2"]).reset_index(drop=True))output_cols += [c+"_1" for c in use_cols]output_cols += [c+"_2" for c in use_cols]if ("x_position" in use_cols) & ("y_position" in use_cols):index = df_combo['x_position_2'].notnull()distance_arr = np.full(len(index), np.nan)tmp_distance_arr = np.sqrt(np.square(df_combo.loc[index, "x_position_1"] - df_combo.loc[index, "x_position_2"])+ np.square(df_combo.loc[index, "y_position_1"]- df_combo.loc[index, "y_position_2"]))distance_arr[index] = tmp_distance_arrdf_combo['distance'] = distance_arroutput_cols += ["distance"]df_combo['G_flug'] = (df_combo['nfl_player_id_2']=="G")output_cols += ["G_flug"]return df_combo, output_colsuse_cols = ['x_position', 'y_position', 'speed', 'distance','direction', 'orientation', 'acceleration', 'sa'
]test, feature_cols = create_features(labels, test_tracking, use_cols=use_cols)
test
test_filtered = test.query('not distance>2').reset_index(drop=True)
test_filtered['frame'] = (test_filtered['step']/10*59.94+5*59.94).astype('int')+1
test_filtered
del test, labels, test_tracking
gc.collect()
train_aug = A.Compose([A.HorizontalFlip(p=0.5),A.ShiftScaleRotate(p=0.5),A.RandomBrightnessContrast(brightness_limit=(-0.1, 0.1), contrast_limit=(-0.1, 0.1), p=0.5),A.Normalize(mean=[0.], std=[1.]),ToTensorV2()
])valid_aug = A.Compose([A.Normalize(mean=[0.], std=[1.]),ToTensorV2()
])
video2helmets = {}
test_helmets_new = test_helmets.set_index('video')
for video in tqdm(test_helmets.video.unique()):video2helmets[video] = test_helmets_new.loc[video].reset_index(drop=True)del test_helmets, test_helmets_new
gc.collect()
video2frames = {}for game_play in tqdm(test_video_metadata.game_play.unique()):for view in ['Endzone', 'Sideline']:video = game_play + f'_{view}.mp4'video2frames[video] = max(list(map(lambda x:int(x.split('_')[-1].split('.')[0]), \glob.glob(f'/kaggle/work/frames/{video}*'))))
class MyDataset(Dataset):def __init__(self, df, aug=valid_aug, mode='train'):self.df = dfself.frame = df.frame.valuesself.feature = df[feature_cols].fillna(-1).valuesself.players = df[['nfl_player_id_1','nfl_player_id_2']].valuesself.game_play = df.game_play.valuesself.aug = augself.mode = modedef __len__(self):return len(self.df)# @lru_cache(1024)# def read_img(self, path):#     return cv2.imread(path, 0)def __getitem__(self, idx):   window = 24frame = self.frame[idx]if self.mode == 'train':frame = frame + random.randint(-6, 6)players = []for p in self.players[idx]:if p == 'G':players.append(p)else:players.append(int(p))imgs = []for view in ['Endzone', 'Sideline']:video = self.game_play[idx] + f'_{view}.mp4'tmp = video2helmets[video]
#             tmp = tmp.query('@frame-@window<=frame<=@frame+@window')tmp[tmp['frame'].between(frame-window, frame+window)]tmp = tmp[tmp.nfl_player_id.isin(players)]#.sort_values(['nfl_player_id', 'frame'])tmp_frames = tmp.frame.valuestmp = tmp.groupby('frame')[['left','width','top','height']].mean()
#0.002sbboxes = []for f in range(frame-window, frame+window+1, 1):if f in tmp_frames:x, w, y, h = tmp.loc[f][['left','width','top','height']]bboxes.append([x, w, y, h])else:bboxes.append([np.nan, np.nan, np.nan, np.nan])bboxes = pd.DataFrame(bboxes).interpolate(limit_direction='both').valuesbboxes = bboxes[::4]if bboxes.sum() > 0:flag = 1else:flag = 0
#0.03sfor i, f in enumerate(range(frame-window, frame+window+1, 4)):img_new = np.zeros((256, 256), dtype=np.float32)if flag == 1 and f <= video2frames[video]:img = cv2.imread(f'/kaggle/work/frames/{video}_{f:04d}.jpg', 0)x, w, y, h = bboxes[i]img = img[int(y+h/2)-128:int(y+h/2)+128,int(x+w/2)-128:int(x+w/2)+128].copy()img_new[:img.shape[0], :img.shape[1]] = imgimgs.append(img_new)
#0.06sfeature = np.float32(self.feature[idx])img = np.array(imgs).transpose(1, 2, 0)    img = self.aug(image=img)["image"]label = np.float32(self.df.contact.values[idx])return img, feature, label

查看截取出来的图片

img, feature, label = MyDataset(test_filtered, valid_aug, 'test')[0]
plt.imshow(img.permute(1,2,0)[:,:,7])
plt.show()
img.shape, feature, label

进行推理

class Model(nn.Module):def __init__(self):super(Model, self).__init__()self.backbone = timm.create_model(CFG['model'], pretrained=False, num_classes=500, in_chans=13)self.mlp = nn.Sequential(nn.Linear(18, 64),nn.LayerNorm(64),nn.ReLU(),nn.Dropout(0.2),# nn.Linear(64, 64),# nn.LayerNorm(64),# nn.ReLU(),# nn.Dropout(0.2))self.fc = nn.Linear(64+500*2, 1)def forward(self, img, feature):b, c, h, w = img.shapeimg = img.reshape(b*2, c//2, h, w)img = self.backbone(img).reshape(b, -1)feature = self.mlp(feature)y = self.fc(torch.cat([img, feature], dim=1))return y
test_set = MyDataset(test_filtered, valid_aug, 'test')
test_loader = DataLoader(test_set, batch_size=CFG['valid_bs'], shuffle=False, num_workers=CFG['num_workers'], pin_memory=True)model = Model().to(device)
model = torch.load('/kaggle/input/track-weight/best_model.pt')model.eval()y_pred = []
with torch.no_grad():tk = tqdm(test_loader, total=len(test_loader))for step, batch in enumerate(tk):if(step % 4 != 3):img, feature, label = [x.to(device) for x in batch]output1 = model(img, feature).squeeze(-1)output2 = model(img.flip(-1), feature).squeeze(-1)y_pred.extend(0.2*(output1.sigmoid().cpu().numpy()) + 0.8*(output2.sigmoid().cpu().numpy()))else:img, feature, label = [x.to(device) for x in batch]output = model(img.flip(-1), feature).squeeze(-1)y_pred.extend(output.sigmoid().cpu().numpy())    y_pred = np.array(y_pred)

这里用了翻转,tta算是一种隐式模型集成

提交

th = 0.29test_filtered['contact'] = (y_pred >= th).astype('int')sub = pd.read_csv('/kaggle/input/nfl-player-contact-detection/sample_submission.csv')sub = sub.drop("contact", axis=1).merge(test_filtered[['contact_id', 'contact']], how='left', on='contact_id')
sub['contact'] = sub['contact'].fillna(0).astype('int')sub[["contact_id", "contact"]].to_csv("submission.csv", index=False)sub.head()

推理代码链接和成绩

infer_code | Kaggle

修改版本

之前的,效果不是很好,我还是换成resnet50进行训练,结果如下,链接和权重如下

infer_code | Kaggle

best_weight | Kaggle

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/12286.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Chapter 6 -Fine-tuning for classification

Chapter 6 -Fine-tuning for classification 本章内容涵盖 引入不同的LLM微调方法准备用于文本分类的数据集修改预训练的 LLM 进行微调微调 LLM 以识别垃圾邮件评估微调LLM分类器的准确性使用微调的 LLM 对新数据进行分类 现在&#xff0c;我们将通过在大语言模型上对特定目标任…

【从零开始的LeetCode-算法】922. 按奇偶排序数组 II

给定一个非负整数数组 nums&#xff0c; nums 中一半整数是 奇数 &#xff0c;一半整数是 偶数 。 对数组进行排序&#xff0c;以便当 nums[i] 为奇数时&#xff0c;i 也是 奇数 &#xff1b;当 nums[i] 为偶数时&#xff0c; i 也是 偶数 。 你可以返回 任何满足上述条件的…

python 小游戏:扫雷

目录 1. 前言 2. 准备工作 3. 生成雷区 4. 鼠标点击扫雷 5. 胜利 or 失败 6. 游戏效果展示 7. 完整代码 1. 前言 本文使用 Pygame 实现的简化版扫雷游戏。 如上图所示&#xff0c;游戏包括基本的扫雷功能&#xff1a;生成雷区、左键点击扫雷、右键标记地雷、显示数字提示…

安全策略实验报告

1.实验拓扑图 2.实验需求 vlan2属于办公区&#xff0c;vlan3生产区 办公区pc在工作日时间可以正常访问OAserver&#xff0c;i其他时间不允许 办公区pc可以在任意时间访问Web server 生产区pc可以在任意时间访问OA server但不能访问web server 特例&#xff1a;生产区pc可以…

力扣73矩阵置零

给定一个 m x n 的矩阵&#xff0c;如果一个元素为 0 &#xff0c;则将其所在行和列的所有元素都设为 0 。请使用 原地 算法。 输入&#xff1a;matrix [[1,1,1],[1,0,1],[1,1,1]] 输出&#xff1a;[[1,0,1],[0,0,0],[1,0,1]] 输入&#xff1a;matrix [[0,1,2,0],[3,4,5,2],[…

蓝桥杯C语言组:暴力破解

基于C语言的暴力破解方法详解 暴力破解是一种通过穷举所有可能的解来找到正确答案的算法思想。在C语言中&#xff0c;暴力破解通常用于解决那些问题规模较小、解的范围有限的问题。虽然暴力破解的效率通常较低&#xff0c;但它是一种简单直接的方法&#xff0c;适用于一些简单…

【自然语言处理(NLP)】生成词向量:GloVe(Global Vectors for Word Representation)原理及应用

文章目录 介绍GloVe 介绍核心思想共现矩阵1. 共现矩阵的定义2. 共现概率矩阵的定义3. 共现概率矩阵的意义4. 共现概率矩阵的构建步骤5. 共现概率矩阵的应用6. 示例7. 优缺点优点缺点 **总结** 目标函数训练过程使用预训练的GloVe词向量 优点应用总结 个人主页&#xff1a;道友老…

介绍一下Mybatis的Executor执行器

Executor执行器是用来执行我们的具体的SQL操作的 有三种基本的Executor执行器&#xff1a; SimpleExecutor简单执行器 每执行一次update或select&#xff0c;就创建一个Statement对象&#xff0c;用完立刻关闭Statement对象 ReuseExecutor可重用执行器 可重复利用Statement…

Autosar-以太网是怎么运行的?(Davinci配置部分)

写在前面&#xff1a; 入行一段时间了&#xff0c;基于个人理解整理一些东西&#xff0c;如有错误&#xff0c;欢迎各位大佬评论区指正&#xff01;&#xff01;&#xff01; 目录 1.Autosar ETH通讯软件架构 2.Ethernet MCAL配置 2.1配置对应Pin属性 2.2配置TXD引脚 2.3配…

【基于SprintBoot+Mybatis+Mysql】电脑商城项目之用户登录

&#x1f9f8;安清h&#xff1a;个人主页 &#x1f3a5;个人专栏&#xff1a;【Spring篇】【计算机网络】【Mybatis篇】 &#x1f6a6;作者简介&#xff1a;一个有趣爱睡觉的intp&#xff0c;期待和更多人分享自己所学知识的真诚大学生。 目录 &#x1f3af;1.登录-持久层 &…

VSCode设置内容字体大小

1、打开VSCode软件&#xff0c;点击左下角的“图标”&#xff0c;选择“Setting”。 在命令面板中的Font Size处选择适合自己的字体大小。 2、对比Font Size值为14与20下的字体大小。

企业商业秘密百问百答之三十八【商务保密协议签订】

《企业商业秘密百问百答》是由天禾律所陈军律师团队精心编撰的成果&#xff0c;汇集了该团队律师在处理商业秘密相关的刑事和民事案件中的丰富经验。近年来&#xff0c;这份资料已通过线上和线下的方式向全国近千家企业进行了广泛宣讲&#xff0c;并获得了积极的社会反响。 其…

C++11中的bind

官方文档对于bind接口的概述解释&#xff1a;Bind function arguments 在C11中&#xff0c;std::bind 是一个非常有用的工具&#xff0c;用于将函数、成员函数或函数对象与特定的参数绑定在一起&#xff0c;生成一个新的可调用对象。std::bind 可以用于部分应用函数参数、改变…

Qt网络相关

“ 所有生而孤独的人&#xff0c;葆有的天真 ” 为了⽀持跨平台, QT对⽹络编程的 API 也进⾏了重新封装。本章会上手一套基于QT的网络通信编写。 UDP Socket 在使用Qt进行网络编程前&#xff0c;需要在Qt项目中的.pro文件里添加对应的网络模块( network ). QT core gui net…

会计学基础

【拯救者】会计学基础速成&#xff08;期末 复试 升本均可用&#xff09; ©无忌教育 重点: 适用课本: 会计基础 会计基础是指会计工作的基本原则和方法&#xff0c;它努力为会计核算提供一个共同的基础&#xff0c;以便各种组织在会计核算上得到一致的结果。会计基础主要…

我们信仰AI?从神明到人工智能——信任的进化

信任的进化&#xff1a; 信任是我们最宝贵的资产。而现在&#xff0c;它正像黑色星期五促销的廉价平板电视一样&#xff0c;被一点点拆解。在过去&#xff0c;世界很简单&#xff1a;人们相信晚间新闻、那些满是灰尘书籍的教授&#xff0c;或者手持病历、眉头紧锁的医生。而如…

《DeepSeek R1:7b 写一个python程序调用摄像头获取视频并显示》

C:\Users\Administrator>ollama run deepseek-r1:7b hello Hello! How can I assist you today? &#x1f60a; 写一个python程序调用摄像头获取视频并显示 好&#xff0c;我需要帮用户写一个Python程序&#xff0c;它能够使用摄像头获取视频&#xff0c;并在屏幕上显示出…

Linux网络 | 进入数据链路层,学习相关协议与概念

前言&#xff1a;本节内容进入博主讲解的网络层级中的最后一层&#xff1a;数据链路层。 首先博主还是会线代友友们认识一下数据链路层的报文。 然后会带大家重新理解一些概念&#xff0c;比如局域网交换机等等。然后就是ARP协议。 讲完这些&#xff0c; 本节任务就算结束。 那…

Python 科学计算

&#x1f9d1; 博主简介&#xff1a;CSDN博客专家&#xff0c;历代文学网&#xff08;PC端可以访问&#xff1a;https://literature.sinhy.com/#/literature?__c1000&#xff0c;移动端可微信小程序搜索“历代文学”&#xff09;总架构师&#xff0c;15年工作经验&#xff0c;…

18.[前端开发]Day18-王者荣耀项目实战(一)

01-06 项目实战 1 代码规范 2 CSS编写顺序 3 组件化开发思想 组件化开发思路 项目整体思路 – 各个击破 07_(掌握)王者荣耀-top-整体布局完成 完整代码 01_page_top1.html <!DOCTYPE html> <html lang"en"> <head><meta charset"UTF-8…