由浅入深系列——Distinctive Image Featuresfrom Scale-Invariant Keypoints(SIFT)

第一章：为什么我们需要"图像指纹"？——SIFT的诞生

想象一下，你带着一张埃菲尔铁塔的明信片来到巴黎。站在铁塔脚下，你举起明信片想拍张对比照——但无论怎么调整角度，手机APP就是识别不出两张图片的对应关系。这个看似简单的场景，却隐藏着计算机视觉领域数十年来最棘手的难题：如何让机器像人类一样，在不同视角、光线、距离下识别同一物体？

1.1 传统方法的困境

在2004年之前，主流的图像识别方法就像拿着固定模板去套用：

尺度敏感：远看是点，近看是窗的埃菲尔铁塔，传统算法无法感知这种尺度变化
方向固化：手机横拍变竖拍，特征点立即"脸盲"
脆弱如纸：一片飘过的云朵改变光照，或路人遮挡部分画面，系统就可能完全失效

这就像要求侦探只能通过固定角度、固定距离的指纹来破案，现实中的复杂场景让传统算法举步维艰。

1.2 David Lowe的灵感

加拿大科学家David Lowe在观察人类视觉系统时获得关键洞见：当我们识别物体时，大脑并非记住每个像素，而是提取关键锚点及其空间关系。基于此，他提出了划时代的SIFT（尺度不变特征变换）算法，其核心思想可概括为：

为每个关键点建立专属身份证
这个身份证需要具备三大超能力：

火眼金睛：在模糊、噪点干扰下仍能准确定位
伸缩自如：无论远观轮廓还是近看细节都能识别
方向感知：倒着看、侧着看都不影响特征匹配

1.3 SIFT的魔法四部曲

尺度空间极值检测
像用不同倍率的放大镜扫描图像，通过高斯差分金字塔捕捉关键点，找到那些"在多个尺度下都显著"的特征。
关键点精确定位
用三维二次函数拟合，剔除不稳定的边缘响应，就像在模糊照片中精确勾勒物体轮廓。
方向分配
为每个特征点建立局部梯度方向直方图，相当于给特征点装上不会迷路的指南针。
描述子生成
将周围区域划分为4x4子块，每个子块计算8个方向的梯度强度，最终形成128维的"特征指纹"。

1.4 为什么是SIFT？

抗干扰性强：实验显示即使30%图像区域被遮挡，仍能保持90%以上的识别准确率
跨尺度匹配：在图像缩小到1/5或放大5倍时，特征依然稳定
光照鲁棒：通过归一化处理，特征描述对光照变化具有天然抵抗力
高效实用：一张500x500像素图片仅需0.3秒即可提取2000+特征点

第二章：深度学习的今天为什么还需要学习SIFT

在深度学习横扫计算机视觉领域的今天，一个诞生于20年前的传统算法依然活跃在谷歌地图、火星探测器乃至心脏手术导航系统中。SIFT（尺度不变特征变换）不仅是图像识别史上的里程碑，其设计思想更深远影响着现代人工智能的发展方向。理解这个算法的本质，就像掌握打开三维视觉世界的密钥。

深度学习的"明"与SIFT的"暗"

尽管卷积神经网络在图像分类等任务中展现出惊人性能，但在需要精准几何匹配的场景中，SIFT仍然保持着独特的优势：

零样本学习能力：无需任何训练数据，SIFT可以直接处理火星地表图像或古生物化石照片，而深度学习模型面对这些罕见对象时往往需要重新训练
物理可解释性：每个特征点对应明确的几何位置（x,y坐标+尺度+方向），这种特性在自动驾驶的传感器融合、手术机器人的空间定位等安全敏感场景中至关重要
计算效率奇迹：在嵌入式设备上，SIFT提取千级特征点仅需10ms级别耗时，比MobileNet等轻量化模型快2个数量级（数据来源：2021年IEEE嵌入式系统会议）

这正是SpaceX在星链卫星姿态调整、达芬奇手术机器人在微创操作中仍采用SIFT的根本原因——当任务涉及物理空间的精确映射时，传统算法的确定性与现代AI的泛化能力形成了完美互补。

在三维视觉的核心技术栈中，SIFT构建的"特征桥梁"至今无法被完全取代：

图像匹配：NASA的毅力号火星车使用SIFT完成全景图拼接，即使在沙尘暴导致图像模糊、光照剧变时，仍能保持94.7%的特征匹配成功率（2020年JPL技术报告）
自动驾驶定位：特斯拉早期Autopilot系统通过SIFT特征匹配高精度地图，其跨季节道路标识识别的稳定性比纯深度学习方案高23%（2019年CVPR自动驾驶研讨会）
SFM（运动恢复结构）：Google Earth的3D城市模型重建中，SIFT处理不同年代、不同分辨率卫星图像的能力，成功解决了金门大桥等钢结构建筑因镜面反射导致的特征丢失问题

SIFT的真正价值不仅在于其技术实现，更在于它重塑了计算机视觉的思维方式：

局部特征哲学：通过提取图像的稀疏显著点而非全局信息，这种思想直接催生了现代目标检测中的关键点预测技术
分层感知架构：构建高斯金字塔处理多尺度特征的理念，在ResNet、HRNet等深度网络中依然清晰可见
旋转-尺度不变性：SIFT的方向归一化方法启发了胶囊网络（Capsule Network）中的姿态矩阵设计

就连当前最先进的LoFTR（2021年最佳图像匹配算法）也公开承认，其注意力机制中融合了SIFT描述子的空间分布先验。这种跨越方法论鸿沟的思想传承，正是SIFT持续焕发生命力的根本原因。

从Lowe的原始论文到今天的神经辐射场（NeRF），SIFT始终扮演着"空间锚点"的关键角色。在AR导航眼镜中，它确保虚拟信息精准贴合物理表面；在考古数字化中，它让风化严重的碑文残片实现毫米级对齐；甚至在显微医学影像中，它能追踪细胞分裂时的亚像素级形变。

第三章：解构SIFT的视觉密码本——四步生成图像"指纹"

SIFT算法的核心如同精密的光学仪器，通过尺度空间探索、关键点锻造、方向感知、特征编码四个精密步骤，将图像转化为可数学化比对的"指纹库"。我们将通过一个具体案例——识别不同角度拍摄的巴黎圣母院尖塔，逐步拆解这个视觉密码本的生成过程。

Step 1：尺度空间探测——构建图像的"天文望远镜"（尺度空间极值检测）

核心任务：在模糊与清晰之间寻找稳定特征点

(1) 为什么需要尺度空间？

在图像中，特征的大小会因拍摄距离而变化。例如，远处的建筑轮廓和近处的窗户细节都是重要特征，但它们的尺度不同。尺度空间的核心思想是通过模糊模拟不同距离的观察效果，从而在不同尺度下捕捉特征。

(2) 高斯模糊：模拟不同尺度的观察

高斯模糊是尺度空间的基础。它通过卷积操作将图像与高斯核进行平滑处理，公式为：

其中：

(x,y) 是像素坐标；
σ 是高斯核的标准差，控制模糊程度。

输出说明：

σ=1.6：轻微模糊，保留细节（如窗户边缘）。
σ=3.2：中等模糊，保留建筑轮廓。
σ=6.4：严重模糊，仅保留大范围结构。

import cv2
import numpy as np
import matplotlib.pyplot as plt# 读取图像
img = cv2.imread('OIP.jpg', cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (256, 256))  # 缩小尺寸加速计算# 高斯模糊
sigma_values = [1.6, 3.2, 6.4]  # 不同σ值
blurred_images = [cv2.GaussianBlur(img, (0,0), sigmaX=sigma) for sigma in sigma_values]# 可视化
plt.figure(figsize=(15, 5))
for i, (sigma, blurred) in enumerate(zip(sigma_values, blurred_images)):plt.subplot(1, 3, i+1)plt.imshow(blurred, cmap='gray')plt.title(f'σ = {sigma}')
plt.tight_layout()
plt.show()

(3) 高斯金字塔：分层尺度空间

为了高效处理多尺度特征，将图像分层（octave）处理：

每层内使用指数增长的σ值，生成一组模糊图像。
每层结束后，将图像缩小一半，进入下一层。

# 高斯金字塔参数
octave_num = 3  # 3层
scales_per_octave = 5  # 每层5个尺度
sigma = 1.6# 生成高斯金字塔
gaussian_pyramid = []
for o in range(octave_num):octave_images = []for s in range(scales_per_octave):k = 2 ** (s / scales_per_octave)  # 尺度系数current_sigma = sigma * kblurred = cv2.GaussianBlur(img, (0,0), sigmaX=current_sigma)octave_images.append(blurred)gaussian_pyramid.append(octave_images)img = cv2.resize(img, (img.shape[1]//2, img.shape[0]//2))  # 下采样# 可视化
plt.figure(figsize=(15, 10))
for o in range(octave_num):for s in range(scales_per_octave):plt.subplot(octave_num, scales_per_octave, o * scales_per_octave + s + 1)plt.imshow(gaussian_pyramid[o][s], cmap='gray')plt.title(f'Octave {o+1}, Scale {s+1}')
plt.tight_layout()
plt.show()

输出说明：

Octave 1：原始分辨率，捕捉细节。
Octave 2：分辨率减半，捕捉中等尺度特征。
Octave 3：分辨率再减半，捕捉大范围结构。

(4) 高斯差分（DoG）：寻找显著特征

高斯差分通过相邻尺度模糊图像相减，突出显著特征：

# 计算DoG金字塔
dog_pyramid = []
for octave in gaussian_pyramid:dog_octave = []for s in range(1, len(octave)):dog = octave[s] - octave[s-1]dog_octave.append(dog)dog_pyramid.append(dog_octave)# 可视化
plt.figure(figsize=(15, 10))
for o in range(octave_num):for s in range(len(dog_pyramid[o])):plt.subplot(octave_num, scales_per_octave-1, o * (scales_per_octave-1) + s + 1)plt.imshow(dog_pyramid[o][s], cmap='jet')plt.title(f'Octave {o+1}, DoG {s+1}')
plt.tight_layout()
plt.show()

输出说明：

红色区域：正响应，表示特征点（如边缘、角点）。
蓝色区域：负响应，表示特征的反向变化。
跨尺度对比：同一特征在不同尺度下的响应强度。

(5) 极值检测：定位特征点

在三维尺度空间（x, y, σ）中搜索极值点：

每个点与相邻尺度和空间的26个点比较。
如果当前点是最大值或最小值，则认为是候选特征点。


def find_extrema(dog_octave, threshold=0.03):extrema = []max_response = np.max(np.abs(dog_octave))  # 最大响应值for s in range(1, len(dog_octave)-1):for y in range(1, dog_octave[s].shape[0]-1):for x in range(1, dog_octave[s].shape[1]-1):neighbor_cube = [dog_octave[s-1][y-1:y+2, x-1:x+2],dog_octave[s][y-1:y+2, x-1:x+2],dog_octave[s+1][y-1:y+2, x-1:x+2]]center = dog_octave[s][y, x]if (center == np.max(neighbor_cube) or center == np.min(neighbor_cube)) and abs(center) > threshold * max_response:extrema.append((x, y, s))return extrema# 检测极值点，设置阈值为最大响应的3%
extrema_points = find_extrema(dog_pyramid[0], threshold=0.03)# 可视化
plt.figure(figsize=(8, 8))
plt.imshow(gaussian_pyramid[0][0], cmap='gray')
xs = [p[0] for p in extrema_points]
ys = [p[1] for p in extrema_points]
plt.scatter(xs, ys, s=30, c='r', alpha=0.7)
plt.title(f'{len(extrema_points)} Extrema Detected')
plt.show()

Step 2：关键点精炼——雕刻特征的"钻石切割术"（关键点精确定位）

2.1 为什么需要关键点精炼？

在尺度空间检测到的极值点是离散的，且可能存在以下问题：

位置不精确：极值点可能偏离实际特征位置。
响应不稳定：低对比度或边缘上的极值点容易受到噪声影响。
重复检测：同一特征可能在多个尺度或位置被重复检测。

关键点精炼的目标是通过数学优化，剔除不稳定的极值点，并将候选点的位置精确到亚像素级别。

2.2 亚像素定位：泰勒展开插值

通过泰勒展开式对 DoG 函数进行插值，找到极值点的精确位置。公式为：

输出说明：

关键点位置被精确到亚像素级别。
偏移量过大的点被剔除，确保稳定性

def refine_keypoints(extrema_points, dog_octave):refined_points = []dog_octave = [layer.astype(np.float64) for layer in dog_octave]  # 转换为 float64for (x, y, s) in extrema_points:# 计算梯度dx = 0.5 * (dog_octave[s][y, x + 1] - dog_octave[s][y, x - 1])dy = 0.5 * (dog_octave[s][y + 1, x] - dog_octave[s][y - 1, x])ds = 0.5 * (dog_octave[s + 1][y, x] - dog_octave[s - 1][y, x])# 计算 Hessian 矩阵dxx = dog_octave[s][y, x + 1] - 2 * dog_octave[s][y, x] + dog_octave[s][y, x - 1]dyy = dog_octave[s][y + 1, x] - 2 * dog_octave[s][y, x] + dog_octave[s][y - 1, x]dss = dog_octave[s + 1][y, x] - 2 * dog_octave[s][y, x] + dog_octave[s - 1][y, x]dxy = 0.25 * (dog_octave[s][y + 1, x + 1] - dog_octave[s][y + 1, x - 1] -dog_octave[s][y - 1, x + 1] + dog_octave[s][y - 1, x - 1])dxs = 0.25 * (dog_octave[s + 1][y, x + 1] - dog_octave[s + 1][y, x - 1] -dog_octave[s - 1][y, x + 1] + dog_octave[s - 1][y, x - 1])dys = 0.25 * (dog_octave[s + 1][y + 1, x] - dog_octave[s + 1][y - 1, x] -dog_octave[s - 1][y + 1, x] + dog_octave[s - 1][y - 1, x])# 构建梯度和 Hessian 矩阵gradient = np.array([dx, dy, ds])hessian = np.array([[dxx, dxy, dxs],[dxy, dyy, dys],[dxs, dys, dss]])# 检查 Hessian 矩阵是否奇异det = np.linalg.det(hessian)if np.abs(det) > 1e-6:  # 行列式大于阈值时才计算逆矩阵offset = -np.linalg.inv(hessian) @ gradientelse:continue  # 跳过奇异矩阵# 更新关键点位置if np.max(np.abs(offset)) < 0.5:  # 偏移量小于0.5时接受x += offset[0]y += offset[1]s += offset[2]refined_points.append((x, y, s))return refined_points# 精炼关键点
refined_points = refine_keypoints(extrema_points, dog_pyramid[0])# 可视化
plt.figure(figsize=(8, 8))
plt.imshow(gaussian_pyramid[0][0], cmap='gray')
xs = [p[0] for p in refined_points]
ys = [p[1] for p in refined_points]
plt.scatter(xs, ys, s=30, c='r', alpha=0.7)
plt.title(f'{len(refined_points)} Refined Keypoints')
plt.show()

2.3 低对比度点剔除

低对比度的极值点对噪声敏感，需要剔除。通过检查 DoG 响应值是否大于阈值

def remove_low_contrast(refined_points, dog_octave, threshold=0.03):filtered_points = []for (x, y, s) in refined_points:if abs(dog_octave[int(s)][int(y), int(x)]) > threshold:filtered_points.append((x, y, s))return filtered_points# 剔除低对比度点
filtered_points = remove_low_contrast(refined_points, dog_pyramid[0], threshold=0.03)# 可视化
plt.figure(figsize=(8, 8))
plt.imshow(gaussian_pyramid[0][0], cmap='gray')
xs = [p[0] for p in filtered_points]
ys = [p[1] for p in filtered_points]
plt.scatter(xs, ys, s=30, c='r', alpha=0.7)
plt.title(f'{len(filtered_points)} After Low-Contrast Removal')
plt.show()

输出说明：

低对比度点被剔除，保留稳定的特征点。

2.4 边缘响应抑制

边缘上的极值点对旋转和尺度变化敏感，需要剔除。通过 Hessian 矩阵的特征值分析：

def suppress_edge_response(filtered_points, dog_octave, edge_ratio=10):final_points = []for (x, y, s) in filtered_points:# 计算 Hessian 矩阵dxx = dog_octave[int(s)][int(y), int(x)+1] - 2 * dog_octave[int(s)][int(y), int(x)] + dog_octave[int(s)][int(y), int(x)-1]dyy = dog_octave[int(s)][int(y)+1, int(x)] - 2 * dog_octave[int(s)][int(y), int(x)] + dog_octave[int(s)][int(y)-1, int(x)]dxy = 0.25 * (dog_octave[int(s)][int(y)+1, int(x)+1] - dog_octave[int(s)][int(y)+1, int(x)-1] -dog_octave[int(s)][int(y)-1, int(x)+1] + dog_octave[int(s)][int(y)-1, int(x)-1])# 计算特征值比率tr = dxx + dyydet = dxx * dyy - dxy ** 2if det > 0 and (tr ** 2) / det < (edge_ratio + 1) ** 2 / edge_ratio:final_points.append((x, y, s))return final_points# 剔除边缘响应点
final_points = suppress_edge_response(filtered_points, dog_pyramid[0], edge_ratio=10)# 可视化
plt.figure(figsize=(8, 8))
plt.imshow(gaussian_pyramid[0][0], cmap='gray')
xs = [p[0] for p in final_points]
ys = [p[1] for p in final_points]
plt.scatter(xs, ys, s=30, c='r', alpha=0.7)
plt.title(f'{len(final_points)} After Edge Suppression')
plt.show()

输出说明：

边缘上的极值点被剔除，保留稳定的角点或斑点特征。

2.5 总结

通过亚像素定位、低对比度点剔除和边缘响应抑制，SIFT 将候选关键点精炼为稳定、精确的特征点。结合 Python 代码和可视化，读者可以直观理解这一过程及其数学原理。

Step 3：方向校准——赋予特征的"内在罗盘"（方向分配）

3.1 为什么需要方向校准？

图像中的特征点可能因旋转而发生变化。为了确保特征描述符具有旋转不变性，需要为每个特征点分配一个主方向，使其描述符能够相对于主方向进行旋转对齐。

3.2 计算梯度幅值和方向

在特征点周围的邻域内，计算每个像素的梯度幅值和方向：

其中：

L(x,y) 是图像在 (x,y) 处的像素值；
m(x,y) 是梯度幅值；
θ(x,y) 是梯度方向（角度）。

def compute_gradient_magnitude_and_orientation(image):# 计算梯度dx = cv2.Sobel(image, cv2.CV_64F, 1, 0, ksize=3)dy = cv2.Sobel(image, cv2.CV_64F, 0, 1, ksize=3)# 计算梯度幅值和方向magnitude = np.sqrt(dx ** 2 + dy ** 2)orientation = np.arctan2(dy, dx) * 180 / np.pi  # 转换为角度return magnitude, orientation# 计算梯度幅值和方向
magnitude, orientation = compute_gradient_magnitude_and_orientation(gaussian_pyramid[0][0])

3.3 构建梯度方向直方图

在特征点周围的邻域内（通常为 16x16 像素），构建梯度方向直方图：

将 0°-360° 划分为 36 个柱（每柱 10°）。
使用高斯加权梯度幅值对直方图进行投票。

Python 实现：

def compute_orientation_histogram(magnitude, orientation, x, y, radius=8, bins=36):histogram = np.zeros(bins)for i in range(-radius, radius):for j in range(-radius, radius):if 0 <= y + i < magnitude.shape[0] and 0 <= x + j < magnitude.shape[1]:angle = orientation[y + i, x + j]if angle < 0:angle += 360  # 将负角度转换为正角度bin_idx = int(angle // (360 / bins)) % binsweight = magnitude[y + i, x + j] * np.exp(-(i ** 2 + j ** 2) / (2 * (radius / 2) ** 2))  # 高斯加权histogram[bin_idx] += weightreturn histogram# 计算特征点的梯度方向直方图
x, y, s = final_points[0]  # 以第一个特征点为例
histogram = compute_orientation_histogram(magnitude, orientation, int(x), int(y))# 可视化直方图
plt.figure(figsize=(8, 4))
plt.bar(range(36), histogram)
plt.xlabel('Orientation Bin (10° per bin)')
plt.ylabel('Weighted Magnitude')
plt.title('Gradient Orientation Histogram')
plt.show()

输出说明：

直方图显示特征点周围梯度的方向分布，峰值对应主方向。

3.4 分配主方向

从梯度方向直方图中找到主方向：

找到直方图中的最大值。
如果存在其他峰值（大于最大值 80% 的柱），则分配多个方向。

Python 实现

def assign_orientations(final_points, magnitude, orientation, bins=36):orientations = []for (x, y, s) in final_points:histogram = compute_orientation_histogram(magnitude, orientation, int(x), int(y))max_value = np.max(histogram)for bin_idx in range(bins):if histogram[bin_idx] >= 0.8 * max_value:  # 大于最大值80%的柱angle = bin_idx * (360 / bins)  # 转换为角度orientations.append((x, y, s, angle))return orientations# 分配主方向
orientations = assign_orientations(final_points, magnitude, orientation)# 可视化主方向
plt.figure(figsize=(8, 8))
plt.imshow(gaussian_pyramid[0][0], cmap='gray')
for (x, y, s, angle) in orientations:plt.arrow(x, y, 10 * np.cos(np.deg2rad(angle)), 10 * np.sin(np.deg2rad(angle)), color='r', width=0.5, head_width=3)
plt.title('Assigned Orientations')
plt.show()

输出说明：

红色箭头表示特征点的主方向。

3.5 总结

通过计算梯度幅值和方向、构建梯度方向直方图，并为特征点分配主方向，SIFT 实现了旋转不变性。

Step 4：指纹编码——铸造特征的"数字DNA"（描述子生成）

核心任务：将局部特征转化为可匹配的数学向量

4.1 为什么需要描述子生成？

描述子是特征点的数学表示，用于在不同图像中匹配相同特征。SIFT 描述子通过将特征点周围的梯度信息编码为一个向量，确保其具有旋转、尺度和光照不变性。

4.2 旋转对齐

根据特征点的主方向，将邻域旋转到标准方向，确保描述符的旋转不变性。

def rotate_patch(image, x, y, angle, radius=8):"""提取图像块并旋转到标准方向。如果图像块为空（靠近边缘），则返回 None。"""x, y = int(x), int(y)# 检查是否靠近边缘if x - radius < 0 or x + radius >= image.shape[1] or y - radius < 0 or y + radius >= image.shape[0]:return None  # 靠近边缘，返回空# 提取图像块patch = image[y-radius:y+radius, x-radius:x+radius]# 旋转图像块rows, cols = patch.shapeM = cv2.getRotationMatrix2D((cols / 2, rows / 2), angle, 1)rotated = cv2.warpAffine(patch, M, (cols, rows))return rotated# 旋转邻域到标准方向
x, y, s, angle = orientations[0]  # 以第一个特征点为例
rotated_patch = rotate_patch(gaussian_pyramid[0][0], x, y, -angle)  # 反向旋转对齐# 可视化
if rotated_patch is not None:plt.figure(figsize=(8, 4))plt.subplot(1, 2, 1)plt.imshow(gaussian_pyramid[0][0][int(y)-8:int(y)+8, int(x)-8:int(x)+8], cmap='gray')plt.title('Original Patch')plt.subplot(1, 2, 2)plt.imshow(rotated_patch, cmap='gray')plt.title('Rotated Patch')plt.show()
else:print("特征点靠近边缘，无法提取完整图像块。")

输出说明：

左图：原始邻域。
右图：旋转后的邻域，主方向对齐。

4.3 计算局部梯度

在旋转后的邻域内，计算每个像素的梯度幅值和方向。

Python 实现：

def compute_local_gradients(patch):dx = cv2.Sobel(patch, cv2.CV_64F, 1, 0, ksize=3)dy = cv2.Sobel(patch, cv2.CV_64F, 0, 1, ksize=3)magnitude = np.sqrt(dx ** 2 + dy ** 2)orientation = np.arctan2(dy, dx) * 180 / np.pi  # 转换为角度return magnitude, orientation# 计算局部梯度
magnitude, orientation = compute_local_gradients(rotated_patch)# 可视化
plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.imshow(magnitude, cmap='jet')
plt.title('Gradient Magnitude')
plt.subplot(1, 2, 2)
plt.imshow(orientation, cmap='hsv')
plt.title('Gradient Orientation')
plt.show()

输出说明：

左图：梯度幅值，亮度越高表示梯度越大。
右图：梯度方向，颜色表示角度

4.4 构建描述子

将旋转后的邻域划分为 4x4 子区域，每个子区域计算 8 方向梯度直方图，最终形成 128 维描述子。

def compute_descriptor(magnitude, orientation, bins=8):descriptor = []for i in range(0, 16, 4):  # 划分4x4子区域for j in range(0, 16, 4):hist = np.zeros(bins)for y in range(i, i+4):for x in range(j, j+4):angle = orientation[y, x]if angle < 0:angle += 360  # 将负角度转换为正角度bin_idx = int(angle // (360 / bins)) % binshist[bin_idx] += magnitude[y, x]descriptor.extend(hist)descriptor = np.array(descriptor)descriptor /= np.linalg.norm(descriptor)  # L2归一化descriptor = np.clip(descriptor, 0, 0.2)  # 截断大于0.2的值descriptor /= np.linalg.norm(descriptor)  # 再次归一化return descriptor# 计算描述子
descriptor = compute_descriptor(magnitude, orientation)# 可视化描述子
plt.figure(figsize=(10, 4))
plt.bar(range(128), descriptor)
plt.xlabel('Descriptor Dimension')
plt.ylabel('Normalized Value')
plt.title('SIFT Descriptor')
plt.show()