Gemma2 2B 模型的model.safetensors.index.json文件解析

Gemma2 2B 模型的 model.safetensors.index.json 文件解析

在使用 Gemma2 2B 模型或其他大型预训练模型时,model.safetensors.index.json 文件起到了索引的作用,它帮助我们了解模型的结构、参数存储方式以及如何加载模型的具体权重。本博客将深入解析该文件的内容和用途。
下载到本地的文件如下所示:

在这里插入图片描述


1. 文件结构概述

model.safetensors.index.json 文件的主要结构包括两个关键部分:

  1. Metadata 元数据:包含模型的总大小信息。
  2. Weight Map 权重映射:定义模型参数与实际存储文件的对应关系。

示例内容如下:

{"metadata": {"total_size": 10457367552},"weight_map": {"model.embed_tokens.weight": "model-00001-of-00003.safetensors","model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors"}
}

2. Metadata 元数据解析

total_size

  • 作用:表示所有模型参数文件的总大小(以字节为单位)。
  • 示例10457367552 字节约等于 10.45 GB
  • 意义
    1. 帮助用户评估存储需求。
    2. 检查文件是否下载完整,与预期大小匹配。

3. Weight Map 权重映射解析

weight_map

  • 作用
    将模型的各层参数映射到具体的 .safetensors 文件。
  • 格式
    • 键:模型参数的名称,表示权重在模型中的位置。
    • 值:存储这些权重的 .safetensors 文件。
  • 示例解析
    • model.embed_tokens.weight: 嵌入层的权重存储在 model-00001-of-00003.safetensors 文件中。
    • model.layers.0.mlp.up_proj.weight: 第 0 层 MLP 的上投影矩阵参数位于 model-00001-of-00003.safetensors
    • model.layers.10.mlp.down_proj.weight: 第 10 层 MLP 的下投影矩阵参数位于 model-00002-of-00003.safetensors

用途

  1. 分布式存储:大型模型被拆分为多个小文件,方便管理和加载。
  2. 增量更新:支持部分更新,不必重写整个模型。
  3. 动态加载:根据需求按需加载模型的某些部分。

4. 模型分片机制

为什么需要分片?

  1. 存储限制:单个文件过大可能超出文件系统限制。
  2. 加载效率:分片可以按需加载,提高内存利用率。
  3. 分布式训练:多个 GPU 或节点可以并行处理不同的参数分片。

如何定位分片?

  • 文件命名规则:model-<编号>-of-<总数>.safetensors
    • model-00001-of-00003.safetensors 表示 3 个分片中的第 1 个。
  • 使用索引文件确保参数名和文件名一一对应。

5. Safetensors 格式简介

优势

  1. 安全性:防止恶意代码注入,保障权重文件的安全加载。
  2. 效率高:二进制存储格式,支持快速读取和写入。
  3. 跨平台兼容性:适用于 CPU 和 GPU 环境。

加载示例

from safetensors.torch import load_file# 加载特定分片
weights = load_file("model-00001-of-00003.safetensors")
print(weights.keys())

6. 实际应用场景

1. 模型加载过程

  1. 根据 model.safetensors.index.json 文件读取分片信息。
  2. 根据需要加载某些分片到 GPU,减少内存占用。
  3. 动态合并加载的参数,恢复完整模型结构。

2. 文件一致性检查

  • 利用 total_size 验证下载的文件总大小是否正确,确保数据完整性。

3. 参数微调

  • 用户可以根据需求只加载特定层权重进行微调,避免加载不必要的参数。

7. 总结

model.safetensors.index.json 文件是大型模型权重管理的重要工具,尤其适合 Gemma2 2B 这样的多层神经网络。通过解析该文件,可以了解模型的存储布局、参数分片策略以及如何高效加载和管理模型权重。

关键要点

  1. 元数据部分提供总大小信息,便于存储规划和完整性检查。
  2. 权重映射部分详细记录模型参数与存储文件的对应关系,支持灵活加载。
  3. Safetensors 格式提高了加载速度和安全性,适合大规模模型的分布式部署。

希望这篇博客能帮助您更好地理解 model.safetensors.index.json 文件的作用和实现原理,助力您的模型开发和部署工作!

后记

2024年12月30日13点45分于上海,在GPT4o大模型辅助下完成。

附录

下面是完整的Gemma2 2B 模型的model.safetensors.index.json文件:

{"metadata": {"total_size": 10457367552},"weight_map": {"model.embed_tokens.weight": "model-00001-of-00003.safetensors","model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.0.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.0.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.1.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.1.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.10.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.10.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.10.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.10.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.11.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.11.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.11.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.11.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.12.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.12.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.12.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.12.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.13.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.13.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.13.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.13.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.14.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.14.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.14.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.14.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.15.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.15.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.15.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.15.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.16.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.16.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.16.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.16.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.17.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.17.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.17.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.17.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.18.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.18.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.18.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.18.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.19.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.19.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.19.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.19.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.2.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.2.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.20.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.20.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.20.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.20.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.20.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.21.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.21.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.21.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.21.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.22.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.22.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.22.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.22.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.23.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.23.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.23.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.23.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.24.input_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.24.mlp.down_proj.weight": "model-00003-of-00003.safetensors","model.layers.24.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.24.mlp.up_proj.weight": "model-00003-of-00003.safetensors","model.layers.24.post_attention_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.24.post_feedforward_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.24.pre_feedforward_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.24.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.24.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.24.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.24.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.25.input_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.25.mlp.down_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.mlp.gate_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.mlp.up_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.post_attention_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.25.post_feedforward_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.25.pre_feedforward_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.25.self_attn.k_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.self_attn.o_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.self_attn.q_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.self_attn.v_proj.weight": "model-00003-of-00003.safetensors","model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.3.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.3.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.4.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.4.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.5.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.5.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.6.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.6.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.6.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.7.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.7.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.7.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.7.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.8.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.8.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.8.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.8.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.8.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.8.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.8.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.8.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.8.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.8.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.8.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.9.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.9.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.9.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.9.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.9.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.norm.weight": "model-00003-of-00003.safetensors"}
}

仅供参考

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/500234.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【游戏设计原理】41 - 游戏的核心

1. 如何理解&#xff1f; 这条原理主要在讲述“游戏核心”这一概念的重要性及其在游戏开发中的作用。游戏的核心是指决定游戏整体玩法和体验的核心元素&#xff0c;它通常是游戏的主要机制、目标或动作方式。理解这一原理时&#xff0c;我们可以从以下几个层面来考虑&#xff…

vue下载和上传的地址动态ip地址配置方法

vue3结合element-plus实现【下载文件】和【上传文件】的动态ip地址配置 效果图 一、修改【文件上传】静态地址 1、首先引入axios import axios from "/utils/request"; import { getToken } from "/utils/auth"; 定义 const importDialogVisible ref(…

基于 Python Django 的花卉商城系统的研究与实现

博主介绍&#xff1a;✌程序员徐师兄、7年大厂程序员经历。全网粉丝12w、csdn博客专家、掘金/华为云/阿里云/InfoQ等平台优质作者、专注于Java技术领域和毕业项目实战✌ &#x1f345;文末获取源码联系&#x1f345; &#x1f447;&#x1f3fb; 精彩专栏推荐订阅&#x1f447;…

[Qt] 信号和槽(1) | 本质 | 使用 | 自定义

目录 一、信号和槽概述 二、本质 底层实现 1. 函数间的相互调用 2. 类成员中的特殊角色 三、使用 四. 自定义信号和槽 1. 基本语法 (1) 自定义信号函数书写规范 (2) 自定义槽函数书写规范 (3) 发送信号 (4) 示例 A. 示例一 B. 示例二 —— 老师说“上课了”&…

OpenGL变换矩阵和输入控制

在前面的文章当中我们已经成功播放了动画&#xff0c;让我们的角色动了起来&#xff0c;这一切变得比较有意思了起来。不过我们发现&#xff0c;角色虽然说是动了起来&#xff0c;不过只是在不停地原地踏步而已&#xff0c;而且我们也没有办法通过键盘来控制这个角色来进行移动…

overscroll-behavior-解决H5在ios上过度滚动的默认行为

1. 问题 开发H5的过程中&#xff0c;经常会有android和ios两边系统需要兼容的情况。在ios上一直有个问题是当H5内容触及到页面顶部或底部时&#xff0c;还是可以被人为的往下或往下拉动界面。当然可能有的情况是比较适用的&#xff0c;比如你往下拉动&#xff0c;然后在导航栏…

复杂对象的创建与组装 - 建造者模式(Builder Pattern)

建造者模式&#xff08;Builder Pattern&#xff09; 建造者模式&#xff08;Builder Pattern&#xff09;建造者模式&#xff08;Builder Pattern&#xff09;概述建造者模式结构图代码 talk is cheap&#xff0c; show you my code总结 建造者模式&#xff08;Builder Patter…

Linux-mac地址

mac地址 由6位16进制数组成。最高字节的最低位&#xff0c;0表示单播地址&#xff0c;1表示多播地址。最高字节的第二位&#xff0c;0表示全局地址&#xff0c;1表示本地地址。 单播地址&#xff1a;单播MAC地址用于一对一的通信模式&#xff0c;即从单一的源端发送到单一的目…

SAP学习笔记 - 豆知识14 - Msg 番号 M7562 - 取引Type WL 对应的番号範囲中不存在2025年度 OMBT

这种类似的以前也写过&#xff0c;原因就是自动採番的番号没弄。 比如跨年了&#xff0c;那该新年度的番号范围没弄啊&#xff0c;就会出这种错误。 把番号范围给加一下就可以了。 1&#xff0c;现象 比如点 VL02N 出荷传票变更 画面&#xff0c;点 出库确认 就会出如下错误…

一文理清JS中获取盒子宽高各方法的差异

前言 这段时间在研究一个反爬产品&#xff0c;环境检测用到了很多个盒子宽高取值方法&#xff0c;如window.outerWidth、window.screen.availWidth&#xff0c;各个方法取值结果不大相同&#xff0c;在此记录下遇到的方法。 各宽方法区别 这里就讲解下各宽度方法的区别&…

sqoop将MySQL数据导入hive

使用脚本加载数据 MySQL有一张表 hive创建一张相同的表 编写脚本同步数据 [rootmaster sqoop]# vim stu.sh#!/bin/bash SQOOP/usr/local/soft/sqoop-1.4.6/bin/sqoop $SQOOP import --connect jdbc:mysql://192.168.67.100:3306/sqoop \--username root \--password 123456 \-…

Docker Compose编排

什么是 Docker Compose? Docker Compose 是 Docker 官方推出的开源项目&#xff0c;用于快速编排和管理多个 Docker 容器的应用程序。它允许用户通过一个 YAML 格式的配置文件 docker-compose.yml 来定义和运行多个相关联的应用容器&#xff0c;从而实现对容器的统一管理和编…

[羊城杯 2024]hiden

一顿解压之后发现有两个文件&#xff1a; 尝试了Rot47解密&#xff0c;得到一个看起来挺像一回事的解码结果&#xff1a; 再将得到的解码结果试试Rot13解密&#xff0c;成功得到正确的解码结果&#xff1a; import wave with open(flag.txt, rb) as f:txt_data f.read()file_l…

LeetCode - 初级算法 数组(只出现一次的数字)

只出现一次的数字 这篇文章讨论如何找到一个数组中只出现一次的数字,确保算法的时间复杂度为线性,且只使用常量额外空间。 免责声明:本文来源于个人知识与公开资料,仅用于学术交流。 描述 给定一个非空整数数组 nums,除了某个元素只出现一次以外,其余每个元素均出现两…

【谷歌开发者月刊】十二月精彩资讯回顾,探索科技新可能

我们在今年的尾声中回顾本月精彩&#xff0c;开发者们借助创新技术为用户打造温暖的应用体验&#xff0c;展现技术与实用的结合。欢迎您查阅本期月刊&#xff0c;掌握最新动态。 本月看点 精彩看点多多&#xff0c;请上下滑动阅览 01DevFest 北京站和上海站圆满举办&#xff0c…

LinuxC高级day4

作业: 1.思维导图 2.终端输入一个C源文件名(.c结尾)判断文件是否有内容&#xff0c;如果没有内容删除文件&#xff0c;如果有内容编译并执行改文件。 3.终端输入两个文件名&#xff0c;判断哪个文件的时间戳更新

数据中台与数据治理服务方案[50页PPT]

本文概述了数据中台与数据治理服务方案的核心要点。数据中台作为政务服务数据化的核心&#xff0c;通过整合各部门业务系统数据&#xff0c;进行建模与加工&#xff0c;以新数据驱动政府管理效率提升与政务服务能力增强。数据治理则聚焦于解决整体架构问题&#xff0c;确保数据…

MAC环境安装(卸载)软件

MAC环境安装&#xff08;卸载&#xff09;软件 jdknode安装node&#xff0c;并实现不同版本的切换背景 卸载node从node官网下载pkg安装的node卸载用 homebrew 安装的node如果你感觉删的不够干净&#xff0c;可以再细分删除验证删除结果 jdk 1.下载jdk 先去官网下载自己需要的版…

时间序列预测算法---LSTM

文章目录 一、前言1.1、深度学习时间序列一般是几维数据&#xff1f;每个维度的名字是什么&#xff1f;通常代表什么含义&#xff1f;1.2、为什么机器学习/深度学习算法无法处理时间序列数据?1.3、RNN(循环神经网络)处理时间序列数据的思路&#xff1f;1.4、RNN存在哪些问题?…

LinuxC高级day2

1.在家目录下创建目录文件&#xff0c;dir a.dir下创建dir1和dir2 b.把当前目录下的所有文件拷贝到dir1中&#xff0c; c.把当前目录下的所有脚本文件拷贝到dir2中 d.把dir2打包并压缩为dir2.tar.xz e.再把dir2.tar.xz移动到dir1中 f.解压dir1中的压缩包 g.使用tree工具&#x…