AssertionError: weight model.layers.0.self_attn.q_proj.weight does not exist


 

通义千问2.5-7B-Instruct-AWQ量化,但在npu上运行报上面错误,奇怪?:

Exception:weight model.layers.0.self_attn.q_proj.weight does not exist

AssertionError: weight model.layers.0.self_attn.q_proj.weight does not exist

https://www.modelscope.cn/models/Qwen/Qwen2.5-7B-Instruct-AWQ/files

原因是不支持,华为有自己的量化方法:

https://www.hiascend.com/document/detail/zh/mindie/10RC3/mindiellm/llmdev/mindie_llm0281.html
测试转换:
root@dev-8242526b-01f2-4a54-b89d-f6d9c57c692d-qjhpf:/usr/local/Ascend/llm_model# python examples/models/qwen/convert_quant_weights.py --model_path /home/apulis-dev/teamdata/Qwen2.5-7B-Instruct-AWQ --save_directory /home/apulis-dev/teamdata/Qwen2.5-7B --w_bit
 8 --a_bit 8 --disable_level L5 --device_type npu --calib_file /usr/local/Ascend/llm_model/examples/convert/model_slim/boolq.jsonl


2024-11-08 11:12:59,068 [INFO] [pid: 776716] env.py-55: {'use_ascend': True, 'max_memory_gb': None, 'reserved_memory_gb': 3, 'skip_warmup': False, 'visible_devices': '0', 'use_host_chooser': True, 'bind_cpu': True}
[W compiler_depend.ts:623] Warning: expandable_segments currently defaults to false. You can enable this feature by `export PYTORCH_NPU_ALLOC_CONF = expandable_segments:True`. (function operator())
Traceback (most recent call last):
  File "/usr/local/Ascend/llm_model/examples/models/qwen/convert_quant_weights.py", line 52, in <module>
    quant_weight_generator = Quantifier(args.model_path, quant_config, anti_outlier_config, args.device_type)
  File "/usr/local/Ascend/llm_model/examples/convert/model_slim/quantifier.py", line 79, in __init__
    self.model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/python3.10.2/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/python3.10.2/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3014, in from_pretrained
    config.quantization_config = AutoHfQuantizer.merge_quantization_configs(
  File "/usr/local/python3.10.2/lib/python3.10/site-packages/transformers/quantizers/auto.py", line 145, in merge_quantization_configs
    quantization_config = AutoQuantizationConfig.from_dict(quantization_config)
  File "/usr/local/python3.10.2/lib/python3.10/site-packages/transformers/quantizers/auto.py", line 75, in from_dict
    return target_cls.from_dict(quantization_config_dict)
  File "/usr/local/python3.10.2/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 90, in from_dict
    config = cls(**config_dict)
  File "/usr/local/python3.10.2/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 655, in __init__
    self.post_init()
  File "/usr/local/python3.10.2/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 662, in post_init
    raise ValueError("AWQ is only available on GPU")
ValueError: AWQ is only available on GPU
[ERROR] 2024-11-08-11:13:06 (PID:776716, Device:0, RankID:-1) ERR99999 UNKNOWN application exception
root@dev-8242526b-01f2-4a54-b89d-f6d9c57c692d-qjhpf:/usr/local/Ascend/llm_model# 

原配置文件内容:
  "quantization_config": {
    "bits": 4,
    "group_size": 128,
    "modules_to_not_convert": null,
    "quant_method": "awq",
    "version": "gemm",
    "zero_point": true
  },
去掉修改为:  "quantize":"w8a8",

root@dev-8242526b-01f2-4a54-b89d-f6d9c57c692d-qjhpf:/home/apulis-dev/teamdata# more Qwen2.5-7B-Instruct-AWQ/config.json
{
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 32768,
  "max_window_layers": 28,
  "model_type": "qwen2",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "quantize":"w8a8",
  "rms_norm_eps": 1e-06,
  "rope_theta": 1000000.0,
  "sliding_window": 131072,
  "tie_word_embeddings": false,
  "torch_dtype": "float16",
  "transformers_version": "4.41.1",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 152064
}


root@dev-8242526b-01f2-4a54-b89d-f6d9c57c692d-qjhpf:/usr/local/Ascend/llm_model# python examples/models/qwen/convert_quant_weights.py --model_path /home/apulis-dev/teamdata/Qwen2.5-7B-Instruct-AWQ --save_directory /home/apulis-dev/teamdata/Qwen2.5-7B --w_bit 8 --a_bit 8 --disable_level L5 --device_type npu --calib_file /usr/local/Ascend/llm_model/examples/convert/model_slim/boolq.jsonl

2024-11-08 11:19:51,266 [INFO] [pid: 781496] env.py-55: {'use_ascend': True, 'max_memory_gb': None, 'reserved_memory_gb': 3, 'skip_warmup': False, 'visible_devices': '0', 'use_host_chooser': True, 'bind_cpu': True}
[W compiler_depend.ts:623] Warning: expandable_segments currently defaults to false. You can enable this feature by `export PYTORCH_NPU_ALLOC_CONF = expandable_segments:True`. (function operator())
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.20s/it]
Some weights of the model checkpoint at /home/apulis-dev/teamdata/Qwen2.5-7B-Instruct-AWQ were not used when initializing Qwen2ForCausalLM: ['model.layers.0.mlp.down_proj.qweight', 'model.layers.0.mlp.down_proj.qzeros', 'model.layers.0.mlp.down_proj.scales', 'model.layers.0.mlp.gate_proj.qweight', 'model.layers.0.mlp.gate_proj.qzeros', 'model.layers.0.mlp.gate_proj.scales', 'model.layers.0.mlp.up_proj.qweight', 'model.layers.0.mlp.up_proj.qzeros', 'model.layers.0.mlp.up_proj.scales', 'model.layers.0.self_attn.k_proj.qweight', 'model.layers.0.self_attn.k_proj.qzeros', 'model.layers.0.self_attn.k_proj.scales', 'model.layers.0.self_attn.o_proj.qweight', 'model.layers.0.self_attn.o_proj.qzeros', 'model.layers.0.self_attn.o_proj.scales', 'model.layers.0.self_attn.q_proj.qweight', 'model.layers.0.self_attn.q_proj.qzeros', 'model.layers.0.self_attn.q_proj.scales', 'model.layers.0.self_attn.v_proj.qweight', 'model.layers.0.self_attn.v_proj.qzeros', 'model.layers.0.self_attn.v_proj.scales', 'model.layers.1.mlp.down_proj.qweight', 'model.layers.1.mlp.down_proj.qzeros', 'model.layers.1.mlp.down_proj.scales', 'model.layers.1.mlp.gate_proj.qweight', 'model.layers.1.mlp.gate_proj.qzeros', 'model.layers.1.mlp.gate_proj.scales', 'model.layers.1.mlp.up_proj.qweight', 'model.layers.1.mlp.up_proj.qzeros', 'model.layers.1.mlp.up_proj.scales', 'model.layers.1.self_attn.k_proj.qweight', 'model.layers.1.self_attn.k_proj.qzeros', 'model.layers.1.self_attn.k_proj.scales', 'model.layers.1.self_attn.o_proj.qweight', 'model.layers.1.self_attn.o_proj.qzeros', 'model.layers.1.self_attn.o_proj.scales', 'model.layers.1.self_attn.q_proj.qweight', 'model.layers.1.self_attn.q_proj.qzeros', 'model.layers.1.self_attn.q_proj.scales', 'model.layers.1.self_attn.v_proj.qweight', 'model.layers.1.self_attn.v_proj.qzeros', 'model.layers.1.self_attn.v_proj.scales', 'model.layers.10.mlp.down_proj.qweight', 'model.layers.10.mlp.down_proj.qzeros', 'model.layers.10.mlp.down_proj.scales', 'model.layers.10.mlp.gate_proj.qweight', 'model.layers.10.mlp.gate_proj.qzeros', 'model.layers.10.mlp.gate_proj.scales', 'model.layers.10.mlp.up_proj.qweight', 'model.layers.10.mlp.up_proj.qzeros', 'model.layers.10.mlp.up_proj.scales', 'model.layers.10.self_attn.k_proj.qweight', 'model.layers.10.self_attn.k_proj.qzeros', 'model.layers.10.self_attn.k_proj.scales', 'model.layers.10.self_attn.o_proj.qweight', 'model.layers.10.self_attn.o_proj.qzeros', 'model.layers.10.self_attn.o_proj.scales', 'model.layers.10.self_attn.q_proj.qweight', 'model.layers.10.self_attn.q_proj.qzeros', 'model.layers.10.self_attn.q_proj.scales', 'model.layers.10.self_attn.v_proj.qweight', 'model.layers.10.self_attn.v_proj.qzeros', 'model.layers.10.self_attn.v_proj.scales', 'model.layers.11.mlp.down_proj.qweight', 'model.layers.11.mlp.down_proj.qzeros', 'model.layers.11.mlp.down_proj.scales', 'model.layers.11.mlp.gate_proj.qweight', 'model.layers.11.mlp.gate_proj.qzeros', 'model.layers.11.mlp.gate_proj.scales', 'model.layers.11.mlp.up_proj.qweight', 'model.layers.11.mlp.up_proj.qzeros', 'model.layers.11.mlp.up_proj.scales', 'model.layers.11.self_attn.k_proj.qweight', 'model.layers.11.self_attn.k_proj.qzeros', 'model.layers.11.self_attn.k_proj.scales', 'model.layers.11.self_attn.o_proj.qweight', 'model.layers.11.self_attn.o_proj.qzeros', 'model.layers.11.self_attn.o_proj.scales', 'model.layers.11.self_attn.q_proj.qweight', 'model.layers.11.self_attn.q_proj.qzeros', 'model.layers.11.self_attn.q_proj.scales', 'model.layers.11.self_attn.v_proj.qweight', 'model.layers.11.self_attn.v_proj.qzeros', 'model.layers.11.self_attn.v_proj.scales', 'model.layers.12.mlp.down_proj.qweight', 'model.layers.12.mlp.down_proj.qzeros', 'model.layers.12.mlp.down_proj.scales', 'model.layers.12.mlp.gate_proj.qweight', 'model.layers.12.mlp.gate_proj.qzeros', 'model.layers.12.mlp.gate_proj.scales', 'model.layers.12.mlp.up_proj.qweight', 'model.layers.12.mlp.up_proj.qzeros', 'model.layers.12.mlp.up_proj.scales', 'model.layers.12.self_attn.k_proj.qweight', 'model.layers.12.self_attn.k_proj.qzeros', 'model.layers.12.self_attn.k_proj.scales', 'model.layers.12.self_attn.o_proj.qweight', 'model.layers.12.self_attn.o_proj.qzeros', 'model.layers.12.self_attn.o_proj.scales', 'model.layers.12.self_attn.q_proj.qweight', 'model.layers.12.self_attn.q_proj.qzeros', 'model.layers.12.self_attn.q_proj.scales', 'model.layers.12.self_attn.v_proj.qweight', 'model.layers.12.self_attn.v_proj.qzeros', 'model.layers.12.self_attn.v_proj.scales', 'model.layers.13.mlp.down_proj.qweight', 'model.layers.13.mlp.down_proj.qzeros', 'model.layers.13.mlp.down_proj.scales', 'model.layers.13.mlp.gate_proj.qweight', 'model.layers.13.mlp.gate_proj.qzeros', 'model.layers.13.mlp.gate_proj.scales', 'model.layers.13.mlp.up_proj.qweight', 'model.layers.13.mlp.up_proj.qzeros', 'model.layers.13.mlp.up_proj.scales', 'model.layers.13.self_attn.k_proj.qweight', 'model.layers.13.self_attn.k_proj.qzeros', 'model.layers.13.self_attn.k_proj.scales', 'model.layers.13.self_attn.o_proj.qweight', 'model.layers.13.self_attn.o_proj.qzeros', 'model.layers.13.self_attn.o_proj.scales', 'model.layers.13.self_attn.q_proj.qweight', 'model.layers.13.self_attn.q_proj.qzeros', 'model.layers.13.self_attn.q_proj.scales', 'model.layers.13.self_attn.v_proj.qweight', 'model.layers.13.self_attn.v_proj.qzeros', 'model.layers.13.self_attn.v_proj.scales', 'model.layers.14.mlp.down_proj.qweight', 'model.layers.14.mlp.down_proj.qzeros', 'model.layers.14.mlp.down_proj.scales', 'model.layers.14.mlp.gate_proj.qweight', 'model.layers.14.mlp.gate_proj.qzeros', 'model.layers.14.mlp.gate_proj.scales', 'model.layers.14.mlp.up_proj.qweight', 'model.layers.14.mlp.up_proj.qzeros', 'model.layers.14.mlp.up_proj.scales', 'model.layers.14.self_attn.k_proj.qweight', 'model.layers.14.self_attn.k_proj.qzeros', 'model.layers.14.self_attn.k_proj.scales', 'model.layers.14.self_attn.o_proj.qweight', 'model.layers.14.self_attn.o_proj.qzeros', 'model.layers.14.self_attn.o_proj.scales', 'model.layers.14.self_attn.q_proj.qweight', 'model.layers.14.self_attn.q_proj.qzeros', 'model.layers.14.self_attn.q_proj.scales', 'model.layers.14.self_attn.v_proj.qweight', 'model.layers.14.self_attn.v_proj.qzeros', 'model.layers.14.self_attn.v_proj.scales', 'model.layers.15.mlp.down_proj.qweight', 'model.layers.15.mlp.down_proj.qzeros', 'model.layers.15.mlp.down_proj.scales', 'model.layers.15.mlp.gate_proj.qweight', 'model.layers.15.mlp.gate_proj.qzeros', 'model.layers.15.mlp.gate_proj.scales', 'model.layers.15.mlp.up_proj.qweight', 'model.layers.15.mlp.up_proj.qzeros', 'model.layers.15.mlp.up_proj.scales', 'model.layers.15.self_attn.k_proj.qweight', 'model.layers.15.self_attn.k_proj.qzeros', 'model.layers.15.self_attn.k_proj.scales', 'model.layers.15.self_attn.o_proj.qweight', 'model.layers.15.self_attn.o_proj.qzeros', 'model.layers.15.self_attn.o_proj.scales', 'model.layers.15.self_attn.q_proj.qweight', 'model.layers.15.self_attn.q_proj.qzeros', 'model.layers.15.self_attn.q_proj.scales', 'model.layers.15.self_attn.v_proj.qweight', 'model.layers.15.self_attn.v_proj.qzeros', 'model.layers.15.self_attn.v_proj.scales', 'model.layers.16.mlp.down_proj.qweight', 'model.layers.16.mlp.down_proj.qzeros', 'model.layers.16.mlp.down_proj.scales', 'model.layers.16.mlp.gate_proj.qweight', 'model.layers.16.mlp.gate_proj.qzeros', 'model.layers.16.mlp.gate_proj.scales', 'model.layers.16.mlp.up_proj.qweight', 'model.layers.16.mlp.up_proj.qzeros', 'model.layers.16.mlp.up_proj.scales', 'model.layers.16.self_attn.k_proj.qweight', 'model.layers.16.self_attn.k_proj.qzeros', 'model.layers.16.self_attn.k_proj.scales', 'model.layers.16.self_attn.o_proj.qweight', 'model.layers.16.self_attn.o_proj.qzeros', 'model.layers.16.self_attn.o_proj.scales', 'model.layers.16.self_attn.q_proj.qweight', 'model.layers.16.self_attn.q_proj.qzeros', 'model.layers.16.self_attn.q_proj.scales', 'model.layers.16.self_attn.v_proj.qweight', 'model.layers.16.self_attn.v_proj.qzeros', 'model.layers.16.self_attn.v_proj.scales', 'model.layers.17.mlp.down_proj.qweight', 'model.layers.17.mlp.down_proj.qzeros', 'model.layers.17.mlp.down_proj.scales', 'model.layers.17.mlp.gate_proj.qweight', 'model.layers.17.mlp.gate_proj.qzeros', 'model.layers.17.mlp.gate_proj.scales', 'model.layers.17.mlp.up_proj.qweight', 'model.layers.17.mlp.up_proj.qzeros', 'model.layers.17.mlp.up_proj.scales', 'model.layers.17.self_attn.k_proj.qweight', 'model.layers.17.self_attn.k_proj.qzeros', 'model.layers.17.self_attn.k_proj.scales', 'model.layers.17.self_attn.o_proj.qweight', 'model.layers.17.self_attn.o_proj.qzeros', 'model.layers.17.self_attn.o_proj.scales', 'model.layers.17.self_attn.q_proj.qweight', 'model.layers.17.self_attn.q_proj.qzeros', 'model.layers.17.self_attn.q_proj.scales', 'model.layers.17.self_attn.v_proj.qweight', 'model.layers.17.self_attn.v_proj.qzeros', 'model.layers.17.self_attn.v_proj.scales', 'model.layers.18.mlp.down_proj.qweight', 'model.layers.18.mlp.down_proj.qzeros', 'model.layers.18.mlp.down_proj.scales', 'model.layers.18.mlp.gate_proj.qweight', 'model.layers.18.mlp.gate_proj.qzeros', 'model.layers.18.mlp.gate_proj.scales', 'model.layers.18.mlp.up_proj.qweight', 'model.layers.18.mlp.up_proj.qzeros', 'model.layers.18.mlp.up_proj.scales', 'model.layers.18.self_attn.k_proj.qweight', 'model.layers.18.self_attn.k_proj.qzeros', 'model.layers.18.self_attn.k_proj.scales', 'model.layers.18.self_attn.o_proj.qweight', 'model.layers.18.self_attn.o_proj.qzeros', 'model.layers.18.self_attn.o_proj.scales', 'model.layers.18.self_attn.q_proj.qweight', 'model.layers.18.self_attn.q_proj.qzeros', 'model.layers.18.self_attn.q_proj.scales', 'model.layers.18.self_attn.v_proj.qweight', 'model.layers.18.self_attn.v_proj.qzeros', 'model.layers.18.self_attn.v_proj.scales', 'model.layers.19.mlp.down_proj.qweight', 'model.layers.19.mlp.down_proj.qzeros', 'model.layers.19.mlp.down_proj.scales', 'model.layers.19.mlp.gate_proj.qweight', 'model.layers.19.mlp.gate_proj.qzeros', 'model.layers.19.mlp.gate_proj.scales', 'model.layers.19.mlp.up_proj.qweight', 'model.layers.19.mlp.up_proj.qzeros', 'model.layers.19.mlp.up_proj.scales', 'model.layers.19.self_attn.k_proj.qweight', 'model.layers.19.self_attn.k_proj.qzeros', 'model.layers.19.self_attn.k_proj.scales', 'model.layers.19.self_attn.o_proj.qweight', 'model.layers.19.self_attn.o_proj.qzeros', 'model.layers.19.self_attn.o_proj.scales', 'model.layers.19.self_attn.q_proj.qweight', 'model.layers.19.self_attn.q_proj.qzeros', 'model.layers.19.self_attn.q_proj.scales', 'model.layers.19.self_attn.v_proj.qweight', 'model.layers.19.self_attn.v_proj.qzeros', 'model.layers.19.self_attn.v_proj.scales', 'model.layers.2.mlp.down_proj.qweight', 'model.layers.2.mlp.down_proj.qzeros', 'model.layers.2.mlp.down_proj.scales', 'model.layers.2.mlp.gate_proj.qweight', 'model.layers.2.mlp.gate_proj.qzeros', 'model.layers.2.mlp.gate_proj.scales', 'model.layers.2.mlp.up_proj.qweight', 'model.layers.2.mlp.up_proj.qzeros', 'model.layers.2.mlp.up_proj.scales', 'model.layers.2.self_attn.k_proj.qweight', 'model.layers.2.self_attn.k_proj.qzeros', 'model.layers.2.self_attn.k_proj.scales', 'model.layers.2.self_attn.o_proj.qweight', 'model.layers.2.self_attn.o_proj.qzeros', 'model.layers.2.self_attn.o_proj.scales', 'model.layers.2.self_attn.q_proj.qweight', 'model.layers.2.self_attn.q_proj.qzeros', 'model.layers.2.self_attn.q_proj.scales', 'model.layers.2.self_attn.v_proj.qweight', 'model.layers.2.self_attn.v_proj.qzeros', 'model.layers.2.self_attn.v_proj.scales', 'model.layers.20.mlp.down_proj.qweight', 'model.layers.20.mlp.down_proj.qzeros', 'model.layers.20.mlp.down_proj.scales', 'model.layers.20.mlp.gate_proj.qweight', 'model.layers.20.mlp.gate_proj.qzeros', 'model.layers.20.mlp.gate_proj.scales', 'model.layers.20.mlp.up_proj.qweight', 'model.layers.20.mlp.up_proj.qzeros', 'model.layers.20.mlp.up_proj.scales', 'model.layers.20.self_attn.k_proj.qweight', 'model.layers.20.self_attn.k_proj.qzeros', 'model.layers.20.self_attn.k_proj.scales', 'model.layers.20.self_attn.o_proj.qweight', 'model.layers.20.self_attn.o_proj.qzeros', 'model.layers.20.self_attn.o_proj.scales', 'model.layers.20.self_attn.q_proj.qweight', 'model.layers.20.self_attn.q_proj.qzeros', 'model.layers.20.self_attn.q_proj.scales', 'model.layers.20.self_attn.v_proj.qweight', 'model.layers.20.self_attn.v_proj.qzeros', 'model.layers.20.self_attn.v_proj.scales', 'model.layers.21.mlp.down_proj.qweight', 'model.layers.21.mlp.down_proj.qzeros', 'model.layers.21.mlp.down_proj.scales', 'model.layers.21.mlp.gate_proj.qweight', 'model.layers.21.mlp.gate_proj.qzeros', 'model.layers.21.mlp.gate_proj.scales', 'model.layers.21.mlp.up_proj.qweight', 'model.layers.21.mlp.up_proj.qzeros', 'model.layers.21.mlp.up_proj.scales', 'model.layers.21.self_attn.k_proj.qweight', 'model.layers.21.self_attn.k_proj.qzeros', 'model.layers.21.self_attn.k_proj.scales', 'model.layers.21.self_attn.o_proj.qweight', 'model.layers.21.self_attn.o_proj.qzeros', 'model.layers.21.self_attn.o_proj.scales', 'model.layers.21.self_attn.q_proj.qweight', 'model.layers.21.self_attn.q_proj.qzeros', 'model.layers.21.self_attn.q_proj.scales', 'model.layers.21.self_attn.v_proj.qweight', 'model.layers.21.self_attn.v_proj.qzeros', 'model.layers.21.self_attn.v_proj.scales', 'model.layers.22.mlp.down_proj.qweight', 'model.layers.22.mlp.down_proj.qzeros', 'model.layers.22.mlp.down_proj.scales', 'model.layers.22.mlp.gate_proj.qweight', 'model.layers.22.mlp.gate_proj.qzeros', 'model.layers.22.mlp.gate_proj.scales', 'model.layers.22.mlp.up_proj.qweight', 'model.layers.22.mlp.up_proj.qzeros', 'model.layers.22.mlp.up_proj.scales', 'model.layers.22.self_attn.k_proj.qweight', 'model.layers.22.self_attn.k_proj.qzeros', 'model.layers.22.self_attn.k_proj.scales', 'model.layers.22.self_attn.o_proj.qweight', 'model.layers.22.self_attn.o_proj.qzeros', 'model.layers.22.self_attn.o_proj.scales', 'model.layers.22.self_attn.q_proj.qweight', 'model.layers.22.self_attn.q_proj.qzeros', 'model.layers.22.self_attn.q_proj.scales', 'model.layers.22.self_attn.v_proj.qweight', 'model.layers.22.self_attn.v_proj.qzeros', 'model.layers.22.self_attn.v_proj.scales', 'model.layers.23.mlp.down_proj.qweight', 'model.layers.23.mlp.down_proj.qzeros', 'model.layers.23.mlp.down_proj.scales', 'model.layers.23.mlp.gate_proj.qweight', 'model.layers.23.mlp.gate_proj.qzeros', 'model.layers.23.mlp.gate_proj.scales', 'model.layers.23.mlp.up_proj.qweight', 'model.layers.23.mlp.up_proj.qzeros', 'model.layers.23.mlp.up_proj.scales', 'model.layers.23.self_attn.k_proj.qweight', 'model.layers.23.self_attn.k_proj.qzeros', 'model.layers.23.self_attn.k_proj.scales', 'model.layers.23.self_attn.o_proj.qweight', 'model.layers.23.self_attn.o_proj.qzeros', 'model.layers.23.self_attn.o_proj.scales', 'model.layers.23.self_attn.q_proj.qweight', 'model.layers.23.self_attn.q_proj.qzeros', 'model.layers.23.self_attn.q_proj.scales', 'model.layers.23.self_attn.v_proj.qweight', 'model.layers.23.self_attn.v_proj.qzeros', 'model.layers.23.self_attn.v_proj.scales', 'model.layers.24.mlp.down_proj.qweight', 'model.layers.24.mlp.down_proj.qzeros', 'model.layers.24.mlp.down_proj.scales', 'model.layers.24.mlp.gate_proj.qweight', 'model.layers.24.mlp.gate_proj.qzeros', 'model.layers.24.mlp.gate_proj.scales', 'model.layers.24.mlp.up_proj.qweight', 'model.layers.24.mlp.up_proj.qzeros', 'model.layers.24.mlp.up_proj.scales', 'model.layers.24.self_attn.k_proj.qweight', 'model.layers.24.self_attn.k_proj.qzeros', 'model.layers.24.self_attn.k_proj.scales', 'model.layers.24.self_attn.o_proj.qweight', 'model.layers.24.self_attn.o_proj.qzeros', 'model.layers.24.self_attn.o_proj.scales', 'model.layers.24.self_attn.q_proj.qweight', 'model.layers.24.self_attn.q_proj.qzeros', 'model.layers.24.self_attn.q_proj.scales', 'model.layers.24.self_attn.v_proj.qweight', 'model.layers.24.self_attn.v_proj.qzeros', 'model.layers.24.self_attn.v_proj.scales', 'model.layers.25.mlp.down_proj.qweight', 'model.layers.25.mlp.down_proj.qzeros', 'model.layers.25.mlp.down_proj.scales', 'model.layers.25.mlp.gate_proj.qweight', 'model.layers.25.mlp.gate_proj.qzeros', 'model.layers.25.mlp.gate_proj.scales', 'model.layers.25.mlp.up_proj.qweight', 'model.layers.25.mlp.up_proj.qzeros', 'model.layers.25.mlp.up_proj.scales', 'model.layers.25.self_attn.k_proj.qweight', 'model.layers.25.self_attn.k_proj.qzeros', 'model.layers.25.self_attn.k_proj.scales', 'model.layers.25.self_attn.o_proj.qweight', 'model.layers.25.self_attn.o_proj.qzeros', 'model.layers.25.self_attn.o_proj.scales', 'model.layers.25.self_attn.q_proj.qweight', 'model.layers.25.self_attn.q_proj.qzeros', 'model.layers.25.self_attn.q_proj.scales', 'model.layers.25.self_attn.v_proj.qweight', 'model.layers.25.self_attn.v_proj.qzeros', 'model.layers.25.self_attn.v_proj.scales', 'model.layers.26.mlp.down_proj.qweight', 'model.layers.26.mlp.down_proj.qzeros', 'model.layers.26.mlp.down_proj.scales', 'model.layers.26.mlp.gate_proj.qweight', 'model.layers.26.mlp.gate_proj.qzeros', 'model.layers.26.mlp.gate_proj.scales', 'model.layers.26.mlp.up_proj.qweight', 'model.layers.26.mlp.up_proj.qzeros', 'model.layers.26.mlp.up_proj.scales', 'model.layers.26.self_attn.k_proj.qweight', 'model.layers.26.self_attn.k_proj.qzeros', 'model.layers.26.self_attn.k_proj.scales', 'model.layers.26.self_attn.o_proj.qweight', 'model.layers.26.self_attn.o_proj.qzeros', 'model.layers.26.self_attn.o_proj.scales', 'model.layers.26.self_attn.q_proj.qweight', 'model.layers.26.self_attn.q_proj.qzeros', 'model.layers.26.self_attn.q_proj.scales', 'model.layers.26.self_attn.v_proj.qweight', 'model.layers.26.self_attn.v_proj.qzeros', 'model.layers.26.self_attn.v_proj.scales', 'model.layers.27.mlp.down_proj.qweight', 'model.layers.27.mlp.down_proj.qzeros', 'model.layers.27.mlp.down_proj.scales', 'model.layers.27.mlp.gate_proj.qweight', 'model.layers.27.mlp.gate_proj.qzeros', 'model.layers.27.mlp.gate_proj.scales', 'model.layers.27.mlp.up_proj.qweight', 'model.layers.27.mlp.up_proj.qzeros', 'model.layers.27.mlp.up_proj.scales', 'model.layers.27.self_attn.k_proj.qweight', 'model.layers.27.self_attn.k_proj.qzeros', 'model.layers.27.self_attn.k_proj.scales', 'model.layers.27.self_attn.o_proj.qweight', 'model.layers.27.self_attn.o_proj.qzeros', 'model.layers.27.self_attn.o_proj.scales', 'model.layers.27.self_attn.q_proj.qweight', 'model.layers.27.self_attn.q_proj.qzeros', 'model.layers.27.self_attn.q_proj.scales', 'model.layers.27.self_attn.v_proj.qweight', 'model.layers.27.self_attn.v_proj.qzeros', 'model.layers.27.self_attn.v_proj.scales', 'model.layers.3.mlp.down_proj.qweight', 'model.layers.3.mlp.down_proj.qzeros', 'model.layers.3.mlp.down_proj.scales', 'model.layers.3.mlp.gate_proj.qweight', 'model.layers.3.mlp.gate_proj.qzeros', 'model.layers.3.mlp.gate_proj.scales', 'model.layers.3.mlp.up_proj.qweight', 'model.layers.3.mlp.up_proj.qzeros', 'model.layers.3.mlp.up_proj.scales', 'model.layers.3.self_attn.k_proj.qweight', 'model.layers.3.self_attn.k_proj.qzeros', 'model.layers.3.self_attn.k_proj.scales', 'model.layers.3.self_attn.o_proj.qweight', 'model.layers.3.self_attn.o_proj.qzeros', 'model.layers.3.self_attn.o_proj.scales', 'model.layers.3.self_attn.q_proj.qweight', 'model.layers.3.self_attn.q_proj.qzeros', 'model.layers.3.self_attn.q_proj.scales', 'model.layers.3.self_attn.v_proj.qweight', 'model.layers.3.self_attn.v_proj.qzeros', 'model.layers.3.self_attn.v_proj.scales', 'model.layers.4.mlp.down_proj.qweight', 'model.layers.4.mlp.down_proj.qzeros', 'model.layers.4.mlp.down_proj.scales', 'model.layers.4.mlp.gate_proj.qweight', 'model.layers.4.mlp.gate_proj.qzeros', 'model.layers.4.mlp.gate_proj.scales', 'model.layers.4.mlp.up_proj.qweight', 'model.layers.4.mlp.up_proj.qzeros', 'model.layers.4.mlp.up_proj.scales', 'model.layers.4.self_attn.k_proj.qweight', 'model.layers.4.self_attn.k_proj.qzeros', 'model.layers.4.self_attn.k_proj.scales', 'model.layers.4.self_attn.o_proj.qweight', 'model.layers.4.self_attn.o_proj.qzeros', 'model.layers.4.self_attn.o_proj.scales', 'model.layers.4.self_attn.q_proj.qweight', 'model.layers.4.self_attn.q_proj.qzeros', 'model.layers.4.self_attn.q_proj.scales', 'model.layers.4.self_attn.v_proj.qweight', 'model.layers.4.self_attn.v_proj.qzeros', 'model.layers.4.self_attn.v_proj.scales', 'model.layers.5.mlp.down_proj.qweight', 'model.layers.5.mlp.down_proj.qzeros', 'model.layers.5.mlp.down_proj.scales', 'model.layers.5.mlp.gate_proj.qweight', 'model.layers.5.mlp.gate_proj.qzeros', 'model.layers.5.mlp.gate_proj.scales', 'model.layers.5.mlp.up_proj.qweight', 'model.layers.5.mlp.up_proj.qzeros', 'model.layers.5.mlp.up_proj.scales', 'model.layers.5.self_attn.k_proj.qweight', 'model.layers.5.self_attn.k_proj.qzeros', 'model.layers.5.self_attn.k_proj.scales', 'model.layers.5.self_attn.o_proj.qweight', 'model.layers.5.self_attn.o_proj.qzeros', 'model.layers.5.self_attn.o_proj.scales', 'model.layers.5.self_attn.q_proj.qweight', 'model.layers.5.self_attn.q_proj.qzeros', 'model.layers.5.self_attn.q_proj.scales', 'model.layers.5.self_attn.v_proj.qweight', 'model.layers.5.self_attn.v_proj.qzeros', 'model.layers.5.self_attn.v_proj.scales', 'model.layers.6.mlp.down_proj.qweight', 'model.layers.6.mlp.down_proj.qzeros', 'model.layers.6.mlp.down_proj.scales', 'model.layers.6.mlp.gate_proj.qweight', 'model.layers.6.mlp.gate_proj.qzeros', 'model.layers.6.mlp.gate_proj.scales', 'model.layers.6.mlp.up_proj.qweight', 'model.layers.6.mlp.up_proj.qzeros', 'model.layers.6.mlp.up_proj.scales', 'model.layers.6.self_attn.k_proj.qweight', 'model.layers.6.self_attn.k_proj.qzeros', 'model.layers.6.self_attn.k_proj.scales', 'model.layers.6.self_attn.o_proj.qweight', 'model.layers.6.self_attn.o_proj.qzeros', 'model.layers.6.self_attn.o_proj.scales', 'model.layers.6.self_attn.q_proj.qweight', 'model.layers.6.self_attn.q_proj.qzeros', 'model.layers.6.self_attn.q_proj.scales', 'model.layers.6.self_attn.v_proj.qweight', 'model.layers.6.self_attn.v_proj.qzeros', 'model.layers.6.self_attn.v_proj.scales', 'model.layers.7.mlp.down_proj.qweight', 'model.layers.7.mlp.down_proj.qzeros', 'model.layers.7.mlp.down_proj.scales', 'model.layers.7.mlp.gate_proj.qweight', 'model.layers.7.mlp.gate_proj.qzeros', 'model.layers.7.mlp.gate_proj.scales', 'model.layers.7.mlp.up_proj.qweight', 'model.layers.7.mlp.up_proj.qzeros', 'model.layers.7.mlp.up_proj.scales', 'model.layers.7.self_attn.k_proj.qweight', 'model.layers.7.self_attn.k_proj.qzeros', 'model.layers.7.self_attn.k_proj.scales', 'model.layers.7.self_attn.o_proj.qweight', 'model.layers.7.self_attn.o_proj.qzeros', 'model.layers.7.self_attn.o_proj.scales', 'model.layers.7.self_attn.q_proj.qweight', 'model.layers.7.self_attn.q_proj.qzeros', 'model.layers.7.self_attn.q_proj.scales', 'model.layers.7.self_attn.v_proj.qweight', 'model.layers.7.self_attn.v_proj.qzeros', 'model.layers.7.self_attn.v_proj.scales', 'model.layers.8.mlp.down_proj.qweight', 'model.layers.8.mlp.down_proj.qzeros', 'model.layers.8.mlp.down_proj.scales', 'model.layers.8.mlp.gate_proj.qweight', 'model.layers.8.mlp.gate_proj.qzeros', 'model.layers.8.mlp.gate_proj.scales', 'model.layers.8.mlp.up_proj.qweight', 'model.layers.8.mlp.up_proj.qzeros', 'model.layers.8.mlp.up_proj.scales', 'model.layers.8.self_attn.k_proj.qweight', 'model.layers.8.self_attn.k_proj.qzeros', 'model.layers.8.self_attn.k_proj.scales', 'model.layers.8.self_attn.o_proj.qweight', 'model.layers.8.self_attn.o_proj.qzeros', 'model.layers.8.self_attn.o_proj.scales', 'model.layers.8.self_attn.q_proj.qweight', 'model.layers.8.self_attn.q_proj.qzeros', 'model.layers.8.self_attn.q_proj.scales', 'model.layers.8.self_attn.v_proj.qweight', 'model.layers.8.self_attn.v_proj.qzeros', 'model.layers.8.self_attn.v_proj.scales', 'model.layers.9.mlp.down_proj.qweight', 'model.layers.9.mlp.down_proj.qzeros', 'model.layers.9.mlp.down_proj.scales', 'model.layers.9.mlp.gate_proj.qweight', 'model.layers.9.mlp.gate_proj.qzeros', 'model.layers.9.mlp.gate_proj.scales', 'model.layers.9.mlp.up_proj.qweight', 'model.layers.9.mlp.up_proj.qzeros', 'model.layers.9.mlp.up_proj.scales', 'model.layers.9.self_attn.k_proj.qweight', 'model.layers.9.self_attn.k_proj.qzeros', 'model.layers.9.self_attn.k_proj.scales', 'model.layers.9.self_attn.o_proj.qweight', 'model.layers.9.self_attn.o_proj.qzeros', 'model.layers.9.self_attn.o_proj.scales', 'model.layers.9.self_attn.q_proj.qweight', 'model.layers.9.self_attn.q_proj.qzeros', 'model.layers.9.self_attn.q_proj.scales', 'model.layers.9.self_attn.v_proj.qweight', 'model.layers.9.self_attn.v_proj.qzeros', 'model.layers.9.self_attn.v_proj.scales']
- This IS expected if you are initializing Qwen2ForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing Qwen2ForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of Qwen2ForCausalLM were not initialized from the model checkpoint at /home/apulis-dev/teamdata/Qwen2.5-7B-Instruct-AWQ and are newly initialized: ['model.layers.0.mlp.down_proj.weight', 'model.layers.0.mlp.gate_proj.weight', 'model.layers.0.mlp.up_proj.weight', 'model.layers.0.self_attn.k_proj.weight', 'model.layers.0.self_attn.o_proj.weight', 'model.layers.0.self_attn.q_proj.weight', 'model.layers.0.self_attn.v_proj.weight', 'model.layers.1.mlp.down_proj.weight', 'model.layers.1.mlp.gate_proj.weight', 'model.layers.1.mlp.up_proj.weight', 'model.layers.1.self_attn.k_proj.weight', 'model.layers.1.self_attn.o_proj.weight', 'model.layers.1.self_attn.q_proj.weight', 'model.layers.1.self_attn.v_proj.weight', 'model.layers.10.mlp.down_proj.weight', 'model.layers.10.mlp.gate_proj.weight', 'model.layers.10.mlp.up_proj.weight', 'model.layers.10.self_attn.k_proj.weight', 'model.layers.10.self_attn.o_proj.weight', 'model.layers.10.self_attn.q_proj.weight', 'model.layers.10.self_attn.v_proj.weight', 'model.layers.11.mlp.down_proj.weight', 'model.layers.11.mlp.gate_proj.weight', 'model.layers.11.mlp.up_proj.weight', 'model.layers.11.self_attn.k_proj.weight', 'model.layers.11.self_attn.o_proj.weight', 'model.layers.11.self_attn.q_proj.weight', 'model.layers.11.self_attn.v_proj.weight', 'model.layers.12.mlp.down_proj.weight', 'model.layers.12.mlp.gate_proj.weight', 'model.layers.12.mlp.up_proj.weight', 'model.layers.12.self_attn.k_proj.weight', 'model.layers.12.self_attn.o_proj.weight', 'model.layers.12.self_attn.q_proj.weight', 'model.layers.12.self_attn.v_proj.weight', 'model.layers.13.mlp.down_proj.weight', 'model.layers.13.mlp.gate_proj.weight', 'model.layers.13.mlp.up_proj.weight', 'model.layers.13.self_attn.k_proj.weight', 'model.layers.13.self_attn.o_proj.weight', 'model.layers.13.self_attn.q_proj.weight', 'model.layers.13.self_attn.v_proj.weight', 'model.layers.14.mlp.down_proj.weight', 'model.layers.14.mlp.gate_proj.weight', 'model.layers.14.mlp.up_proj.weight', 'model.layers.14.self_attn.k_proj.weight', 'model.layers.14.self_attn.o_proj.weight', 'model.layers.14.self_attn.q_proj.weight', 'model.layers.14.self_attn.v_proj.weight', 'model.layers.15.mlp.down_proj.weight', 'model.layers.15.mlp.gate_proj.weight', 'model.layers.15.mlp.up_proj.weight', 'model.layers.15.self_attn.k_proj.weight', 'model.layers.15.self_attn.o_proj.weight', 'model.layers.15.self_attn.q_proj.weight', 'model.layers.15.self_attn.v_proj.weight', 'model.layers.16.mlp.down_proj.weight', 'model.layers.16.mlp.gate_proj.weight', 'model.layers.16.mlp.up_proj.weight', 'model.layers.16.self_attn.k_proj.weight', 'model.layers.16.self_attn.o_proj.weight', 'model.layers.16.self_attn.q_proj.weight', 'model.layers.16.self_attn.v_proj.weight', 'model.layers.17.mlp.down_proj.weight', 'model.layers.17.mlp.gate_proj.weight', 'model.layers.17.mlp.up_proj.weight', 'model.layers.17.self_attn.k_proj.weight', 'model.layers.17.self_attn.o_proj.weight', 'model.layers.17.self_attn.q_proj.weight', 'model.layers.17.self_attn.v_proj.weight', 'model.layers.18.mlp.down_proj.weight', 'model.layers.18.mlp.gate_proj.weight', 'model.layers.18.mlp.up_proj.weight', 'model.layers.18.self_attn.k_proj.weight', 'model.layers.18.self_attn.o_proj.weight', 'model.layers.18.self_attn.q_proj.weight', 'model.layers.18.self_attn.v_proj.weight', 'model.layers.19.mlp.down_proj.weight', 'model.layers.19.mlp.gate_proj.weight', 'model.layers.19.mlp.up_proj.weight', 'model.layers.19.self_attn.k_proj.weight', 'model.layers.19.self_attn.o_proj.weight', 'model.layers.19.self_attn.q_proj.weight', 'model.layers.19.self_attn.v_proj.weight', 'model.layers.2.mlp.down_proj.weight', 'model.layers.2.mlp.gate_proj.weight', 'model.layers.2.mlp.up_proj.weight', 'model.layers.2.self_attn.k_proj.weight', 'model.layers.2.self_attn.o_proj.weight', 'model.layers.2.self_attn.q_proj.weight', 'model.layers.2.self_attn.v_proj.weight', 'model.layers.20.mlp.down_proj.weight', 'model.layers.20.mlp.gate_proj.weight', 'model.layers.20.mlp.up_proj.weight', 'model.layers.20.self_attn.k_proj.weight', 'model.layers.20.self_attn.o_proj.weight', 'model.layers.20.self_attn.q_proj.weight', 'model.layers.20.self_attn.v_proj.weight', 'model.layers.21.mlp.down_proj.weight', 'model.layers.21.mlp.gate_proj.weight', 'model.layers.21.mlp.up_proj.weight', 'model.layers.21.self_attn.k_proj.weight', 'model.layers.21.self_attn.o_proj.weight', 'model.layers.21.self_attn.q_proj.weight', 'model.layers.21.self_attn.v_proj.weight', 'model.layers.22.mlp.down_proj.weight', 'model.layers.22.mlp.gate_proj.weight', 'model.layers.22.mlp.up_proj.weight', 'model.layers.22.self_attn.k_proj.weight', 'model.layers.22.self_attn.o_proj.weight', 'model.layers.22.self_attn.q_proj.weight', 'model.layers.22.self_attn.v_proj.weight', 'model.layers.23.mlp.down_proj.weight', 'model.layers.23.mlp.gate_proj.weight', 'model.layers.23.mlp.up_proj.weight', 'model.layers.23.self_attn.k_proj.weight', 'model.layers.23.self_attn.o_proj.weight', 'model.layers.23.self_attn.q_proj.weight', 'model.layers.23.self_attn.v_proj.weight', 'model.layers.24.mlp.down_proj.weight', 'model.layers.24.mlp.gate_proj.weight', 'model.layers.24.mlp.up_proj.weight', 'model.layers.24.self_attn.k_proj.weight', 'model.layers.24.self_attn.o_proj.weight', 'model.layers.24.self_attn.q_proj.weight', 'model.layers.24.self_attn.v_proj.weight', 'model.layers.25.mlp.down_proj.weight', 'model.layers.25.mlp.gate_proj.weight', 'model.layers.25.mlp.up_proj.weight', 'model.layers.25.self_attn.k_proj.weight', 'model.layers.25.self_attn.o_proj.weight', 'model.layers.25.self_attn.q_proj.weight', 'model.layers.25.self_attn.v_proj.weight', 'model.layers.26.mlp.down_proj.weight', 'model.layers.26.mlp.gate_proj.weight', 'model.layers.26.mlp.up_proj.weight', 'model.layers.26.self_attn.k_proj.weight', 'model.layers.26.self_attn.o_proj.weight', 'model.layers.26.self_attn.q_proj.weight', 'model.layers.26.self_attn.v_proj.weight', 'model.layers.27.mlp.down_proj.weight', 'model.layers.27.mlp.gate_proj.weight', 'model.layers.27.mlp.up_proj.weight', 'model.layers.27.self_attn.k_proj.weight', 'model.layers.27.self_attn.o_proj.weight', 'model.layers.27.self_attn.q_proj.weight', 'model.layers.27.self_attn.v_proj.weight', 'model.layers.3.mlp.down_proj.weight', 'model.layers.3.mlp.gate_proj.weight', 'model.layers.3.mlp.up_proj.weight', 'model.layers.3.self_attn.k_proj.weight', 'model.layers.3.self_attn.o_proj.weight', 'model.layers.3.self_attn.q_proj.weight', 'model.layers.3.self_attn.v_proj.weight', 'model.layers.4.mlp.down_proj.weight', 'model.layers.4.mlp.gate_proj.weight', 'model.layers.4.mlp.up_proj.weight', 'model.layers.4.self_attn.k_proj.weight', 'model.layers.4.self_attn.o_proj.weight', 'model.layers.4.self_attn.q_proj.weight', 'model.layers.4.self_attn.v_proj.weight', 'model.layers.5.mlp.down_proj.weight', 'model.layers.5.mlp.gate_proj.weight', 'model.layers.5.mlp.up_proj.weight', 'model.layers.5.self_attn.k_proj.weight', 'model.layers.5.self_attn.o_proj.weight', 'model.layers.5.self_attn.q_proj.weight', 'model.layers.5.self_attn.v_proj.weight', 'model.layers.6.mlp.down_proj.weight', 'model.layers.6.mlp.gate_proj.weight', 'model.layers.6.mlp.up_proj.weight', 'model.layers.6.self_attn.k_proj.weight', 'model.layers.6.self_attn.o_proj.weight', 'model.layers.6.self_attn.q_proj.weight', 'model.layers.6.self_attn.v_proj.weight', 'model.layers.7.mlp.down_proj.weight', 'model.layers.7.mlp.gate_proj.weight', 'model.layers.7.mlp.up_proj.weight', 'model.layers.7.self_attn.k_proj.weight', 'model.layers.7.self_attn.o_proj.weight', 'model.layers.7.self_attn.q_proj.weight', 'model.layers.7.self_attn.v_proj.weight', 'model.layers.8.mlp.down_proj.weight', 'model.layers.8.mlp.gate_proj.weight', 'model.layers.8.mlp.up_proj.weight', 'model.layers.8.self_attn.k_proj.weight', 'model.layers.8.self_attn.o_proj.weight', 'model.layers.8.self_attn.q_proj.weight', 'model.layers.8.self_attn.v_proj.weight', 'model.layers.9.mlp.down_proj.weight', 'model.layers.9.mlp.gate_proj.weight', 'model.layers.9.mlp.up_proj.weight', 'model.layers.9.self_attn.k_proj.weight', 'model.layers.9.self_attn.o_proj.weight', 'model.layers.9.self_attn.q_proj.weight', 'model.layers.9.self_attn.v_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-11-08 11:34:06,314 - msmodelslim-logger - INFO - Automatic disable last layer name: lm_head
feature process: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00,  2.48it/s]
('Warning: torch.save with "_use_new_zipfile_serialization = False" is not recommended for npu tensor, which may bring unexpected errors and hopefully set "_use_new_zipfile_serialization = True"', 'if it is necessary to use this, please convert the npu tensor to cpu tensor for saving')
2024-11-08 11:34:09,205 - msmodelslim-logger - INFO - The number of nn.Linear and nn.Conv2d is 197.
2024-11-08 11:34:09,206 - msmodelslim-logger - INFO - Automatic disabled layer names are:
model.layers.0.self_attn.k_proj
model.layers.0.self_attn.q_proj
model.layers.0.self_attn.v_proj
model.layers.26.self_attn.k_proj
model.layers.26.self_attn.q_proj
2024-11-08 11:34:09,206 - msmodelslim-logger - INFO - roll back:lm_head
model.layers.0.mlp.down_proj
model.layers.0.self_attn.k_proj
model.layers.0.self_attn.q_proj
model.layers.0.self_attn.v_proj
model.layers.1.mlp.down_proj
model.layers.10.mlp.down_proj
model.layers.11.mlp.down_proj
model.layers.12.mlp.down_proj
model.layers.13.mlp.down_proj
model.layers.14.mlp.down_proj
model.layers.15.mlp.down_proj
model.layers.16.mlp.down_proj
model.layers.17.mlp.down_proj
model.layers.18.mlp.down_proj
model.layers.19.mlp.down_proj
model.layers.2.mlp.down_proj
model.layers.20.mlp.down_proj
model.layers.21.mlp.down_proj
model.layers.22.mlp.down_proj
model.layers.23.mlp.down_proj
model.layers.24.mlp.down_proj
model.layers.25.mlp.down_proj
model.layers.26.mlp.down_proj
model.layers.26.self_attn.k_proj
model.layers.26.self_attn.q_proj
model.layers.27.mlp.down_proj
model.layers.3.mlp.down_proj
model.layers.4.mlp.down_proj
model.layers.5.mlp.down_proj
model.layers.6.mlp.down_proj
model.layers.7.mlp.down_proj
model.layers.8.mlp.down_proj
model.layers.9.mlp.down_proj
2024-11-08 11:34:09,569 - msmodelslim-logger - INFO - use min-max observer:model.layers.0.self_attn.o_proj.quant_input, range_parm:8.5546875
2024-11-08 11:34:09,570 - msmodelslim-logger - INFO - use min-max observer:model.layers.0.mlp.gate_proj.quant_input, range_parm:5.7109375
2024-11-08 11:34:09,570 - msmodelslim-logger - INFO - use min-max observer:model.layers.0.mlp.up_proj.quant_input, range_parm:5.7109375
2024-11-08 11:34:09,570 - msmodelslim-logger - INFO - use min-max observer:model.layers.1.self_attn.q_proj.quant_input, range_parm:11.125
2024-11-08 11:34:09,570 - msmodelslim-logger - INFO - use min-max observer:model.layers.1.self_attn.k_proj.quant_input, range_parm:11.125
2024-11-08 11:34:09,571 - msmodelslim-logger - INFO - use min-max observer:model.layers.1.self_attn.v_proj.quant_input, range_parm:11.125
2024-11-08 11:34:09,571 - msmodelslim-logger - INFO - use min-max observer:model.layers.1.self_attn.o_proj.quant_input, range_parm:5.3515625
2024-11-08 11:34:09,571 - msmodelslim-logger - INFO - use min-max observer:model.layers.1.mlp.gate_proj.quant_input, range_parm:7.20703125
2024-11-08 11:34:09,571 - msmodelslim-logger - INFO - use min-max observer:model.layers.1.mlp.up_proj.quant_input, range_parm:7.20703125
2024-11-08 11:34:09,572 - msmodelslim-logger - INFO - use min-max observer:model.layers.2.self_attn.q_proj.quant_input, range_parm:7.8046875
2024-11-08 11:34:09,572 - msmodelslim-logger - INFO - use min-max observer:model.layers.2.self_attn.k_proj.quant_input, range_parm:7.8046875
2024-11-08 11:34:09,572 - msmodelslim-logger - INFO - use min-max observer:model.layers.2.self_attn.v_proj.quant_input, range_parm:7.8046875
2024-11-08 11:34:09,572 - msmodelslim-logger - INFO - use min-max observer:model.layers.2.self_attn.o_proj.quant_input, range_parm:8.21875
2024-11-08 11:34:09,573 - msmodelslim-logger - INFO - use min-max observer:model.layers.2.mlp.gate_proj.quant_input, range_parm:7.140625
2024-11-08 11:34:09,573 - msmodelslim-logger - INFO - use min-max observer:model.layers.2.mlp.up_proj.quant_input, range_parm:7.140625
2024-11-08 11:34:09,573 - msmodelslim-logger - INFO - use min-max observer:model.layers.3.self_attn.q_proj.quant_input, range_parm:8.6484375
2024-11-08 11:34:09,573 - msmodelslim-logger - INFO - use min-max observer:model.layers.3.self_attn.k_proj.quant_input, range_parm:8.6484375
2024-11-08 11:34:09,573 - msmodelslim-logger - INFO - use min-max observer:model.layers.3.self_attn.v_proj.quant_input, range_parm:8.6484375
2024-11-08 11:34:09,574 - msmodelslim-logger - INFO - use min-max observer:model.layers.3.self_attn.o_proj.quant_input, range_parm:4.59375
2024-11-08 11:34:09,574 - msmodelslim-logger - INFO - use min-max observer:model.layers.3.mlp.gate_proj.quant_input, range_parm:12.3671875
2024-11-08 11:34:09,574 - msmodelslim-logger - INFO - use min-max observer:model.layers.3.mlp.up_proj.quant_input, range_parm:12.3671875
2024-11-08 11:34:09,574 - msmodelslim-logger - INFO - use min-max observer:model.layers.4.self_attn.q_proj.quant_input, range_parm:9.7734375
2024-11-08 11:34:09,575 - msmodelslim-logger - INFO - use min-max observer:model.layers.4.self_attn.k_proj.quant_input, range_parm:9.7734375
2024-11-08 11:34:09,575 - msmodelslim-logger - INFO - use min-max observer:model.layers.4.self_attn.v_proj.quant_input, range_parm:9.7734375
2024-11-08 11:34:09,575 - msmodelslim-logger - INFO - use min-max observer:model.layers.4.self_attn.o_proj.quant_input, range_parm:4.63671875
2024-11-08 11:34:09,575 - msmodelslim-logger - INFO - use min-max observer:model.layers.4.mlp.gate_proj.quant_input, range_parm:5.859375
2024-11-08 11:34:09,575 - msmodelslim-logger - INFO - use min-max observer:model.layers.4.mlp.up_proj.quant_input, range_parm:5.859375
2024-11-08 11:34:09,576 - msmodelslim-logger - INFO - use min-max observer:model.layers.5.self_attn.q_proj.quant_input, range_parm:7.984375
2024-11-08 11:34:09,576 - msmodelslim-logger - INFO - use min-max observer:model.layers.5.self_attn.k_proj.quant_input, range_parm:7.984375
2024-11-08 11:34:09,576 - msmodelslim-logger - INFO - use min-max observer:model.layers.5.self_attn.v_proj.quant_input, range_parm:7.984375
2024-11-08 11:34:09,576 - msmodelslim-logger - INFO - use min-max observer:model.layers.5.self_attn.o_proj.quant_input, range_parm:4.60546875
2024-11-08 11:34:09,577 - msmodelslim-logger - INFO - use min-max observer:model.layers.5.mlp.gate_proj.quant_input, range_parm:8.8671875
2024-11-08 11:34:09,577 - msmodelslim-logger - INFO - use min-max observer:model.layers.5.mlp.up_proj.quant_input, range_parm:8.8671875
2024-11-08 11:34:09,577 - msmodelslim-logger - INFO - use min-max observer:model.layers.6.self_attn.q_proj.quant_input, range_parm:9.1484375
2024-11-08 11:34:09,577 - msmodelslim-logger - INFO - use min-max observer:model.layers.6.self_attn.k_proj.quant_input, range_parm:9.1484375
2024-11-08 11:34:09,577 - msmodelslim-logger - INFO - use min-max observer:model.layers.6.self_attn.v_proj.quant_input, range_parm:9.1484375
2024-11-08 11:34:09,578 - msmodelslim-logger - INFO - use min-max observer:model.layers.6.self_attn.o_proj.quant_input, range_parm:4.4453125
2024-11-08 11:34:09,578 - msmodelslim-logger - INFO - use min-max observer:model.layers.6.mlp.gate_proj.quant_input, range_parm:6.97265625
2024-11-08 11:34:09,578 - msmodelslim-logger - INFO - use min-max observer:model.layers.6.mlp.up_proj.quant_input, range_parm:6.97265625
2024-11-08 11:34:09,578 - msmodelslim-logger - INFO - use min-max observer:model.layers.7.self_attn.q_proj.quant_input, range_parm:7.83203125
2024-11-08 11:34:09,579 - msmodelslim-logger - INFO - use min-max observer:model.layers.7.self_attn.k_proj.quant_input, range_parm:7.83203125
2024-11-08 11:34:09,579 - msmodelslim-logger - INFO - use min-max observer:model.layers.7.self_attn.v_proj.quant_input, range_parm:7.83203125
2024-11-08 11:34:09,579 - msmodelslim-logger - INFO - use min-max observer:model.layers.7.self_attn.o_proj.quant_input, range_parm:5.5390625
2024-11-08 11:34:09,579 - msmodelslim-logger - INFO - use min-max observer:model.layers.7.mlp.gate_proj.quant_input, range_parm:7.8984375
2024-11-08 11:34:09,580 - msmodelslim-logger - INFO - use min-max observer:model.layers.7.mlp.up_proj.quant_input, range_parm:7.8984375
2024-11-08 11:34:09,580 - msmodelslim-logger - INFO - use min-max observer:model.layers.8.self_attn.q_proj.quant_input, range_parm:8.71875
2024-11-08 11:34:09,580 - msmodelslim-logger - INFO - use min-max observer:model.layers.8.self_attn.k_proj.quant_input, range_parm:8.71875
2024-11-08 11:34:09,580 - msmodelslim-logger - INFO - use min-max observer:model.layers.8.self_attn.v_proj.quant_input, range_parm:8.71875
2024-11-08 11:34:09,580 - msmodelslim-logger - INFO - use min-max observer:model.layers.8.self_attn.o_proj.quant_input, range_parm:5.9140625
2024-11-08 11:34:09,581 - msmodelslim-logger - INFO - use min-max observer:model.layers.8.mlp.gate_proj.quant_input, range_parm:9.109375
2024-11-08 11:34:09,581 - msmodelslim-logger - INFO - use min-max observer:model.layers.8.mlp.up_proj.quant_input, range_parm:9.109375
2024-11-08 11:34:09,581 - msmodelslim-logger - INFO - use min-max observer:model.layers.9.self_attn.q_proj.quant_input, range_parm:6.45703125
2024-11-08 11:34:09,581 - msmodelslim-logger - INFO - use min-max observer:model.layers.9.self_attn.k_proj.quant_input, range_parm:6.45703125
2024-11-08 11:34:09,582 - msmodelslim-logger - INFO - use min-max observer:model.layers.9.self_attn.v_proj.quant_input, range_parm:6.45703125
2024-11-08 11:34:09,582 - msmodelslim-logger - INFO - use min-max observer:model.layers.9.self_attn.o_proj.quant_input, range_parm:4.76171875
2024-11-08 11:34:09,582 - msmodelslim-logger - INFO - use min-max observer:model.layers.9.mlp.gate_proj.quant_input, range_parm:6.05859375
2024-11-08 11:34:09,582 - msmodelslim-logger - INFO - use min-max observer:model.layers.9.mlp.up_proj.quant_input, range_parm:6.05859375
2024-11-08 11:34:09,582 - msmodelslim-logger - INFO - use min-max observer:model.layers.10.self_attn.q_proj.quant_input, range_parm:7.23046875
2024-11-08 11:34:09,583 - msmodelslim-logger - INFO - use min-max observer:model.layers.10.self_attn.k_proj.quant_input, range_parm:7.23046875
2024-11-08 11:34:09,583 - msmodelslim-logger - INFO - use min-max observer:model.layers.10.self_attn.v_proj.quant_input, range_parm:7.23046875
2024-11-08 11:34:09,583 - msmodelslim-logger - INFO - use min-max observer:model.layers.10.self_attn.o_proj.quant_input, range_parm:4.83203125
2024-11-08 11:34:09,583 - msmodelslim-logger - INFO - use min-max observer:model.layers.10.mlp.gate_proj.quant_input, range_parm:5.9921875
2024-11-08 11:34:09,584 - msmodelslim-logger - INFO - use min-max observer:model.layers.10.mlp.up_proj.quant_input, range_parm:5.9921875
2024-11-08 11:34:09,584 - msmodelslim-logger - INFO - use min-max observer:model.layers.11.self_attn.q_proj.quant_input, range_parm:9.0859375
2024-11-08 11:34:09,584 - msmodelslim-logger - INFO - use min-max observer:model.layers.11.self_attn.k_proj.quant_input, range_parm:9.0859375
2024-11-08 11:34:09,584 - msmodelslim-logger - INFO - use min-max observer:model.layers.11.self_attn.v_proj.quant_input, range_parm:9.0859375
2024-11-08 11:34:09,584 - msmodelslim-logger - INFO - use min-max observer:model.layers.11.self_attn.o_proj.quant_input, range_parm:4.83984375
2024-11-08 11:34:09,585 - msmodelslim-logger - INFO - use min-max observer:model.layers.11.mlp.gate_proj.quant_input, range_parm:6.1875
2024-11-08 11:34:09,585 - msmodelslim-logger - INFO - use min-max observer:model.layers.11.mlp.up_proj.quant_input, range_parm:6.1875
2024-11-08 11:34:09,585 - msmodelslim-logger - INFO - use min-max observer:model.layers.12.self_attn.q_proj.quant_input, range_parm:8.546875
2024-11-08 11:34:09,585 - msmodelslim-logger - INFO - use min-max observer:model.layers.12.self_attn.k_proj.quant_input, range_parm:8.546875
2024-11-08 11:34:09,586 - msmodelslim-logger - INFO - use min-max observer:model.layers.12.self_attn.v_proj.quant_input, range_parm:8.546875
2024-11-08 11:34:09,586 - msmodelslim-logger - INFO - use min-max observer:model.layers.12.self_attn.o_proj.quant_input, range_parm:5.30859375
2024-11-08 11:34:09,586 - msmodelslim-logger - INFO - use min-max observer:model.layers.12.mlp.gate_proj.quant_input, range_parm:6.6484375
2024-11-08 11:34:09,586 - msmodelslim-logger - INFO - use min-max observer:model.layers.12.mlp.up_proj.quant_input, range_parm:6.6484375
2024-11-08 11:34:09,586 - msmodelslim-logger - INFO - use min-max observer:model.layers.13.self_attn.q_proj.quant_input, range_parm:8.5390625
2024-11-08 11:34:09,587 - msmodelslim-logger - INFO - use min-max observer:model.layers.13.self_attn.k_proj.quant_input, range_parm:8.5390625
2024-11-08 11:34:09,587 - msmodelslim-logger - INFO - use min-max observer:model.layers.13.self_attn.v_proj.quant_input, range_parm:8.5390625
2024-11-08 11:34:09,587 - msmodelslim-logger - INFO - use min-max observer:model.layers.13.self_attn.o_proj.quant_input, range_parm:4.99609375
2024-11-08 11:34:09,587 - msmodelslim-logger - INFO - use min-max observer:model.layers.13.mlp.gate_proj.quant_input, range_parm:7.12890625
2024-11-08 11:34:09,588 - msmodelslim-logger - INFO - use min-max observer:model.layers.13.mlp.up_proj.quant_input, range_parm:7.12890625
2024-11-08 11:34:09,588 - msmodelslim-logger - INFO - use min-max observer:model.layers.14.self_attn.q_proj.quant_input, range_parm:8.2109375
2024-11-08 11:34:09,588 - msmodelslim-logger - INFO - use min-max observer:model.layers.14.self_attn.k_proj.quant_input, range_parm:8.2109375
2024-11-08 11:34:09,588 - msmodelslim-logger - INFO - use min-max observer:model.layers.14.self_attn.v_proj.quant_input, range_parm:8.2109375
2024-11-08 11:34:09,588 - msmodelslim-logger - INFO - use min-max observer:model.layers.14.self_attn.o_proj.quant_input, range_parm:5.5
2024-11-08 11:34:09,589 - msmodelslim-logger - INFO - use min-max observer:model.layers.14.mlp.gate_proj.quant_input, range_parm:6.6328125
2024-11-08 11:34:09,589 - msmodelslim-logger - INFO - use min-max observer:model.layers.14.mlp.up_proj.quant_input, range_parm:6.6328125
2024-11-08 11:34:09,589 - msmodelslim-logger - INFO - use min-max observer:model.layers.15.self_attn.q_proj.quant_input, range_parm:8.265625
2024-11-08 11:34:09,589 - msmodelslim-logger - INFO - use min-max observer:model.layers.15.self_attn.k_proj.quant_input, range_parm:8.265625
2024-11-08 11:34:09,590 - msmodelslim-logger - INFO - use min-max observer:model.layers.15.self_attn.v_proj.quant_input, range_parm:8.265625
2024-11-08 11:34:09,590 - msmodelslim-logger - INFO - use min-max observer:model.layers.15.self_attn.o_proj.quant_input, range_parm:4.41796875
2024-11-08 11:34:09,590 - msmodelslim-logger - INFO - use min-max observer:model.layers.15.mlp.gate_proj.quant_input, range_parm:6.67578125
2024-11-08 11:34:09,590 - msmodelslim-logger - INFO - use min-max observer:model.layers.15.mlp.up_proj.quant_input, range_parm:6.67578125
2024-11-08 11:34:09,590 - msmodelslim-logger - INFO - use min-max observer:model.layers.16.self_attn.q_proj.quant_input, range_parm:9.1171875
2024-11-08 11:34:09,591 - msmodelslim-logger - INFO - use min-max observer:model.layers.16.self_attn.k_proj.quant_input, range_parm:9.1171875
2024-11-08 11:34:09,591 - msmodelslim-logger - INFO - use min-max observer:model.layers.16.self_attn.v_proj.quant_input, range_parm:9.1171875
2024-11-08 11:34:09,591 - msmodelslim-logger - INFO - use min-max observer:model.layers.16.self_attn.o_proj.quant_input, range_parm:4.7890625
2024-11-08 11:34:09,591 - msmodelslim-logger - INFO - use min-max observer:model.layers.16.mlp.gate_proj.quant_input, range_parm:7.5703125
2024-11-08 11:34:09,592 - msmodelslim-logger - INFO - use min-max observer:model.layers.16.mlp.up_proj.quant_input, range_parm:7.5703125
2024-11-08 11:34:09,592 - msmodelslim-logger - INFO - use min-max observer:model.layers.17.self_attn.q_proj.quant_input, range_parm:8.0390625
2024-11-08 11:34:09,592 - msmodelslim-logger - INFO - use min-max observer:model.layers.17.self_attn.k_proj.quant_input, range_parm:8.0390625
2024-11-08 11:34:09,592 - msmodelslim-logger - INFO - use min-max observer:model.layers.17.self_attn.v_proj.quant_input, range_parm:8.0390625
2024-11-08 11:34:09,592 - msmodelslim-logger - INFO - use min-max observer:model.layers.17.self_attn.o_proj.quant_input, range_parm:4.171875
2024-11-08 11:34:09,593 - msmodelslim-logger - INFO - use min-max observer:model.layers.17.mlp.gate_proj.quant_input, range_parm:7.6796875
2024-11-08 11:34:09,593 - msmodelslim-logger - INFO - use min-max observer:model.layers.17.mlp.up_proj.quant_input, range_parm:7.6796875
2024-11-08 11:34:09,593 - msmodelslim-logger - INFO - use min-max observer:model.layers.18.self_attn.q_proj.quant_input, range_parm:8.921875
2024-11-08 11:34:09,594 - msmodelslim-logger - INFO - use min-max observer:model.layers.18.self_attn.k_proj.quant_input, range_parm:8.921875
2024-11-08 11:34:09,594 - msmodelslim-logger - INFO - use min-max observer:model.layers.18.self_attn.v_proj.quant_input, range_parm:8.921875
2024-11-08 11:34:09,594 - msmodelslim-logger - INFO - use min-max observer:model.layers.18.self_attn.o_proj.quant_input, range_parm:4.69921875
2024-11-08 11:34:09,594 - msmodelslim-logger - INFO - use min-max observer:model.layers.18.mlp.gate_proj.quant_input, range_parm:9.03125
2024-11-08 11:34:09,594 - msmodelslim-logger - INFO - use min-max observer:model.layers.18.mlp.up_proj.quant_input, range_parm:9.03125
2024-11-08 11:34:09,595 - msmodelslim-logger - INFO - use min-max observer:model.layers.19.self_attn.q_proj.quant_input, range_parm:7.77734375
2024-11-08 11:34:09,595 - msmodelslim-logger - INFO - use min-max observer:model.layers.19.self_attn.k_proj.quant_input, range_parm:7.77734375
2024-11-08 11:34:09,595 - msmodelslim-logger - INFO - use min-max observer:model.layers.19.self_attn.v_proj.quant_input, range_parm:7.77734375
2024-11-08 11:34:09,595 - msmodelslim-logger - INFO - use min-max observer:model.layers.19.self_attn.o_proj.quant_input, range_parm:5.05078125
2024-11-08 11:34:09,596 - msmodelslim-logger - INFO - use min-max observer:model.layers.19.mlp.gate_proj.quant_input, range_parm:9.1484375
2024-11-08 11:34:09,596 - msmodelslim-logger - INFO - use min-max observer:model.layers.19.mlp.up_proj.quant_input, range_parm:9.1484375
2024-11-08 11:34:09,596 - msmodelslim-logger - INFO - use min-max observer:model.layers.20.self_attn.q_proj.quant_input, range_parm:8.3125
2024-11-08 11:34:09,596 - msmodelslim-logger - INFO - use min-max observer:model.layers.20.self_attn.k_proj.quant_input, range_parm:8.3125
2024-11-08 11:34:09,596 - msmodelslim-logger - INFO - use min-max observer:model.layers.20.self_attn.v_proj.quant_input, range_parm:8.3125
2024-11-08 11:34:09,597 - msmodelslim-logger - INFO - use min-max observer:model.layers.20.self_attn.o_proj.quant_input, range_parm:5.09375
2024-11-08 11:34:09,597 - msmodelslim-logger - INFO - use min-max observer:model.layers.20.mlp.gate_proj.quant_input, range_parm:9.546875
2024-11-08 11:34:09,597 - msmodelslim-logger - INFO - use min-max observer:model.layers.20.mlp.up_proj.quant_input, range_parm:9.546875
2024-11-08 11:34:09,597 - msmodelslim-logger - INFO - use min-max observer:model.layers.21.self_attn.q_proj.quant_input, range_parm:8.34375
2024-11-08 11:34:09,598 - msmodelslim-logger - INFO - use min-max observer:model.layers.21.self_attn.k_proj.quant_input, range_parm:8.34375
2024-11-08 11:34:09,598 - msmodelslim-logger - INFO - use min-max observer:model.layers.21.self_attn.v_proj.quant_input, range_parm:8.34375
2024-11-08 11:34:09,598 - msmodelslim-logger - INFO - use min-max observer:model.layers.21.self_attn.o_proj.quant_input, range_parm:5.03125
2024-11-08 11:34:09,598 - msmodelslim-logger - INFO - use min-max observer:model.layers.21.mlp.gate_proj.quant_input, range_parm:9.5234375
2024-11-08 11:34:09,598 - msmodelslim-logger - INFO - use min-max observer:model.layers.21.mlp.up_proj.quant_input, range_parm:9.5234375
2024-11-08 11:34:09,599 - msmodelslim-logger - INFO - use min-max observer:model.layers.22.self_attn.q_proj.quant_input, range_parm:6.23046875
2024-11-08 11:34:09,599 - msmodelslim-logger - INFO - use min-max observer:model.layers.22.self_attn.k_proj.quant_input, range_parm:6.23046875
2024-11-08 11:34:09,599 - msmodelslim-logger - INFO - use min-max observer:model.layers.22.self_attn.v_proj.quant_input, range_parm:6.23046875
2024-11-08 11:34:09,599 - msmodelslim-logger - INFO - use min-max observer:model.layers.22.self_attn.o_proj.quant_input, range_parm:4.953125
2024-11-08 11:34:09,600 - msmodelslim-logger - INFO - use min-max observer:model.layers.22.mlp.gate_proj.quant_input, range_parm:8.0
2024-11-08 11:34:09,600 - msmodelslim-logger - INFO - use min-max observer:model.layers.22.mlp.up_proj.quant_input, range_parm:8.0
2024-11-08 11:34:09,600 - msmodelslim-logger - INFO - use min-max observer:model.layers.23.self_attn.q_proj.quant_input, range_parm:7.2109375
2024-11-08 11:34:09,600 - msmodelslim-logger - INFO - use min-max observer:model.layers.23.self_attn.k_proj.quant_input, range_parm:7.2109375
2024-11-08 11:34:09,600 - msmodelslim-logger - INFO - use min-max observer:model.layers.23.self_attn.v_proj.quant_input, range_parm:7.2109375
2024-11-08 11:34:09,601 - msmodelslim-logger - INFO - use min-max observer:model.layers.23.self_attn.o_proj.quant_input, range_parm:6.640625
2024-11-08 11:34:09,601 - msmodelslim-logger - INFO - use min-max observer:model.layers.23.mlp.gate_proj.quant_input, range_parm:6.37109375
2024-11-08 11:34:09,601 - msmodelslim-logger - INFO - use min-max observer:model.layers.23.mlp.up_proj.quant_input, range_parm:6.37109375
2024-11-08 11:34:09,601 - msmodelslim-logger - INFO - use min-max observer:model.layers.24.self_attn.q_proj.quant_input, range_parm:8.4921875
2024-11-08 11:34:09,602 - msmodelslim-logger - INFO - use min-max observer:model.layers.24.self_attn.k_proj.quant_input, range_parm:8.4921875
2024-11-08 11:34:09,602 - msmodelslim-logger - INFO - use min-max observer:model.layers.24.self_attn.v_proj.quant_input, range_parm:8.4921875
2024-11-08 11:34:09,602 - msmodelslim-logger - INFO - use min-max observer:model.layers.24.self_attn.o_proj.quant_input, range_parm:8.6953125
2024-11-08 11:34:09,602 - msmodelslim-logger - INFO - use min-max observer:model.layers.24.mlp.gate_proj.quant_input, range_parm:6.83984375
2024-11-08 11:34:09,602 - msmodelslim-logger - INFO - use min-max observer:model.layers.24.mlp.up_proj.quant_input, range_parm:6.83984375
2024-11-08 11:34:09,603 - msmodelslim-logger - INFO - use min-max observer:model.layers.25.self_attn.q_proj.quant_input, range_parm:6.421875
2024-11-08 11:34:09,603 - msmodelslim-logger - INFO - use min-max observer:model.layers.25.self_attn.k_proj.quant_input, range_parm:6.421875
2024-11-08 11:34:09,603 - msmodelslim-logger - INFO - use min-max observer:model.layers.25.self_attn.v_proj.quant_input, range_parm:6.421875
2024-11-08 11:34:09,603 - msmodelslim-logger - INFO - use min-max observer:model.layers.25.self_attn.o_proj.quant_input, range_parm:9.4609375
2024-11-08 11:34:09,604 - msmodelslim-logger - INFO - use min-max observer:model.layers.25.mlp.gate_proj.quant_input, range_parm:6.30078125
2024-11-08 11:34:09,604 - msmodelslim-logger - INFO - use min-max observer:model.layers.25.mlp.up_proj.quant_input, range_parm:6.30078125
2024-11-08 11:34:09,604 - msmodelslim-logger - INFO - use min-max observer:model.layers.26.self_attn.v_proj.quant_input, range_parm:13.421875
2024-11-08 11:34:09,604 - msmodelslim-logger - INFO - use min-max observer:model.layers.26.self_attn.o_proj.quant_input, range_parm:7.1796875
2024-11-08 11:34:09,605 - msmodelslim-logger - INFO - use min-max observer:model.layers.26.mlp.gate_proj.quant_input, range_parm:7.89453125
2024-11-08 11:34:09,605 - msmodelslim-logger - INFO - use min-max observer:model.layers.26.mlp.up_proj.quant_input, range_parm:7.89453125
2024-11-08 11:34:09,605 - msmodelslim-logger - INFO - use min-max observer:model.layers.27.self_attn.q_proj.quant_input, range_parm:13.2890625
2024-11-08 11:34:09,605 - msmodelslim-logger - INFO - use min-max observer:model.layers.27.self_attn.k_proj.quant_input, range_parm:13.2890625
2024-11-08 11:34:09,605 - msmodelslim-logger - INFO - use min-max observer:model.layers.27.self_attn.v_proj.quant_input, range_parm:13.2890625
2024-11-08 11:34:09,606 - msmodelslim-logger - INFO - use min-max observer:model.layers.27.self_attn.o_proj.quant_input, range_parm:5.80078125
2024-11-08 11:34:09,606 - msmodelslim-logger - INFO - use min-max observer:model.layers.27.mlp.gate_proj.quant_input, range_parm:6.515625
2024-11-08 11:34:09,606 - msmodelslim-logger - INFO - use min-max observer:model.layers.27.mlp.up_proj.quant_input, range_parm:6.515625
2024-11-08 11:34:09,606 - msmodelslim-logger - INFO - Wrap Quantizer success!
2024-11-08 11:34:09,606 - msmodelslim-logger - INFO - Calibration start!
  0%|                                                                                                                                                                                                                                      | 0/50 [00:00<?, ?it/s]2024-11-08 11:34:09,633 - msmodelslim-logger - INFO - layer:model.layers.0.self_attn.o_proj.quant_input, range: 8.5546875, Automatically set the drop rate:1.0
2024-11-08 11:34:09,658 - msmodelslim-logger - INFO - layer:model.layers.0.mlp.gate_proj.quant_input, range: 5.7109375, Automatically set the drop rate:1.0
2024-11-08 11:34:09,686 - msmodelslim-logger - INFO - layer:model.layers.0.mlp.up_proj.quant_input, range: 5.7109375, Automatically set the drop rate:1.0
2024-11-08 11:34:09,703 - msmodelslim-logger - INFO - layer:model.layers.1.self_attn.q_proj.quant_input, range: 11.125, Automatically set the drop rate:1.0
2024-11-08 11:34:09,707 - msmodelslim-logger - INFO - layer:model.layers.1.self_attn.k_proj.quant_input, range: 11.125, Automatically set the drop rate:1.0
2024-11-08 11:34:09,709 - msmodelslim-logger - INFO - layer:model.layers.1.self_attn.v_proj.quant_input, range: 11.125, Automatically set the drop rate:1.0
2024-11-08 11:34:09,717 - msmodelslim-logger - INFO - layer:model.layers.1.self_attn.o_proj.quant_input, range: 5.3515625, Automatically set the drop rate:1.0
2024-11-08 11:34:09,742 - msmodelslim-logger - INFO - layer:model.layers.1.mlp.gate_proj.quant_input, range: 7.20703125, Automatically set the drop rate:1.0
2024-11-08 11:34:09,770 - msmodelslim-logger - INFO - layer:model.layers.1.mlp.up_proj.quant_input, range: 7.20703125, Automatically set the drop rate:1.0
2024-11-08 11:34:09,784 - msmodelslim-logger - INFO - layer:model.layers.2.self_attn.q_proj.quant_input, range: 7.8046875, Automatically set the drop rate:1.0
2024-11-08 11:34:09,787 - msmodelslim-logger - INFO - layer:model.layers.2.self_attn.k_proj.quant_input, range: 7.8046875, Automatically set the drop rate:1.0
2024-11-08 11:34:09,790 - msmodelslim-logger - INFO - layer:model.layers.2.self_attn.v_proj.quant_input, range: 7.8046875, Automatically set the drop rate:1.0
2024-11-08 11:34:09,797 - msmodelslim-logger - INFO - layer:model.layers.2.self_attn.o_proj.quant_input, range: 8.21875, Automatically set the drop rate:1.0
2024-11-08 11:34:09,822 - msmodelslim-logger - INFO - layer:model.layers.2.mlp.gate_proj.quant_input, range: 7.140625, Automatically set the drop rate:1.0
2024-11-08 11:34:09,850 - msmodelslim-logger - INFO - layer:model.layers.2.mlp.up_proj.quant_input, range: 7.140625, Automatically set the drop rate:1.0
2024-11-08 11:34:09,864 - msmodelslim-logger - INFO - layer:model.layers.3.self_attn.q_proj.quant_input, range: 8.6484375, Automatically set the drop rate:1.0
2024-11-08 11:34:09,868 - msmodelslim-logger - INFO - layer:model.layers.3.self_attn.k_proj.quant_input, range: 8.6484375, Automatically set the drop rate:1.0
2024-11-08 11:34:09,870 - msmodelslim-logger - INFO - layer:model.layers.3.self_attn.v_proj.quant_input, range: 8.6484375, Automatically set the drop rate:1.0
2024-11-08 11:34:09,878 - msmodelslim-logger - INFO - layer:model.layers.3.self_attn.o_proj.quant_input, range: 4.59375, Automatically set the drop rate:1.0
2024-11-08 11:34:09,903 - msmodelslim-logger - INFO - layer:model.layers.3.mlp.gate_proj.quant_input, range: 12.3671875, Automatically set the drop rate:1.0
2024-11-08 11:34:09,931 - msmodelslim-logger - INFO - layer:model.layers.3.mlp.up_proj.quant_input, range: 12.3671875, Automatically set the drop rate:1.0
2024-11-08 11:34:09,949 - msmodelslim-logger - INFO - layer:model.layers.4.self_attn.q_proj.quant_input, range: 9.7734375, Automatically set the drop rate:1.0
2024-11-08 11:34:09,953 - msmodelslim-logger - INFO - layer:model.layers.4.self_attn.k_proj.quant_input, range: 9.7734375, Automatically set the drop rate:1.0
2024-11-08 11:34:09,956 - msmodelslim-logger - INFO - layer:model.layers.4.self_attn.v_proj.quant_input, range: 9.7734375, Automatically set the drop rate:1.0
2024-11-08 11:34:09,963 - msmodelslim-logger - INFO - layer:model.layers.4.self_attn.o_proj.quant_input, range: 4.63671875, Automatically set the drop rate:1.0
2024-11-08 11:34:09,988 - msmodelslim-logger - INFO - layer:model.layers.4.mlp.gate_proj.quant_input, range: 5.859375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,017 - msmodelslim-logger - INFO - layer:model.layers.4.mlp.up_proj.quant_input, range: 5.859375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,031 - msmodelslim-logger - INFO - layer:model.layers.5.self_attn.q_proj.quant_input, range: 7.984375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,034 - msmodelslim-logger - INFO - layer:model.layers.5.self_attn.k_proj.quant_input, range: 7.984375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,037 - msmodelslim-logger - INFO - layer:model.layers.5.self_attn.v_proj.quant_input, range: 7.984375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,044 - msmodelslim-logger - INFO - layer:model.layers.5.self_attn.o_proj.quant_input, range: 4.60546875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,070 - msmodelslim-logger - INFO - layer:model.layers.5.mlp.gate_proj.quant_input, range: 8.8671875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,098 - msmodelslim-logger - INFO - layer:model.layers.5.mlp.up_proj.quant_input, range: 8.8671875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,112 - msmodelslim-logger - INFO - layer:model.layers.6.self_attn.q_proj.quant_input, range: 9.1484375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,115 - msmodelslim-logger - INFO - layer:model.layers.6.self_attn.k_proj.quant_input, range: 9.1484375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,118 - msmodelslim-logger - INFO - layer:model.layers.6.self_attn.v_proj.quant_input, range: 9.1484375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,126 - msmodelslim-logger - INFO - layer:model.layers.6.self_attn.o_proj.quant_input, range: 4.4453125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,151 - msmodelslim-logger - INFO - layer:model.layers.6.mlp.gate_proj.quant_input, range: 6.97265625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,179 - msmodelslim-logger - INFO - layer:model.layers.6.mlp.up_proj.quant_input, range: 6.97265625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,193 - msmodelslim-logger - INFO - layer:model.layers.7.self_attn.q_proj.quant_input, range: 7.83203125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,197 - msmodelslim-logger - INFO - layer:model.layers.7.self_attn.k_proj.quant_input, range: 7.83203125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,199 - msmodelslim-logger - INFO - layer:model.layers.7.self_attn.v_proj.quant_input, range: 7.83203125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,207 - msmodelslim-logger - INFO - layer:model.layers.7.self_attn.o_proj.quant_input, range: 5.5390625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,232 - msmodelslim-logger - INFO - layer:model.layers.7.mlp.gate_proj.quant_input, range: 7.8984375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,260 - msmodelslim-logger - INFO - layer:model.layers.7.mlp.up_proj.quant_input, range: 7.8984375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,274 - msmodelslim-logger - INFO - layer:model.layers.8.self_attn.q_proj.quant_input, range: 8.71875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,277 - msmodelslim-logger - INFO - layer:model.layers.8.self_attn.k_proj.quant_input, range: 8.71875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,280 - msmodelslim-logger - INFO - layer:model.layers.8.self_attn.v_proj.quant_input, range: 8.71875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,288 - msmodelslim-logger - INFO - layer:model.layers.8.self_attn.o_proj.quant_input, range: 5.9140625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,313 - msmodelslim-logger - INFO - layer:model.layers.8.mlp.gate_proj.quant_input, range: 9.109375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,341 - msmodelslim-logger - INFO - layer:model.layers.8.mlp.up_proj.quant_input, range: 9.109375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,355 - msmodelslim-logger - INFO - layer:model.layers.9.self_attn.q_proj.quant_input, range: 6.45703125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,359 - msmodelslim-logger - INFO - layer:model.layers.9.self_attn.k_proj.quant_input, range: 6.45703125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,362 - msmodelslim-logger - INFO - layer:model.layers.9.self_attn.v_proj.quant_input, range: 6.45703125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,370 - msmodelslim-logger - INFO - layer:model.layers.9.self_attn.o_proj.quant_input, range: 4.76171875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,396 - msmodelslim-logger - INFO - layer:model.layers.9.mlp.gate_proj.quant_input, range: 6.05859375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,424 - msmodelslim-logger - INFO - layer:model.layers.9.mlp.up_proj.quant_input, range: 6.05859375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,438 - msmodelslim-logger - INFO - layer:model.layers.10.self_attn.q_proj.quant_input, range: 7.23046875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,442 - msmodelslim-logger - INFO - layer:model.layers.10.self_attn.k_proj.quant_input, range: 7.23046875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,445 - msmodelslim-logger - INFO - layer:model.layers.10.self_attn.v_proj.quant_input, range: 7.23046875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,453 - msmodelslim-logger - INFO - layer:model.layers.10.self_attn.o_proj.quant_input, range: 4.83203125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,478 - msmodelslim-logger - INFO - layer:model.layers.10.mlp.gate_proj.quant_input, range: 5.9921875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,507 - msmodelslim-logger - INFO - layer:model.layers.10.mlp.up_proj.quant_input, range: 5.9921875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,521 - msmodelslim-logger - INFO - layer:model.layers.11.self_attn.q_proj.quant_input, range: 9.0859375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,525 - msmodelslim-logger - INFO - layer:model.layers.11.self_attn.k_proj.quant_input, range: 9.0859375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,528 - msmodelslim-logger - INFO - layer:model.layers.11.self_attn.v_proj.quant_input, range: 9.0859375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,536 - msmodelslim-logger - INFO - layer:model.layers.11.self_attn.o_proj.quant_input, range: 4.83984375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,561 - msmodelslim-logger - INFO - layer:model.layers.11.mlp.gate_proj.quant_input, range: 6.1875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,590 - msmodelslim-logger - INFO - layer:model.layers.11.mlp.up_proj.quant_input, range: 6.1875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,604 - msmodelslim-logger - INFO - layer:model.layers.12.self_attn.q_proj.quant_input, range: 8.546875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,607 - msmodelslim-logger - INFO - layer:model.layers.12.self_attn.k_proj.quant_input, range: 8.546875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,610 - msmodelslim-logger - INFO - layer:model.layers.12.self_attn.v_proj.quant_input, range: 8.546875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,619 - msmodelslim-logger - INFO - layer:model.layers.12.self_attn.o_proj.quant_input, range: 5.30859375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,644 - msmodelslim-logger - INFO - layer:model.layers.12.mlp.gate_proj.quant_input, range: 6.6484375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,672 - msmodelslim-logger - INFO - layer:model.layers.12.mlp.up_proj.quant_input, range: 6.6484375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,686 - msmodelslim-logger - INFO - layer:model.layers.13.self_attn.q_proj.quant_input, range: 8.5390625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,690 - msmodelslim-logger - INFO - layer:model.layers.13.self_attn.k_proj.quant_input, range: 8.5390625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,693 - msmodelslim-logger - INFO - layer:model.layers.13.self_attn.v_proj.quant_input, range: 8.5390625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,701 - msmodelslim-logger - INFO - layer:model.layers.13.self_attn.o_proj.quant_input, range: 4.99609375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,727 - msmodelslim-logger - INFO - layer:model.layers.13.mlp.gate_proj.quant_input, range: 7.12890625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,755 - msmodelslim-logger - INFO - layer:model.layers.13.mlp.up_proj.quant_input, range: 7.12890625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,769 - msmodelslim-logger - INFO - layer:model.layers.14.self_attn.q_proj.quant_input, range: 8.2109375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,773 - msmodelslim-logger - INFO - layer:model.layers.14.self_attn.k_proj.quant_input, range: 8.2109375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,776 - msmodelslim-logger - INFO - layer:model.layers.14.self_attn.v_proj.quant_input, range: 8.2109375, Automatically set the drop rate:1.0
2024-11-08 11:34:10,784 - msmodelslim-logger - INFO - layer:model.layers.14.self_attn.o_proj.quant_input, range: 5.5, Automatically set the drop rate:1.0
2024-11-08 11:34:10,809 - msmodelslim-logger - INFO - layer:model.layers.14.mlp.gate_proj.quant_input, range: 6.6328125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,838 - msmodelslim-logger - INFO - layer:model.layers.14.mlp.up_proj.quant_input, range: 6.6328125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,852 - msmodelslim-logger - INFO - layer:model.layers.15.self_attn.q_proj.quant_input, range: 8.265625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,855 - msmodelslim-logger - INFO - layer:model.layers.15.self_attn.k_proj.quant_input, range: 8.265625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,858 - msmodelslim-logger - INFO - layer:model.layers.15.self_attn.v_proj.quant_input, range: 8.265625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,866 - msmodelslim-logger - INFO - layer:model.layers.15.self_attn.o_proj.quant_input, range: 4.41796875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,891 - msmodelslim-logger - INFO - layer:model.layers.15.mlp.gate_proj.quant_input, range: 6.67578125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,919 - msmodelslim-logger - INFO - layer:model.layers.15.mlp.up_proj.quant_input, range: 6.67578125, Automatically set the drop rate:1.0
2024-11-08 11:34:10,934 - msmodelslim-logger - INFO - layer:model.layers.16.self_attn.q_proj.quant_input, range: 9.1171875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,938 - msmodelslim-logger - INFO - layer:model.layers.16.self_attn.k_proj.quant_input, range: 9.1171875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,941 - msmodelslim-logger - INFO - layer:model.layers.16.self_attn.v_proj.quant_input, range: 9.1171875, Automatically set the drop rate:1.0
2024-11-08 11:34:10,949 - msmodelslim-logger - INFO - layer:model.layers.16.self_attn.o_proj.quant_input, range: 4.7890625, Automatically set the drop rate:1.0
2024-11-08 11:34:10,974 - msmodelslim-logger - INFO - layer:model.layers.16.mlp.gate_proj.quant_input, range: 7.5703125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,003 - msmodelslim-logger - INFO - layer:model.layers.16.mlp.up_proj.quant_input, range: 7.5703125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,017 - msmodelslim-logger - INFO - layer:model.layers.17.self_attn.q_proj.quant_input, range: 8.0390625, Automatically set the drop rate:1.0
2024-11-08 11:34:11,020 - msmodelslim-logger - INFO - layer:model.layers.17.self_attn.k_proj.quant_input, range: 8.0390625, Automatically set the drop rate:1.0
2024-11-08 11:34:11,023 - msmodelslim-logger - INFO - layer:model.layers.17.self_attn.v_proj.quant_input, range: 8.0390625, Automatically set the drop rate:1.0
2024-11-08 11:34:11,031 - msmodelslim-logger - INFO - layer:model.layers.17.self_attn.o_proj.quant_input, range: 4.171875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,056 - msmodelslim-logger - INFO - layer:model.layers.17.mlp.gate_proj.quant_input, range: 7.6796875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,085 - msmodelslim-logger - INFO - layer:model.layers.17.mlp.up_proj.quant_input, range: 7.6796875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,099 - msmodelslim-logger - INFO - layer:model.layers.18.self_attn.q_proj.quant_input, range: 8.921875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,102 - msmodelslim-logger - INFO - layer:model.layers.18.self_attn.k_proj.quant_input, range: 8.921875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,105 - msmodelslim-logger - INFO - layer:model.layers.18.self_attn.v_proj.quant_input, range: 8.921875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,113 - msmodelslim-logger - INFO - layer:model.layers.18.self_attn.o_proj.quant_input, range: 4.69921875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,138 - msmodelslim-logger - INFO - layer:model.layers.18.mlp.gate_proj.quant_input, range: 9.03125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,166 - msmodelslim-logger - INFO - layer:model.layers.18.mlp.up_proj.quant_input, range: 9.03125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,180 - msmodelslim-logger - INFO - layer:model.layers.19.self_attn.q_proj.quant_input, range: 7.77734375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,184 - msmodelslim-logger - INFO - layer:model.layers.19.self_attn.k_proj.quant_input, range: 7.77734375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,187 - msmodelslim-logger - INFO - layer:model.layers.19.self_attn.v_proj.quant_input, range: 7.77734375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,195 - msmodelslim-logger - INFO - layer:model.layers.19.self_attn.o_proj.quant_input, range: 5.05078125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,220 - msmodelslim-logger - INFO - layer:model.layers.19.mlp.gate_proj.quant_input, range: 9.1484375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,248 - msmodelslim-logger - INFO - layer:model.layers.19.mlp.up_proj.quant_input, range: 9.1484375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,262 - msmodelslim-logger - INFO - layer:model.layers.20.self_attn.q_proj.quant_input, range: 8.3125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,266 - msmodelslim-logger - INFO - layer:model.layers.20.self_attn.k_proj.quant_input, range: 8.3125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,268 - msmodelslim-logger - INFO - layer:model.layers.20.self_attn.v_proj.quant_input, range: 8.3125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,276 - msmodelslim-logger - INFO - layer:model.layers.20.self_attn.o_proj.quant_input, range: 5.09375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,302 - msmodelslim-logger - INFO - layer:model.layers.20.mlp.gate_proj.quant_input, range: 9.546875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,330 - msmodelslim-logger - INFO - layer:model.layers.20.mlp.up_proj.quant_input, range: 9.546875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,344 - msmodelslim-logger - INFO - layer:model.layers.21.self_attn.q_proj.quant_input, range: 8.34375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,348 - msmodelslim-logger - INFO - layer:model.layers.21.self_attn.k_proj.quant_input, range: 8.34375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,351 - msmodelslim-logger - INFO - layer:model.layers.21.self_attn.v_proj.quant_input, range: 8.34375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,359 - msmodelslim-logger - INFO - layer:model.layers.21.self_attn.o_proj.quant_input, range: 5.03125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,385 - msmodelslim-logger - INFO - layer:model.layers.21.mlp.gate_proj.quant_input, range: 9.5234375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,413 - msmodelslim-logger - INFO - layer:model.layers.21.mlp.up_proj.quant_input, range: 9.5234375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,427 - msmodelslim-logger - INFO - layer:model.layers.22.self_attn.q_proj.quant_input, range: 6.23046875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,431 - msmodelslim-logger - INFO - layer:model.layers.22.self_attn.k_proj.quant_input, range: 6.23046875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,434 - msmodelslim-logger - INFO - layer:model.layers.22.self_attn.v_proj.quant_input, range: 6.23046875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,442 - msmodelslim-logger - INFO - layer:model.layers.22.self_attn.o_proj.quant_input, range: 4.953125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,467 - msmodelslim-logger - INFO - layer:model.layers.22.mlp.gate_proj.quant_input, range: 8.0, Automatically set the drop rate:1.0
2024-11-08 11:34:11,496 - msmodelslim-logger - INFO - layer:model.layers.22.mlp.up_proj.quant_input, range: 8.0, Automatically set the drop rate:1.0
2024-11-08 11:34:11,510 - msmodelslim-logger - INFO - layer:model.layers.23.self_attn.q_proj.quant_input, range: 7.2109375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,513 - msmodelslim-logger - INFO - layer:model.layers.23.self_attn.k_proj.quant_input, range: 7.2109375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,517 - msmodelslim-logger - INFO - layer:model.layers.23.self_attn.v_proj.quant_input, range: 7.2109375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,525 - msmodelslim-logger - INFO - layer:model.layers.23.self_attn.o_proj.quant_input, range: 6.640625, Automatically set the drop rate:1.0
2024-11-08 11:34:11,550 - msmodelslim-logger - INFO - layer:model.layers.23.mlp.gate_proj.quant_input, range: 6.37109375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,578 - msmodelslim-logger - INFO - layer:model.layers.23.mlp.up_proj.quant_input, range: 6.37109375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,592 - msmodelslim-logger - INFO - layer:model.layers.24.self_attn.q_proj.quant_input, range: 8.4921875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,595 - msmodelslim-logger - INFO - layer:model.layers.24.self_attn.k_proj.quant_input, range: 8.4921875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,598 - msmodelslim-logger - INFO - layer:model.layers.24.self_attn.v_proj.quant_input, range: 8.4921875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,607 - msmodelslim-logger - INFO - layer:model.layers.24.self_attn.o_proj.quant_input, range: 8.6953125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,632 - msmodelslim-logger - INFO - layer:model.layers.24.mlp.gate_proj.quant_input, range: 6.83984375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,660 - msmodelslim-logger - INFO - layer:model.layers.24.mlp.up_proj.quant_input, range: 6.83984375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,674 - msmodelslim-logger - INFO - layer:model.layers.25.self_attn.q_proj.quant_input, range: 6.421875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,677 - msmodelslim-logger - INFO - layer:model.layers.25.self_attn.k_proj.quant_input, range: 6.421875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,680 - msmodelslim-logger - INFO - layer:model.layers.25.self_attn.v_proj.quant_input, range: 6.421875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,688 - msmodelslim-logger - INFO - layer:model.layers.25.self_attn.o_proj.quant_input, range: 9.4609375, Automatically set the drop rate:1.0
2024-11-08 11:34:11,713 - msmodelslim-logger - INFO - layer:model.layers.25.mlp.gate_proj.quant_input, range: 6.30078125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,741 - msmodelslim-logger - INFO - layer:model.layers.25.mlp.up_proj.quant_input, range: 6.30078125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,752 - msmodelslim-logger - INFO - layer:model.layers.26.self_attn.v_proj.quant_input, range: 13.421875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,760 - msmodelslim-logger - INFO - layer:model.layers.26.self_attn.o_proj.quant_input, range: 7.1796875, Automatically set the drop rate:1.0
2024-11-08 11:34:11,786 - msmodelslim-logger - INFO - layer:model.layers.26.mlp.gate_proj.quant_input, range: 7.89453125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,814 - msmodelslim-logger - INFO - layer:model.layers.26.mlp.up_proj.quant_input, range: 7.89453125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,828 - msmodelslim-logger - INFO - layer:model.layers.27.self_attn.q_proj.quant_input, range: 13.2890625, Automatically set the drop rate:1.0
2024-11-08 11:34:11,831 - msmodelslim-logger - INFO - layer:model.layers.27.self_attn.k_proj.quant_input, range: 13.2890625, Automatically set the drop rate:1.0
2024-11-08 11:34:11,834 - msmodelslim-logger - INFO - layer:model.layers.27.self_attn.v_proj.quant_input, range: 13.2890625, Automatically set the drop rate:1.0
2024-11-08 11:34:11,842 - msmodelslim-logger - INFO - layer:model.layers.27.self_attn.o_proj.quant_input, range: 5.80078125, Automatically set the drop rate:1.0
2024-11-08 11:34:11,867 - msmodelslim-logger - INFO - layer:model.layers.27.mlp.gate_proj.quant_input, range: 6.515625, Automatically set the drop rate:1.0
2024-11-08 11:34:11,895 - msmodelslim-logger - INFO - layer:model.layers.27.mlp.up_proj.quant_input, range: 6.515625, Automatically set the drop rate:1.0
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [01:03<00:00,  1.28s/it]
2024-11-08 11:35:13,446 - msmodelslim-logger - INFO - Calibration end!
2024-11-08 11:35:13,475 - msmodelslim-logger - INFO - write directory exists, write file to directory /home/apulis-dev/teamdata/Qwen2.5-7B
2024-11-08 11:35:50,283 - msmodelslim-logger - INFO - invalid safetensors_name, set safetensors_name to default quant_model_weight_w8a8.safetensors
2024-11-08 11:35:50,284 - msmodelslim-logger - INFO - invalid json_name, set json_name to default quant_model_description_w8a8.json
Traceback (most recent call last):
  File "/usr/local/Ascend/llm_model/examples/models/qwen/convert_quant_weights.py", line 59, in <module>
    quant_weight_generator.convert(tokenized_data, args.save_directory, args.disable_level)
  File "/usr/local/Ascend/llm_model/examples/convert/model_slim/quantifier.py", line 114, in convert
    calibrator.save(save_path, save_type=["safe_tensor"])
  File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/msmodelslim/pytorch/llm_ptq/llm_ptq_tools/quant_tools.py", line 390, in save
    self.save_safetensor(output_path, safetensors_name, json_name)
  File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/msmodelslim/pytorch/llm_ptq/llm_ptq_tools/quant_tools.py", line 408, in save_safetensor
    self.set_fp_safetensor(ori_model_state_dict_name, safetensor_weight, ori_model_state_dict)
  File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/msmodelslim/pytorch/llm_ptq/llm_ptq_tools/quant_tools.py", line 424, in set_fp_safetensor
    safetensor_weight[ori_model_state_dict_name] = ori_model_state_dict.clone()
RuntimeError: NPU out of memory. Tried to allocate 130.00 MiB (NPU 0; 21.02 GiB total capacity; 18.99 GiB already allocated; 18.99 GiB current active; 818.35 MiB free; 19.01 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.
[ERROR] 2024-11-08-11:35:51 (PID:781496, Device:0, RankID:-1) ERR99999 UNKNOWN application exception
root@dev-8242526b-01f2-4a54-b89d-f6d9c57c692d-qjhpf:/usr/local/Ascend/llm_model# 
root@dev-8242526b-01f2-4a54-b89d-f6d9c57c692d-qjhpf:/usr/local/Ascend/llm_model# 


由于显存不足,无法保存成功。方法应该是对的。

处理参考:ATB Models纯模型使用-ATB Models使用-模型推理使用流程-MindIE LLM开发指南-大模型开发-MindIE1.0.RC3开发文档-昇腾社区

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/468664.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【SSL-RL】自监督强化学习:随机潜在演员评论家 (SLAC)算法

&#x1f4e2;本篇文章是博主强化学习&#xff08;RL&#xff09;领域学习时&#xff0c;用于个人学习、研究或者欣赏使用&#xff0c;并基于博主对相关等领域的一些理解而记录的学习摘录和笔记&#xff0c;若有不当和侵权之处&#xff0c;指出后将会立即改正&#xff0c;还望谅…

怎么启动python脚本文件

创建一个简单的python入门代码&#xff0c;以便示范。 存储文件并复制该python文件的存储路径。 使用cd 命令切换工作目录到python文件所在的目录。 输入变量环境中的python路径和python文件的名字。 回车执行后&#xff0c;可完成命令行的python文件运行。

DDei在线设计器V1.2.42版发布

V1.2.42版 新特性&#xff1a; 1.快捷编辑框可以映射到主控件的多个属性上&#xff0c;从而实现快速编辑。 2.跟随图形的支持范围增加&#xff0c;从仅支持线控件到支持所有控件 2.新增控件双击回调函数EVENT_CONTROL_DBL_CLICK&#xff0c;可以用于覆盖默认的快速编辑逻辑…

信息安全工程师(78)网络安全应急响应技术与常见工具

前言 网络安全应急响应是指为应对网络安全事件&#xff0c;相关人员或组织机构对网络安全事件进行监测、预警、分析、响应和恢复等工作。 一、网络安全应急响应技术 网络安全应急响应组织 构成&#xff1a;网络安全应急响应组织主要由应急领导组和应急技术支撑组构成。领导组负…

网络安全之SQL初步注入

一.字符型 平台使用pikachu $name$_GET[name]; ​ $query"select id,email from member where username$name"; 用户输入的数据会被替换到SQL语句中的$name位置 查询1的时候&#xff0c;会展示username1的用户数据&#xff0c;可以测试是否有注入点&#xff08;闭…

Gradle命令编译Android Studio工程项目并签名

文章目录 gradlew命令gradlew编译debug apkgradlew编译release apkapksigner签名apkgradlew注意事项 gradlew命令 gradlew 是一个脚本文件&#xff0c;它允许你在没有全局安装 Gradle 的情况下运行 Gradle 构建。这个脚本在多平台上可用&#xff0c;对于 Windows 系统来说是 g…

B2B订货系统功能设计与代码开发(PHP + MySQL)

在B2B&#xff08;Business to Business&#xff09;电子商务中&#xff0c;企业之间的商品订购、交易和供应链管理是核心功能。一个高效的B2B订货系统可以帮助企业管理库存、订单、采购等业务流程。本文将介绍一个基于PHP与MySQL技术栈的B2B订货系统的功能设计与开发流程。 一…

增删改查基础项目总结

上篇中主要负责后端部分完成了一个简单的学习辅助系统部分界面&#xff0c;主要针对增删改查进行了练习&#xff0c;过程中遇到了一些细节上的问题以及当时做的时候去查阅的一些之前没有太注意到的额外知识&#xff0c;所以还需要进行进一步梳理&#xff0c;像登录校验的方法以…

【Zookeeper集群搭建】安装zookeeper、zookeeper集群配置、zookeeper启动与关闭、zookeeper的shell命令操作

目录 一、安装Zookeeper 二、配置Zookeeper集群 三、Zookeeper服务的启动与关闭 四、Zookeeper的shell操作 前情提要&#xff1a;延续上篇【Hadoop和Hbase集群配置】继续配置Zookeeper&#xff0c;开启三台虚拟机Hadoop1、Hadoop2、Hadoop3&#xff0c;进入终端&#xff0c…

智能社区服务小程序+ssm

智能社区服务小程序 摘要 随着信息技术在管理上越来越深入而广泛的应用&#xff0c;管理信息系统的实施在技术上已逐步成熟。本文介绍了智能社区服务小程序的开发全过程。通过分析智能社区服务小程序管理的不足&#xff0c;创建了一个计算机管理智能社区服务小程序的方案。文…

【C++】vector模拟实现、迭代器失效问题(超详解)

vector会使用之后我们来模拟实现一下&#xff0c;通过对vector的模拟实现&#xff0c;我们来说一下迭代器失效问题。 1.准备工作 在头文件vector.h里声明和实现函数&#xff0c;然后在test.cpp里测试代码的正确性。 在vector.h中用命名空间分隔一下&#xff0c;因为c库里面也有…

前端学习八股资料CSS(一)

&#x1f914;&#x1f914;宝子们&#xff0c;好久不见啊&#xff01;今日继续分享前端八股笔记&#xff0c;好多友友们觉得笔记对于自己学习复习或面试复习或平时加强知识非常有用&#xff0c;收到了大家的好评&#xff0c;谢谢大家的喜欢&#xff0c;我会坚持继续更新的&…

【进阶】Stable Diffusion 插件 Controlnet 安装使用教程(图像精准控制)

Stable Diffusion WebUI 的绘画插件 Controlnet 最近更新了 V1.1 版本&#xff0c;发布了 14 个优化模型&#xff0c;并新增了多个预处理器&#xff0c;让它的功能比之前更加好用了&#xff0c;最近几天又连续更新了 3 个新 Reference 预处理器&#xff0c;可以直接根据图像生产…

小程序源码-模版 100多套小程序(附源码)

一、搭建开发环境 搭建环境可以从这里开始&#xff1a; 微信小程序从零开始开发步骤&#xff08;一&#xff09;搭建开发环境 - 简书 二、程序示例 1、AppleMusic https://download.csdn.net/download/m0_54925305/89977187 2、仿B站首页 https://download.csdn.net/downlo…

【Python-AI篇】K近邻算法(KNN)

0. 前置----机器学习流程 获取数据集数据基本处理特征工程机器学习模型评估在线服务 1. KNN算法概念 如果一个样本在特征空间中的K个最相似&#xff08;即特征空间中最邻近&#xff09;的样本中大多数属于某一个类别&#xff0c;则该样本也属于这一个类别 1.1 KNN算法流程总…

Deepin 系统中安装Rider和Uno Platform

1、在系统的中断命令行中输入如下命令&#xff0c;安装.NET 8环境。 wget https://packages.microsoft.com/config/ubuntu/20.04/packages-microsoft-prod.deb -O packages-microsoft-prod.debsudo dpkg -i packages-microsoft-prod.debsudo apt-get updatesudo apt-get insta…

[OpenGL]使用OpenGL实现硬阴影效果

一、简介 本文介绍了如何使用OpenGL实现硬阴影效果&#xff0c;并在最后给出了全部的代码。本文基于[OpenGL]渲染Shadow Map&#xff0c;实现硬阴影的流程如下&#xff1a; 首先&#xff0c;以光源为视角&#xff0c;渲染场景的深度图&#xff0c;将light space中的深度图存储…

反序列化漏洞浅析

Apache InLong 是开源的高性能数据集成框架&#xff0c;支持数据接入、数据同步和数据订阅&#xff0c;同时支持批处理和流处理&#xff0c;方便业务构建基于流式的数据分析、建模和应用。浅析Apache InLong < 1.12.0 JDBC反序列化漏洞&#xff08;CVE-2024-26579&#xff0…

基于微信小程序的移动学习平台的设计与实现+ssm(lw+演示+源码+运行)

摘 要 由于APP软件在开发以及运营上面所需成本较高&#xff0c;而用户手机需要安装各种APP软件&#xff0c;因此占用用户过多的手机存储空间&#xff0c;导致用户手机运行缓慢&#xff0c;体验度比较差&#xff0c;进而导致用户会卸载非必要的APP&#xff0c;倒逼管理者必须改…

SQL中的内连接(inner join)、外连接(left|right join、full join)以及on关键字中涉及分区筛选、null解释

一、简介 本篇幅主要介绍了&#xff1a; SQL中内连接&#xff08;inner join&#xff09;、外连接&#xff08;left join、right join、full join&#xff09;的机制;连接关键字on上涉及表分区筛选的物理执行及引擎优化&#xff1b;null在表关联时的情况与执行&#xff1b; …