【AI时代】可视化训练模型工具LLaMA-Factory安装与使用

文章目录

安装
训练
使用

安装

官方地址：https://github.com/hiyouga/LLaMA-Factory

创建虚拟环境

conda create -n llama-factory
conda activate llama-factory

安装

git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"

检查

完成安装后，可以通过使用llamafactory-cli来快速校验安装是否成功

如果您能成功看到类似下面的界面，就说明安装成功了。

启动webui

nohup llamafactory-cli webui > output.log 2>&1 &

启动后访问该地址：

训练

简单在页面设置一下参数

模型路径：可以使用huggingface的路径，也可以直接配置本地的路径；大部分参数使用默认的即可。

自定义数据集需要在该文件中进行配置，页面才可见：

配置好之后，点击预览命令，展示训练命令：

llamafactory-cli train \--stage sft \--do_train True \--model_name_or_path /mnt/largeroom/llm/model/DeepSeek-R1-Distill-Qwen-1.5B \--preprocessing_num_workers 16 \--finetuning_type lora \--template deepseek3 \--flash_attn auto \--dataset_dir data \--dataset alpaca_zh_demo \--cutoff_len 2048 \--learning_rate 5e-05 \--num_train_epochs 3.0 \--max_samples 1000000 \--per_device_train_batch_size 4 \--gradient_accumulation_steps 8 \--lr_scheduler_type cosine \--max_grad_norm 1.0 \--logging_steps 5 \--save_steps 100 \--warmup_steps 4 \--packing False \--report_to none \--output_dir saves/DeepSeek-R1-1.5B-Distill/lora/train_lora_02 \--bf16 True \--plot_loss True \--trust_remote_code True \--ddp_timeout 180000000 \--include_num_input_tokens_seen True \--optim adamw_torch \--lora_rank 16 \--lora_alpha 16 \--lora_dropout 0 \--lora_target all

可以看到我所有卡都用上了：

完成之后，会展示损失函数：

在输出目录可以看到微调好的权重：

使用

对于训练好的模型，如果是像上图这像的分开存储的权重，可以通过配置检查点路径进行调用

需要提前安装vllm：

pip install vllm==0.7.2

如果安装慢，网络连接不上，可以使用-i指定源：

 pip install vllm==0.7.2 -i https://pypi.tuna.tsinghua.edu.cn/simple/

常用的国内源：

阿里云：https://mirrors.aliyun.com/pypi/simple/
豆瓣：https://pypi.douban.com/simple/
清华大学：https://pypi.tuna.tsinghua.edu.cn/simple/
中国科学技术大学：https://pypi.mirrors.ustc.edu.cn/simple/

也可以通过 <font style="color:rgb(0, 0, 0);">llamafactory-cli export merge_config.yaml</font> 指令来合并模型。

### model
model_name_or_path: /mnt/largeroom/llm/model/DeepSeek-R1-Distill-Qwen-1.5B
adapter_name_or_path: /mnt/largeroom/zhurunhua/LLaMA-Factory/saves/DeepSeek-R1-1.5B-Distill/lora/train_lora_02
template: deepseek3
finetuning_type: lora### export
export_dir: /mnt/largeroom/llm/model/deepseek-r1-1.5b-peft
export_size: 2
export_device: cpu
export_legacy_format: false