DeepSeek-R1：模型部署与应用实践

深入探索DeepSeek-R1：模型部署与应用实践

在当今人工智能飞速发展的时代，大语言模型（LLMs）已经成为众多领域的核心驱动力。DeepSeek-R1作为一款备受瞩目的模型，在自然语言处理任务中展现出了强大的能力。本文将深入探讨DeepSeek-R1的部署过程、代码实现细节以及应用中的一些关键要点。

在这里插入图片描述

一、DeepSeek-R1模型简介

DeepSeek-R1是一个基于Transformer架构的因果语言模型，它在预训练阶段学习了大量的文本数据，从而具备了理解和生成自然语言的能力。与其他模型相比，DeepSeek-R1在处理长文本、语义理解和生成质量上具有独特的优势，尤其适用于对话系统、文本生成等任务。

二、模型部署与代码实现

（一）环境准备

在部署DeepSeek-R1之前，可以使用AutoDL服务器环境满足以下要求：

硬件：具备NVIDIA GPU，以加速模型推理。
软件：安装PyTorch、transformers库等依赖包。

（二）代码解析

导入必要的库

import os
import json
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import unicodedata
from typing import List

这些库提供了文件操作、数据处理、模型加载和推理等功能。

生成函数

@torch.inference_mode()
def generate(model: AutoModelForCausalLM,input_ids: torch.Tensor,attention_mask: torch.Tensor,max_new_tokens: int,temperature: float = 1.0
) -> List[int]:outputs = model.generate(input_ids=input_ids,attention_mask=attention_mask,max_new_tokens=max_new_tokens,temperature=temperature,eos_token_id=model.config.eos_token_id,pad_token_id=model.config.eos_token_id,do_sample=True,top_k=50,top_p=0.95,)return outputs[0].tolist()

generate函数负责根据输入的文本生成回复。通过设置attention_mask，可以确保模型在处理文本时关注到正确的位置。temperature、top_k和top_p等参数用于控制生成文本的随机性和多样性。

输入清理函数

def clean_input(user_input):user_input = "".join(c for c in user_input if not unicodedata.category(c).startswith("C"))return user_input.strip()

clean_input函数用于清理用户输入，去除不可见字符和多余的空格，确保输入的文本质量。

消息内容清理函数

def clean_message_content(content):if not content or not isinstance(content, str):return ""return content.strip()

该函数用于清理消息内容，过滤掉无效输入。

构建提示函数

def build_prompt(messages, max_history=3):template = "The following is a conversation with an AI assistant. The assistant is helpful, knowledgeable, and polite:\n"for msg in messages[-max_history:]:content = clean_message_content(msg["content"])if not content:continuetemplate += f"{msg['role'].capitalize()}: {content}\n"template += "Assistant: "return template.strip()

build_prompt函数用于构建模型输入的提示，通过限制对话历史的长度，可以减少计算量并提高模型的响应速度。

（三）模型加载与交互

if __name__ == "__main__":print("Initializing DeepSeek-R1 Service...")ckpt_path = "/root/DeepSeek-R1"config_path = "/root/DeepSeek-R1/config.json"tokenizer = AutoTokenizer.from_pretrained(ckpt_path)model = AutoModelForCausalLM.from_pretrained(ckpt_path,torch_dtype=torch.bfloat16,).cuda()messages = []while True:user_input = input("You: ").strip()user_input = clean_input(user_input)if not user_input or len(user_input.strip()) == 0:print("Invalid input. Please type something meaningful!")continueif user_input.lower() in ["exit", "quit"]:print("Exiting conversation. Goodbye!")breakmessages.append({"role": "user", "content": user_input})messages = messages[-10:]prompt = build_prompt(messages)if not isinstance(prompt, str) or len(prompt.strip()) == 0:print("Error: Prompt is empty or invalid. Skipping this turn.")continuetokenized_prompt = tokenizer(prompt, return_tensors="pt", truncation=True, padding=True)input_ids = tokenized_prompt["input_ids"].to("cuda")attention_mask = tokenized_prompt["attention_mask"].to("cuda")max_new_tokens = 150temperature = 0.7completion_tokens = generate(model, input_ids, attention_mask, max_new_tokens, temperature)completion = tokenizer.decode(completion_tokens[len(input_ids[0]):],skip_special_tokens=True).split("User:")[0].strip()print(f"Assistant: {completion}")messages.append({"role": "assistant", "content": completion})

在主程序中，首先加载模型和分词器，然后进入一个循环，等待用户输入。用户输入经过清理和处理后，构建成提示并输入模型进行推理。模型生成的回复经过解码后显示给用户，并添加到对话历史中。

在这里插入图片描述

0:"Qwen2ForCausalLM"
attention_dropout:0
bos_token_id:151643
eos_token_id:151643
hidden_act:"silu"
hidden_size:1536
initializer_range:0.02
intermediate_size:8960
max_position_embeddings:131072
max_window_layers:21
model_type:"qwen2"
num_attention_heads:12
num_hidden_layers:28
num_key_value_heads:2
rms_norm_eps:0.000001
rope_theta:10000
sliding_window:4096
tie_word_embeddings:false
torch_dtype:"bfloat16"
transformers_version:"4.44.0"
use_cache:true
use_mrope:false
use_sliding_window:false
vocab_size:151936

_from_model_config:true
bos_token_id:151646
eos_token_id:151643
do_sample:true
temperature:0.6
top_p:0.95
transformers_version:"4.39.3"

add_bos_token:true
add_eos_token:false
clean_up_tokenization_spaces:false
legacy:true
model_max_length:16384
sp_model_kwargs:
unk_token:null
tokenizer_class:"LlamaTokenizerFast"
chat_template:"{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% set ns = namespace(is_first=false, is_tool=false, is_output_first=true, system_prompt='') %}{%- for message in messages %}{%- if message['role'] == 'system' %}{% set ns.system_prompt = message['content'] %}{%- endif %}{%- endfor %}{{bos_token}}{{ns.system_prompt}}{%- for message in messages %}{%- if message['role'] == 'user' %}{%- set ns.is_tool = false -%}{{'<｜User｜>' + message['content']}}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is none %}{%- set ns.is_tool = false -%}{%- for tool in message['tool_calls']%}{%- if not ns.is_first %}{{'<｜Assistant｜><｜tool▁calls▁begin｜><｜tool▁call▁begin｜>' + tool['type'] + '<｜tool▁sep｜>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<｜tool▁call▁end｜>'}}{%- set ns.is_first = true -%}{%- else %}{{'\n' + '<｜tool▁call▁begin｜>' + tool['type'] + '<｜tool▁sep｜>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<｜tool▁call▁end｜>'}}{{'<｜tool▁calls▁end｜><｜end▁of▁sentence｜>'}}{%- endif %}{%- endfor %}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is not none %}{%- if ns.is_tool %}{{'<｜tool▁outputs▁end｜>' + message['content'] + '<｜end▁of▁sentence｜>'}}{%- set ns.is_tool = false -%}{%- else %}{% set content = message['content'] %}{% if '</think>' in content %}{% set content = content.split('</think>')[-1] %}{% endif %}{{'<｜Assistant｜>' + content + '<｜end▁of▁sentence｜>'}}{%- endif %}{%- endif %}{%- if message['role'] == 'tool' %}{%- set ns.is_tool = true -%}{%- if ns.is_output_first %}{{'<｜tool▁outputs▁begin｜><｜tool▁output▁begin｜>' + message['content'] + '<｜tool▁output▁end｜>'}}{%- set ns.is_output_first = false %}{%- else %}{{'\n<｜tool▁output▁begin｜>' + message['content'] + '<｜tool▁output▁end｜>'}}{%- endif %}{%- endif %}{%- endfor -%}{% if ns.is_tool %}{{'<｜tool▁outputs▁end｜>'}}{% endif %}{% if add_generation_prompt and not ns.is_tool %}{{'<｜Assistant｜>'}}{% endif %}"

在这里插入图片描述

四、总结与展望

通过以上步骤，成功部署了DeepSeek-R1模型，并实现了一个简单的对话系统。在实际应用中，还可以进一步优化模型的性能，例如调整超参数、使用更高效的硬件等。同时，结合更多的领域知识和数据，可以让DeepSeek-R1在特定任务中发挥更大的作用。未来，随着大语言模型技术的不断发展，DeepSeek-R1有望在更多领域得到广泛应用，为自然语言处理带来更多的创新和突破。

附录：
https://www.codewithgpu.com/i/deepseek-ai/DeepSeek-R1/DeepSeek-R1/1624/4

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.rhkb.cn/news/24849.html

如若内容造成侵权/违法违规/事实不符，请联系长河编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！