Dify中的工具

Dify中的工具分为内置工具（硬编码）和第三方工具（OpenAPI Swagger/ChatGPT Plugin）。工具可被Workflow（工作流）和Agent使用，当然Workflow也可被发布为工具，这样Workflow（工作流）中又可以使用Workflow（工具）。

一.Dify内置工具

下面以Google为例介绍。从前端看只要输入SerpApi API key即可，接下来重点分析后端实现。

源码位置：dify-0.6.9/api/core/tools/provider/builtin/google

1.准备工具供应商 yaml

源码位置：dify-0.6.9/api/core/tools/provider/builtin/google/google.yaml

identity:  # 工具供应商的基本信息author: Dify  # 作者name: google  # 工具供应商的名称，名称是唯一的，不允许和其它供应商重名label:  # 标签用于前端展示en_US: Google  # 英文标签zh_Hans: Google  # 简体中文标签pt_BR: Google  # 葡萄牙语标签description:  # 描述用于前端展示en_US: Google  # 英文描述zh_Hans: GoogleSearch  # 简体中文描述pt_BR: Google  # 葡萄牙语描述icon: icon.svg  # 图标文件名，图标文件需要放在当前模块的_assets目录下

2.准备供应商凭据

源码位置：dify-0.6.9/api/core/tools/provider/builtin/google/google.yaml

Google使用了SerpApi提供的API，而SerpApi需要一个API Key才能使用，即该工具需要一个凭证才能使用，也是前端需要输入SerpApi API key的原因。

credentials_for_provider:  # 凭据字段serpapi_api_key:  # 凭据字段的唯一标识type: secret-input  # 凭据字段的类型required: true  # 是否必填label:  # 标签用于前端展示en_US: SerpApi API key  # 英文标签zh_Hans: SerpApi API key  # 简体中文标签pt_BR: SerpApi API key  # 葡萄牙语标签placeholder:  # 提示用于前端展示en_US: Please input your SerpApi API key  # 英文提示zh_Hans: 请输入你的 SerpApi API key  # 简体中文提示pt_BR: Please input your SerpApi API key  # 葡萄牙语提示help:  # 凭据字段帮助文本en_US: Get your SerpApi API key from SerpApi  # 英文帮助文本zh_Hans: 从 SerpApi 获取您的 SerpApi API key  # 简体中文帮助文本pt_BR: Get your SerpApi API key from SerpApi  # 葡萄牙语帮助文本url: https://serpapi.com/manage-api-key  # 凭据字段帮助链接

type：凭据字段类型，目前支持secret-input、text-input、select 三种类型，分别对应密码输入框、文本输入框、下拉框，如果为secret-input，则会在前端隐藏输入内容，并且后端会对输入内容进行加密。

3.准备工具 yaml

源码位置：dify-0.6.9\api\core\tools\provider\builtin\google\tools\google_search.yaml

一个供应商底下可以有多个工具，每个工具都需要一个 yaml 文件来描述，这个文件包含了工具的基本信息、参数、输出等。

identity:  # 工具的基本信息name: google_search  # 工具的唯一名称author: Dify  # 工具的作者label:  # 工具的标签，用于前端展示en_US: GoogleSearch  # 英文标签zh_Hans: 谷歌搜索  # 简体中文标签pt_BR: GoogleSearch  # 葡萄牙语标签
description:  # 工具的描述human:  # 人类可读的描述en_US: A tool for performing a Google SERP search and extracting snippets and webpages.Input should be a search query.zh_Hans: 一个用于执行 Google SERP 搜索并提取片段和网页的工具。输入应该是一个搜索查询。pt_BR: A tool for performing a Google SERP search and extracting snippets and webpages.Input should be a search query.# 传递给 LLM 的介绍，为了使得LLM更好理解这个工具，我们建议在这里写上关于这个工具尽可能详细的信息，让 LLM 能够理解并使用这个工具llm: A tool for performing a Google SERP search and extracting snippets and webpages.Input should be a search query.
parameters:  # 参数列表- name: query  # 参数名称type: string  # 参数类型required: true  # 是否必填label:  # 参数标签en_US: Query string  # 英文标签zh_Hans: 查询语句  # 简体中文标签pt_BR: Query string  # 葡萄牙语标签human_description:  # 参数描述，用于前端展示en_US: used for searching  # 英文描述zh_Hans: 用于搜索网页内容  # 简体中文描述pt_BR: used for searching  # 葡萄牙语描述# 传递给LLM的介绍，同上，为了使得LLM更好理解这个参数，我们建议在这里写上关于这个参数尽可能详细的信息，让LLM能够理解这个参数llm_description: key words for searchingform: llm  # 参数的表单类型，llm表示这个参数需要由Agent自行推理出来，前端将不会展示这个参数- name: result_type  # 参数名称type: select  # 参数类型required: true  # 是否必填options:  # 参数的选项- value: textlabel:en_US: textzh_Hans: 文本pt_BR: texto- value: linklabel:en_US: linkzh_Hans: 链接pt_BR: linkdefault: link  # 默认值为链接label:en_US: Result typezh_Hans: 结果类型pt_BR: Result typehuman_description:en_US: used for selecting the result type, text or linkzh_Hans: 用于选择结果类型，使用文本还是链接进行展示pt_BR: used for selecting the result type, text or linkform: form  # 参数的表单类型，form表示这个参数需要由用户在对话开始前在前端填写

identity 字段是必须的，它包含了工具的基本信息，包括名称、作者、标签、描述等
parameters 参数列表
- name 参数名称，唯一，不允许和其他参数重名
- type 参数类型，目前支持string、number、boolean、select 四种类型，分别对应字符串、数字、布尔值、下拉框
- required 是否必填
  - 在llm模式下，如果参数为必填，则会要求 Agent 必须要推理出这个参数
  - 在form模式下，如果参数为必填，则会要求用户在对话开始前在前端填写这个参数
- options 参数选项
  - 在llm模式下，Dify 会将所有选项传递给 LLM，LLM 可以根据这些选项进行推理
  - 在form模式下，type为select时，前端会展示这些选项
- default 默认值
- label 参数标签，用于前端展示
- human_description 用于前端展示的介绍，支持多语言
- llm_description 传递给 LLM 的介绍，为了使得 LLM 更好理解这个参数，我们建议在这里写上关于这个参数尽可能详细的信息，让 LLM 能够理解这个参数
- form 表单类型，目前支持llm、form两种类型，分别对应 Agent 自行推理和前端填写

4.准备工具代码

源码位置：dify-0.6.9\api\core\tools\provider\builtin\google\tools\google_search.py

class GoogleSearchTool(BuiltinTool):def _invoke(self, user_id: str,  # 表示用户IDtool_parameters: dict[str, Any],  # 表示工具参数) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:  # 表示工具调用消息"""invoke tools"""query = tool_parameters['query']  # 表示查询result_type = tool_parameters['result_type']  # 表示结果类型api_key = self.runtime.credentials['serpapi_api_key']  # 表示API密钥result = SerpAPI(api_key).run(query, result_type=result_type)  # 表示运行查询if result_type == 'text':  # 表示结果类型为文本return self.create_text_message(text=result)  # 返回文本消息return self.create_link_message(link=result)  # 返回链接消息

5.准备供应商代码

源码位置：dify-0.6.9\api\core\tools\provider\builtin\google\google.py

class GoogleSearchTool(BuiltinTool):def _invoke(self, user_id: str,  # 表示用户IDtool_parameters: dict[str, Any],  # 表示工具参数) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:  # 表示工具调用消息"""invoke tools"""query = tool_parameters['query']  # 表示查询result_type = tool_parameters['result_type']  # 表示结果类型api_key = self.runtime.credentials['serpapi_api_key']  # 表示API密钥result = SerpAPI(api_key).run(query, result_type=result_type)  # 表示运行查询if result_type == 'text':  # 表示结果类型为文本return self.create_text_message(text=result)  # 返回文本消息return self.create_link_message(link=result)  # 返回链接消息

二.工具接口中的消息返回

1.返回消息类型

源码位置：dify-0.6.9\api\core\tools\tool\tool.py

Dify支持文本 链接 图片 文件BLOB 等多种消息类型，可通过以下几个接口返回不同类型的消息给 LLM 和用户。

def create_image_message(self, image: str, save_as: str = '') -> ToolInvokeMessage:"""create an image message:param image: the url of the image:return: the image message"""return ToolInvokeMessage(type=ToolInvokeMessage.MessageType.IMAGE, message=image, save_as=save_as)def create_file_var_message(self, file_var: FileVar) -> ToolInvokeMessage:return ToolInvokeMessage(type=ToolInvokeMessage.MessageType.FILE_VAR,message='',meta={'file_var': file_var},save_as='')def create_link_message(self, link: str, save_as: str = '') -> ToolInvokeMessage:"""create a link message:param link: the url of the link:return: the link message"""return ToolInvokeMessage(type=ToolInvokeMessage.MessageType.LINK, message=link, save_as=save_as)def create_text_message(self, text: str, save_as: str = '') -> ToolInvokeMessage:"""create a text message:param text: the text:return: the text message"""return ToolInvokeMessage(type=ToolInvokeMessage.MessageType.TEXT, message=text,save_as=save_as)def create_blob_message(self, blob: bytes, meta: dict = None, save_as: str = '') -> ToolInvokeMessage:"""create a blob message:param blob: the blob:return: the blob message"""return ToolInvokeMessage(type=ToolInvokeMessage.MessageType.BLOB, message=blob, meta=meta,save_as=save_as)

如果要返回文件的原始数据，如图片、音频、视频、PPT、Word、Excel 等，可以使用文件 BLOB。

blob 文件的原始数据，bytes 类型。
meta 文件的元数据，如果知道该文件的类型，最好传递一个mime_type，否则Dify将使用octet/stream作为默认类型。比如：

# b64decode函数的作用是将一个Base64编码的字符串解码为原始的字节数据
self.create_blob_message(blob=b64decode(image.b64_json), meta={ 'mime_type': 'image/png' }, save_as=self.VARIABLE_KEY.IMAGE.value)self.create_blob_message(blob=response.content, meta={'mime_type': 'image/svg+xml'})

application/octet-stream 是一种通用的二进制数据的 MIME 类型。“Octet” 是一个八位字节，“stream” 指的是数据流。这种类型通常用于表示未知的、二进制的数据。当下载或上传文件时，如果服务器或客户端不能确定文件的具体类型，就可能会使用 application/octet-stream。例如，当下载一个 .exe 文件或者 .zip 文件时，HTTP 响应的 Content-Type 头部字段可能就会被设置为 application/octet-stream。

2.总结和爬虫

还有2个常用的文本总结工具和网络爬虫工具如下：

源码位置：dify-0.6.9\api\core\tools\tool\builtin_tool.py

def summary(self, user_id: str, content: str) -> str:max_tokens = self.get_max_tokens()if self.get_prompt_tokens(prompt_messages=[UserPromptMessage(content=content)]) < max_tokens * 0.6:return contentdef get_prompt_tokens(content: str) -> int:return self.get_prompt_tokens(prompt_messages=[SystemPromptMessage(content=_SUMMARY_PROMPT),UserPromptMessage(content=content)])def summarize(content: str) -> str:summary = self.invoke_model(user_id=user_id, prompt_messages=[SystemPromptMessage(content=_SUMMARY_PROMPT),UserPromptMessage(content=content)], stop=[])return summary.message.contentlines = content.split('\n')new_lines = []# split long line into multiple linesfor i in range(len(lines)):line = lines[i]if not line.strip():continueif len(line) < max_tokens * 0.5:new_lines.append(line)elif get_prompt_tokens(line) > max_tokens * 0.7:while get_prompt_tokens(line) > max_tokens * 0.7:new_lines.append(line[:int(max_tokens * 0.5)])line = line[int(max_tokens * 0.5):]new_lines.append(line)else:new_lines.append(line)# merge lines into messages with max tokensmessages: list[str] = []for i in new_lines:if len(messages) == 0:messages.append(i)else:if len(messages[-1]) + len(i) < max_tokens * 0.5:messages[-1] += iif get_prompt_tokens(messages[-1] + i) > max_tokens * 0.7:messages.append(i)else:messages[-1] += isummaries = []for i in range(len(messages)):message = messages[i]summary = summarize(message)summaries.append(summary)result = '\n'.join(summaries)if self.get_prompt_tokens(prompt_messages=[UserPromptMessage(content=result)]) > max_tokens * 0.7:return self.summary(user_id=user_id, content=result)return resultdef get_url(self, url: str, user_agent: str = None) -> str:"""get url"""return get_url(url, user_agent=user_agent)

3.变量池

简单理解变量池用于存储工具运行过程中产生的变量、文件等，这些变量可以在工具运行过程中被其它工具使用。以DallE3和Vectorizer.AI为例，介绍如何使用变量池。

DallE3是一个图片生成工具，它可以根据文本生成图片，将让DallE3生成一个咖啡厅的 Logo。
Vectorizer.AI是一个矢量图转换工具，它可以将图片转换为矢量图，将DallE3生成的PNG图标转换为矢量图，从而可真正被设计师使用。

# DallE 消息返回
self.create_blob_message(blob=b64decode(image.b64_json), meta={ 'mime_type': 'image/png' }, save_as=self.VARIABLE_KEY.IMAGE.value)# 从变量池中获取到之前 DallE 生成的图片
image_binary = self.get_variable_file(self.VARIABLE_KEY.IMAGE)

三.Dify第三方工具

创建自定义工具，目前支持 OpenAPI Swagger 和 ChatGPT Plugin 规范。可将 OpenAPI schema 内容直接粘贴或从 URL 内导入。工具目前支持两种鉴权方式：无鉴权和 API Key。

1.天气（JSON）

{"openapi": "3.1.0","info": {"title": "Get weather data","description": "Retrieves current weather data for a location.","version": "v1.0.0"},"servers": [{"url": "https://weather.example.com"}],"paths": {"/location": {"get": {"description": "Get temperature for a specific location","operationId": "GetCurrentWeather","parameters": [{"name": "location","in": "query","description": "The city and state to retrieve the weather for","required": true,"schema": {"type": "string"}}],"deprecated": false}}},"components": {"schemas": {}}
}

2.宠物商店（YAML）

# Taken from https://github.com/OAI/OpenAPI-Specification/blob/main/examples/v3.0/petstore.yamlopenapi: "3.0.0"info:version: 1.0.0title: Swagger Petstorelicense:name: MITservers:- url: https://petstore.swagger.io/v1paths:/pets:get:summary: List all petsoperationId: listPetstags:- petsparameters:- name: limitin: querydescription: How many items to return at one time (max 100)required: falseschema:type: integermaximum: 100format: int32responses:'200':description: A paged array of petsheaders:x-next:description: A link to the next page of responsesschema:type: stringcontent:application/json:    schema:$ref: "#/components/schemas/Pets"default:description: unexpected errorcontent:application/json:schema:$ref: "#/components/schemas/Error"post:summary: Create a petoperationId: createPetstags:- petsresponses:'201':description: Null responsedefault:description: unexpected errorcontent:application/json:schema:$ref: "#/components/schemas/Error"/pets/{petId}:get:summary: Info for a specific petoperationId: showPetByIdtags:- petsparameters:- name: petIdin: pathrequired: truedescription: The id of the pet to retrieveschema:type: stringresponses:'200':description: Expected response to a valid requestcontent:application/json:schema:$ref: "#/components/schemas/Pet"default:description: unexpected errorcontent:application/json:schema:$ref: "#/components/schemas/Error"components:schemas:Pet:type: objectrequired:- id- nameproperties:id:type: integerformat: int64name:type: stringtag:type: stringPets:type: arraymaxItems: 100items:$ref: "#/components/schemas/Pet"Error:type: objectrequired:- code- messageproperties:code:type: integerformat: int32message:type: string

3.空模板（JSON）

{"openapi": "3.1.0","info": {"title": "Untitled","description": "Your OpenAPI specification","version": "v1.0.0"},"servers": [{"url": ""}],"paths": {},"components": {"schemas": {}}
}

四.Cloudflare Workers

一个函数调用工具可以部署到Cloudflare Workers，并使用OpenAPI模式。其中，Cloudflare Workers是Cloudflare提供的一种在边缘网络运行JavaScript函数的服务。简单理解这是一个用于为dify应用创建工具的Cloudflare Worker。

# 克隆代码
git clone https://github.com/crazywoola/dify-tools-worker# 开发模式
cp .wrangler.toml.example .wrangler.toml
npm install
npm run dev
# You will get a url like this: http://localhost:8787# 部署模式
npm run deploy 
# You will get a url like this: https://difytoolsworker.yourname.workers.dev

填写URL从URL中导入，如下所示：