前言
chatGPT目前的明显问题是不能够获取新知识,也没有办法和外界交互,而plugin就是来解决这个问题的。
chatgpt-retrieval-plugin
插件的基本信息
这是最新开源的一个plugin,里面有几个核心点
第一个是插件的定义
我不逐句翻译了,大概意思就是可以扩展chatGPT,这样就可以试用外部的资料和调用外部的服务了。
然后又说明了一个插件需要的三个部分
- 需要一个API
- 需要一个API的定义
- 需要一个描述这个插件的json文件
插件和chatGPT之间的核心逻辑
这个图其实就是插件与chatGPT的一个核心交互模式。
解释下里面的的核心点:chatGPT是如何和工具进行交互的。
这里就涉及到刚才插件要求的三个要素,Api、Api文档、插件描述。
我们核心看插件描述和API文档
api-plugin.json
{"schema_version": "v1","name_for_model": "retrieval","name_for_human": "Retrieval Plugin","description_for_model": "Plugin for searching through the user's documents (such as files, emails, and more) to find answers to questions and retrieve relevant information. Use it whenever a user asks something that might be found in their personal information.","description_for_human": "Search through your documents.","auth": {"type": "user_http","authorization_type": "bearer"},"api": {"type": "openapi","url": "https://your-app-url.com/.well-known/openapi.yaml","has_user_authentication": false},"logo_url": "https://your-app-url.com/.well-known/logo.png","contact_email": "hello@contact.com", "legal_info_url": "hello@legal.com"
}
openapi.yaml
openapi: 3.0.2
info:title: Retrieval Plugin APIdescription: A retrieval API for querying and filtering documents based on natural language queries and metadataversion: 1.0.0servers:- url: https://your-app-url.com
paths:/query:post:summary: Querydescription: Accepts search query objects array each with query and optional filter. Break down complex questions into sub-questions. Refine results by criteria, e.g. time / source, don't do this often. Split queries if ResponseTooLargeError occurs.operationId: query_query_postrequestBody:content:application/json:schema:$ref: "#/components/schemas/QueryRequest"required: trueresponses:"200":description: Successful Responsecontent:application/json:schema:$ref: "#/components/schemas/QueryResponse""422":description: Validation Errorcontent:application/json:schema:$ref: "#/components/schemas/HTTPValidationError"security:- HTTPBearer: []
components:schemas:DocumentChunkMetadata:title: DocumentChunkMetadatatype: objectproperties:source:$ref: "#/components/schemas/Source"source_id:title: Source Idtype: stringurl:title: Urltype: stringcreated_at:title: Created Attype: stringauthor:title: Authortype: stringdocument_id:title: Document Idtype: stringDocumentChunkWithScore:title: DocumentChunkWithScorerequired:- text- metadata- scoretype: objectproperties:id:title: Idtype: stringtext:title: Texttype: stringmetadata:$ref: "#/components/schemas/DocumentChunkMetadata"embedding:title: Embeddingtype: arrayitems:type: numberscore:title: Scoretype: numberDocumentMetadataFilter:title: DocumentMetadataFiltertype: objectproperties:document_id:title: Document Idtype: stringsource:$ref: "#/components/schemas/Source"source_id:title: Source Idtype: stringauthor:title: Authortype: stringstart_date:title: Start Datetype: stringend_date:title: End Datetype: stringHTTPValidationError:title: HTTPValidationErrortype: objectproperties:detail:title: Detailtype: arrayitems:$ref: "#/components/schemas/ValidationError"Query:title: Queryrequired:- querytype: objectproperties:query:title: Querytype: stringfilter:$ref: "#/components/schemas/DocumentMetadataFilter"top_k:title: Top Ktype: integerdefault: 3QueryRequest:title: QueryRequestrequired:- queriestype: objectproperties:queries:title: Queriestype: arrayitems:$ref: "#/components/schemas/Query"QueryResponse:title: QueryResponserequired:- resultstype: objectproperties:results:title: Resultstype: arrayitems:$ref: "#/components/schemas/QueryResult"QueryResult:title: QueryResultrequired:- query- resultstype: objectproperties:query:title: Querytype: stringresults:title: Resultstype: arrayitems:$ref: "#/components/schemas/DocumentChunkWithScore"Source:title: Sourceenum:- email- file- chattype: stringdescription: An enumeration.ValidationError:title: ValidationErrorrequired:- loc- msg- typetype: objectproperties:loc:title: Locationtype: arrayitems:anyOf:- type: string- type: integermsg:title: Messagetype: stringtype:title: Error Typetype: stringsecuritySchemes:HTTPBearer:type: httpscheme: bearer
这两个部分有点长,我摘出重点
在插件描述中:
"name_for_model": "retrieval",
"description_for_model": "Plugin for searching through the user's documents (such as files, emails, and more) to find answers to questions and retrieve relevant information. Use it whenever a user asks something that might be found in their personal information.",
api定义中
/query:post:summary: Querydescription: Accepts search query objects array each with query and optional filter. Break down complex questions into sub-questions. Refine results by criteria, e.g. time / source, don't do this often. Split queries if ResponseTooLargeError occurs.operationId: query_query_post
这两段逻辑就是ChatGPT可以判断是否要调用这个接口的依据,也就是说这两段会传如到prompt中去。
而具体怎么调用就是通过openapi.yaml 来获取的接口知识,这个对gpt来说实在是太简单。
总结
这个逻辑和LangChain的agent模块基本一致。目前只有一个核心区别,就是这个插件看起来是只作用于ChatGPT,也就是说我们需要登录到chatGPT的网站上通过对话才能使用这个插件。而LangChain不同,他是一个工具,可以开发一个独立与ChatGPT的网站或者工具。
如果我是LangChain,就直接着手开发对接plugin的Agent或者tool咯