Ollama接口文档中文版

API 应用程序接口

Endpoints 端点

  • Generate a completion 生成完成
  • Generate a chat completion生成聊天完成
  • Create a Model 创建模型
  • List Local Models 列出本地模型
  • Show Model Information 显示模型信息
  • Copy a Model 复制模型
  • Delete a Model 删除模型
  • Pull a Model 拉取模型
  • Push a Model 推送模型
  • Generate Embeddings 生成嵌入
  • List Running Models 列出正在运行的模型

Conventions 约定

Model names 型号名称

Model names follow a model:tag format, where model can have an optional namespace such as example/model. Some examples are orca-mini:3b-q4_1 and llama3:70b. The tag is optional and, if not provided, will default to latest. The tag is used to identify a specific version.
模型名称遵循 model:tag 格式,其中 model 可以具有可选的命名空间,例如 example/model。一些示例包括 orca-mini:3b-q4_1 和 llama3:70b。该标签是可选的,如果未提供,则默认为 latest。该标签用于标识特定版本。

Durations 持续时间

All durations are returned in nanoseconds.
所有持续时间均以纳秒为单位返回。

Streaming responses 流式处理响应

Certain endpoints stream responses as JSON objects. Streaming can be disabled by providing {"stream": false} for these endpoints.
某些终端节点将响应作为 JSON 对象流式传输。可以通过为这些终端节点提供 {“stream”: false} 来禁用流式处理。

Generate a completion 生成完成

POST /api/generate

Generate a response for a given prompt with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
使用提供的模型为给定提示生成响应。这是一个流式处理终结点,因此将有一系列响应。最终响应对象将包含来自请求的统计信息和其他数据。

Parameters 参数

  • model: (required) the model name
    model:(必需)模型名称
  • prompt: the prompt to generate a response for
    prompt:生成响应的提示
  • suffix: the text after the model response
    suffix:模型响应后的文本
  • images: (optional) a list of base64-encoded images (for multimodal models such as llava)
    images:(可选)Base64 编码的图像列表(适用于 LLAVA 等多模态模型)

Advanced parameters (optional):
高级参数(可选):

  • format: the format to return a response in. Currently the only accepted value is json
    format:返回响应的格式。目前唯一接受的值是 json
  • options: additional model parameters listed in the documentation for the Modelfile such as temperature
    options:模型文件文档中列出的其他模型参数,例如温度
  • system: system message to (overrides what is defined in the Modelfile)
    system: 系统消息 to (覆盖 Modelfile 中定义的内容)
  • template: the prompt template to use (overrides what is defined in the Modelfile)
    template:要使用的提示模板(覆盖 Modelfile 中定义的内容)
  • context: the context parameter returned from a previous request to /generate, this can be used to keep a short conversational memory
    context:从上一个请求返回的 context 参数 /generate,这可用于保持简短的对话记忆
  • stream: if false the response will be returned as a single response object, rather than a stream of objects
    stream:如果为 false,则响应将作为单个响应对象返回,而不是对象流
  • raw: if true no formatting will be applied to the prompt. You may choose to use the raw parameter if you are specifying a full templated prompt in your request to the API
    raw:如果为 true,则不会对提示应用任何格式。如果您在对 API 的请求中指定了完整的模板化提示,则可以选择使用 raw 参数
  • keep_alive: controls how long the model will stay loaded into memory following the request (default: 5m)
    keep_alive:控制模型在请求后加载到内存中的时间(默认值:5M
JSON mode JSON 模式

Enable JSON mode by setting the format parameter to json. This will structure the response as a valid JSON object. See the JSON mode example below.
通过将 format 参数设置为 json 来启用 JSON 模式。这会将响应构建为有效的 JSON 对象。请参阅下面的 JSON 模式示例。

Important 重要

It's important to instruct the model to use JSON in the prompt. Otherwise, the model may generate large amounts whitespace.
请务必指示模型在提示符中使用 JSON。否则,模型可能会生成大量空格。

Examples 例子
Generate request (Streaming)
生成请求 (Streaming)
Request 请求
curl http://localhost:11434/api/generate -d '{"model": "llama3.2","prompt": "Why is the sky blue?"
}'
Response 响应

A stream of JSON objects is returned:
返回 JSON 对象流:

{"model": "llama3.2","created_at": "2023-08-04T08:52:19.385406455-07:00","response": "The","done": false
}

The final response in the stream also includes additional data about the generation:
流中的最终响应还包括有关生成的其他数据:

  • total_duration: time spent generating the response
    total_duration:生成响应所花费的时间
  • load_duration: time spent in nanoseconds loading the model
    load_duration:加载模型所花费的时间(以纳秒为单位)
  • prompt_eval_count: number of tokens in the prompt
    prompt_eval_count:提示符中的令牌数
  • prompt_eval_duration: time spent in nanoseconds evaluating the prompt
    prompt_eval_duration:评估提示所花费的时间(以纳秒为单位)
  • eval_count: number of tokens in the response
    eval_count:响应中的令牌数
  • eval_duration: time in nanoseconds spent generating the response
    eval_duration:生成响应所花费的时间(以纳秒为单位)
  • context: an encoding of the conversation used in this response, this can be sent in the next request to keep a conversational memory
    context:此响应中使用的对话的编码,可以在下一个请求中发送以保持对话记忆
  • response: empty if the response was streamed, if not streamed, this will contain the full response
    response: 如果响应是流式的,则为空,如果未流式响应,则此响应将包含完整的响应

To calculate how fast the response is generated in tokens per second (token/s), divide eval_count / eval_duration * 10^9.
要计算以每秒令牌数 (token/s) 为单位生成响应的速度,请将 eval_count / eval_duration * 10^9 除以。

{"model": "llama3.2","created_at": "2023-08-04T19:22:45.499127Z","response": "","done": true,"context": [1, 2, 3],"total_duration": 10706818083,"load_duration": 6338219291,"prompt_eval_count": 26,"prompt_eval_duration": 130079000,"eval_count": 259,"eval_duration": 4232710000
}
Request (No streaming) 请求 (无流式处理)
Request 请求

A response can be received in one reply when streaming is off.
当流式传输关闭时,可以在一个回复中收到响应。

curl http://localhost:11434/api/generate -d '{"model": "llama3.2","prompt": "Why is the sky blue?","stream": false
}'
Response 响应

If stream is set to false, the response will be a single JSON object:
如果 stream 设置为 false,则响应将是单个 JSON 对象:

{"model": "llama3.2","created_at": "2023-08-04T19:22:45.499127Z","response": "The sky is blue because it is the color of the sky.","done": true,"context": [1, 2, 3],"total_duration": 5043500667,"load_duration": 5025959,"prompt_eval_count": 26,"prompt_eval_duration": 325953000,"eval_count": 290,"eval_duration": 4709213000
}
Request (with suffix) 请求(带后缀)
Request 请求
curl http://localhost:11434/api/generate -d '{"model": "codellama:code","prompt": "def compute_gcd(a, b):","suffix": "    return result","options": {"temperature": 0},"stream": false
}'
Response 响应
{"model": "codellama:code","created_at": "2024-07-22T20:47:51.147561Z","response": "\n  if a == 0:\n    return b\n  else:\n    return compute_gcd(b % a, a)\n\ndef compute_lcm(a, b):\n  result = (a * b) / compute_gcd(a, b)\n","done": true,"done_reason": "stop","context": [...],"total_duration": 1162761250,"load_duration": 6683708,"prompt_eval_count": 17,"prompt_eval_duration": 201222000,"eval_count": 63,"eval_duration": 953997000
}
Request (JSON mode) 请求(JSON 模式)

Important 重要

When format is set to json, the output will always be a well-formed JSON object. It's important to also instruct the model to respond in JSON.
当 format 设置为 json 时,输出将始终是格式正确的 JSON 对象。指示模型以 JSON 格式响应也很重要。

Request 请求
curl http://localhost:11434/api/generate -d '{"model": "llama3.2","prompt": "What color is the sky at different times of the day? Respond using JSON","format": "json","stream": false
}'
Response 响应
{"model": "llama3.2","created_at": "2023-11-09T21:07:55.186497Z","response": "{\n\"morning\": {\n\"color\": \"blue\"\n},\n\"noon\": {\n\"color\": \"blue-gray\"\n},\n\"afternoon\": {\n\"color\": \"warm gray\"\n},\n\"evening\": {\n\"color\": \"orange\"\n}\n}\n","done": true,"context": [1, 2, 3],"total_duration": 4648158584,"load_duration": 4071084,"prompt_eval_count": 36,"prompt_eval_duration": 439038000,"eval_count": 180,"eval_duration": 4196918000
}

The value of response will be a string containing JSON similar to:
response 的值将是包含 JSON 的字符串,类似于:

{"morning": {"color": "blue"},"noon": {"color": "blue-gray"},"afternoon": {"color": "warm gray"},"evening": {"color": "orange"}
}
Request (with images) 请求(带图片)

To submit images to multimodal models such as llava or bakllava, provide a list of base64-encoded images:
要将图像提交到 llava 或 bakllava 等多模态模型,请提供 base64 编码的图像列表:

Request 请求
curl http://localhost:11434/api/generate -d '{"model": "llava","prompt":"What is in this picture?","stream": false,"images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
}'
Response 响应
<span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><span style="color:#1f2328"><span style="color:var(--fgColor-default, var(--color-fg-default))"><span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><code>{"model": "llava","created_at": "2023-11-03T15:36:02.583064Z","response": "A happy cartoon character, which is cute and cheerful.","done": true,"context": [1, 2, 3],"total_duration": 2938432250,"load_duration": 2559292,"prompt_eval_count": 1,"prompt_eval_duration": 2195557000,"eval_count": 44,"eval_duration": 736432000
}
</code></span></span></span></span>
Request (Raw Mode) 请求 (Raw 模式)

In some cases, you may wish to bypass the templating system and provide a full prompt. In this case, you can use the raw parameter to disable templating. Also note that raw mode will not return a context.
在某些情况下,您可能希望绕过模板系统并提供完整提示。在这种情况下,您可以使用 raw 参数来禁用模板。另请注意,raw 模式不会返回上下文。

Request 请求
curl http://localhost:11434/api/generate -d '{"model": "mistral","prompt": "[INST] why is the sky blue? [/INST]","raw": true,"stream": false
}'
Request (Reproducible outputs)
请求(可重现的输出)

For reproducible outputs, set seed to a number:
对于可重现的输出,请将 seed 设置为一个数字:

Request 请求
curl http://localhost:11434/api/generate -d '{"model": "mistral","prompt": "Why is the sky blue?","options": {"seed": 123}
}'
Response 响应
{"model": "mistral","created_at": "2023-11-03T15:36:02.583064Z","response": " The sky appears blue because of a phenomenon called Rayleigh scattering.","done": true,"total_duration": 8493852375,"load_duration": 6589624375,"prompt_eval_count": 14,"prompt_eval_duration": 119039000,"eval_count": 110,"eval_duration": 1779061000
}
Generate request (With options)
生成请求(带选项)

If you want to set custom options for the model at runtime rather than in the Modelfile, you can do so with the options parameter. This example sets every available option, but you can set any of them individually and omit the ones you do not want to override.
如果要在运行时而不是在 Modelfile 中为模型设置自定义选项,可以使用 options 参数来实现。此示例设置所有可用选项,但您可以单独设置其中任何一个选项,并省略您不想覆盖的选项。

Request 请求
curl http://localhost:11434/api/generate -d '{"model": "llama3.2","prompt": "Why is the sky blue?","stream": false,"options": {"num_keep": 5,"seed": 42,"num_predict": 100,"top_k": 20,"top_p": 0.9,"min_p": 0.0,"tfs_z": 0.5,"typical_p": 0.7,"repeat_last_n": 33,"temperature": 0.8,"repeat_penalty": 1.2,"presence_penalty": 1.5,"frequency_penalty": 1.0,"mirostat": 1,"mirostat_tau": 0.8,"mirostat_eta": 0.6,"penalize_newline": true,"stop": ["\n", "user:"],"numa": false,"num_ctx": 1024,"num_batch": 2,"num_gpu": 1,"main_gpu": 0,"low_vram": false,"f16_kv": true,"vocab_only": false,"use_mmap": true,"use_mlock": false,"num_thread": 8}
}'
Response 响应
{"model": "llama3.2","created_at": "2023-08-04T19:22:45.499127Z","response": "The sky is blue because it is the color of the sky.","done": true,"context": [1, 2, 3],"total_duration": 4935886791,"load_duration": 534986708,"prompt_eval_count": 26,"prompt_eval_duration": 107345000,"eval_count": 237,"eval_duration": 4289432000
}
Load a model 加载模型

If an empty prompt is provided, the model will be loaded into memory.
如果提供空提示,则模型将被加载到内存中。

Request 请求
curl http://localhost:11434/api/generate -d '{"model": "llama3.2"
}'
Response 响应

A single JSON object is returned:
返回单个 JSON 对象:

{"model": "llama3.2","created_at": "2023-12-18T19:52:07.071755Z","response": "","done": true
}
Unload a model 卸载模型

If an empty prompt is provided and the keep_alive parameter is set to 0, a model will be unloaded from memory.
如果提供的提示为空且 keep_alive 参数设置为 0,则模型将从内存中卸载。

Request 请求
curl http://localhost:11434/api/generate -d '{"model": "llama3.2","keep_alive": 0
}'
Response 响应

A single JSON object is returned:
返回单个 JSON 对象:

{"model": "llama3.2","created_at": "2024-09-12T03:54:03.516566Z","response": "","done": true,"done_reason": "unload"
}

Generate a chat completion
生成聊天完成

POST /api/chat

Generate the next message in a chat with a provided model. This is a streaming endpoint, so there will be a series of responses. Streaming can be disabled using "stream": false. The final response object will include statistics and additional data from the request.
使用提供的模型在聊天中生成下一条消息。这是一个流式处理终结点,因此将有一系列响应。可以使用 “stream”: false 禁用流式传输。最终响应对象将包含来自请求的统计信息和其他数据。

Parameters 参数

  • model: (required) the model name
    model:(必需)模型名称
  • messages: the messages of the chat, this can be used to keep a chat memory
    messages:聊天的消息,可用于保存聊天记录
  • tools: tools for the model to use if supported. Requires stream to be set to false
    tools:模型要使用的工具(如果支持)。需要将 stream 设置为 false

The message object has the following fields:
message 对象具有以下字段:

  • role: the role of the message, either systemuserassistant, or tool
    role:消息的角色,可以是 systemuserAssistant 或 tool
  • content: the content of the message
    content:消息的内容
  • images (optional): a list of images to include in the message (for multimodal models such as llava)
    images (optional):消息中包含的图像列表(适用于 LLAVA 等多模态模型)
  • tool_calls (optional): a list of tools the model wants to use
    tool_calls (可选):模型要使用的工具列表

Advanced parameters (optional):
高级参数(可选):

  • format: the format to return a response in. Currently the only accepted value is json
    format:返回响应的格式。目前唯一接受的值是 json
  • options: additional model parameters listed in the documentation for the Modelfile such as temperature
    options:模型文件文档中列出的其他模型参数,例如温度
  • stream: if false the response will be returned as a single response object, rather than a stream of objects
    stream:如果为 false,则响应将作为单个响应对象返回,而不是对象流
  • keep_alive: controls how long the model will stay loaded into memory following the request (default: 5m)
    keep_alive:控制模型在请求后加载到内存中的时间(默认值:5M
Examples 例子
Chat Request (Streaming) 聊天请求 (流式处理)
Request 请求

Send a chat message with a streaming response.
发送包含流式响应的聊天消息。

curl http://localhost:11434/api/chat -d '{"model": "llama3.2","messages": [{"role": "user","content": "why is the sky blue?"}]
}'
Response 响应

A stream of JSON objects is returned:
返回 JSON 对象流:

{"model": "llama3.2","created_at": "2023-08-04T08:52:19.385406455-07:00","message": {"role": "assistant","content": "The","images": null},"done": false
}

Final response: 最终回应:

{"model": "llama3.2","created_at": "2023-08-04T19:22:45.499127Z","done": true,"total_duration": 4883583458,"load_duration": 1334875,"prompt_eval_count": 26,"prompt_eval_duration": 342546000,"eval_count": 282,"eval_duration": 4535599000
}
Chat request (No streaming)
聊天请求 (无流式处理)
Request 请求
curl http://localhost:11434/api/chat -d '{"model": "llama3.2","messages": [{"role": "user","content": "why is the sky blue?"}],"stream": false
}'
Response 响应
{"model": "llama3.2","created_at": "2023-12-12T14:13:43.416799Z","message": {"role": "assistant","content": "Hello! How are you today?"},"done": true,"total_duration": 5191566416,"load_duration": 2154458,"prompt_eval_count": 26,"prompt_eval_duration": 383809000,"eval_count": 298,"eval_duration": 4799921000
}
Chat request (With History)
聊天请求(带历史记录)

Send a chat message with a conversation history. You can use this same approach to start the conversation using multi-shot or chain-of-thought prompting.
发送包含对话历史记录的聊天消息。您可以使用相同的方法,使用多发或思维链提示来开始对话。

Request 请求
curl http://localhost:11434/api/chat -d '{"model": "llama3.2","messages": [{"role": "user","content": "why is the sky blue?"},{"role": "assistant","content": "due to rayleigh scattering."},{"role": "user","content": "how is that different than mie scattering?"}]
}'
Response 响应

A stream of JSON objects is returned:
返回 JSON 对象流:

{"model": "llama3.2","created_at": "2023-08-04T08:52:19.385406455-07:00","message": {"role": "assistant","content": "The"},"done": false
}

Final response: 最终回应:

{"model": "llama3.2","created_at": "2023-08-04T19:22:45.499127Z","done": true,"total_duration": 8113331500,"load_duration": 6396458,"prompt_eval_count": 61,"prompt_eval_duration": 398801000,"eval_count": 468,"eval_duration": 7701267000
}
Chat request (with images)
聊天请求(带图片)
Request 请求

Send a chat message with images. The images should be provided as an array, with the individual images encoded in Base64.
发送带有图像的聊天消息。图像应以数组形式提供,单个图像以 Base64 编码。

curl http://localhost:11434/api/chat -d '{"model": "llava","messages": [{"role": "user","content": "what is in this image?","images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]}]
}'
Response 响应
{"model": "llava","created_at": "2023-12-13T22:42:50.203334Z","message": {"role": "assistant","content": " The image features a cute, little pig with an angry facial expression. It's wearing a heart on its shirt and is waving in the air. This scene appears to be part of a drawing or sketching project.","images": null},"done": true,"total_duration": 1668506709,"load_duration": 1986209,"prompt_eval_count": 26,"prompt_eval_duration": 359682000,"eval_count": 83,"eval_duration": 1303285000
}
Chat request (Reproducible outputs)
聊天请求(可重现的输出)
Request 请求
curl http://localhost:11434/api/chat -d '{"model": "llama3.2","messages": [{"role": "user","content": "Hello!"}],"options": {"seed": 101,"temperature": 0}
}'
Response 响应
{"model": "llama3.2","created_at": "2023-12-12T14:13:43.416799Z","message": {"role": "assistant","content": "Hello! How are you today?"},"done": true,"total_duration": 5191566416,"load_duration": 2154458,"prompt_eval_count": 26,"prompt_eval_duration": 383809000,"eval_count": 298,"eval_duration": 4799921000
}
Chat request (with tools)
聊天请求(带工具)
Request 请求
<span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><span style="color:#1f2328"><span style="color:var(--fgColor-default, var(--color-fg-default))"><span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><code>curl http://localhost:11434/api/chat -d '{"model": "llama3.2","messages": [{"role": "user","content": "What is the weather today in Paris?"}],"stream": false,"tools": [{"type": "function","function": {"name": "get_current_weather","description": "Get the current weather for a location","parameters": {"type": "object","properties": {"location": {"type": "string","description": "The location to get the weather for, e.g. San Francisco, CA"},"format": {"type": "string","description": "The format to return the weather in, e.g. 'celsius' or 'fahrenheit'","enum": ["celsius", "fahrenheit"]}},"required": ["location", "format"]}}}]
}'
</code></span></span></span></span>
Response 响应
{"model": "llama3.2","created_at": "2024-07-22T20:33:28.123648Z","message": {"role": "assistant","content": "","tool_calls": [{"function": {"name": "get_current_weather","arguments": {"format": "celsius","location": "Paris, FR"}}}]},"done_reason": "stop","done": true,"total_duration": 885095291,"load_duration": 3753500,"prompt_eval_count": 122,"prompt_eval_duration": 328493000,"eval_count": 33,"eval_duration": 552222000
}
Load a model 加载模型

If the messages array is empty, the model will be loaded into memory.
如果 messages 数组为空,则模型将被加载到内存中。

Request 请求
<span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><span style="color:#1f2328"><span style="color:var(--fgColor-default, var(--color-fg-default))"><span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><code>curl http://localhost:11434/api/chat -d '{"model": "llama3.2","messages": []
}'
</code></span></span></span></span>
Response 响应
{"model": "llama3.2","created_at":"2024-09-12T21:17:29.110811Z","message": {"role": "assistant","content": ""},"done_reason": "load","done": true
}
Unload a model 卸载模型

If the messages array is empty and the keep_alive parameter is set to 0, a model will be unloaded from memory.
如果 messages 数组为空且 keep_alive 参数设置为 0,则将从内存中卸载模型。

Request 请求
<span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><span style="color:#1f2328"><span style="color:var(--fgColor-default, var(--color-fg-default))"><span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><code>curl http://localhost:11434/api/chat -d '{"model": "llama3.2","messages": [],"keep_alive": 0
}'
</code></span></span></span></span>
Response 响应

A single JSON object is returned:
返回单个 JSON 对象:

{"model": "llama3.2","created_at":"2024-09-12T21:33:17.547535Z","message": {"role": "assistant","content": ""},"done_reason": "unload","done": true
}

Create a Model 创建模型

POST /api/create

Create a model from a Modelfile. It is recommended to set modelfile to the content of the Modelfile rather than just set path. This is a requirement for remote create. Remote model creation must also create any file blobs, fields such as FROM and ADAPTER, explicitly with the server using Create a Blob and the value to the path indicated in the response.
从 Modelfile 创建模型。建议将 modelfile 设置为 Modelfile 的内容,而不仅仅是设置 path。这是远程创建的要求。远程模型创建还必须使用创建 Blob 和响应中指示的路径的值,与服务器显式创建任何文件 blob,例如 FROM 和 ADAPTER

Parameters 参数

  • name: name of the model to create
    name:要创建的模型的名称
  • modelfile (optional): contents of the Modelfile
    modelfile(可选):Modelfile 的内容
  • stream: (optional) if false the response will be returned as a single response object, rather than a stream of objects
    stream:(可选)如果为 false,则响应将作为单个响应对象返回,而不是对象流
  • path (optional): path to the Modelfile
    path (可选):Modelfile 的路径
Examples 例子
Create a new model 创建新模型

Create a new model from a Modelfile.
从 Modelfile 创建新模型。

Request 请求
curl http://localhost:11434/api/create -d '{"name": "mario","modelfile": "FROM llama3\nSYSTEM You are mario from Super Mario Bros."
}'
Response 响应

A stream of JSON objects. Notice that the final JSON object shows a "status": "success".
JSON 对象流。请注意,最终的 JSON 对象显示 “status”: “success”。

{"status":"reading model metadata"}
{"status":"creating system layer"}
{"status":"using already created layer sha256:22f7f8ef5f4c791c1b03d7eb414399294764d7cc82c7e94aa81a1feb80a983a2"}
{"status":"using already created layer sha256:8c17c2ebb0ea011be9981cc3922db8ca8fa61e828c5d3f44cb6ae342bf80460b"}
{"status":"using already created layer sha256:7c23fb36d80141c4ab8cdbb61ee4790102ebd2bf7aeff414453177d4f2110e5d"}
{"status":"using already created layer sha256:2e0493f67d0c8c9c68a8aeacdf6a38a2151cb3c4c1d42accf296e19810527988"}
{"status":"using already created layer sha256:2759286baa875dc22de5394b4a925701b1896a7e3f8e53275c36f75a877a82c9"}
{"status":"writing layer sha256:df30045fe90f0d750db82a058109cecd6d4de9c90a3d75b19c09e5f64580bb42"}
{"status":"writing layer sha256:f18a68eb09bf925bb1b669490407c1b1251c5db98dc4d3d81f3088498ea55690"}
{"status":"writing manifest"}
{"status":"success"}

Check if a Blob Exists
检查 Blob 是否存在

HEAD /api/blobs/:digest

Ensures that the file blob used for a FROM or ADAPTER field exists on the server. This is checking your Ollama server and not Ollama.ai.
确保服务器上存在用于 FROM 或 ADAPTER 字段的文件 blob。这是检查您的 Ollama 服务器,而不是 Ollama.ai。

Query Parameters 查询参数
  • digest: the SHA256 digest of the blob
    digest:blob 的 SHA256 摘要
Examples 例子
Request 请求
curl -I http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
Response 响应

Return 200 OK if the blob exists, 404 Not Found if it does not.
如果 blob 存在,则返回 200 OK,如果不存在,则返回 404 Not Found。

Create a Blob 创建 Blob

POST /api/blobs/:digest

Create a blob from a file on the server. Returns the server file path.
从服务器上的文件创建 Blob。返回服务器文件路径。

Query Parameters 查询参数
  • digest: the expected SHA256 digest of the file
    digest:文件的预期 SHA256 摘要
Examples 例子
Request 请求
curl -T model.bin -X POST http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
Response 响应

Return 201 Created if the blob was successfully created, 400 Bad Request if the digest used is not expected.
如果 blob 创建成功,则返回 201 Created ,如果使用的摘要不是预期的,则返回 400 Bad Request。

List Local Models 列出本地模型

GET /api/tags

List models that are available locally.
列出本地可用的模型。

Examples 例子
Request 请求
curl http://localhost:11434/api/tags
Response 响应

A single JSON object will be returned.
将返回单个 JSON 对象。

{"models": [{"name": "codellama:13b","modified_at": "2023-11-04T14:56:49.277302595-07:00","size": 7365960935,"digest": "9f438cb9cd581fc025612d27f7c1a6669ff83a8bb0ed86c94fcf4c5440555697","details": {"format": "gguf","family": "llama","families": null,"parameter_size": "13B","quantization_level": "Q4_0"}},{"name": "llama3:latest","modified_at": "2023-12-07T09:32:18.757212583-08:00","size": 3825819519,"digest": "fe938a131f40e6f6d40083c9f0f430a515233eb2edaa6d72eb85c50d64f2300e","details": {"format": "gguf","family": "llama","families": null,"parameter_size": "7B","quantization_level": "Q4_0"}}]
}

Show Model Information 显示模型信息

POST /api/show

Show information about a model including details, modelfile, template, parameters, license, system prompt.
显示有关模型的信息,包括详细信息、模型文件、模板、参数、许可证、系统提示符。

Parameters 参数

  • name: name of the model to show
    name:要显示的模型的名称
  • verbose: (optional) if set to true, returns full data for verbose response fields
    verbose:(可选)如果设置为 true,则返回详细响应字段的完整数据
Examples 例子
Request 请求
curl http://localhost:11434/api/show -d '{"name": "llama3.2"
}'
Response 响应
{"modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llava:latest\n\nFROM /Users/matt/.ollama/models/blobs/sha256:200765e1283640ffbd013184bf496e261032fa75b99498a9613be4e94d63ad52\nTEMPLATE \"\"\"{{ .System }}\nUSER: {{ .Prompt }}\nASSISTANT: \"\"\"\nPARAMETER num_ctx 4096\nPARAMETER stop \"\u003c/s\u003e\"\nPARAMETER stop \"USER:\"\nPARAMETER stop \"ASSISTANT:\"","parameters": "num_keep                       24\nstop                           \"<|start_header_id|>\"\nstop                           \"<|end_header_id|>\"\nstop                           \"<|eot_id|>\"","template": "{{ if .System }}<|start_header_id|>system<|end_header_id|>\n\n{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>\n\n{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ .Response }}<|eot_id|>","details": {"parent_model": "","format": "gguf","family": "llama","families": ["llama"],"parameter_size": "8.0B","quantization_level": "Q4_0"},"model_info": {"general.architecture": "llama","general.file_type": 2,"general.parameter_count": 8030261248,"general.quantization_version": 2,"llama.attention.head_count": 32,"llama.attention.head_count_kv": 8,"llama.attention.layer_norm_rms_epsilon": 0.00001,"llama.block_count": 32,"llama.context_length": 8192,"llama.embedding_length": 4096,"llama.feed_forward_length": 14336,"llama.rope.dimension_count": 128,"llama.rope.freq_base": 500000,"llama.vocab_size": 128256,"tokenizer.ggml.bos_token_id": 128000,"tokenizer.ggml.eos_token_id": 128009,"tokenizer.ggml.merges": [],            // populates if `verbose=true`"tokenizer.ggml.model": "gpt2","tokenizer.ggml.pre": "llama-bpe","tokenizer.ggml.token_type": [],        // populates if `verbose=true`"tokenizer.ggml.tokens": []             // populates if `verbose=true`}
}

Copy a Model 复制模型

POST /api/copy

Copy a model. Creates a model with another name from an existing model.
复制模型。从现有模型创建具有其他名称的模型。

Examples 例子
Request 请求
curl http://localhost:11434/api/copy -d '{"source": "llama3.2","destination": "llama3-backup"
}'
Response 响应

Returns a 200 OK if successful, or a 404 Not Found if the source model doesn't exist.
如果成功,则返回 200 OK,如果源模型不存在,则返回 404 Not Found。

Delete a Model 删除模型

DELETE /api/delete

Delete a model and its data.
删除模型及其数据。

Parameters 参数

  • name: model name to delete
    name:要删除的模型名称
Examples 例子
Request 请求
curl -X DELETE http://localhost:11434/api/delete -d '{"name": "llama3:13b"
}'
Response 响应

Returns a 200 OK if successful, 404 Not Found if the model to be deleted doesn't exist.
如果成功,则返回 200 OK,如果要删除的模型不存在,则返回 404 Not Found。

Pull a Model 拉取模型

POST /api/pull

Download a model from the ollama library. Cancelled pulls are resumed from where they left off, and multiple calls will share the same download progress.
从 ollama 库下载模型。取消的拉取将从上次中断的位置继续,并且多个调用将共享相同的下载进度。

Parameters 参数

  • name: name of the model to pull
    name:要拉取的模型名称
  • insecure: (optional) allow insecure connections to the library. Only use this if you are pulling from your own library during development.
    不安全:(可选)允许与库建立不安全的连接。仅在开发过程中从自己的库中提取时才使用此项。
  • stream: (optional) if false the response will be returned as a single response object, rather than a stream of objects
    stream:(可选)如果为 false,则响应将作为单个响应对象返回,而不是对象流
Examples 例子
Request 请求
curl http://localhost:11434/api/pull -d '{"name": "llama3.2"
}'
Response 响应

If stream is not specified, or set to true, a stream of JSON objects is returned:
如果未指定 stream 或设置为 true,则返回 JSON 对象流:

The first object is the manifest:
第一个对象是清单:

{"status": "pulling manifest"
}

Then there is a series of downloading responses. Until any of the download is completed, the completed key may not be included. The number of files to be downloaded depends on the number of layers specified in the manifest.
然后是一系列的下载响应。在任何下载完成之前,可能不包含已完成的密钥。要下载的文件数取决于清单中指定的层数。

{"status": "downloading digestname","digest": "digestname","total": 2142590208,"completed": 241970
}

After all the files are downloaded, the final responses are:
下载所有文件后,最终响应为:

{"status": "verifying sha256 digest"
}
{"status": "writing manifest"
}
{"status": "removing any unused layers"
}
{"status": "success"
}

if stream is set to false, then the response is a single JSON object:
如果 stream 设置为 false,则响应是单个 JSON 对象:

{"status": "success"
}

Push a Model 推送模型

POST /api/push

Upload a model to a model library. Requires registering for ollama.ai and adding a public key first.
将模型上传到模型库。需要先注册 ollama.ai 并添加公钥。

Parameters 参数

  • name: name of the model to push in the form of <namespace>/<model>:<tag>
    name:要推送的模型名称,格式为 <namespace>/<model>:<tag>
  • insecure: (optional) allow insecure connections to the library. Only use this if you are pushing to your library during development.
    不安全:(可选)允许与库建立不安全的连接。仅在开发期间推送到库时使用此选项。
  • stream: (optional) if false the response will be returned as a single response object, rather than a stream of objects
    stream:(可选)如果为 false,则响应将作为单个响应对象返回,而不是对象流
Examples 例子
Request 请求
curl http://localhost:11434/api/push -d '{"name": "mattw/pygmalion:latest"
}'
Response 响应

If stream is not specified, or set to true, a stream of JSON objects is returned:
如果未指定 stream 或设置为 true,则返回 JSON 对象流:

{ "status": "retrieving manifest" }

and then: 然后:

{"status": "starting upload","digest": "sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711ab","total": 1928429856
}

Then there is a series of uploading responses:
然后是一系列上传响应:

{"status": "starting upload","digest": "sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711ab","total": 1928429856
}

Finally, when the upload is complete:
最后,上传完成后:

{"status":"pushing manifest"}
{"status":"success"}

If stream is set to false, then the response is a single JSON object:
如果 stream 设置为 false,则响应是单个 JSON 对象:

{ "status": "success" }

Generate Embeddings 生成嵌入

POST /api/embed

Generate embeddings from a model
从模型生成嵌入

Parameters 参数

  • model: name of model to generate embeddings from
    model:用于生成嵌入的模型的名称
  • input: text or list of text to generate embeddings for
    input:要为其生成嵌入的文本或文本列表

Advanced parameters: 高级参数:

  • truncate: truncates the end of each input to fit within context length. Returns error if false and context length is exceeded. Defaults to true
    truncate:截断每个输入的结尾以适合上下文长度。如果超出 false 且超出上下文长度,则返回错误。默认为 true
  • options: additional model parameters listed in the documentation for the Modelfile such as temperature
    options:模型文件文档中列出的其他模型参数,例如温度
  • keep_alive: controls how long the model will stay loaded into memory following the request (default: 5m)
    keep_alive:控制模型在请求后加载到内存中的时间(默认值:5M
Examples 例子
Request 请求
curl http://localhost:11434/api/embed -d '{"model": "all-minilm","input": "Why is the sky blue?"
}'
Response 响应
{"model": "all-minilm","embeddings": [[0.010071029, -0.0017594862, 0.05007221, 0.04692972, 0.054916814,0.008599704, 0.105441414, -0.025878139, 0.12958129, 0.031952348]],"total_duration": 14143917,"load_duration": 1019500,"prompt_eval_count": 8
}
Request (Multiple input) 请求 (多路输入)
curl http://localhost:11434/api/embed -d '{"model": "all-minilm","input": ["Why is the sky blue?", "Why is the grass green?"]
}'
Response 响应
{"model": "all-minilm","embeddings": [[0.010071029, -0.0017594862, 0.05007221, 0.04692972, 0.054916814,0.008599704, 0.105441414, -0.025878139, 0.12958129, 0.031952348],[-0.0098027075, 0.06042469, 0.025257962, -0.006364387, 0.07272725,0.017194884, 0.09032035, -0.051705178, 0.09951512, 0.09072481]]
}

List Running Models 列出正在运行的模型

GET /api/ps

List models that are currently loaded into memory.
列出当前加载到内存中的模型。

Examples 例子
Request 请求
curl http://localhost:11434/api/ps
Response 响应

A single JSON object will be returned.
将返回单个 JSON 对象。

{"models": [{"name": "mistral:latest","model": "mistral:latest","size": 5137025024,"digest": "2ae6f6dd7a3dd734790bbbf58b8909a606e0e7e97e94b7604e0aa7ae4490e6d8","details": {"parent_model": "","format": "gguf","family": "llama","families": ["llama"],"parameter_size": "7.2B","quantization_level": "Q4_0"},"expires_at": "2024-06-04T14:38:31.83753-07:00","size_vram": 5137025024}]
}

Generate Embedding 生成嵌入

Note: this endpoint has been superseded by /api/embed
注意:此端点已被 /api/embed 取代

POST /api/embeddings

Generate embeddings from a model
从模型生成嵌入

Parameters 参数

  • model: name of model to generate embeddings from
    model:用于生成嵌入的模型的名称
  • prompt: text to generate embeddings for
    prompt:为其生成嵌入的文本

Advanced parameters: 高级参数:

  • options: additional model parameters listed in the documentation for the Modelfile such as temperature
    options:模型文件文档中列出的其他模型参数,例如温度
  • keep_alive: controls how long the model will stay loaded into memory following the request (default: 5m)
    keep_alive:控制模型在请求后加载到内存中的时间(默认值:5M
Examples 例子
Request 请求
curl http://localhost:11434/api/embeddings -d '{"model": "all-minilm","prompt": "Here is an article about llamas..."
}'
Response 响应
{"embedding": [0.5670403838157654, 0.009260174818336964, 0.23178744316101074, -0.2916173040866852, -0.8924556970596313,0.8785552978515625, -0.34576427936553955, 0.5742510557174683, -0.04222835972905159, -0.137906014919281]
}

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/461058.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【经典论文阅读11】ESMM模型——基于贝叶斯公式的CVR预估

传统的CVR模型&#xff08;也就是直接对conversion rate建模的模型&#xff09;在实际应用中面临两个问题&#xff08;样本选择偏差与数据稀疏性问题&#xff09;。为了解决这两个问题&#xff0c;本文提出ESMM模型。该模型巧妙地利用用户行为序列去建模这个问题&#xff0c;从…

OpenCV基础01

目录 一、环境安装 二、显示窗口 三、创建图片 四、图片保存 五、图像裁剪 六、调整图片大小 七、图像绘制 1、绘制圆形 2、绘制矩形 3、绘制文本 4、绘制直线 5、中文文本 八、控制鼠标 九、鼠标事件 十、视频处理 OpenCV作为C和C语言的源代码文件&#xff0c;…

git:将多个提交合并为一个

如何将第一至第五次提交合并为一个&#xff1f; 1. 使用 git log -n 命令查看spring boot admin的commit-id&#xff0c;本例n6&#xff0c;命令如下&#xff1a; PS E:\liguogang\spring-cloud> git log -62. 使用 git reset --soft commit-id 命令将前五次提交重置到工作…

Leetcode 二叉树中的最大路径和

算法思想 这道题要求在一棵二叉树中找到路径和最大的路径。路径可以从树中任意一个节点开始&#xff0c;到任意一个节点结束&#xff0c;但路径上的节点必须是连续的。 算法使用递归的方式来遍历树中的每个节点&#xff0c;并在遍历过程中计算包含当前节点的最大路径和。具体…

Python 变量在函数中的作用域

什么是局部变量&#xff1f; 作用范围在函数内部&#xff0c;在函数外部无法使用 什么是全局变量&#xff1f; 在函数内部和外部均可使用 如何将函数内定义的变量声明为全局变量&#xff1f; 使用global关键字&#xff0c; global变量 练习&#xff1a; 演示局部变量 #…

【Android】Kotlin教程(4)

文章目录 1.field2.计算属性3.主构造函数4.次构造函数5.默认参数6.初始化块7.初始化顺序7.延迟初始化lateinit8.惰性初始化 1.field field 关键字通常与属性的自定义 getter 和 setter 一起使用。当你需要为一个属性提供自定义的行为时&#xff0c;可以使用 field 来访问或设置…

SSH登录介绍

说明&#xff1a;一般登录服务器&#xff0c;我们可以用远程连接工具&#xff0c;如XShell、Windterm等&#xff0c;或者通过公司搭建的JumpServer&#xff08;跳板机、堡垒机&#xff09;来连接。前者是点对点登录&#xff0c;输入主机、端口&#xff0c;通过SSH协议登录&…

unity中预制体的移动-旋转-放缩

unity中预制体的移动-旋转-放缩 左上侧竖栏图标介绍Tools(手形工具)Move Tool(移动工具&#xff0c;单位米)Rotate Tool(旋转工具&#xff0c;单位角度)Scale Tool(缩放工具&#xff0c;单位倍数)Rect Tool(矩形工具)Transform Tool(变换工具)图标快捷键对照表工具使用的小技巧…

HarmonyOS开发 - 本地持久化之实现LocalStorage实例

用户首选项为应用提供Key-Value键值型的数据处理能力&#xff0c;支持应用持久化轻量级数据&#xff0c;并对其修改和查询。数据存储形式为键值对&#xff0c;键的类型为字符串型&#xff0c;值的存储数据类型包括数字型、字符型、布尔型以及这3种类型的数组类型。 说明&#x…

java程序打包为一个exe程序

ok&#xff0c;最近学到了一个有意思的东西 那就是如何将自己写好的java程序打包成一个exe程序&#xff0c;发给别人&#xff0c;然后运行。 那么开始之前&#xff0c;请先安装整个工具&#xff1a; exe4j&#xff1a;https://www.ej-technologies.com/exe4j/download&#…

高并发设计模式之ForkJoin模式

分而治之是一种思想,所谓分而治之就是把一个复杂的算法问题按一定的分解方法分为规模较小的若干部分,然后逐个解决,分别找出各部分的解,最后把各部分的解在整合成整个问题的解.ForkJoin模式就是分而治之思想的另一种应用. ForkJoin模式的原理 ForkJoin模式先把一个大任务分解…

AMD XILINX 20nm器件价格上调25%

随着市场回暖&#xff0c;台积电也在调整价格策略&#xff0c;近期台积电上调了20nm的出厂价格。 据相关消息显示&#xff0c;AMD为了保障持续的供货和服务&#xff0c;也计划将20nm器件的价格统一上调25%&#xff0c;预计将于11月发布正式的涨价通知&#xff0c;并于2025年Q1开…

EfficientNet-B6模型实现ISIC皮肤镜图像数据集分类

项目源码获取方式见文章末尾&#xff01; 回复暗号&#xff1a;13&#xff0c;免费获取600多个深度学习项目资料&#xff0c;快来加入社群一起学习吧。 《------往期经典推荐------》 项目名称 1.【基于opencv答题卡识别判卷】 2.【卫星图像道路检测DeepLabV3Plus模型】 3.【G…

为何我们要将测试左移?回到过去的美好时光

以下为作者观点&#xff1a; 为何我们将测试左移&#xff1f;在传统的开发周期中&#xff0c;测试通常在功能完成后甚至在开发阶段结束时进行。左移测试通过从开发过程开始到整个开发过程整合测试活动来挑战这一点。 让我们首先讨论一下为什么我们选择“左移”&#xff0c;因…

java项目之基于智能推荐的卫生健康系统(springboot)

风定落花生&#xff0c;歌声逐流水&#xff0c;大家好我是风歌&#xff0c;混迹在java圈的辛苦码农。今天要和大家聊的是一款基于springboot的基于智能推荐的卫生健康系统。项目源码以及部署相关请联系风歌&#xff0c;文末附上联系信息 。 项目简介&#xff1a; 基于智能推荐…

性能测试详解

&#x1f345; 点击文末小卡片 &#xff0c;免费获取软件测试全套资料&#xff0c;资料在手&#xff0c;涨薪更快 一、 性能测试术语解释 1. 响应时间 响应时间即从应用系统发出请求开始&#xff0c;到客户端接收到最后一个字节数据为止所消耗的时间。响应时间按软件的特点…

深度学习基础—循环神经网络(RNN)

引言 从本系列博客开始&#xff0c;我们将来一起学习一下NLP领域的相关基础知识&#xff0c;NLP领域重要的模型是RNN&#xff0c;在此之前&#xff0c;先来了解一些符号的含义。 1.符号定义 &#xff08;1&#xff09;符号定义 假设建立一个能够自动识别句中人名位置的序列模型…

【工具变量】自由贸易试验区试点DID数据集(2003-2023年)

数据简介&#xff1a;自由贸易试验区&#xff08;Free Trade Zone&#xff0c;简称FTZ&#xff09;是中国ZF在新形势下为了推进GG开放、提高开放型经济水平而采取的重要战略举措。自贸试验区在一国的部分领土内运入任何货物&#xff0c;被认为在关境以外&#xff0c;免于实施惯…

Flask

创建项目 Pycharm专业版 默认文件 Pycharm社区版没有自动创建这几个文件&#xff0c;手动创建即可。 运行 常规功能 debug模式 修改内容自动更新&#xff0c;否则需要重新启动运行项目才生效。 修改host 通网络内其他人可以通过我得ip访问该服务。 修改端口号 空格分隔…

[Wireshark] 使用Wireshark抓包https数据包并显示为明文、配置SSLKEYLOGFILE变量(附下载链接)

前言 wireshark安装包 链接&#xff1a;https://pan.quark.cn/s/febb28f57c01 提取码&#xff1a;fUCQ 链接失效&#xff08;可能会被官方和谐&#xff09;可评论或私信我重发 chrome与firefox在访问https网站的时候会将密钥写入这个环境变量SSLKEYLOGFILE中&#xff0c;在wir…