题意:“OpenAI API:如何在 API 请求之前计算令牌数量”
问题背景:
I would like to count the tokens of my OpenAI API request in R
before sending it (version gpt-3.5-turbo
). Since the OpenAI API has rate limits, this seems important to me.
“我想在发送 OpenAI API 请求之前,在 R 中计算请求的令牌数量(使用的是 gpt-3.5-turbo 版本)。由于 OpenAI API 有频率限制,这对我来说很重要。”
Example: 示例:
The function I use to send requests: 我使用这个函数来发送请求:
ask_chatgpt <- function(prompt) {response <- POST(url = "https://api.openai.com/v1/chat/completions", add_headers(Authorization = paste("Bearer", api_key)),content_type_json(),encode = "json",body = list(model = "gpt-3.5-turbo",messages = list(list(role = "user", content = prompt))))str_trim(content(response)$choices[[1]]$message$content)}
Example 示例
api_key <- "your_openai_api_key" library(httr)
library(tidyverse)#Calls the ChatGPT API with the given prompt and returns the answer
ask_chatgpt <- function(prompt) {response <- POST(url = "https://api.openai.com/v1/chat/completions", add_headers(Authorization = paste("Bearer", api_key)),content_type_json(),encode = "json",body = list(model = "gpt-3.5-turbo",messages = list(list(role = "user", content = prompt))))str_trim(content(response)$choices[[1]]$message$content)
}prompt <- "how do I count the token in R for gpt-3.45-turbo?"ask_chatgpt(prompt)
#> [1] "As an AI language model, I am not sure what you mean by \"count the token in R for gpt-3.5-turbo.\" Please provide more context or clarification so that I can better understand your question and provide an appropriate answer."
I would like to calculate/estimate as how many tokens prompt
will need with gtp-3.5-turbo
There is a similar question for gtp-3 and python
, where the tiktoken library is recommended. However, I could not find a similar library in R
.
“我想计算/估算使用 gpt-3.5-turbo 时 prompt 需要多少令牌。
对于 gpt-3 和 Python,有一个类似的问题,其中推荐了 tiktoken 库。然而,我在 R 中找不到类似的库。”
OpenAI also recommends tiktoken or gpt-3-encoder
package for JavaScript.
“OpenAI 还推荐使用 tiktoken 或 gpt-3-encoder JavaScript 包。”
问题解决:
OpenAI has their own tokenizer so you probably won't be able to reproduce it. Instead, I would just recommend using their python API via the reticulate
package
“OpenAI 有他们自己的分词器,所以你可能无法完全重现它。相反,我建议你通过 reticulate
包使用他们的 Python API。”
First, install the tiktoken
package via the command line using:
“首先,通过命令行使用以下命令安装 tiktoken
包:”
pip install tiktoken
Then, in R 然后,在R语言中
library(reticulate)
tiktoken <- import("tiktoken")
encoding <- tiktoken$encoding_for_model("gpt-3.5-turbo")
prompt <- "how do I count the token in R for gpt-3.45-turbo?"
length(encoding$encode(prompt))
# [1] 19