OpenAI API: How to count tokens before API request

题意：“OpenAI API：如何在 API 请求之前计算令牌数量”

问题背景：

I would like to count the tokens of my OpenAI API request in R before sending it (version gpt-3.5-turbo). Since the OpenAI API has rate limits, this seems important to me.

“我想在发送 OpenAI API 请求之前，在 R 中计算请求的令牌数量（使用的是 gpt-3.5-turbo 版本）。由于 OpenAI API 有频率限制，这对我来说很重要。”

Example: 示例：

The function I use to send requests: 我使用这个函数来发送请求：

ask_chatgpt <- function(prompt) {response <- POST(url = "https://api.openai.com/v1/chat/completions", add_headers(Authorization = paste("Bearer", api_key)),content_type_json(),encode = "json",body = list(model = "gpt-3.5-turbo",messages = list(list(role = "user", content = prompt))))str_trim(content(response)$choices[[1]]$message$content)}

Example 示例

api_key <- "your_openai_api_key" library(httr)
library(tidyverse)#Calls the ChatGPT API with the given prompt and returns the answer
ask_chatgpt <- function(prompt) {response <- POST(url = "https://api.openai.com/v1/chat/completions", add_headers(Authorization = paste("Bearer", api_key)),content_type_json(),encode = "json",body = list(model = "gpt-3.5-turbo",messages = list(list(role = "user", content = prompt))))str_trim(content(response)$choices[[1]]$message$content)
}prompt <- "how do I count the token in R for gpt-3.45-turbo?"ask_chatgpt(prompt)
#> [1] "As an AI language model, I am not sure what you mean by \"count the token in R for gpt-3.5-turbo.\" Please provide more context or clarification so that I can better understand your question and provide an appropriate answer."

I would like to calculate/estimate as how many tokens prompt will need with gtp-3.5-turbo

There is a similar question for gtp-3 and python, where the tiktoken library is recommended. However, I could not find a similar library in R.

“我想计算/估算使用 gpt-3.5-turbo 时 prompt 需要多少令牌。

对于 gpt-3 和 Python，有一个类似的问题，其中推荐了 tiktoken 库。然而，我在 R 中找不到类似的库。”

OpenAI also recommends tiktoken or gpt-3-encoder package for JavaScript.

“OpenAI 还推荐使用 tiktoken 或 gpt-3-encoder JavaScript 包。”

问题解决：

OpenAI has their own tokenizer so you probably won't be able to reproduce it. Instead, I would just recommend using their python API via the reticulate package

“OpenAI 有他们自己的分词器，所以你可能无法完全重现它。相反，我建议你通过 reticulate 包使用他们的 Python API。”

First, install the tiktoken package via the command line using:

“首先，通过命令行使用以下命令安装 tiktoken 包：”

pip install tiktoken

Then, in R 然后，在R语言中

library(reticulate)
tiktoken <- import("tiktoken")
encoding <- tiktoken$encoding_for_model("gpt-3.5-turbo")
prompt <- "how do I count the token in R for gpt-3.45-turbo?"
length(encoding$encode(prompt))
# [1] 19