当ChatGPT接管咖啡吧后,会发生什么呢?让我们来看看吧!
众所周知,打工人不能失去咖啡。然而,不论是点外卖还是自己做,都需要耗费不少的时间。如果,只要我说一句:
“我要喝咖啡!”
几分钟后就能有一杯咖啡送到我面前,那该是多么美好的事情啊。这样的事情,也许只在梦里有吧。
等等,最近火热的ChatGPT不是号称会引领第四次科技革命吗?那能不能通过使用ChatGPT让我们美梦成真呢?
话不多说,直接请出ChatGPT。
为了让ChatGPT能够接管我们的咖啡机和机器人,我们在Prompt中对他进行了角色的设置:
你是一个出色的机器人任务翻译员。根据指令,你可以将其拆分为一系列机器人动作,并生成可直接执行的 Python 程序
不愧是ChatGPT
我们先让ChatGPT从简单的事情做起,简单任务完成不了怎么完成复杂任务呢?
首先使用了以下指令:
“送餐机器人前往咖啡机”
“送餐机器人去到3号桌”
实验结果十分振奋人心,在测试中它没有出现生成错误程序的情况。
下面是针对“送餐机器人前往咖啡机”任务所生成的程序和执行效果:
import requests
import time
dispatch_api = http://localhost:5000/dispatch
# Dispatch the delivery robot to the table
print("Dispatching the delivery robot to table 3...")
response = requests.get(dispatch_api, params={"target_name": "咖啡机"})
time.sleep(1) # Wait for 1 second for the robot to complete the action
▲送餐机器人前往咖啡机
不愧是ChatGPT,能力就是强。
接下来只要把简单的程序组合成复杂的程序,不就可以完成更复杂的任务了吗?这对ChatGPT来说,想必也是小事一桩吧。
巧妇难为无米之炊
有了前面成功的经历,我变得大胆,想要一步到位,所以直接下发了一个复杂任务:制作并递送咖啡。
结果,ChatGPT生成的程序根本不能完成任务。
下面是针对“制作并递送咖啡”所生成的程序和执行效果:
import requests
import time
make_coffee_api = "http://localhost:5000/make_coffee"
get_cup_api = "http://localhost:5000/get_cup"
put_cup_api = "http://localhost:5000/put_cup"
open_hatch_api = "http://localhost:5000/open_hatch"
close_hatch_api = "http://localhost:5000/close_hatch"
dispatch_api = "http://localhost:5000/dispatch"
# Define the target location for the delivery robot
table_3 = "3号桌"# Move the delivery robot to the table 3
print("Moving the delivery robot to table 3...")
response = requests.get(move_robot_api, params={"target_name": table_3})
time.sleep(1) # Wait for 1 second for the robot to complete the action# Make a cup of American coffee
print("Making a cup of American coffee...")
response = requests.get(make_coffee_api, params={"coffee_type": "USA"})
time.sleep(1) # Wait for 1 second for the robot to complete the action# Get a cup from the coffee machine
print("Getting a cup from the coffee machine...")
response = requests.get(get_cup_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Open the delivery compartment door
print("Opening the delivery compartment door...")
response = requests.get(open_hatch_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Put the cup into the delivery compartment
print("Putting the cup into the delivery compartment...")
response = requests.get(put_cup_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Close the delivery compartment door
print("Closing the delivery compartment door...")
response = requests.get(close_hatch_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action
▲运行错误程序后,机器人没有带咖啡过来
可以看到,机器人确实走到了我旁边,可是里面却空空如也。回头一看,咖啡还在机械臂手里呢!
到底是哪里出了问题呢?
仔细分析它给出的代码,如果送餐机器人一开始就已经拿到了咖啡,那么,上面的程序不也可以完成任务吗?
想到这里,我意识到我需要告诉ChatGPT更多的信息,不能只是简单地告诉他任务内容,却不提供完成任务所必须的信息。巧妇都难为无米之炊,咱也不能为难ChatGPT呀。
紧接着,我在prompt中加入了机器人的状态和环境的信息。结果,正确!
下面是针对“制作并递送咖啡”所生成的程序(执行效果见最后):
import requests
import time
make_coffee_api = "http://localhost:5000/make_coffee"
get_cup_api = "http://localhost:5000/get_cup"
put_cup_api = "http://localhost:5000/put_cup"
open_hatch_api = "http://localhost:5000/open_hatch"
close_hatch_api = "http://localhost:5000/close_hatch"
dispatch_api = "http://localhost:5000/dispatch"
# First, dispatch the delivery robot to the coffee machineprint("Dispatching the delivery robot to the coffee machine...")response = requests.get(dispatch_api, params={"target_name": "咖啡机"})time.sleep(1) # Wait for 1 second for the robot to complete the action# # Then, make a cup of coffeeprint("Making a cup of coffee...")response = requests.get(make_coffee_api, params={"coffee_type": "USA"})time.sleep(1) # Wait for 1 second for the robot to complete the action# Then, open the #delivery compartment door
print("Opening the delivery compartment door...")
response = requests.get(open_hatch_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Next, use the robotic arm at the coffee machine to get the cup filled with coffee
print("Getting the cup filled with coffee...")
response = requests.get(get_cup_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Next, put the cup inside the compartment using the robotic arm at the coffee machine
print("Putting the cup inside the delivery robot's compartment...")
response = requests.get(put_cup_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Finally, close the delivery compartment door
print("Closing the delivery compartment door...")
response = requests.get(close_hatch_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Dispatch the delivery robot to the table
print("Dispatching the delivery robot to table 3...")
response = requests.get(dispatch_api, params={"target_name": "3号桌"})
time.sleep(1) # Wait for 1 second for the robot to complete the action
美梦成真
既然可以生成正确的程序,那么就来实际体验下吧。但当我随口说出喝咖啡的要求时,得到的回答让人大跌眼镜:
输入:你好,点两杯美式咖啡
输出:抱歉,我是一个文本AI助手,无法为您提供实际的咖啡服务。但是,如果您需要我为您生成相应的机器人动作序列,我可以帮您完成。请问您需要这样的帮助吗?
输入:我渴了,来杯咖啡吧?
输出:抱歉,我是一个文本AI助手,无法为您提供实际的咖啡服务。但是,如果您需要我为您生成相应的机器人动作序列,我可以帮您完成。请问您需要这样的帮助吗?
“我是一个文本AI助手”,莫非是忘记了上下文?
但输入之前的指令,依旧可以输出正确的程序,而且它还在强调它可以帮我写程序,这与我赋予它的角色是一致的。
也许,问题不在ChatGPT,而在我?
是因为我给它的设定太过死板,导致它失去了自然交互的能力?也许不应该设定为“机器人程序翻译员”,应该设置一个更有人情味的角色。随后,我们将其扮演的角色更新为一个咖啡吧的服务员。
更换角色后的结果如下(相比于前面的任务,增加了互动的功能):
输入:我渴了,来杯咖啡吧
输出:当然可以!请问您想要一杯美式咖啡还是意式咖啡呢?
输入:来杯美式咖啡
输出:好的,一杯美式咖啡。请您稍等一下,我马上为您准备。
import requests
import time
make_coffee_api = "http://localhost:5000/make_coffee"
get_cup_api = "http://localhost:5000/get_cup"
put_cup_api = "http://localhost:5000/put_cup"
open_hatch_api = "http://localhost:5000/open_hatch"
close_hatch_api = "http://localhost:5000/close_hatch"
dispatch_api = "http://localhost:5000/dispatch"
# First, dispatch the delivery robot to the coffee machineprint("Dispatching the delivery robot to the coffee machine...")response = requests.get(dispatch_api, params={"target_name": "咖啡机"})time.sleep(1) # Wait for 1 second for the robot to complete the action# # Then, make a cup of coffeeprint("Making a cup of coffee...")response = requests.get(make_coffee_api, params={"coffee_type": "USA"})time.sleep(1) # Wait for 1 second for the robot to complete the action# Then, open the #delivery compartment door
print("Opening the delivery compartment door...")
response = requests.get(open_hatch_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Next, use the robotic arm at the coffee machine to get the cup filled with coffee
print("Getting the cup filled with coffee...")
response = requests.get(get_cup_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Next, put the cup inside the compartment using the robotic arm at the coffee machine
print("Putting the cup inside the delivery robot's compartment...")
response = requests.get(put_cup_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Finally, close the delivery compartment door
print("Closing the delivery compartment door...")
response = requests.get(close_hatch_api)
time.sleep(1) # Wait for 1 second for the robot to complete the action# Dispatch the delivery robot to the table
print("Dispatching the delivery robot to table 3...")
response = requests.get(dispatch_api, params={"target_name": "3号桌"})
time.sleep(1) # Wait for 1 second for the robot to complete the action
下面是程序运行后的效果:
▲制作咖啡
▲取到咖啡
▲递送咖啡
可以看出,改进设定的角色后,ChatGPT具有了交互能力,可以与人进行友好的交谈,也可以生成代码,执行任务。
现在,只要我说一句:来杯咖啡。几分钟后就真的能有一杯咖啡送到我面前,实在是太棒啦!
最后总结一下在这个过程中暴露出的一些问题和收获:
1:不恰当的prompt会削弱ChatGPT本身的理解能力
2:角色设置中需要包含所有需要完成的任务和要求,如果任务描述不充分,就会导致ChatGPT表现不佳,甚至丧失原本的能力。
3:prompt中加入机器人状态信息,有助于ChatGPT生成正确的序列
同时也发现了一些亟待解决的事情:
比如ChatGPT生成程序的正确性验证是一个很有挑战性的问题,目前采取的是通过人去核验,但真实场景下用户是不具备判断能力的。因此需要一套自动化的系统来确保程序的正确性。还有与环境的交互问题,需要让ChatGPT自主感知设备状态以及环境信息,这样才能更好地根据不同情况生成不同的程序。以及当对话轮次变多后,prompt的膨胀问题等等。这些都是之后需要研究的问题。
END
致谢
感谢上海市级科技重大专项“人工智能前沿基础理论与关键技术—自主智能无人系统”的资助。
参考文献:
[1]. ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application
[2]. ChatGPT for Robotics: Design Principles and Model Abilities
复旦大学软件工程实验室
作者丨郝竞渊
排版丨刘俊伟
审核丨董震