碳基生命永远在想办法偷懒,
近来觉得把任务用提示词准确描述给ChatGPT,输出Code也很累
终于蹲到了魔法杖
GitHub上大火的AutoGPT有了网页版
能自动调用ChatGPT(3.5)
输入自然语言,拆解任务,分段输出
用它来写代码、改代码
把一段代码丢进去让它来说说干了啥
是起飞的感觉
AutoGPT网站
以下尝试基于网页版AutoGPT的实现
网址:https://app.cognosys.ai/
免费,只需要注册一个邮箱(国内邮箱可用,无需科学上网)
案例教程
以公开网站抓取贵金属过去十年的日频数据为例:
在左侧文本框内输入一段很简陋的指令(自然语言)
输入命令
别问我为什么不用ChatGPT,想准确的命令和提示词也很费脑
另外:如果懒得输中文,可以Notion之类调用AI汉译英
左上角有一个接入自己Open AI API key的选项,可以不接入,默认网站也可以自己调用(弊端是有时候输出的答案长些 可能内容不全)
然后点下开始,就可以见证奇迹了
见证魔法
看见中间的窗口开始thinking
然后一直往下跑
我就抱着手
负责看……
(再没我事了)
任务自动化拆解
这一步是AutoGPT的核心
完成任务
Completing “1. Develop a web scraper using Python to extract daily palladium prices from the website https://www.lppm.com/data from 2013 to present.”
Sure, I can help you with that task. To extract daily palladium prices from the website https://www.lppm.com/data from 2013 to present, we can use Python’s Beautiful Soup library which is a popular web scraping library.
Here’s the code that will fetch the daily palladium prices from the website and store it in a CSV file:
import requests
from bs4 import BeautifulSoup
import csv# URL of the website to be scraped
url = "https://www.lppm.com/data"# Send a GET request to the website and get the response
response = requests.get(url)# Parse the HTML content using Beautiful Soup
soup = BeautifulSoup(response.content, 'html.parser')# Get the table containing the palladium prices
table = soup.find('table', attrs={'id': 'datatable'})# Get the table rows
rows = table.find_all('tr')# Create a list to hold the data
data = []# Loop through each row and get the data
for row in rows:cols = row.find_all('td')cols = [col.text.strip() for col in cols]data.append(cols)# Write the data to a CSV file
with open('palladium_prices.csv',
Completing “2. Clean and format the extracted data to ensure consistency and accuracy.”
Sure, I can help you with that.
After fetching the data from the website, we can use Python’s pandas library to clean and format the data. Here are the steps I suggest:
First, we need to import the pandas library and read the csv file that we created while fetching the data. We can use the read_csv() method for this.
import pandas as pddf = pd.read_csv('palladium_data.csv')
Next, we need to check the data types of each column and ensure that they are consistent. We can use the dtypes attribute to check the data types.
print(df.dtypes)
This will give us the data types of each column. We can then convert any columns that are not in the correct data type to the correct one.
We also need to check for any missing or null values in the data. We can use the isnull() method to check for any null values.
Copy code
print(df.isnull().sum())
If there are any null values, we can either drop the rows or fill them with appropriate values.
We can also format the date column to ensure consistency. We can use the to_datetime()
Completing “3. Write a function to export the cleaned data to a CSV file for easy analysis and storage.”
Sure, I can help you with that. Here’s a Python function that exports the cleaned data to a CSV file:
import pandas as pddef export_to_csv(data):df = pd.DataFrame(data, columns=['Date', 'Price'])df.to_csv('palladium_prices.csv', index=False)
This function takes in the cleaned data as a list of tuples, where each tuple contains the date and price of the palladium. It then converts this data into a Pandas DataFrame and exports it to a CSV file named “palladium_prices.csv” in the current working directory.
You can call this function at the end of your data cleaning process to export the cleaned data to a CSV file for easy analysis and storage.
多贴心,每一个步骤又给解释,又给方案,生怕你看不懂hhh
开始加戏了
理论上讲任务到这已经完成了
它居然没停
又开始thinking
想了想开始加戏
先帮我清理数据,处理缺省值
又按年把日数据分组(这些后面进一步分析的话都需要做,可谓先见之明)
这个尽心尽力的乙方真的很牛
然后又是一通针对附加任务的输出
Completing “Create a function to handle missing or null values in the data by either dropping the rows or filling them with appropriate values.”
Sure, I can create a Python function to handle missing or null values in the palladium daily prices data fetched from the website https://www.lppm.com/data. Here is the code:
import pandas as pddef handle_missing_values(data):# Drop rows with missing or null valuesdata.dropna(inplace=True)# Fill missing or null values with appropriate valuesdata.fillna(method='ffill', inplace=True)# Save the data to a CSV filedata.to_csv('palladium_daily_prices.csv', index=False)return data
不光给代码,生怕你看不懂还接着解释
The handle_missing_values function takes in the data variable, which should be a Pandas DataFrame containing the palladium daily prices data fetched from the website. The function first drops any rows with missing or null values using the dropna method. It then fills any remaining missing or null values with the previous valid value using the fillna method with the ffill parameter. Finally, the function saves the cleaned data to a CSV file named palladium_daily_prices.csv using the to_csv method with the index=False parameter.
Please note that this function assumes that the data variable contains(可惜这一段太长了 没输出完)
然后是groupby()函数分年
Completing “Use the groupby()
method to group the data by year and calculate the average price of palladium for each year.”
Sure, I can help you with that. Here’s a Python code that can fetch palladium daily prices from the website https://www.lppm.com/data, filter the data from 2013 to now, group the data by year, and calculate the average price of palladium for each year:
import pandas as pd
import requests
from bs4 import BeautifulSoup# Fetch the data from the website
url = "https://www.lppm.com/data"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')# Find the table with the palladium daily prices
table = soup.find('table', {'id': 'tblPalladium'})# Convert the table to a pandas dataframe
df = pd.read_html(str(table))[0]# Filter the data from 2013 to now
df['Date'] = pd.to_datetime(df['Date'])
df = df[df['Date'].dt.year >= 2013]# Group the data by year and calculate the average price of palladium for each year
df_grouped = df.groupby(df['Date'].dt.year)['Price'].mean()# Save the data to a csv file
df_grouped.to_csv('palladium_prices.csv')
最后他是嫌我任务太简单,没有可视化吗
再来个图
Completing “Create a visualization of the palladium prices over time using a line chart in Python’s Matplotlib library.”
Sure, I can help you with that. Here is the Python code to fetch the palladium daily prices from the website https://www.lppm.com/data from 2013-now and create a CSV file to record the data:
import requests
from bs4 import BeautifulSoup
import csvurl = "https://www.lppm.com/data"
response = requests.get(url)soup = BeautifulSoup(response.content, 'html.parser')
table = soup.find_all('table')[0]rows = table.find_all('tr')
data = []
for row in rows:cols = row.find_all('td')cols = [col.text.strip() for col in cols]data.append(cols)with open('palladium_prices.csv', 'w', newline='') as file:writer = csv.writer(file)writer.writerows(data)
This code will fetch the data from the website and write it to a CSV file named palladium_prices.csv.
To create a line chart of the palladium prices over time, we can use Python’s Matplotlib library. Here is the code to do that:
import pandas as pd
import matplotlib.pyplot as pltdf = pd.read_csv('palladium_prices.csv')
df['Date
就在我疑惑他为什么还没结束的时候
又开始加戏了
他竟然开始处理异常值!!!!
Completing “Create a function to handle outliers in the data by either dropping them or replacing them with appropriate values.”
Sure, I can help you with that. Here is a Python code that fetches palladium daily prices from the website https://www.lppm.com/data from 2013-now and creates a CSV file to record the data.
(其实大可不必)
import pandas as pd
import requests
from bs4 import BeautifulSoupdef fetch_palladium_prices():url = 'https://www.lppm.com/data'headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}response = requests.get(url, headers=headers)soup = BeautifulSoup(response.text, 'html.parser')table = soup.find_all('table')[0]df = pd.read_html(str(table))[0]df.columns = df.columns.droplevel()df = df.rename(columns={'Date': 'date', 'Palladium AM': 'palladium_am', 'Palladium PM': 'palladium_pm'})df['date'] = pd.to_datetime(df['date'], format='%d-%b-%y')df.to_csv('
然后调用describe函数,统计学对数据进行了一定的描述
Completing “Use the describe()
method to get a summary of the statistical measures of the data.”
Sure, here’s a Python code snippet that fetches daily palladium prices from the LPPM website and creates a CSV file to record the data. This code uses the pandas library to read the HTML table from the website and convert it into a DataFrame, which can then be used to compute the summary statistics using the describe() method.
import pandas as pd# Define the URL and read the HTML table
url = 'https://www.lppm.com/data'
dfs = pd.read_html(url)# Select the table containing the palladium prices
df = dfs[3]# Rename the columns and set the date as the index
df.columns = ['Date', 'USD', 'EUR', 'GBP', 'JPY']
df['Date'] = pd.to_datetime(df['Date'], format='%d-%b-%y')
df.set_index('Date', inplace=True)# Save the data to a CSV file
df.to_csv('palladium_prices.csv')# Compute the summary statistics using the describe() method
summary = df.describe()
print(summary)
This code should fetch the palladium prices from the LPPM website, create a CSV file called palladium_prices.csv to record the data
最后的最后,我在想如果还不结束我就stop的时候
(看他演戏演的也没耐心了,观众坐不住了)
他开始画分布
Completing “Create a histogram of the palladium prices to visualize the distribution of the data.”
Sure, I can help you with that. Here’s the code that can fetch the palladium daily prices from the website and create a CSV file to record the data:
import requests
from bs4 import BeautifulSoup
import csvurl = 'https://www.lppm.com/data'response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')table = soup.find('table', {'class': 'table-data'})
rows = table.find_all('tr')data = []
for row in rows:cols = row.find_all('td')cols = [col.text.strip() for col in cols]data.append(cols)with open('palladium_prices.csv', 'w', newline='') as csvfile:writer = csv.writer(csvfile)writer.writerows(data)
To create a histogram of the palladium prices, you can add the following code to the end of the above code:
import pandas as pd
import matplotlib.pyplot as pltdf = pd.read_csv('palladium_prices.csv')
prices = df['Price']plt.hist(prices, bins=20)
plt.title('Histogram of Palladium Prices')
plt.xlabel('Price')
plt.ylabel('Frequency')
plt.show()
干完这些活以后他大概还有余力,跟打了鸡血似的继续输出
开始计算标的价格和其它经济指标之间的相关性,画散点图……
这一番操作下来,一篇期中大作业得有了吧
(不禁怀疑他学习了多少份金融学大作业hhh)
……
虽然完全用不到,写都写了
把代码贴在下面
处理异常值:
Completing “Create a Python function to handle outliers in the palladium daily prices data by either dropping them or replacing them with appropriate values.”
Sure, I can help you with that. Here’s a Python function that fetches the palladium daily prices from the website you mentioned and handles the outliers by dropping them:
import pandas as pd
import requests
from bs4 import BeautifulSoupdef fetch_palladium_data():url = 'https://www.lppm.com/data'response = requests.get(url)soup = BeautifulSoup(response.content, 'html.parser')table = soup.find('table')df = pd.read_html(str(table))[0]df.columns = df.columns.droplevel()df = df.rename(columns={'Unnamed: 0': 'Date'})df['Date'] = pd.to_datetime(df['Date'])df = df.set_index('Date')# Handling outliersdf = df.drop(df[df['Palladium'] < 500].index)# Writing data to a CSV filedf.to_csv('palladium_daily_prices.csv')
This function fetches the data from the website, cleans it, drops outliers with a value less than 500, and writes the clean data to a CSV file named palladium_daily_prices.csv. You can call this function as follows:
fetch_palladium
相关性分析
Completing “Use the corr()
method in pandas to calculate the correlation between the palladium prices and other economic factors, such as inflation or stock prices.”
Sure, I can help you with that. Here’s a Python code that fetches daily palladium prices from the website https://www.lppm.com/data and calculates the correlation between palladium prices and inflation:
import pandas as pd
import requests
from bs4 import BeautifulSoup# Fetch data from website
url = 'https://www.lppm.com/data'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
table = soup.find('table')
data = []# Extract palladium prices
for row in table.find_all('tr'):cols = row.find_all('td')if len(cols) == 4:date = cols[0].text.strip()price = float(cols[1].text.strip().replace(',', ''))data.append({'Date': date, 'Palladium Price': price})# Convert data to pandas dataframe
df = pd.DataFrame(data)# Fetch inflation data
inflation_url = 'https://www.statbureau.org/en/united-states/inflation-tables'
inflation_data = pd.read_html(inflation_url)[0]# Remove unnecessary columns and rename columns
inflation_data = inflation_data[['Year', 'Annual Average']]
inflation_data.columns =
相关性分析画图
Completing “Create a Python function to plot the palladium prices against other economic factors using a scatter plot in Matplotlib.”
Sure, here’s a Python function that fetches the daily palladium prices from the LPPM website and plots it against other economic factors using a scatter plot in Matplotlib:
import pandas as pd
import matplotlib.pyplot as plt
import requests
from bs4 import BeautifulSoupdef plot_palladium_prices():# Fetch the data from the LPPM websiteurl = 'https://www.lppm.com/data'r = requests.get(url)soup = BeautifulSoup(r.content, 'html.parser')table = soup.find('table', {'class': 'table table-striped table-bordered table-hover dataTable no-footer'})# Convert the data to a pandas dataframedata = []for tr in table.find_all('tr'):row = []for td in tr.find_all('td'):row.append(td.text.strip())data.append(row)df = pd.DataFrame(data[1:], columns=data[0])df = df.iloc[::-1].reset_index(drop=True)df['Date'] = pd.to_datetime(df['Date'], format='%d-%b-%y')df['Palladium'] = pd.to_numeric(df['Palladium'].str.replace(',', ''))# Plot the data
终于!!!!他想了想
任务终于结束了
突然有一种领导想要结束会议,员工还巴拉巴拉讲,突然停止之后,大家如释重负的感觉
不得不说,AutoGPT驯化之后的GPT,虽然是3.5版本,但也真的太敬业了
敬业的令人发指
什么时候它能帮我把全部输出的内容直接搬运到CSDN呢?
最后让我们回顾一下他复杂的任务线
还记得我输入的指令是什么吗?
我自己都忘了……
最后再放上这个神奇的网站
https://app.cognosys.ai/
这一篇写了输入自然语言,输出代码
下一篇尝试输入代码,输出自然语言+代码
看看如何用它优化代码的(代码本身能跑,就是循环效率不行,跑死机了)