幻想过这样的两种能力,一是回到过去,二是预见未来。时间逆转回到过去,这更多的是在文艺作品中能够出现的情节。而预见未来,我们正在努力,希望可以更准确地预见更长时间内更多的细节。例如在瞬息万变的股票交易市场中,我们可能会利用NLP来判断股市舆情,或者借助机器学习的方法来预测股市行情大势,又或者可以通过大数据找出不同股票间的隐性关联,从而获取正确的投资策略。而实现这一切,都需要大量数据来支撑我们的试验,在这里我将介绍如何高效获取股票交易历史数据的方法,让大家都能快速获取数据完成各项试验:
下面我先定义一下代号和对应的证券交易所的名称:
Code | Stock Exchange |
---|---|
SHA | Shanghai Stock Exchange |
SHE | Shenzhen Stock Exchange |
HKG | Hong Kong Stock Exchange |
LON | London Stock Exchange |
NASDAQ | NASDAQ Stock Exchange |
NYSE | New York Stock Exchange |
AMEX | American Stock Exchange |
ASX | Australian Securities Exchange |
BVMF | Bovespa Stock Exchange |
CVE | Toronto TSX Ventures Stock Exchange |
TSE | Toronto Stock Exchange |
KSE | Korea Stock Exchange |
NSE | National Stock Exchange of India |
NZE | New Zealand Stock Exchange |
SGX | Singapore Exchange |
STO | NASDAQ OMX Stockholm |
TPE | Taiwan Stock Exchange |
TYO | Tokyo Stock Exchange |
现在我们下载StockData,完成后进入目录可以看到:
可以看到一个文件夹symbol, 还有三个Python文件, 分别是data.py, run.py和stock.py。我们试运行一下run.py:
由上可知,要获取任意证券交易所的交易信息,只需要一行命令便能实现。例如,以下我们要获取深圳证券交易所(SHE),其中的八只股票的全部交易信息,并存放在目录SHE_8下面:
可以看到,下载的文件全部保存为.csv格式,并且名字是所对应股票的股票代码。例如000001.csv中的000001便是深圳证券交易所下面平安银行的代码,现在输入命令查看$ cat SHE_8/000001.csv | less, 可以看到获取到的平安银行的交易信息是从1991年开始的:
现在我们试试将一个交易所全部股票的交易信息下载下来,例如将London Stock Exchange(LON)下载存放在LON_ALL目录下面, 运行命令$ python run.py LON_All LON:
进入目录LON_ALL, 可以看到从London Stock Exchange(LON)一共下载了4663只股票的数据:
实现简介,其中symbol中收集各个证券交易所的股票代码,data.py负责具体下载数据, stock.py负责对具体证券交易所股票的遍历,run.py负责接收命令行参数。其中stock.py的代码如下:
#stock.py
import json
import os
from data import DataReaderexchanges = {'SHE':'symbol/SHE.txt','AMEX':'symbol/AMEX.txt','NASDAQ':'symbol/NASDAQ.txt','NYSE':'symbol/NYSE.txt','SHA':'symbol/SHA.txt','KSE':'symbol/KSE.txt', 'TPE':'symbol/TPE.txt','LON':'symbol/LON.txt','SGX':'symbol/SGX.txt','TYO':'symbol/TYO.txt','TSE':'symbol/TSE.txt','CNSX':'symbol/CNSX.txt', 'CVE':'symbol/CVE.txt','NZE':'symbol/NZE.txt','ASX':'symbol/ASX.txt','BVMF':'symbol/BVMF.txt','HKG':'symbol/HKG.txt','NSE':'symbol/NSE.txt','BOM':'symbol/BOM.txt','STO':'symbol/STO.txt'}def stock_data(path='stock_file',exchange='SHE',num = 10000):com_num = 1fh = open(exchanges[exchange])lines = fh.readlines()fh.close()if not os.path.isdir(path):os.mkdir(path)print '\nCreate a folder: '+path+'\n'for line in lines:if com_num > num:breaksym = json.loads(line)[0]tmp = symif exchange=='SHE':sym += '.SZ'elif exchange=='SHA':sym += '.SS'elif exchange=='TPE':sym += '.TW'elif exchange=='KSE':sym += '.KS'elif exchange=='LON':sym += '.L'elif exchange=='SGX':sym += '.SI'elif exchange=='NZE':sym += '.NZ'elif exchange=='ASX':sym += '.AX'elif exchange=='HKG':sym += '.HK'elif exchange=='BVMF':sym += '.SA'elif exchange=='STO':sym += '.ST'elif exchange=='TYO':passelif exchange in ['TSE','CNSX','CVE']:sym += '.TO'elif exchange in ['NSE','BOM']:sym += '.BO'elif exchange in ['AMEX','NASDAQ','NYSE']:passtry:data = DataReader(sym, 'yahoo', start='5/20/1900').to_csv()except:print str(com_num)+': '+'\033[0;31mNot available\033[0m '+tmp+'.csv'+'\n'continuefh = open(path+'/'+tmp+'.csv','w')fh.write(data)fh.close()print str(com_num)+': \033[0;32mDownloaded\033[0m '+tmp+'.csv'+'\n'com_num += 1print '\033[0;33mCongratulations! Downloaded '+str(com_num-1)+' files!\033[0m\n'
总结,一步下载全部股票信息的命令如下:
$ python run.py folder_path trading_market_code