1、前期准备
Linux系统
Python(最好是2)
Jdk 1.8以上
2、安装Python2
--更新软件包
sudo apt update
--安装python2
sudo apt install python2
--查看python版本
python2 --version
3、下载DataX
Linux下载DataX
wget http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
解压
tar -zxvf datax.tar.gz
4、增加DataX Job(DataX数据迁移任务)
读取库:SQL Server
写入库:MongoDB
SqlServerToMongodb.json内容如下
{"job": {"setting": {"speed": {"channel": 5}},"content": [{"reader": {"name": "sqlserverreader","parameter": {"username": "用户名","password": "密码","column": ["id","version","created","modified","code","name"],"splitPk": "pk","where": "","connection": [{"table": ["EMPLOYEE"],"jdbcUrl": ["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=TEST"]}]}},"writer": {"name": "mongodbwriter","parameter": {"address": ["127.0.0.1:27017"],"userName": "datax","userPassword": "datax","dbName": "TEST","collectionName": "employee","column": [{"name": "id","type": "string"}, {"name": "version","type": "int"}, {"name": "created","type": "date"}, {"name": "modified","type": "date"}, {"name": "code","type": "string"}, {"name": "name","type": "string"}]}}}]}
}
字段名 | 描述 |
channel | datax线程数(分几个线程执行) |
其他参数查看下面参考资料 |
本文是SqlServerToMongodb的示例,其他查看github里面其他库的读写文档
参考资料:
GitHub - alibaba/DataX: DataX是阿里云DataWorks数据集成的开源版本。
https://github.com/alibaba/DataX/blob/master/sqlserverreader/doc/sqlserverreader.md
https://github.com/alibaba/DataX/blob/master/mongodbwriter/doc/mongodbwriter.md
5、启动任务
进入bin目录
执行命令(具体目录以自己放的位置为主):
python2 datax.py ../job/SqlServerToMongodb.json