下载dataX
https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202308/datax.tar.gz
然后
下载后解压至本地某个目录,进入bin目录,即可运行同步作业:
$ cd {YOUR_DATAX_HOME}/bin $ python datax.py {YOUR_JOB.json}
要求你有python和jdk1.8还有maven3
第一步、创建作业的配置文件(json格式)
模板类型:
#stream2stream.json
{
"job": {
"content": [
{
"reader": {
"name": "streamreader",
"parameter": {
"sliceRecordCount": 10,
"column": [
{
"type": "long",
"value": "10"
},
{
"type": "string",
"value": "hello,你好,世界-DataX"
}
]
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"encoding": "UTF-8",
"print": true
}
}
}
],
"setting": {
"speed": {
"channel": 5
}
}
}
}
启动
$ cd {YOUR_DATAX_DIR_BIN}
$ python datax.py ./stream2stream.json
github左边,你想用哪个reader或者writer
直接去当前的resouece下,用他给好的json就行了。
如果你打不开github也无所谓,你下载下来的文件夹里面plugins里面就有模板。
非常简单。
例子
mysql读写例子
{"job": {"content": [{"reader": {"name": "mysqlreader", "parameter": {"username": "root","password": "123123","column": ["*"],"splitPk": "ID","where": "ID <= 1888","connection": [{"jdbcUrl": ["jdbc:mysql://192.168.1.1:3306/xxx?useUnicode=true&characterEncoding=utf8"], "table": ["t_member"]}]}}, "writer": {"name": "mysqlwriter", "parameter": {"column": ["*"], "connection": [{"jdbcUrl": "jdbc:mysql://192.168.1.2:3306/xxx?useUnicode=true&characterEncoding=utf8","table": ["t_xxx"]}], "password": "123123","preSql": ["执行写入前执行的语句,比如删除表啊,之类的"], "session": ["set session sql_mode='ANSI'"], "username": "root", "writeMode": "insert"}}}], "setting": {"speed": {"channel": "5"}}} }