1.部署minio环境
docker pull minio/minio
宿主机与容器挂在映射
宿主机位置 | 容器位置 |
---|---|
/data/minio/config | /data |
/data/minio/data | /root/.minio |
拉起环境:
docker run -p 9000:9000 -p 9090:9090 --name minio \
-d --restart=always \
-e "MINIO_ACCESS_KEY=admin" \
-e "MINIO_SECRET_KEY=admin123456" \
-v /data/minio/data:/data \
-v /data/minio/config:/root/.minio \minio/minio \
server /data --console-address ":9090
2.准备starrocks环境
参考docker部署starrocks 使用 Docker 部署 StarRocks @ deploy_with_docker @ StarRocks Docs
3.minio文件查询/全库备份·实操
借助python生成parquet文件
xiuchenggong@xiuchengdeMacBook-Pro ~ % python3
Python 3.9.10 (main, Jan 15 2022, 11:48:04)
[Clang 13.0.0 (clang-1300.0.29.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd;
>>> pf = pd.read_csv("/Users/xiuchenggong/test.csv")
>>> pf.to_parquet("/Users/xiuchenggong/test.parquet",engine="pyarrow")
3.1 去查存在minio上的parquet数据(支持查parquet或orc格式数据):
StarRocks > CREATE EXTERNAL TABLE table_1-> (-> name string,-> id int-> )-> ENGINE=file-> PROPERTIES-> (-> "path" = "s3a://starrocks/test.parquet",-> "format" = "parquet",-> "aws.s3.enable_ssl" = "false",-> "aws.s3.enable_path_style_access" = "true",-> "aws.s3.endpoint" = "172.17.0.3:9000",-> "aws.s3.access_key" = "0OnU8H9YwTNTJUBC2r7F",-> "aws.s3.secret_key" = "vFQ3fIcs90woUS4200L0BYfxelE86iF6cI4vVzYC"-> );
Query OK, 0 rows affected (0.009 sec)StarRocks > show tables;
+-------------------+
| Tables_in_test_db |
+-------------------+
| table_1 |
| test1 |
| test2 |
+-------------------+
3 rows in set (0.003 sec)
StarRocks > select * from table_1;
+--------------+------+
| name | id |
+--------------+------+
| gongxiucheng | 1 |
| gongzixi | 2 |
+--------------+------+
2 rows in set (0.073 sec)
3.2 全量备份到minio(外表不能备份)
创建repository:
StarRocks > create repository starrocks_backup_01-> with broker-> on location "s3a://starrocks"-> properties(-> "aws.s3.enable_ssl" = "false",-> "aws.s3.enable_path_style_access" = "true",-> "aws.s3.access_key" = "0OnU8H9YwTNTJUBC2r7F",-> "aws.s3.secret_key" = "vFQ3fIcs90woUS4200L0BYfxelE86iF6cI4vVzYC",-> "aws.s3.endpoint" = "172.17.0.3:9000"-> )-> ;
开始备份:
StarRocks > drop table table_1;
Query OK, 0 rows affected (0.010 sec)StarRocks > backup snapshot test_db.snapshot_minio to starrocks_backup_01 properties("type"="full");
Query OK, 0 rows affected (0.024 sec)StarRocks > show backup\G;
*************************** 1. row ***************************JobId: 11047SnapshotName: snapshot_minioDbName: test_dbState: SAVE_METABackupObjs: [test_db.test1], [test_db.test2]CreateTime: 2023-09-05 01:58:42
SnapshotFinishedTime: 2023-09-05 01:58:48UploadFinishedTime: 2023-09-05 01:58:54FinishedTime: NULLUnfinishedTasks:Progress:TaskErrMsg:Status: [OK]Timeout: 86400
1 row in set (0.003 sec)ERROR: No query specifiedStarRocks > show backup\G;
*************************** 1. row ***************************JobId: 11047SnapshotName: snapshot_minioDbName: test_dbState: FINISHEDBackupObjs: [test_db.test1], [test_db.test2]CreateTime: 2023-09-05 01:58:42
SnapshotFinishedTime: 2023-09-05 01:58:48UploadFinishedTime: 2023-09-05 01:58:54FinishedTime: 2023-09-05 01:59:00UnfinishedTasks:Progress:TaskErrMsg:Status: [OK]Timeout: 86400
1 row in set (0.004 sec)ERROR: No query specified
查看minio上文件:
备份成功;