09-03 周二 ansible部署和节点管理过程

09-03 周二 ansible部署和节点管理过程
时间版本修改人描述
2024年9月3日10:08:58V0.1宋全恒新建文档,

简介

 首先要找一个跳板机,来确保所有的机器都可以访问。然后我们围绕ansible来搭建环境,方便一键执行所有的命令,主要的任务是将这10个节点均挂载NAS服务器,添加我们的harbor服务器,

ansible介绍

 ansible/ansible at v2.17.3是一个自动化的管理工具,可以管理多个节点,实现诸如命令执行,自动挂载,文件拷贝等命令。非常的方便管理集群的场景。

 常用的模块如下所示:

image-20240903153745533

10GPU信息

image-20240903143640134

批量执行

for ip in $(seq 64 73); do ssh root@10.107.204.$ip "systemctl restart docker"; done

结果

 经过设置,在42服务器上使用yuzailiang用户创建了conda虚拟环境,ansible,激活该环境,可实现对于GPU节点的批量操作

部署步骤

创建conda环境,安装ansible

(ansible) yuzailiang@ubuntu:~$ cat update_harbor.yml 
---
- name: Update Docker daemon configuration and ensure valid JSONhosts: gpusbecome: yestasks:- name: Install Python if not installedansible.builtin.package:name: python3state: present- name: Ensure /etc/docker/daemon.json existsansible.builtin.file:path: /etc/docker/daemon.jsonstate: touch- name: Read existing daemon.jsonansible.builtin.slurp:path: /etc/docker/daemon.jsonregister: daemon_json_content- name: Decode JSONansible.builtin.set_fact:daemon_json: "{{ daemon_json_content['content'] | b64decode | from_json }}"- name: Ensure insecure-registries contains the new registryansible.builtin.set_fact:updated_daemon_json: >-{{daemon_json | combine({'insecure-registries': (daemon_json['insecure-registries'] | default([])) + ['10.200.88.53']})}}- name: Write updated daemon.jsonansible.builtin.copy:dest: /etc/docker/daemon.jsoncontent: "{{ updated_daemon_json | to_nice_json }}"backup: yesmode: '0644'- name: Validate JSON syntaxansible.builtin.command:cmd: 'python3 -m json.tool /etc/docker/daemon.json'register: validation_resultfailed_when: validation_result.rc != 0ignore_errors: yes- name: Print validation resultansible.builtin.debug:msg: "JSON validation result: {{ validation_result.stdout }}"- name: Restart Docker serviceansible.builtin.service:name: dockerstate: restarted- name: Log in to Docker registryansible.builtin.command:cmd: docker login 10.200.88.53 --username dros_admin --password 'Dros@zjgxn&07101604'ignore_errors: yes

配置ansible

新建inventory节点清单

[operator]
10.107.204.64[framework]
10.107.204.65[model]
10.107.204.66
10.107.204.67
10.107.204.68
10.107.204.69[compile]
10.107.204.70[abstract]
10.107.204.71[communication]
10.107.204.72
10.107.204.73# New group that includes all the groups
[gpus:children]
operator
framework
model
compile
abstract
communication

 我们可以进一步的为这些IP起别名,方便我们操作

(ansible) yuzailiang@ubuntu:~$ sudo vim /etc/ansible/hosts 10.107.204.65[model]
10.107.204.66
10.107.204.67
10.107.204.68
10.107.204.69[compile]
10.107.204.70[hardware]
10.107.204.71[communication]
10.107.204.72
10.107.204.73# New group that includes all the groups
[gpus:children]
operator
framework
model
compile
hardware
communication# Aliases for all nodes
[gpus]
gpu1 ansible_host=10.107.204.64
gpu2 ansible_host=10.107.204.65
gpu3 ansible_host=10.107.204.66
gpu4 ansible_host=10.107.204.67
gpu5 ansible_host=10.107.204.68
gpu6 ansible_host=10.107.204.69
gpu7 ansible_host=10.107.204.70
gpu8 ansible_host=10.107.204.71
gpu9 ansible_host=10.107.204.72
gpu10 ansible_host=10.107.204.73

拷贝公钥,免密配置

(ansible) yuzailiang@ubuntu:~/Shell$ bash copy_pub.sh 
正在将公钥复制到 root@10.107.204.64...
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/yuzailiang/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keysNumber of key(s) added: 1Now try logging into the machine, with:   "ssh -o 'StrictHostKeyChecking=no' 'root@10.107.204.64'"
and check to make sure that only the key(s) you wanted were added.成功将公钥复制到 10.107.204.64
正在将公钥复制到 root@10.107.204.65...
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/yuzailiang/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keysNumber of key(s) added: 1Now try logging into the machine, with:   "ssh -o 'StrictHostKeyChecking=no' 'root@10.107.204.65'"
and check to make sure that only the key(s) you wanted were added.成功将公钥复制到 10.107.204.65
正在将公钥复制到 root@10.107.204.66...
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/yuzailiang/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed/usr/bin/ssh-copy-id: WARNING: All keys were skipped because they already exist on the remote system.(if you think this is a mistake, you may want to use -f option)成功将公钥复制到 10.107.204.66
正在将公钥复制到 root@10.107.204.67...
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/yuzailiang/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keysNumber of key(s) added: 1Now try logging into the machine, with:   "ssh -o 'StrictHostKeyChecking=no' 'root@10.107.204.67'"
and check to make sure that only the key(s) you wanted were added.成功将公钥复制到 10.107.204.67
正在将公钥复制到 root@10.107.204.68...
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/yuzailiang/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keysNumber of key(s) added: 1Now try logging into the machine, with:   "ssh -o 'StrictHostKeyChecking=no' 'root@10.107.204.68'"
and check to make sure that only the key(s) you wanted were added.成功将公钥复制到 10.107.204.68
正在将公钥复制到 root@10.107.204.69...
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/yuzailiang/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keysNumber of key(s) added: 1Now try logging into the machine, with:   "ssh -o 'StrictHostKeyChecking=no' 'root@10.107.204.69'"
and check to make sure that only the key(s) you wanted were added.成功将公钥复制到 10.107.204.69
正在将公钥复制到 root@10.107.204.70...
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/yuzailiang/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keysNumber of key(s) added: 1Now try logging into the machine, with:   "ssh -o 'StrictHostKeyChecking=no' 'root@10.107.204.70'"
and check to make sure that only the key(s) you wanted were added.成功将公钥复制到 10.107.204.70
正在将公钥复制到 root@10.107.204.71...
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/yuzailiang/.ssh/id_rsa.pub"

 进一步的,可以优化这个脚本,方便复用

(ansible) yuzailiang@ubuntu:~/Shell$ cat copy_pub.sh 
#!/bin/bash# 参数检查
if [ $# -ne 3 ]; thenecho "使用方法: $0 <基础IP> <起始IP> <终止IP>"echo "示例: $0 10.107.204 72 73"exit 1
fi# 获取参数
BASE_IP="$1."
START_IP=$2
END_IP=$3# SSH用户
USER="root"# SSH密码
PASSWORD="qsgctys@05980"# 公钥路径
PUB_KEY_PATH="$HOME/.ssh/id_rsa.pub"# 检查sshpass是否安装
if ! command -v sshpass &> /dev/null; thenecho "sshpass未安装,请先安装它。"exit 1
fi# 检查公钥是否存在
if [ ! -f "$PUB_KEY_PATH" ]; thenecho "SSH公钥未找到,请生成公钥或指定正确的路径。"exit 1
fi# 循环遍历IP范围并复制公钥
for i in $(seq $START_IP $END_IP); doFULL_IP="$BASE_IP$i"echo "正在将公钥复制到 $USER@$FULL_IP..."# 使用sshpass传递密码并复制公钥sshpass -p "$PASSWORD" ssh-copy-id -i "$PUB_KEY_PATH" -o StrictHostKeyChecking=no "$USER@$FULL_IP"if [ $? -eq 0 ]; thenecho "成功将公钥复制到 $FULL_IP"elseecho "无法连接到 $FULL_IP,跳过..."fi
doneecho "所有操作完成。"

配置远端用户/etc/ansible/ansible.cfg

 由于在本机的用户为yuzailiang,而远端操作机器的用户为root,因此我们需要关联私钥和用户。配置

(ansible) yuzailiang@ubuntu:~/Shell$ sudo cat /etc/ansible/ansible.cfg 
[defaults]
remote_user = root
private_key_file = ~/.ssh/id_rsa
interpreter_python = auto

 最后interpreter_python = auto是为了抑制警告。

因此,在使用ansible环境时,需要使用42服务器,使用yuzailiang用户登录,激活环境ansible,然后就能愉快的操作这些节点组了。

使用playbook编辑hosts

 新建play-bok剧本文件

(ansible) yuzailiang@ubuntu:~$ cat update_hosts.yml 
---
- name: Ensure /etc/hosts contains NAS entryhosts: gpus  # 指定目标组名become: yes  # 提升权限以编辑 /etc/hoststasks:- name: Check if /etc/hosts contains NAS entryansible.builtin.lineinfile:path: /etc/hostsline: "10.15.35.70 NAS"state: presentbackup: yes  # 可选,备份文件tags: hosts
(ansible) yuzailiang@ubuntu:~$ ansible-playbook  update_hosts.yml -l modelPLAY [Ensure /etc/hosts contains NAS entry] **********************************************************************************************************************************************************************TASK [Gathering Facts] *******************************************************************************************************************************************************************************************
[WARNING]: Platform linux on host 10.107.204.67 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.67]
[WARNING]: Platform linux on host 10.107.204.69 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.69]
[WARNING]: Platform linux on host 10.107.204.68 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.68]
[WARNING]: Platform linux on host 10.107.204.66 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.66]TASK [Check if /etc/hosts contains NAS entry] ********************************************************************************************************************************************************************
changed: [10.107.204.67]
changed: [10.107.204.66]
changed: [10.107.204.68]
changed: [10.107.204.69]PLAY RECAP *******************************************************************************************************************************************************************************************************
10.107.204.66              : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.67              : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.68              : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.69              : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

挂载NAS

新建剧本ensure_mounts.yml

(ansible) yuzailiang@ubuntu:~$ cat ensure_mounts.yml 
---
- name: Ensure directories and mounts are configuredhosts: all  # 或者指定特定的组,如 'gpus'become: yes  # 提升权限以创建目录、编辑 /etc/fstab 和执行挂载操作tasks:- name: Ensure directories existansible.builtin.file:path: "{{ item }}"state: directorymode: '0755'loop:- /mnt/nas_v1- /mnt/nas_v2- /mnt/self-define- name: Ensure fstab contains necessary entriesansible.builtin.lineinfile:path: /etc/fstabline: "{{ item }}"state: presentbackup: yes  # 可选,备份文件loop:- "nas:/volume1/1 /mnt/nas_v1 nfs defaults 0 0"- "nas:/volume1/1/self-define /mnt/self-define nfs defaults 0 0"- "nas:/volume2/2 /mnt/nas_v2 nfs defaults 0 0"- name: Ensure all filesystems are mountedansible.builtin.mount:path: "{{ item.path }}"src: "{{ item.src }}"fstype: "{{ item.fstype }}"opts: "{{ item.opts }}"state: mountedloop:- { path: "/mnt/nas_v1", src: "nas:/volume1/1", fstype: "nfs", opts: "defaults" }- { path: "/mnt/self-define", src: "nas:/volume1/1/self-define", fstype: "nfs", opts: "defaults" }- { path: "/mnt/nas_v2", src: "nas:/volume2/2", fstype: "nfs", opts: "defaults" }

执行命令

 执行上述剧本,创建目录,更新/etc/fstab 并且执行挂载

(ansible) yuzailiang@ubuntu:~$ ansible-playbook ensure_mounts.yml -l gpusPLAY [Ensure directories and mounts are configured] **************************************************************************************************************************************************************TASK [Gathering Facts] *******************************************************************************************************************************************************************************************
[WARNING]: Platform linux on host 10.107.204.67 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.67]
[WARNING]: Platform linux on host 10.107.204.68 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.68]
[WARNING]: Platform linux on host 10.107.204.64 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.64]
[WARNING]: Platform linux on host 10.107.204.65 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.65]
[WARNING]: Platform linux on host 10.107.204.66 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.66]
[WARNING]: Platform linux on host 10.107.204.69 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.69]
[WARNING]: Platform linux on host 10.107.204.72 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.72]
[WARNING]: Platform linux on host 10.107.204.70 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.70]
[WARNING]: Platform linux on host 10.107.204.73 is using the discovered Python interpreter at /usr/bin/python3.8, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [10.107.204.73]
fatal: [10.107.204.71]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 10.107.204.71 port 22: Connection timed out", "unreachable": true}TASK [Ensure directories exist] **********************************************************************************************************************************************************************************
ok: [10.107.204.68] => (item=/mnt/nas_v1)
ok: [10.107.204.65] => (item=/mnt/nas_v1)
changed: [10.107.204.66] => (item=/mnt/nas_v1)
ok: [10.107.204.64] => (item=/mnt/nas_v1)
ok: [10.107.204.67] => (item=/mnt/nas_v1)
ok: [10.107.204.68] => (item=/mnt/nas_v2)
ok: [10.107.204.65] => (item=/mnt/nas_v2)
changed: [10.107.204.66] => (item=/mnt/nas_v2)
ok: [10.107.204.67] => (item=/mnt/nas_v2)
ok: [10.107.204.64] => (item=/mnt/nas_v2)
ok: [10.107.204.68] => (item=/mnt/self-define)
ok: [10.107.204.65] => (item=/mnt/self-define)
ok: [10.107.204.64] => (item=/mnt/self-define)
ok: [10.107.204.67] => (item=/mnt/self-define)
ok: [10.107.204.66] => (item=/mnt/self-define)
ok: [10.107.204.69] => (item=/mnt/nas_v1)
ok: [10.107.204.70] => (item=/mnt/nas_v1)
ok: [10.107.204.73] => (item=/mnt/nas_v1)
ok: [10.107.204.72] => (item=/mnt/nas_v1)
ok: [10.107.204.69] => (item=/mnt/nas_v2)
ok: [10.107.204.70] => (item=/mnt/nas_v2)
ok: [10.107.204.72] => (item=/mnt/nas_v2)
ok: [10.107.204.73] => (item=/mnt/nas_v2)
ok: [10.107.204.69] => (item=/mnt/self-define)
ok: [10.107.204.70] => (item=/mnt/self-define)
ok: [10.107.204.72] => (item=/mnt/self-define)
ok: [10.107.204.73] => (item=/mnt/self-define)TASK [Ensure fstab contains necessary entries] *******************************************************************************************************************************************************************
ok: [10.107.204.64] => (item=nas:/volume1/1 /mnt/nas_v1 nfs defaults 0 0)
ok: [10.107.204.67] => (item=nas:/volume1/1 /mnt/nas_v1 nfs defaults 0 0)
ok: [10.107.204.68] => (item=nas:/volume1/1 /mnt/nas_v1 nfs defaults 0 0)
ok: [10.107.204.66] => (item=nas:/volume1/1 /mnt/nas_v1 nfs defaults 0 0)
ok: [10.107.204.65] => (item=nas:/volume1/1 /mnt/nas_v1 nfs defaults 0 0)
ok: [10.107.204.64] => (item=nas:/volume1/1/self-define /mnt/self-define nfs defaults 0 0)
ok: [10.107.204.67] => (item=nas:/volume1/1/self-define /mnt/self-define nfs defaults 0 0)
ok: [10.107.204.66] => (item=nas:/volume1/1/self-define /mnt/self-define nfs defaults 0 0)
ok: [10.107.204.65] => (item=nas:/volume1/1/self-define /mnt/self-define nfs defaults 0 0)
ok: [10.107.204.68] => (item=nas:/volume1/1/self-define /mnt/self-define nfs defaults 0 0)
ok: [10.107.204.64] => (item=nas:/volume2/2 /mnt/nas_v2 nfs defaults 0 0)
ok: [10.107.204.67] => (item=nas:/volume2/2 /mnt/nas_v2 nfs defaults 0 0)
ok: [10.107.204.65] => (item=nas:/volume2/2 /mnt/nas_v2 nfs defaults 0 0)
ok: [10.107.204.66] => (item=nas:/volume2/2 /mnt/nas_v2 nfs defaults 0 0)
ok: [10.107.204.68] => (item=nas:/volume2/2 /mnt/nas_v2 nfs defaults 0 0)
ok: [10.107.204.69] => (item=nas:/volume1/1 /mnt/nas_v1 nfs defaults 0 0)
ok: [10.107.204.70] => (item=nas:/volume1/1 /mnt/nas_v1 nfs defaults 0 0)
ok: [10.107.204.73] => (item=nas:/volume1/1 /mnt/nas_v1 nfs defaults 0 0)
ok: [10.107.204.72] => (item=nas:/volume1/1 /mnt/nas_v1 nfs defaults 0 0)
ok: [10.107.204.69] => (item=nas:/volume1/1/self-define /mnt/self-define nfs defaults 0 0)
ok: [10.107.204.70] => (item=nas:/volume1/1/self-define /mnt/self-define nfs defaults 0 0)
ok: [10.107.204.72] => (item=nas:/volume1/1/self-define /mnt/self-define nfs defaults 0 0)
ok: [10.107.204.73] => (item=nas:/volume1/1/self-define /mnt/self-define nfs defaults 0 0)
ok: [10.107.204.69] => (item=nas:/volume2/2 /mnt/nas_v2 nfs defaults 0 0)
ok: [10.107.204.70] => (item=nas:/volume2/2 /mnt/nas_v2 nfs defaults 0 0)
ok: [10.107.204.72] => (item=nas:/volume2/2 /mnt/nas_v2 nfs defaults 0 0)
ok: [10.107.204.73] => (item=nas:/volume2/2 /mnt/nas_v2 nfs defaults 0 0)TASK [Ensure all filesystems are mounted] ************************************************************************************************************************************************************************
ok: [10.107.204.66] => (item={'path': '/mnt/nas_v1', 'src': 'nas:/volume1/1', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.65] => (item={'path': '/mnt/nas_v1', 'src': 'nas:/volume1/1', 'fstype': 'nfs', 'opts': 'defaults'})
ok: [10.107.204.66] => (item={'path': '/mnt/self-define', 'src': 'nas:/volume1/1/self-define', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.67] => (item={'path': '/mnt/nas_v1', 'src': 'nas:/volume1/1', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.64] => (item={'path': '/mnt/nas_v1', 'src': 'nas:/volume1/1', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.68] => (item={'path': '/mnt/nas_v1', 'src': 'nas:/volume1/1', 'fstype': 'nfs', 'opts': 'defaults'})
ok: [10.107.204.66] => (item={'path': '/mnt/nas_v2', 'src': 'nas:/volume2/2', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.65] => (item={'path': '/mnt/self-define', 'src': 'nas:/volume1/1/self-define', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.67] => (item={'path': '/mnt/self-define', 'src': 'nas:/volume1/1/self-define', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.64] => (item={'path': '/mnt/self-define', 'src': 'nas:/volume1/1/self-define', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.68] => (item={'path': '/mnt/self-define', 'src': 'nas:/volume1/1/self-define', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.65] => (item={'path': '/mnt/nas_v2', 'src': 'nas:/volume2/2', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.67] => (item={'path': '/mnt/nas_v2', 'src': 'nas:/volume2/2', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.64] => (item={'path': '/mnt/nas_v2', 'src': 'nas:/volume2/2', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.69] => (item={'path': '/mnt/nas_v1', 'src': 'nas:/volume1/1', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.68] => (item={'path': '/mnt/nas_v2', 'src': 'nas:/volume2/2', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.69] => (item={'path': '/mnt/self-define', 'src': 'nas:/volume1/1/self-define', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.70] => (item={'path': '/mnt/nas_v1', 'src': 'nas:/volume1/1', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.72] => (item={'path': '/mnt/nas_v1', 'src': 'nas:/volume1/1', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.73] => (item={'path': '/mnt/nas_v1', 'src': 'nas:/volume1/1', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.69] => (item={'path': '/mnt/nas_v2', 'src': 'nas:/volume2/2', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.70] => (item={'path': '/mnt/self-define', 'src': 'nas:/volume1/1/self-define', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.72] => (item={'path': '/mnt/self-define', 'src': 'nas:/volume1/1/self-define', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.73] => (item={'path': '/mnt/self-define', 'src': 'nas:/volume1/1/self-define', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.70] => (item={'path': '/mnt/nas_v2', 'src': 'nas:/volume2/2', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.72] => (item={'path': '/mnt/nas_v2', 'src': 'nas:/volume2/2', 'fstype': 'nfs', 'opts': 'defaults'})
changed: [10.107.204.73] => (item={'path': '/mnt/nas_v2', 'src': 'nas:/volume2/2', 'fstype': 'nfs', 'opts': 'defaults'})PLAY RECAP *******************************************************************************************************************************************************************************************************
10.107.204.64              : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.65              : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.66              : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.67              : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.68              : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.69              : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.70              : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.71              : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.72              : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.73              : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0  

harbor处理

 如果在某些节点上的 /etc/docker/daemon.json 文件中已经包含了 "insecure-registries" 配置项,并且你希望添加新的仓库地址而不删除现有的项,你需要确保更新操作不会覆盖现有的配置。Ansible 的 blockinfile 模块可以帮助你添加新的配置,同时保留文件中已存在的其他内容。

新建playbook

---
- name: Update Docker daemon configuration and login to repository if neededhosts: gpusbecome: yestasks:- name: Install python3ansible.builtin.package:name: python3state: present- name: Ensure /etc/docker/daemon.json existsansible.builtin.file:path: /etc/docker/daemon.jsonstate: touch- name: Add new registry to /etc/docker/daemon.jsonansible.builtin.blockinfile:path: /etc/docker/daemon.jsonblock: |{"insecure-registries": ["10.200.88.53"]}marker: "# {mark} ANSIBLE MANAGED BLOCK"create: yesbackup: yesmode: '0644'validate: 'python3 -m json.tool %s > /dev/null'- name: Restart Docker serviceansible.builtin.service:name: dockerstate: restarted- name: Check if Docker is already logged inansible.builtin.command:cmd: docker info | grep "Username:"register: docker_login_statusignore_errors: yes- name: Log in to Docker registry if not already logged inansible.builtin.command:cmd: docker login 10.200.88.53 --username dros_admin --password 'Dros@zjgxn&07101604'when: docker_login_status.rc != 0ignore_errors: yes

playbook解析

 上述命令解析如下

确保 Python 已安装

  • 确保在节点上安装了 Python3,因为 json.tool 需要 Python3 支持。

确保 /etc/docker/daemon.json 存在

  • 确保该文件存在,即使它是空文件。

读取现有的 daemon.json

  • 使用 slurp 模块读取现有的 JSON 文件内容。

解码 JSON

  • 将读取到的 base64 编码的内容解码并转换为 JSON 对象。

确保包含新仓库地址

  • 更新 JSON 对象,确保 insecure-registries 中包含新的仓库地址。

写入更新后的 daemon.json

  • 将更新后的 JSON 写入到 /etc/docker/daemon.json,并进行备份。

验证 JSON 语法

  • 验证 JSON 文件的语法正确性。

重启 Docker 服务

  • 确保 Docker 服务使用新的配置重新启动。

直接登录 Docker 注册表

  • 尝试登录 Docker 注册表,如果登录失败不会中断 Playbook 的执行。

执行命令

(ansible) yuzailiang@ubuntu:~$ ansible-playbook update_harbor.ymlTASK [Restart Docker service] ************************************************************************************************************************************************************************************
changed: [10.107.204.65]
changed: [10.107.204.66]
changed: [10.107.204.64]
changed: [10.107.204.68]
changed: [10.107.204.67]
changed: [10.107.204.72]
changed: [10.107.204.69]
changed: [10.107.204.70]
changed: [10.107.204.73]TASK [Log in to Docker registry] *********************************************************************************************************************************************************************************
changed: [10.107.204.64]
changed: [10.107.204.67]
changed: [10.107.204.66]
changed: [10.107.204.65]
changed: [10.107.204.68]
changed: [10.107.204.69]
changed: [10.107.204.72]
changed: [10.107.204.70]
changed: [10.107.204.73]PLAY RECAP *******************************************************************************************************************************************************************************************************
10.107.204.64              : ok=11   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.65              : ok=11   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.66              : ok=11   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.67              : ok=11   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.68              : ok=11   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.69              : ok=11   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.70              : ok=11   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.71              : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.72              : ok=11   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
10.107.204.73              : ok=11   changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

总结

 本文围绕ansible,以及ansible命令和ansible-playbook命令完成了自动化集群管理的环境部署,以及使用,通过自动完成harbor仓库配置,NAS目录挂载,更新hosts,等同类任务方便所有GPU节点的使用。ansible是一个非常良好的自动化管理工具。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/416349.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

SQL语言的规则和规范

规则 是什么呢&#xff0c;规则就是我们最基本&#xff0c;每时每刻都要遵守的比如人行道靠右&#xff0c;不能逆行&#xff0c; 规范 呢就是锦上添花&#xff0c;如果你不这么做&#xff0c;是不那么道德&#xff0c;不那么好的&#xff0c;就像小学生见到老师要问好&#…

机器学习:opencv图像识别--图片运算、边界、阈值处理、平滑处理

目录 一、图片运算 1.加法 1. 2.add 3.加权相加 2.减法 二、图片边界 三、图像阈值处理 四、图像平滑处理 1.生成椒盐噪声 2.滤波器 1.均值滤波 2.方框滤波 3.高斯滤波 4.中值滤波 一、图片运算 1.加法 1. 直接将图片上每个像素点的值加上给定值或者两张图片…

wpf image source绑定相对路径方法

当使用image source绑定相对路径图片资源时&#xff0c;出现问题&#xff1a;未能找到路径C:\windows/System32…路径的一部分 解决方法&#xff1a; 将文件放到指定文件夹中包含在当前项目中 具体绑定语句为&#xff1a; <Image Stretch"Fill" x:Name"imgT…

(计算机论文)基于SpringBoot和Vue的台球赛事服务网站的设计与实现

毕业设计&#xff08;论文&#xff09; 博主可接毕设论文&#xff01;&#xff01;&#xff01; 基于SpringBoot和Vue的台球赛事服务网站的设计与实现 摘 要 在快速发展的信息时代&#xff0c;体育竞赛作为群众文化娱乐的一部分&#xff0c;已日益受到广泛关注。台球&#xff…

python 怎样计算字符串的长度

python 计算字符串长度&#xff0c;一个中文算两个字符&#xff0c;先转换成utf8&#xff0c;然后通过计算utf8的长度和len函数取得的长度&#xff0c;进行对比即可知道字符串内中文字符的数量&#xff0c;自然就可以计算出字符串的长度了。 valueu脚本12 length len(value) u…

排查SQL Server中的内存不足及其他疑难问题

文章目录 引言I DMV 资源信号灯资源信号灯 DMV sys.dm_exec_query_resource_semaphores( 确定查询执行内存的等待)查询性能计数器什么是内存授予?II DBCC MEMORYSTATUS 查询内存对象III DBCC 命令释放多个 SQL Server 内存缓存 - 临时度量值IV 等待资源池 %ls (%ld)中的内存…

统计学习与方法实战——K近邻算法

K近邻算法 K近邻算法备注k近邻模型算法距离度量 k k k值选择分类决策规则构造KDTree k k k近邻查找范围查询 代码结构总结 K近邻算法 备注 kNN是一种基本分类与回归方法. 多数表决规则等价于0-1损失函数下的经验风险最小化&#xff0c;支持多分类&#xff0c; 有别于前面的感…

QT做一个USB HID设备识别软件

1.下载 HidApi库&#xff1a;GitHub - yigityuce/HidApi: Human Interface Device Api (HidApi) with C 2.pro文件添加 DEFINES - UNICODE LIBS -lsetupapi 3.h文件 #ifndef My_Usb_Hid_Device_H #define My_Usb_Hid_Device_H#include <QWidget> #include <QStr…

数据结构(6.4_6)——拓扑排序

AOV网 AOV网&#xff1a;用顶点表示活动的网。 用DAG图(有向无环图)表示一个工程&#xff0c;顶点表示活动&#xff0c;有向边<Vi,Vj>表示活动Vi必须先于vj进行 拓扑排序&#xff08;找到做事的先后顺序&#xff09; 对有回路的图进行拓扑排序 拓扑排序的实现代码 回…

Redis过期键监听

在 Redis 中&#xff0c;为了监听过期键事件&#xff0c;需要使用 Redis 的 Keyspace Notifications 功能。这一功能允许客户端订阅某些事件的发生&#xff0c;比如键过期、键删除等。 启用过期键监听 在 Redis 的配置文件 redis.conf 中&#xff0c;确保配置项 notify-keysp…

Python画笔案例-031 绘制器形图

1、绘制蝌蚪 通过 python 的turtle 库绘制器形图&#xff0c;如下图&#xff1a; 2、实现代码 绘制器形图&#xff0c;以下为实现代码&#xff1a; """器形图.py采用前进&#xff0c;倒退&#xff0c;左转&#xff0c;右转命令制作的一个图形。 ""&q…

场外个股期权机构有哪些?

今天带你了解场外个股期权机构有哪些&#xff1f;场外个股期权交易商名单包括了多家券商&#xff0c;这些券商在场外期权市场中扮演着重要的角色。 场外个股期权通常涉及的主要机构包括&#xff1a; 1.投资银行&#xff1a;这些机构常常作为交易的中介或对手方&#xff0c;为…

绝区零苹果电脑能玩吗,如何在Mac上玩绝区零?绝区零MacBook 下载安装保姆级教程

《绝区零》是一款由米哈游开发的都市动作冒险游戏&#xff0c;游戏的故事背景设定在一个名为「新艾利都」的现代化大都市中&#xff0c;玩家将扮演一对「绳匠」兄妹展开冒险。很多玩家都在问苹果电脑笔记本Mac怎么玩绝区零&#xff0c;今天就给大家介绍一下《绝区零》是一款什么…

【UE5】控件蓝图——树视图(TreeView)的基本使用

目录 前言 效果 步骤 一、显示根节点 二、显示子节点 前言 我们在视口中添加1个方块&#xff0c;2个球体&#xff0c;5个圆柱 它们在大纲视图中的层级关系如下&#xff0c;那么如何将这种层级关系显示在树视图中是本篇文章要解决的问题。 效果 步骤 一、显示根节点 1…

跨境电商代购系统中前台基本功能介绍:帮助更快的了解跨境代购业务

前台多语言&#xff1a;可支持语言有中文&#xff08;繁体&#xff09;中文&#xff08;简体&#xff09;英文等。多语言使用百度翻译引擎接口实现&#xff0c;翻译效果与百度一致&#xff1b;网站语言分为两大块&#xff1a;1.系统后台有语言包可以编辑修改网站标题以及发布文…

mongodb在Java中条件分组聚合查询并且分页(时间戳,按日期分组,年月日...)

废话不多说&#xff0c;先看效果图&#xff1a; SQL查询结果示例&#xff1a; 多种查询结果示例&#xff1a; 原SQL&#xff1a; db.getCollection("hbdd_order").aggregate([{// 把时间戳格式化$addFields: {orderDate: {"$dateToString": {"for…

[数据集][目标检测]课堂行行为检测数据集VOC+YOLO格式4065张12类别

数据集格式&#xff1a;Pascal VOC格式YOLO格式(不包含分割路径的txt文件&#xff0c;仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件) 图片数量(jpg文件个数)&#xff1a;4065 标注数量(xml文件个数)&#xff1a;4065 标注数量(txt文件个数)&#xff1a;4065 标注…

【MySQL】索引性能分析工具详解——>为sql优化(select)做准备

前言 大家好吖&#xff0c;欢迎来到 YY 滴MySQL系列 &#xff0c;热烈欢迎&#xff01; 本章主要内容面向接触过C的老铁 主要内容含&#xff1a; 欢迎订阅 YY滴C专栏&#xff01;更多干货持续更新&#xff01;以下是传送门&#xff01; YY的《C》专栏YY的《C11》专栏YY的《Lin…

DS18B20时序抓图

关于时序文字描述参见&#xff1a;DS18B20时序描述 一个完整的读数过程如下&#xff1a; 对应的过程如下&#xff1a; Reset Presence RomCmd:0xCC(Skip Rom) FunCmd:0xBE(Read Scratchpad) Data:0x01A0(Temperature,26) Reset Presence RomCmd:0xCC(Skip Rom) FunCmd:0x44(Co…

两大信号 华为又有神操作

文&#xff5c;琥珀食酒社 作者 | 积溪 华为的神操作又要来了 三折叠手机马上就要亮相 手机圈的所有友商们 又要睡不着觉了 这次华为选择和苹果硬刚 发布会都定在了同一天 绝对不是巧合 而是神来之笔 就是要告诉苹果 该是华为的市场 它都得拿回来 也让友商们认清了…