部署环境
华为云Flexus应用服务器
操作系统:CentOS 7.6
openGauss版本:openGauss 5.1.0 (Preview)
参考文档
官方安装文档:
https://docs.opengauss.org/zh/docs/5.1.0/docs/InstallationGuide/%E4%BA%86%E8%A7%A3%E5%AE%89%E8%A3%85%E6%B5%81%E7%A8%8B.html
其他文件问题排查文档:
https://blog.csdn.net/YINZHE__/article/details/131347291
https://www.modb.pro/db/650751
安装openGauss
安装依赖
yum install update
yum install -y bzip2 libaio-devel flex bison ncurses-devel glibc-devel libxml2-devel patch redhat-lsb-core unzip gcc gcc-c++ perl openssl-devel libffi-devel libtool zlib-devel readline-devel expect
检查是安装了python3
python3 -V
准备配置文件
<?xml version="1.0" encoding="UTF-8"?>
<ROOT><!-- openGauss整体信息 --><CLUSTER><!-- 数据库名称 --><PARAM name="clusterName" value="dbCluster" /><!-- 数据库节点名称(hostname) --><PARAM name="nodeNames" value="node1_hostname" /><!-- 数据库安装目录--><PARAM name="gaussdbAppPath" value="/opt/huawei/install/app" /><!-- 日志目录--><PARAM name="gaussdbLogPath" value="/var/log/omm" /><!-- 临时文件目录--><PARAM name="tmpMppdbPath" value="/opt/huawei/tmp" /><!-- 数据库工具目录--><PARAM name="gaussdbToolPath" value="/opt/huawei/install/om" /><!-- 数据库core文件目录--><PARAM name="corePath" value="/opt/huawei/corefile" /><!-- 节点IP,与数据库节点名称列表一一对应 --><PARAM name="backIp1s" value="192.168.0.1"/> </CLUSTER><!-- 每台服务器上的节点部署信息 --><DEVICELIST><!-- 节点1上的部署信息 --><DEVICE sn="node1_hostname"><!-- 节点1的主机名称 --><PARAM name="name" value="node1_hostname"/><!-- 节点1所在的AZ及AZ优先级 --><PARAM name="azName" value="AZ1"/><PARAM name="azPriority" value="1"/><!-- 节点1的IP,如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP --><PARAM name="backIp1" value="192.168.0.1"/><PARAM name="sshIp1" value="192.168.0.1"/><!--dbnode--><PARAM name="dataNum" value="1"/><PARAM name="dataPortBase" value="15400"/><PARAM name="dataNode1" value="/opt/huawei/install/data/dn"/><PARAM name="dataNode1_syncNum" value="0"/></DEVICE></DEVICELIST>
</ROOT>
创建文件存放目录并给执行权限
mkdir -p /opt/software/openGauss
mkdir -p /opt/huawei
chmod 755 -R /opt/software
chmod 755 -R /opt/huawei
初始化安装环境
cd /opt/software/openGauss
tar -zxvf openGauss-x.x.x-openEuler-64bit-all.tar.gz
tar -zxvf openGauss-x.x.x-openEuler-64bit-om.tar.gz
cd /opt/software/openGauss/script
./gs_preinstall -U omm -G dbgrp -X /opt/software/openGauss/cluster_config.xml
执行gs_preinstall时会报错,提示我们使用命令:
./gs_checkos -i A -h hcss-ecs-3fd4 --detail
进行检查,检查结果:
[root@hcss-ecs-3fd4 script]# ./gs_checkos -i A -h hcss-ecs-3fd4 --detail
Checking items:A1. [ OS version status ] : Normal[hcss-ecs-3fd4]centos_7.6.1810_64bitA2. [ Kernel version status ] : NormalThe names about all kernel versions are same. The value is "3.10.0-1160. 119.1.el7.x86_64".A3. [ Unicode status ] : NormalThe values of all unicode are same. The value is "LANG=en_US.UTF-8".A4. [ Time zone status ] : NormalThe informations about all timezones are same. The value is "+0800".A5. [ Swap memory status ] : NormalThe value about swap memory is correct.A6. [ System control parameters status ] : NormalAll values about system control parameters are correct.A7. [ File system configuration status ] : NormalBoth soft nofile and hard nofile are correct.A8. [ Disk configuration status ] : NormalThe value about XFS mount parameters is correct.A9. [ Pre-read block size status ] : NormalThe value about Logical block size is correct.A10.[ IO scheduler status ] : NormalThe value of IO scheduler is correct.A11.[ Network card configuration status ] : Warning[hcss-ecs-3fd4]
BondMode NullWarning reason: Failed to obtain the network card speed value. Commands for obtain the network card speed: /sbin/ethtool lo | grep 'Speed:'. Error:A12.[ Time consistency status ] : Warning[hcss-ecs-3fd4]The NTPD not detected on machine and local time is "2025-03-20 08:37:33" .A13.[ Firewall service status ] : NormalThe firewall service is stopped.A14.[ THP service status ] : NormalThe THP service is stopped.
Total numbers:14. Abnormal numbers:0. Warning numbers:2.
如果有Abnormal项,必须解决,可以参考:https://blog.csdn.net/YINZHE__/article/details/131347291
开始安装
su - omm
gs_install -X /opt/software/openGauss/cluster_config.xml
安装过程中需要设置密码:
设置的密码要符合复杂度要求:
1.最少包含8个字符,最多包含16个字符。
2.不能和用户名、当前密码(ALTER)、或当前密码反序相同。
3.至少包含大写字母(A-Z)、小写字母(a-z)、数字、非字母数字字符(限定为~!@#$%^&*()-_=+|[{}];:,<.>/?)四类字符中的三类字符。
安装过程输出信息:
Parsing the configuration file.
Successfully checked gs_uninstall on every node.
Check preinstall on every node.
Successfully checked preinstall on every node.
Creating the backup directory.
Successfully created the backup directory.
begin deploy..
Installing the cluster.
begin prepare Install Cluster..
Checking the installation environment on all nodes.
begin install Cluster..
Installing applications on all nodes.
Successfully installed APP.
begin init Instance..
encrypt cipher and rand files for database.
Please enter password for database:
Please repeat for database:
begin to create CA cert files
The sslcert will be generated in /opt/huawei/install/app/share/sslcert/om
NO cm_server instance, no need to create CA for CM.
Non-dss_ssl_enable, no need to create CA for DSS
Cluster installation is completed.
Configuring.
Deleting instances from all nodes.
Successfully deleted instances from all nodes.
Checking node configuration on all nodes.
Initializing instances on all nodes.
Updating instance configuration on all nodes.
Check consistence of memCheck and coresCheck on database nodes.
Configuring pg_hba on all nodes.
Configuration is completed.
Using omm:dbgrp to install database.
Using installation program path : /opt/huawei/install/app_b5a8d5b0
$GAUSSHOME points to /opt/huawei/install/app_b5a8d5b0, no need to create symbolic link.
Traceback (most recent call last):File "/opt/huawei/install/om/script/local/Install.py", line 812, in <module>functionDict[g_opts.action]()File "/opt/huawei/install/om/script/local/Install.py", line 743, in startClusterdn.start(self.time_out)File "/opt/huawei/install/om/script/local/../gspylib/component/Kernel/Kernel.py", line 106, in start"failure details." + "\n" + output)
Exception: [GAUSS-51607] : Failed to start instance. Error: Please check the gs_ctl log for failure details.
[2025-03-20 09:09:00.686][6902][][gs_ctl]: gs_ctl started,datadir is /opt/huawei/install/data/dn
[2025-03-20 09:09:00.717][6902][][gs_ctl]: waiting for server to start...
.0 LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env.0 LOG: [Alarm Module]Host Name: hcss-ecs-3fd40 LOG: [Alarm Module]Host IP: hcss-ecs-3fd4. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOS T IP>0 LOG: [Alarm Module]Cluster Name: dbCluster0 LOG: [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 580 WARNING: failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or director y.
0 WARNING: failed to parse feature control file: gaussdb.version.
0 WARNING: Failed to load the product control file, so gaussdb cannot distinguish product version.
0 LOG: bbox_dump_path is set to /opt/huawei/corefile/
2025-03-20 09:09:00.783 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: base_page_saved_interval is 400, ori is 400.
2025-03-20 09:09:00.783 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 DB010 0 [REDO] LOG: Recovery parallelism, cpu count = 2, max = 4, ac tual = 2
2025-03-20 09:09:00.783 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 DB010 0 [REDO] LOG: ConfigRecoveryParallelism, true_max_recovery_par allelism:4, max_recovery_parallelism:4
gaussdb.state does not exist, and skipt setting since it is optional.2025-03-20 09:09:00.810 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 0 0000 0 [BACKEND] LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env.2025-03-20 09:09:00.810 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Host Name: hcss-ecs-3fd42025-03-20 09:09:00.810 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Host IP: hcss-ecs-3fd4. Copy ho stname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>2025-03-20 09:09:00.810 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Cluster Name: dbCluster2025-03-20 09:09:00.810 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 582025-03-20 09:09:00.815 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: loaded library "security_plugin"
2025-03-20 09:09:00.818 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets
2025-03-20 09:09:00.818 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets
2025-03-20 09:09:00.822 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2025-03-20 09:09:00.822 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: Failed to initialize the memory protect f or g_instance.attr.attr_storage.cstore_buffers (1024 Mbytes) or shared memory (3538 Mbytes) is larger.
2025-03-20 09:09:00.823 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 42809 0 [BACKEND] FATAL: could not create shared memory segment: Can not allocate memory
2025-03-20 09:09:00.823 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 42809 0 [BACKEND] DETAIL: Failed system call was shmget(key=15400001 , size=3710847304, 03600).
2025-03-20 09:09:00.823 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 42809 0 [BACKEND] HINT: This error usually means that openGauss's re quest for a shared memory segment exceeded available memory or swap space, or exceeded your kernel's SHMALL parameter. You can either reduce the reque st size or reconfigure the kernel with larger SHMALL. To reduce the request size (currently 3710847304 bytes), reduce openGauss's shared memory usage, perhaps by reducing shared_buffers.The openGauss documentation contains more information about shared memory configuration.
2025-03-20 09:09:00.826 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: FiniNuma allocIndex: 0.
[2025-03-20 09:09:01.719][6902][][gs_ctl]: waitpid 6905 failed, exitstatus is 256, ret is 2[2025-03-20 09:09:01.719][6902][][gs_ctl]: stopped waiting
[2025-03-20 09:09:01.719][6902][][gs_ctl]: could not start server
Examine the log output.[GAUSS-51607] : Failed to start instance. Error: Please check the gs_ctl log for failure details.
[2025-03-20 09:09:00.686][6902][][gs_ctl]: gs_ctl started,datadir is /opt/huawei/install/data/dn
[2025-03-20 09:09:00.717][6902][][gs_ctl]: waiting for server to start...
.0 LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env.0 LOG: [Alarm Module]Host Name: hcss-ecs-3fd40 LOG: [Alarm Module]Host IP: hcss-ecs-3fd4. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOS T IP>0 LOG: [Alarm Module]Cluster Name: dbCluster0 LOG: [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 580 WARNING: failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or director y.
0 WARNING: failed to parse feature control file: gaussdb.version.
0 WARNING: Failed to load the product control file, so gaussdb cannot distinguish product version.
0 LOG: bbox_dump_path is set to /opt/huawei/corefile/
2025-03-20 09:09:00.783 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: base_page_saved_interval is 400, ori is 400.
2025-03-20 09:09:00.783 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 DB010 0 [REDO] LOG: Recovery parallelism, cpu count = 2, max = 4, ac tual = 2
2025-03-20 09:09:00.783 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 DB010 0 [REDO] LOG: ConfigRecoveryParallelism, true_max_recovery_par allelism:4, max_recovery_parallelism:4
gaussdb.state does not exist, and skipt setting since it is optional.2025-03-20 09:09:00.810 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 0 0000 0 [BACKEND] LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env.2025-03-20 09:09:00.810 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Host Name: hcss-ecs-3fd42025-03-20 09:09:00.810 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Host IP: hcss-ecs-3fd4. Copy ho stname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>2025-03-20 09:09:00.810 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Cluster Name: dbCluster2025-03-20 09:09:00.810 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 582025-03-20 09:09:00.815 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: loaded library "security_plugin"
2025-03-20 09:09:00.818 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets
2025-03-20 09:09:00.818 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets
2025-03-20 09:09:00.822 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2025-03-20 09:09:00.822 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: Failed to initialize the memory protect f or g_instance.attr.attr_storage.cstore_buffers (1024 Mbytes) or shared memory (3538 Mbytes) is larger.
2025-03-20 09:09:00.823 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 42809 0 [BACKEND] FATAL: could not create shared memory segment: Can not allocate memory
2025-03-20 09:09:00.823 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 42809 0 [BACKEND] DETAIL: Failed system call was shmget(key=15400001 , size=3710847304, 03600).
2025-03-20 09:09:00.823 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 42809 0 [BACKEND] HINT: This error usually means that openGauss's re quest for a shared memory segment exceeded available memory or swap space, or exceeded your kernel's SHMALL parameter. You can either reduce the reque st size or reconfigure the kernel with larger SHMALL. To reduce the request size (currently 3710847304 bytes), reduce openGauss's shared memory usage, perhaps by reducing shared_buffers.The openGauss documentation contains more information about shared memory configuration.
2025-03-20 09:09:00.826 67db6aac.1 [unknown] 140006029562368 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: FiniNuma allocIndex: 0.
[2025-03-20 09:09:01.719][6902][][gs_ctl]: waitpid 6905 failed, exitstatus is 256, ret is 2[2025-03-20 09:09:01.719][6902][][gs_ctl]: stopped waiting
[2025-03-20 09:09:01.719][6902][][gs_ctl]: could not start server
Examine the log output.
从输出信息中看到安装成功但是无法启动,主要错误信息:Failed to start instance. Error: Please check the gs_ctl log for failure details.
原因是因为虚拟机内存太低了,参考:https://www.modb.pro/db/650751
在/opt/openGauss/data/dn/postgresql.conf中调整以下参数:
shared_buffers = 128MB
cstore_buffers = 128MB #min 16MB
ssl = off
重新启动数据库:
[omm@node111 ~]$ gs_om -t start
Starting cluster.
配置Navicat远程连接
修改配置
修改postgresql.conf
vim /opt/huawei/install/data/dn/postgresql.conf
将68行和110行改为
listen_addresses = '*'
password_encryption_type = 0
修改pg_hba.conf
vim /opt/huawei/install/data/dn/pg_hba.conf
追加
host all all 0.0.0.0/0 md5
重启数据库
su omm
gs_om -t restart
创建远程连接用户
gsql -d postgres -p 15400
openGauss=# create user test with password "OUYE@123";
NOTICE: The encrypted password contains MD5 ciphertext, which is not secure.
ALTER ROLE
openGauss=# GRANT ALL PRIVILEGES to test;
ALTER ROLE
openGauss=# create database db_tpcc owner test;
CREATE DATABASE
openGauss=# \q
[omm@hcss-ecs-3fd4 root]$ gsql -d db_tpcc -p 15400 -U test
Password for user test: