【标题描述】:3.x(带cm)升级至5.0.1(带cm),升级提交后一段时间,CM集群主备发生切换
【测试类型:工具功能】【测试版本:5.0.1】【升级】3.x(带cm)升级至5.0.1(带cm),升级提交后一段时间,CM集群主备发生切换
【操作系统和硬件信息】(查询命令: cat /etc/system-release, uname -a):
CentOS Linux release 7.6.1810 (Core)
Linux kwepwebenv07954 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
【测试环境】(单机/1主x备x级联备):
一主两备
【被测功能】:
升级
【测试类型】:
功能测试
【数据库版本】(查询命令: gaussdb -V):
3.0.0:gsql (openGauss 3.0.0 build 02c14696) compiled at 2022-04-01 18:12:34 commit 0 last mr
3.0.2:gsql (openGauss 3.0.2 build 74914a8d) compiled at 2022-11-17 17:02:30 commit 0 last mr
3.0.3:gsql (openGauss 3.0.3 build 46134f73) compiled at 2023-01-10 22:42:07 commit 0 last mr
3.0.5:gsql (openGauss 3.0.5 build b54d05de) compiled at 2023-09-14 19:23:00 commit 0 last mr
3.1.0:gsql (openGauss 3.1.0 build 4e931f9a) compiled at 2022-09-29 14:19:24 commit 0 last mr
5.0.1:gsql (openGauss 5.0.1 build f766addf) compiled at 2023-10-07 18:07:51 commit 0 last mr
【预置条件】:
【操作步骤】(请填写详细的操作步骤):
可复现路径:
3.0.2cm -- 5.0.1cm 就地升级_回滚_升级提交
3.0.3cm -- 5.0.1cm 就地升级、就地升级_回滚_升级提交、就地升级_强制回滚_升级提交
3.0.5ncm/cm -- 5.0.1cm 就地升级
3.1.0cm -- 5.0.1cm 就地升级
3.0.0cm -- 3.1.0cm -- 5.0.1cm 就地升级
【预期输出】:
升级成功,tpcc连跑正常
【实际输出】:
升级提交后一段时间,CM集群主备发生切换
【原因分析】:
【日志信息】(请附上日志文件、截图、coredump信息):
【测试代码】:
Hey @lixin, Welcome to openGauss Community.
All of the projects in openGauss Community are maintained by @opengauss_bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at Here to find the details.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
Hi @lixin, please use the command /sig xxx to add a SIG label to this issue.
For example: /sig sqlengine or /sig storageengine or /sig om or /sig ai and so on.
You can find more SIG labels from Here.
If you have no idea about that, please contact with @xiangxinyong , @zhangxubo .
切换涉及到2个过程,一个是升级发生切换,另一个是提交时发生切换。当前场景为提交时发生切换,目前未知原因在升级场景发生了切换。
验收日期:2023/11/29
验收版本:gsql (openGauss 3.0.2 build 74914a8d) compiled at 2022-11-17 17:02:30 commit 0 last mr
gsql (openGauss 5.0.1 build d855a18f) compiled at 2023-11-28 19:13:55 commit 0 last mr
验收结论:通过
验收场景:3.0.2带cm就地升级到5.0.1带cm,再回滚,再升级提交
升级前
[2023-11-29 16:40:03 INFO UpgradeScene 19 20512] 第1次升级前检查
[2023-11-29 16:40:03 INFO UpgradeScene 19 20512] 升级前准备与检查
[2023-11-29 16:40:03 INFO UpgradeScene 19 20512] 1.获取升级前版本
[2023-11-29 16:40:03 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc
gs_ssh -c "gsql -V"
EOF
[2023-11-29 16:40:06 INFO UpgradeScene 19 20512] Success: Successfully execute command on all nodes.
Output:
[SUCCESS] kwepwebenv06293:
gsql (openGauss 3.0.2 build 74914a8d) compiled at 2022-11-17 17:02:30 commit 0 last mr
[SUCCESS] kwepwebenv07953:
gsql (openGauss 3.0.2 build 74914a8d) compiled at 2022-11-17 17:02:30 commit 0 last mr
[SUCCESS] kwepwebenv07954:
gsql (openGauss 3.0.2 build 74914a8d) compiled at 2022-11-17 17:02:30 commit 0 last mr
就地升级
[2023-11-29 16:43:43 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc && gs_upgradectl -t auto-upgrade -X /opt/upgrade_1129/1129/pkg/upgrade_1129.xml
EOF
[2023-11-29 16:52:21 INFO UpgradeScene 19 20512] Success: Static configuration matched with old static configuration files.
Performing inplace rollback.
Rollback succeeded.
Checking upgrade environment.
Successfully checked upgrade environment.
Wait for the cluster status normal or degrade.
Start check CMS parameter.
Old cluster version number less than 92574.
Start to do health check.
Successfully checked cluster status.
Backing up current application and configurations.
Successfully backed up current application and configurations.
Stop cluster with gs_om successfully.
Backing up cluster configuration.
Successfully backup hotpatch config file.
Successfully backed up cluster configuration.
Installing new binary.
Restoring cluster configuration.
Successfully restored cluster configuration.
Stop cluster with gs_om successfully.
Modifying the socket path.
copy certs from /data/upgrade_1129/cluster/app_74914a8d to /data/upgrade_1129/cluster/app_d855a18f.
Successfully copy certs from /data/upgrade_1129/cluster/app_74914a8d to /data/upgrade_1129/cluster/app_d855a18f.
Stop cluster with gs_om successfully.
Switch symbolic link to new binary directory.
Successfully switch symbolic link to new binary directory.
Stop cluster with gs_om successfully.
Waiting for the cluster status to become normal.
.
The cluster status is normal.
Start to do health check.
Successfully checked cluster status.
Upgrade main process has been finished, user can do some check now.
Once the check done, please execute following command to commit upgrade:
gs_upgradectl -t commit-upgrade -X /opt/upgrade_1129/1129/pkg/upgrade_1129.xml
Last login: Wed Nov 29 16:43:42 CST 2023
[2023-11-29 16:52:21 INFO UpgradeScene 19 20512] 升级后验证
[2023-11-29 16:52:21 INFO UpgradeScene 19 20512] 1.版本验证
[2023-11-29 16:52:21 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc
gs_ssh -c "gsql -V"
EOF
[2023-11-29 16:52:24 INFO UpgradeScene 19 20512] Success: Successfully execute command on all nodes.
Output:
[SUCCESS] kwepwebenv06293:
gsql (openGauss 5.0.1 build d855a18f) compiled at 2023-11-28 19:13:55 commit 0 last mr
[SUCCESS] kwepwebenv07953:
gsql (openGauss 5.0.1 build d855a18f) compiled at 2023-11-28 19:13:55 commit 0 last mr
[SUCCESS] kwepwebenv07954:
gsql (openGauss 5.0.1 build d855a18f) compiled at 2023-11-28 19:13:55 commit 0 last mr
[2023-11-29 16:52:28 INFO UpgradeScene 19 20512] 3.检查数据库状态
[2023-11-29 16:52:28 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc
gs_om -t status --all
EOF
[2023-11-29 16:52:29 INFO UpgradeScene 19 20512] Success: -----------------------------------------------------------------------
cluster_state : Normal
redistributing : No
balanced : Yes
-----------------------------------------------------------------------
node : 1
node_name : kwepwebenv06293
node : 1
instance_id : 1
node_ip : xx.xx.xx.1
data_path : /data/upgrade_1129/cluster/cm/cm_server
type : CMServer
instance_state : Down
node : 1
instance_id : 6001
node_ip : xx.xx.xx.1
data_path : /data/upgrade_1129/cluster/dn1
type : Datanode
instance_state : Primary
static_connections : 2
HA_state : Normal
reason : Normal
standby_node :
standby_data_path :
standby_node :
standby_data_path :
standby_state : Standby
sender_sent_location : 0/B1A77670
sender_write_location : 0/B1A77670
sender_flush_location : 0/B1A77670
sender_replay_location : 0/B1A77670
receiver_received_location: 0/B1A77670
receiver_write_location : 0/B1A77670
receiver_flush_location : 0/B1A77670
receiver_replay_location : 0/B1A77670
sync_state : Quorum
secondary_state : Unknown
sender_sent_location : 0/0
sender_write_location : 0/0
sender_flush_location : 0/0
sender_replay_location : 0/0
receiver_received_location: 0/0
receiver_write_location : 0/0
receiver_flush_location : 0/0
receiver_replay_location : 0/0
sync_state : Unknown
node : 1
node_name : kwepwebenv06293
node : 1
instance_id : 1
node_ip : xx.xx.xx.1
data_path : /data/upgrade_1129/cluster/cm/cm_server
type : CMServer
instance_state : Down
node : 1
node_ip : xx.xx.xx.1
type : Fenced UDF
state : Normal
-----------------------------------------------------------------------
node : 2
node_name : kwepwebenv07953
node : 2
instance_id : 2
node_ip : xx.xx.xx.2
data_path : /data/upgrade_1129/cluster/cm/cm_server
type : CMServer
instance_state : Standby
node : 2
instance_id : 6002
node_ip : xx.xx.xx.2
data_path : /data/upgrade_1129/cluster/dn1
type : Datanode
instance_state : Standby
dcf_role : FOLLOWER
static_connections : 2
HA_state : Normal
reason : Normal
sender_sent_location : 0/B1A77670
sender_write_location : 0/B1A77670
sender_flush_location : 0/B1A77670
sender_replay_location : 0/B1A77670
receiver_received_location: 0/B1A77670
receiver_write_location : 0/B1A77670
receiver_flush_location : 0/B1A77670
receiver_replay_location : 0/B1A77670
sync_state : Async
node : 2
node_name : kwepwebenv07953
node : 2
instance_id : 2
node_ip : xx.xx.xx.2
data_path : /data/upgrade_1129/cluster/cm/cm_server
type : CMServer
instance_state : Standby
node : 2
node_ip : xx.xx.xx.2
type : Fenced UDF
state : Normal
-----------------------------------------------------------------------
node : 3
node_name : kwepwebenv07954
node : 3
instance_id : 3
node_ip : xx.xx.xx.3
data_path : /data/upgrade_1129/cluster/cm/cm_server
type : CMServer
instance_state : Primary
node : 3
instance_id : 6003
node_ip : xx.xx.xx.3
data_path : /data/upgrade_1129/cluster/dn1
type : Datanode
instance_state : Standby
dcf_role : FOLLOWER
static_connections : 2
HA_state : Normal
reason : Normal
sender_sent_location : 0/B1A77670
sender_write_location : 0/B1A77670
sender_flush_location : 0/B1A77670
sender_replay_location : 0/B1A77670
receiver_received_location: 0/B1A77670
receiver_write_location : 0/B1A77670
receiver_flush_location : 0/B1A77670
receiver_replay_location : 0/B1A77670
sync_state : Async
node : 3
node_name : kwepwebenv07954
node : 3
instance_id : 3
node_ip : xx.xx.xx.3
data_path : /data/upgrade_1129/cluster/cm/cm_server
type : CMServer
instance_state : Primary
node : 3
node_ip : xx.xx.xx.3
type : Fenced UDF
state : Normal
回滚
[2023-11-29 16:52:29 INFO UpgradeScene 19 20512] 开始回滚
[2023-11-29 16:52:29 INFO UpgradeScene 19 20512] 升级版本回滚
[2023-11-29 16:52:29 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc
gs_upgradectl -t auto-rollback -X /opt/upgrade_1129/1129/pkg/upgrade_1129.xml
EOF
[2023-11-29 16:56:10 INFO UpgradeScene 19 20512] Success: Static configuration matched with old static configuration files.
Performing inplace rollback.
Checking static configuration files.
Successfully checked static configuration files.
Restoring cluster configuration.
Successfully rollback hotpatch config file.
Successfully restored cluster configuration.
Start roll back CM instance.
Switch symbolic link to old binary directory.
Successfully switch symbolic link to old binary directory.
Stop cluster with gs_om successfully.
Restoring application and configurations.
Successfully restored application and configuration.
Restoring cluster configuration.
Successfully rollback hotpatch config file.
Successfully restored cluster configuration.
Clean up backup catalog files.
Start check CMS parameter.
Old cluster version number less than 92574.
Successfully cleaned new install path.
Rollback succeeded.
Last login: Wed Nov 29 16:52:28 CST 2023
[2023-11-29 16:56:10 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 -c "source /home/upgrade_1129/gaussdb.bashrc; gaussdb -V"
[2023-11-29 16:56:10 INFO UpgradeScene 19 20512] Success: gaussdb (openGauss 3.0.2 build 74914a8d) compiled at 2022-11-17 17:02:30 commit 0 last mr
再次升级
[2023-11-29 16:59:20 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc && gs_upgradectl -t auto-upgrade -X /opt/upgrade_1129/1129/pkg/upgrade_1129.xml
EOF
[2023-11-29 17:06:21 INFO UpgradeScene 19 20512] Success: Static configuration matched with old static configuration files.
Performing inplace rollback.
Rollback succeeded.
Checking upgrade environment.
Successfully checked upgrade environment.
Wait for the cluster status normal or degrade.
Start check CMS parameter.
Old cluster version number less than 92574.
Start to do health check.
Successfully checked cluster status.
Backing up current application and configurations.
Successfully backed up current application and configurations.
Stop cluster with gs_om successfully.
Backing up cluster configuration.
Successfully backup hotpatch config file.
Successfully backed up cluster configuration.
Installing new binary.
Restoring cluster configuration.
Successfully restored cluster configuration.
Stop cluster with gs_om successfully.
Modifying the socket path.
Successfully modified socket path.
copy certs from /data/upgrade_1129/cluster/app_74914a8d to /data/upgrade_1129/cluster/app_d855a18f.
Successfully copy certs from /data/upgrade_1129/cluster/app_74914a8d to /data/upgrade_1129/cluster/app_d855a18f.
Stop cluster with gs_om successfully.
Switch symbolic link to new binary directory.
Successfully switch symbolic link to new binary directory.
Stop cluster with gs_om successfully.
Waiting for the cluster status to become normal.
.
The cluster status is normal.
Start to do health check.
Successfully checked cluster status.
Upgrade main process has been finished, user can do some check now.
Once the check done, please execute following command to commit upgrade:
gs_upgradectl -t commit-upgrade -X /opt/upgrade_1129/1129/pkg/upgrade_1129.xml
[2023-11-29 17:06:21 INFO UpgradeScene 19 20512] 升级后验证
[2023-11-29 17:06:21 INFO UpgradeScene 19 20512] 1.版本验证
[2023-11-29 17:06:21 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc
gs_ssh -c "gsql -V"
EOF
[2023-11-29 17:06:23 INFO UpgradeScene 19 20512] Success: Successfully execute command on all nodes.
Output:
[SUCCESS] kwepwebenv06293:
gsql (openGauss 5.0.1 build d855a18f) compiled at 2023-11-28 19:13:55 commit 0 last mr
[SUCCESS] kwepwebenv07953:
gsql (openGauss 5.0.1 build d855a18f) compiled at 2023-11-28 19:13:55 commit 0 last mr
[SUCCESS] kwepwebenv07954:
gsql (openGauss 5.0.1 build d855a18f) compiled at 2023-11-28 19:13:55 commit 0 last mr
[2023-11-29 17:06:28 INFO UpgradeScene 19 20512] 3.检查数据库状态
[2023-11-29 17:06:28 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc
gs_om -t status --all
EOF
[2023-11-29 17:06:29 INFO UpgradeScene 19 20512] Success: -----------------------------------------------------------------------
cluster_state : Normal
redistributing : No
balanced : Yes
-----------------------------------------------------------------------
node : 1
node_name : kwepwebenv06293
node : 1
instance_id : 1
node_ip : xx.xx.xx.1
data_path : /data/upgrade_1129/cluster/cm/cm_server
type : CMServer
instance_state : Standby
node : 1
instance_id : 6001
node_ip : xx.xx.xx.1
data_path : /data/upgrade_1129/cluster/dn1
type : Datanode
instance_state : Primary
提交
[2023-11-29 17:06:29 INFO UpgradeScene 19 20512] 提交升级
[2023-11-29 17:06:29 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc
gs_upgradectl -t commit-upgrade -X /opt/upgrade_1129/1129/pkg/upgrade_1129.xml
EOF
[2023-11-29 17:07:46 INFO UpgradeScene 19 20512] Success: NOTICE: Start to commit binary upgrade.
Start to check whether can be committed.
Can be committed.
Start to set commit flag.
Set commit flag succeeded.
Start to do operations that cannot be rollback.
Wait for the cluster status normal or degrade.
Cancel the upgrade status succeeded.
Start to clean temp files for upgrade.
Start check CMS parameter.
Old cluster version number less than 92574.
Clean up backup catalog files.
Successfully cleaned old install path.
Stop cluster with gs_om successfully.
Clean temp files for upgrade succeeded.
NOTICE: Commit binary upgrade succeeded.
Last login: Wed Nov 29 17:06:28 CST 2023
[2023-11-29 17:07:46 INFO UpgradeScene 19 20512] 版本验证
[2023-11-29 17:07:46 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc
gs_ssh -c "gaussdb -V"
EOF
[2023-11-29 17:07:55 INFO UpgradeScene 19 20512] Success: Successfully execute command on all nodes.
Output:
[SUCCESS] kwepwebenv06293:
gaussdb (openGauss 5.0.1 build d855a18f) compiled at 2023-11-28 19:13:55 commit 0 last mr
[2023-11-29 17:08:02 INFO UpgradeScene 19 20512] 3.检查数据库状态
[2023-11-29 17:08:02 INFO UpgradeScene 19 20512] 开始执行: su - upgrade_1129 <<EOF
source /home/upgrade_1129/gaussdb.bashrc
gs_om -t status --all
EOF
[2023-11-29 17:08:03 INFO UpgradeScene 19 20512] Success: -----------------------------------------------------------------------
cluster_state : Normal
redistributing : No
balanced : Yes
-----------------------------------------------------------------------
node : 1
node_name : kwepwebenv06293
node : 1
instance_id : 1
node_ip : xx.xx.xx.1
data_path : /data/upgrade_1129/cluster/cm/cm_server
type : CMServer
instance_state : Standby
node : 1
instance_id : 6001
node_ip : xx.xx.xx.1
data_path : /data/upgrade_1129/cluster/dn1
type : Datanode
instance_state : Primary
登录 后才可以发表评论