401 Star 1.4K Fork 1.5K

GVPopenGauss / openGauss-server

 / 详情

【测试类型:工具功能】【测试版本:5.0.1】【升级】 2.0.1就地升级5.0.1,提交升级时启库失败,产生core

已验收
缺陷
创建于  
2023-11-30 19:48

【标题描述】:2.0.1就地升级5.0.1,提交升级时启库失败,产生core
【测试类型:工具功能】【测试版本:5.0.1】【升级】 2.0.1就地升级5.0.1,提交升级时启库失败,产生core

【操作系统和硬件信息】(查询命令: cat /etc/system-release, uname -a):
[upgrade_1130@kwepwebenv07953 bin]$ cat /etc/system-release
CentOS Linux release 7.6.1810 (Core)
[upgrade_1130@kwepwebenv07953 bin]$ uname -a
Linux kwepwebenv07953 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
【测试环境】(单机/1主x备x级联备):
一主两备
【被测功能】:
升级
【测试类型】:
功能
【数据库版本】(查询命令: gaussdb -V):
gsql (openGauss 2.0.1 build d97c0e8a) compiled at 2021-06-02 19:37:17 commit 0 last mr
gsql (openGauss 5.0.1 build d855a18f) compiled at 2023-11-28 19:13:55 commit 0 last mr

【预置条件】:
不涉及
【操作步骤】(请填写详细的操作步骤):
1.CI配置直接升级路径及版本并执行
2.0.1-5.0.1 就地升级

【预期输出】:
升级成功
【实际输出】:
提交升级时启库失败,产生core
输入图片说明
【原因分析】:

  1. 这个问题的根因
  2. 问题推断过程
  3. 还有哪些原因可能造成类似现象
  4. 该问题是否有临时规避措施
  5. 问题解决方案
  6. 预计修复问题时间

【日志信息】(请附上日志文件、截图、coredump信息):

提交升级报错信息:

[2023-11-30 17:25:23 INFO UpgradeScene 19 1877] 提交升级
[2023-11-30 17:25:23 INFO UpgradeScene 19 1877] 开始执行: su - upgrade_1130 <<EOF
source /home/upgrade_1130/gaussdb.bashrc
gs_upgradectl -t commit-upgrade -X /opt/upgrade_1130/1130/pkg/upgrade_1130.xml
EOF
[2023-11-30 17:26:11 INFO UpgradeScene 19 1877] Success: NOTICE: Start to commit binary upgrade.
Start to check whether can be committed.
Can be committed.
Start to set commit flag.
Set commit flag succeeded.
Start to do operations that cannot be rollback.
Cancel the upgrade status succeeded.
Start to clean temp files for upgrade.
Clean up backup catalog files.
Successfully cleaned old install path.
[FAILURE] kwepwebenv06293:
[GAUSS-51607] : Failed to start instance. Error: Please check the gs_ctl log for failure details.
[2023-11-30 17:26:04.269][2134][][gs_ctl]: gs_ctl started,datadir is /data/upgrade_1130/cluster/dn1
[2023-11-30 17:26:04.332][2134][][gs_ctl]: waiting for server to start...
.0 LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

0 LOG:  [Alarm Module]Host Name: kwepwebenv06293

0 LOG:  [Alarm Module]Host IP: kwepwebenv06293. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>

0 LOG:  [Alarm Module]Cluster Name: upgrade_1130

0 LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

0 WARNING:  failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
0 WARNING:  failed to parse feature control file: gaussdb.version.
0 WARNING:  Failed to load the product control file, so gaussdb cannot distinguish product version.
0 LOG:  bbox_dump_path is set to /core/corefile/
2023-11-30 17:26:04.460 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 DB010  0 [REDO] LOG:  Recovery parallelism, cpu count = 8, max = 4, actual = 4
2023-11-30 17:26:04.460 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 DB010  0 [REDO] LOG:  ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4
2023-11-30 17:26:04.473 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

2023-11-30 17:26:04.473 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Host Name: kwepwebenv06293

2023-11-30 17:26:04.473 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Host IP: 10.247.86.215

2023-11-30 17:26:04.473 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Cluster Name: upgrade_1130

2023-11-30 17:26:04.474 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

2023-11-30 17:26:04.479 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  loaded library "security_plugin"
2023-11-30 17:26:04.481 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 01000  0 [BACKEND] WARNING:  could not create any HA TCP/IP sockets
2023-11-30 17:26:04.486 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2023-11-30 17:26:04.486 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 01000  0 [BACKEND] WARNING:  Failed to initialize the memory protect for g_instance.attr.attr_storage.cstore_buffers (1024 Mbytes) or shared memory (4475 Mbytes) is larger.
2023-11-30 17:26:04.607 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [CACHE] LOG:  set data cache  size(805306368)
2023-11-30 17:26:05.262 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [SEGMENT_PAGE] LOG:  Segment-page constants: DF_MAP_SIZE: 8156, DF_MAP_BIT_CNT: 65248, DF_MAP_GROUP_EXTENTS: 4175872, IPBLOCK_SIZE: 8168, EXTENTS_PER_IPBLOCK: 1021, IPBLOCK_GROUP_SIZE: 4090, BMT_HEADER_LEVEL0_TOTAL_PAGES: 8323072, BktMapEntryNumberPerBlock: 2038, BktMapBlockNumber: 25, BktBitMaxMapCnt: 512
2023-11-30 17:26:05.358 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  gaussdb: fsync file "/data/upgrade_1130/cluster/dn1/gaussdb.state.temp" success
2023-11-30 17:26:05.358 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  create gaussdb state file success: db state(STARTING_STATE), server mode(Primary), connection index(1)
2023-11-30 17:26:05.395 6568552c.1 [unknown] 139779193811264 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  max_safe_fds = 977, usable_fds = 1000, already_open = 13
bbox_dump_path is set to /core/corefile/
.[2023-11-30 17:26:11.594][2134][][gs_ctl]:  gaussDB state is Coredump

[2023-11-30 17:26:11.594][2134][][gs_ctl]: stopped waiting
[2023-11-30 17:26:11.594][2134][][gs_ctl]: could not start server
Examine the log output.

主节点堆栈:
输入图片说明
输入图片说明

【测试代码】:
不涉及

评论 (4)

lixin 创建了缺陷

Hey @lixin, Welcome to openGauss Community.
All of the projects in openGauss Community are maintained by @opengauss_bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at Here to find the details.

Hi @lixin, please use the command /sig xxx to add a SIG label to this issue.
For example: /sig sqlengine or /sig storageengine or /sig om or /sig ai and so on.
You can find more SIG labels from Here.
If you have no idea about that, please contact with @xiangxinyong , @zhangxubo .

lixin 负责人设置为胡正超
lixin 添加协作者周斌
lixin 关联项目设置为openGauss 5.0.0 community
lixin 优先级设置为主要
lixin 关联分支设置为master

看报错是双写文件缺失,导致打开失败。建议排查OM工具关于就地升级的流程,是否遗漏双写文件的拷贝。
请升级的owner进行处理 @周斌

胡正超 添加协作者胡正超
胡正超 负责人胡正超 修改为周斌
胡正超 取消协作者周斌
胡正超 取消协作者胡正超
周斌 添加协作者周斌
周斌 负责人周斌 修改为薛蒙恩
周斌 添加协作者liuheng
周斌 修改了备注
薛蒙恩 任务状态待办的 修改为已确认
薛蒙恩 任务状态已确认 修改为修复中
薛蒙恩 任务状态修复中 修改为已完成
薛蒙恩 修改了备注
薛蒙恩 任务状态已完成 修改为待回归
jiexiao1413 任务状态待回归 修改为测试中

验收日期:2023/12/19
验收版本:gsql (openGauss 2.0.1 build d97c0e8a) compiled at 2021-06-02 19:37:17 commit 0 last mr
gsql (openGauss 5.0.1 build 33b035fd) compiled at 2023-12-15 20:19:06 commit 0 last mrgsql (openGauss 5.0.1 build 33b035fd) compiled at 2023-12-15 20:19:06 commit 0 last mr
验收结论:通过
升级前版本

[2023-12-18 14:54:12 INFO UpgradeScene 19 19455] 1.获取升级前版本
[2023-12-18 14:54:12 INFO UpgradeScene 19 19455] 开始执行: su - upgrade_1218 <<EOF
source /home/upgrade_1218/gaussdb.bashrc
gs_ssh -c "gsql -V"
EOF
[2023-12-18 14:54:14 INFO UpgradeScene 19 19455] Success: Successfully execute command on all nodes.

Output:
[SUCCESS] kwemhisprc10436:
gsql (openGauss 2.0.1 build d97c0e8a) compiled at 2021-06-02 19:37:17 commit 0 last mr

就地升级

[2023-12-18 14:57:52 INFO UpgradeScene 19 19455] 开始执行: su - upgrade_1218 <<EOF
source /home/upgrade_1218/gaussdb.bashrc && gs_upgradectl -t auto-upgrade -X /opt/upgrade_1218/1218/pkg/upgrade_1218.xml
EOF
[2023-12-18 15:04:13 INFO UpgradeScene 19 19455] Success: Static configuration matched with old static configuration files.
Performing inplace rollback.
Rollback succeeded.
Checking upgrade environment.
Successfully checked upgrade environment.
Successfully started cluster.
Start to do health check.
Successfully checked cluster status.
Backing up current application and configurations.
Successfully backed up current application and configurations.
Backing up cluster configuration.
Successfully backup hotpatch config file.
Successfully backed up cluster configuration.
Installing new binary.
Restoring cluster configuration.
Successfully restored cluster configuration.
Successfully started cluster.
Start check CMS parameter.
Modifying the socket path.
Successfully modified socket path.
Successfully started cluster.
copy certs from /data/upgrade_1218/cluster/app_d97c0e8a to /data/upgrade_1218/cluster/app_33b035fd.
Successfully copy certs from /data/upgrade_1218/cluster/app_d97c0e8a to /data/upgrade_1218/cluster/app_33b035fd.
Switch symbolic link to new binary directory.
Successfully switch symbolic link to new binary directory.
Successfully started cluster.
Successfully started cluster.
Waiting for the cluster status to become normal.
.
The cluster status is normal.
Create checkpoint before switching.
Start to do health check.
Successfully checked cluster status.
Upgrade main process has been finished, user can do some check now.
Once the check done, please execute following command to commit upgrade:

    gs_upgradectl -t commit-upgrade -X /opt/upgrade_1218/1218/pkg/upgrade_1218.xml

Last login: Mon Dec 18 14:57:47 CST 2023
[2023-12-18 15:04:13 INFO UpgradeScene 19 19455] 升级后验证
[2023-12-18 15:04:13 INFO UpgradeScene 19 19455] 1.版本验证
[2023-12-18 15:04:13 INFO UpgradeScene 19 19455] 开始执行: su - upgrade_1218 <<EOF
source /home/upgrade_1218/gaussdb.bashrc
gs_ssh -c "gsql -V"
EOF
[2023-12-18 15:04:16 INFO UpgradeScene 19 19455] Success: Successfully execute command on all nodes.

Output:
[SUCCESS] kwemhisprc10436:
gsql (openGauss 5.0.1 build 33b035fd) compiled at 2023-12-15 20:19:06 commit 0 last mr

回滚

[2023-12-18 15:04:25 INFO UpgradeScene 19 19455] 开始回滚
[2023-12-18 15:04:25 INFO UpgradeScene 19 19455] 升级版本回滚
[2023-12-18 15:04:25 INFO UpgradeScene 19 19455] 开始执行: su - upgrade_1218 <<EOF
source /home/upgrade_1218/gaussdb.bashrc
gs_upgradectl -t auto-rollback -X /opt/upgrade_1218/1218/pkg/upgrade_1218.xml
EOF
[2023-12-18 15:06:43 INFO UpgradeScene 19 19455] Success: Static configuration matched with old static configuration files.
Performing inplace rollback.
Checking static configuration files.
Successfully checked static configuration files.
Successfully started cluster.
Restoring cluster configuration.
Successfully rollback hotpatch config file.
Successfully restored cluster configuration.
Start roll back CM instance.
Switch symbolic link to old binary directory.
Successfully switch symbolic link to old binary directory.
Successfully started cluster.
Restoring application and configurations.
Successfully restored application and configuration.
Restoring cluster configuration.
Successfully rollback hotpatch config file.
Successfully restored cluster configuration.
Clean up backup catalog files.
Successfully started cluster.
Successfully cleaned new install path.
Rollback succeeded.
Last login: Mon Dec 18 15:04:23 CST 2023
[2023-12-18 15:06:43 INFO UpgradeScene 19 19455] 开始执行: su - upgrade_1218 -c "source /home/upgrade_1218/gaussdb.bashrc; gaussdb -V"
[2023-12-18 15:06:43 INFO UpgradeScene 19 19455] Success: gaussdb (openGauss 2.0.1 build d97c0e8a) compiled at 2021-06-02 19:37:17 commit 0 last mr

就地升级

[2023-12-18 15:09:59 INFO UpgradeScene 19 19455] 开始执行: su - upgrade_1218 <<EOF
source /home/upgrade_1218/gaussdb.bashrc && gs_upgradectl -t auto-upgrade -X /opt/upgrade_1218/1218/pkg/upgrade_1218.xml
EOF
[2023-12-18 15:15:17 INFO UpgradeScene 19 19455] Success: Static configuration matched with old static configuration files.
Performing inplace rollback.
Rollback succeeded.
Checking upgrade environment.
Successfully checked upgrade environment.
Successfully started cluster.
Start to do health check.
Successfully checked cluster status.
Backing up current application and configurations.
Successfully backed up current application and configurations.
Backing up cluster configuration.
Successfully backup hotpatch config file.
Successfully backed up cluster configuration.
Installing new binary.
Restoring cluster configuration.
Successfully restored cluster configuration.
Successfully started cluster.
Start check CMS parameter.
Modifying the socket path.
Successfully modified socket path.
Successfully started cluster.
copy certs from /data/upgrade_1218/cluster/app_d97c0e8a to /data/upgrade_1218/cluster/app_33b035fd.
Successfully copy certs from /data/upgrade_1218/cluster/app_d97c0e8a to /data/upgrade_1218/cluster/app_33b035fd.
Switch symbolic link to new binary directory.
Successfully switch symbolic link to new binary directory.
Successfully started cluster.
Successfully started cluster.
Waiting for the cluster status to become normal.
.
The cluster status is normal.
Create checkpoint before switching.
Start to do health check.
Successfully checked cluster status.
Upgrade main process has been finished, user can do some check now.
Once the check done, please execute following command to commit upgrade:

    gs_upgradectl -t commit-upgrade -X /opt/upgrade_1218/1218/pkg/upgrade_1218.xml
[2023-12-18 15:15:17 INFO UpgradeScene 19 19455] 升级后验证
[2023-12-18 15:15:17 INFO UpgradeScene 19 19455] 1.版本验证
[2023-12-18 15:15:17 INFO UpgradeScene 19 19455] 开始执行: su - upgrade_1218 <<EOF
source /home/upgrade_1218/gaussdb.bashrc
gs_ssh -c "gsql -V"
EOF
[2023-12-18 15:15:20 INFO UpgradeScene 19 19455] Success: Successfully execute command on all nodes.

Output:
[SUCCESS] kwemhisprc10436:
gsql (openGauss 5.0.1 build 33b035fd) compiled at 2023-12-15 20:19:06 commit 0 last mr

提交升级

[2023-12-18 15:15:28 INFO UpgradeScene 19 19455] 最后再提交升级
[2023-12-18 15:15:28 INFO UpgradeScene 19 19455] 提交升级
[2023-12-18 15:15:28 INFO UpgradeScene 19 19455] 开始执行: su - upgrade_1218 <<EOF
source /home/upgrade_1218/gaussdb.bashrc
gs_upgradectl -t commit-upgrade -X /opt/upgrade_1218/1218/pkg/upgrade_1218.xml
EOF
[2023-12-18 15:16:15 INFO UpgradeScene 19 19455] Success: NOTICE: Start to commit binary upgrade.
Start to check whether can be committed.
Can be committed.
Start to set commit flag.
Set commit flag succeeded.
Start to do operations that cannot be rollback.
Cancel the upgrade status succeeded.
Start to clean temp files for upgrade.
Clean up backup catalog files.
Successfully cleaned old install path.
Successfully started cluster.
Clean temp files for upgrade succeeded.
NOTICE: Commit binary upgrade succeeded.
Last login: Mon Dec 18 15:15:26 CST 2023

[2023-12-18 15:16:15 INFO UpgradeScene 19 19455] 版本验证
[2023-12-18 15:16:15 INFO UpgradeScene 19 19455] 开始执行: su - upgrade_1218 <<EOF
source /home/upgrade_1218/gaussdb.bashrc
gs_ssh -c "gaussdb -V"
EOF
[2023-12-18 15:16:17 INFO UpgradeScene 19 19455] Success: Successfully execute command on all nodes.

Output:
[SUCCESS] kwemhisprc10436:
gaussdb (openGauss 5.0.1 build 33b035fd) compiled at 2023-12-15 20:19:06 commit 0 last mr

lixin 任务状态测试中 修改为已验收

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(5)
13084139 opengauss bot 1686829535 7504334 heng5938 1646112316
C++
1
https://gitee.com/opengauss/openGauss-server.git
git@gitee.com:opengauss/openGauss-server.git
opengauss
openGauss-server
openGauss-server

搜索帮助