【标题描述】:
【测试类型:SQL功能/存储功能/接口功能/工具功能/性能/并发/压力长稳/故障注入/安全/资料/编码规范】【测试版本:x.x.x】 问题描述
【操作系统和硬件信息】(查询命令: cat /etc/system-release, uname -a):
【测试环境】(单机/1主x备x级联备):
单机
【被测功能】:
备份
【测试类型】:
【数据库版本】(查询命令: gaussdb -V):
master
【预置条件】:
【操作步骤】(请填写详细的操作步骤):
【预期输出】:
【实际输出】:
【原因分析】:
!4494:gs_probackup支持PITR功能
此pr将pg_start_backup的场景也设置backupEndRquired,而由于未执行pg_stop_backup,所以并不存在backup_end类型的xlog,导致启动失败,回退此pr后可启动成功
【日志信息】(请附上日志文件、截图、coredump信息):
【测试代码】:
Hey @CCA, Welcome to openGauss Community.
All of the projects in openGauss Community are maintained by @opengauss_bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at Here to find the details.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
Hi @CCA, please use the command /sig xxx to add a SIG label to this issue.
For example: /sig sqlengine or /sig storageengine or /sig om or /sig ai and so on.
You can find more SIG labels from Here.
If you have no idea about that, please contact with @xiangxinyong , @zhangxubo .
我昨天验证的时候可能分支代码没处理好,这个问题应该确实是必现的。那么从问题解决上来看,我考虑按照PG的代码判断逻辑,移除对于backupStartPoint的判断。基本如下。
/*
* Have we reached the point where our base backup was completed?
*/
if (!XLogRecPtrIsInvalid(backupEndPoint) &&
backupEndPoint <= lastReplayedEndRecPtr)
{
elog(DEBUG1, "end of backup reached");
/*
* We have reached the end of base backup, as indicated by pg_control.
* Update the control file accordingly.
*/
ReachedEndOfBackup(lastReplayedEndRecPtr, lastReplayedTLI);
backupStartPoint = InvalidXLogRecPtr;
backupEndPoint = InvalidXLogRecPtr;
backupEndRequired = false;
}
但是同时我又担心这个代码删除可能产生别的影响,所以目前考虑的修改patch可能是按照这个改法:
Subject: [PATCH] test
---
Index: src/gausskernel/storage/access/transam/xlog.cpp
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/gausskernel/storage/access/transam/xlog.cpp b/src/gausskernel/storage/access/transam/xlog.cpp
--- a/src/gausskernel/storage/access/transam/xlog.cpp (revision 45f6613aec7194e0028e792b9062ef9aabab2c3d)
+++ b/src/gausskernel/storage/access/transam/xlog.cpp (date 1701238723240)
@@ -11319,9 +11319,11 @@
return;
/*
* Have we reached the point where our base backup was completed?
+ * backupStartPoint may clear when XLOG_BACKUP_END redo
*/
if (!XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupEndPoint) &&
- !XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint) &&
+ (!t_thrd.shemem_ptr_cxt.ControlFile->backupEndRequired ||
+ !XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint)) &&
XLByteLE(t_thrd.shemem_ptr_cxt.ControlFile->backupEndPoint, lastReplayedEndRecPtr)) {
/*
* We have reached the end of base backup, as indicated by pg_control.
@@ -14147,8 +14149,7 @@
rc = memcpy_s(&startpoint, sizeof(startpoint), XLogRecGetData(record), sizeof(startpoint));
securec_check(rc, "", "");
- if (XLByteEQ(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint, startpoint) &&
- t_thrd.shemem_ptr_cxt.ControlFile->backupEndRequired) {
+ if (XLByteEQ(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint, startpoint)) {
/*
* We have reached the end of base backup, the point where
* pg_stop_backup() was done. The data on disk is now consistent.
@@ -16739,8 +16740,6 @@
if (fscanf_s(lfp, "BACKUP METHOD: %19s\n", backuptype, sizeof(backuptype)) == 1) {
if (strcmp(backuptype, "streamed") == 0) {
*backupEndRequired = true;
- } else if (strcmp(backuptype, "pg_start_backup") == 0) {
- *backupEndRequired = true;
}
}
目前对于这个判断,增加这个分支,应该可以将影响最小化
引入的原因是:是全量build会产生XLOG_BACKUP_END日志,这个日志带有start_point的标记。之后的增量build的start_point起点一样,带过来的日志里有这个XLOG_BACKUP_END,回放的时候会满足这个条件进行处理。清理掉之后,后续再check_recovery_consistency函数中,由于backupStartPoint被清理为0,无法再调用backup_cut_xlog_file重装backupEndPoint,导致这个值被遗留在controlfile中。
因此问题2中引入了这个判断。
我看pg的实现逻辑中,问题1是没有这个backupstartPoint判断的。正常情况下直接去掉。为了减少去掉的影响,增加了对backupEndRequired判断,把问题2的判断条件引入
嗯嗯,我试一下把
验证版本:
openGauss=# select version();
version
--------------------------------------------------------------------------------------------------------------------------------------------------------
(openGauss 5.1.1 build 01b191f0) compiled at 2023-12-07 15:17:01 commit 0 last mr on aarch64-unknown-linux-gnu, compiled by g++ (GCC) 10.3.1, 64-bit
(1 row)
openGauss=# select pg_start_backup('test');
pg_start_backup
-----------------
0/3000028
(1 row)
kill数据库
数据库状态异常
目录残留文件
重启数据库:数据库正常启动
验收结论:通过
验收版本:
验收截图:
1.select pg_start_backup('test');
2.开启另一个窗口kill进程
3.数据库状态异常
4.目录残留文件
5.重启数据库,数据库启动正常
登录 后才可以发表评论