Sign in
Sign up
Explore
Enterprise
Education
Search
Help
Terms of use
About Us
Explore
Enterprise
Education
Gitee Premium
Gitee AI
AI teammates
Sign in
Sign up
Fetch the repository succeeded.
Open Source
>
Database Related
>
Database Service
&&
Donate
Please sign in before you donate.
Cancel
Sign in
Scan WeChat QR to Pay
Cancel
Complete
Prompt
Switch to Alipay.
OK
Cancel
Watch
Unwatch
Watching
Releases Only
Ignoring
443
Star
1.5K
Fork
1.8K
openGauss
/
openGauss-server
Code
Issues
977
Pull Requests
166
Wiki
Insights
Pipelines
Service
Quality Analysis
Jenkins for Gitee
Tencent CloudBase
Tencent Cloud Serverless
悬镜安全
Aliyun SAE
Codeblitz
SBOM
Don’t show this again
Update failed. Please try again later!
Remove this flag
Content Risk Flag
This task is identified by
as the content contains sensitive information such as code security bugs, privacy leaks, etc., so it is only accessible to contributors of this repository.
主节点发送的同步请求Req消息被丢弃并且主节点发送宕机,重启之后整个集群没有主节点
Canceled
#I8MGB4
Bug
huanghj78
Opened this issue
2023-12-07 11:16
<!-- #请认真填写以下信息,否则可能由于无法定位,导致issue无法解决而被取消 --> 【标题描述】:丢弃主节点发送的同步请求Req,并在持续丢弃期间重启主节点,即使后续恢复正常不再丢弃Req,整个集群也没有选举出主节点 【测试类型:SQL功能/存储功能/接口功能/工具功能/性能/并发/压力长稳/故障注入/安全/资料/编码规范】【测试版本:x.x.x】 问题描述 【操作系统和硬件信息】(查询命令: cat /etc/system-release, uname -a): openEuler release 22.03 LTS Linux 3adf3c2c795d 4.15.0-194-generic #205-Ubuntu SMP Fri Sep 16 19:49:27 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux 【测试环境】(单机/1主x备x级联备): 1主2备 【被测功能】: 【测试类型】: 【数据库版本】(查询命令: gaussdb -V): gaussdb (openGauss 5.0.0 build ) compiled at 2023-04-06 15:35:14 commit 0 last mr debug 【预置条件】: 【操作步骤】(请填写详细的操作步骤): 插桩DCF源码rep_leader.c的函数rep_appendlog_node static status_t rep_appendlog_node(uint32 stream_id, uint32 node_id, dcf_role_t default_role, uint64 last_log_index, bool8* node_exists_log) { uint64 old_next_index = (uint64)cm_atomic_get((atomic_t*)&NEXT_INDEX); uint64 log_begin = old_next_index == CM_INVALID_INDEX_ID ? 1 : old_next_index; log_begin = MAX(log_begin, stg_first_index(stream_id)); uint64 log_count = rep_calu_log_count(stream_id, node_id, default_role, log_begin, last_log_index); *node_exists_log = (log_count > 0); /* Logs are sent even if log_count==0. Periodically sending empty logs ensures that lost packets are retransmitted */ mec_message_t pack; CM_RETURN_IFERR(mec_alloc_pack(&pack, MEC_CMD_APPEND_LOG_RPC_REQ, g_cur_node_id, node_id, stream_id)); uint64 pre_log_index = log_begin == CM_INVALID_INDEX_ID ? CM_INVALID_INDEX_ID : log_begin - 1; rep_apendlog_req_t appendlog_req; rep_init_appendlog_head(stream_id, &appendlog_req, pre_log_index, last_log_index); CM_RETURN_IFERR_EX(rep_encode_appendlog_head(&pack, &appendlog_req), mec_release_pack(&pack)); uint32 log_count_pos = mec_get_write_pos(&pack) - sizeof(uint64); uint64 j = 0; uint32 total_size = 0; for (uint64 index = log_begin; j < log_count; index++, j++) { log_entry_t* entry = stg_get_entry(stream_id, index); if (entry == NULL) { break; } total_size += (sizeof(rep_log_t) + ENTRY_SIZE(entry)); if (total_size > MESSAGE_BUFFER_SIZE && j > 0) { LOG_DEBUG_INF("[REP]total_size[%u] is enough, send size[%u]. log_count[%llu], j[%llu]", total_size, (uint32)(total_size - (sizeof(rep_log_t) + ENTRY_SIZE(entry))), log_count, j); stg_entry_dec_ref(entry); break; } status_t ret = rep_encode_one_log(&pack, log_count_pos, j + 1, entry); stg_entry_dec_ref(entry); if (ret != CM_SUCCESS) { mec_release_pack(&pack); LOG_DEBUG_ERR("[REP]encode_one_log fail, index=%llu, j=%llu", index, j); return CM_ERROR; } ps_record1(PS_PACK, index); } appendlog_req.log_count = j; // 在此插桩,伪代码如下 // 重启前插桩 static int cnt = 0; if(appendlog_req.log_count > 0){ cnt++; if cnt == 6 { kill $LEADR_PID } return CM_SUCCESS; } // 重启后插桩 static int cnt = 0; if(appendlog_req.log_count > 0 && cnt < 10){ cnt++; return CM_SUCCESS; } CM_RETURN_IFERR_EX(mec_send_data(&pack), mec_release_pack(&pack)); LOG_DEBUG_INF("[REP]rep send succeed: " REP_APPEND_REQ_FMT, REP_APPEND_REQ_VAL(&pack, &appendlog_req, log_begin)); if (APPEND_MODE == APPEND_NORMAL_MODE) { (void)cm_atomic_cas((atomic_t*)&NEXT_INDEX, old_next_index, log_begin + j); LOG_DEBUG_INF("[REP]set next_index to %llu,stream_id=%u,node_id=%u", NEXT_INDEX, stream_id, node_id); } mec_release_pack(&pack); return CM_SUCCESS; } 【预期输出】: 原主节点宕机之后集群发生选举,选举出新的主节点或原主节点重启之后,发生选举 【实际输出】: 原主节点宕机之后集群没有发生选举,且原主节点重启之后,仍没有发生选举 【原因分析】: 1. 这个问题的根因 2. 问题推断过程 3. 还有哪些原因可能造成类似现象 4. 该问题是否有临时规避措施 5. 问题解决方案 6. 预计修复问题时间 【日志信息】(请附上日志文件、截图、coredump信息): 初始状态:    重启后状态:    从节点日志: 2023-12-07 10:55:22.799 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: start create thread! 2023-12-07 10:55:22.799 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: create thread end! 2023-12-07 10:55:22.813 [unknown] [unknown] localhost 140542188910144 0[0:0#0] 0 [BACKEND] LOG: reaper backend started. 2023-12-07 10:55:22.813 [unknown] [unknown] localhost 140542218335808 0[0:0#0] 0 [BACKEND] LOG: [Alarm Module]alarm checker started. 2023-12-07 10:55:22.818 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [mpfl_ulink_file]: unlink global/max_page_flush_lsn sucessfully! ret:4294967295 2023-12-07 10:55:22.818 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: StartupXLOG: biggest_lsn_in_page is set to FFFFFFFF/FFFFFFFF, enable_update_max_page_flush_lsn:0 2023-12-07 10:55:22.818 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: database system timeline: 17 2023-12-07 10:55:22.818 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: database system was interrupted while in recovery at log time 2023-12-07 10:36:32 CST 2023-12-07 10:55:22.818 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. 2023-12-07 10:55:22.819 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: entering standby mode 2023-12-07 10:55:22.819 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Init of double write for ext finished. 2023-12-07 10:55:22.819 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Double Write init 2023-12-07 10:55:22.821 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch meta file info: dw_file_num [1], dw_file_size [256] MB, dw_version [92568] 2023-12-07 10:55:22.829 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch meta file info: dw_file_num [1], dw_file_size [256] MB, dw_version [92568] 2023-12-07 10:55:22.841 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch file header: id 0, file_head[dwn 1, start 15022] 2023-12-07 10:55:22.848 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid single file header: id 0, file_head[dwn 48, start 0] 2023-12-07 10:55:22.855 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid single file header: id 0, file_head[dwn 48, start 0] 2023-12-07 10:55:22.860 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW batch flush file recovery start. 2023-12-07 10:55:22.928 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW recovery state: "Empty", file start page[dwn 1, start 15022], now access page 0, current [page_id 15022, dwn 1, checksum verify res is 1, page_num orig 0, page_num fixed 0] 2023-12-07 10:55:22.928 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW recovery state: "Finish", file start page[dwn 1, start 15022], now access page 0, current [page_id 15022, dwn 1, checksum verify res is 1, page_num orig 0, page_num fixed 0] 2023-12-07 10:55:22.928 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW batch flush file recovery finish. 2023-12-07 10:55:22.928 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush file recovery start. 2023-12-07 10:55:22.951 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: [single flush] recovery, [first version] dw page is new, break this recovery: buf_tag[rel 0/0/0 blk 0 fork 0], compress: 0 2023-12-07 10:55:22.981 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: [second version] DW single flush file valid item num is 0. 2023-12-07 10:55:22.981 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush file recovery finish. 2023-12-07 10:55:22.987 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush finish recovery, reset the file head[dwn 49, start 0]. 2023-12-07 10:55:22.992 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush finish recovery [second version], reset the file head[dwn 49, start 0]. 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: redo record is at 0/B49D690; shutdown FALSE 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: GTM-Free Mode: start local next csn from checkpoint 1021, next xid 21256 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: Startup: write global globalRecycleXid 20899 from checkpoint 20899 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: startup csnlog without extend at xid:21255, pageno:20 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: ===Not in the streaming DR switchover=== 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: StartupXLOG PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:0/B49D710, oldRedo:0/B49D690, newCkpLoc:0/B49D710, newRedo:0/B49D690, preCkpLoc:0/B49D5F0 2023-12-07 10:55:23.107 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: [Pgstat] remove statfiles in pg_stat_tmp/pgstat.stat, global/pgstat.stat 2023-12-07 10:55:23.110 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: redo minRecoveryPoint at 0/B49D7B0; backupStartPoint at 0/0; backupEndRequired FALSE 2023-12-07 10:55:23.110 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: redo starts at 0/B49D690 2023-12-07 10:55:23.111 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: set knl_g_set_redo_finish_status to false when starting redo 2023-12-07 10:55:23.113 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: sigusr1_handler create WalWriterAuxiliary(140541941450304) after local recovery is done. pmState:2, ServerMode:3 2023-12-07 10:55:23.113 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:23.131 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_RECOVERY 2023-12-07 10:55:23.131 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:23.136 [unknown] [unknown] localhost 140542073103936 0[0:0#0] 0 [BACKEND] LOG: invalidate buffer bgwriter started 2023-12-07 10:55:23.137 [unknown] [unknown] localhost 140542027494976 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 0 2023-12-07 10:55:23.138 [unknown] [unknown] localhost 140542006523456 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 1 2023-12-07 10:55:23.138 [unknown] [unknown] localhost 140542052660800 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 2 2023-12-07 10:55:23.139 [unknown] [unknown] localhost 140541941450304 0[0:0#0] 0 [BACKEND] LOG: walwriterauxiliary started 2023-12-07 10:55:23.139 [unknown] [unknown] localhost 140541983454784 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 3 2023-12-07 10:55:23.141 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: checkpointer started, CheckPointTimeout is 60 2023-12-07 10:55:23.142 [unknown] [unknown] localhost 140541964514880 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 4 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: XLogPageRead IsRecoveryDone is set true,ReadRecPtr:0/B49D710, EndRecPtr:0/B49D7B0 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats: the basic statistic during redo are as follows : redo_start_ptr:00000000/0B49D690, redo_start_time:755232922819634, min_recovery_point:00000000/0B49D7B0, read_ptr:00000000/0B49D710, last_replayed_read_Ptr:00000000/0B49D7B0, speed:0 KB/s 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_XLOG: the event io statistic during redo are as follows : total_duration:57, counter:11 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_WRITE_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_PROCESS_PENDING: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_APPLY: the event io statistic during redo are as follows : total_durat ion:0, counter:0 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: XLogPageRead set redo finish status,ReadRecPtr:0/B49D710, EndRecPtr:0/B49D7B0 2023-12-07 10:55:24.118 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:24.119 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: set lsn after recovery done in gaussdb state file 2023-12-07 10:55:24.119 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:24.133 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_RECOVERY 2023-12-07 10:55:24.133 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:24.164 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: request xlog stream at 0/B49D7B0. 2023-12-07 10:55:24.164 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: startup shut down walreceiver. 2023-12-07 10:55:24.164 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: wal receiver try to connect to host=172.17.0.3 port=31001 localhost=172.17.0.2 localport=21001 index 1 . 2023-12-07 10:55:24.165 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:24.181 [unknown] [unknown] localhost 140541799953984 0[0:0#0] 0 [BACKEND] LOG: walrcvwriter thread started 2023-12-07 10:55:24.181 [unknown] [unknown] localhost 140541783172672 0[0:0#0] 0 [BACKEND] LOG: walreceiver thread started 2023-12-07 10:55:24.242 [unknown] [unknown] localhost 140541783172672 0[0:0#0] 0 [DCF] LOG: DCF data version, applied index and min applied index read from /home/opengauss/opengauss/data/paxosindex is 1, 80 and 0. 2023-12-07 10:55:24.242 [unknown] [unknown] localhost 140541783172672 0[0:0#0] 0 [BACKEND] LOG: Before start DCF module, node_id = 1, dcf_config = [{"stream_id":1,"node_id":1,"ip":"172.17.0.2","port":21000,"role":"LEADER"},{"stream_id":1,"node_id":2,"ip":"172.17.0.3","port":21000,"role":"FOLLOWER"}, {"stream_id":1,"node_id":3,"ip":"172.17.0.4","port":21000,"role":"FOLLOWER"}] 2023-12-07 10:55:24.715 [unknown] [unknown] localhost 140541783172672 0[0:0#0] 0 [BACKEND] LOG: Start DCF module success. 2023-12-07 10:55:24.715 [unknown] [unknown] localhost 140541783172672 0[0:0#0] 0 [BACKEND] LOG: set knl_g_set_redo_finish_status to false when connecting to the primary 2023-12-07 10:55:25.133 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_RECOVERY 2023-12-07 10:55:25.133 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: consistent recovery state reached at 0/B49D7B0 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats: the basic statistic during redo are as follows : redo_start_ptr:00000000/0B49D690, redo_start_time:755232922819634, min_recovery_point:00000000/0B49D7B0, read_ptr:00000000/0B49D710, last_replayed_read_Ptr:00000000/0B49D7B0, speed:0 KB/s 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_XLOG: the event io statistic during redo are as follows : total_duration:57, counter:11 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_WRITE_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_PROCESS_PENDING: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_APPLY: the event io statistic during redo are as follows : total_durat ion:0, counter:0 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_unlink_stats_file: unlink redo.state sucessfully!, ret:-1 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: send signal to be hot standby at 0/B49D7B0 2023-12-07 10:55:25.172 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:25.173 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: update gaussdb state file: db state(NORMAL_STATE), server mode(Standby) 2023-12-07 10:55:25.173 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: database system is ready to accept read only connections 2023-12-07 10:55:25.173 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: auditor process 0 started, pid=140541279336000 2023-12-07 10:55:25.181 [unknown] [unknown] localhost 140541307582016 0[0:0#0] 0 [BACKEND] LOG: [Pgstat] statfile global/pgstat.stat is missing, using empty dbhash. 2023-12-07 10:55:25.192 [unknown] [unknown] localhost 140541258823232 0[0:0#0] 0 [BACKEND] LOG: CfsShrinker started 2023-12-07 10:55:25.194 [unknown] [unknown] localhost 140541279336000 0[0:0#0] 0 [BACKEND] LOG: audit_process_cxt_init enter 2023-12-07 10:55:25.194 [unknown] [unknown] localhost 140541279336000 0[0:0#0] 0 [BACKEND] LOG: audit_process_cxt_init pipe init successfully for pipe : 0 file descriptor: 72 2023-12-07 10:55:25.216 opengauss postgres localhost 140541238965824 0[0:0#0] 0 [BACKEND] LOG: statement flush thread start 2023-12-07 10:55:25.235 opengauss postgres localhost 140541238965824 0[0:0#0] 0 [BACKEND] LOG: Mem-file chain of standby_statement_history_fast init done and online. 2023-12-07 10:55:25.251 opengauss postgres localhost 140541238965824 0[0:0#0] 0 [BACKEND] LOG: Mem-file chain of standby_statement_history_slow init done and online. 2023-12-07 10:55:25.251 [unknown] [unknown] localhost 140541279336000 0[0:0#0] 0 [BACKEND] LOG: audit_process_cxt_init success 2023-12-07 10:55:32.685 [unknown] [unknown] localhost 140541352142400 0[0:0#0] 0 [BACKEND] LOG: Enter ElectionCbFunc: stream_id = 1, new_leader = 2. 2023-12-07 10:55:32.685 [unknown] [unknown] localhost 140541352142400 0[0:0#0] 0 [BACKEND] LOG: Exit ElectionCbFunc. 2023-12-07 10:55:34.698 [unknown] [unknown] localhost 140541339629120 0[0:0#0] 0 [BACKEND] LOG: DCF follower send build reason query to leader 2 successfully. 2023/12/07 10:55:34 error in setting up connection to 172.17.0.1:8080 due to dial tcp 172.17.0.1:8080: connect: connection refused dial tcp 172.17.0.1:8080: connect: connection refused 2023-12-07 10:55:34.698 [unknown] [unknown] localhost 140541108418112 0[0:0#0] 0 [BACKEND] LOG: DCF follower received reply from leader 2 successfully, crc 1858586281. 2023-12-07 10:55:34.702 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:34.702 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: update gaussdb state file: build reason(Normal), db state(Normal), server mode(Standby), current connect index(1) 2023-12-07 10:56:23.142 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: CreateRestartPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:0/B49D710, oldRedo:0/B49D690, newCkpLoc:0/B49DCB0, newRedo:0/B49DC30, preCkpLoc:0/B49D710 2023-12-07 10:56:23.143 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: CreateRestartPoint, need flush 0 pages 2023-12-07 10:56:23.408 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 1, start 15022], total_pages 0 2023-12-07 10:56:23.428 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: keep all the xlog segments, because current segno = 11, less than wal_keep_segments = 16 2023-12-07 10:56:23.428 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 000000010000000000000000 2023-12-07 11:00:24.330 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: CreateRestartPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:0/B49E028, oldRedo:0/B49DF90, newCkpL oc:0/B49E148, newRedo:0/B49E0C8, preCkpLoc:0/B49DEF0 2023-12-07 11:00:24.330 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: CreateRestartPoint, need flush 0 pages 2023-12-07 11:00:24.540 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 1, start 15022], total_pages 0 2023-12-07 11:00:24.565 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: keep all the xlog segments, because current segno = 11, less than wal_keep_segments = 16 2023-12-07 11:00:24.565 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 000000010000000000000000 2023-12-07 11:00:30.249 [unknown] [unknown] localhost 140541352142400 0[0:0#0] 0 [BACKEND] LOG: Enter ElectionCbFunc: stream_id = 1, new_leader = 3. 2023-12-07 11:00:30.249 [unknown] [unknown] localhost 140541352142400 0[0:0#0] 0 [BACKEND] LOG: Exit ElectionCbFunc. 2023-12-07 11:01:24.617 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: skipping restartpoint, already performed at 0/B49E0C8 2023-12-07 11:01:39.845 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: skipping restartpoint, already performed at 0/B49E0C8 2023-12-07 11:01:54.859 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: skipping restartpoint, already performed at 0/B49E0C8 主节点日志: 2023-12-07 10:55:30.596 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: start create thread! 2023-12-07 10:55:30.596 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: create thread end! 2023-12-07 10:55:30.609 [unknown] [unknown] localhost 140121470920256 0[0:0#0] 0 [BACKEND] LOG: [Alarm Module]alarm checker started. 2023-12-07 10:55:30.609 [unknown] [unknown] localhost 140121454138944 0[0:0#0] 0 [BACKEND] LOG: reaper backend started. 2023-12-07 10:55:30.614 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [mpfl_ulink_file]: unlink global/max_page_flush_lsn sucessfully! ret:4294967295 2023-12-07 10:55:30.615 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: StartupXLOG: biggest_lsn_in_page is set to FFFFFFFF/FFFFFFFF, enable_update_max_page_flush_lsn:0 2023-12-07 10:55:30.615 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: database system timeline: 19 2023-12-07 10:55:30.615 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: database system was interrupted while in recovery at log time 2023-12-07 10:47:20 CST 2023-12-07 10:55:30.615 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. 2023-12-07 10:55:30.616 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: entering standby mode 2023-12-07 10:55:30.616 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Init of double write for ext finished. 2023-12-07 10:55:30.616 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Double Write init 2023-12-07 10:55:30.618 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch meta file info: dw_file_num [1], dw_file_size [256] MB, dw_version [92568] 2023-12-07 10:55:30.837 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch meta file info: dw_file_num [1], dw_file_size [256] MB, dw_version [92568] 2023-12-07 10:55:30.849 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch file header: id 0, file_head[dwn 1, start 31397] 2023-12-07 10:55:30.855 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid single file header: id 0, file_head[dwn 17, start 0] 2023-12-07 10:55:30.860 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid single file header: id 0, file_head[dwn 17, start 0] 2023-12-07 10:55:30.865 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW batch flush file recovery start. 2023-12-07 10:55:30.939 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW recovery state: "Empty", file start page[dwn 1, start 31397], now access page 0, current [page_id 31397, dwn 1, checksum verify res is 1, page_num orig 0, page_num fixed 0] 2023-12-07 10:55:30.939 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW recovery state: "Finish", file start page[dwn 1, start 31397], now access page 0, current [page_id 31397, dwn 1, checksum verify res is 1, page_num orig 0, page_num fixed 0] 2023-12-07 10:55:30.939 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW batch flush file recovery finish. 2023-12-07 10:55:30.939 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush file recovery start. 2023-12-07 10:55:30.940 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: [single flush] recovery, [first version] dw page is new, break this recovery: buf_tag[rel 0/0/0 blk 0 fork 0], compress: 0 2023-12-07 10:55:30.959 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: [second version] DW single flush file valid item num is 0. 2023-12-07 10:55:30.959 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush file recovery finish. 2023-12-07 10:55:30.963 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush finish recovery, reset the file head[dwn 18, start 0]. 2023-12-07 10:55:30.966 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_STARTUP 2023-12-07 10:55:30.966 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:30.968 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush finish recovery [second version], reset the file head[dwn 18, start 0]. 2023-12-07 10:55:31.066 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: redo record is at 0/B49DB10; shutdown FALSE 2023-12-07 10:55:31.066 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: GTM-Free Mode: start local next csn from checkpoint 1021, next xid 21966 2023-12-07 10:55:31.066 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: Startup: write global globalRecycleXid 20899 from checkpoint 20899 2023-12-07 10:55:31.066 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup csnlog without extend at xid:21965, pageno:21 2023-12-07 10:55:31.067 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: ===Not in the streaming DR switchover=== 2023-12-07 10:55:31.067 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: StartupXLOG PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:0/B49DB90, oldRedo:0/B49DB10, newCkpLoc:0/B49DB90, newRedo:0/B49DB10, preCkpLoc:0/B49DA70 2023-12-07 10:55:31.076 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: [Pgstat] remove statfiles in pg_stat_tmp/pgstat.stat, global/pgstat.stat 2023-12-07 10:55:31.078 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: redo minRecoveryPoint at 0/B49DC30; backupStartPoint at 0/0; backupEndRequired FALSE 2023-12-07 10:55:31.079 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: redo starts at 0/B49DB10 2023-12-07 10:55:31.079 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: set knl_g_set_redo_finish_status to false when starting redo 2023-12-07 10:55:31.084 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: sigusr1_handler create WalWriterAuxiliary(140121206617664) after local recovery is done. pmState:2, ServerMode:3 2023-12-07 10:55:31.084 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:31.105 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: checkpointer started, CheckPointTimeout is 60 2023-12-07 10:55:31.106 [unknown] [unknown] localhost 140121317832256 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 0 2023-12-07 10:55:31.108 [unknown] [unknown] localhost 140121271694912 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 1 2023-12-07 10:55:31.108 [unknown] [unknown] localhost 140121225557568 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 2 2023-12-07 10:55:31.109 [unknown] [unknown] localhost 140121248626240 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 3 2023-12-07 10:55:31.109 [unknown] [unknown] localhost 140121338275392 0[0:0#0] 0 [BACKEND] LOG: invalidate buffer bgwriter started 2023-12-07 10:55:31.111 [unknown] [unknown] localhost 140121292666432 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 4 2023-12-07 10:55:31.111 [unknown] [unknown] localhost 140121206617664 0[0:0#0] 0 [BACKEND] LOG: walwriterauxiliary started 2023-12-07 10:55:31.968 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_RECOVERY 2023-12-07 10:55:31.968 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:32.080 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: XLogPageRead IsRecoveryDone is set true,ReadRecPtr:0/B49DB90, EndRecPtr:0/B49DC30 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats: the basic statistic during redo are as follows : redo_start_ptr:00000000/0B49DB10, redo_start_time:755232930616454, min_recovery_point:00000000/0B49DC30, read_ptr:00000000/0B49DB90, last_replayed_read_Ptr:00000000/0B49DC30, speed:0 KB/s 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_XLOG: the event io statistic during redo are as follows : total_duration:71, counter:11 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_WRITE_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_PROCESS_PENDING: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_APPLY: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: XLogPageRead set redo finish status,ReadRecPtr:0/B49DB90, EndRecPtr:0/B49DC30 2023-12-07 10:55:32.088 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:32.088 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: set lsn after recovery done in gaussdb state file 2023-12-07 10:55:32.088 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:32.131 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: request xlog stream at 0/B49DC30. 2023-12-07 10:55:32.131 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down walreceiver. 2023-12-07 10:55:32.131 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: wal receiver try to connect to host=172.17.0.2 port=21001 localhost=172.17.0.3 localport=31001 index 1 . 2023-12-07 10:55:32.132 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:32.147 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: walreceiver thread started 2023-12-07 10:55:32.147 [unknown] [unknown] localhost 140121063552576 0[0:0#0] 0 [BACKEND] LOG: walrcvwriter thread started 2023-12-07 10:55:32.198 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [DCF] LOG: DCF data version, applied index and min applied index read from /home/opengauss/opengauss/data/paxosindex is 1, 107667 and 113850. 2023-12-07 10:55:32.198 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: Before start DCF module, node_id = 2, dcf_config = [{"stream_id":1,"node_id":1,"ip":"172.17.0.2","port":21000,"role":"LEADER"},{"stream_id":1,"node_id":2,"ip":"172.17.0.3","port":21000,"role":"FOLLOWER"}, {"stream_id":1,"node_id":3,"ip":"172.17.0.4","port":21000,"role":"FOLLOWER"}] 2023-12-07 10:55:32.667 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: Start DCF module success. 2023-12-07 10:55:32.675 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: set knl_g_set_redo_finish_status to false when connecting to the primary 2023/12/07 10:55:32 error in setting up connection to 172.17.0.1:8080 due to dial tcp 172.17.0.1:8080: connect: connection refused dial tcp 172.17.0.1:8080: connect: connection refused 2023/12/07 10:55:32 error in setting up connection to 172.17.0.1:8080 due to dial tcp 172.17.0.1:8080: connect: connection refused dial tcp 172.17.0.1:8080: connect: connection refused 2023-12-07 10:55:32.969 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_RECOVERY 2023-12-07 10:55:32.969 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: consistent recovery state reached at 0/B49DC30 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats: the basic statistic during redo are as follows : redo_start_ptr:00000000/0B49DB10, redo_start_time:755232930616454, min_recovery_point:00000000/0B49DC30, read_ptr:00000000/0B49DB90, last_replayed_read_Ptr:00000000/0B49DC30, speed:0 KB/s 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_XLOG: the event io statistic during redo are as follows : total_duration:71, counter:11 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_WRITE_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_PROCESS_PENDING: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_APPLY: the event io statistic during redo are as follows : total_durat ion:0, counter:0 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_unlink_stats_file: unlink redo.state sucessfully!, ret:-1 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: send signal to be hot standby at 0/B49DC30 2023-12-07 10:55:33.138 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:33.138 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: update gaussdb state file: db state(NORMAL_STATE), server mode(Standby) 2023-12-07 10:55:33.138 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: database system is ready to accept read only connections 2023-12-07 10:55:33.138 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: auditor process 0 started, pid=140120466912832 2023-12-07 10:55:33.147 [unknown] [unknown] localhost 140120495748672 0[0:0#0] 0 [BACKEND] LOG: [Pgstat] statfile global/pgstat.stat is missing, using empty dbhash. 2023-12-07 10:55:33.155 [unknown] [unknown] localhost 140120466912832 0[0:0#0] 0 [BACKEND] LOG: audit_process_cxt_init enter 2023-12-07 10:55:33.156 [unknown] [unknown] localhost 140120466912832 0[0:0#0] 0 [BACKEND] LOG: audit_process_cxt_init pipe init successfully for pipe : 0 file descriptor: 80 2023-12-07 10:55:33.160 [unknown] [unknown] localhost 140120450131520 0[0:0#0] 0 [BACKEND] LOG: CfsShrinker started 2023-12-07 10:55:33.184 opengauss postgres localhost 140120430736960 0[0:0#0] 0 [BACKEND] LOG: statement flush thread start 2023-12-07 10:55:33.203 opengauss postgres localhost 140120430736960 0[0:0#0] 0 [BACKEND] LOG: Mem-file chain of standby_statement_history_fast init done and online. 2023-12-07 10:55:33.220 opengauss postgres localhost 140120430736960 0[0:0#0] 0 [BACKEND] LOG: Mem-file chain of standby_statement_history_slow init done and online. 2023-12-07 10:55:33.784 [unknown] [unknown] localhost 140120558655040 0[0:0#0] 0 [BACKEND] LOG: Enter PromoteCallbackFunc, isDcfShmemInited = 1 2023-12-07 10:55:33.784 [unknown] [unknown] localhost 140120558655040 0[0:0#0] 0 [BACKEND] LOG: Instance to do failover. 2023-12-07 10:55:33.784 [unknown] [unknown] localhost 140120558655040 0[0:0#0] 0 [BACKEND] LOG: Exit PromoteCallbackFunc, isDcfShmemInited = 1 2023-12-07 10:55:33.784 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: Begin to wait read xlog from DCF! 2023-12-07 10:55:33.784 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: All DCF log has been applied! 2023-12-07 10:55:33.809 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: received failover notification 2023-12-07 10:55:33.809 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down walreceiver. 2023-12-07 10:55:33.817 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] FATAL: terminating walreceiver process due to administrator command 2023-12-07 10:55:33.817 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] BACKTRACELOG: tid[231999]'s backtrace: /home/opengauss/openGauss/install/bin/gaussdb() [0x1068c7d] /home/opengauss/openGauss/install/bin/gaussdb(_Z9errfinishiz+0x2fd) [0x10609e2] /home/opengauss/openGauss/install/bin/gaussdb(_Z23ProcessWalRcvInterruptsv+0xa1) [0x1ffc9f1] /home/opengauss/openGauss/install/bin/gaussdb(_Z18WalRcvrProcessDataPlPb+0xa9) [0x1ffd02f] /home/opengauss/openGauss/install/bin/gaussdb(_Z15WalReceiverMainv+0x1950) [0x1ffec6a] /home/opengauss/openGauss/install/bin/gaussdb(_Z26GaussDbAuxiliaryThreadMainIL15knl_thread_role41EEiP14knl_thread_arg+0x24d) [0x17c5b6e] /home/opengauss/openGauss/install/bin/gaussdb(_Z17GaussDbThreadMainIL15knl_thread_role41EEiP14knl_thread_arg+0x4af) [0x17bbd5d] /home/opengauss/openGauss/install/bin/gaussdb() [0x17ab596] /home/opengauss/openGauss/install/bin/gaussdb() [0x2253f9c] /usr/lib64/libc.so.6(+0x8b47a) [0x7f71a15a747a] /usr/lib64/libc.so.6(clone+0x44) [0x7f71a1629364] Use addr2line to get pretty function name and line 2023-12-07 10:55:33.817 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: waiting walrcvwriter: 140121063552576 terminate 2023-12-07 10:55:33.817 [unknown] [unknown] localhost 140121063552576 0[0:0#0] 0 [BACKEND] LOG: walrcvwriter thread shut down 2023-12-07 10:55:33.828 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: set local_redo_finish_status to false in WalRcvDie 2023-12-07 10:55:33.852 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: walreceiver thread shut down 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down datareceiver. 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: RecPtr(0/B49DC50),receivedUpto(0/B49DC30) 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: Secondary Standby has synchronized all the xlog and replication data, standby promote to primary 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down walreceiver. 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down datareceiver. 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: read xlog page failed at 0/B49DC30 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: set knl_g_set_redo_finish_status to true when redo done 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: redo done at 0/B49DB90, end at 0/B49DC30 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [PR]: Recoverying elapsed: 2831170 us, redoTotalBytes:288,EndRecPtr:189389872, redoStartPtr:189389584,speed:0 MB/s, totalTime:2831170 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_unlink_stats_file: unlink redo.state sucessfully!, ret:-1 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats: the basic statistic during redo are as follows : redo_start_ptr:00000000/0B49DB10, redo_start_time:755232930616454, min_recovery_point:00000000/0B49DC30, read_ptr:00000000/0B49DB90, last_replayed_read_Ptr:00000000/0B49DC30, speed:0 KB/s 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_XLOG: the event io statistic during redo are as follows : total_duration:71, counter:11 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_WRITE_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_PROCESS_PENDING: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_APPLY: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down walreceiver. 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down datareceiver. 2023-12-07 10:55:33.915 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:34.118 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: set gaussdb state file: db state(PROMOTING_STATE), server mode(Standby) 2023-12-07 10:55:34.121 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: archive recovery complete 2023-12-07 10:55:34.126 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: btree_xlog_cleanup is over, it takes time:0 microseconds 2023-12-07 10:55:34.126 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: UBTreeXlogCleanup is over, it takes time:1 microseconds 2023-12-07 10:55:34.127 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: skipping restartpoint, already performed at 0/B49DB10 2023-12-07 10:55:34.144 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DCF] LOG: Write dcfData version 1, apply index 107667, min apply index 113850 and crc 2602742273 into "/home/opengauss/opengauss/data/paxosindex.backup" 2023-12-07 10:55:34.148 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DCF] LOG: Write dcfData version 1, apply index 107667, min apply index 113850 and crc 2602742273 into "/home/opengauss/opengauss/data/paxosindex" 2023-12-07 10:55:34.148 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: redo done, nextXid: 22321, startupMaxXid: 22321, recentLocalXmin: 21966, recentGlobalXmin: 21966, PendingPreparedXacts: 0, NextCommitSeqNo: 1021, cutoff_csn_min: 0. 2023-12-07 10:55:34.165 [postmaster][reaper][140126007985600] LOG: startupprocess exit 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.188 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: autovacuum launcher started 2023-12-07 10:55:34.192 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: walwriter started 2023-12-07 10:55:34.194 [unknown] [unknown] localhost 140120305952320 0[0:0#0] 0 [BACKEND] LOG: global stats collector started 2023-12-07 10:55:34.195 [unknown] [unknown] localhost 140120322733632 0[0:0#0] 0 [BACKEND] LOG: undo launcher started 2023-12-07 10:55:34.196 [unknown] [unknown] localhost 140121063552576 0[0:0#0] 0 [BACKEND] LOG: job scheduler started 2023-12-07 10:55:34.199 [unknown] [unknown] localhost 140120262964800 0[0:0#0] 0 [TIMECAPSULE] LOG: txnsnapcapturer started 2023-12-07 10:55:34.213 opengauss postgres localhost 140120214730304 0[0:0#0] 0 [BACKEND] LOG: ASP thread start 2023-12-07 10:55:34.213 opengauss postgres localhost 140120239896128 0[0:0#0] 0 [BACKEND] LOG: instrumention percentile started 2023-12-07 10:55:34.309 opengauss postgres localhost 140120067929664 0[0:0#0] 0 [BACKEND] LOG: process wlm thread starting up. 2023-12-07 10:55:34.312 opengauss postgres localhost 140120067929664 0[0:0#0] 0 [BACKEND] LOG: build user data finished 2023-12-07 10:55:34.312 opengauss postgres localhost 140120028083776 0[0:0#0] 0 [BACKEND] LOG: WLMarbiter thread is starting up. 2023-12-07 10:55:34.313 opengauss postgres localhost 140120048989760 0[0:0#0] 0 [BACKEND] LOG: WLMmonitor thread is starting up. 2023-12-07 10:55:34.313 opengauss postgres localhost 140120048989760 0[0:0#0] 0 [BACKEND] LOG: cannot find the mount device of data directory /home/opengauss/opengauss/data, io_priority is not availab le. 2023-12-07 10:55:34.479 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:34.480 [postmaster][reaper][140126007985600] LOG: update gaussdb state file: db state(NORMAL_STATE), server mode(Primary) 2023-12-07 10:55:34.480 [postmaster][reaper][140126007985600] LOG: database system is ready to accept connections 2023-12-07 10:55:34.493 [unknown] [unknown] localhost 140119914772032 0[0:0#0] 0 [UNDO] LOG: [UndoRecycleMain:549]undo recycle started. 2023-12-07 10:55:34.678 [unknown] [unknown] localhost 140120614762048 0[0:0#0] 0 [BACKEND] LOG: DCF leader send reply to follower 3 successfully, crc 1858586281. 2023-12-07 10:55:34.698 [unknown] [unknown] localhost 140120614762048 0[0:0#0] 0 [BACKEND] LOG: DCF leader send reply to follower 1 successfully, crc 1858586281. 2023-12-07 10:55:44.512 [unknown] [unknown] localhost 140119914772032 0[0:0#0] 0 [UNDO] LOG: [UndoRecycleMain:643]sleep 10s, ensure the snapcapturer can give the undorecyclemain a valid recycleXmin. 2023-12-07 10:55:49.140 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: database first startup and recovery finish,so do checkpointer 2023-12-07 10:55:49.359 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 1, start 31397], total_pages 0 2023-12-07 10:55:49.369 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: keep all the xlog segments, because current segno = 11, less than wal_keep_segments = 16 2023-12-07 10:55:49.369 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] WARNING: replicationSlotMinLSN is InvalidXLogRecPtr!!! 2023-12-07 10:55:49.369 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] WARNING: replicationSlotMaxLSN is InvalidXLogRecPtr!!! 2023-12-07 10:55:49.369 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] WARNING: The number of walsender 0 is less than the number of valid replconninfo 2. 2023-12-07 10:55:49.382 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: CreateCheckPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:0/B49DB90, oldRedo:0/B49DB10, newCkpLoc:0/B49DCB0, newRedo:0/B49DC30, preCkpLoc:0/B49DB90 2023-12-07 10:55:49.383 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: will update control file (create checkpoint), shutdown:0 2023-12-07 10:55:49.395 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [DCF] LOG: Write dcfData version 1, apply index 107668, min apply index 113863 and crc 4155646614 into "/home/opengauss/opengauss/data/paxosindex.backup" 2023-12-07 10:55:49.399 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [DCF] LOG: Write dcfData version 1, apply index 107668, min apply index 113863 and crc 4155646614 into "/home/opengauss/opengauss/data/paxosindex" 2023-12-07 10:55:49.401 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 000000010000000000000000 【测试代码】:
<!-- #请认真填写以下信息,否则可能由于无法定位,导致issue无法解决而被取消 --> 【标题描述】:丢弃主节点发送的同步请求Req,并在持续丢弃期间重启主节点,即使后续恢复正常不再丢弃Req,整个集群也没有选举出主节点 【测试类型:SQL功能/存储功能/接口功能/工具功能/性能/并发/压力长稳/故障注入/安全/资料/编码规范】【测试版本:x.x.x】 问题描述 【操作系统和硬件信息】(查询命令: cat /etc/system-release, uname -a): openEuler release 22.03 LTS Linux 3adf3c2c795d 4.15.0-194-generic #205-Ubuntu SMP Fri Sep 16 19:49:27 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux 【测试环境】(单机/1主x备x级联备): 1主2备 【被测功能】: 【测试类型】: 【数据库版本】(查询命令: gaussdb -V): gaussdb (openGauss 5.0.0 build ) compiled at 2023-04-06 15:35:14 commit 0 last mr debug 【预置条件】: 【操作步骤】(请填写详细的操作步骤): 插桩DCF源码rep_leader.c的函数rep_appendlog_node static status_t rep_appendlog_node(uint32 stream_id, uint32 node_id, dcf_role_t default_role, uint64 last_log_index, bool8* node_exists_log) { uint64 old_next_index = (uint64)cm_atomic_get((atomic_t*)&NEXT_INDEX); uint64 log_begin = old_next_index == CM_INVALID_INDEX_ID ? 1 : old_next_index; log_begin = MAX(log_begin, stg_first_index(stream_id)); uint64 log_count = rep_calu_log_count(stream_id, node_id, default_role, log_begin, last_log_index); *node_exists_log = (log_count > 0); /* Logs are sent even if log_count==0. Periodically sending empty logs ensures that lost packets are retransmitted */ mec_message_t pack; CM_RETURN_IFERR(mec_alloc_pack(&pack, MEC_CMD_APPEND_LOG_RPC_REQ, g_cur_node_id, node_id, stream_id)); uint64 pre_log_index = log_begin == CM_INVALID_INDEX_ID ? CM_INVALID_INDEX_ID : log_begin - 1; rep_apendlog_req_t appendlog_req; rep_init_appendlog_head(stream_id, &appendlog_req, pre_log_index, last_log_index); CM_RETURN_IFERR_EX(rep_encode_appendlog_head(&pack, &appendlog_req), mec_release_pack(&pack)); uint32 log_count_pos = mec_get_write_pos(&pack) - sizeof(uint64); uint64 j = 0; uint32 total_size = 0; for (uint64 index = log_begin; j < log_count; index++, j++) { log_entry_t* entry = stg_get_entry(stream_id, index); if (entry == NULL) { break; } total_size += (sizeof(rep_log_t) + ENTRY_SIZE(entry)); if (total_size > MESSAGE_BUFFER_SIZE && j > 0) { LOG_DEBUG_INF("[REP]total_size[%u] is enough, send size[%u]. log_count[%llu], j[%llu]", total_size, (uint32)(total_size - (sizeof(rep_log_t) + ENTRY_SIZE(entry))), log_count, j); stg_entry_dec_ref(entry); break; } status_t ret = rep_encode_one_log(&pack, log_count_pos, j + 1, entry); stg_entry_dec_ref(entry); if (ret != CM_SUCCESS) { mec_release_pack(&pack); LOG_DEBUG_ERR("[REP]encode_one_log fail, index=%llu, j=%llu", index, j); return CM_ERROR; } ps_record1(PS_PACK, index); } appendlog_req.log_count = j; // 在此插桩,伪代码如下 // 重启前插桩 static int cnt = 0; if(appendlog_req.log_count > 0){ cnt++; if cnt == 6 { kill $LEADR_PID } return CM_SUCCESS; } // 重启后插桩 static int cnt = 0; if(appendlog_req.log_count > 0 && cnt < 10){ cnt++; return CM_SUCCESS; } CM_RETURN_IFERR_EX(mec_send_data(&pack), mec_release_pack(&pack)); LOG_DEBUG_INF("[REP]rep send succeed: " REP_APPEND_REQ_FMT, REP_APPEND_REQ_VAL(&pack, &appendlog_req, log_begin)); if (APPEND_MODE == APPEND_NORMAL_MODE) { (void)cm_atomic_cas((atomic_t*)&NEXT_INDEX, old_next_index, log_begin + j); LOG_DEBUG_INF("[REP]set next_index to %llu,stream_id=%u,node_id=%u", NEXT_INDEX, stream_id, node_id); } mec_release_pack(&pack); return CM_SUCCESS; } 【预期输出】: 原主节点宕机之后集群发生选举,选举出新的主节点或原主节点重启之后,发生选举 【实际输出】: 原主节点宕机之后集群没有发生选举,且原主节点重启之后,仍没有发生选举 【原因分析】: 1. 这个问题的根因 2. 问题推断过程 3. 还有哪些原因可能造成类似现象 4. 该问题是否有临时规避措施 5. 问题解决方案 6. 预计修复问题时间 【日志信息】(请附上日志文件、截图、coredump信息): 初始状态:    重启后状态:    从节点日志: 2023-12-07 10:55:22.799 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: start create thread! 2023-12-07 10:55:22.799 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: create thread end! 2023-12-07 10:55:22.813 [unknown] [unknown] localhost 140542188910144 0[0:0#0] 0 [BACKEND] LOG: reaper backend started. 2023-12-07 10:55:22.813 [unknown] [unknown] localhost 140542218335808 0[0:0#0] 0 [BACKEND] LOG: [Alarm Module]alarm checker started. 2023-12-07 10:55:22.818 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [mpfl_ulink_file]: unlink global/max_page_flush_lsn sucessfully! ret:4294967295 2023-12-07 10:55:22.818 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: StartupXLOG: biggest_lsn_in_page is set to FFFFFFFF/FFFFFFFF, enable_update_max_page_flush_lsn:0 2023-12-07 10:55:22.818 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: database system timeline: 17 2023-12-07 10:55:22.818 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: database system was interrupted while in recovery at log time 2023-12-07 10:36:32 CST 2023-12-07 10:55:22.818 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. 2023-12-07 10:55:22.819 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: entering standby mode 2023-12-07 10:55:22.819 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Init of double write for ext finished. 2023-12-07 10:55:22.819 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Double Write init 2023-12-07 10:55:22.821 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch meta file info: dw_file_num [1], dw_file_size [256] MB, dw_version [92568] 2023-12-07 10:55:22.829 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch meta file info: dw_file_num [1], dw_file_size [256] MB, dw_version [92568] 2023-12-07 10:55:22.841 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch file header: id 0, file_head[dwn 1, start 15022] 2023-12-07 10:55:22.848 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid single file header: id 0, file_head[dwn 48, start 0] 2023-12-07 10:55:22.855 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid single file header: id 0, file_head[dwn 48, start 0] 2023-12-07 10:55:22.860 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW batch flush file recovery start. 2023-12-07 10:55:22.928 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW recovery state: "Empty", file start page[dwn 1, start 15022], now access page 0, current [page_id 15022, dwn 1, checksum verify res is 1, page_num orig 0, page_num fixed 0] 2023-12-07 10:55:22.928 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW recovery state: "Finish", file start page[dwn 1, start 15022], now access page 0, current [page_id 15022, dwn 1, checksum verify res is 1, page_num orig 0, page_num fixed 0] 2023-12-07 10:55:22.928 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW batch flush file recovery finish. 2023-12-07 10:55:22.928 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush file recovery start. 2023-12-07 10:55:22.951 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: [single flush] recovery, [first version] dw page is new, break this recovery: buf_tag[rel 0/0/0 blk 0 fork 0], compress: 0 2023-12-07 10:55:22.981 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: [second version] DW single flush file valid item num is 0. 2023-12-07 10:55:22.981 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush file recovery finish. 2023-12-07 10:55:22.987 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush finish recovery, reset the file head[dwn 49, start 0]. 2023-12-07 10:55:22.992 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush finish recovery [second version], reset the file head[dwn 49, start 0]. 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: redo record is at 0/B49D690; shutdown FALSE 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: GTM-Free Mode: start local next csn from checkpoint 1021, next xid 21256 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: Startup: write global globalRecycleXid 20899 from checkpoint 20899 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: startup csnlog without extend at xid:21255, pageno:20 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: ===Not in the streaming DR switchover=== 2023-12-07 10:55:23.090 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: StartupXLOG PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:0/B49D710, oldRedo:0/B49D690, newCkpLoc:0/B49D710, newRedo:0/B49D690, preCkpLoc:0/B49D5F0 2023-12-07 10:55:23.107 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: [Pgstat] remove statfiles in pg_stat_tmp/pgstat.stat, global/pgstat.stat 2023-12-07 10:55:23.110 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: redo minRecoveryPoint at 0/B49D7B0; backupStartPoint at 0/0; backupEndRequired FALSE 2023-12-07 10:55:23.110 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: redo starts at 0/B49D690 2023-12-07 10:55:23.111 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: set knl_g_set_redo_finish_status to false when starting redo 2023-12-07 10:55:23.113 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: sigusr1_handler create WalWriterAuxiliary(140541941450304) after local recovery is done. pmState:2, ServerMode:3 2023-12-07 10:55:23.113 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:23.131 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_RECOVERY 2023-12-07 10:55:23.131 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:23.136 [unknown] [unknown] localhost 140542073103936 0[0:0#0] 0 [BACKEND] LOG: invalidate buffer bgwriter started 2023-12-07 10:55:23.137 [unknown] [unknown] localhost 140542027494976 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 0 2023-12-07 10:55:23.138 [unknown] [unknown] localhost 140542006523456 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 1 2023-12-07 10:55:23.138 [unknown] [unknown] localhost 140542052660800 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 2 2023-12-07 10:55:23.139 [unknown] [unknown] localhost 140541941450304 0[0:0#0] 0 [BACKEND] LOG: walwriterauxiliary started 2023-12-07 10:55:23.139 [unknown] [unknown] localhost 140541983454784 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 3 2023-12-07 10:55:23.141 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: checkpointer started, CheckPointTimeout is 60 2023-12-07 10:55:23.142 [unknown] [unknown] localhost 140541964514880 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 4 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: XLogPageRead IsRecoveryDone is set true,ReadRecPtr:0/B49D710, EndRecPtr:0/B49D7B0 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats: the basic statistic during redo are as follows : redo_start_ptr:00000000/0B49D690, redo_start_time:755232922819634, min_recovery_point:00000000/0B49D7B0, read_ptr:00000000/0B49D710, last_replayed_read_Ptr:00000000/0B49D7B0, speed:0 KB/s 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_XLOG: the event io statistic during redo are as follows : total_duration:57, counter:11 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_WRITE_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_PROCESS_PENDING: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_APPLY: the event io statistic during redo are as follows : total_durat ion:0, counter:0 2023-12-07 10:55:24.114 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: XLogPageRead set redo finish status,ReadRecPtr:0/B49D710, EndRecPtr:0/B49D7B0 2023-12-07 10:55:24.118 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:24.119 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: set lsn after recovery done in gaussdb state file 2023-12-07 10:55:24.119 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:24.133 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_RECOVERY 2023-12-07 10:55:24.133 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:24.164 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: request xlog stream at 0/B49D7B0. 2023-12-07 10:55:24.164 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: startup shut down walreceiver. 2023-12-07 10:55:24.164 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: wal receiver try to connect to host=172.17.0.3 port=31001 localhost=172.17.0.2 localport=21001 index 1 . 2023-12-07 10:55:24.165 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:24.181 [unknown] [unknown] localhost 140541799953984 0[0:0#0] 0 [BACKEND] LOG: walrcvwriter thread started 2023-12-07 10:55:24.181 [unknown] [unknown] localhost 140541783172672 0[0:0#0] 0 [BACKEND] LOG: walreceiver thread started 2023-12-07 10:55:24.242 [unknown] [unknown] localhost 140541783172672 0[0:0#0] 0 [DCF] LOG: DCF data version, applied index and min applied index read from /home/opengauss/opengauss/data/paxosindex is 1, 80 and 0. 2023-12-07 10:55:24.242 [unknown] [unknown] localhost 140541783172672 0[0:0#0] 0 [BACKEND] LOG: Before start DCF module, node_id = 1, dcf_config = [{"stream_id":1,"node_id":1,"ip":"172.17.0.2","port":21000,"role":"LEADER"},{"stream_id":1,"node_id":2,"ip":"172.17.0.3","port":21000,"role":"FOLLOWER"}, {"stream_id":1,"node_id":3,"ip":"172.17.0.4","port":21000,"role":"FOLLOWER"}] 2023-12-07 10:55:24.715 [unknown] [unknown] localhost 140541783172672 0[0:0#0] 0 [BACKEND] LOG: Start DCF module success. 2023-12-07 10:55:24.715 [unknown] [unknown] localhost 140541783172672 0[0:0#0] 0 [BACKEND] LOG: set knl_g_set_redo_finish_status to false when connecting to the primary 2023-12-07 10:55:25.133 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_RECOVERY 2023-12-07 10:55:25.133 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: consistent recovery state reached at 0/B49D7B0 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats: the basic statistic during redo are as follows : redo_start_ptr:00000000/0B49D690, redo_start_time:755232922819634, min_recovery_point:00000000/0B49D7B0, read_ptr:00000000/0B49D710, last_replayed_read_Ptr:00000000/0B49D7B0, speed:0 KB/s 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_XLOG: the event io statistic during redo are as follows : total_duration:57, counter:11 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_WRITE_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_PROCESS_PENDING: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_APPLY: the event io statistic during redo are as follows : total_durat ion:0, counter:0 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_unlink_stats_file: unlink redo.state sucessfully!, ret:-1 2023-12-07 10:55:25.166 [unknown] [unknown] localhost 140542159091264 0[0:0#0] 0 [BACKEND] LOG: send signal to be hot standby at 0/B49D7B0 2023-12-07 10:55:25.172 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:25.173 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: update gaussdb state file: db state(NORMAL_STATE), server mode(Standby) 2023-12-07 10:55:25.173 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: database system is ready to accept read only connections 2023-12-07 10:55:25.173 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: auditor process 0 started, pid=140541279336000 2023-12-07 10:55:25.181 [unknown] [unknown] localhost 140541307582016 0[0:0#0] 0 [BACKEND] LOG: [Pgstat] statfile global/pgstat.stat is missing, using empty dbhash. 2023-12-07 10:55:25.192 [unknown] [unknown] localhost 140541258823232 0[0:0#0] 0 [BACKEND] LOG: CfsShrinker started 2023-12-07 10:55:25.194 [unknown] [unknown] localhost 140541279336000 0[0:0#0] 0 [BACKEND] LOG: audit_process_cxt_init enter 2023-12-07 10:55:25.194 [unknown] [unknown] localhost 140541279336000 0[0:0#0] 0 [BACKEND] LOG: audit_process_cxt_init pipe init successfully for pipe : 0 file descriptor: 72 2023-12-07 10:55:25.216 opengauss postgres localhost 140541238965824 0[0:0#0] 0 [BACKEND] LOG: statement flush thread start 2023-12-07 10:55:25.235 opengauss postgres localhost 140541238965824 0[0:0#0] 0 [BACKEND] LOG: Mem-file chain of standby_statement_history_fast init done and online. 2023-12-07 10:55:25.251 opengauss postgres localhost 140541238965824 0[0:0#0] 0 [BACKEND] LOG: Mem-file chain of standby_statement_history_slow init done and online. 2023-12-07 10:55:25.251 [unknown] [unknown] localhost 140541279336000 0[0:0#0] 0 [BACKEND] LOG: audit_process_cxt_init success 2023-12-07 10:55:32.685 [unknown] [unknown] localhost 140541352142400 0[0:0#0] 0 [BACKEND] LOG: Enter ElectionCbFunc: stream_id = 1, new_leader = 2. 2023-12-07 10:55:32.685 [unknown] [unknown] localhost 140541352142400 0[0:0#0] 0 [BACKEND] LOG: Exit ElectionCbFunc. 2023-12-07 10:55:34.698 [unknown] [unknown] localhost 140541339629120 0[0:0#0] 0 [BACKEND] LOG: DCF follower send build reason query to leader 2 successfully. 2023/12/07 10:55:34 error in setting up connection to 172.17.0.1:8080 due to dial tcp 172.17.0.1:8080: connect: connection refused dial tcp 172.17.0.1:8080: connect: connection refused 2023-12-07 10:55:34.698 [unknown] [unknown] localhost 140541108418112 0[0:0#0] 0 [BACKEND] LOG: DCF follower received reply from leader 2 successfully, crc 1858586281. 2023-12-07 10:55:34.702 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:34.702 [unknown] [unknown] localhost 140546745337280 0[0:0#0] 0 [BACKEND] LOG: update gaussdb state file: build reason(Normal), db state(Normal), server mode(Standby), current connect index(1) 2023-12-07 10:56:23.142 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: CreateRestartPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:0/B49D710, oldRedo:0/B49D690, newCkpLoc:0/B49DCB0, newRedo:0/B49DC30, preCkpLoc:0/B49D710 2023-12-07 10:56:23.143 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: CreateRestartPoint, need flush 0 pages 2023-12-07 10:56:23.408 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 1, start 15022], total_pages 0 2023-12-07 10:56:23.428 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: keep all the xlog segments, because current segno = 11, less than wal_keep_segments = 16 2023-12-07 10:56:23.428 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 000000010000000000000000 2023-12-07 11:00:24.330 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: CreateRestartPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:0/B49E028, oldRedo:0/B49DF90, newCkpL oc:0/B49E148, newRedo:0/B49E0C8, preCkpLoc:0/B49DEF0 2023-12-07 11:00:24.330 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: CreateRestartPoint, need flush 0 pages 2023-12-07 11:00:24.540 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 1, start 15022], total_pages 0 2023-12-07 11:00:24.565 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: keep all the xlog segments, because current segno = 11, less than wal_keep_segments = 16 2023-12-07 11:00:24.565 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 000000010000000000000000 2023-12-07 11:00:30.249 [unknown] [unknown] localhost 140541352142400 0[0:0#0] 0 [BACKEND] LOG: Enter ElectionCbFunc: stream_id = 1, new_leader = 3. 2023-12-07 11:00:30.249 [unknown] [unknown] localhost 140541352142400 0[0:0#0] 0 [BACKEND] LOG: Exit ElectionCbFunc. 2023-12-07 11:01:24.617 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: skipping restartpoint, already performed at 0/B49E0C8 2023-12-07 11:01:39.845 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: skipping restartpoint, already performed at 0/B49E0C8 2023-12-07 11:01:54.859 [unknown] [unknown] localhost 140542089885248 0[0:0#0] 0 [BACKEND] LOG: skipping restartpoint, already performed at 0/B49E0C8 主节点日志: 2023-12-07 10:55:30.596 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: start create thread! 2023-12-07 10:55:30.596 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: create thread end! 2023-12-07 10:55:30.609 [unknown] [unknown] localhost 140121470920256 0[0:0#0] 0 [BACKEND] LOG: [Alarm Module]alarm checker started. 2023-12-07 10:55:30.609 [unknown] [unknown] localhost 140121454138944 0[0:0#0] 0 [BACKEND] LOG: reaper backend started. 2023-12-07 10:55:30.614 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [mpfl_ulink_file]: unlink global/max_page_flush_lsn sucessfully! ret:4294967295 2023-12-07 10:55:30.615 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: StartupXLOG: biggest_lsn_in_page is set to FFFFFFFF/FFFFFFFF, enable_update_max_page_flush_lsn:0 2023-12-07 10:55:30.615 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: database system timeline: 19 2023-12-07 10:55:30.615 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: database system was interrupted while in recovery at log time 2023-12-07 10:47:20 CST 2023-12-07 10:55:30.615 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. 2023-12-07 10:55:30.616 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: entering standby mode 2023-12-07 10:55:30.616 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Init of double write for ext finished. 2023-12-07 10:55:30.616 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Double Write init 2023-12-07 10:55:30.618 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch meta file info: dw_file_num [1], dw_file_size [256] MB, dw_version [92568] 2023-12-07 10:55:30.837 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch meta file info: dw_file_num [1], dw_file_size [256] MB, dw_version [92568] 2023-12-07 10:55:30.849 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid batch file header: id 0, file_head[dwn 1, start 31397] 2023-12-07 10:55:30.855 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid single file header: id 0, file_head[dwn 17, start 0] 2023-12-07 10:55:30.860 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: Found a valid single file header: id 0, file_head[dwn 17, start 0] 2023-12-07 10:55:30.865 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW batch flush file recovery start. 2023-12-07 10:55:30.939 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW recovery state: "Empty", file start page[dwn 1, start 31397], now access page 0, current [page_id 31397, dwn 1, checksum verify res is 1, page_num orig 0, page_num fixed 0] 2023-12-07 10:55:30.939 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW recovery state: "Finish", file start page[dwn 1, start 31397], now access page 0, current [page_id 31397, dwn 1, checksum verify res is 1, page_num orig 0, page_num fixed 0] 2023-12-07 10:55:30.939 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW batch flush file recovery finish. 2023-12-07 10:55:30.939 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush file recovery start. 2023-12-07 10:55:30.940 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: [single flush] recovery, [first version] dw page is new, break this recovery: buf_tag[rel 0/0/0 blk 0 fork 0], compress: 0 2023-12-07 10:55:30.959 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: [second version] DW single flush file valid item num is 0. 2023-12-07 10:55:30.959 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush file recovery finish. 2023-12-07 10:55:30.963 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush finish recovery, reset the file head[dwn 18, start 0]. 2023-12-07 10:55:30.966 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_STARTUP 2023-12-07 10:55:30.966 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:30.968 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DBL_WRT] LOG: DW single flush finish recovery [second version], reset the file head[dwn 18, start 0]. 2023-12-07 10:55:31.066 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: redo record is at 0/B49DB10; shutdown FALSE 2023-12-07 10:55:31.066 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: GTM-Free Mode: start local next csn from checkpoint 1021, next xid 21966 2023-12-07 10:55:31.066 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: Startup: write global globalRecycleXid 20899 from checkpoint 20899 2023-12-07 10:55:31.066 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup csnlog without extend at xid:21965, pageno:21 2023-12-07 10:55:31.067 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: ===Not in the streaming DR switchover=== 2023-12-07 10:55:31.067 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: StartupXLOG PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:0/B49DB90, oldRedo:0/B49DB10, newCkpLoc:0/B49DB90, newRedo:0/B49DB10, preCkpLoc:0/B49DA70 2023-12-07 10:55:31.076 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: [Pgstat] remove statfiles in pg_stat_tmp/pgstat.stat, global/pgstat.stat 2023-12-07 10:55:31.078 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: redo minRecoveryPoint at 0/B49DC30; backupStartPoint at 0/0; backupEndRequired FALSE 2023-12-07 10:55:31.079 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: redo starts at 0/B49DB10 2023-12-07 10:55:31.079 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: set knl_g_set_redo_finish_status to false when starting redo 2023-12-07 10:55:31.084 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: sigusr1_handler create WalWriterAuxiliary(140121206617664) after local recovery is done. pmState:2, ServerMode:3 2023-12-07 10:55:31.084 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:31.105 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: checkpointer started, CheckPointTimeout is 60 2023-12-07 10:55:31.106 [unknown] [unknown] localhost 140121317832256 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 0 2023-12-07 10:55:31.108 [unknown] [unknown] localhost 140121271694912 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 1 2023-12-07 10:55:31.108 [unknown] [unknown] localhost 140121225557568 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 2 2023-12-07 10:55:31.109 [unknown] [unknown] localhost 140121248626240 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 3 2023-12-07 10:55:31.109 [unknown] [unknown] localhost 140121338275392 0[0:0#0] 0 [BACKEND] LOG: invalidate buffer bgwriter started 2023-12-07 10:55:31.111 [unknown] [unknown] localhost 140121292666432 0[0:0#0] 0 [INCRE_CKPT] LOG: pagewriter started, thread id is 4 2023-12-07 10:55:31.111 [unknown] [unknown] localhost 140121206617664 0[0:0#0] 0 [BACKEND] LOG: walwriterauxiliary started 2023-12-07 10:55:31.968 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_RECOVERY 2023-12-07 10:55:31.968 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:32.080 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: XLogPageRead IsRecoveryDone is set true,ReadRecPtr:0/B49DB90, EndRecPtr:0/B49DC30 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats: the basic statistic during redo are as follows : redo_start_ptr:00000000/0B49DB10, redo_start_time:755232930616454, min_recovery_point:00000000/0B49DC30, read_ptr:00000000/0B49DB90, last_replayed_read_Ptr:00000000/0B49DC30, speed:0 KB/s 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_XLOG: the event io statistic during redo are as follows : total_duration:71, counter:11 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_WRITE_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_PROCESS_PENDING: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_APPLY: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:32.081 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: XLogPageRead set redo finish status,ReadRecPtr:0/B49DB90, EndRecPtr:0/B49DC30 2023-12-07 10:55:32.088 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:32.088 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: set lsn after recovery done in gaussdb state file 2023-12-07 10:55:32.088 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:32.131 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: request xlog stream at 0/B49DC30. 2023-12-07 10:55:32.131 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down walreceiver. 2023-12-07 10:55:32.131 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: wal receiver try to connect to host=172.17.0.2 port=21001 localhost=172.17.0.3 localport=31001 index 1 . 2023-12-07 10:55:32.132 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:32.147 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: walreceiver thread started 2023-12-07 10:55:32.147 [unknown] [unknown] localhost 140121063552576 0[0:0#0] 0 [BACKEND] LOG: walrcvwriter thread started 2023-12-07 10:55:32.198 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [DCF] LOG: DCF data version, applied index and min applied index read from /home/opengauss/opengauss/data/paxosindex is 1, 107667 and 113850. 2023-12-07 10:55:32.198 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: Before start DCF module, node_id = 2, dcf_config = [{"stream_id":1,"node_id":1,"ip":"172.17.0.2","port":21000,"role":"LEADER"},{"stream_id":1,"node_id":2,"ip":"172.17.0.3","port":21000,"role":"FOLLOWER"}, {"stream_id":1,"node_id":3,"ip":"172.17.0.4","port":21000,"role":"FOLLOWER"}] 2023-12-07 10:55:32.667 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: Start DCF module success. 2023-12-07 10:55:32.675 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: set knl_g_set_redo_finish_status to false when connecting to the primary 2023/12/07 10:55:32 error in setting up connection to 172.17.0.1:8080 due to dial tcp 172.17.0.1:8080: connect: connection refused dial tcp 172.17.0.1:8080: connect: connection refused 2023/12/07 10:55:32 error in setting up connection to 172.17.0.1:8080 due to dial tcp 172.17.0.1:8080: connect: connection refused dial tcp 172.17.0.1:8080: connect: connection refused 2023-12-07 10:55:32.969 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] WARNING: could not fork new process for connection due to PMstate PM_RECOVERY 2023-12-07 10:55:32.969 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: not ready to start snapshot capturer. 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: consistent recovery state reached at 0/B49DC30 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats: the basic statistic during redo are as follows : redo_start_ptr:00000000/0B49DB10, redo_start_time:755232930616454, min_recovery_point:00000000/0B49DC30, read_ptr:00000000/0B49DB90, last_replayed_read_Ptr:00000000/0B49DC30, speed:0 KB/s 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_XLOG: the event io statistic during redo are as follows : total_duration:71, counter:11 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_WRITE_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_PROCESS_PENDING: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_APPLY: the event io statistic during redo are as follows : total_durat ion:0, counter:0 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_unlink_stats_file: unlink redo.state sucessfully!, ret:-1 2023-12-07 10:55:33.132 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: send signal to be hot standby at 0/B49DC30 2023-12-07 10:55:33.138 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:33.138 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: update gaussdb state file: db state(NORMAL_STATE), server mode(Standby) 2023-12-07 10:55:33.138 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: database system is ready to accept read only connections 2023-12-07 10:55:33.138 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: auditor process 0 started, pid=140120466912832 2023-12-07 10:55:33.147 [unknown] [unknown] localhost 140120495748672 0[0:0#0] 0 [BACKEND] LOG: [Pgstat] statfile global/pgstat.stat is missing, using empty dbhash. 2023-12-07 10:55:33.155 [unknown] [unknown] localhost 140120466912832 0[0:0#0] 0 [BACKEND] LOG: audit_process_cxt_init enter 2023-12-07 10:55:33.156 [unknown] [unknown] localhost 140120466912832 0[0:0#0] 0 [BACKEND] LOG: audit_process_cxt_init pipe init successfully for pipe : 0 file descriptor: 80 2023-12-07 10:55:33.160 [unknown] [unknown] localhost 140120450131520 0[0:0#0] 0 [BACKEND] LOG: CfsShrinker started 2023-12-07 10:55:33.184 opengauss postgres localhost 140120430736960 0[0:0#0] 0 [BACKEND] LOG: statement flush thread start 2023-12-07 10:55:33.203 opengauss postgres localhost 140120430736960 0[0:0#0] 0 [BACKEND] LOG: Mem-file chain of standby_statement_history_fast init done and online. 2023-12-07 10:55:33.220 opengauss postgres localhost 140120430736960 0[0:0#0] 0 [BACKEND] LOG: Mem-file chain of standby_statement_history_slow init done and online. 2023-12-07 10:55:33.784 [unknown] [unknown] localhost 140120558655040 0[0:0#0] 0 [BACKEND] LOG: Enter PromoteCallbackFunc, isDcfShmemInited = 1 2023-12-07 10:55:33.784 [unknown] [unknown] localhost 140120558655040 0[0:0#0] 0 [BACKEND] LOG: Instance to do failover. 2023-12-07 10:55:33.784 [unknown] [unknown] localhost 140120558655040 0[0:0#0] 0 [BACKEND] LOG: Exit PromoteCallbackFunc, isDcfShmemInited = 1 2023-12-07 10:55:33.784 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: Begin to wait read xlog from DCF! 2023-12-07 10:55:33.784 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: All DCF log has been applied! 2023-12-07 10:55:33.809 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: received failover notification 2023-12-07 10:55:33.809 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down walreceiver. 2023-12-07 10:55:33.817 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] FATAL: terminating walreceiver process due to administrator command 2023-12-07 10:55:33.817 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] BACKTRACELOG: tid[231999]'s backtrace: /home/opengauss/openGauss/install/bin/gaussdb() [0x1068c7d] /home/opengauss/openGauss/install/bin/gaussdb(_Z9errfinishiz+0x2fd) [0x10609e2] /home/opengauss/openGauss/install/bin/gaussdb(_Z23ProcessWalRcvInterruptsv+0xa1) [0x1ffc9f1] /home/opengauss/openGauss/install/bin/gaussdb(_Z18WalRcvrProcessDataPlPb+0xa9) [0x1ffd02f] /home/opengauss/openGauss/install/bin/gaussdb(_Z15WalReceiverMainv+0x1950) [0x1ffec6a] /home/opengauss/openGauss/install/bin/gaussdb(_Z26GaussDbAuxiliaryThreadMainIL15knl_thread_role41EEiP14knl_thread_arg+0x24d) [0x17c5b6e] /home/opengauss/openGauss/install/bin/gaussdb(_Z17GaussDbThreadMainIL15knl_thread_role41EEiP14knl_thread_arg+0x4af) [0x17bbd5d] /home/opengauss/openGauss/install/bin/gaussdb() [0x17ab596] /home/opengauss/openGauss/install/bin/gaussdb() [0x2253f9c] /usr/lib64/libc.so.6(+0x8b47a) [0x7f71a15a747a] /usr/lib64/libc.so.6(clone+0x44) [0x7f71a1629364] Use addr2line to get pretty function name and line 2023-12-07 10:55:33.817 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: waiting walrcvwriter: 140121063552576 terminate 2023-12-07 10:55:33.817 [unknown] [unknown] localhost 140121063552576 0[0:0#0] 0 [BACKEND] LOG: walrcvwriter thread shut down 2023-12-07 10:55:33.828 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: set local_redo_finish_status to false in WalRcvDie 2023-12-07 10:55:33.852 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: walreceiver thread shut down 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down datareceiver. 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: RecPtr(0/B49DC50),receivedUpto(0/B49DC30) 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: Secondary Standby has synchronized all the xlog and replication data, standby promote to primary 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down walreceiver. 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down datareceiver. 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: read xlog page failed at 0/B49DC30 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: set knl_g_set_redo_finish_status to true when redo done 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: redo done at 0/B49DB90, end at 0/B49DC30 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [PR]: Recoverying elapsed: 2831170 us, redoTotalBytes:288,EndRecPtr:189389872, redoStartPtr:189389584,speed:0 MB/s, totalTime:2831170 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_unlink_stats_file: unlink redo.state sucessfully!, ret:-1 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats: the basic statistic during redo are as follows : redo_start_ptr:00000000/0B49DB10, redo_start_time:755232930616454, min_recovery_point:00000000/0B49DC30, read_ptr:00000000/0B49DB90, last_replayed_read_Ptr:00000000/0B49DC30, speed:0 KB/s 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_XLOG: the event io statistic during redo are as follows : total_duration:71, counter:11 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_READ_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_WRITE_DATA: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_PROCESS_PENDING: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: [REDO_STATS]redo_dump_all_stats WAIT_APPLY: the event io statistic during redo are as follows : total_duration:0, counter:0 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down walreceiver. 2023-12-07 10:55:33.910 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: startup shut down datareceiver. 2023-12-07 10:55:33.915 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:34.118 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: set gaussdb state file: db state(PROMOTING_STATE), server mode(Standby) 2023-12-07 10:55:34.121 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: archive recovery complete 2023-12-07 10:55:34.126 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: btree_xlog_cleanup is over, it takes time:0 microseconds 2023-12-07 10:55:34.126 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [REDO] LOG: UBTreeXlogCleanup is over, it takes time:1 microseconds 2023-12-07 10:55:34.127 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: skipping restartpoint, already performed at 0/B49DB10 2023-12-07 10:55:34.144 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DCF] LOG: Write dcfData version 1, apply index 107667, min apply index 113850 and crc 2602742273 into "/home/opengauss/opengauss/data/paxosindex.backup" 2023-12-07 10:55:34.148 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [DCF] LOG: Write dcfData version 1, apply index 107667, min apply index 113850 and crc 2602742273 into "/home/opengauss/opengauss/data/paxosindex" 2023-12-07 10:55:34.148 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: redo done, nextXid: 22321, startupMaxXid: 22321, recentLocalXmin: 21966, recentGlobalXmin: 21966, PendingPreparedXacts: 0, NextCommitSeqNo: 1021, cutoff_csn_min: 0. 2023-12-07 10:55:34.165 [postmaster][reaper][140126007985600] LOG: startupprocess exit 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.165 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: Archive slot change to active when DBstatus Promoting 2023-12-07 10:55:34.188 [unknown] [unknown] localhost 140121046771264 0[0:0#0] 0 [BACKEND] LOG: autovacuum launcher started 2023-12-07 10:55:34.192 [unknown] [unknown] localhost 140121424262720 0[0:0#0] 0 [BACKEND] LOG: walwriter started 2023-12-07 10:55:34.194 [unknown] [unknown] localhost 140120305952320 0[0:0#0] 0 [BACKEND] LOG: global stats collector started 2023-12-07 10:55:34.195 [unknown] [unknown] localhost 140120322733632 0[0:0#0] 0 [BACKEND] LOG: undo launcher started 2023-12-07 10:55:34.196 [unknown] [unknown] localhost 140121063552576 0[0:0#0] 0 [BACKEND] LOG: job scheduler started 2023-12-07 10:55:34.199 [unknown] [unknown] localhost 140120262964800 0[0:0#0] 0 [TIMECAPSULE] LOG: txnsnapcapturer started 2023-12-07 10:55:34.213 opengauss postgres localhost 140120214730304 0[0:0#0] 0 [BACKEND] LOG: ASP thread start 2023-12-07 10:55:34.213 opengauss postgres localhost 140120239896128 0[0:0#0] 0 [BACKEND] LOG: instrumention percentile started 2023-12-07 10:55:34.309 opengauss postgres localhost 140120067929664 0[0:0#0] 0 [BACKEND] LOG: process wlm thread starting up. 2023-12-07 10:55:34.312 opengauss postgres localhost 140120067929664 0[0:0#0] 0 [BACKEND] LOG: build user data finished 2023-12-07 10:55:34.312 opengauss postgres localhost 140120028083776 0[0:0#0] 0 [BACKEND] LOG: WLMarbiter thread is starting up. 2023-12-07 10:55:34.313 opengauss postgres localhost 140120048989760 0[0:0#0] 0 [BACKEND] LOG: WLMmonitor thread is starting up. 2023-12-07 10:55:34.313 opengauss postgres localhost 140120048989760 0[0:0#0] 0 [BACKEND] LOG: cannot find the mount device of data directory /home/opengauss/opengauss/data, io_priority is not availab le. 2023-12-07 10:55:34.479 [unknown] [unknown] localhost 140126007985600 0[0:0#0] 0 [BACKEND] LOG: gaussdb: fsync file "/home/opengauss/opengauss/data/gaussdb.state.temp" success 2023-12-07 10:55:34.480 [postmaster][reaper][140126007985600] LOG: update gaussdb state file: db state(NORMAL_STATE), server mode(Primary) 2023-12-07 10:55:34.480 [postmaster][reaper][140126007985600] LOG: database system is ready to accept connections 2023-12-07 10:55:34.493 [unknown] [unknown] localhost 140119914772032 0[0:0#0] 0 [UNDO] LOG: [UndoRecycleMain:549]undo recycle started. 2023-12-07 10:55:34.678 [unknown] [unknown] localhost 140120614762048 0[0:0#0] 0 [BACKEND] LOG: DCF leader send reply to follower 3 successfully, crc 1858586281. 2023-12-07 10:55:34.698 [unknown] [unknown] localhost 140120614762048 0[0:0#0] 0 [BACKEND] LOG: DCF leader send reply to follower 1 successfully, crc 1858586281. 2023-12-07 10:55:44.512 [unknown] [unknown] localhost 140119914772032 0[0:0#0] 0 [UNDO] LOG: [UndoRecycleMain:643]sleep 10s, ensure the snapcapturer can give the undorecyclemain a valid recycleXmin. 2023-12-07 10:55:49.140 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: database first startup and recovery finish,so do checkpointer 2023-12-07 10:55:49.359 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 1, start 31397], total_pages 0 2023-12-07 10:55:49.369 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: keep all the xlog segments, because current segno = 11, less than wal_keep_segments = 16 2023-12-07 10:55:49.369 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] WARNING: replicationSlotMinLSN is InvalidXLogRecPtr!!! 2023-12-07 10:55:49.369 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] WARNING: replicationSlotMaxLSN is InvalidXLogRecPtr!!! 2023-12-07 10:55:49.369 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] WARNING: The number of walsender 0 is less than the number of valid replconninfo 2. 2023-12-07 10:55:49.382 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: CreateCheckPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:0/B49DB90, oldRedo:0/B49DB10, newCkpLoc:0/B49DCB0, newRedo:0/B49DC30, preCkpLoc:0/B49DB90 2023-12-07 10:55:49.383 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: will update control file (create checkpoint), shutdown:0 2023-12-07 10:55:49.395 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [DCF] LOG: Write dcfData version 1, apply index 107668, min apply index 113863 and crc 4155646614 into "/home/opengauss/opengauss/data/paxosindex.backup" 2023-12-07 10:55:49.399 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [DCF] LOG: Write dcfData version 1, apply index 107668, min apply index 113863 and crc 4155646614 into "/home/opengauss/opengauss/data/paxosindex" 2023-12-07 10:55:49.401 [unknown] [unknown] localhost 140121355056704 0[0:0#0] 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 000000010000000000000000 【测试代码】:
Comments (
3
)
Sign in
to comment
Status
Canceled
Backlog
Confirmed
已答复
Canceled
挂起
Fixing
Done
待回归
测试中
Accepted
Assignees
Not set
刘展峰
liuzhanfeng2
Assignee
Collaborator
+Assign
+Mention
陈栋
chendong76
Assignee
Collaborator
+Assign
+Mention
Labels
Not set
Projects
Unprojected
Unprojected
Milestones
No related milestones
No related milestones
Pull Requests
None yet
None yet
Successfully merging a pull request will close this issue.
Branches
No related branch
Branches (20)
Tags (30)
master
6.0.0
5.0.0
datavec_poc
tp_poc
7.0.0-RC2
7.0.0-RC1
master_bak08271930
bugfix_0725
3.0.0
iud_dev
dev_board
5.1.0
kms
2.0.0
3.1.0
2.1.0
1.1.0
1.0.1
1.0.0
v7.0.0-RC2
v6.0.2
v7.0.0-RC1
v6.0.1
v3.0.6
v6.0.0
v3.0.5B009
v5.0.3
v5.0.2
v6.0.0-RC1
v3.0.5
v5.0.1
v5.1.0
5.1.0
v5.0.0
v3.0.3
v3.1.1
v3.0.2
v3.1.0
v2.0.5
v3.0.1
v2.0.4
v2.0.3
v3.0.0
v2.1.0
v2.0.1
v2.0.0
v1.1.0
v1.0.1
v1.0.0
Planed to start   -   Planed to end
-
Top level
Not Top
Top Level: High
Top Level: Medium
Top Level: Low
Priority
Not specified
Serious
Main
Secondary
Unimportant
Duration
(hours)
参与者(2)
C++
1
https://gitee.com/opengauss/openGauss-server.git
git@gitee.com:opengauss/openGauss-server.git
opengauss
openGauss-server
openGauss-server
Going to Help Center
Search
Git 命令在线学习
如何在 Gitee 导入 GitHub 仓库
Git 仓库基础操作
企业版和社区版功能对比
SSH 公钥设置
如何处理代码冲突
仓库体积过大,如何减小?
如何找回被删除的仓库数据
Gitee 产品配额说明
GitHub仓库快速导入Gitee及同步更新
什么是 Release(发行版)
将 PHP 项目自动发布到 packagist.org
Repository Report
Back to the top
Login prompt
This operation requires login to the code cloud account. Please log in before operating.
Go to login
No account. Register