398 Star 1.3K Fork 1.5K

GVPopenGauss / openGauss-server

 / 详情

【业务范畴:开源】【测试类型:逻辑复制】【测试活动:社区】【测试版本:2.0.1】【特性名称:逻辑复制】【环境:裸机】压力大的时候,备机并发解码槽出现连接失败

已完成
缺陷
创建于  
2021-09-13 16:51

[硬件类型版本]:RedFlag Asianux release 7.6.1911
[数据库版本号]:gsql ((openGauss 2.0.1 build 24cdd19c)compiled at 2021-09-07 20:41:31 commit 0 last mr)
[测试环境]:一主四备,两同步备两异步备
[被测功能]:逻辑复制
[测试类型]:可靠性测试
[测试步骤]:1:创建30个逻辑复制槽
2:tpcc导入20仓数据
3:执行tpcc20分钟,10并发
4:备机开启30个会话,每个会话开启一个逻辑复制槽,每个会话的复制槽不一样
5:观察每个会话的复制槽的解码
[预期结果]:每个复制槽可以正常解码
[实际结果]:出现连接失败的报错,wal_sender_timeout默认参数值较小

评论 (8)

wang_xun934809 创建了缺陷
展开全部操作日志

Hey @wang_xun934809, Welcome to openGauss Community.
All of the projects in openGauss Community are maintained by @opengauss-bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at https://gitee.com/opengauss/community/blob/master/contributors/command.en.md to find the details.

wang_xun934809 修改了标题
wang_xun934809 修改了描述
wang_xun934809 修改了标题
zhangxubo 添加了
 
sig/storageengine
标签
zhangxubo 负责人设置为wanglei
opengauss-bot 负责人wanglei 修改为pengjiong
wanglei 优先级设置为主要

@wang_xun934809 请问下

4:备机开启30个会话,每个会话开启一个逻辑复制槽,每个会话的复制槽不一样

这个步骤中,是通过什么命令开启会话的?是用自己编写的应用通过 START_REPLICATION 流复制的方式还是通过执行sql pg_logical_slot_get_changes的方式获取逻辑解码的内容?另外,提示 wal_sender_timeout 的数据库日志也麻烦贴下

和提单人员沟通,补充部分信息:

  1. 第3步执行TPCC和第4步逻辑解码为同步进行
  2. 使用的逻辑解码工具为 pg_recvlogical
wang_xun934809 修改了标题
opengauss-bot 负责人pengjiong 修改为仲夏十三
pengjiong 负责人仲夏十三 修改为未设置
pengjiong 负责人设置为pengjiong

如果 wal_sender_timeout 是默认参数(默认6s),且执行 pg_recvlogical 时,没有通过 -s 参数指定心跳间隔(默认10s)的话,不需要任何压力和并发即可复现。
修复该问题后,在wal_sender_timeout 为6s,pg_recvlogical 未指定-s参数时(默认10s),参考issue的复现步骤,没有再出现解码失败的问题,不过由于年代久远,很多参数配置未知,不确定是否和issue场景一致,本地测试时的配置如下。
tpcc:

warehouses=20
loadWorkers=10

terminals=10
runTxnsPerTerminal=0
runMins=30
limitTxnsPerMin=300

terminalWarehouseFixed=true

newOrderWeight=45
paymentWeight=43
orderStatusWeight=4
deliveryWeight=4
stockLevelWeight=4

openGauss-server:

 (GaussDB Kernel V500R001C20 build 051c208a) compiled at 2022-09-16 16:04:49 commit 0 last mr   on aarch64-unknown-linux-gnu, compiled by g++ (GCC) 7.3.0, 64-bit

postgres=# select name,setting,unit from pg_settings where name like '%wal%';
             name             |  setting  | unit 
------------------------------+-----------+------
 max_wal_senders              | 35        | 
 wal_block_size               | 8192      | 
 wal_buffers                  | 2048      | 8kB
 wal_file_init_num            | 10        | 
 wal_keep_segments            | 16        | 
 wal_level                    | logical   | 
 wal_log_hints                | on        | 
 wal_receiver_buffer_size     | 65536     | kB
 wal_receiver_connect_retries | 1         | 
 wal_receiver_connect_timeout | 2         | s
 wal_receiver_status_interval | 5         | s
 wal_receiver_timeout         | 6000      | ms
 wal_segment_size             | 2048      | 8kB
 wal_sender_timeout           | 6000      | ms
 wal_sync_method              | fdatasync | 
 wal_writer_cpu               | -1        | 
 wal_writer_delay             | 200       | ms
 walsender_max_send_size      | 8192      | kB

30 pg_recvlogical并发,不同的复制槽:

postgres=# select count(*) from pg_stat_activity where application_name='pg_recvlogical';
 count 
-------
    30
(1 row)

postgres=#  select * from pg_replication_slots ;
 slot_name |     plugin     | slot_type | datoid |   database   | active | xmin | catalog_xmin | restart_lsn | dummy_standby 
-----------+----------------+-----------+--------+--------------+--------+------+--------------+-------------+---------------
 slot1     | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot2     | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot3     | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot4     | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot5     | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot6     | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot7     | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot8     | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot9     | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot0     | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot11    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot12    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot13    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot14    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot15    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot16    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot17    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot18    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot19    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot10    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot21    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot22    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot23    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot24    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot25    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot26    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot27    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot28    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot29    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
 slot20    | mppdb_decoding | logical   |  16405 | benchmarksql | t      |      |        45998 | 0/A9F60170  | f
(30 rows)

tpcc结果:

11:10:36,352 [Thread-7] INFO   jTPCC : Term-00,
11:10:36,352 [Thread-7] INFO   jTPCC : Term-00,
11:10:36,353 [Thread-7] INFO   jTPCC : Term-00, Measured tpmC (NewOrders) = 133.7
11:10:36,353 [Thread-7] INFO   jTPCC : Term-00, Measured tpmTOTAL = 299.96
11:10:36,353 [Thread-7] INFO   jTPCC : Term-00, Session Start     = 2022-09-17 10:40:34
11:10:36,353 [Thread-7] INFO   jTPCC : Term-00, Session End       = 2022-09-17 11:10:36
11:10:36,353 [Thread-7] INFO   jTPCC : Term-00, Transaction Count = 9009
pengjiong 任务状态待办的 修改为修复中
pengjiong 通过opengauss/openGauss-server Pull Request !2178任务状态修复中 修改为已完成

已验收,每个复制槽可以正常解码
tpcc:
输入图片说明

opengauss:
gaussdb (openGauss 3.1.0 build a31f86e5) compiled at 2022-10-18 20:10:11 commit 0 last mr

输入图片说明
tpccdb=# select * from pg_get_replication_slots();
slot_name | plugin | slot_type | datoid | active | xmin | catalog_xmin | restart_lsn | dummy_standby | confirmed_flush
-------------+----------------+-----------+--------+--------+------+--------------+-------------+---------------+-----------------
test_slot28 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
dn_6002 | | physical | 0 | t | | | 0/AAC00150 | f |
dn_6003 | | physical | 0 | t | | | 0/AAC00150 | f |
test_slot30 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
dn_6004 | | physical | 0 | t | | | 0/AAC00150 | f |
test_slot12 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot15 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot3 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot2 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot10 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot22 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
dn_6005 | | physical | 0 | t | | | 0/AAC00150 | f |
test_slot5 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot19 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot18 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot1 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot14 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot27 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot11 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot24 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot23 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot16 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot4 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot17 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot6 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot20 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot25 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot7 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot13 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot8 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot9 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot26 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot29 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
test_slot21 | mppdb_decoding | logical | 15705 | t | | 58912 | 0/AAC00060 | f | 0/AAC00150
(34 rows)

输入图片说明
tpccdb=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn | dummy_standby | confirmed_flush
-------------+----------------+-----------+--------+----------+--------+------+--------------+-------------+---------------+-----------------
test_slot28 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
dn_6002 | | physical | 0 | | t | | | 0/AABFFEF8 | f |
dn_6003 | | physical | 0 | | t | | | 0/AABFFEF8 | f |
test_slot30 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
dn_6004 | | physical | 0 | | t | | | 0/AABFFEF8 | f |
test_slot12 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot15 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot3 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot2 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot10 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot22 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
dn_6005 | | physical | 0 | | t | | | 0/AABFFEF8 | f |
test_slot5 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot19 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot18 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot1 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot14 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot27 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot11 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot24 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot23 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot16 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot4 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot17 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot6 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot20 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot25 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot7 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot13 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot8 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot9 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot26 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot29 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
test_slot21 | mppdb_decoding | logical | 15705 | postgres | t | | 58912 | 0/AABFFEA8 | f | 0/AABFFEF8
(34 rows)

tpcc结果:
输入图片说明

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(5)
5622128 opengauss bot 1581905080
C++
1
https://gitee.com/opengauss/openGauss-server.git
git@gitee.com:opengauss/openGauss-server.git
opengauss
openGauss-server
openGauss-server

搜索帮助

14c37bed 8189591 565d56ea 8189591