一主一备模式下,进行简单测试,一段时间后显示:fail to allocate a zone
通过提供的工具收集日志,结果是死锁导致的:
2022-08-03 14:14:59.865 62ea1252.5025 postgres 139780237289216 PostgreSQL JDBC Driver 1538516 dn_6001_6002 40P01 3377699720531497 [BACKEND] ERROR: deadlock detected
2022-08-03 14:14:59.865 62ea1252.5025 postgres 139780237289216 PostgreSQL JDBC Driver 1538516 dn_6001_6002 40P01 3377699720531497 [BACKEND] DETAIL: Process 139780237289216 waits for ShareLock on transaction 1538556; blocked by process 139780876728064.
Process 139780876728064 waits for AccessExclusiveLock on tuple (19808,36) of (relation 16390, bucket 0) of database 15484; blocked by process 139779890865920.
Process 139779890865920 waits for ShareLock on transaction 1538516; blocked by process 139780237289216.
Process 139780237289216: delete from users where user_name = $1
线程代码:
public class DelectTest extends Thread {
String sql = "delete from users where user_name = ?;" +
"delete from users where user_name = ?;" +
"delete from users where user_name = ?";
JDBCUtil jdbc = new JDBCUtil();
public void run(){
Long startTime = System.currentTimeMillis();
Long endTime = null;
boolean flag = true;
while(flag){
try {
DelectTest.sleep(800);
PreparedStatement statement = jdbc.getStatement(sql);
statement.setString(1,"zhangsan");
statement.setString(2,"zhaoliu");
statement.setString(3,"longqi");
int i = statement.executeUpdate();
System.out.println("成功地删除了"+ i +"条数据");
//MyLogger.info("成功地删除了"+ i +"条数据");
endTime = System.currentTimeMillis();
if(endTime - startTime > 86400000*3){
flag = false;
}
} catch (SQLException throwables) {
throwables.printStackTrace();
//MyLogger.warning(throwables.getMessage());
} catch (InterruptedException e) {
e.printStackTrace();
} finally {
jdbc.closeSource();
}
}
}
}
之后将测试关闭程序,只要对数据库进行写操作,就会显示fail to allocate a zone,需要重启数据库才能恢复。请问这是为什么呢?怎么解决
Hey @黄佳, Welcome to openGauss Community.
All of the projects in openGauss Community are maintained by @opengauss-bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at Here to find the details.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
已将sql语句进行优化,避免了死锁,但是仍然会出现:failed to allocate a zone。收集到的日志如下:
clean statement thread start 2022-08-07 00:33:03.957 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [SLRU] LOG: remove old segments(<2048) under pg_csnlog 2022-08-07 00:33:03.957 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: truncate CSN log oldestXact 2595255, next xid 2595271 2022-08-07 00:33:04.165 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 193, start 3654], total_pages 0 2022-08-07 00:33:04.173 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: slotname: dn_6002, dummy: 0, restartlsn: 1/7F6C3FD8 2022-08-07 00:33:04.174 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: CreateCheckPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:1/7F593818, oldRedo:1/7F5930A8, newCkpLoc:1/7F6C4070, newRedo:1/7F62C0B0, preCkpLoc:1/7F4FD190 2022-08-07 00:33:04.174 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: will update control file (create checkpoint), shutdown:0 2022-08-07 00:33:04.175 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 00000001000000010000006D 2022-08-07 00:33:31.729 62ee97db.5054 postgres 140242626868992 Clean Statement thread 0 dn_6001_6002 00000 0 [BACKEND] LOG: clean statement thread start 2022-08-07 00:34:03.439 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 193, start 4139], total_pages 0 2022-08-07 00:34:03.447 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: slotname: dn_6002, dummy: 0, restartlsn: 1/7F759F90 2022-08-07 00:34:03.448 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: CreateCheckPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:1/7F6C4070, oldRedo:1/7F62C0B0, newCkpLoc:1/7F75A028, newRedo:1/7F6C37E8, preCkpLoc:1/7F593818 2022-08-07 00:34:03.448 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: will update control file (create checkpoint), shutdown:0 2022-08-07 00:34:03.449 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 00000001000000010000006D 2022-08-07 00:34:32.492 62ee9818.5053 postgres 140242626868992 Clean Statement thread 0 dn_6001_6002 00000 0 [BACKEND] LOG: clean statement thread start 2022-08-07 00:34:50.017 62ec74f9.6135 [unknown] 140243039483648 dn_6001 0 dn_6001_6002 01000 0 [BACKEND] WARNING: audit file total space(1074649551 B) exceed guc parameter(audit_space_limit: 1048576 KB) 2022-08-07 00:34:50.017 62ec74f9.6135 [unknown] 140243039483648 dn_6001 0 dn_6001_6002 01000 0 [BACKEND] WARNING: /var/log/omm/omm/pg_audit/dn_6001/997_adt 2022-08-07 00:35:03.674 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 193, start 4693], total_pages 0 2022-08-07 00:35:03.681 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: slotname: dn_6002, dummy: 0, restartlsn: 1/7F7F27B0 2022-08-07 00:35:03.682 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: CreateCheckPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:1/7F75A028, oldRedo:1/7F6C37E8, newCkpLoc:1/7F7F2830, newRedo:1/7F759F90, preCkpLoc:1/7F6C4070 2022-08-07 00:35:03.682 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: will update control file (create checkpoint), shutdown:0 2022-08-07 00:35:03.683 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 00000001000000010000006D 2022-08-07 00:35:28.075 62ee9850.6123 [unknown] 140242626868992 dn_6001 0 dn_6001_6002 00000 0 [AUTOVAC] LOG: report autovac startup on database 15484 to stats collector 2022-08-07 00:35:28.082 62ee9850.6123 [unknown] 140242626868992 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: start autovacuum on database "postgres" 2022-08-07 00:35:33.243 62ee9855.5054 postgres 140242626868992 Clean Statement thread 0 dn_6001_6002 00000 0 [BACKEND] LOG: clean statement thread start 2022-08-07 00:36:03.896 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 193, start 5051], total_pages 0 2022-08-07 00:36:03.904 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: slotname: dn_6002, dummy: 0, restartlsn: 1/7F88B758 2022-08-07 00:36:03.905 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: CreateCheckPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:1/7F7F2830, oldRedo:1/7F759F90, newCkpLoc:1/7F88B7D8, newRedo:1/7F88A868, preCkpLoc:1/7F75A028 2022-08-07 00:36:03.905 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: will update control file (create checkpoint), shutdown:0 2022-08-07 00:36:03.906 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 00000001000000010000006D 2022-08-07 00:36:34.043 62ee9892.5053 postgres 140242576525056 Clean Statement thread 0 dn_6001_6002 00000 0 [BACKEND] LOG: clean statement thread start 2022-08-07 00:37:34.842 62ee98ce.5057 postgres 140242610087680 Clean Statement thread 0 dn_6001_6002 00000 0 [BACKEND] LOG: clean statement thread start 2022-08-07 00:38:04.188 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 193, start 6138], total_pages 0 2022-08-07 00:38:04.196 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: slotname: dn_6002, dummy: 0, restartlsn: 1/7F9BBB20 2022-08-07 00:38:04.197 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: CreateCheckPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:1/7F88B7D8, oldRedo:1/7F88A868, newCkpLoc:1/7F9BBBA0, newRedo:1/7F923228, preCkpLoc:1/7F7F2830 2022-08-07 00:38:04.197 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: will update control file (create checkpoint), shutdown:0 2022-08-07 00:38:04.198 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 00000001000000010000006D 2022-08-07 00:38:35.611 62ee990b.5055 postgres 140242576525056 Clean Statement thread 0 dn_6001_6002 00000 0 [BACKEND] LOG: clean statement thread start 2022-08-07 00:39:03.486 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 193, start 6711], total_pages 0 2022-08-07 00:39:03.507 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: slotname: dn_6002, dummy: 0, restartlsn: 1/7FA506B0 2022-08-07 00:39:03.517 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: CreateCheckPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:1/7F9BBBA0, oldRedo:1/7F923228, newCkpLoc:1/7FA50730, newRedo:1/7F9BB930, preCkpLoc:1/7F88B7D8 2022-08-07 00:39:03.518 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: will update control file (create checkpoint), shutdown:0 2022-08-07 00:39:03.537 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 00000001000000010000006D 2022-08-07 00:39:36.489 62ee9948.5057 postgres 140242559743744 Clean Statement thread 0 dn_6001_6002 00000 0 [BACKEND] LOG: clean statement thread start 2022-08-07 00:40:03.798 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [DBL_WRT] LOG: [batch flush] DW truncate end: file_head[dwn 193, start 7292], total_pages 0 2022-08-07 00:40:03.806 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: slotname: dn_6002, dummy: 0, restartlsn: 1/7FAE98C0 2022-08-07 00:40:03.807 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: CreateCheckPoint PrintCkpXctlControlFile: [checkPoint] oldCkpLoc:1/7FA50730, oldRedo:1/7F9BB930, newCkpLoc:1/7FAE9940, newRedo:1/7FAE98C0, preCkpLoc:1/7F9BBBA0 2022-08-07 00:40:03.807 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: will update control file (create checkpoint), shutdown:0 2022-08-07 00:40:03.808 62ec74f9.6125 [unknown] 140243736393472 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: attempting to remove WAL segments older than log file 00000001000000010000006D 2022-08-07 00:40:15.377 62ec74ff.5063 [unknown] 140242984888064 dn_6002 0 dn_6001_6002 00000 0 [BACKEND] LOG: the config file has no change 2022-08-07 00:40:27.622 62ee997b.6123 [unknown] 140242576525056 dn_6001 0 dn_6001_6002 00000 0 [AUTOVAC] LOG: report autovac startup on database 1 to stats collector 2022-08-07 00:40:27.629 62ee997b.6123 [unknown] 140242576525056 dn_6001 0 dn_6001_6002 00000 0 [BACKEND] LOG: start autovacuum on database "template1" 2022-08-07 00:40:37.314 62ee9985.5053 postgres 140242576525056 Clean Statement thread 0 dn_6001_6002 00000 0 [BACKEND] LOG: clean statement thread start 2022-08-07 00:41:30.921 62ee99ba.5053 postgres 140242576525056 PostgreSQL JDBC Driver 0 dn_6001_6002 42809 3940649676307841 [BACKEND] ERROR: [AllocateZonesBeforXid:856]failed to allocate a zone 2022-08-07 00:41:30.921 62ee99ba.5053 postgres 140242576525056 PostgreSQL JDBC Driver 0 dn_6001_6002 42809 3940649676307841 [BACKEND] STATEMENT: insert into users(user_name,rowversion) values($1,now()),($2,now()),($3,now()),($4,now()),($5,now()) 2022-08-07 00:41:31.205 62ee99bb.5055 postgres 140242593306368 PostgreSQL JDBC Driver 0 dn_6001_6002 42809 3940649676307850 [BACKEND] ERROR: [AllocateZonesBeforXid:856]failed to allocate a zone
请问如何解决?
@黄佳 是否开启ustore? 参考一下此issue:#I5KOXV:长时间运行会因为“failed to allocate a zone”报错,无法进行任何写操作
实不相瞒,我在不同的插入的同时,也在不同的查询和删除数据。只是当时报错是插入线程之间进入死循环,我就只提了插入。
配置文件:
<?xml version="1.0" encoding="UTF-8"?>
<ROOT>
<!-- openGauss整体信息 -->
<CLUSTER>
<!-- 数据库名称 -->
<PARAM name="clusterName" value="Cluster_template" />
<!-- 数据库节点名称(hostname) -->
<PARAM name="nodeNames" value="gauss-db1,gauss-db2" />
<!-- 数据库安装目录-->
<PARAM name="gaussdbAppPath" value="/opt/huawei/install/app" />
<!-- 日志目录-->
<PARAM name="gaussdbLogPath" value="/var/log/omm" />
<!-- 临时文件目录-->
<PARAM name="tmpMppdbPath" value="/opt/huawei/tmp"/>
<!-- 数据库工具目录-->
<PARAM name="gaussdbToolPath" value="/opt/huawei/install/om" />
<!-- 数据库core文件目录-->
<PARAM name="corePath" value="/opt/huawei/corefile"/>
<!-- 节点IP,与数据库节点名称列表一一对应 -->
<PARAM name="backIp1s" value="****,****"/>
</CLUSTER>
<!-- 每台服务器上的节点部署信息 -->
<DEVICELIST>
<!-- 节点1上的部署信息 -->
<DEVICE sn="gauss-db1">
<!-- 节点1的主机名称 -->
<PARAM name="name" value="gauss-db1"/>
<!-- 节点1所在的AZ及AZ优先级 -->
<PARAM name="azName" value="AZ1"/>
<PARAM name="azPriority" value="1"/>
<!-- 节点1的IP,如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP -->
<PARAM name="backIp1" value="****"/>
<PARAM name="sshIp1" value="****"/>
<!--dn-->
<PARAM name="dataNum" value="1"/>
<PARAM name="dataPortBase" value="5432"/>
<PARAM name="dataNode1" value="/opt/huawei/install/data/dn,gauss-db2,/opt/huawei/install/data/dn"/>
<PARAM name="dataNode1_syncNum" value="0"/>
</DEVICE>
<!-- 节点2上的节点部署信息,其中“name”的值配置为主机名称 -->
<DEVICE sn="gauss-db2">
<!-- 节点2的主机名称 -->
<PARAM name="name" value="gauss-db2"/>
<!-- 节点2所在的AZ及AZ优先级 -->
<PARAM name="azName" value="AZ1"/>
<PARAM name="azPriority" value="1"/>
<!-- 节点2的IP,如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP -->
<PARAM name="backIp1" value="****"/>
<PARAM name="sshIp1" value="****"/>
</DEVICE>
</DEVICELIST>
</ROOT>
===============================================================
===============================================================
数据库之前测试就一个很普通的表
@黄佳 问题是否还能复现? issue:#I5KOXV:长时间运行会因为“failed to allocate a zone”报错,无法进行任何写操作 已经合入。
如果还有问题请帮忙提供数据库下postgresql.conf配置文件
同时提供连接一下连接数据库的参数(如果含连接池,带连接池参数),使用时有多少并发?
换了存储引擎之后就好了,之前并发量其实不高,就是io频率优点高,加上后面稍微优化了一下sql就没问题了。
非问题关闭
登录 后才可以发表评论