diff --git "a/Test_Result/openGauss_6.0.0_release/\345\206\205\346\240\270\345\234\272\346\231\257\345\214\226/openGauss920B\346\234\272\345\231\250\346\200\247\350\203\275\346\265\213\350\257\225\346\212\245\345\221\212.md" "b/Test_Result/openGauss_6.0.0_release/\345\206\205\346\240\270\345\234\272\346\231\257\345\214\226/openGauss\346\225\260\346\215\256\345\272\223\351\262\262\351\271\217920 V200\346\234\272\345\231\250\346\200\247\350\203\275\346\265\213\350\257\225\346\212\245\345\221\212.md" similarity index 76% rename from "Test_Result/openGauss_6.0.0_release/\345\206\205\346\240\270\345\234\272\346\231\257\345\214\226/openGauss920B\346\234\272\345\231\250\346\200\247\350\203\275\346\265\213\350\257\225\346\212\245\345\221\212.md" rename to "Test_Result/openGauss_6.0.0_release/\345\206\205\346\240\270\345\234\272\346\231\257\345\214\226/openGauss\346\225\260\346\215\256\345\272\223\351\262\262\351\271\217920 V200\346\234\272\345\231\250\346\200\247\350\203\275\346\265\213\350\257\225\346\212\245\345\221\212.md" index 454e9ff5b336d98755be0abf5a7fbcddc5505353..dbccd5beb94cb6bcac613b0f08f0962138ca51a6 100644 --- "a/Test_Result/openGauss_6.0.0_release/\345\206\205\346\240\270\345\234\272\346\231\257\345\214\226/openGauss920B\346\234\272\345\231\250\346\200\247\350\203\275\346\265\213\350\257\225\346\212\245\345\221\212.md" +++ "b/Test_Result/openGauss_6.0.0_release/\345\206\205\346\240\270\345\234\272\346\231\257\345\214\226/openGauss\346\225\260\346\215\256\345\272\223\351\262\262\351\271\217920 V200\346\234\272\345\231\250\346\200\247\350\203\275\346\265\213\350\257\225\346\212\245\345\221\212.md" @@ -5,17 +5,17 @@ 修订记录 -| 日期 | 修订版本 | 修改描述 | 作者 | -| -------- | -------- | ------------------------------------- | ------- | -| 2024.9.5 | 1.0 | openGauss920B机器性能测试报告初稿完成 | liutong | +| 日期 | 修订版本 | 修改描述 | 作者 | +| -------- | -------- | --------------------------------------------------- | ------- | +| 2024.9.5 | 1.0 | openGauss数据库鲲鹏920 V200机器性能测试报告初稿完成 | liutong | [TOC] -**Keywords 关键词**:分区表、性能、tpmC、920B +**Keywords 关键词**:分区表、性能、tpmC、鲲鹏920 V200 -**Abstract 摘要**:本文对openGauss 6.0.0 RC1版本和6.0.0版本在openGauss920B机器的2p单机1h场景下的tpcc性能测试的情况进行详细说明,并给出最终测试结论。 +**Abstract 摘要**:本文对openGauss 6.0.0 RC1版本和6.0.0版本在鲲鹏920 V200机器的2p单机1h场景下的tpcc性能测试的情况进行详细说明,并给出最终测试结论。 **缩略语清单: ** @@ -26,7 +26,7 @@ # 1 概述 -本特性为了验证openGauss 6.0.0 RC1版本和6.0.0版本可以在鲲鹏920B机器上达到tpmC 230w的性能标准。 +本特性为了验证openGauss 6.0.0 RC1版本和6.0.0版本可以在鲲鹏920 V200机器上达到tpmC 230w的性能标准。 # 2 测试版本说明 @@ -47,14 +47,14 @@ ## 3.1 测试结论总结 -openGauss920B机器性能测试共计执行2个用例,主要覆盖了6.0.0及6.0.0 RC1版本的tpcc性能测试,性能均达到标准230w,共发现0个问题单,整体质量良好。 +openGauss数据库鲲鹏920 V200机器性能测试共计执行2个用例,主要覆盖了6.0.0及6.0.0 RC1版本的tpcc性能测试,性能均达到标准230w,共发现0个问题单,整体质量良好。 | 测试活动 | 活动评价 | | ------------------------------------------------------------ | ------------------------------------------------------------ | | 安装6.0.0 RC1版本数据库,根据调优文档对测试环境进行参数调优,之后对数据库进行1000仓861并发的tpcc性能测试 | 1000仓861并发下tpmC 性能最终达到236.45w,达到标准230w,符合预期 | | 安装6.0.0版本数据库,根据调优文档对测试环境进行参数调优,之后对数据库进行1000仓861并发的tpcc性能测试 | 1000仓861并发下tpmC 性能最终达到239.33w,达到标准230w,符合预期 | -openGauss 6.0.0及6.0.0 RC1版本在openGauss920B机器的2p单机1h场景下完成了tpcc性能测试,回归测试结果正常。 +openGauss 6.0.0及6.0.0 RC1版本在鲲鹏920 V200机器的2p单机1h场景下完成了tpcc性能测试,回归测试结果正常。 ## 3.2 约束说明 @@ -76,9 +76,9 @@ openGauss 6.0.0及6.0.0 RC1版本在openGauss920B机器的2p单机1h场景下完 ### 4.1.1 新需求质量评价 -| 特性 | 特性价值评估 | 应用说明及关键约束假设依赖 | 关键遗留事项如缺陷等 | 测试整体覆盖情况 | 特性质量评估 | 主要风险 | -| -------------------------------------------- | ------------------------------------------------------------ | -------------------------- | -------------------- | ----------------------------------------------------------- | -------------------------- | -------- | -| openGauss 920B机器2p单机1h场景下tpcc性能测试 | 证明openGauss可以在920B机器上跑到230w,相当于标准测试环境的4p单机场景的性能值 | 详见3.2章节描述 | 无 | 覆盖openGauss的6.0.0版本和6.0.0 RC1版本性能测试以及资料测试 | | 无 | +| 特性 | 特性价值评估 | 应用说明及关键约束假设依赖 | 关键遗留事项如缺陷等 | 测试整体覆盖情况 | 特性质量评估 | 主要风险 | +| --------------------------------------------------------- | ------------------------------------------------------------ | -------------------------- | -------------------- | ----------------------------------------------------------- | -------------------------- | -------- | +| openGauss数据库鲲鹏920 V200机器2p单机1h场景下tpcc性能测试 | 证明openGauss可以在鲲鹏920 V200机器上跑到230w,相当于标准测试环境的4p单机场景的性能值 | 详见3.2章节描述 | 无 | 覆盖openGauss的6.0.0版本和6.0.0 RC1版本性能测试以及资料测试 | | 无 | *特性质量评估说明*: @@ -142,7 +142,7 @@ openGauss 6.0.0及6.0.0 RC1版本在openGauss920B机器的2p单机1h场景下完 ## 5.1 覆盖率分析 -该需求覆盖了覆盖openGauss的6.0.0版本和6.0.0 RC1版本在920B机器的2p单机1h场景下的性能测试以及资料测试。 +该需求覆盖了覆盖openGauss的6.0.0版本和6.0.0 RC1版本在鲲鹏920 V200机器的2p单机1h场景下的性能测试以及资料测试。 ## 5.2 缺陷统计和分析 @@ -163,9 +163,9 @@ openGauss 6.0.0及6.0.0 RC1版本在openGauss920B机器的2p单机1h场景下完 ## 6.1 测试策略回顾 -| 编号 | 特性 | 验证策略 | 是否按照测试策略执行 | -| ---- | -------------------------------------------- | ------------------------------------------------------------ | -------------------- | -| 1 | openGauss 920B机器2p单机1h场景下tpcc性能测试 | 在openGauss920B机器的2p单机1h场景下进行6.0.0版本和6.0.0 RC1版本的tpcc性能测试,消除可能的版本影响 | YES | +| 编号 | 特性 | 验证策略 | 是否按照测试策略执行 | +| ---- | --------------------------------------------------------- | ------------------------------------------------------------ | -------------------- | +| 1 | openGauss数据库鲲鹏920 V200机器2p单机1h场景下tpcc性能测试 | 对openGauss数据库在鲲鹏920 V200机器的2p单机1h场景下进行6.0.0版本和6.0.0 RC1版本的tpcc性能测试,消除可能的版本影响 | YES | ## 6.2 测试设计评估 diff --git "a/Test_Result/openGauss_6.0.0_release/\345\267\245\345\205\267\351\223\276/AGE\346\217\222\344\273\266\346\200\247\350\203\275\344\274\230\345\214\226\346\265\213\350\257\225\346\212\245\345\221\212.md" "b/Test_Result/openGauss_6.0.0_release/\345\267\245\345\205\267\351\223\276/AGE\346\217\222\344\273\266\346\200\247\350\203\275\344\274\230\345\214\226\346\265\213\350\257\225\346\212\245\345\221\212.md" index 5192ac6d9519053ceb4ac6b6318d6f51d04fe21c..ee549cfcdbb6f287250d6c528e86ddbd148c2112 100644 --- "a/Test_Result/openGauss_6.0.0_release/\345\267\245\345\205\267\351\223\276/AGE\346\217\222\344\273\266\346\200\247\350\203\275\344\274\230\345\214\226\346\265\213\350\257\225\346\212\245\345\221\212.md" +++ "b/Test_Result/openGauss_6.0.0_release/\345\267\245\345\205\267\351\223\276/AGE\346\217\222\344\273\266\346\200\247\350\203\275\344\274\230\345\214\226\346\265\213\350\257\225\346\212\245\345\221\212.md" @@ -23,10 +23,30 @@ openGauss、Apache-AGE、PostgreSQL | ------ | -------- | -------- | | 无 | | | +功能验收:
+1)AGE官方测试用例通过;
+ 测试通过。详见AGE适配opengauss测试报告
+2)TSBS性能测试结果超越基于最新PostgreSQL版本的AGE;
+ 测试通过。本报告为性能测试报告
+3)本项目开发的测试用例覆盖所有外部接口并运行通过;
+| 交付件 | 类型 | 验收标准 | 验收结论 | +| :------: | :-------: | :-------: | :-------: | +| 交付需求中所有的代码改动 | 代码 | 代码交付 | 验收通过 | +| 测试用例集成到流水线 | 测试用例交付 | 测试用例代码合入 | 验收通过 | +| 第二阶段性能优化修改点说明文档 | 文档 | 修改点说明文档 | 验收通过 | +| 设计文档 | 文档 | 设计文档交付 | 验收通过 | +| 测试方案 | 文档 | 测试方案交付 | 验收通过 | +| 测试报告 | 文档 | 测试报告交付 | 验收通过 | +| check-in报告 | 文档 | check-in报告交付 | 验收通过 | +| 资料说明 | 文档 | 资料文档交付 | 验收通过 | + # 1 特性概述 运行图数据库测试套件,评估openGauss上AGE插件的性能,识别待改进点,突破相应算法和关键技术,目标是性能超越基于主流PostgreSQL版本(PG11和PG16)的AGE。 +## 1.1 AGE原理说明 +AGE的cypher语法树会最终转换为SQL的语法树,AGE对查询的优化层和执行层并不做修改。 + # 2 特性测试信息 本节描述被测对象的版本信息、环境信息 | 版本信息 | 测试起始时间 | 测试结束时间 | @@ -137,8 +157,9 @@ x86机器执行三次,取平均值:
第一次查询语法比较
![第一次查询语法比较](images/AGE_x86%E6%9E%B6%E6%9E%84%E7%AC%AC%E4%B8%80%E6%AC%A1%E6%9F%A5%E8%AF%A2%E8%AF%AD%E5%8F%A5%E6%AF%94%E8%BE%83.png) +**注:空白柱状图为AGE不支持类型CQ13、CQ14** -- CentOS X86架构下,第一次查询结果分析
+- CentOS X86架构下,第一次查询结果分析 > - SF0.1数据集opengauss整体查询时间表现优于pg性能,约60.49% > - SF1数据集opengauss整体查询时间表现优于pg性能,约83.36% > - SF10数据集opengauss整体查询时间表现优于pg性能,约90.24% @@ -162,9 +183,10 @@ SQ7类型查询随着数据集增大,openGauss数据库的表现越差,不
第二次查询语法比较
![第一次查询语法比较](images/AGE_x86%E6%9E%B6%E6%9E%84%E7%AC%AC%E4%BA%8C%E6%AC%A1%E6%9F%A5%E8%AF%A2%E8%AF%AD%E5%8F%A5%E6%AF%94%E8%BE%83.png) +**注:空白柱状图为AGE不支持类型CQ13、CQ14** -- CentOS X86架构下,第二次查询结果分析
+- CentOS X86架构下,第二次查询结果分析 > - SF0.1数据集opengauss整体查询时间表现优于pg性能,约1.34% > - SF1数据集opengauss整体查询时间表现优于pg性能,约48.71% > - SF10数据集opengauss整体查询时间表现优于pg性能,约75.61% @@ -202,6 +224,8 @@ arm机器执行三次,取平均值: ![第一次查询语法比较](images/AGE_x86%E6%9E%B6%E6%9E%84%E7%AC%AC%E4%B8%80%E6%AC%A1%E6%9F%A5%E8%AF%A2%E8%AF%AD%E5%8F%A5%E6%AF%94%E8%BE%83.png) +**注:空白柱状图为AGE不支持类型CQ13、CQ14** + - openEuler aarch64架构下,第一次查询结果分析
> - SF0.1数据集opengauss整体查询时间表现优于pg性能,约56.54% > - SF1数据集opengauss整体查询时间表现优于pg性能,约82.86% @@ -229,7 +253,9 @@ SQ7类型查询随着数据集增大,openGauss数据库的表现越差,不 ![第一次查询语法比较](images/AGE_x86%E6%9E%B6%E6%9E%84%E7%AC%AC%E4%BA%8C%E6%AC%A1%E6%9F%A5%E8%AF%A2%E8%AF%AD%E5%8F%A5%E6%AF%94%E8%BE%83.png) -- openEuler aarch64架构下,第二次查询结果分析
+**注:空白柱状图为AGE不支持类型CQ13、CQ14** + +- openEuler aarch64架构下,第二次查询结果分析 > - SF0.1数据集opengauss整体查询时间表现优于pg性能,约5.3% > - SF1数据集opengauss整体查询时间表现优于pg性能,约99.96% > - SF10数据集opengauss整体查询时间表现优于pg性能,约77.39% @@ -254,17 +280,649 @@ SQ4类型查询SF0.1数据集openGauss性能劣化约1倍;SF1数据集openGaus SQ5类型查询SF0.1数据集openGauss性能劣化约1倍;SF1数据集openGauss性能劣化约47%;SF10数据集openGauss性能劣化约13.4倍
SQ7类型查询SF0.1数据集openGauss性能劣化约29%;SF1数据集openGauss性能劣化约1.9倍;SF10数据集openGauss性能劣化约3.1倍
-### 4.2.1 性能劣化分析 -> - 优势 ->4.2章节中可以看出,openGauss+AGE在使用变长路径查询时性能优于postgreSQL+AGE ->查询性能测试中有70.59%的查询,openGauss+AGE查询效率大于postgreSQL+AGE - -- 执行计划列举 -![查询比较](images/%E5%A4%8D%E6%9D%82%E6%9F%A5%E8%AF%A25%E5%88%86%E6%9E%90.png) -执行计划完全相同时,openGauss+AGE可能会劣于postgreSQL+AGE,主要原因应该还是并行参数,基础算子执行效率差异 -![DDL](images/pg%E7%9A%84DDL%E8%AF%AD%E5%8F%A5.png) -AGE创建函数时在PG侧多了个参数,openGauss不支持 - -- 其他佐证 -![火焰图](images/%E6%80%A7%E8%83%BD%E7%81%AB%E7%84%B0%E5%9B%BE.png) -从图中可以看出AGE函数都在openGauss内核函数ExecScan和ExecMakeTableFunctionResult之内,底层的扫描过程使用的还是openGauss的存储引擎,性能瓶颈在存储层。 \ No newline at end of file +### 4.2.1 性能差异分析 +- 优势
+4.2章节中可以看出,openGauss+AGE在使用变长路径查询时性能优于postgreSQL+AGE
+在所有类型的查询总时间长中,openGauss的表现优于PG
+查询性能测试中有70.59%的查询,openGauss+AGE查询效率大于postgreSQL+AGE + +- 劣势查询结果总结
+**SQ5查询** 性能差的原因主要在于openGauss和PostgreSQL生成的查询计划不同,PG侧默认使用并行计划。当openGauss也选择相同查询计划时,性能会优于PostgreSQL。同时也说明该查询对应查询计划PostgreSQL的并行机制不是影响其效率的根本原因。
+**SQ7查询** 在PostgreSQL中查询计划表现为默认使用了并行的计划,openGauss强制开启并行计划后测试结果为:当query_dop =8时,openGauss的查询效率提高了2倍,才可以与PostgreSQL(并行为2)达到同数量级。因此对于SQ7 查询,openGauss慢于PostgreSQL的原因在不仅在于执行计划的不同,系统并行算子对查询效率也有较大的影响
+**CQ8查询** openGauss和PostgreSQL生成的查询计划不同,当openGauss也选择相同查询计划时,性能会优于postgres。
+**CQ5查询** openGauss的查询效率较PostgreSQL差,问题的原因主要是查询计划的不同。在并行开启的条件下,甚至产生了副作用,对于AGE插件来说,openGauss对并行的开启条件较PostgreSQL来说更为苛刻。因为AGE在将Cypher转化为SQL语句后,查询会包含大量的子查询,子计划,和函数的嵌套调用。 + +#### 4.2.1.1 背景 + +- AGE的cypher语法树会最终转换为SQL的语法树,AGE对查询的优化层和执行层并不做修改。因此一些最终只能转化为数据库本身的算子的查询语句,其执行效率完全取决于数据库内核的优化器和执行器。 +- 对于变长路径的模式匹配,如match (n:person)-[e:knows *1..2 ]->(m:person),查询n通过1-2跳的边与m有knows关系。AGE在PG中,使用函数算子执行,需要将图数据从磁盘加载到内存中,再进行查询,有任何增删改的事务、或者新的连接都会导致下次查询时重新加载到内存。并且该算子一直在修复bug和漏洞。 +![算子](images/AGE%E7%AE%97%E5%AD%90.png) +AGE在openGauss适配后,变长路径查询使用算子优化,不需要重复的内存加载,查询效率有所提升,在有事务和新连接建立时,查询效率不会产生较大的波动,同时内存占用也更小,因此对于大规模的数据集及事务型查询方面更占优势。 +- 鉴于第一次查询时内存加载的原因,openGauss性能优势明显,为了提供一个更全面的性能评价,所以我们补充了在连接不断开,没有任何事务的情况下,立即进行第二次查询的性能测试对比。 + +下面对测试用例第一次查询中的执行计划的主要差异点进行说明,该计划 是在ldbc1数据集上进行的测试输出。 + +#### 4.2.1.2 SQ5查询语句(短查询5) + +``` +load 'age'; +set search_path= ag_catalog; +select * from ag_catalog.cypher('ldbc', $$ explain analyze MATCH (m:post {id: 24697} )-[:hascreatorpost]->(p:person) +RETURN +    p.id AS personId, +    p.firstName AS firstName, +    p.lastName AS lastName +        $$ )  AS (personId agtype,firstName agtype, +     lastName agtype); +``` +- openGauss 执行计划 + +``` +Hash Join (cost=726.37..12262.88 rows=1049 width=347) (actual time=20.157..20.177 rows=1 loops=1) + Hash Cond: (_age_default_alias_0.end_id = p.id) + -> Nested Loop (cost=32.13..11530.62 rows=1049 width=8) (actual time=0.725..0.735 rows=1 loops=1) + -> Bitmap Heap Scan on post m (cost=32.13..3664.90 rows=1049 width=8) (actual time=0.682..0.685 rows=1 loops=1) + Recheck Cond: (properties @> agtype_build_map('id'::text, '24697'::agtype)) + Heap Blocks: exact=1 + -> Bitmap Index Scan on graph_post_idx_prop (cost=0.00..31.87 rows=1049 width=0) (actual time=0.607..0.607 rows=1 loops=1) + Index Cond: (properties @> agtype_build_map('id'::text, '24697'::agtype)) + -> Index Scan using hascreatorpost_start on hascreatorpost _age_default_alias_0 (cost=0.00..7.49 rows=1 width=16) (actual time=0.031..0.035 rows=1 loops=1) + Index Cond: (start_id = m.id) + -> Hash (cost=569.66..569.66 rows=9966 width=347) (actual time=18.635..18.635 rows=9966 loops=1) + Buckets: 32768 Batches: 1 Memory Usage: 3945kB + -> Seq Scan on person p (cost=0.00..569.66 rows=9966 width=347) (actual time=0.008..5.055 rows=9966 loops=1) +Total runtime: 23.658 ms +``` +- Postgres 执行计划 + +``` +Gather (cost=1053.33..9762.93 rows=1049 width=96) (actual time=2.017..12.665 rows=1 loops=1) + Workers Planned: 1 + Workers Launched: 1 + -> Nested Loop (cost=53.33..8658.03 rows=617 width=96) (actual time=0.432..0.437 rows=0 loops=2) + -> Nested Loop (cost=53.04..8450.08 rows=617 width=8) (actual time=0.391..0.393 rows=0 loops=2) + -> Parallel Bitmap Heap Scan on post m (cost=52.62..3689.78 rows=617 width=8) (actual time=0.379..0.381 rows=0 loops=2) + Recheck Cond: (properties @> '{"id": "24697"}'::agtype) + Heap Blocks: exact=1 + -> Bitmap Index Scan on graph_post_idx_prop (cost=0.00..52.35 rows=1049 width=0) (actual time=0.177..0.178 rows=1 loops=1) + Index Cond: (properties @> '{"id": "24697"}'::agtype) + -> Index Scan using hascreatorpost_start on hascreatorpost _age_default_alias_0 (cost=0.43..7.71 rows=1 width=16) (actual time=0.017..0.017 rows=1 loops=1) + Index Cond: (start_id = m.id) + -> Index Scan using graph_person_idx on person p (cost=0.29..0.30 rows=1 width=372) (actual time=0.017..0.019 rows=1 loops=1) + Index Cond: (id = _age_default_alias_0.end_id) +Planning Time: 0.939 ms +Execution Time: 12.732 ms +``` +- __分析__ +pg中存在并行 Parallel 和Gather 相关的算子,并且查询计划不完全相同,因此查询效率比openGauss快 + +Pg查询计划 +``` +-> Parallel Bitmap Heap Scan on post m (cost=52.62..3689.78 rows=617 width=8) (actual time=0.379..0.381 rows=0 loops=2) +Gather (cost=1053.33..9762.93 rows=1049 width=96) (actual time=2.017..12.665 rows=1 loops=1) + Workers Planned: 1 + Workers Launched: 1 +``` +openGauss中相同位置的查询计划: +``` +Bitmap Heap Scan on post m (cost=32.13..3664.90 rows=1049 width=8) (actual time=0.682..0.685 rows=1 loops=1) +Nested Loop (cost=32.13..11530.62 rows=1049 width=8) (actual time=0.725..0.735 rows=1 loops=1) +``` +当关闭seq_scan时,openGauss会获得比postgres更优的性能 +``` +Nested Loop (cost=32.13..17290.98 rows=1049 width=347) (actual time=0.737..0.757 rows=1 loops=1) + Output: agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(p.id, _label_name(21513::oid, p.id), p.properties), '"id"'::agtype]), agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(p.id, _label_name(21513::oid, p.id), p.properties), '"firstName"'::agtype]), agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(p.id, _label_name(21513::oid, p.id), p.properties), '"lastName"'::agtype]) + -> Nested Loop (cost=32.13..11530.62 rows=1049 width=8) (actual time=0.437..0.441 rows=1 loops=1) + Output: _age_default_alias_0.end_id + -> Bitmap Heap Scan on ldbc.post m (cost=32.13..3664.90 rows=1049 width=8) (actual time=0.415..0.417 rows=1 loops=1) + Output: m.id, m.properties + Recheck Cond: (m.properties @> agtype_build_map('id'::text, '24697'::agtype)) + Heap Blocks: exact=1 + -> Bitmap Index Scan on graph_post_idx_prop (cost=0.00..31.87 rows=1049 width=0) (actual time=0.373..0.373 rows=1 loops=1) + Index Cond: (m.properties @> agtype_build_map('id'::text, '24697'::agtype)) + -> Index Scan using hascreatorpost_start on ldbc.hascreatorpost _age_default_alias_0 (cost=0.00..7.49 rows=1 width=16) (actual time=0.016..0.017 rows=1 loops=1) + Output: _age_default_alias_0.id, _age_default_alias_0.start_id, _age_default_alias_0.end_id, _age_default_alias_0.properties + Index Cond: (_age_default_alias_0.start_id = m.id) + -> Index Scan using unique_graph_person_idx on ldbc.person p (cost=0.00..5.46 rows=1 width=347) (actual time=0.008..0.010 rows=1 loops=1) + Output: p.id, p.properties + Index Cond: (p.id = _age_default_alias_0.end_id) +Total runtime: 1.427 ms +``` +- **总结** +SQ5性能差的原因主要在于openGauss和postgres生成的查询计划不同,当openGauss也选择相同查询计划时,性能会优于postgres。同时也说明该查询对应查询计划的并行机制不是影响其效率的根本原因。 + +#### 4.2.1.3 SQ7查询语句(短查询7) +``` +load 'age'; +set search_path= ag_catalog; +select * from ag_catalog.cypher('ldbc', $$ explain analyze MATCH (m:comment)<-[:replyofcomment]-(c:comment)-[:hascreatorcomment]->(p:person) where m.id=57459 + OPTIONAL MATCH (m)-[:hascreatorcomment]->(a:person)-[r:knows]-(p) + RETURN c.id AS commentId, + c.content AS commentContent, + c.creationDate AS commentCreationDate, + p.id AS replyAuthorId, + p.firstName AS replyAuthorFirstName, + p.lastName AS replyAuthorLastName, + CASE + WHEN r IS NULL THEN false + ELSE true + END AS replyAuthorKnowsOriginalMessageAuthor + ORDER BY c.creationDate DESC, p.id + $$ ) AS (commentId agtype,commentContent agtype, + commentCreationDate agtype,replyAuthorId agtype,replyAuthorFirstName agtype,replyAuthorLastName agtype,replyAuthorKnowsOriginalMessageAuthor agtype); +``` +- openGauss 执行计划 +``` +Sort (cost=18024.27..18024.27 rows=1 width=583) (actual time=1004.353..1004.354 rows=2 loops=1) + Sort Key: (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(c.id, _label_name(17433::oid, c.id), c.properties), '"creationDate"'::agtype])) DESC, (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(p.id, _label_name(17433::oid, p.id), p.properties), '"id"'::agtype])) + Sort Method: quicksort Memory: 25kB + -> Nested Loop Left Join (cost=190.29..18024.26 rows=1 width=583) (actual time=234.028..1003.736 rows=2 loops=1) + -> Nested Loop (cost=8.28..17826.13 rows=1 width=755) (actual time=163.892..932.459 rows=2 loops=1) + -> Nested Loop (cost=8.28..17818.84 rows=1 width=416) (actual time=163.401..931.124 rows=2 loops=1) + -> Nested Loop (cost=8.28..17811.43 rows=1 width=228) (actual time=135.114..872.744 rows=2 loops=1) + -> Hash Join (cost=8.28..17789.19 rows=3 width=220) (actual time=92.768..793.254 rows=2 loops=1) + Hash Cond: (_age_default_alias_0.end_id = m.id) + -> Seq Scan on replyofcomment _age_default_alias_0 (cost=0.00..15913.98 rows=744898 width=24) (actual time=11.135..518.025 rows=744898 loops=1) + -> Hash (cost=8.27..8.27 rows=1 width=204) (actual time=39.032..39.032 rows=1 loops=1) + Buckets: 32768 Batches: 1 Memory Usage: 257kB + -> Index Scan using unique_comment_idx on comment m (cost=0.00..8.27 rows=1 width=204) (actual time=38.987..38.993 rows=1 loops=1) + Index Cond: (agtype_access_operator(VARIADIC ARRAY[properties, '"id"'::agtype]) = '57459'::agtype) + -> Index Scan using hascreatorcomment_start on hascreatorcomment _age_default_alias_1 (cost=0.00..7.40 rows=1 width=24) (actual time=78.973..78.981 rows=2 loops=2) + Index Cond: (start_id = _age_default_alias_0.start_id) + Filter: _ag_enforce_edge_uniqueness(_age_default_alias_0.id, id) + -> Index Scan using unique_graph_comment_idx on comment c (cost=0.00..7.40 rows=1 width=204) (actual time=58.327..58.333 rows=2 loops=2) + Index Cond: (id = _age_default_alias_0.start_id) + -> Index Scan using unique_graph_person_idx on person p (cost=0.00..7.29 rows=1 width=347) (actual time=1.278..1.287 rows=2 loops=2) + Index Cond: (id = _age_default_alias_1.end_id) + -> Nested Loop (cost=182.01..198.07 rows=1 width=64) (actual time=70.323..70.323 rows=0 loops=2) + Join Filter: _ag_enforce_edge_uniqueness(_age_default_alias_0.id, r.id) + -> Nested Loop (cost=0.01..12.02 rows=1 width=24) (actual time=17.649..17.668 rows=2 loops=2) + -> Index Scan using hascreatorcomment_start on hascreatorcomment _age_default_alias_0 (cost=0.01..8.28 rows=1 width=24) (actual time=0.060..0.067 rows=2 loops=2) + Index Cond: (start_id = (age_id(_agtype_build_vertex(m.id, _label_name(17433::oid, m.id), m.properties)))::graphid) + -> Index Only Scan using unique_graph_person_idx on person a (cost=0.00..3.74 rows=1 width=8) (actual time=17.011..17.017 rows=2 loops=2) + Index Cond: (id = _age_default_alias_0.end_id) + Heap Fetches: 0 + -> Bitmap Heap Scan on knows r (cost=182.00..186.04 rows=1 width=56) (actual time=52.576..52.576 rows=0 loops=2) + Recheck Cond: (((start_id = a.id) AND (end_id = (age_id(_agtype_build_vertex(p.id, _label_name(17433::oid, p.id), p.properties)))::graphid)) OR ((end_id = a.id) AND (start_id = (age_id(_agtype_build_vertex(p.id, _label_name(17433::oid, p.id), p.properties)))::graphid))) + -> BitmapOr (cost=182.00..182.00 rows=1 width=0) (actual time=52.559..52.559 rows=0 loops=2) + -> BitmapAnd (cost=90.87..90.87 rows=1 width=0) (actual time=52.049..52.049 rows=0 loops=2) + -> Bitmap Index Scan on knows_start (cost=0.00..3.99 rows=26 width=0) (actual time=15.134..15.134 rows=28 loops=2) + Index Cond: (start_id = a.id) + -> Bitmap Index Scan on knows_end (cost=0.00..3.99 rows=25 width=0) (actual time=36.522..36.522 rows=17 loops=2) + Index Cond: (end_id = (age_id(_agtype_build_vertex(p.id, _label_name(17433::oid, p.id), p.properties)))::graphid) + -> BitmapAnd (cost=90.87..90.87 rows=1 width=0) (actual time=0.494..0.494 rows=0 loops=2) + -> Bitmap Index Scan on knows_end (cost=0.00..3.98 rows=25 width=0) (actual time=0.483..0.483 rows=0 loops=2) + Index Cond: (end_id = a.id) + -> Bitmap Index Scan on knows_start (cost=0.00..4.00 rows=26 width=0) (Actual time: never executed) + Index Cond: (start_id = (age_id(_agtype_build_vertex(p.id, _label_name(17433::oid, p.id), p.properties)))::graphid) +Total runtime: 1008.908 ms +``` +- Postgres 执行计划 +``` +Sort (cost=16919.85..16919.85 rows=1 width=224) (actual time=62.686..63.515 rows=2 loops=1) + Sort Key: (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(c.id, _label_name('17886'::oid, c.id), c.properties), '"creationDate"'::agtype])) DESC, (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(p.id, _label_name('17886'::oid, p.id), p.properties), '"id"'::agtype])) + Sort Method: quicksort Memory: 25kB + -> Nested Loop (cost=1179.54..16919.84 rows=1 width=224) (actual time=6.783..63.497 rows=2 loops=1) + Join Filter: (c.id = _age_default_alias_0.start_id) + -> Nested Loop Left Join (cost=1179.11..16919.20 rows=1 width=420) (actual time=6.670..63.343 rows=2 loops=1) + -> Nested Loop (cost=1009.17..16737.14 rows=1 width=609) (actual time=6.525..63.154 rows=2 loops=1) + -> Nested Loop (cost=1008.88..16736.83 rows=1 width=245) (actual time=6.512..63.133 rows=2 loops=1) + -> Gather (cost=1008.46..16736.21 rows=1 width=237) (actual time=6.477..63.083 rows=2 loops=1) + Workers Planned: 2 + Workers Launched: 2 + -> Hash Join (cost=8.46..15736.11 rows=1 width=237) (actual time=32.703..51.274 rows=1 loops=3) + Hash Cond: (_age_default_alias_0.end_id = m.id) + -> Parallel Seq Scan on replyofcomment _age_default_alias_0 (cost=0.00..14563.74 rows=310374 width=24) (actual time=0.056..33.169 rows=248299 loops=3) + -> Hash (cost=8.45..8.45 rows=1 width=221) (actual time=0.135..0.136 rows=1 loops=3) + Buckets: 1024 Batches: 1 Memory Usage: 9kB + -> Index Scan using comment_idx on comment m (cost=0.43..8.45 rows=1 width=221) (actual time=0.127..0.129 rows=1 loops=3) + Index Cond: (agtype_access_operator(VARIADIC ARRAY[properties, '"id"'::agtype]) = '"57459"'::agtype) + -> Index Scan using hascreatorcomment_start on hascreatorcomment _age_default_alias_1 (cost=0.43..0.61 rows=1 width=24) (actual time=0.020..0.021 rows=1 loops=2) + Index Cond: (start_id = _age_default_alias_0.start_id) + Filter: _ag_enforce_edge_uniqueness(_age_default_alias_0.id, id) + -> Index Scan using graph_person_idx on person p (cost=0.29..0.30 rows=1 width=372) (actual time=0.007..0.008 rows=1 loops=2) + Index Cond: (id = _age_default_alias_1.end_id) + -> Nested Loop (cost=169.94..182.05 rows=1 width=64) (actual time=0.089..0.092 rows=0 loops=2) + Join Filter: _ag_enforce_edge_uniqueness(_age_default_alias_2.id, r.id) + -> Nested Loop (cost=0.72..8.77 rows=1 width=24) (actual time=0.039..0.041 rows=1 loops=2) + -> Index Scan using hascreatorcomment_start on hascreatorcomment _age_default_alias_2 (cost=0.44..8.46 rows=1 width=24) (actual time=0.006..0.006 rows=1 loops=2) + Index Cond: (start_id = (age_id(_agtype_build_vertex(m.id, _label_name('17886'::oid, m.id), m.properties)))::graphid) + -> Index Only Scan using graph_person_idx on person a (cost=0.29..0.30 rows=1 width=8) (actual time=0.019..0.020 rows=1 loops=2) + Index Cond: (id = _age_default_alias_2.end_id) + Heap Fetches: 0 + -> Bitmap Heap Scan on knows r (cost=169.22..173.27 rows=1 width=56) (actual time=0.044..0.046 rows=0 loops=2) + Recheck Cond: (((start_id = a.id) AND (end_id = (age_id(_agtype_build_vertex(p.id, _label_name('17886'::oid, p.id), p.properties)))::graphid)) OR ((end_id = a.id) AND (start_id = (age_id(_agtype_build_vertex(p.id, _label_name('17886'::oid, p.id), p.properties)))::graphid))) + -> BitmapOr (cost=169.22..169.22 rows=1 width=0) (actual time=0.042..0.043 rows=0 loops=2) + -> BitmapAnd (cost=84.49..84.49 rows=1 width=0) (actual time=0.035..0.035 rows=0 loops=2) + -> Bitmap Index Scan on knows_start (cost=0.00..0.80 rows=26 width=0) (actual time=0.009..0.009 rows=14 loops=2) + Index Cond: (start_id = a.id) + -> Bitmap Index Scan on knows_end (cost=0.00..0.80 rows=25 width=0) (actual time=0.016..0.016 rows=8 loops=2) + Index Cond: (end_id = (age_id(_agtype_build_vertex(p.id, _label_name('17886'::oid, p.id), p.properties)))::graphid) + -> BitmapAnd (cost=84.49..84.49 rows=1 width=0) (actual time=0.005..0.006 rows=0 loops=2) + -> Bitmap Index Scan on knows_end (cost=0.00..0.79 rows=25 width=0) (actual time=0.004..0.005 rows=0 loops=2) + Index Cond: (end_id = a.id) + -> Bitmap Index Scan on knows_start (cost=0.00..0.81 rows=26 width=0) (never executed) + Index Cond: (start_id = (age_id(_agtype_build_vertex(p.id, _label_name('17886'::oid, p.id), p.properties)))::graphid) + -> Index Scan using graph_comment_idx on comment c (cost=0.43..0.58 rows=1 width=221) (actual time=0.012..0.013 rows=1 loops=2) + Index Cond: (id = _age_default_alias_1.start_id) +Planning Time: 4.221 ms +Execution Time: 63.697 ms +``` +- **分析** +openGauss中 Seq Scan 执行时间是postgres的Parallel Seq Scan的2倍左右,因此openGauss的效率会比PG低。 +当强制开启并行(修改该查询涉及到的函数均为 immutable) ,并设置当前会话 query_dop =4或者8 时,性能达到最优,性能会有所提升,但是依然较pg差约23% + +Pg执行计划 +``` + -> Nested Loop (cost=1008.88..16736.83 rows=1 width=245) (actual time=6.512..63.133 rows=2 loops=1) + -> Gather (cost=1008.46..16736.21 rows=1 width=237) (actual time=6.477..63.083 rows=2 loops=1) + + Parallel Seq Scan on replyofcomment _age_default_alias_0 (cost=0.00..14563.74 rows=310374 width=24) (actual time=0.056..33.169 rows=248299 loops=3) +``` +对应的openGauss查询计划 +``` + Seq Scan on replyofcomment _age_default_alias_0 (cost=0.00..15913.98 rows=744898 width=24) (actual time=11.135..518.025 rows=744898 loops=1) +``` + +- **总结** +当query_dop =8时,openGauss的查询效率提高了2倍,才可以与postgres(并行为2)达到同数量级。因此对于SQ7 查询,openGauss慢于pg的原因在不仅在于执行计划的不同,系统并行算子对查询效率也有较大的影响 + +#### 4.2.1.4 CQ8查询语句(长查询8) +``` +select * from ag_catalog.cypher('ldbc', $$ +   MATCH (start:person {id:  16})<-[:hascreatorpost]-(:post)<-[:replyofpost]-(comment:comment)-[:hascreatorcomment]->(person:person) +RETURN +    person.id AS personId, +    person.firstName AS personFirstName, +    person.lastName AS personLastName, +    comment.creationDate AS commentCreationDate, +    comment.id AS commentId, +    comment.content AS commentContent +ORDER BY +    commentCreationDate DESC, +                                commentId ASC +LIMIT 20  $$ )  AS (personId agtype, +     personFirstName agtype ,personLastName agtype ,commentCreationDate agtype, +     commentId agtype ,commentContent agtype) +``` +- openGauss 执行计划 +``` +Limit (cost=45811.14..45811.19 rows=20 width=551) (actual time=350.082..350.085 rows=20 loops=1) + -> Sort (cost=45811.14..45812.36 rows=486 width=551) (actual time=350.058..350.060 rows=20 loops=1) + Sort Key: (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(comment.id, _label_name(17433::oid, comment.id), comment.properties), '"creationDate"'::agtype])) DESC, (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(comment.id, _label_name(17433::oid, comment.id), comment.properties), '"id"'::agtype])) + Sort Method: top-N heapsort Memory: 34kB + -> Hash Join (cost=8583.78..45798.21 rows=486 width=551) (actual time=11.583..347.297 rows=65 loops=1) + Hash Cond: (_age_default_alias_3.end_id = person.id) + -> Nested Loop (cost=7889.55..45075.42 rows=486 width=212) (actual time=0.824..332.801 rows=65 loops=1) + -> Nested Loop (cost=7889.55..41534.61 rows=486 width=24) (actual time=0.807..332.218 rows=65 loops=1) + Join Filter: _ag_enforce_edge_uniqueness(_age_default_alias_0.id, _age_default_alias_2.id, _age_default_alias_3.id) + -> Hash Join (cost=7889.55..30903.02 rows=1459 width=24) (actual time=0.739..330.516 rows=65 loops=1) + Hash Cond: (_age_default_alias_2.end_id = _age_default_alias_0.start_id) + -> Seq Scan on replyofpost _age_default_alias_2 (cost=0.00..19564.73 rows=915773 width=24) (actual time=0.002..139.105 rows=915773 loops=1) + -> Hash (cost=7868.66..7868.66 rows=1671 width=24) (actual time=0.422..0.422 rows=17 loops=1) + Buckets: 32768 Batches: 1 Memory Usage: 257kB + -> Nested Loop (cost=16.08..7868.66 rows=1671 width=24) (actual time=0.293..0.409 rows=17 loops=1) + -> Nested Loop (cost=16.08..1561.67 rows=1671 width=16) (actual time=0.274..0.293 rows=17 loops=1) + -> Bitmap Heap Scan on person start (cost=16.08..51.85 rows=10 width=8) (actual time=0.251..0.252 rows=1 loops=1) + Recheck Cond: (properties @> agtype_build_map('id'::text, '16'::agtype)) + Heap Blocks: exact=1 + -> Bitmap Index Scan on graph_person_idx_prop (cost=0.00..16.08 rows=10 width=0) (actual time=0.217..0.217 rows=1 loops=1) + Index Cond: (properties @> agtype_build_map('id'::text, '16'::agtype)) + -> Index Scan using hascreatorpost_end on hascreatorpost _age_default_alias_0 (cost=0.00..149.31 rows=167 width=24) (actual time=0.014..0.024 rows=17 loops=1) + Index Cond: (end_id = start.id) + -> Index Only Scan using unique_graph_post_idx on post _age_default_alias_1 (cost=0.00..3.76 rows=1 width=8) (actual time=0.093..0.102 rows=17 loops=17) + Index Cond: (id = _age_default_alias_0.start_id) + Heap Fetches: 0 + -> Index Scan using hascreatorcomment_start on hascreatorcomment _age_default_alias_3 (cost=0.00..7.27 rows=1 width=24) (actual time=0.452..0.486 rows=65 loops=65) + Index Cond: (start_id = _age_default_alias_2.start_id) + -> Index Scan using unique_graph_comment_idx on comment (cost=0.00..7.28 rows=1 width=204) (actual time=0.438..0.489 rows=65 loops=65) + Index Cond: (id = _age_default_alias_2.start_id) + -> Hash (cost=569.66..569.66 rows=9966 width=347) (actual time=10.299..10.299 rows=9966 loops=1) + Buckets: 32768 Batches: 1 Memory Usage: 3945kB + -> Seq Scan on person (cost=0.00..569.66 rows=9966 width=347) (actual time=0.008..2.673 rows=9966 loops=1) +Total runtime: 351.206 ms +``` +- Postgres 执行计划 +``` +Limit (cost=3404.43..3404.48 rows=20 width=192) (actual time=19.801..19.810 rows=20 loops=1) + -> Sort (cost=3404.43..3405.19 rows=306 width=192) (actual time=19.799..19.805 rows=20 loops=1) + Sort Key: (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(comment.id, _label_name('17886'::oid, comment.id), comment.properties), '"creationDate"'::agtype])) DESC, (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(comment.id, _label_name('17886'::oid, comment.id), comment.properties), '"id"'::agtype])) + Sort Method: top-N heapsort Memory: 30kB + -> Nested Loop (cost=32.51..3396.29 rows=306 width=192) (actual time=15.470..19.429 rows=65 loops=1) + -> Nested Loop (cost=32.23..3286.50 rows=306 width=229) (actual time=15.342..16.945 rows=65 loops=1) + Join Filter: (_age_default_alias_2.start_id = comment.id) + -> Nested Loop (cost=31.80..3104.80 rows=306 width=24) (actual time=15.304..16.421 rows=65 loops=1) + -> Nested Loop (cost=31.37..2962.21 rows=306 width=40) (actual time=15.270..16.088 rows=65 loops=1) + Join Filter: _ag_enforce_edge_uniqueness(_age_default_alias_0.id, _age_default_alias_2.id, _age_default_alias_3.id) + -> Nested Loop (cost=30.94..2420.92 rows=919 width=40) (actual time=15.208..15.405 rows=65 loops=1) + -> Nested Loop (cost=30.52..1732.98 rows=1053 width=16) (actual time=15.154..15.170 rows=17 loops=1) + -> Bitmap Heap Scan on person start (cost=30.09..65.99 rows=10 width=8) (actual time=15.116..15.118 rows=1 loops=1) + Recheck Cond: (properties @> '{"id": "16"}'::agtype) + Heap Blocks: exact=1 + -> Bitmap Index Scan on graph_person_idx_prop (cost=0.00..30.09 rows=10 width=0) (actual time=15.101..15.101 rows=1 loops=1) + Index Cond: (properties @> '{"id": "16"}'::agtype) + -> Index Scan using hascreatorpost_end on hascreatorpost _age_default_alias_0 (cost=0.43..165.03 rows=167 width=24) (actual time=0.033..0.042 rows=17 loops=1) + Index Cond: (end_id = start.id) + -> Index Scan using replyofcomment_end on replyofpost _age_default_alias_2 (cost=0.42..0.59 rows=6 width=24) (actual time=0.010..0.012 rows=4 loops=17) + Index Cond: (end_id = _age_default_alias_0.start_id) + -> Index Scan using hascreatorcomment_start on hascreatorcomment _age_default_alias_3 (cost=0.43..0.58 rows=1 width=24) (actual time=0.007..0.007 rows=1 loops=65) + Index Cond: (start_id = _age_default_alias_2.start_id) + -> Index Only Scan using graph_post_idx on post _age_default_alias_1 (cost=0.43..0.46 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=65) + Index Cond: (id = _age_default_alias_0.start_id) + Heap Fetches: 0 + -> Index Scan using graph_comment_idx on comment (cost=0.43..0.58 rows=1 width=221) (actual time=0.007..0.007 rows=1 loops=65) + Index Cond: (id = _age_default_alias_3.start_id) + -> Index Scan using graph_person_idx on person (cost=0.29..0.30 rows=1 width=372) (actual time=0.005..0.005 rows=1 loops=65) + Index Cond: (id = _age_default_alias_3.end_id) +Planning Time: 43.884 ms +Execution Time: 20.053 ms +``` +- **分析** + +此查询的查询计划二者不一致,再拥有相同索引结构的情况下,pg全部选择了index scan +在关闭seq scan的情况下,openGauss和pg有相同的执行效率 +``` +Limit (cost=60337.99..60338.04 rows=20 width=551) (actual time=7.583..7.587 rows=20 loops=1) + -> Sort (cost=60337.99..60339.21 rows=486 width=551) (actual time=7.476..7.476 rows=20 loops=1) + Sort Key: (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(comment.id, _label_name(17433::oid, comment.id), comment.properties), '"creationDate"'::agtype])) DESC, (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(comment.id, _label_name(17433::oid, comment.id), comment.properties), '"id"'::agtype])) + Sort Method: top-N heapsort Memory: 34kB + -> Nested Loop (cost=16.08..60325.06 rows=486 width=551) (actual time=0.551..7.196 rows=65 loops=1) + -> Nested Loop (cost=16.08..56790.61 rows=486 width=212) (actual time=0.376..2.772 rows=65 loops=1) + -> Nested Loop (cost=16.08..53249.80 rows=486 width=24) (actual time=0.360..2.183 rows=65 loops=1) + Join Filter: _ag_enforce_edge_uniqueness(_age_default_alias_0.id, _age_default_alias_2.id, _age_default_alias_3.id) + -> Nested Loop (cost=16.08..42618.20 rows=1459 width=24) (actual time=0.292..0.632 rows=65 loops=1) + -> Nested Loop (cost=16.08..7868.66 rows=1671 width=24) (actual time=0.277..0.431 rows=17 loops=1) + -> Nested Loop (cost=16.08..1561.67 rows=1671 width=16) (actual time=0.259..0.283 rows=17 loops=1) + -> Bitmap Heap Scan on person start (cost=16.08..51.85 rows=10 width=8) (actual time=0.236..0.237 rows=1 loops=1) + Recheck Cond: (properties @> agtype_build_map('id'::text, '16'::agtype)) + Heap Blocks: exact=1 + -> Bitmap Index Scan on graph_person_idx_prop (cost=0.00..16.08 rows=10 width=0) (actual time=0.198..0.198 rows=1 loops=1) + Index Cond: (properties @> agtype_build_map('id'::text, '16'::agtype)) + -> Index Scan using hascreatorpost_end on hascreatorpost _age_default_alias_0 (cost=0.00..149.31 rows=167 width=24) (actual time=0.017..0.031 rows=17 loops=1) + Index Cond: (end_id = start.id) + -> Index Only Scan using unique_graph_post_idx on post _age_default_alias_1 (cost=0.00..3.76 rows=1 width=8) (actual time=0.113..0.119 rows=17 loops=17) + Index Cond: (id = _age_default_alias_0.start_id) + Heap Fetches: 0 + -> Index Scan using replyofcomment_end on replyofpost _age_default_alias_2 (cost=0.00..20.74 rows=6 width=24) (actual time=0.119..0.160 rows=65 loops=17) + Index Cond: (end_id = _age_default_alias_0.start_id) + -> Index Scan using hascreatorcomment_start on hascreatorcomment _age_default_alias_3 (cost=0.00..7.27 rows=1 width=24) (actual time=0.447..0.487 rows=65 loops=65) + Index Cond: (start_id = _age_default_alias_2.start_id) + -> Index Scan using unique_graph_comment_idx on comment (cost=0.00..7.28 rows=1 width=204) (actual time=0.462..0.506 rows=65 loops=65) + Index Cond: (id = _age_default_alias_2.start_id) + -> Index Scan using unique_graph_person_idx on person (cost=0.00..7.22 rows=1 width=347) (actual time=0.351..0.409 rows=65 loops=65) + Index Cond: (id = _age_default_alias_3.end_id) +Total runtime: 8.763 ms +``` + +- **总结** +SQ5性能差的原因主要在于openGauss和postgres生成的查询计划不同,当openGauss也选择相同查询计划时,性能会优于postgres。 + +#### 4.2.1.5 CQ5 查询语句(长查询5) +openGauss查询计划 +``` +Subquery Scan on _ (cost=32590.10..32590.60 rows=20 width=40) (actual time=11181.375..11181.397 rows=20 loops=1) + -> Limit (cost=32590.10..32590.35 rows=20 width=40) (actual time=11181.306..11181.321 rows=20 loops=1) + -> Subquery Scan on _age_default_alias_previous_cypher_clause (cost=32590.10..32591.35 rows=100 width=40) (actual time=11181.269..11181.280 rows=20 loops=1) + -> Sort (cost=32590.10..32590.35 rows=100 width=72) (actual time=11181.247..11181.248 rows=20 loops=1) + Sort Key: (count((_agtype_build_vertex(post.id, _label_name(16956::oid, post.id), post.properties)))) DESC, (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(forum.id, _label_name(16956::oid, forum.id), forum.properties), '"id"'::agtype])) + Sort Method: quicksort Memory: 2498kB + -> HashAggregate (cost=32583.28..32585.78 rows=100 width=146) (actual time=11102.829..11106.029 rows=12882 loops=1) + Group By Key: agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(forum.id, _label_name(16956::oid, forum.id), forum.properties), '"id"'::agtype]), agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(forum.id, _label_name(16956::oid, forum.id), forum.properties), '"title"'::agtype]) + -> Hash Right Join (cost=16270.66..32572.66 rows=1416 width=138) (actual time=4056.424..10843.923 rows=140315 loops=1) + Hash Cond: ((_age_default_alias_0.end_id = (age_id((_agtype_build_vertex(friend.id, _label_name(16956::oid, friend.id), friend.properties))))::graphid) AND (_age_default_alias_1.start_id = (age_id(_agtype_build_vertex(forum.id, _label_name(16956::oid, forum.id), forum.properties)))::graphid)) + -> Hash Join (cost=11272.42..26808.11 rows=49668 width=48) (actual time=1499.527..2019.708 rows=148994 loops=1) + Hash Cond: (post.id = _age_default_alias_0.start_id) + -> Append (cost=0.00..13753.39 rows=342839 width=40) (actual time=0.088..417.172 rows=342838 loops=1) + -> Seq Scan on _ag_label_vertex post (cost=0.00..1.01 rows=1 width=40) (actual time=0.008..0.008 rows=0 loops=1) + -> Seq Scan on forum post (cost=0.00..409.28 rows=15028 width=40) (actual time=0.064..14.722 rows=15028 loops=1) + -> Seq Scan on post (cost=0.00..6521.05 rows=149005 width=40) (actual time=0.095..182.043 rows=149005 loops=1) + -> Seq Scan on comment post (cost=0.00..5938.81 rows=151681 width=40) (actual time=0.089..165.053 rows=151681 loops=1) + -> Seq Scan on organisation post (cost=0.00..279.54 rows=7954 width=40) (actual time=0.076..8.489 rows=7954 loops=1) + -> Seq Scan on person post (cost=0.00..89.62 rows=1562 width=40) (actual time=0.071..1.946 rows=1562 loops=1) + -> Seq Scan on place post (cost=0.00..42.59 rows=1459 width=40) (actual time=0.052..1.570 rows=1459 loops=1) + -> Seq Scan on tag post (cost=0.00..468.79 rows=16079 width=40) (actual time=0.121..16.979 rows=16079 loops=1) + -> Seq Scan on tagclass post (cost=0.00..2.70 rows=70 width=40) (actual time=0.056..0.125 rows=70 loops=1) + -> Hash (cost=10651.61..10651.61 rows=49665 width=32) (actual time=1478.605..1478.605 rows=148994 loops=1) + Buckets: 262144 Batches: 1 Memory Usage: 11361kB + -> Hash Join (cost=5046.36..10651.61 rows=49665 width=32) (actual time=61.127..1432.983 rows=148994 loops=1) + Hash Cond: (_age_default_alias_0.start_id = _age_default_alias_1.end_id) + Join Filter: _ag_enforce_edge_uniqueness(_age_default_alias_0.id, _age_default_alias_1.id) + -> Seq Scan on hascreatorpost _age_default_alias_0 (cost=0.00..3184.05 rows=149005 width=24) (actual time=0.009..23.657 rows=149005 loops=1) + -> Hash (cost=3183.94..3183.94 rows=148994 width=24) (actual time=59.451..59.451 rows=148994 loops=1) + Buckets: 262144 Batches: 1 Memory Usage: 10197kB + -> Seq Scan on containerof _age_default_alias_1 (cost=0.00..3183.94 rows=148994 width=24) (actual time=0.065..30.929 rows=148994 loops=1) + -> Hash (cost=4977.00..4977.00 rows=1416 width=138) (actual time=2489.691..2489.691 rows=139655 loops=1) + Buckets: 131072 (originally 131072) Batches: 2 (originally 1) Memory Usage: 64513kB + -> Nested Loop (cost=106.86..4977.00 rows=1416 width=138) (actual time=41.796..996.177 rows=139655 loops=1) + -> Nested Loop (cost=106.86..4463.09 rows=1416 width=40) (actual time=41.755..525.901 rows=139655 loops=1) + -> HashAggregate (cost=101.32..101.47 rows=10 width=348) (actual time=41.533..42.137 rows=736 loops=1) + Group By Key: _agtype_build_vertex(friend.id, _label_name(16956::oid, friend.id), friend.properties) + -> Nested Loop (cost=0.00..101.29 rows=10 width=348) (actual time=0.615..34.736 rows=1142 loops=1) + Join Filter: (age_id(_agtype_build_vertex(person.id, _label_name(16956::oid, person.id), person.properties)) <> age_id(_agtype_build_vertex(friend.id, _label_name(16956::oid, friend.id), friend.properties))) + -> Nested Loop (cost=0.00..18.27 rows=10 width=356) (actual time=0.357..11.099 rows=1142 loops=1) + -> Index Scan using unique_person_idx on person (cost=0.00..8.27 rows=1 width=348) (actual time=0.090..0.091 rows=1 loops=1) + Index Cond: (agtype_access_operator(VARIADIC ARRAY[properties, '"id"'::agtype]) = '16'::agtype) + -> Cypher VLE (cost=0.00..0.00 rows=1000 width=8) (actual time=0.258..10.814 rows=1142 loops=1) + -> Values Scan on "*VALUES*" (cost=0.00..0.02 rows=1 width=80) (actual time=0.015..0.015 rows=1 loops=1) + -> Index Scan using unique_graph_person_idx on person friend (cost=0.00..8.27 rows=1 width=348) (actual time=3.171..3.726 rows=1142 loops=1142) + Index Cond: (id = _age_default_alias_0.end_id) + -> Bitmap Heap Scan on hasmember membership (cost=5.54..434.49 rows=166 width=16) (actual time=45.005..457.666 rows=139655 loops=736) + Recheck Cond: (end_id = (age_id((_agtype_build_vertex(friend.id, _label_name(16956::oid, friend.id), friend.properties))))::graphid) + Filter: (agtype_access_operator(VARIADIC ARRAY[properties, '"creationDate"'::agtype]) > '1310817800000'::agtype) + Rows Removed by Filter: 24745 + Heap Blocks: exact=93144 + -> Bitmap Index Scan on hasmember_end (cost=0.00..5.50 rows=166 width=0) (actual time=28.450..28.450 rows=164400 loops=736) + Index Cond: (end_id = (age_id((_agtype_build_vertex(friend.id, _label_name(16956::oid, friend.id), friend.properties))))::graphid) + -> Index Scan using unique_graph_forum_idx on forum (cost=0.00..0.35 rows=1 width=106) (actual time=328.240..385.513 rows=139655 loops=139655) + Index Cond: (id = membership.start_id) +Total runtime: 11215.172 ms +``` +Postgres查询计划 +``` +Limit (cost=157441.66..157441.91 rows=20 width=64) (actual time=3660.882..3660.912 rows=20 loops=1) + -> Subquery Scan on _age_default_alias_previous_cypher_clause (cost=157441.66..157629.51 rows=15028 width=64) (actual time=3660.881..3660.909 rows=20 loops=1) + -> Sort (cost=157441.66..157479.23 rows=15028 width=96) (actual time=3660.877..3660.901 rows=20 loops=1) + Sort Key: ((count((_agtype_build_vertex(post.id, _label_name('16973'::oid, post.id), post.properties))))::agtype) DESC, (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(forum.id, _label_name('16973'::oid, forum.id), forum.properties), '"id"'::agtype])) + Sort Method: top-N heapsort Memory: 29kB + -> HashAggregate (cost=155985.80..156399.07 rows=15028 width=96) (actual time=3653.301..3656.888 rows=12882 loops=1) + Group Key: agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(forum.id, _label_name('16973'::oid, forum.id), forum.properties), '"id"'::agtype]), agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(forum.id, _label_name('16973'::oid, forum.id), forum.properties), '"title"'::agtype]) + Batches: 1 Memory Usage: 2833kB + -> Merge Left Join (cost=143334.08..154356.01 rows=217305 width=96) (actual time=3117.960..3556.860 rows=140315 loops=1) + Merge Cond: ((((age_id((_agtype_build_vertex(friend.id, _label_name('16973'::oid, friend.id), friend.properties))))::graphid) = _age_default_alias_1.end_id) AND (((age_id(_agtype_build_vertex(forum.id, _label_name('16973'::oid, forum.id), forum.properties)))::graphid) = _age_default_alias_2.start_id)) + -> Sort (cost=93941.06..94484.32 rows=217305 width=164) (actual time=2513.341..2557.232 rows=139655 loops=1) + Sort Key: ((age_id((_agtype_build_vertex(friend.id, _label_name('16973'::oid, friend.id), friend.properties))))::graphid), ((age_id(_agtype_build_vertex(forum.id, _label_name('16973'::oid, forum.id), forum.properties)))::graphid) + Sort Method: external merge Disk: 85656kB + -> Hash Join (cost=28023.82..56848.66 rows=217305 width=164) (actual time=1714.127..2255.347 rows=139655 loops=1) + Hash Cond: (membership.start_id = forum.id) + -> Nested Loop (cost=27378.69..53215.59 rows=217305 width=40) (actual time=1703.337..1906.464 rows=139655 loops=1) + -> HashAggregate (cost=27378.27..27401.70 rows=1562 width=32) (actual time=1703.224..1703.610 rows=736 loops=1) + Group Key: _agtype_build_vertex(friend.id, _label_name('16973'::oid, friend.id), friend.properties) + Batches: 1 Memory Usage: 577kB + -> Nested Loop (cost=0.29..26083.11 rows=518063 width=32) (actual time=1433.384..1700.253 rows=1142 loops=1) + Join Filter: age_match_vle_terminal_edge(person.id, friend.id, _age_default_alias_0.edges) + Rows Removed by Join Filter: 1781520 + -> Nested Loop (cost=0.28..145.87 rows=1554 width=382) (actual time=0.239..5.775 rows=1561 loops=1) + Join Filter: (age_id(_agtype_build_vertex(person.id, _label_name('16973'::oid, person.id), person.properties)) <> age_id(_agtype_build_vertex(friend.id, _label_name('16973'::oid, friend.id), friend.properties))) + Rows Removed by Join Filter: 1 + -> Index Scan using person_idx on person (cost=0.28..8.29 rows=1 width=374) (actual time=0.075..0.085 rows=1 loops=1) + Index Cond: (agtype_access_operator(VARIADIC ARRAY[properties, '"id"'::agtype]) = '"16"'::agtype) + -> Seq Scan on person friend (cost=0.00..94.62 rows=1562 width=374) (actual time=0.003..0.427 rows=1562 loops=1) + -> Memoize (cost=0.01..10.02 rows=1000 width=32) (actual time=0.913..0.943 rows=1142 loops=1561) + Cache Key: person.id + Cache Mode: binary + Hits: 1560 Misses: 1 Evictions: 0 Overflows: 0 Memory Usage: 113kB + -> Function Scan on age_vle _age_default_alias_0 (cost=0.01..10.01 rows=1000 width=32) (actual time=1425.750..1425.794 rows=1142 loops=1) + -> Index Scan using hasmember_end on hasmember membership (cost=0.42..15.14 rows=139 width=16) (actual time=0.008..0.262 rows=190 loops=736) + Index Cond: (end_id = (age_id((_agtype_build_vertex(friend.id, _label_name('16973'::oid, friend.id), friend.properties))))::graphid) + Filter: (agtype_access_operator(VARIADIC ARRAY[properties, '"creationDate"'::agtype]) > '"1310817800000"'::agtype) + Rows Removed by Filter: 34 + -> Hash (cost=457.28..457.28 rows=15028 width=132) (actual time=10.523..10.526 rows=15028 loops=1) + Buckets: 16384 Batches: 1 Memory Usage: 2550kB + -> Seq Scan on forum (cost=0.00..457.28 rows=15028 width=132) (actual time=0.029..5.011 rows=15028 loops=1) + -> Materialize (cost=49393.02..49964.38 rows=114271 width=48) (actual time=601.796..651.115 rows=148981 loops=1) + -> Sort (cost=49393.02..49678.70 rows=114271 width=48) (actual time=601.789..636.105 rows=148981 loops=1) + Sort Key: _age_default_alias_1.end_id, _age_default_alias_2.start_id + Sort Method: external merge Disk: 52592kB + -> Hash Join (cost=15966.42..36275.56 rows=114271 width=48) (actual time=221.575..503.998 rows=148994 loops=1) + Hash Cond: (post.id = _age_default_alias_1.start_id) + -> Append (cost=0.00..17880.77 rows=342839 width=40) (actual time=0.034..175.292 rows=342838 loops=1) + -> Seq Scan on _ag_label_vertex post_1 (cost=0.00..0.01 rows=1 width=40) (actual time=0.013..0.013 rows=0 loops=1) + -> Seq Scan on forum post_2 (cost=0.00..532.42 rows=15028 width=40) (actual time=0.019..6.913 rows=15028 loops=1) + -> Seq Scan on post post_3 (cost=0.00..7525.07 rows=149005 width=40) (actual time=0.067..71.564 rows=149005 loops=1) + -> Seq Scan on comment post_4 (cost=0.00..7032.22 rows=151681 width=40) (actual time=0.050..70.052 rows=151681 loops=1) + -> Seq Scan on organisation post_5 (cost=0.00..334.31 rows=7954 width=40) (actual time=0.042..3.847 rows=7954 loops=1) + -> Seq Scan on person post_6 (cost=0.00..102.43 rows=1562 width=40) (actual time=0.012..0.726 rows=1562 loops=1) + -> Seq Scan on place post_7 (cost=0.00..52.88 rows=1459 width=40) (actual time=0.032..0.680 rows=1459 loops=1) + -> Seq Scan on tag post_8 (cost=0.00..584.18 rows=16079 width=40) (actual time=0.023..7.267 rows=16079 loops=1) + -> Seq Scan on tagclass post_9 (cost=0.00..3.05 rows=70 width=40) (actual time=0.026..0.056 rows=70 loops=1) + -> Hash (cost=15345.61..15345.61 rows=49665 width=32) (actual time=209.223..209.225 rows=148994 loops=1) + Buckets: 131072 (originally 65536) Batches: 2 (originally 1) Memory Usage: 7254kB + -> Hash Join (cost=6519.36..15345.61 rows=49665 width=32) (actual time=38.582..184.785 rows=148994 loops=1) + Hash Cond: (_age_default_alias_1.start_id = _age_default_alias_2.end_id) + Join Filter: _ag_enforce_edge_uniqueness(_age_default_alias_1.id, _age_default_alias_2.id) + -> Seq Scan on hascreatorpost _age_default_alias_1 (cost=0.00..3783.05 rows=149005 width=24) (actual time=0.015..20.816 rows=149005 loops=1) + -> Hash (cost=3782.94..3782.94 rows=148994 width=24) (actual time=37.995..37.996 rows=148994 loops=1) + Buckets: 131072 Batches: 2 Memory Usage: 6475kB + -> Seq Scan on containerof _age_default_alias_2 (cost=0.00..3782.94 rows=148994 width=24) (actual time=0.010..19.820 rows=148994 loops=1) +Planning Time: 9.414 ms +Execution Time: 3685.651 ms +``` +- **分析** +1. 二者生成的最终查询计划不相同 +2. 相似的计划中,openGauss的执行效率也会比postgres的效率低 如:Append算子 +PG的append算子 +``` + -> Append (cost=0.00..17880.77 rows=342839 width=40) (actual time=0.034..175.292 rows=342838 loops=1) + -> Seq Scan on _ag_label_vertex post_1 (cost=0.00..0.01 rows=1 width=40) (actual time=0.013..0.013 rows=0 loops=1) + -> Seq Scan on forum post_2 (cost=0.00..532.42 rows=15028 width=40) (actual time=0.019..6.913 rows=15028 loops=1) + -> Seq Scan on post post_3 (cost=0.00..7525.07 rows=149005 width=40) (actual time=0.067..71.564 rows=149005 loops=1) + -> Seq Scan on comment post_4 (cost=0.00..7032.22 rows=151681 width=40) (actual time=0.050..70.052 rows=151681 loops=1) + -> Seq Scan on organisation post_5 (cost=0.00..334.31 rows=7954 width=40) (actual time=0.042..3.847 rows=7954 loops=1) + -> Seq Scan on person post_6 (cost=0.00..102.43 rows=1562 width=40) (actual time=0.012..0.726 rows=1562 loops=1) + -> Seq Scan on place post_7 (cost=0.00..52.88 rows=1459 width=40) (actual time=0.032..0.680 rows=1459 loops=1) + -> Seq Scan on tag post_8 (cost=0.00..584.18 rows=16079 width=40) (actual time=0.023..7.267 rows=16079 loops=1) + -> Seq Scan on tagclass post_9 (cost=0.00..3.05 rows=70 width=40) (actual time=0.026..0.056 rows=70 loops=1) +``` +openGauss的append算子 +``` + -> Append (cost=0.00..13753.39 rows=342839 width=40) (actual time=0.088..417.172 rows=342838 loops=1) + -> Seq Scan on _ag_label_vertex post (cost=0.00..1.01 rows=1 width=40) (actual time=0.008..0.008 rows=0 loops=1) + -> Seq Scan on forum post (cost=0.00..409.28 rows=15028 width=40) (actual time=0.064..14.722 rows=15028 loops=1) + -> Seq Scan on post (cost=0.00..6521.05 rows=149005 width=40) (actual time=0.095..182.043 rows=149005 loops=1) + -> Seq Scan on comment post (cost=0.00..5938.81 rows=151681 width=40) (actual time=0.089..165.053 rows=151681 loops=1) + -> Seq Scan on organisation post (cost=0.00..279.54 rows=7954 width=40) (actual time=0.076..8.489 rows=7954 loops=1) + -> Seq Scan on person post (cost=0.00..89.62 rows=1562 width=40) (actual time=0.071..1.946 rows=1562 loops=1) + -> Seq Scan on place post (cost=0.00..42.59 rows=1459 width=40) (actual time=0.052..1.570 rows=1459 loops=1) + -> Seq Scan on tag post (cost=0.00..468.79 rows=16079 width=40) (actual time=0.121..16.979 rows=16079 loops=1) + -> Seq Scan on tagclass post (cost=0.00..2.70 rows=70 width=40) (actual time=0.056..0.125 rows=70 loops=1) +``` +相同的Append算子,openGauss的执行时间是pg的2.38倍 +opengauss侧开启并行当set query_dop = 2或者4 时。查询计划发生改变,新的查询计划的执行时间是原先的6倍,并且虽然开启了query_dop,查询计划中并没有相应的关键字。说明openGauss和postgres对并行计划开启的条件不相同。 +``` +Subquery Scan on _ (cost=10364.60..10365.10 rows=20 width=40) (actual time=67503.164..67503.184 rows=20 loops=1) + -> Limit (cost=10364.60..10364.85 rows=20 width=40) (actual time=67503.111..67503.121 rows=20 loops=1) + -> Subquery Scan on _age_default_alias_previous_cypher_clause (cost=10364.60..10365.85 rows=100 width=40) (actual time=67503.077..67503.084 rows=20 loops=1) + -> Sort (cost=10364.60..10364.85 rows=100 width=72) (actual time=67503.069..67503.071 rows=20 loops=1) + Sort Key: (count((_agtype_build_vertex(post.id, _label_name(27627::oid, post.id), post.properties)))) DESC, (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(forum.id, _label_name(27627::oid, forum.id), forum.properties), '"id"'::agtype])) + Sort Method: quicksort Memory: 2498kB + -> HashAggregate (cost=10357.78..10360.28 rows=100 width=146) (actual time=67424.239..67428.529 rows=12882 loops=1) + Group By Key: agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(forum.id, _label_name(27627::oid, forum.id), forum.properties), '"id"'::agtype]), agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(forum.id, _label_name(27627::oid, forum.id), forum.properties), '"title"'::agtype]) + -> Nested Loop Left Join (cost=44.85..10347.25 rows=1404 width=138) (actual time=41.714..67142.471 rows=140315 loops=1) + Join Filter: (_age_default_alias_0.end_id = (age_id((_agtype_build_vertex(friend.id, _label_name(27627::oid, friend.id), friend.properties))))::graphid) + Rows Removed by Join Filter: 1812103 + -> Nested Loop (cost=44.84..1280.32 rows=1404 width=138) (actual time=41.202..1052.677 rows=139655 loops=1) + -> Nested Loop (cost=44.84..1142.37 rows=1404 width=40) (actual time=41.194..555.749 rows=139655 loops=1) + -> HashAggregate (cost=39.29..39.44 rows=10 width=348) (actual time=41.113..42.067 rows=736 loops=1) + Group By Key: _agtype_build_vertex(friend.id, _label_name(27627::oid, friend.id), friend.properties) + -> Nested Loop (cost=0.00..39.27 rows=10 width=348) (actual time=0.487..34.000 rows=1142 loops=1) + Join Filter: (age_id(_agtype_build_vertex(person.id, _label_name(27627::oid, person.id), person.properties)) <> age_id(_agtype_build_vertex(friend.id, _label_name(27627::oid, friend.id), friend.properties))) + -> Nested Loop (cost=0.00..18.27 rows=10 width=356) (actual time=0.296..9.160 rows=1142 loops=1) + -> Index Scan using unique_person_idx on person (cost=0.00..8.27 rows=1 width=348) (actual time=0.025..0.027 rows=1 loops=1) + Index Cond: (agtype_access_operator(VARIADIC ARRAY[properties, '"id"'::agtype]) = '16'::agtype) + -> Cypher VLE (cost=0.00..0.00 rows=1000 width=8) (actual time=0.264..8.885 rows=1142 loops=1) + -> Values Scan on "*VALUES*" (cost=0.00..0.02 rows=1 width=80) (actual time=0.081..0.081 rows=1 loops=1) + -> Index Scan using unique_graph_person_idx on person friend (cost=0.00..8.27 rows=1 width=348) (actual time=3.082..3.498 rows=1142 loops=1142) + Index Cond: (id = _age_default_alias_0.end_id) + -> Bitmap Heap Scan on hasmember membership (cost=5.54..434.49 rows=166 width=16) (actual time=43.916..484.544 rows=139655 loops=736) + Recheck Cond: (end_id = (age_id((_agtype_build_vertex(friend.id, _label_name(27627::oid, friend.id), friend.properties))))::graphid) + Filter: (agtype_access_operator(VARIADIC ARRAY[properties, '"creationDate"'::agtype]) > '1310817800000'::agtype) + Rows Removed by Filter: 24745 + Heap Blocks: exact=93144 + -> Bitmap Index Scan on hasmember_end (cost=0.00..5.50 rows=166 width=0) (actual time=25.883..25.883 rows=164400 loops=736) + Index Cond: (end_id = (age_id((_agtype_build_vertex(friend.id, _label_name(27627::oid, friend.id), friend.properties))))::graphid) + -> Index Scan using unique_graph_forum_idx on forum (cost=0.00..0.35 rows=1 width=106) (actual time=361.209..413.912 rows=139655 loops=139655) + Index Cond: (id = membership.start_id) + -> Nested Loop (cost=0.01..25.14 rows=9 width=48) (actual time=4018.728..56044.069 rows=1814125 loops=139655) + -> Nested Loop (cost=0.01..10.34 rows=4 width=32) (actual time=2937.485..24575.239 rows=1814125 loops=139655) + -> Index Scan using containerof_start on containerof _age_default_alias_1 (cost=0.01..2.56 rows=12 width=24) (actual time=514.919..1101.550 rows=1814125 loops=139655) + Index Cond: (start_id = (age_id(_agtype_build_vertex(forum.id, _label_name(27627::oid, forum.id), forum.properties)))::graphid) + -> Index Scan using hascreatorpost_start on hascreatorpost _age_default_alias_0 (cost=0.00..0.64 rows=1 width=24) (actual time=21097.130..21651.517 rows=1814125 loops=1814125) + Index Cond: (start_id = _age_default_alias_1.end_id) + Filter: _ag_enforce_edge_uniqueness(id, _age_default_alias_1.id) + -> Append (cost=0.00..3.61 rows=9 width=40) (actual time=11989.076..29901.632 rows=1814125 loops=1814125) + -> Index Scan using _ag_label_vertex_pkey on _ag_label_vertex post (cost=0.00..0.27 rows=1 width=40) (actual time=1745.978..1745.978 rows=0 loops=1814125) + Index Cond: (id = _age_default_alias_0.start_id) + -> Index Scan using unique_graph_forum_idx on forum post (cost=0.00..0.35 rows=1 width=40) (actual time=2392.051..2392.051 rows=0 loops=1814125) + Index Cond: (id = _age_default_alias_0.start_id) + -> Index Scan using unique_graph_post_idx on post (cost=0.00..0.72 rows=1 width=40) (actual time=6567.833..7180.503 rows=1814125 loops=1814125) + Index Cond: (id = _age_default_alias_0.start_id) + -> Index Scan using unique_graph_comment_idx on comment post (cost=0.00..0.71 rows=1 width=40) (actual time=3847.310..3847.310 rows=0 loops=1814125) + Index Cond: (id = _age_default_alias_0.start_id) + -> Index Scan using unique_graph_organisation_idx on organisation post (cost=0.00..0.34 rows=1 width=40) (actual time=2469.414..2469.414 rows=0 loops=1814125) + Index Cond: (id = _age_default_alias_0.start_id) + -> Index Scan using unique_graph_person_idx on person post (cost=0.00..0.33 rows=1 width=40) (actual time=2339.788..2339.788 rows=0 loops=1814125) + Index Cond: (id = _age_default_alias_0.start_id) + -> Index Scan using unique_graph_place_idx on place post (cost=0.00..0.27 rows=1 width=40) (actual time=2321.210..2321.210 rows=0 loops=1814125) + Index Cond: (id = _age_default_alias_0.start_id) + -> Index Scan using unique_graph_tag_idx on tag post (cost=0.00..0.36 rows=1 width=40) (actual time=2480.319..2480.319 rows=0 loops=1814125) + Index Cond: (id = _age_default_alias_0.start_id) + -> Index Scan using unique_graph_tagclass_idx on tagclass post (cost=0.00..0.27 rows=1 width=40) (actual time=1492.267..1492.267 rows=0 loops=1814125) + Index Cond: (id = _age_default_alias_0.start_id) +Total runtime: 67525.572 ms +``` + +- **总结** +根据分析,CQ5在ldbc0.1 数据集下,openGauss的查询效率较postgres差,问题的原因主要是查询计划的不同。在并行开启的条件下,甚至产生了副作用, 对于AGE插件来说, openGauss对并行的开启条件较pg来说更为苛刻。因为AGE在将Cypher转化为SQL语句后,查询会包含大量的子查询,子计划,和函数的嵌套调用。 + +### 4.2.2 其他佐证 +#### 4.2.2.1 openGauss与PG并行能力基础测试 +##### 4.2.2.1.1 测试说明 +测试环境与验收时测试环境一致 +1.数据准备 +创建两列的测试表,随机插入1千万条数据 +``` +create table test (id int,en varchar(200)); +insert into test values(generate_series(1,10000000),md5(random()::text)); +``` +2.openGauss 开启并行 +在会话中设置query_dop 为 1,2,4,8,16 分别执行查询语句 +``` +explain analyze select count(*) from public.test; +``` +3.Pg开启并行 +为了与openGauss达到相同的并行配置,在postgres.conf中配置如下参数 +``` +max_worker_processes = 16 # (change requires restart) +max_parallel_workers_per_gather = 16 # limited by max_parallel_workers +max_parallel_maintenance_workers = 16 # limited by max_parallel_workers +max_parallel_workers = 16 # number of max_worker_processes that +``` +在会话中设置 parallel_workers为 1,2,4,8,16 +``` +alter table test set (parallel_workers =1); +``` +再分别执行查询语句 +``` +explain analyze select count(*) from public.test; +``` +##### 4.2.2.1.2 测试结果 +不同并行度openGauss与postgres执行语句时长 +| 并行度 | openGauss(ms) | PostgreSQL(ms) | +| :----: | :----: | :----: | +| 1 | 3347 | 540 | +| 2 | 1001 | 359 | +| 4 | 542 | 238 | +| 8 | 298 | 154 | +| 16| 193 | 97 | + +在一致的测试环境中,基础算子seq scan 和 Aggregrate 在相同并行度下,openGauss性能较postgres较差。因此,在AGE中,当Cypher转化为SQL后,如果不是查询计划的优势,或者计划中openGauss的某算子优于postgres,openGauss较难超越postgres。 diff --git "a/Test_Result/openGauss_6.0.0_release/\345\267\245\345\205\267\351\223\276/images/AGE\347\256\227\345\255\220.png" "b/Test_Result/openGauss_6.0.0_release/\345\267\245\345\205\267\351\223\276/images/AGE\347\256\227\345\255\220.png" new file mode 100644 index 0000000000000000000000000000000000000000..cd4b6681768d349e65fafdcdbc9c143d1a47637d Binary files /dev/null and "b/Test_Result/openGauss_6.0.0_release/\345\267\245\345\205\267\351\223\276/images/AGE\347\256\227\345\255\220.png" differ