100 Star 330 Fork 295

openLooKeng / hetu-core

 / 详情

[1.4.0RC4][task recovery] When the CTAS SQL statement is executed and the query is restored using snapshots, some values are changed.

Tested
Defect 成员
创建于  
2021-10-12 18:51

Software Environment:

  • openLooKeng version (source or binary):
    openlooKeng1.4.0 RC4
  • OS platform & distribution (eg., Linux Ubuntu 16.04):
    Centos 7.6
  • Java version:
    JDK 1.8

1CN+5Worker
Datesets:tpcds 100G

Describe the current behavior

Occasional data inconsistency

Describe the expected behavior

Data consistency

Steps to reproduce the issue

  1. enable snapshot
    set session snapshot_enabled=true;

  2. CTAS

create table task_recovery_tpcds_automate.task_Q08(s_store_name,_col1) as
select s_store_name
,sum(ss_net_profit)
from store_sales
,date_dim
,store,
(select ca_zip
from (
SELECT substr(ca_zip,1,5) ca_zip
FROM customer_address
WHERE substr(ca_zip,1,5) IN (
'43758','76357','20728','59309','19777','27690',
'23681','52275','64367','24674','79465',
'52936','53936','91889','89248','70394',
'66020','56289','45541','29900','99055',
'47395','16654','26748','74456','31039',
'77674','87076','92273','31667','20150',
'84426','75885','61588','57973','29487',
'95008','65615','24339','84923','38463',
'13811','44227','18570','40389','14584',
'33007','61590','47363','57853','43499',
'90755','47141','14392','33991','77031',
'22854','20127','10624','15730','75295',
'98460','17059','26953','82996','17095',
'53227','34618','86978','33613','12541',
'63977','53929','55459','11516','85350',
'99888','23506','10569','66837','50031',
'28282','83901','98554','54828','14616',
'12743','42473','95507','30542','12883',
'95097','61307','32530','37753','53116',
'10989','87430','22114','68848','21246',
'68327','28446','85870','11697','30541',
'22933','70727','17570','55311','73355',
'16347','61573','81229','95480','92091',
'52603','51232','62666','12173','31993',
'98202','78325','46798','63259','34167',
'50435','56182','29390','51732','88435',
'10366','46637','69283','18218','33324',
'24139','16122','53142','16832','98386',
'41451','85109','32534','83953','76537',
'60857','59939','22271','38788','26296',
'59937','14272','98651','38185','16322',
'13735','56321','81398','36035','36512',
'96290','40596','22748','77965','28512',
'15540','20574','72340','81870','31905',
'18121','26282','30345','38703','74274',
'71129','23244','68810','10106','55461',
'25528','71474','37071','21552','81846',
'64930','13233','11694','17829','43790',
'60379','11482','22714','40977','73320',
'13928','78952','92802','66663','95765',
'86101','19813','90867','81258','93891',
'32755','21548','36452','50931','95773',
'57046','14736','30562','44667','80519',
'99886','97296','38505','29732','38693',
'83898','88032','64442','25944','39303',
'70781','92448','64252','89641','88070',
'38159','27654','72120','41689','37122',
'63776','90416','28479','14787','18038',
'39783','50062','28010','13042','86777',
'32380','80664','33558','43641','14627',
'68858','57733','53458','73016','76141',
'42375','12248','38778','50092','80825',
'58934','12145','78407','57009','52782',
'72140','35635','63926','35282','29292',
'30149','33576','95945','48303','56310',
'32214','69726','48249','91163','57311',
'12361','20491','13551','61620','59648',
'44466','53607','18410','99090','37973',
'17986','80713','95948','35103','51799',
'54707','52269','86117','44909','15530',
'28999','80844','62823','46487','15144',
'51445','81050','34943','45141','28541',
'12414','56922','50548','16422','16780',
'53104','60629','24405','61768','48257',
'92852','27390','24411','17776','81487',
'34848','45773','64188','24209','55276',
'11379','33956','46173','67361','32337',
'82112','73196','38461','43987','17980',
'65414','12247','42107','15326','73018',
'59993','85526','50231','60176','23889',
'88012','27859','44921','50915','21742',
'21272','64763','78761','62002','18502',
'42208','49675','69413','46013','67034',
'52739','94050','76249','25105','67299',
'77588','50637','14333','39372','98030',
'79792','12014','56236','61057','51347',
'87879','71564','48478','33078','23325',
'25526','52855','27570','78396','18695',
'24397','76087','35195','97232','29136',
'15812','18408','40746','78749')
intersect
select ca_zip
from (SELECT substr(ca_zip,1,5) ca_zip,count(*) cnt
FROM customer_address, customer
WHERE ca_address_sk = c_current_addr_sk and
c_preferred_cust_flag='Y'
group by ca_zip
having count(*) > 10)A1)A2) V1
where ss_store_sk = s_store_sk
and ss_sold_date_sk = d_date_sk
and d_qoy = 1 and d_year = 2000
and (substr(s_zip,1,2) = substr(V1.ca_zip,1,2))
group by s_store_name
order by s_store_name;
  1. When the information "Finished capturing snapshot n for task" is displayed in Coordinator server.log, stop a worker olk.
  2. select
select count(1) from task_recovery_tpcds_automate.task_Q08;
select * from task_recovery_tpcds_automate.task_Q08 order by s_store_name;

Related log/screenshots

Occasional data inconsistency
As shown in the following figure:
Expected results on the left, actual results on the right
输入图片说明

Special notes for this issue

附件

评论 (4)

yumei 创建了Defect
yumei 负责人设置为jessica-surya
yumei 关联项目设置为Proj-openLooKeng
yumei 关联仓库设置为openLooKeng/hetu-core
yumei 计划开始日期设置为2021-10-12
yumei 计划截止日期设置为2021-11-30
yumei 里程碑设置为20211230-1.5.0
yumei 优先级设置为严重
yumei 添加了
 
bug
标签
yumei 添加了
 
m/ReliableExecution
标签
yumei 添加了
 
r/RC4
标签
yumei 添加了
 
1.4.0
标签
yumei 上传了附件QA-996 log and etc configuration.tar
展开全部操作日志

Please retest and attach debug logs from the following branch: https://gitee.com/jessica-surya/hetu-core/tree/task-recovery-debug

Tested 40 times in 1.5.0RC1 and no problem was found

@jessica-surya Please analyze the code to check whether the problem has been rectified, and post the analysis results.

No changes directly related to this issue in PR #1250 or #1253, although this could have been caused by an underlying memory related issue and now that we are more careful with our memory usage, fewer inconsistent result errors are occurring.

!1250:Fix snapshot scheduling bug causing queries to hang
!1253:Track memory usage in SingleInputSnapshotState

jessica-surya 任务状态Todo 修改为ToTest
yumei 任务状态ToTest 修改为Tested

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(3)
Java
1
https://gitee.com/openlookeng/hetu-core.git
git@gitee.com:openlookeng/hetu-core.git
openlookeng
hetu-core
hetu-core

搜索帮助