401 Star 1.4K Fork 1.3K

GVPopenEuler / kernel

 / 详情

【openEuler-kernel-6.1.0】【arm】ltp执行memcg_max_usage_in_byte和memcg_usage_in_bytes用例失败

已验收
缺陷
创建于  
2023-01-31 11:31

【环境信息】
软件信息:
1)OS版本及分支:openEuler-22.03-LTS-SP1
2)内核信息:6.1.0-1.0.0.1.oe2203sp1.aarch64
3)发现问题的组件版本信息:ltp-20220930
【问题复现步骤】
具体操作步骤
1.编译ltp-20220930,make autotools;./configure;make;make install
2.执行用例:
./runltp -s memcg_max_usage_in_byte;./runltp -s memcg_usage_in_bytes
出现概率:必现
【预期结果】
用例执行成功
【实际结果】
用例执行失败,报错信息如下图
【附件信息】
输入图片说明
输入图片说明

评论 (6)

hanson_fang 创建了缺陷

Hi hanson_fang, welcome to the openEuler Community.
I'm the Bot here serving you. You can find the instructions on how to interact with me at Here.
If you have any questions, please contact the SIG: Kernel, and any of the maintainers: @YangYingliang , @成坚 (CHENG Jian) , @jiaoff , @zhengzengkai , @刘勇强 , @wangxiongfeng , @朱科潜 , @WangShaoBo , @lujialin , @wuxu_buque , @Xu Kuohai , @冷嘲啊 , @Lingmingqiang , @yuzenghui , @juntian , @OSSIM , @陈结松 , @whoisxxx , @koulihong , @刘恺 , @hanjun-guo , @woqidaideshi , @Chiqijun , @Kefeng , @ThunderTown , @AlexGuo , @kylin-mayukun , @Zheng Zucheng , @柳歆 , @Jackie Liu , @zhujianwei001 , @郑振鹏 , @SuperSix173 , @colyli , @Zhang Yi , @htforge , @Qiuuuuu , @Yuehaibing , @xiehaocheng , @guzitao , @CTC-Xibo.Wang , @zhanghongchen , @chen wei , @Jason Zeng , @苟浩 , @DuanqiangWen , @georgeguo , @毛泓博 , @AllenShi , @zhangjialin , @Xie XiuQi

openeuler-ci-bot 添加了
 
sig/Kernel
标签
hanson_fang 负责人设置为Xie XiuQi
hanson_fang 计划截止日期设置为2023-02-01
hanson_fang 计划开始日期设置为2023-01-31
hanson_fang 优先级设置为主要
hanson_fang 修改了标题
hanson_fang 修改了标题

此问题是由于1813e51eece0ad6f4aacaeb738e7cced46feb470 把MEMCG_CHARGE_BATCH 32U 改为 64U,导致ltp test case失败。 而且此问题应该在所有硬件平台,包括X86都可以复现。

具体情况为在文件中mm/memcontrol.c:
2627 static int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask,
2628 unsigned int nr_pages)
2629 {
2630 unsigned int batch = max(MEMCG_CHARGE_BATCH, nr_pages);
2631 int nr_retries = MAX_RECLAIM_RETRIES;
2632 struct mem_cgroup *mem_over_limit;
2633 struct page_counter *counter;
2634 unsigned long nr_reclaimed;
2635 bool passed_oom = false;
2636 unsigned int reclaim_options = MEMCG_RECLAIM_MAY_SWAP;
2637 bool drained = false;
2638 bool raised_max_event = false;
2639 unsigned long pflags;
2640
2641 retry:
2642 if (consume_stock(memcg, nr_pages))
2643 return 0;
2644
2645 if (!do_memsw_account() ||
2646 page_counter_try_charge(&memcg->memsw, batch, &counter)) {
2647 if (page_counter_try_charge(&memcg->memory, batch, &counter))
2648 goto done_restock;
2649 if (do_memsw_account())
2650 page_counter_uncharge(&memcg->memsw, batch);
2651 mem_over_limit = mem_cgroup_from_counter(counter, memory);
2652 } else {
2653 mem_over_limit = mem_cgroup_from_counter(counter, memsw);
2654 reclaim_options &= ~MEMCG_RECLAIM_MAY_SWAP;
2655 }
2656
2657 if (batch > nr_pages) {
2658 batch = nr_pages;
2659 goto retry;
2660 }

batch = max(MEMCG_CHARGE_BATCH, nr_pages); batch取MEMCG_CHARGE_BATCH和记账页数较大的那个值。
如果 准备记账的实际页数 < MEMCG_CHARGE_BATCH, 则使用实际页数重试记账流程,也就是下边几行。
if (batch > nr_pages) {
batch = nr_pages;
goto retry;
}

LTP测试例中则是按照此逻辑设计的,具体看下个评论。

memcg_max_usage_in_bytes_test 1 TINFO: timeout per run is 0h 5m 0s
memcg_max_usage_in_bytes_test 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy to 0 failed
memcg_max_usage_in_bytes_test 1 TINFO: Test memory.max_usage_in_bytes
memcg_max_usage_in_bytes_test 1 TINFO: Running memcg_process --mmap-anon -s 4194304
memcg_max_usage_in_bytes_test 1 TINFO: Warming up pid: 239096
memcg_max_usage_in_bytes_test 1 TINFO: Process is still here after warm up: 239096
memcg_max_usage_in_bytes_test 1 TFAIL: memory.max_usage_in_bytes is 4456448, 4206592-4341760 as expected

我们把以上内存换算成4k page大小,
4194304/4k = 1024 pages
memcg_max_usage_in_bytes_test 1 TFAIL: memory.max_usage_in_bytes is 4456448, 4206592-4341760 as expected
4206592/4k = 1027 pages
4341760/4k = 1060 pages
4341760/4k - 4206592/4k = 33 pages

4456448/4k = 1088
4456448/4k - 4206592/4k = 61 pages > 32 pages

可以看出申请的内存页数为61页,大于了test case预期的32页,但tesecase中的32页,实际上来自于内核的MEMCG_CHARGE_BATCH 32U, 在6.1.0中,MEMCG_CHARGE_BATCH已改为64页,所以tesecase也应相应的改变。

这里内核的逻辑并没有问题,需要更新下ltp相应的tesecase。

404 # Post 4.16 kernel updates stat in batch (> 32 pages) every time
_405 #PAGESIZES=$(($PAGESIZE * 33))
406 PAGESIZES=$(($PAGESIZE * 65)) _

407
408 # On recent Linux kernels (at least v5.4) updating stats happens in batches
409 # (PAGESIZES) and also might depend on workload and number of CPUs. The kernel
410 # caches the data and does not prioritize stats precision. This is especially
411 # visible for max_usage_in_bytes where it usually exceeds
412 # actual memory allocation.
413 # When checking for usage_in_bytes and max_usage_in_bytes accept also higher values
414 # from given range:
415 MEM_USAGE_RANGE=$((PAGESIZES))

通过更改ltp文件./testcases/kernel/controllers/memcg/functional/memcg_lib.sh里的PAGESIZES=$(($PAGESIZE * 33))为PAGESIZES=$(($PAGESIZE * 65)),在6.1.0中测试,
可见memory.max_usage_in_bytes 的4206592-4341760 as expected 变为is 4206592-4472832 as expected,测试通过。

输入图片说明

LTP主线已更新,还没更新到ltp-20220930。
输入图片说明

[root@localhost ltp]# git show 2aaff45db7
commit 2aaff45db7960ce8e46e39fad8ae95a3f5db6cba
Author: Zhao Gongyi zhaogongyi@huawei.com
Date: Wed Dec 7 16:37:09 2022 +0800

memcg_lib.sh: Update 'PAGESIZES' for 6.1 kernel

Post 6.1 kernel updates stat in batch (> 64 pages) every time
since commit 1813e51eece0ad6f4aacaeb738e7cced46feb470.

Update 'PAGESIZES' for 6.1 kernel, otherwise the testcase
memcg_max_usage_in_bytes_test.sh will fail and report:
zhangjialin 任务状态待办的 修改为已完成
zhangjialin 添加了
 
issue_invalid
标签
hanson_fang 任务状态已完成 修改为已验收

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(3)
5329419 openeuler ci bot 1632792936
C
1
https://gitee.com/openeuler/kernel.git
git@gitee.com:openeuler/kernel.git
openeuler
kernel
kernel

搜索帮助