74 Star 357 Fork 215

GVPopenEuler / A-Tune

 / 详情

atune-adm tuning完成后的最优参数未下发生效

已完成
任务
创建于  
2023-03-20 11:26

【标题描述】对物理机上运行的spark tpcds负载进行系统参数调优,搜索得到的一组最优参数值,只有部分下发生效,vm.dirty_ratio, vm.min_free_kbytes没有下发生效
【环境信息】

  • 硬件:鲲鹏920服务器

  • os:

NAME="openEuler"
VERSION="20.03 (LTS-SP3)"
ID="openEuler"
VERSION_ID="20.03"
PRETTY_NAME="openEuler 20.03 (LTS-SP3)"
ANSI_COLOR="0;31"
  • 内核
Linux host79 4.19.90-2112.8.0.0131.oe1.aarch64 #1 SMP Fri Dec 31 19:53:20 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
  • atune版本
atune.aarch64                                           1.0.0-6.oe1                               
atune-client.aarch64                                    1.0.0-6.oe1                               
atune-collector.aarch64                                 1.1.0-1.oe1                               
atune-db.aarch64                                        1.0.0-6.oe1                               
atune-engine.aarch64                                    1.0.0-6.oe1                               
atune.src                                               1.0.0-1.oe1                               
atune-collector.src                                     1.1.0-1.oe1                              

【问题复现步骤】

  1. 运行spark tpcds负载
  2. 运行atune-adm tuning --project os_spark_sql --detail tuning_params_os4spark_client.yaml进行参数调优, tuning_params_os4spark_client.yaml见附件

【预期结果】

运行结果

调优后,

  • vm.dirty_ratio=95
  • vm.min_free_kbytes=839680

【实际结果】

vm.dirty_ratio, vm.min_free_kbytes均为调优前的值

【附件信息】

project: "os_spark_sql"
maxiterations: 100
startworkload: ''
stopworkload: ''
object:
- name: transparent_hugepage.defrag
  info:
    desc: Enabling or Disabling Transparent Hugepages
    get: cat /sys/kernel/mm/transparent_hugepage/defrag | sed -n 's/.*\[\(.*\)\].*/\1/p'
    set: echo $value > /sys/kernel/mm/transparent_hugepage/defrag
    needrestart: 'false'
    type: discrete
    options:
    - never
    dtype: string
- name: transparent_hugepage.enabled
  info:
    desc: Enabling or Disabling Transparent Hugepages
    get: cat /sys/kernel/mm/transparent_hugepage/enabled | sed -n 's/.*\[\(.*\)\].*/\1/p'
    set: echo $value > /sys/kernel/mm/transparent_hugepage/enabled
    needrestart: 'false'
    type: discrete
    options:
    - never
    dtype: string
- name: vm.dirty_expire_centisecs
  info:
    desc: Expiration time of dirty data. When the flusher thread of the kernel is
      woken up after the expiration time, dirty data is written back to the disk.
      The unit is 1% second.
    get: sysctl -n vm.dirty_expire_centisecs
    set: sysctl -w vm.dirty_expire_centisecs=$value
    needrestart: 'false'
    type: discrete
    scope:
    - 1000
    - 5000
    step: 500
    items: null
    dtype: int
- name: vm.dirty_writeback_centisecs
  info:
    desc: Sets the interval for waking up the flusher kernel thread. This thread is
      used to write dirty pages back to the disk. The unit is 1% second.
    get: sysctl -n vm.dirty_writeback_centisecs
    set: sysctl -w vm.dirty_writeback_centisecs=$value
    needrestart: 'false'
    type: discrete
    scope:
    - 100
    - 1000
    step: 100
    items: null
    dtype: int    
- name: vm.min_free_kbytes
  info:
    desc: Size of memory reserved in each memory area, in KB.
    get: sysctl -n vm.min_free_kbytes
    set: sysctl -w vm.min_free_kbytes=$value
    needrestart: 'false'
    type: discrete
    scope:
    - 10240
    - 1024000
    step: 10240
    items: null
    dtype: int
- name: vm.dirty_ratio
  info:
    desc: The percentage of dirty data in the memory cannot exceed this value.
    get: sysctl -n vm.dirty_ratio
    set: sysctl -w vm.dirty_ratio=$value
    needrestart: 'false'
    type: discrete
    scope:
    - 0
    - 100
    step: 1
    items: null
    dtype: int   
- name: vm.dirty_background_ratio
  info:
    desc: When the percentage of dirty pages reaches dirty_background_ratio, the write
      function wakes up the flusher thread of the kernel to write back dirty page
      data until the percentage is less than the value of dirty_background_ratio.
    get: sysctl -n vm.dirty_background_ratio
    set: sysctl -w vm.dirty_background_ratio=$value
    needrestart: 'false'
    type: discrete
    scope:
    - 0
    - 100
    step: 1
    items: null
    dtype: int
- name: kernel.sched_min_granularity_ns
  info:
    desc: Minimum running time of a process on the CPU. During this time, the kernel
      does not proactively select other processes for scheduling (in nanoseconds).
    get: sysctl -n kernel.sched_min_granularity_ns
    set: sysctl -w kernel.sched_min_granularity_ns=$value
    needrestart: 'false'
    type: discrete
    scope:
    - 1000000
    - 100000000
    step: 1000000
    items: null
    dtype: int        
- name: kernel.sched_wakeup_granularity_ns
  info:
    desc: This variable indicates the base of the minimum time that a process should
      run after it is woken up. The smaller the base, the higher the probability of
      preemption.
    get: sysctl -n kernel.sched_wakeup_granularity_ns
    set: sysctl -w kernel.sched_wakeup_granularity_ns=$value
    needrestart: 'false'
    type: discrete
    scope:
    - 1000000
    - 100000000
    step: 1000000
    items: null
    dtype: int     
- name: kernel.sched_autogroup_enabled
  info:
    desc: 'When enabled, the kernel creates task groups to optimize desktop program
      scheduling. 0: disabled 1: enabled'
    get: sysctl -n kernel.sched_autogroup_enabled
    set: sysctl -w kernel.sched_autogroup_enabled=$value
    needrestart: 'false'
    type: discrete
    options:
    - '0'
    - '1'
    dtype: string
- name: kernel.numa_balancing
  info:
    desc: Specifies whether to enable NUMA automatic balancing.
    get: sysctl -n kernel.numa_balancing
    set: sysctl -w kernel.numa_balancing=$value
    needrestart: 'false'
    type: discrete
    options:
    - '0'
    - '1'
    dtype: string                

评论 (2)

akzing 创建了缺陷

Hi akzing, welcome to the openEuler Community.
I'm the Bot here serving you. You can find the instructions on how to interact with me at Here.
If you have any questions, please contact the SIG: A-Tune, and any of the maintainers: @谢志鹏 , @Monday , @gaoruoshu , @HuBin95

openeuler-ci-bot 添加了
 
sig/A-Tune
标签
akzing 修改了描述

1.考虑A-Tune未成功下发参数值
2.考虑有其他服务同时修改了此参数
当前正在尝试复现该问题

gaoruoshu 任务类型缺陷 修改为任务
gaoruoshu 任务状态待办的 修改为已完成

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(3)
5329419 openeuler ci bot 1632792936
Go
1
https://gitee.com/openeuler/A-Tune.git
git@gitee.com:openeuler/A-Tune.git
openeuler
A-Tune
A-Tune

搜索帮助