bugzilla: https://bugzilla.openanolis.cn/show_bug.cgi?id=4201
This optimization is inspired by [1].
When the on-demand routine is triggered, the user daemon will fetch data
from remote and then write the fetched data to the backing file with
DIRECT IO to avoid double page cache. When the on-demand routine
finishes, the process requesting for IO will read data from the backing
file (also with DIRECT IO). This mechanism will cause read latency
since these two DIRECT IO.
To optimize this, make the user daemon buffer write the backing file.
Make the process requesting for IO read data from the page cache of the
backing file first if there's any, and otherwise fallback to the DIRECT
IO.
With the optimization, the IOPS of fio job [2] boosts from 13.8K to 35.2K. The cost time of the workload of Linux source code compiling [3] in container boosts from 2m18.448s to 2m14.972s.
[1] https://github.com/dragonflyoss/image-service/blob/master/contrib/kernel-patches/0001-cachefiles-optimize-on-demand-IO-path-with-buffer-IO.patch
[2] fio -name=test -ioengin=psync -filename=/mnt/testfile -direct=0 -bs=4K -rw=randread -time_based -runtime=5s
[3] nerdctl run --rm --net=host --snapshotter=nydus eci-nydus-registry.cn-hangzhou.cr.aliyuncs.com/v6/bldlinux:v0.1-rafs-v6-lz4
暂无评论