CXL-SHM：Non-blocking Partial Failure Resilient Memory Management System

The efficiency of distributed shared memory (DSM) has been greatly improved by recent hardware technologies. But, the difficulty of distributed memory management can still be a major obstacle to the democratization of DSM, especially when a partial failure of the participating clients (due to crashed processes or machines) should be tolerated.

Therefore,we present CXL-SHM, an automatic distributed memory management system based on reference counting. The reference count maintenance in CXL-SHM is implemented with a special era-based non-blocking algorithm. Thus, there are no global blocking, memory leak, double free, and wild pointer problems, even if some participating clients unexpectedly fail without destroying their possessed memory references. We evaluated our system on real CXL hardware with both micro-benchmarks and end-to-end applications, which demonstrate the efficiency of CXL-SHM and the simplicity and flexibility of using CXL-SHM to build efficient distributed applications.

Requirements

Hardware Environment

Recommended Simulation Platform: General Linux Server (Phyical Machine or Bare Mental Server in Cloud). It would be feasible to utilize a separate NUMA socket's DRAM as a means to emulate the remote CXL hardware. The goal is to facilitate the reproduction of our results. Using remote NUMA to simulate CXL latency is similar to previous works. It's worth noting that our preliminary evaluations indicate a similarity in performance between a remote CXL memory and a cross-NUMA access (Pond [^1] and TBB [^2]).
Original CXL Platform: Intel Linux Server with Sapphire Rapids CPU and FPGA device (Intel Agilex I/Y serial) with R-Tile [^5]. CXL device is configured as devdax mode.
(Optional) RDMA Platform: It can produce the results of baseline in Figure 6. This platform should equip with Mellanox/Nvidia 50GBps ConnectX-5 RDMA NIC.

Software Environment

Linux Kernel >= 5.10.134
OS version >= CentOS 7
CMake >= 3.5+
Jemalloc: jemalloc = 5.2.1-2.1.al8, jemalloc-devel = 5.2.1-2.1.al8
gcc with C++11 support
Main Memory >= 32GB

Installation

CXL-SHM is a C++ libaray, and it provides two ways to install.

Compiling Installation

You can install CXL-SHM with compiling the project.

cd cxl-shm
mkdir build
cmake ..
make -j
make install

RPM Installation

TODO

Usage

Here is a simple C++ example. You can compile it with g++ test.cpp -o test -lcxlmalloc -latomic

#include <assert.h>
#include <stdbool.h>
#include <stdint.h>
#include <errno.h>
#include <thread>
#include <unistd.h>
#include <future>

#include <cxlmalloc.h>
#include <cxlmalloc_internal.h>
#include <sys/ipc.h>
#include <sys/shm.h>

size_t length;
int shm_id;

void consumer(uint64_t queue_offset, std::promise<uint64_t> &offset)
{
    sleep(3);
    cxl_shm shm = cxl_shm(length, shm_id);
    shm.thread_init();
    void* start = shm.get_start();
    CXLRef r1 = shm.cxl_unwrap(queue_offset);
    offset.set_value(r1.get_tbr()->pptr);
}

int main()
{
    using namespace std;
    length = (ZU(1) << 28);
    shm_id = shmget(100, length, IPC_CREAT|0664);
    cxl_shm shm = cxl_shm(length, shm_id);
    shm.thread_init();

    shm.thread_init();
    int result = (shm.get_thread_id() != 0);

    CXLRef ref = shm.cxl_malloc(32, 0);
    result = (ref.get_tbr() != NULL && ref.get_addr() != NULL);

    shmctl(shm_id, IPC_RMID, NULL);

    return 0;
}

[^1]: Li, Huaicheng, et al. Pond: CXL-based memory pooling systems for cloud platforms. Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2. 2023. [^2]: Maruf, Hasan Al, et al. TPP: Transparent page placement for CXL-enabled tiered-memory. Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3. 2023.

anolis/CXL-shm

CXL-SHM：Non-blocking Partial Failure Resilient Memory Management System

Requirements

Hardware Environment

Software Environment

Installation

Compiling Installation

RPM Installation

Usage

简介

发行版

贡献者

近期动态

anolis/CXL-shm .gitee-modal { width: 500px !important; }

CXL-SHM：Non-blocking Partial Failure Resilient Memory Management System

Requirements

Hardware Environment

Software Environment

Installation

Compiling Installation

RPM Installation

Usage

简介

发行版

贡献者

近期动态

搜索帮助

anolis/CXL-shm