# DCache **Repository Path**: zhangxiao2000/DCache ## Basic Information - **Project Name**: DCache - **Description**: DCache: A Distributed Cache Mechanism for HDFS Based on RDMA. - **Primary Language**: Java - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 2 - **Forks**: 0 - **Created**: 2020-08-27 - **Last Updated**: 2022-08-02 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Introduction DCache: A Distributed Cache Mechanism for HDFS Based on RDMA. The design detailed you can find from our **paper**. Optimize the read and write process of HDFS. Change the code base on **Hadoop2.9.0**. Please cite our paper if you think this is useful. X Zhang, Liu B , Gou Z , et al. DCache: A Distributed Cache Mechanism for HDFS based on RDMA[C]// 2020 IEEE 22nd International Conference on High Performance Computing and Communications IEEE, 2020. # Compile ## Dependency 1. Java 2. maven 3. protobuf 4. [JUCX](https://github.com/openucx/ucx/tree/master/bindings/java) The [Jar of JUCX](https://gitee.com/zhangxiao2000/DCache/tree/master/hadoop-2.9.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/lib) has been included in our project. You can refer to the community version of Hadoop and use maven to compile. ```shell mvn package -Pdist,native -DskipTests -Dtar -Dmaven.test.skip=true ``` # Deployment We just modify the code of read and write of HDFS. So only two Jar files changed: **hadoop-hdfs-2.9.0.jar,hadoop-hdfs-client-2.9.0.jar**. You can complile by source code or download from **Release**. The full Hadoop you can download from [Apache](https://archive.apache.org/dist/hadoop/common/hadoop-2.9.0/). If `${HADOOP_SRC}` is the root directory of Hadoop source code. `${HADOOP_HOME}` is the root directory of Hadoop. Then you can replace Jar files by these commands. ```shell cp ${HADOOP_SRC}/hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-2.9.0.jar ${HADOOP_HOME}/share/hadoop/hdfs/ cp ${HADOOP_SRC}/hadoop-hdfs-project/hadoop-hdfs-client/target/hadoop-hdfs-client-2.9.0.jar ${HADOOP_HOME}/share/hadoop/hdfs/ cp ${HADOOP_SRC}/hadoop-hdfs-project/hadoop-hdfs-client/target/hadoop-hdfs-client-2.9.0.jar ${HADOOP_HOME}/share/hadoop/hdfs/lib cp ${HADOOP_SRC}/hadoop-hdfs-project/hadoop-hdfs/src/main/lib/jucx-1.7.0.jar ${HADOOP_HOME}/share/hadoop/hdfs/lib ``` If you want to use hadoop commands to start CacheNode, you need to replace these files. ```shell cp deploy/hdfs ${HADOOP_HOME}/bin/ cp deploy/hadoop-daemon.sh ${HADOOP_HOME}/sbin/ cp deploy/hdfs-site.xml ${HADOOP_HOME}/etc/hadoop/ ``` You can control whether to enable the DCache function by adjusting the parameter named **dfs.client.isSupportRDMA**(`${HADOOP_HOME}/etc/hadoop/hdfs-site.xml`) # Usage ```shell #start CacheNode(stop the CacheNode only by kill the process) hadoop-daemon.sh start cachenode #cache file hdfs cacheadmin -addDirective -path -rdma #uncache file hdfs cacheadmin -removeDirective -path -rdma ```