# ZeroFS **Repository Path**: joelive/ZeroFS ## Basic Information - **Project Name**: ZeroFS - **Description**: zerofs 使 S3 存储感觉像一个真正的文件系统 - **Primary Language**: Unknown - **License**: AGPL-3.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-09-26 - **Last Updated**: 2025-10-18 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
### ZFS on S3 via NBD
ZeroFS provides NBD block devices that ZFS can use directly - no intermediate filesystem needed. Here's ZFS running on S3 storage:
### Ubuntu Running on ZeroFS
Watch Ubuntu boot from ZeroFS:
### Self-Hosting ZeroFS
ZeroFS can self-host! Here's a demo showing Rust's toolchain building ZeroFS while running on ZeroFS:
## Architecture
```mermaid
graph TB
subgraph "Client Layer"
NFS[NFS Client]
P9[9P Client]
NBD[NBD Client]
end
subgraph "ZeroFS Core"
NFSD[NFS Server]
P9D[9P Server]
NBDD[NBD Server]
VFS[Virtual Filesystem]
ENC[Encryption Manager]
CACHE[Cache Manager]
NFSD --> VFS
P9D --> VFS
NBDD --> VFS
VFS --> ENC
ENC --> CACHE
end
subgraph "Storage Backend"
SLATE[SlateDB]
LSM[LSM Tree]
S3[S3 Object Store]
CACHE --> SLATE
SLATE --> LSM
LSM --> S3
end
NFS --> NFSD
P9 --> P9D
NBD --> NBDD
```
## Quick Start
### Installation
#### Download Binary (Recommended)
Download pre-built binaries from the [releases page](https://github.com/Barre/ZeroFS/releases). These binaries are optimized with Profile-Guided Optimization (PGO) for best performance.
#### Via Cargo
```bash
cargo install zerofs
```
#### Via Docker
```bash
docker pull ghcr.io/barre/zerofs:latest
```
### Getting Started
```bash
# Generate a configuration file
zerofs init
# Edit the configuration with your S3 credentials
$EDITOR zerofs.toml
# Run ZeroFS
zerofs run -c zerofs.toml
```
## Configuration
ZeroFS uses a TOML configuration file that supports environment variable substitution. This makes it easy to manage secrets and customize paths.
### Creating a Configuration
Generate a default configuration file:
```bash
zerofs init # Creates zerofs.toml
```
The configuration file has sections for:
- **Cache** - Local cache settings for performance
- **Storage** - S3/Azure/local backend configuration and encryption
- **Servers** - Enable/disable NFS, 9P, and NBD servers
- **Cloud credentials** - AWS or Azure authentication
### Example Configuration
```toml
[cache]
dir = "${HOME}/.cache/zerofs"
disk_size_gb = 10.0
memory_size_gb = 1.0 # Optional, defaults to 0.25
[storage]
url = "s3://my-bucket/zerofs-data"
encryption_password = "${ZEROFS_PASSWORD}"
[servers.nfs]
addresses = ["127.0.0.1:2049"] # Can specify multiple addresses
[servers.ninep]
addresses = ["127.0.0.1:5564"]
unix_socket = "/tmp/zerofs.9p.sock" # Optional
[servers.nbd]
addresses = ["127.0.0.1:10809"]
unix_socket = "/tmp/zerofs.nbd.sock" # Optional
[aws]
access_key_id = "${AWS_ACCESS_KEY_ID}"
secret_access_key = "${AWS_SECRET_ACCESS_KEY}"
# endpoint = "https://s3.us-east-1.amazonaws.com" # For S3-compatible services
# default_region = "us-east-1"
# allow_http = "true" # For non-HTTPS endpoints
# [azure]
# storage_account_name = "${AZURE_STORAGE_ACCOUNT_NAME}"
# storage_account_key = "${AZURE_STORAGE_ACCOUNT_KEY}"
```
### Environment Variable Substitution
The configuration supports `${VAR}` syntax for environment variables. This is useful for:
- Keeping secrets out of configuration files
- Using different settings per environment
- Sharing configurations across systems
All referenced environment variables must be set when running ZeroFS.
### Storage Backends
ZeroFS supports multiple storage backends through the `url` field in `[storage]`:
#### Amazon S3
```toml
[storage]
url = "s3://my-bucket/path"
[aws]
access_key_id = "${AWS_ACCESS_KEY_ID}"
secret_access_key = "${AWS_SECRET_ACCESS_KEY}"
# endpoint = "https://s3.us-east-1.amazonaws.com" # For S3-compatible services
# default_region = "us-east-1"
# allow_http = "true" # For non-HTTPS endpoints (e.g., MinIO)
```
#### Microsoft Azure
```toml
[storage]
url = "azure://container/path"
[azure]
storage_account_name = "${AZURE_STORAGE_ACCOUNT_NAME}"
storage_account_key = "${AZURE_STORAGE_ACCOUNT_KEY}"
```
#### Local Filesystem
```toml
[storage]
url = "file:///path/to/storage"
# No additional configuration needed
```
### Server Configuration
You can enable or disable individual servers by including or commenting out their sections:
```toml
# To disable a server, comment out or remove its entire section
[servers.nfs]
addresses = ["0.0.0.0:2049"] # Bind to all IPv4 interfaces
# addresses = ["[::]:2049"] # Bind to all IPv6 interfaces
# addresses = ["127.0.0.1:2049", "[::1]:2049"] # Dual-stack localhost
[servers.ninep]
addresses = ["127.0.0.1:5564"]
unix_socket = "/tmp/zerofs.9p.sock" # Optional: adds Unix socket support
[servers.nbd]
addresses = ["127.0.0.1:10809"]
unix_socket = "/tmp/zerofs.nbd.sock" # Optional: adds Unix socket support
```
### Encryption
Encryption is always enabled in ZeroFS. All file data is encrypted using ChaCha20-Poly1305 authenticated encryption with lz4 compression. Configure your password in the configuration file:
```toml
[storage]
url = "s3://my-bucket/data"
encryption_password = "${ZEROFS_PASSWORD}" # Or use a literal password (not recommended)
```
#### Password Management
On first run, ZeroFS generates a 256-bit data encryption key (DEK) and encrypts it with a key derived from your password using Argon2id. The encrypted key is stored in the database, so you need the same password for subsequent runs.
To change your password:
```bash
# Change the encryption password (reads new password from stdin)
echo "new-secure-password" | zerofs change-password -c zerofs.toml
# Or read from a file
zerofs change-password -c zerofs.toml < new-password.txt
```
After changing the password, update your configuration file or environment variable to use the new password for future runs.
#### What's Encrypted vs What's Not
**Encrypted:**
- All file contents (in 256K chunks)
- File metadata values (permissions, timestamps, etc.)
**Not Encrypted:**
- Key structure (inode IDs, directory entry names)
- Database structure (LSM tree levels, bloom filters)
This design is intentional. Encrypting keys would severely impact performance as LSM trees need to compare and sort keys during compaction. The key structure reveals filesystem hierarchy but not file contents.
This should be fine for most use-cases but if you need to hide directory structure and filenames, you can layer a filename-encrypting filesystem like gocryptfs on top of ZeroFS.
## Mounting the Filesystem
### 9P (Recommended for better performance and FSYNC POSIX semantics)
9P provides better performance and more accurate POSIX semantics, especially for fsync/commit operations. When fsync is called on 9P, ZeroFS receives a proper signal to flush data to stable storage, ensuring strong durability guarantees.
**Note on durability:** With NFS, ZeroFS reports writes as "stable" to the client even though they are actually unstable (buffered in memory/cache). This is done to avoid performance degradation, as otherwise each write would translate to an fsync-like operation (COMMIT in NFS terms). During testing, we expected clients to call COMMIT on FSYNC, but tested clients (macOS and Linux) don't follow this pattern. If you require strong durability guarantees, 9P is strongly recommended over NFS.
#### TCP Mount (default)
```bash
mount -t 9p -o trans=tcp,port=5564,version=9p2000.L,msize=1048576,cache=mmap,access=user 127.0.0.1 /mnt/9p
```
#### Unix Socket Mount (lower latency for local access)
For improved performance when mounting locally, you can use Unix domain sockets which eliminate TCP/IP stack overhead:
```bash
# Configure Unix socket in zerofs.toml
# [servers.ninep]
# unix_socket = "/tmp/zerofs.9p.sock"
# Mount using Unix socket
mount -t 9p -o trans=unix,version=9p2000.L,msize=1048576,cache=mmap,access=user /tmp/zerofs.9p.sock /mnt/9p
```
Unix sockets avoid the network stack entirely, making them ideal for local mounts where the client and ZeroFS run on the same machine.
### NFS
#### macOS
```bash
mount -t nfs -o async,nolocks,rsize=1048576,wsize=1048576,tcp,port=2049,mountport=2049,hard 127.0.0.1:/ mnt
```
#### Linux
```bash
mount -t nfs -o async,nolock,rsize=1048576,wsize=1048576,tcp,port=2049,mountport=2049,hard 127.0.0.1:/ /mnt
```
## NBD Configuration and Usage
In addition to file-level access, ZeroFS provides raw block devices through NBD with full TRIM/discard support:
```bash
# Configure NBD in zerofs.toml
# [servers.nbd]
# addresses = ["127.0.0.1:10809"]
# unix_socket = "/tmp/zerofs.nbd.sock" # Optional
# Start ZeroFS
zerofs run -c zerofs.toml
# Mount ZeroFS via NFS or 9P to manage devices
mount -t nfs 127.0.0.1:/ /mnt/zerofs
# or
mount -t 9p -o trans=tcp,port=5564 127.0.0.1 /mnt/zerofs
# Create NBD devices dynamically
mkdir -p /mnt/zerofs/.nbd
truncate -s 1G /mnt/zerofs/.nbd/device1
truncate -s 2G /mnt/zerofs/.nbd/device2
truncate -s 5G /mnt/zerofs/.nbd/device3
# Connect via TCP with optimal settings (high timeout, multiple connections)
nbd-client 127.0.0.1 10809 /dev/nbd0 -N device1 -persist -timeout 600 -connections 4
nbd-client 127.0.0.1 10809 /dev/nbd1 -N device2 -persist -timeout 600 -connections 4
# Or connect via Unix socket (better local performance)
nbd-client -unix /tmp/zerofs.nbd.sock /dev/nbd2 -N device3 -persist -timeout 600 -connections 4
# Use the block devices
mkfs.ext4 /dev/nbd0
mount /dev/nbd0 /mnt/block
# Or create a ZFS pool
zpool create mypool /dev/nbd0 /dev/nbd1 /dev/nbd2
```
### TRIM/Discard Support
ZeroFS NBD devices support TRIM operations, which actually delete the corresponding chunks from the LSM-tree database backed by S3:
```bash
# Manual TRIM
fstrim /mnt/block
# Enable automatic discard (for filesystems)
mount -o discard /dev/nbd0 /mnt/block
# ZFS automatic TRIM
zpool set autotrim=on mypool
zpool trim mypool
```
When blocks are trimmed, ZeroFS removes the corresponding chunks from ZeroFS' LSM-tree, which eventually results in freed space in S3 storage through compaction. This reduces storage costs for any filesystem or application that issues TRIM commands.
### NBD Device Management
NBD devices are managed as regular files in the `.nbd` directory:
```bash
# List devices
ls -lh /mnt/zerofs/.nbd/
# Create a new device
truncate -s 10G /mnt/zerofs/.nbd/my-device
# Remove a device (must disconnect NBD client first)
nbd-client -d /dev/nbd0
rm /mnt/zerofs/.nbd/my-device
```
Devices are discovered dynamically by the NBD server - no restart needed! You can read/write these files directly through NFS/9P, or access them as block devices through NBD.
## Geo-Distributed Storage with ZFS
Since ZeroFS makes S3 regions look like local block devices, you can create globally distributed ZFS pools by running multiple ZeroFS instances across different regions:
```bash
# Machine 1 - US East (10.0.1.5)
# zerofs-us-east.toml:
# [storage]
# url = "s3://my-bucket/us-east-db"
# encryption_password = "${SHARED_KEY}"
# [servers.nbd]
# addresses = ["0.0.0.0:10809"]
# [aws]
# default_region = "us-east-1"
zerofs run -c zerofs-us-east.toml
# Create device via mount (from same or another machine)
mount -t nfs 10.0.1.5:/ /mnt/zerofs
truncate -s 100G /mnt/zerofs/.nbd/storage
umount /mnt/zerofs
# Machine 2 - EU West (10.0.2.5)
# Similar config with url = "s3://my-bucket/eu-west-db" and default_region = "eu-west-1"
zerofs run -c zerofs-eu-west.toml
# Machine 3 - Asia Pacific (10.0.3.5)
# Similar config with url = "s3://my-bucket/asia-db" and default_region = "ap-southeast-1"
zerofs run -c zerofs-asia.toml
# From a client machine, connect to all three NBD devices with optimal settings
nbd-client 10.0.1.5 10809 /dev/nbd0 -N storage -persist -timeout 600 -connections 8
nbd-client 10.0.2.5 10809 /dev/nbd1 -N storage -persist -timeout 600 -connections 8
nbd-client 10.0.3.5 10809 /dev/nbd2 -N storage -persist -timeout 600 -connections 8
# Create a mirrored pool across continents using raw block devices
zpool create global-pool mirror /dev/nbd0 /dev/nbd1 /dev/nbd2
```
**Result**: Your ZFS pool now spans three continents with automatic:
- **Disaster recovery** - If any region goes down, your data remains available
- **Geographic redundancy** - Data is simultaneously stored in multiple regions
- **Infinite scalability** - Add more regions by spinning up additional ZeroFS instances
This turns expensive geo-distributed storage infrastructure into a few simple commands.
## Tiered Storage with ZFS L2ARC
Since ZeroFS makes S3 behave like a regular block device, you can use ZFS's L2ARC to create automatic storage tiering:
```bash
# Create your S3-backed pool
zpool create mypool /dev/nbd0 /dev/nbd1
# Add local NVMe as L2ARC cache
zpool add mypool cache /dev/nvme0n1
# Check your setup
zpool iostat -v mypool
```
With this setup, ZFS automatically manages data placement across storage tiers:
1. NVMe L2ARC: for frequently accessed data
2. ZeroFS caches: Sub-millisecond latency for warm data
3. backend:for everything else
The tiering is transparent to applications. A PostgreSQL database sees consistent performance for hot data while storing years of historical data at S3 prices. No manual archival processes or capacity planning emergencies.
## PostgreSQL Performance
Here's pgbench running on PostgreSQL with ZeroFS + L2ARC as the storage backend:
### Read/Write Performance
```
postgres@ubuntu-16gb-fsn1-1:/root$ pgbench -c 50 -j 15 -t 100000 example
pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1))
starting vacuum...end.
transaction type: