LLaMA Box is an LM inference server(pure API, w/o frontend assets) based on the llama.cpp and stable-diffusion.cpp.