# barney **Repository Path**: mirrors_NVIDIA/barney ## Basic Information - **Project Name**: barney - **Description**: A Scalable (and Optionally, Data-Parallel) ANARI Multi-GPU Path Tracer - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-11-16 - **Last Updated**: 2026-03-21 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ![](samples/collage-triangles.jpg) # Barney - A Multi-GPU (and optionally, Multi-Node) Implementation of the ANARI Rendering API Build Status: [![Windows](https://github.com/NVIDIA/barney/actions/workflows/Windows.yml/badge.svg)](https://github.com/NVIDIA/barney/actions/workflows/Windows.yml) [![Ubuntu](https://github.com/NVIDIA/barney/actions/workflows/Ubuntu.yml/badge.svg)](https://github.com/NVIDIA/barney/actions/workflows/Ubuntu.yml) DISCLAIMER: Though Barney has by now reached a stage where it can be expected to be reasonably stable and complete, it is still under active development. If you run into bugs, missing features, or simply broken/outdated documentation please report those at https://github.com/NVIDIA/barney # What is Barney? Barney is a renderer that implements that ANARI Cross Platform Rendering API (https://www.khronos.org/anari/) primarily for NVIDIA OptiX and CUDA capable GPUs. ### Multi-GPU and Multi-Node Parallel Rendering Barney is highly scalable, and can be used for both local, single-GPU rendering (as I do on my laptop on a daily basis), as well as for parallel rendering on multi-GPU nodes, and even for MPI based data-replicated and/or data-parallel rendering: - Single-GPU usage: For single GPU usage Barney works like any other ANARI device; multiple GPus or MPI are *not* required to run Barney. - Multi-GPU: Barney can also make use of more than one GPU. Barney can either be used in *explicit* multi-GPU mode (where the ANARI app explicitly creates different devices for different GPUs, and then "tethers" those using a specific ANARI extension we have introduced for this purpose); or it can simply use *automatic* multi-GPU, where Barney will simply grab all available GPUs and split the work across them. - Multi-Node: For cluster, cloud, or HPC environments Barney also supports MPI-parallel rendering (if built with MPI support), in which case an MPI-parallel application can use Barney across multiple different GPUs and/or nodes. ### Data Parallel and/or Data Replicated Rendering Barney also supports both *data parallel* as well as *data replicated* rendering: In fully data replicated rendering each GPU (and/or each node) gets the exact same copy of all the scene content, and different GPUs render different portions of the final image (ie, this should make rendering the *same* content *faster*). In fully data parallel (also sometimes called "distributed") rendering the scene to be rendered is "distributed" across the different GPUs/nodes, so different GPUs get different parts of what is logically a single model; then barney will make sure that each GPU "sees" all content during rendering (in this mode, Barney will not get faster by adding more GPUs, but it can render models much larger than what a single GPU could have rendered). Barney also supports some intermediate modes where, for example, different nodes work data parallel, but all GPUs on a given node work data replicated, etc. ### Primarily Focussed on Sci-Vis Content Barney is primarily intended for the type and size of data one could encounter in a scientific visualization (sci-vis) content, when used from tools such as, for example, ParaView. Barney supports all the typical geometric types required by such applications (triangle meshes, spheres, cylinders, curves, etc), and also supports the typical scalar field/volume data types such as structured volumes as well as Block-Structured AMR and unstructured data (as far as these are currently supported by ANARI). ### Path Tracer Though clearly focussed on Sci-Vis, Barney is still a pretty capable ray/path tracer on its own, and will, if scene and material data is properly set up, also do HRDI environment map lighting, indirect illumiation, glossy and specular reflection/refraction; depth-of-field cameras; point-, directional, and to some degree area lights; volumetric scattering, etc. Barney will clearly not achieve the kind of correctness or realism that pure global illumination renderers like Mitsuba or PBRT will be able to achieve; but it is still expected to behave creditably on typical non-Sci Vis rendering content. # Building and Running Barney is not a stand-alone "renderer" or "vis-tool"; it is a library with an API, and needs other applications to build towards it. As such, it is never "run" on its own; it also needs to be run from another application (e.g., `hayStack`, at http://github.org/ingowald/hayStack), or from any application that supports the ANARI API (see https://www.khronos.org/anari/). ## Dependencies for building Barney Barney is primarily intended for interactive (multi-)GPU rendering, but can also be built in a non-GPU configuration. Similarly, one of barney's most important features is MPI-based data parallel rendering, but can absolutely also be built---and used--without MPI. As such, dependencies depend on what exactly needs to get built: One way or another, barney requires: - `cmake`, for building - a c++-20 compliant c++ compiler (gcc on linux, visual studio on windows, clang on mac) For CUDA/OptiX Acceleration, it also requires: - `CUDA`, version 12 and up. - `OWL` (https://github.com/NVIDIA/owl). Note OWL gets pulled in as a git submodule, no need to externally get and install. - `OptiX`, as part of OWL. See documentation in OWL (https://github.com/NVIDIA/owl) for where to get, and how to best install for OWL to easily find it) For MPI-based data-parallel rendering: - Building requires a working MPI install. *Running* barney requires a CUDA-aware MPI, for *building* this should not matter. We typically develop under---and test with OpenMPI---4.1.6 or 5.0, but users have reported working with other MPI flavors such as MPICH. ## Building Barney The ANARI build of barney works in pretty much the same way (and with the same options), but requires a pre-built and installed `ANARI-SDK` from https://github.com/KhronosGroup/ANARI-SDK. As to the time of this writing, you need ANARI SDK version 0.15 (or `next_release` branch) First, build and install the ANARI SDK (https://github.com/KhronosGroup/ANARI-SDK): ``` bash cd ANARI-SDK mkdir build cd build cmake .. -DCMAKE_INSTALL_PREFIX= cmake --build .. [ --config Release ] cmake --install .. [ --config Release ] ``` Then, build barney, using same install dir: ``` bash cd barney mkdir build cd build cmake .. -DCMAKE_INSTALL_PREFIX= [options] cmake --build .. [ --config Release ] cmake --install .. [ --config Release ] ``` By default Barney builds without MPI support; to enable this add `-DBARNEY_MPI=ON` to the cmake config command. # Examples of Supported Geometry and Volume Types ## Triangle Meshes (including Instances and Color- and Opacity Textures) Example: PBRT landscape in `miniScene` (http://github.com/ingowald/miniScene) format, the 'embree headlight', and the TSDViewer 'TestORB'; all rendering with HDRI env-map lighting. ![](samples/collage-triangles.jpg) ## Non-Triangular Surface Types for Sci-Vis Supporting most(all?) of the ANARI non-triangular geometry types: Here various examples with capsules, spheres, cylinders, cones, and curves. ![](samples/collage-usergeom.jpg) ## Volume Data Structured Volume Data (`float`, `uint8` and `uint16` are supported, and any volume can be distributed across different ranks by each rank having different porions of that volume. `Barney` being intended for sci-vis, every volume can have its own transfer function. Here examples 'chest' and 'kingsnake' (both regular structuctured data, in different input scalar types), and on the right, 'scivis2011', a unstructured mesh volume type inside a semi-transparent triangular surface. ![](samples/collage-volumes.jpg) # ANARI / BARNARI Though `barney` is not *limited to* ANARI (it is its own library, with its own API), it will also, by default build a (by now reasonably complete!) `ANARI` "device" that exposes most of barney's functionality to applications using the ANARI API. If enabled in the cmake build (it's on by default)---and properly installed via `make install` or `cmake --install`---this builds a implements an ANARI device that any ANARI app can load as a ANARI device named `"barney"`. If barney is built with MPI support for MPI-based data-parallel ray tracing it will also build a ANARI `"barney_mpi"` device as well. Note: To distinguish between the (general) ANARI *API* and the specific barney-based implementation of this API we typically refer to this implementation as the `(B)ANARI` device, or simply as `banari`. ## Building BANARI: - dependencies: `libgtk-3-dev` - need to get, build, *and install* the ANARI-SDK: `https://github.com/KhronosGroup/ANARI-SDK`. Note the SDK *must* be installed for barney to properly find it. - build barney as described above. `BUILD_ANARI` should be on by default, so unless explictily disabled this should also build the banari device. ## Using BANARI: The barney devices should be easily usably by any existing ANARI application by simply loading the `"barney"` device. For those apps that respect the `ANARI_LIBRARY` environment variable convention you sohld also be able to just set `ANARI_LIBRARY=barney`, and have the app load the `"default"` device. For data-parallel rendering across multiple collaborating ranks use `"barney_mpi"`instead. Also see https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://arxiv.org/abs/2407.00179&ved=2ahUKEwj2rPmKuqGMAxVLJUQIHVIvFSsQFnoECBoQAQ&usg=AOvVaw0z7wpXQQyZwSdPhd6effC8 for the conventions on how to properly use data-parallel ANARI (which barney implements). # Version History ## v0.10 - major updated to MPI performance - support for multi-GPU data parallel ANARI using device tethering - more cuda-like kernel launches across both embree and CPU backends - updates as required for ANARI 0.15 - support for CUDA 13 - performance fixes for threading on embree backend - various fixes throughout ## v0.9.2, 0.9.4, and 0.9.6: - various stability fixes and bug fixes in particular relating to materials, path tracing, and lighting, as well as on multi-device rendering - closed various missing gaps wrt anari specs (missing formats, unsupported parameters, etc) ## v0.9.0 - major rework that allows 'rtcore' abstraction and multiple backends - support both optix and embree (cpu only) backends - completely reworked cmake build system (and in particular how stuff gets linked) - reworked install/export/import system, exporting both `barney` and `barney_mpi`; should always get used through `find_package(barney)` and then linked as import target(s) `barney::barney` and (if found) `barney::barney_mpi` - anari device split into `anari_library_barney` and `anari_library_barney_mpi` - version used for pynari backend that works on all of linux/windows/mac and both cpu/gpu Known limitations/issues: - AMR support currently disabled - umesh support only support cubql sampler ## v0.8.0 - updated to latest anari 0.12 SDK - changed cuda arch detection to use 'native' by default, but allowing to override on cmdline. - updated cuda arch detection in owl and cubql so as to allow barney to tell them which one to use (so it matches across all projects) - reworked path tracing (and in particular MIS code) to (mostly-)pass a furnace test