A minimal init system (PID 1) for ephemeral NVIDIA GPU-enabled VMs running under Kata Containers. NVRC sets up GPU drivers, configures hardware, spawns NVIDIA management daemons, and hands off to kata-agent for container orchestration.
Fail Fast, Fail Hard: NVRC is designed for ephemeral confidential VMs where any configuration failure should immediately terminate the VM. There are no recovery mechanisms—if GPU initialization fails, the VM powers off. This "panic-on-failure" approach ensures:
- Security: No undefined states in confidential computing environments
- Simplicity: No complex error recovery logic to audit
- Clarity: If it's running, it's configured correctly
flowchart TD
Start([NVRC starts as PID 1]) --> PanicHook[Set panic hook<br/>power off VM on panic]
PanicHook --> MountFS[Mount filesystems<br/>/proc /dev /sys /run /tmp]
MountFS --> LoopbackUp[Bring up loopback interface]
LoopbackUp --> InitKernlog[Initialize kernel logging]
InitKernlog --> PollSyslogOnce[Poll syslog once]
PollSyslogOnce --> ParseKernel[Parse kernel parameters<br/>/proc/cmdline]
ParseKernel --> DetectMode[Detect mode]
DetectMode --> ModeSelect{Mode?}
ModeSelect -->|gpu default| GPUMode[GPU Mode]
ModeSelect -->|cpu| CPUMode[CPU Mode]
ModeSelect -->|servicevm-nvl4| NVL4Mode[ServiceVM NVL4<br/>H100/H200/H800]
ModeSelect -->|servicevm-nvl5| NVL5Mode[ServiceVM NVL5<br/>B100/B200/B300]
GPUMode --> GPUSteps[• Load nvidia.ko nvidia-uvm<br/>• Start nvidia-persistenced<br/>• nvidia-smi: lmc lgc pl srs<br/>• nv-hostengine dcgm-exporter<br/>• Generate CDI spec<br/>• Health checks]
CPUMode --> CPUSteps[• Skip GPU initialization]
NVL4Mode --> NVL4Steps[• Load nvidia.ko<br/>• Start fabric-mgr greedy<br/>• Health checks]
NVL5Mode --> NVL5Steps[• Load ib_umad mlx5_ib<br/>• Detect CX7 port GUID<br/>• Start nvlsm<br/>• Start fabric-mgr symmetric<br/>• Health checks]
GPUSteps --> Lockdown
CPUSteps --> Lockdown
NVL4Steps --> Lockdown
NVL5Steps --> Lockdown
Lockdown[Disable kernel module loading<br/>security lockdown]
Lockdown --> ForkAgent[Fork kata-agent<br/>handoff control to guest agent]
ForkAgent --> PollSyslog[Poll syslog forever<br/>keep PID 1 alive]
style Start fill:#e1f5ff
style PollSyslog fill:#e1f5ff
style GPUMode fill:#c8e6c9
style CPUMode fill:#fff9c4
style NVL4Mode fill:#ffccbc
style NVL5Mode fill:#ffccbc
NVRC is configured entirely via kernel command-line parameters (no config files). This is critical for minimal init environments where userspace configuration doesn't exist yet.
| Parameter | Values | Default | Description |
|---|---|---|---|
nvrc.mode |
gpu, cpu, nvswitch-nvl4, nvswitch-nvl5 |
gpu |
Operation mode. cpu for CPU-only, nvswitch-nvl4 for H100/H200/H800 service VMs, nvswitch-nvl5 for B200/B300/B100 service VMs. |
nvrc.log |
off, error, warn, info, debug, trace |
off |
Log verbosity level. Also enables /proc/sys/kernel/printk_devkmsg. |
| Parameter | Values | Default | Description |
|---|---|---|---|
nvrc.smi.lgc |
<MHz> |
- | Lock GPU core clocks to fixed frequency. Eliminates thermal throttling for consistent performance. |
nvrc.smi.lmc |
<MHz> |
- | Lock memory clocks to fixed frequency. Used alongside lgc for fully deterministic GPU behavior. |
nvrc.smi.pl |
<Watts> |
- | Set GPU power limit. Lower values reduce heat/power; higher allows peak performance. |
nvrc.smi.srs |
enabled, disabled |
- | Secure Randomization Seed for GPU memory (passed to nvidia-smi). |
| Parameter | Values | Default | Description |
|---|---|---|---|
nvrc.uvm.persistence.mode |
on/off, true/false, 1/0, yes/no |
true |
UVM persistence mode keeps unified memory state across CUDA context teardowns. |
nvrc.dcgm |
on/off, true/false, 1/0, yes/no |
false |
Enable DCGM (Data Center GPU Manager) for telemetry and health monitoring. |
nvrc.fm.mode |
0, 1 |
- | Fabric Manager mode: 0=bare metal, 1=servicevm (shared nvswitch). Auto-set in nvswitch modes. |
nvrc.fm.rail.policy |
greedy, symmetric |
greedy |
Partition rail policy. Symmetric required for Confidential Computing on Blackwell. |
Minimal GPU setup (defaults):
nvrc.mode=gpu
CPU-only mode:
nvrc.mode=cpu
NVSwitch NVL4 mode (Service VM for HGX H100/H200/H800 - NVLink 4.0):
nvrc.mode=nvswitch-nvl4
NVSwitch NVL5 mode (Service VM for HGX B200/B300/B100 - NVLink 5.0):
nvrc.mode=nvswitch-nvl5
GPU with locked clocks for benchmarking:
nvrc.mode=gpu nvrc.smi.lgc=1500 nvrc.smi.lmc=5001 nvrc.smi.pl=300
GPU with DCGM monitoring:
nvrc.mode=gpu nvrc.dcgm=on nvrc.log=info
Multi-GPU with NVLink:
nvrc.mode=gpu nvrc.fm.mode=0 nvrc.log=debug
NVRC is compiled as a statically-linked musl binary for minimal dependencies:
# x86_64
cargo build --release --target x86_64-unknown-linux-musl
# aarch64
cargo build --release --target aarch64-unknown-linux-muslBuild configuration in .cargo/config.toml enables aggressive size
optimization and static linking.
# Unit tests (requires root for some tests)
cargo test
# Coverage (requires llvm-cov and root)
cargo llvm-cov --all-features --workspace
# Fuzzing
cargo +nightly fuzz run kernel_params
# Static analysis
cargo clippy --all-features -- -D warnings
cargo audit
cargo deny checkNVRC operates with a defense-in-depth security model appropriate for confidential computing:
- Minimal Attack Surface: 7 direct dependencies, statically linked
- Fail-Fast: Panic hook powers off VM on any panic (no undefined states)
- Read-Only Root: Filesystem becomes read-only after initialization
- Module Lockdown: Kernel module loading disabled after GPU setup
- OOM Protection: kata-agent protected with OOM score adjustment (-997)
- Static Linking: No dynamic library dependencies to compromise
- SLSA L3: Build provenance and Sigstore artifact signing
In traditional long-running systems, recovering from errors is valuable. In ephemeral confidential VMs:
- VM lifetime is seconds/minutes: Restarting is faster than debugging partial failures
- Confidential computing requires integrity: Undefined states could leak secrets
- Orchestrator handles retries: Kubernetes/Kata will reschedule the pod
- Simpler audit surface: No complex recovery logic to verify
Check kernel logs for panic messages. Common causes:
- Missing NVIDIA drivers in container image
- Invalid kernel parameters (check
/proc/cmdline) - Daemon startup failures (check logs with
nvrc.log=debug)
- Verify
nvrc.mode=gpu(default, but check explicitly) - Check that GPU is passed through to VM
- Ensure nvidia kernel modules are present
- Verify CDI spec generation succeeded
- Enable debug logging:
nvrc.log=debug - Check that binaries exist in container image
- Verify configuration files are present (
/etc/dcgm-exporter/,/usr/share/nvidia/nvswitch/)
See CONTRIBUTING.md for DCO sign-off requirements.
See VERIFY.md for instructions on verifying release artifacts with Sigstore.
Apache-2.0 - Copyright (c) NVIDIA CORPORATION