/ architecture

The Madness Stack.

Physics-bound infrastructure, layer by layer.

We do not optimize software; we eliminate it. Traditional clouds are built on general-purpose abstractions stacked twelve deep. Avahana collapses that stack using unikernel-like OS principles, kernel-bypass networking, and control-plane-as-a-service primitives — delivering infrastructure that runs at the speed of the underlying wire and silicon.

01—Foundations

Three core pillars.

Zero-Copy Networking

eBPF and XDP process packets at the NIC driver level, bypassing the heavy Linux TCP/IP stack entirely.

Hard Multi-Tenancy

We reject namespaces for isolation. Every tenant gets a microVM (Kata / Cloud Hypervisor) that boots in under 100ms.

The Hollow Fleet

A centralized control-plane factory manages distributed, immutable worker nodes via persistent reverse tunnels.

Layer by layer.

Twelve layers, each rejecting a legacy abstraction in favor of a primitive that respects the hardware.

The Substrate

→ Talos Linux

An immutable, API-driven OS that boots in RAM. No SSH. No console. No package manager.

The physics

General-purpose Linux is technical debt. Talos is a <80MB bootloader for Kubernetes.
System extensions are immutable overlays applied at boot time for specialized hardware (GPUs, NICs).
We manage 10,000 nodes via gRPC API, eliminating configuration drift entirely.

Why not Ubuntu / RHEL

Configuration drift, SSH-based ops, manual patching. Doesn't scale operationally.

Operational gain

Repurpose a Web Node into a GPU Node by updating its MachineConfig API and rebooting.

The Factory

→ Kamaji (Control Plane as a Service)

Tenant control planes run as lightweight pods sharing a hyper-optimized multi-tenant etcd backend.

The physics

Stateless API servers. Kubernetes control planes spin up as pods in <5 seconds.
Shared etcd: thousands of tenants on a single NVMe-backed datastore, separated by cryptographic keys.
100% Kubernetes API compatibility — every Helm chart, operator, and tool works out of the box.

Why not Custom Rust control plane

We prioritize ecosystem compatibility over reinvention. Standard kube-apiserver wins.

Operational gain

1,000+ control planes per bare metal node.
Provisioning a tenant cluster is just starting a pod.

Virtualization

→ Polymorphic Isolation Engine

Adaptive runtime: microVMs on metal, hardened containers on cloud, Wasm for edge functions.

The physics

On bare metal: Cloud Hypervisor (Rust) via Kata. 100% native speed, hardware-level isolation.
On public cloud: hardened native containers wrapped in user namespaces and policed by Tetragon eBPF. 99% native speed, provider-grade enforcement.
For high-density logic: WebAssembly (WasmEdge). Millisecond startup for AI inference and serverless.

Why not Nested KVM on cloud VMs

VM-in-VM costs 50% of native performance. Unacceptable.

The Fabric

→ Cilium (eBPF) + Gateway API

Adaptive networking: BGP on metal for line-rate routing, accelerated overlays in the cloud.

The physics

On Avahana metal: BGP advertises pod IPs to top-of-rack switches. Zero encapsulation overhead.
On hybrid/cloud: Geneve encapsulation with eBPF host routing — bypasses iptables to minimize overlay penalty.
Cilium ClusterMesh: a single flat IP space across regions, secured with WireGuard.

Why not Calico / kube-proxy

iptables-based forwarding caps throughput and visibility.

Operational gain

Hubble provides packet-level visibility — DNS, HTTP, latency — without instrumenting application code.
Identity-aware L3–L7 network policies enforced at the NIC.

Storage & State

→ LINSTOR + OpenEBS LocalPV

Dual-engine NVMe plane: replicated DRBD for stateful pets, raw passthrough for cattle.

The physics

Tier 1: LINSTOR / DRBD network replication. Interrupt-driven (vs SPDK's CPU-bound polling). <0.2ms overhead.
Tier 0: OpenEBS LocalPV — direct NVMe passthrough for AI training, Neon-style databases, line-rate IOPS.

Why not Pure SPDK

Burns 100% CPU polling idle drives. Wasteful on small nodes.

Operational gain

Standard (Replicated) and Turbo (Local NVMe) storage classes, automatically.
~20% compute cost saved on small nodes vs polling-based stacks.

The Edge

→ Pingora (Rust)

Programmable, self-hosted edge proxy with auth, billing, and WAF compiled into the binary.

The physics

Mode A — Global Acceleration: Anycast IPs announced via BGP at our PoPs; Pingora routes through ClusterMesh / WireGuard to the workload.
Mode B — Local Ingress: VIPs announced via ARP/BGP inside customer LANs; provides hardware-load-balancer behavior with no external dependencies.

Why not Nginx + sidecars + external CDN

Sidecar latency, fragmented logic, vendor dependency.

Operational gain

10k+ concurrent connections/sec, zero drops during upgrades.

Management Plane

→ Go controllers + ConnectRPC + Zitadel

Translates human intent to infrastructure specs. Stateless, horizontally scalable.

The physics

Business state in PostgreSQL (users, organizations, billing, audit).
Infrastructure state in etcd. The Cortex never touches a server directly — it updates the manifest, and Operators converge reality.
OpenMeter ingests telemetry to compute usage in real time.

Supply Chain

→ BuildKit (Kata-isolated) + Dragonfly P2P

Hostile builds in microVMs; image distribution accelerated via peer-to-peer at the edge.

The physics

Rootless BuildKit wrapped in Kata microVMs. Cloud Native Buildpacks auto-detect languages — no Dockerfile required.
Dragonfly: a single supernode pulls a 10GB image once, then streams chunks to neighbors over the LAN. ~95% WAN bandwidth saved on edge clusters.
Cosign signs every image. Talos refuses unsigned images via admission control.

Operational gain

Aggressive NVMe-backed build caching enables Git-to-Production in seconds.

The Interface — Telemetry

→ Vector (Rust) + ClickHouse

Logs, metrics, and traces unified into one stream. Petabyte-scale, sub-second queries.

The physics

Vector runs on every node. No heavy Java or Ruby agents.
ClickHouse handles petabyte-scale telemetry with sub-second query latency.
Live-tail debugging works even for air-gapped nodes.

API Contract

→ ConnectRPC (Protobuf)

The entire platform surface area defined in Protobuf. Type-safe clients for Go and TypeScript.

The physics

Speaks HTTP/1.1, HTTP/2, and gRPC seamlessly. No browser proxy needed.
Backend, CLI, and Web Console are guaranteed in sync — generated from one source of truth.

L10

The CLI

→ avactl — Go + Cobra + ConnectRPC

A single static binary for Linux, macOS, and Windows. Primary tool for super-admins and CI/CD.

The physics

Shares business-logic libraries with the Backend, reducing duplication ~40%.

L11

The Web Consoles

→ Next.js (App Router) + shadcn/ui

Two consoles from a shared component library: User Console for customers, Admin Console for operators.

The physics

Server-side rendering for instant page loads.
Connects directly to ConnectRPC (Layer 9) and ClickHouse (Layer 8) for real-time data.

Performance & SLA targets.

What this stack is designed to deliver. Every claim links back to a layer above.

Category	Metric	Industry standard	Avahana target	Technical enabler
Provisioning	Control Plane Creation	5–15 min	<15 sec	Kamaji (pod-based)
Compute	VM Cold Start	30–120 sec	<200 ms	Cloud Hypervisor
Compute	Cloud VM Overhead	20–50%	<1%	Tetragon eBPF enforcement
Compute	Wasm Cold Start	—	<5 ms	WasmEdge
Network	Network Overhead	50 ms+	<5 ms	eBPF / XDP bypass
Storage	IOPS Performance	Throttled	Line rate	OpenEBS LocalPV / NVMe Gen5
Edge	Global Routing	DNS propagation	Anycast (<1 sec)	BGP + Pingora
Supply Chain	Build to Deploy	5–10 min	<60 sec	BuildKit + Dragonfly P2P
Observability	Telemetry Latency	1–5 min	<5 sec	Vector + ClickHouse
Operations	Admin : Node Ratio	1 : 100	1 : 5,000	Talos (immutable OS)

Want to run on this stack?

The stack is real. The product is in active build. Get an invite when the beta opens.