Skip to content

Firecracker CTL

Rust Axum REST API for managing Firecracker microVMs. Provides VM lifecycle management (create, poll, destroy) with per-VM timeout enforcement and in-memory state tracking via DashMap.

MethodPathDescription
POST/vm/createCreate and start a microVM
GET/vm/{vm_id}Get VM status
GET/vm/{vm_id}/resultGet stdout/stderr/exit_code after completion
DELETE/vm/{vm_id}Force-terminate a running VM
GET/vmList all VMs
GET/healthService health check
  1. User code arrives via env.CODE in the create request
  2. Code is written to a raw block file (512-byte padded)
  3. Block file attached as second Firecracker drive (/dev/vdb)
  4. Entrypoint passed via boot_args (fc_entrypoint=/usr/bin/python3)
  5. VM init script reads code from /dev/vdb, writes to /tmp/code, execs entrypoint

Pre-built ext4 root filesystems built in-cluster via ArgoCD PostSync hook:

ImageSizeContents
alpine-minimal32 MBAlpine + busybox
alpine-python128 MBAlpine + Python 3.12
alpine-node128 MBAlpine + Node.js

All manifests in apps/kube/firecracker/manifests/:

  • Deploymentfirecracker-ctl with /dev/kvm device plugin, kvm=true node selector
  • Service — ClusterIP on port 9001
  • PVC — 2Gi Longhorn volume for rootfs + vmlinux kernel
  • NetworkPolicy — Ingress from edge-runtime + dashboard proxy only
  • KEDA ScaledObject — minReplicas=1, cron scales to 2 during peak hours
  • Rootfs Init Job — ArgoCD PostSync hook, builds ext4 images in-cluster

Firecracker runs in two parallel deployments in the same namespace, chosen by trust level:

DeploymentNetworkAccessUse Case
firecracker-ctlNone (MMDS only)Edge functions + dashboardEphemeral, untrusted code execution
firecracker-ctl-netGluetun/WireGuard VPNDashboard only (staff)Persistent endpoints with outbound internet

firecracker-ctl-net shares its network namespace with a Gluetun sidecar — all VM traffic exits through a WireGuard tunnel, so user code in VMs never sees the raw host network. This is the only ecosystem that hosts persistent endpoints.

Long-lived VMs that expose HTTP servers, routed by name from the public edge. Think Fly.io Machines or Cloud Run — deploy once, stays up until explicitly stopped. Lives in the networked ecosystem only.

Browser → kbve.com/fc/{name}/*
→ kbve-gateway (path prefix /fc/)
→ kbve-service:4321 (axum-kbve)
→ /fc/{name}/{*path} handler (staff auth)
→ firecracker-ctl-net.firecracker.svc:9001/proxy/{name}/{*path}
→ VM tap IP:{http_port}/{*path}
→ response

Two hops of reverse proxy: axum-kbve authenticates + scopes to staff, firecracker-ctl-net does name→VM lookup and forwards over TAP.

MethodPathDescription
POST/fc/deployDeploy a persistent VM with a unique name + http_port
GET/fc/listList all persistent endpoints with status
GET/fc/{name}Get a single endpoint’s metadata + health
DELETE/fc/{name}Stop and destroy a persistent endpoint
ANY/proxy/{name}/{*path}Reverse-proxy HTTP into the named VM

Deploy payload includes rootfs image, entrypoint, optional code blob (raw drive), env vars (via MMDS), http_port, vcpu/memory. No timeout field — persistent VMs live until DELETE.

Persistent endpoints require TAP networking (deferred in Phase 1 for ephemeral VMs — MMDS was enough for outbound-only workloads). Inbound HTTP to a specific guest IP:port is not reachable via MMDS.

  • Each persistent VM gets a /dev/net/tun TAP device inside the firecracker-ctl-net pod netns
  • A private /30 per VM out of 172.18.0.0/16 (host .1, guest .2)
  • firecracker-ctl maintains name → tap_ip:http_port map for the proxy
  • Guest gets the IP via kernel ip= boot args; no DHCP needed
  • Gluetun’s iptables rules already allow the pod netns internal subnets; VM egress flows out the WireGuard tunnel unchanged
  • Deploy → allocate TAP + IP, boot VM, wait for GET http://{tap_ip}:{http_port}/health (configurable path), mark healthy
  • Unhealthy → mark degraded; proxy still forwards; restart is manual (no auto-restart in phase 1)
  • Delete → SIGTERM the VMM process, release TAP + IP, remove from map
  • Pod restart → all persistent endpoints lost (in-memory state only). Phase 2 adds on-disk persistence + automatic redeploy.

A persistent-VM rootfs must:

  • Boot with ip=...::::eth0:off kernel args (kernel-side config, no DHCP)
  • Run an HTTP server on the declared http_port as PID 1 (or via a wrapper that re-execs entrypoint)
  • Respond to GET /health with 200 when ready

New rootfs image planned: alpine-python-web (Alpine + Python + uvicorn + a tiny FastAPI shim), alpine-node-web (Alpine + Node + express shim). Tracked as alpine-python / alpine-node derivatives in the rootfs init job.

  • Path prefix /fc/* is staff-gated at axum-kbve (same permission as /dashboard/* routes)
  • VMs cannot reach each other — Firecracker jailer isolates per-VM netns; TAP devices sit on a bridge with inter-guest traffic dropped
  • VMs cannot reach cluster services — NetworkPolicy already denies cluster egress from the pod; Gluetun’s firewall drops anything not going through the tunnel
  • Outbound is metered by the VPN provider, not KBVE — no per-endpoint quotas in phase 1
  • Phase 1 — docs + routing skeleton: /fc/* gateway entry, axum-kbve handler, firecracker-ctl 501 stubs
  • Phase 2a — IP allocator (Ipv4Pool, IpAllocation): pure logic, thread-safe, carves /30 subnets from 172.18.0.0/16 (16384 slots). 13 unit tests.
  • Phase 2b — TAP device manager (TapManager): shells out to ip + iptables to create TAPs and install pod-level NAT/FORWARD rules. Idempotent init. Command assembly factored into pure builders with 13 more unit tests.
  • Phase 2c — lifecycle registry: /fc/deploy allocates IP + creates TAP + registers in DashMap; /fc/list / /fc/{name} / DELETE /fc/{name} wired end-to-end; endpoints land in pending status. VM process spawn deferred to 2e. FC_PERSISTENT_ENDPOINTS_ENABLED env gate ensures only the -net deployment carries persistent state.
  • Phase 2d — HTTP forwarder: /proxy/{name}/{*path} reverse-proxies to http://{guest_ip}:{http_port}/{path} preserving method/headers/body; streams response.
  • Phase 2e — VM lifecycle: actually spawn the Firecracker VMM with the TAP attached and ip= kernel boot args; transition pending → starting → healthy. Health-check loop marks degraded on failure.
  • Phase 3 — web-server rootfs images (alpine-python-web, alpine-node-web) + example deployments
  • Phase 4 — dashboard UI: deploy form, endpoint list, health badges, logs viewer
  • Phase 5 — on-disk state + auto-redeploy on pod restart
  • Phase 6 — per-endpoint quotas + metrics dashboard