Firecracker CTL

Overview

Rust Axum REST API for managing Firecracker microVMs. Provides VM lifecycle management (create, poll, destroy) with per-VM timeout enforcement and in-memory state tracking via DashMap.

API Endpoints

Method	Path	Description
`POST`	`/vm/create`	Create and start a microVM
`GET`	`/vm/{vm_id}`	Get VM status
`GET`	`/vm/{vm_id}/result`	Get stdout/stderr/exit_code after completion
`DELETE`	`/vm/{vm_id}`	Force-terminate a running VM
`GET`	`/vm`	List all VMs
`GET`	`/health`	Service health check

Code Execution Flow

User code arrives via env.CODE in the create request
Code is written to a raw block file (512-byte padded)
Block file attached as second Firecracker drive (/dev/vdb)
Entrypoint passed via boot_args (fc_entrypoint=/usr/bin/python3)
VM init script reads code from /dev/vdb, writes to /tmp/code, execs entrypoint

Rootfs Images

Pre-built ext4 root filesystems built in-cluster via ArgoCD PostSync hook:

Image	Size	Contents
`alpine-minimal`	32 MB	Alpine + busybox
`alpine-python`	128 MB	Alpine + Python 3.12
`alpine-node`	128 MB	Alpine + Node.js

Kubernetes Resources

All manifests in apps/kube/firecracker/manifests/:

Deployment — firecracker-ctl with /dev/kvm device plugin, kvm=true node selector
Service — ClusterIP on port 9001
PVC — 2Gi Longhorn volume for rootfs + vmlinux kernel
NetworkPolicy — Ingress from edge-runtime + dashboard proxy only
KEDA ScaledObject — minReplicas=1, cron scales to 2 during peak hours
Rootfs Init Job — ArgoCD PostSync hook, builds ext4 images in-cluster

Two Ecosystems

Firecracker runs in two parallel deployments in the same namespace, chosen by trust level:

Deployment	Network	Access	Use Case
`firecracker-ctl`	None (MMDS only)	Edge functions + dashboard	Ephemeral, untrusted code execution
`firecracker-ctl-net`	Gluetun/WireGuard VPN	Dashboard only (staff)	Persistent endpoints with outbound internet

firecracker-ctl-net shares its network namespace with a Gluetun sidecar — all VM traffic exits through a WireGuard tunnel, so user code in VMs never sees the raw host network. This is the only ecosystem that hosts persistent endpoints.

Persistent Endpoints (`/fc/*`)

Long-lived VMs that expose HTTP servers, routed by name from the public edge. Think Fly.io Machines or Cloud Run — deploy once, stays up until explicitly stopped. Lives in the networked ecosystem only.

Request Flow

Browser → kbve.com/fc/{name}/*
       → kbve-gateway (path prefix /fc/)
       → kbve-service:4321 (axum-kbve)
       → /fc/{name}/{*path} handler (staff auth)
       → firecracker-ctl-net.firecracker.svc:9001/proxy/{name}/{*path}
       → VM tap IP:{http_port}/{*path}
       → response

Two hops of reverse proxy: axum-kbve authenticates + scopes to staff, firecracker-ctl-net does name→VM lookup and forwards over TAP.

API Additions (firecracker-ctl-net only)

Method	Path	Description
`POST`	`/fc/deploy`	Deploy a persistent VM with a unique name + http_port
`GET`	`/fc/list`	List all persistent endpoints with status
`GET`	`/fc/{name}`	Get a single endpoint’s metadata + health
`DELETE`	`/fc/{name}`	Stop and destroy a persistent endpoint
`ANY`	`/proxy/{name}/{*path}`	Reverse-proxy HTTP into the named VM

Deploy payload includes rootfs image, entrypoint, optional code blob (raw drive), env vars (via MMDS), http_port, vcpu/memory. No timeout field — persistent VMs live until DELETE.

Networking (TAP per VM)

Persistent endpoints require TAP networking (deferred in Phase 1 for ephemeral VMs — MMDS was enough for outbound-only workloads). Inbound HTTP to a specific guest IP:port is not reachable via MMDS.

Each persistent VM gets a /dev/net/tun TAP device inside the firecracker-ctl-net pod netns
A private /30 per VM out of 172.18.0.0/16 (host .1, guest .2)
firecracker-ctl maintains name → tap_ip:http_port map for the proxy
Guest gets the IP via kernel ip= boot args; no DHCP needed
Gluetun’s iptables rules already allow the pod netns internal subnets; VM egress flows out the WireGuard tunnel unchanged

Lifecycle

Deploy → allocate TAP + IP, boot VM, wait for GET http://{tap_ip}:{http_port}/health (configurable path), mark healthy
Unhealthy → mark degraded; proxy still forwards; restart is manual (no auto-restart in phase 1)
Delete → SIGTERM the VMM process, release TAP + IP, remove from map
Pod restart → all persistent endpoints lost (in-memory state only). Phase 2 adds on-disk persistence + automatic redeploy.

Rootfs Requirements

A persistent-VM rootfs must:

Boot with ip=...::::eth0:off kernel args (kernel-side config, no DHCP)
Run an HTTP server on the declared http_port as PID 1 (or via a wrapper that re-execs entrypoint)
Respond to GET /health with 200 when ready

New rootfs image planned: alpine-python-web (Alpine + Python + uvicorn + a tiny FastAPI shim), alpine-node-web (Alpine + Node + express shim). Tracked as alpine-python / alpine-node derivatives in the rootfs init job.

Security

Path prefix /fc/* is staff-gated at axum-kbve (same permission as /dashboard/* routes)
VMs cannot reach each other — Firecracker jailer isolates per-VM netns; TAP devices sit on a bridge with inter-guest traffic dropped
VMs cannot reach cluster services — NetworkPolicy already denies cluster egress from the pod; Gluetun’s firewall drops anything not going through the tunnel
Outbound is metered by the VPN provider, not KBVE — no per-endpoint quotas in phase 1

Phased Delivery

Phase 1 — docs + routing skeleton: /fc/* gateway entry, axum-kbve handler, firecracker-ctl 501 stubs
Phase 2a — IP allocator (Ipv4Pool, IpAllocation): pure logic, thread-safe, carves /30 subnets from 172.18.0.0/16 (16384 slots). 13 unit tests.
Phase 2b — TAP device manager (TapManager): shells out to ip + iptables to create TAPs and install pod-level NAT/FORWARD rules. Idempotent init. Command assembly factored into pure builders with 13 more unit tests.
Phase 2c — lifecycle registry: /fc/deploy allocates IP + creates TAP + registers in DashMap; /fc/list / /fc/{name} / DELETE /fc/{name} wired end-to-end; endpoints land in pending status. VM process spawn deferred to 2e. FC_PERSISTENT_ENDPOINTS_ENABLED env gate ensures only the -net deployment carries persistent state.
Phase 2d — HTTP forwarder: /proxy/{name}/{*path} reverse-proxies to http://{guest_ip}:{http_port}/{path} preserving method/headers/body; streams response.
Phase 2e — VM lifecycle: actually spawn the Firecracker VMM with the TAP attached and ip= kernel boot args; transition pending → starting → healthy. Health-check loop marks degraded on failure.
Phase 3 — web-server rootfs images (alpine-python-web, alpine-node-web) + example deployments
Phase 4 — dashboard UI: deploy form, endpoint list, health badges, logs viewer
Phase 5 — on-disk state + auto-redeploy on pod restart
Phase 6 — per-endpoint quotas + metrics dashboard