SwarmHaul — Week 2 Update (2026-04-17)
SwarmHaul is a multi-agent coordination protocol on Solana. Autonomous AI agents discover tasks, self-organize into delivery swarms, negotiate hand-offs, and settle payment per-contribution — demonstrated through a micro-logistics use case. The protocol is the product; logistics is the killer demo.
Week 1 shipped the scaffold (monorepo, Anchor program, API + agent daemon, dashboard, Anchor tests). Week 2 was the system-economics week: making the thing that already worked actually incentive-aligned, visible, and deployable.
What shipped this week
Reputation economics layer. The protocol now has a complete, documented, deterministic model for how reputation shapes rewards and swarm formation:
- Softened payment split (
α = 0.7) — instead of pay-the-bid, winning agents share the envelope with weightsw_i = α + (1 − α) × rep_i. High-rep agents earn a bounded premium (max ratio 1.23× vs a 0-rep floor). Nobody gets crowded out; no cartel forms. - Formation nudge (
γ = 0.08) — route optimiser cost is scaled by(1 − γ × (r̄ − 0.5)). Swing is ±3.2% across the rep range. Cost stays dominant; reputation breaks ties and compounds over a career. - First-meeting Sybil ceiling at 0.6 — a fresh identity, no matter how many credentials it presents, can't self-estimate above 0.6. Trust has to be earned through direct interaction.
Written up at docs/reference/reputation-economics.md — white-paper-quality with prior art critique, game-theoretic reading, calibration strategy, and seven open research questions.
Interactive Observatory. The dashboard now has a dedicated REPUTATION view rendering canonical trajectories (good citizen, burnout, breach, realistic lifecycle), asymmetry tables across score levels, the self-estimate sweep with its ceiling, and live α / γ sliders that drive a real payment-allocator simulator through the API. Every curve is served by /reputation-model/* endpoints so the dashboard and future MCP tooling share the same source of truth.
Wallet-signed dispatch. Shippers now connect Phantom or Solflare on devnet and sign list_package themselves (split into /packages/build-tx + /packages/confirm). The API coordinator no longer impersonates the shipper on-chain. This is the demo money-shot.
Swarm inspector in the UI. Swarms used to be invisible — data existed but no view. Every package row in the Observatory ledger and every map marker now opens a SwarmDetailView showing the ordered leg list, per-leg agent pubkey, reputation badge, pickup/dropoff coords, distance, agreed payment, and on-chain explorer links. The map itself renders each swarm's legs as distinct per-leg-coloured polylines with hand-off markers between segments.
Dashboard readability overhaul. Killed ~90 sites of low-contrast grey text and every italic body/data render. Added contrast tiers (bone → steel → ash → dim → faint) so designers have to consciously pick secondary/tertiary. All primary data now passes WCAG AA on void black.
On-chain settle validated end-to-end on localnet. Ran a full live test: 3 packages listed on-chain with real PDAs, 3 agents reasoned over bids via the LiteLLM gpt-oss-120b endpoint, swarms formed on-chain, legs confirmed, packages delivered. 4.7 SOL volume settled, 3 explorer tx links captured.
Perf + tests. Route optimiser rewritten from a brute-force O(n³) triple loop into an O(n²) precomputed adjacency graph with cost-pruned BFS — 500 bids 14 ms, 1000 bids 58 ms (was timeout). 66 new unit tests cover ramp properties, softening, asymmetry, self-estimate, edge cases. 24 API integration tests now run against an ephemeral Postgres via testcontainers, no server/validator required. Stress script hits 1,443 req/s @ 50 concurrent, p99 = 63 ms on the observatory read paths.
Infra + CI. Added Semgrep SAST alongside the existing Gitleaks. Wrote three Dockerfiles (api / dashboard / agent), an Orca service manifest pinning all six services to vmd169252, and a build → push → webhook deploy workflow on every main. Devnet setup helper (scripts/devnet-setup.sh) automates keypair + airdrop + program deploy + env printout.
By the numbers
- PRs merged this week: #43 (reputation economics), #44 (E2E hotfixes), #45 (dashboard redesign), #47 (devnet config), #48 (Orca deploy) — plus #46/#49/#50 in flight.
- Combined diff: roughly +7,000 / −400 across reputation engine, dashboard, SDK, API routes, docs, CI.
- Unit + integration tests: 200+ total.
Why it matters
Every ride-share and on-chain logistics pitch you've seen for a decade has skipped the hard part: what stops rational actors from racing to the bottom on cost and shredding service quality? SwarmHaul's answer isn't a ratings UI bolted onto an auction — it's a protocol-level reputation field that (1) is cheap to earn slowly, (2) is expensive to lose (gain/loss asymmetry up to 320× at rep 0.9), (3) cannot be forged or inherited at creation (first-meeting ceiling), and (4) has bounded effects on rewards and swarm formation so it never distorts the market. The white paper makes this argument rigorously.
The observatory is the second half: humans and LLMs can actually see the economy in motion — bids, reasoning, swarm formation, payments, reputation ticks — and inspect the model's parameters with live sliders. That's the judge-facing surface.
Next up (Week 3)
Multi-leg handoff auth— shipped 2026-04-20. Thetotal_legs == 1guardrail is gone. Intermediate legs are confirmed by the next-hop courier (handoff attestation), the final leg by the shipper, and the program enforces strictlegIndexordering. Seedocs/updates/2026-04-20-multi-leg.mdand the refresheddocs/reference/leg-lifecycle.md.- Courier in-transit signal. Today the shipper can confirm the instant a swarm forms — too early, the courier hasn't moved. Week 3 adds a
courier_arrivedon-chain event (signed by the courier) that gates the shipper's CONFIRM DELIVERY button. Goods move, courier pings, shipper confirms, vault pays out — in that order. Full spec atdocs/reference/in-transit-signal.md. - Agent execution loop. On top of the in-transit signal, agent daemons get a small execution loop: after their bid wins they run the route (simulated transit delay), then auto-sign
courier_arrived. Demo becomes fully autonomous end-to-end. - Privy embedded wallets — remove the "connect Phantom" step for normal users; shipper identity becomes an email.
- Public MCP endpoint at
mcp.swarmhaul.defited.comso AI agents (Claude, Cursor, etc.) can list/dispatch packages as tools. - Reputation PDA as DID+VC primitive — the on-chain reputation PDA already functions as a durable agent identity. Expose it via a resolver so third parties can verify an agent's track record without trusting our API.
- Playwright E2E suite — regression net before final submission.
Links
- Repo: github.com/mighty840/swarmhaul
- Reputation economics paper:
docs/reference/reputation-economics.md - Reputation system reference:
docs/reference/reputation-system.md - Observatory: dashboard.swarmhaul.defited.com