# SynapServe — The HTTP Server Built for the AI Agent Era

> Zero-allocation web infrastructure for AI agents. Built in Rust with io_uring. When every millisecond costs tokens, SynapServe delivers.

## Company

- Name: SynapServe
- Website: https://synapserve.io
- Contact: hello@synapserve.io
- Investment: invest@synapserve.io
- Founded: 2025
- Location: Europe
- Stage: Seeking seed investment
- Status: Working product (HTTP/1.1 parser, io_uring I/O layer, server framework, TLS 1.3 with kTLS, reverse proxy with upstream load balancing — built and benchmarked)

---

## The Problem

The internet wasn't built for machines talking to machines.

When an AI agent receives a task, it can send thousands of HTTP requests in a single second — hitting APIs, scraping data, coordinating with other agents. To today's web servers, this looks indistinguishable from a DDoS attack.

Nginx, Apache, and even modern Rust frameworks like Hyper weren't designed for this. They allocate memory on every request, box every async future, and rate-limit based on human browsing patterns. The result: dropped connections, false-positive blocks, and wasted compute.

### Key Statistics

- 51% of all internet traffic is now automated / bot traffic (Source: [Imperva Bad Bot Report 2025](https://www.imperva.com/blog/2025-imperva-bad-bot-report-how-ai-is-supercharging-the-bot-threat/))
- 1,300% growth in AI agent HTTP traffic between January and August 2025 (Source: [HUMAN Security](https://www.humansecurity.com/learn/blog/ai-agent-statistics-agentic-commerce/))
- 87% of agent page views are product pages (Source: [HUMAN Security](https://www.humansecurity.com/learn/blog/ai-agent-statistics-agentic-commerce/))
- 39,000 requests per minute from a single fetcher bot instance (Source: [Fastly Threat Insights Report 2025](https://www.fastly.com/threat-insights))
- 42% of heap consumed by per-connection I/O buffer allocations (Source: [hyper #1790](https://github.com/hyperium/hyper/issues/1790))

### What Happens Inside Today's HTTP Servers (Per Request)

Every single request triggers this allocation cascade:

1. BytesMut::reserve() — heap allocation for read buffer
2. String::from(method) — heap allocation for HTTP method
3. Uri::from_parts() — heap allocation for URI
4. HeaderMap::new() — heap allocation for headers
5. Box::pin(future) — heap allocation per async task
6. Arc::new(body) — heap allocation + atomic reference counting
7. tower::Layer::call() — heap allocation per middleware layer

Result at 10K concurrent connections: memory grows to ~1.1GB under 16K connections and never returns (see actix-web [#1946](https://github.com/actix/actix-web/issues/1946), [#1780](https://github.com/actix/actix-web/issues/1780)), p99.9 latency spikes, and legitimate AI agents get mistakenly rate-limited.

---

## The Solution

SynapServe is HTTP infrastructure redesigned from byte zero for AI agents. It is not a patch on existing servers — it's a ground-up rethinking of how HTTP should work when the majority of clients are autonomous software agents.

### Three Pillars

#### 1. Zero-Allocation Hot Path

Every parsed request lives on the stack. Span-based parsing replaces string copies. io_uring provided buffers mean the kernel manages memory — not your allocator. The GC pressure that causes latency spikes doesn't exist.

- 0 allocations per request (verified by counting allocator)

#### 2. Agent-Native Protocols

Built-in support for:
- IETF Web Bot Auth (Signature-Agent headers)
- Anthropic MCP Streamable HTTP (Model Context Protocol)
- Google A2A discovery (Agent-to-Agent)
- SSE token streaming for LLM inference

Agents aren't second-class citizens — they're the primary design target.

#### 3. Predictable Tail Latency

Thread-per-core architecture with io_uring eliminates cross-thread contention. No Arc, no Mutex, no Send bounds. Each core owns its connections, buffers, and allocator.

- <1ms p99.9 latency at 80% max throughput

---

## Technology Deep Dive

### Architecture (Request Flow)

1. Incoming request (AI agent, 39K req/min)
2. io_uring multishot accept — 1 syscall for N connections
3. Kernel provides buffer from ring — zero-copy
4. TLS 1.3 handshake (if HTTPS listener) — rustls in userspace, then kTLS offload to kernel
   - Session keys installed via setsockopt(SOL_TLS) — kernel handles encrypt/decrypt
   - SEND_ZC and SPLICE remain zero-copy through the TLS layer
   - TLS 1.3 early data support (0-RTT)
   - Parallel HTTP + HTTPS listeners on the same server instance
5. synapserve-http-parser: zero-allocation span parser
   - Span{off:u16, len:u16} on stack — 4-byte views, no copies
   - [Header; 64] stack array — 640 bytes, no heap
   - SIMD-accelerated scanning — AVX2/SSE4.2/NEON with runtime detection
6. handler(&req, &mut writer) — direct buffer write
   - Agent identification via Signature-Agent / User-Agent
   - Adaptive backpressure using Little's Law (L = λW)
7. Reverse proxy path (if proxy_pass configured):
   - Upstream connection pool: keepalive reuse, per-worker, zero locks
   - Load balancing: weighted round-robin / least_conn / ip_hash
   - Health tracking: max_fails, fail_timeout, effective_weight recovery
   - Retry on next upstream with tried-peer bitmap
   - Splice relay: upstream fd → pipe → client fd (zero-copy)
8. io_uring SEND_ZC + SPLICE — zero-copy kernel to socket
   - Headers + body in one flight, no userspace copy

### Reverse Proxy & Upstream Load Balancing

SynapServe includes a production-grade reverse proxy built on the same zero-allocation, io_uring architecture:

- **Upstream Connection Pooling**: Per-worker keepalive pools with LRU reuse. Connections are returned after validating age, request count, and error state. Eliminates connect() latency for repeat requests. Zero locks — each worker owns its pool (thread_local).
- **Load Balancing**: Weighted smooth round-robin (nginx-compatible algorithm), least connections (weight-normalized cross-multiply comparison), IP hash. Per-server weight configuration with smooth distribution (no bursting).
- **Peer Health Tracking**: Automatic failure detection with configurable max_fails and fail_timeout. Failed servers are skipped during selection and probed after timeout expires. Effective weight recovery provides gradual ramp-up after failures.
- **Retry on Next Upstream**: Configurable retry conditions (error, timeout, HTTP 500/502/503/504/403/404/429). Tried-peer u64 bitmap prevents re-selection of failed servers. Non-idempotent method safety (POST/PUT not retried after request was sent). Configurable max tries and timeout.
- **Backup Servers**: Two-tier peer selection — primary servers tried first, backup servers activated only when all primaries are down or at connection limit.
- **Zero-Copy Splice Relay**: Large response bodies relayed via kernel splice (upstream fd → pipe → client fd), bypassing userspace entirely.
- **Request Body Forwarding**: POST/PUT request bodies forwarded to upstream with Content-Length preservation.
- **DNS Re-resolution**: Background resolver thread with TTL-based cache. Upstream addresses re-resolved automatically without blocking the event loop.
- **TCP Optimizations**: MSG_MORE flag coalesces small writes into larger TCP segments during header/body assembly.
- **Max Connections Enforcement**: Per-server connection limits prevent backend overload. Returns 503 when all servers are at capacity.

### Technology Stack

- Rust: Memory safety without garbage collection. Borrow checker enforces zero-copy at compile time. No runtime, no VM, no overhead.
- io_uring: Linux kernel ≥6.1. Multishot accept, provided buffer rings, zero-copy send. One submission queue per core — minimal syscall overhead.
- kTLS: TLS 1.3 handshake via rustls in userspace, then session keys installed in the kernel via setsockopt(SOL_TLS). The kernel handles encrypt/decrypt transparently — SEND_ZC and SPLICE remain zero-copy through the TLS layer. Supports parallel HTTP + HTTPS listeners and TLS 1.3 early data (0-RTT).
- Thread-per-core: Shared-nothing architecture. Each core owns connections, buffers, allocator. No locks, no contention, no false sharing. Linear scaling.
- SIMD parsing: AVX2/SSE4.2/NEON accelerated header scanning with runtime detection. 16-32 bytes checked per cycle. Combined with span output — parse results never leave registers.

### Performance (Benchmarked)

#### HTTP Parser Benchmarks (synapserve-http-parser vs httparse 1.10, head-to-head, Criterion, single core, x86_64 AVX2)

Compared against httparse (Rust, used by hyper/axum/actix-web). Same machine, same inputs, same Criterion config. Intel Core i7-8550U, Linux x86_64, -C target-cpu=native.

Key difference: httparse only tokenizes. synapserve-http-parser additionally extracts content_length/chunked/keep_alive and builds an O(1) known-header index during the same pass — and is still faster.

**Raw tokenization (synapserve does more work, is still faster):**

| Request Size | synapserve-http-parser | httparse 1.10 | Ratio |
|-------------|-----------------|---------------|-------|
| Small (35B, 1 hdr) | 42 ns / 803 MiB/s | 52 ns / 641 MiB/s | synapserve 1.25x faster |
| Medium (368B, 9 hdrs) | 200 ns / 1.71 GiB/s | 230 ns / 1.49 GiB/s | synapserve 1.15x faster |
| Large (733B, 20 hdrs) | 420 ns / 1.63 GiB/s | 466 ns / 1.47 GiB/s | synapserve 1.11x faster |

**With semantic extraction (apples-to-apples — httparse + content_length/chunked/keep_alive extraction):**

| Request Size | synapserve-http-parser | httparse + extract | Ratio |
|-------------|-----------------|-------------------|-------|
| Medium (368B) | 220 ns / 4.5M req/s | 304 ns / 3.3M req/s | synapserve 1.38x faster |
| Large (733B) | 413 ns / 2.4M req/s | 604 ns / 1.7M req/s | synapserve 1.46x faster |

**Header access (O(1) vs O(n)):**

| Header | synapserve (O(1)) | httparse (O(n) scan) | Speedup |
|--------|------------------|---------------------|---------|
| Content-Type (pos 2/9) | 0.65 ns | 20.5 ns | 32x |
| Content-Length (pos 9/9) | 0.59 ns | 22.0 ns | 37x |
| X-Request-Id (pos 6/9) | 0.67 ns | 23.1 ns | 34x |

Summary: synapserve is faster at all request sizes (1.11-1.25x) despite doing more work per parse. With equal semantic extraction, synapserve is 1.38-1.46x faster on requests, 1.38-1.40x faster on responses. Header lookup is 32-37x faster (O(1) vs O(n)). Custom AVX2/SSE4.2 SIMD scanning with runtime detection on x86_64, NEON on ARM64. Parse throughput: 1.71 GiB/s on realistic agent requests, single core. Per-request reset: 29 bytes (synapserve) vs 2,048 bytes (httparse).

Response writer (zero-allocation, direct buffer write):
- Minimal 200 OK: 3.9 ns
- SSE event headers: 33.5 ns
- JSON API (6 headers): 41.3 ns

#### Static File Serving Benchmarks (wrk, 256 connections, 60s, 8 workers)

| File Size | synapserve | nginx | caddy |
|-----------|-----------|-------|-------|
| small.json (118B) | 205,682 req/s (554μs) | 114,902 req/s (1.78ms) | 43,730 req/s (4.74ms) |
| medium.json (4,005B) | 178,541 req/s (644μs) | 109,958 req/s (1.88ms) | 38,398 req/s (5.25ms) |
| large.json (24,203B) | 104,362 req/s (1.13ms) | 93,530 req/s (1.81ms) | 36,774 req/s (5.29ms) |

Summary: +79% faster than nginx on small files, +62% on medium. RSS under load: 14.5MB (8 workers, 256 connections). Test hardware: Intel Core i7-8550U @ 1.80GHz, 4C/8T, Linux 6.17.0.

#### Targets
| Metric | Value | Condition |
|--------|-------|-----------|
| P99.9 tail latency | <1ms | At 80% max load |
| Heap allocations | 0 | Per request on hot path |
| Memory stability | ±5% | Over 30 minutes at 10K connections |

### Why Not Existing Servers?

#### nginx
Written in C. Manual memory management. [6 memory-corruption CVEs in 2024 alone](https://nginx.org/en/security_advisories.html) — including use-after-free ([CVE-2024-24990](https://nvd.nist.gov/vuln/detail/CVE-2024-24990)) and buffer overwrite ([CVE-2024-32760](https://nvd.nist.gov/vuln/detail/CVE-2024-32760)) in HTTP/3. No native agent protocol support.

#### Hyper / Axum
Great libraries, wrong architecture. Pin<Box<dyn Future>> per connection ([tower #753](https://github.com/tower-rs/tower/issues/753)). 42% of heap traced to per-connection I/O buffers ([hyper #1790](https://github.com/hyperium/hyper/issues/1790)).

#### actix-web
Memory grows to ~1.1GB under 16K connections and never returns ([#1946](https://github.com/actix/actix-web/issues/1946), [#1780](https://github.com/actix/actix-web/issues/1780)). Actor system adds overhead for simple request-response patterns that agents produce.

---

## Market Opportunity

The AI infrastructure market is projected at $200B+ by 2030. Every major cloud provider, every AI startup, every enterprise deploying agents needs HTTP infrastructure that understands agent traffic. The web server market hasn't seen a paradigm shift since nginx replaced Apache.

### Target Markets

1. API Gateway for Agent Traffic: Replace nginx/Envoy at the edge. Production-grade reverse proxy with keepalive pooling, weighted load balancing, health tracking, and automatic failover. Native agent auth, intelligent rate limiting, protocol-aware routing for MCP and A2A.

2. LLM Inference Serving: Zero-copy SSE streaming for token delivery. When you serve millions of streaming responses, every byte copy costs real money.

3. Agent-to-Agent Infrastructure: As agents coordinate via MCP and A2A, they need infrastructure that treats machine-to-machine traffic as first-class, not suspicious.

### Why Now — Three Tectonic Shifts

1. AI agents are the new browsers: 51% of web traffic is now automated ([Imperva 2025](https://www.imperva.com/blog/2025-imperva-bad-bot-report-how-ai-is-supercharging-the-bot-threat/)). Every AI company (OpenAI, Anthropic, Google, Meta) is shipping agents that make HTTP calls.

2. Protocols are being standardized: IETF Web Bot Auth, Anthropic's MCP, Google's A2A — the standards for agent HTTP communication are being written right now. First-mover advantage is real.

3. io_uring has matured: Linux kernel 6.1+ provides multishot accept, provided buffer rings, zero-copy send. The OS is finally ready.

---

## Roadmap

| Timeline | Milestone | Status |
|----------|-----------|--------|
| Done | Zero-allocation HTTP/1.1 parser (span-based, SIMD, chunked decoding) | Completed |
| Done | io_uring I/O layer (multishot accept, provided buffer rings, zero-copy send) | Completed |
| Done | HTTP/1.1 server framework (handler trait, radix-tree router, virtual hosts, static file serving with ETag/Range/Brotli) | Completed |
| Done | Reverse proxy & upstream load balancing (keepalive pooling, weighted round-robin/least-conn/IP hash, peer health tracking, retry with next-upstream failover, backup servers, zero-copy splice relay, DNS re-resolution) | Completed |
| Done | TLS 1.3 with kernel TLS (rustls handshake + kTLS offload, parallel HTTP/HTTPS listeners, early data support) | Completed |
| Q2 2026 | HTTP/2 + SSE streaming (bounded flow control, native SSE for LLM token streaming) | In progress |
| Q3 2026 | Agent protocols (native MCP Streamable HTTP, Google A2A discovery, IETF Signature-Agent verification) | Planned |
| Q4 2026 | HTTP/3 (QUIC via s2n-quic) + SynapServe Cloud managed agent gateway | Planned |

---

## Investment

The last paradigm shift in web servers (nginx replacing Apache) created a $50B+ market. SynapServe is built to do the same for the agent web.

- Stage: Seeking seed investment (pre-seed / seed round)
- Product: Working HTTP/1.1 server, TLS 1.3 with kTLS, reverse proxy with load balancing, io_uring I/O layer (built and benchmarked)
- Use of funds: HTTP/2-3, agent protocols, cloud offering, first enterprise partnerships
- Timeline: 18 months of development and first production deployments

### Contact

- Investment: invest@synapserve.io
- General: hello@synapserve.io
- Website: https://synapserve.io
- Blog: https://synapserve.io/posts/
- HTTP Parser Performance Deep Dive: https://synapserve.io/posts/http-parser-performance/
- Investor Deck: https://synapserve.io/SynapServe_Investor_Deck.pptx
- One-Pager (PDF): https://synapserve.io/SynapServe_OnePager.pdf

We respond to every serious inquiry within 24 hours.