# SynapServe — The HTTP Server Built for the AI Agent Era > Zero-allocation web infrastructure for AI agents. Built in Rust with io_uring. When every millisecond costs tokens, SynapServe delivers. ## Company - Name: SynapServe - Website: https://synapserve.io - Contact: hello@synapserve.io - Investment: invest@synapserve.io - Founded: 2025 - Location: Europe - Stage: Seeking seed investment - Status: Working product (HTTP/1.1 parser, io_uring I/O layer, server framework, TLS 1.3 with kTLS, reverse proxy with upstream load balancing — built and benchmarked) --- ## The Problem The internet wasn't built for machines talking to machines. When an AI agent receives a task, it can send thousands of HTTP requests in a single second — hitting APIs, scraping data, coordinating with other agents. To today's web servers, this looks indistinguishable from a DDoS attack. Nginx, Apache, and even modern Rust frameworks like Hyper weren't designed for this. They allocate memory on every request, box every async future, and rate-limit based on human browsing patterns. The result: dropped connections, false-positive blocks, and wasted compute. ### Key Statistics - 51% of all internet traffic is now automated / bot traffic (Source: [Imperva Bad Bot Report 2025](https://www.imperva.com/blog/2025-imperva-bad-bot-report-how-ai-is-supercharging-the-bot-threat/)) - 1,300% growth in AI agent HTTP traffic between January and August 2025 (Source: [HUMAN Security](https://www.humansecurity.com/learn/blog/ai-agent-statistics-agentic-commerce/)) - 87% of agent page views are product pages (Source: [HUMAN Security](https://www.humansecurity.com/learn/blog/ai-agent-statistics-agentic-commerce/)) - 39,000 requests per minute from a single fetcher bot instance (Source: [Fastly Threat Insights Report 2025](https://www.fastly.com/threat-insights)) - 42% of heap consumed by per-connection I/O buffer allocations (Source: [hyper #1790](https://github.com/hyperium/hyper/issues/1790)) ### What Happens Inside Today's HTTP Servers (Per Request) Every single request triggers this allocation cascade: 1. BytesMut::reserve() — heap allocation for read buffer 2. String::from(method) — heap allocation for HTTP method 3. Uri::from_parts() — heap allocation for URI 4. HeaderMap::new() — heap allocation for headers 5. Box::pin(future) — heap allocation per async task 6. Arc::new(body) — heap allocation + atomic reference counting 7. tower::Layer::call() — heap allocation per middleware layer Result at 10K concurrent connections: memory grows to ~1.1GB under 16K connections and never returns (see actix-web [#1946](https://github.com/actix/actix-web/issues/1946), [#1780](https://github.com/actix/actix-web/issues/1780)), p99.9 latency spikes, and legitimate AI agents get mistakenly rate-limited. --- ## The Solution SynapServe is HTTP infrastructure redesigned from byte zero for AI agents. It is not a patch on existing servers — it's a ground-up rethinking of how HTTP should work when the majority of clients are autonomous software agents. ### Three Pillars #### 1. Zero-Allocation Hot Path Every parsed request lives on the stack. Span-based parsing replaces string copies. io_uring provided buffers mean the kernel manages memory — not your allocator. The GC pressure that causes latency spikes doesn't exist. - 0 allocations per request (verified by counting allocator) #### 2. Agent-Native Protocols Built-in support for: - IETF Web Bot Auth (Signature-Agent headers) - Anthropic MCP Streamable HTTP (Model Context Protocol) - Google A2A discovery (Agent-to-Agent) - SSE token streaming for LLM inference Agents aren't second-class citizens — they're the primary design target. #### 3. Predictable Tail Latency Thread-per-core architecture with io_uring eliminates cross-thread contention. No Arc, no Mutex, no Send bounds. Each core owns its connections, buffers, and allocator. - <1ms p99.9 latency at 80% max throughput --- ## Technology Deep Dive ### Architecture (Request Flow) 1. Incoming request (AI agent, 39K req/min) 2. io_uring multishot accept — 1 syscall for N connections 3. Kernel provides buffer from ring — zero-copy 4. TLS 1.3 handshake (if HTTPS listener) — rustls in userspace, then kTLS offload to kernel - Session keys installed via setsockopt(SOL_TLS) — kernel handles encrypt/decrypt - SEND_ZC and SPLICE remain zero-copy through the TLS layer - TLS 1.3 early data support (0-RTT) - Parallel HTTP + HTTPS listeners on the same server instance 5. synapserve-http-parser: zero-allocation span parser - Span{off:u16, len:u16} on stack — 4-byte views, no copies - [Header; 64] stack array — 640 bytes, no heap - SIMD-accelerated scanning — AVX2/SSE4.2/NEON with runtime detection 6. handler(&req, &mut writer) — direct buffer write - Agent identification via Signature-Agent / User-Agent - Adaptive backpressure using Little's Law (L = λW) 7. Reverse proxy path (if proxy_pass configured): - Upstream connection pool: keepalive reuse, per-worker, zero locks - Load balancing: weighted round-robin / least_conn / ip_hash - Health tracking: max_fails, fail_timeout, effective_weight recovery - Retry on next upstream with tried-peer bitmap - Splice relay: upstream fd → pipe → client fd (zero-copy) 8. io_uring SEND_ZC + SPLICE — zero-copy kernel to socket - Headers + body in one flight, no userspace copy ### Reverse Proxy & Upstream Load Balancing SynapServe includes a production-grade reverse proxy built on the same zero-allocation, io_uring architecture: - **Upstream Connection Pooling**: Per-worker keepalive pools with LRU reuse. Connections are returned after validating age, request count, and error state. Eliminates connect() latency for repeat requests. Zero locks — each worker owns its pool (thread_local). - **Load Balancing**: Weighted smooth round-robin (nginx-compatible algorithm), least connections (weight-normalized cross-multiply comparison), IP hash. Per-server weight configuration with smooth distribution (no bursting). - **Peer Health Tracking**: Automatic failure detection with configurable max_fails and fail_timeout. Failed servers are skipped during selection and probed after timeout expires. Effective weight recovery provides gradual ramp-up after failures. - **Retry on Next Upstream**: Configurable retry conditions (error, timeout, HTTP 500/502/503/504/403/404/429). Tried-peer u64 bitmap prevents re-selection of failed servers. Non-idempotent method safety (POST/PUT not retried after request was sent). Configurable max tries and timeout. - **Backup Servers**: Two-tier peer selection — primary servers tried first, backup servers activated only when all primaries are down or at connection limit. - **Zero-Copy Splice Relay**: Large response bodies relayed via kernel splice (upstream fd → pipe → client fd), bypassing userspace entirely. - **Request Body Forwarding**: POST/PUT request bodies forwarded to upstream with Content-Length preservation. - **DNS Re-resolution**: Background resolver thread with TTL-based cache. Upstream addresses re-resolved automatically without blocking the event loop. - **TCP Optimizations**: MSG_MORE flag coalesces small writes into larger TCP segments during header/body assembly. - **Max Connections Enforcement**: Per-server connection limits prevent backend overload. Returns 503 when all servers are at capacity. ### Technology Stack - Rust: Memory safety without garbage collection. Borrow checker enforces zero-copy at compile time. No runtime, no VM, no overhead. - io_uring: Linux kernel ≥6.1. Multishot accept, provided buffer rings, zero-copy send. One submission queue per core — minimal syscall overhead. - kTLS: TLS 1.3 handshake via rustls in userspace, then session keys installed in the kernel via setsockopt(SOL_TLS). The kernel handles encrypt/decrypt transparently — SEND_ZC and SPLICE remain zero-copy through the TLS layer. Supports parallel HTTP + HTTPS listeners and TLS 1.3 early data (0-RTT). - Thread-per-core: Shared-nothing architecture. Each core owns connections, buffers, allocator. No locks, no contention, no false sharing. Linear scaling. - SIMD parsing: AVX2/SSE4.2/NEON accelerated header scanning with runtime detection. 16-32 bytes checked per cycle. Combined with span output — parse results never leave registers. ### Performance (Benchmarked) #### HTTP Parser Benchmarks (synapserve-http-parser vs httparse 1.10, head-to-head, Criterion, single core, x86_64 AVX2) Compared against httparse (Rust, used by hyper/axum/actix-web). Same machine, same inputs, same Criterion config. Intel Core i7-8550U, Linux x86_64, -C target-cpu=native. Key difference: httparse only tokenizes. synapserve-http-parser additionally extracts content_length/chunked/keep_alive and builds an O(1) known-header index during the same pass — and is still faster. **Raw tokenization (synapserve does more work, is still faster):** | Request Size | synapserve-http-parser | httparse 1.10 | Ratio | |-------------|-----------------|---------------|-------| | Small (35B, 1 hdr) | 42 ns / 803 MiB/s | 52 ns / 641 MiB/s | synapserve 1.25x faster | | Medium (368B, 9 hdrs) | 200 ns / 1.71 GiB/s | 230 ns / 1.49 GiB/s | synapserve 1.15x faster | | Large (733B, 20 hdrs) | 420 ns / 1.63 GiB/s | 466 ns / 1.47 GiB/s | synapserve 1.11x faster | **With semantic extraction (apples-to-apples — httparse + content_length/chunked/keep_alive extraction):** | Request Size | synapserve-http-parser | httparse + extract | Ratio | |-------------|-----------------|-------------------|-------| | Medium (368B) | 220 ns / 4.5M req/s | 304 ns / 3.3M req/s | synapserve 1.38x faster | | Large (733B) | 413 ns / 2.4M req/s | 604 ns / 1.7M req/s | synapserve 1.46x faster | **Header access (O(1) vs O(n)):** | Header | synapserve (O(1)) | httparse (O(n) scan) | Speedup | |--------|------------------|---------------------|---------| | Content-Type (pos 2/9) | 0.65 ns | 20.5 ns | 32x | | Content-Length (pos 9/9) | 0.59 ns | 22.0 ns | 37x | | X-Request-Id (pos 6/9) | 0.67 ns | 23.1 ns | 34x | Summary: synapserve is faster at all request sizes (1.11-1.25x) despite doing more work per parse. With equal semantic extraction, synapserve is 1.38-1.46x faster on requests, 1.38-1.40x faster on responses. Header lookup is 32-37x faster (O(1) vs O(n)). Custom AVX2/SSE4.2 SIMD scanning with runtime detection on x86_64, NEON on ARM64. Parse throughput: 1.71 GiB/s on realistic agent requests, single core. Per-request reset: 29 bytes (synapserve) vs 2,048 bytes (httparse). Response writer (zero-allocation, direct buffer write): - Minimal 200 OK: 3.9 ns - SSE event headers: 33.5 ns - JSON API (6 headers): 41.3 ns #### Static File Serving Benchmarks (wrk, 256 connections, 60s, 8 workers) | File Size | synapserve | nginx | caddy | |-----------|-----------|-------|-------| | small.json (118B) | 205,682 req/s (554μs) | 114,902 req/s (1.78ms) | 43,730 req/s (4.74ms) | | medium.json (4,005B) | 178,541 req/s (644μs) | 109,958 req/s (1.88ms) | 38,398 req/s (5.25ms) | | large.json (24,203B) | 104,362 req/s (1.13ms) | 93,530 req/s (1.81ms) | 36,774 req/s (5.29ms) | Summary: +79% faster than nginx on small files, +62% on medium. RSS under load: 14.5MB (8 workers, 256 connections). Test hardware: Intel Core i7-8550U @ 1.80GHz, 4C/8T, Linux 6.17.0. #### Targets | Metric | Value | Condition | |--------|-------|-----------| | P99.9 tail latency | <1ms | At 80% max load | | Heap allocations | 0 | Per request on hot path | | Memory stability | ±5% | Over 30 minutes at 10K connections | ### Why Not Existing Servers? #### nginx Written in C. Manual memory management. [6 memory-corruption CVEs in 2024 alone](https://nginx.org/en/security_advisories.html) — including use-after-free ([CVE-2024-24990](https://nvd.nist.gov/vuln/detail/CVE-2024-24990)) and buffer overwrite ([CVE-2024-32760](https://nvd.nist.gov/vuln/detail/CVE-2024-32760)) in HTTP/3. No native agent protocol support. #### Hyper / Axum Great libraries, wrong architecture. Pin> per connection ([tower #753](https://github.com/tower-rs/tower/issues/753)). 42% of heap traced to per-connection I/O buffers ([hyper #1790](https://github.com/hyperium/hyper/issues/1790)). #### actix-web Memory grows to ~1.1GB under 16K connections and never returns ([#1946](https://github.com/actix/actix-web/issues/1946), [#1780](https://github.com/actix/actix-web/issues/1780)). Actor system adds overhead for simple request-response patterns that agents produce. --- ## Market Opportunity The AI infrastructure market is projected at $200B+ by 2030. Every major cloud provider, every AI startup, every enterprise deploying agents needs HTTP infrastructure that understands agent traffic. The web server market hasn't seen a paradigm shift since nginx replaced Apache. ### Target Markets 1. API Gateway for Agent Traffic: Replace nginx/Envoy at the edge. Production-grade reverse proxy with keepalive pooling, weighted load balancing, health tracking, and automatic failover. Native agent auth, intelligent rate limiting, protocol-aware routing for MCP and A2A. 2. LLM Inference Serving: Zero-copy SSE streaming for token delivery. When you serve millions of streaming responses, every byte copy costs real money. 3. Agent-to-Agent Infrastructure: As agents coordinate via MCP and A2A, they need infrastructure that treats machine-to-machine traffic as first-class, not suspicious. ### Why Now — Three Tectonic Shifts 1. AI agents are the new browsers: 51% of web traffic is now automated ([Imperva 2025](https://www.imperva.com/blog/2025-imperva-bad-bot-report-how-ai-is-supercharging-the-bot-threat/)). Every AI company (OpenAI, Anthropic, Google, Meta) is shipping agents that make HTTP calls. 2. Protocols are being standardized: IETF Web Bot Auth, Anthropic's MCP, Google's A2A — the standards for agent HTTP communication are being written right now. First-mover advantage is real. 3. io_uring has matured: Linux kernel 6.1+ provides multishot accept, provided buffer rings, zero-copy send. The OS is finally ready. --- ## Roadmap | Timeline | Milestone | Status | |----------|-----------|--------| | Done | Zero-allocation HTTP/1.1 parser (span-based, SIMD, chunked decoding) | Completed | | Done | io_uring I/O layer (multishot accept, provided buffer rings, zero-copy send) | Completed | | Done | HTTP/1.1 server framework (handler trait, radix-tree router, virtual hosts, static file serving with ETag/Range/Brotli) | Completed | | Done | Reverse proxy & upstream load balancing (keepalive pooling, weighted round-robin/least-conn/IP hash, peer health tracking, retry with next-upstream failover, backup servers, zero-copy splice relay, DNS re-resolution) | Completed | | Done | TLS 1.3 with kernel TLS (rustls handshake + kTLS offload, parallel HTTP/HTTPS listeners, early data support) | Completed | | Q2 2026 | HTTP/2 + SSE streaming (bounded flow control, native SSE for LLM token streaming) | In progress | | Q3 2026 | Agent protocols (native MCP Streamable HTTP, Google A2A discovery, IETF Signature-Agent verification) | Planned | | Q4 2026 | HTTP/3 (QUIC via s2n-quic) + SynapServe Cloud managed agent gateway | Planned | --- ## Investment The last paradigm shift in web servers (nginx replacing Apache) created a $50B+ market. SynapServe is built to do the same for the agent web. - Stage: Seeking seed investment (pre-seed / seed round) - Product: Working HTTP/1.1 server, TLS 1.3 with kTLS, reverse proxy with load balancing, io_uring I/O layer (built and benchmarked) - Use of funds: HTTP/2-3, agent protocols, cloud offering, first enterprise partnerships - Timeline: 18 months of development and first production deployments ### Contact - Investment: invest@synapserve.io - General: hello@synapserve.io - Website: https://synapserve.io - Blog: https://synapserve.io/posts/ - HTTP Parser Performance Deep Dive: https://synapserve.io/posts/http-parser-performance/ - Investor Deck: https://synapserve.io/SynapServe_Investor_Deck.pptx - One-Pager (PDF): https://synapserve.io/SynapServe_OnePager.pdf We respond to every serious inquiry within 24 hours.