Decentralized AI Infrastructure — Open Beta

Every Browser Is an AI Supercomputer

Mesh transforms standard web browsers and native apps into interoperable nodes within a resilient, globally distributed P2P compute grid — featuring shard-based model distribution, heterogeneous execution orchestration across WebGPU / WebNN / Wasm, and trustless cryptographic validation. No cloud API keys. No centralized GPU fleet. Just the web.

12 B+
Params Executed On-Edge
WebRTC P2P
Shard Transport Layer
zkML
Trustless Validation
0 APIs
Cloud Dependencies

Infrastructure Built for the
Impossible Constraints of the Browser

Mesh doesn't compromise the browser — it engineers around every limitation. Each subsystem is a radical rethinking of what's achievable at the edge.

Shard-Based Execution

Infinite Memory via Sharding

Transformer models exceeding available VRAM are partitioned into pipeline stages and streamed across multiple browser tabs running in parallel — each tab owns a contiguous slice of the model graph. Inter-tab latency is amortized across prefill and decode phases, sidestepping local GPU memory ceilings entirely.

// Tab 0 → Layers 0–11 (embed + first half)
// Tab 1 → Layers 12–23 (second half + LM head)
BroadcastChannel("shard-grid") → activations
Model Swarming

P2P Model Swarming

Multi-gigabyte model weights are distributed using torrent-style WebRTC data channels. Nodes that have already downloaded shards become seeders, dramatically accelerating warm-up time across the fleet. All weights are cached persistently in the Origin Private File System (OPFS) — survive restarts, survive eviction.

OPFS → persistent model cache
WebRTC DataChannel → shard transport
Seeder ratio ↑ → warm-up latency ↓
Unified Runtime

Heterogeneous Backends

A single unified scheduler dispatches tensor ops to the fastest backend available per device. WebGPU for discrete and integrated GPUs, WebNN for neural-engine accelerated Silicon, WebAssembly SIMD as the universal fallback. One graph, any hardware — no specialised builds required.

Backend priority: WebGPU → WebNN → Wasm
Runtime detection → optimal kernel dispatch
INT8 / FP16 quantisation per backend caps
OOM Recovery

Unkillable Compute

Out-of-memory crashes are an expected system event, not an error. When a shard worker tab is killed by the browser's memory pressure heuristic, the orchestrator transparently spawns a replacement in a hidden background tab, reloads the cached weights from OPFS, and re-syncs pipeline state — all before the calling code sees a timeout.

OOM detected → respawn hidden tab
OPFS cache → instant weight reload
Pipeline state sync → <500ms recovery

Your Data Never Leaves
Your Browser

Mesh's privacy model is structural, not policy-based. Cryptographic tiling ensures no single node in the grid can ever reconstruct a user's input payload — even while participating in its inference.

Cryptographic Tiling

Input tensors are split into cryptographically independent fragments before distribution. Each participating node receives only the slice required for its shard — reconstructing the full input requires quorum collusion that is computationally infeasible.

Zero-Knowledge Proof Validation (zkML)

Every inference result is accompanied by a compact zkML proof that attests correct execution without revealing the input or intermediate activations. Validators verify the proof, not the payload — enabling trustless compute markets without sacrificing privacy.

No Persistent Data Exfiltration

Shard workers are ephemeral sandboxed contexts. Activation tensors in transit exist only in GPU memory buffers — never written to disk, never logged, never accessible outside the isolated execution context.

Contribute Compute,
Earn Karma

Mesh's Proof-of-Useful-Work protocol rewards every measurable contribution to grid health — from raw inference throughput to model seeding and uptime.

Karma Score

A deterministic, on-chain reputation score that accumulates from verifiable grid contributions. Higher karma unlocks priority job routing, validator eligibility, and governance weight.

Inference
82%
Seeding
64%
Uptime
91%
Redundancy
55%
Validation
73%
Proof-of-Useful-Work

Every inference job produces a cryptographic execution receipt. Nodes submit receipts to the validator pool; confirmed receipts trigger karma accrual proportional to shard complexity and latency SLA adherence.

Redundant Execution Bonuses

Critical inference paths require 2-of-3 redundant execution for consensus. Nodes that volunteer for redundant slots earn multiplied karma — incentivising a self-healing grid without coordinator overhead.

Model Seeder Rewards

Nodes that seed model shards to new grid participants receive continuous karma accrual proportional to bytes transferred and verified. Seeding is the network's distribution layer — it is a first-class contribution.

Uptime & Fault-Tolerance Staking

Long-running nodes with high uptime scores are eligible to stake karma as collateral for validator roles — earning a share of protocol revenue in exchange for maintaining latency and reliability SLAs.

The Grid Is Live.
Your Browser Can Join Today.

No installation. No sign-up required. Open a tab and become a node.

Join the Test Grid Architecture Paper