BACK_TO_ONGOING
INIT-06PlannedLaunch: 2026-Q2Owner: Edge
Edge Inference
Deploy small models on devices. Low latency, offline-capable inference.
/ Overview
Edge Inference will deploy small, optimized models on devices and edge nodes. Low latency, offline-capable inference with optional sync to the cloud. Planned roadmap for 2026.
/ The_Problem
Cloud inference adds latency and depends on connectivity. For real-time and offline use cases, models need to run on device. Edge Inference will give you one pipeline: train in the cloud, deploy to phones and edge nodes with quantization and optional sync.
/ Train_once,_run_everywhere
Export to ONNX, quantize for size and speed, and deploy to iOS, Android, or Linux edge nodes. Optional sync to the cloud for logging and model updates without moving inference back to the server.
“Edge Inference is our path to sub-50ms inference and offline-first AI in the product.”
Planned
2026
/ Roadmap
P1
Runtime design
ONNX, quantization
P2
Device SDK
iOS, Android, Linux