base compute is an AI inference lab. We build the runtimes and infrastructure that make powerful AI run on-device - fast, private, and at near-zero marginal cost.
Already shipped
baseRT is tuned to get the most out of Apple Silicon for local LLMs, so you get the throughput that makes on-device inference practical at scale. It's the performance baseline we extend across our edge stack.
Open sourceInference throughput (tok/s) on Apple M4 Pro
That gap closes when AI runs on hardware organisations already own - not in a data centre they don't control. Local inference has no usage meter. At scale, it approaches zero marginal cost.
Cloud-only AI is unsustainable on privacy, latency, connectivity and cost. The industry has to move to the edge. The question is who builds the infrastructure that gets it there.
What changed
From pro laptops to low-power edge devices, AI-ready hardware is already mainstream.
Of common chatbot queries can be answered correctly by local models.
Stanford University Research, 2026
The capability gap between open and closed has been bridged. DeepSeek R1, Qwen 3, and Llama 4 are there.
The problem
llama.cpp, MLX, Ollama, ExecuTorch, ONNX Runtime
These tools are strong for local experiments, but they stop at raw inference and leave major production gaps.
Performance is also unresolved: current runtimes still leave significant throughput on the table and are rarely tuned to the silicon they actually run on.
The hardware is ready. The models are ready. The infrastructure to deploy them in production is not.
What we're building
Natively optimised for each hardware target - Apple Silicon, Nvidia Jetson, commodity x86. We extract what the defaults leave on the table.
Deploy, update and manage local AI across an entire organisation - from a handful of devices to thousands.
Secure, auditable delivery of models to any device or on-premises environment. Nothing in transit to third-party infrastructure.
Usage visibility with the audit trails enterprise and government actually require. In the architecture from day one.
Working on edge AI, regulated infrastructure, or want to follow what we're doing - we'd like to hear from you.
Contact