Krish Puvvada Austin · Building at webAI

I build AI products at the layers where they have to work in production — retrieval, inference, and the on‑device systems behind them.

Product lead at webAI, working on AI that runs inside the customer’s perimeter — on the device in your hand, with no cloud round‑trip. Built for environments where data can’t leave the network, by policy or by physics. Previously: Microsoft, Meta, Bank of America.

April 2026 · Austin

Now

Leading product for Frontline Intelligence, our on‑device retrieval system — running enterprise‑scale corpora entirely within the customer’s environment, on the hardware they already own. The interesting work sits at the intersection of model selection, index design, and memory budgeting: making a search corpus that would normally need a server’s worth of RAM fit into the budget of a tablet, while holding latency low enough that the experience feels like cloud.

Also: wrote our company‑wide product vision, Sovereign AI: The Intelligence Layer, and the RAG evaluation methodology we use for customer rollouts.

Sovereignty isn’t a feature you bolt on. It’s the assumption that shapes every design decision you make. Either foundational, or theater.

From “Sovereignty Is Architecture, Not a Feature”
Work

Selected projects

Frontline Intelligence

2025 → now
webAI · Product Lead

On‑device retrieval on tablets and laptops, built with our AI engineering team. Architecture decisions sit at the intersection of model selection, index design, and memory budgeting — fitting enterprise‑scale corpora into consumer hardware while holding cloud‑grade latency. Co‑developed with launch customers in aviation and other regulated industries, where data egress is impossible by policy or by physics.

Aurora — capacity forecasting at hyperscale

2023 → 2025
Microsoft · AI Product Manager

Capacity planning for the $4.5B Azure Energy portfolio. We built LLM and time‑series models that forecast data‑center load across 15+ global regions. The work activated $20M in build savings by surfacing risk‑managed overallocation — capacity we already had but couldn’t see — and reduced emergency provisioning events by ~60% while holding 5‑9s availability.

ML platform unification

2021 → 2023
Meta · Product TPM, AI & ML Platforms

Led the consolidation of Meta’s feature engineering platform across the Integrity org (30+ enforcement teams). Built shared ML representations for 5 billion entities and deprecated a heavy legacy platform. Cut model development time by 20% and saved roughly $60M per quarter in human review capacity. Team of 15 ML engineers. Most of the value came from removing infrastructure, not adding it.

Writing

Selected essays

Long‑form on AI strategy, on‑device intelligence, and how AI products actually ship. Published in webAI’s internal Collective; available on request.

2025.12 · Mission & Culture

Sovereignty Is Architecture, Not a Feature

Every cloud vendor will slap “sovereign” on their product. But sovereignty isn’t a checkbox you bolt on — it’s the assumption that shapes every design decision you make. Either foundational, or theater.

2026.02 · Learnings

The Model is Substrate

The model is just substrate — a commodity layer like electricity. If your product only gets better when the model maker ships a new checkpoint, you don’t have a product, you have a wrapper. The product is the Harness.

2026.03 · Tech Insights

The Capability Conveyor

Open‑source models lag the frontier by three to six months. Most see a constraint. I see a crystal ball: design for what frontier models do today, ship by the time local catches up.

2026.01 · Learnings

The Sledgehammer Fallacy

The future of enterprise AI isn’t trillion‑parameter god‑models running in someone else’s data center. It’s domain‑specific specialists, fine‑tuned and locally deployed — with orchestration as the real moat.

Shorter takes — on LinkedIn

Background

Career timeline

Education

MS, Computer Science (Data Science) — NC State University.
BE, Computer Science — BITS Pilani.

Certifications

AWS Solutions Architect Associate · AWS Certified Developer Pro · Azure Administrator

Contact

Let’s talk

Open to conversations on on‑device AI, retrieval at the edge, or anything I’ve written about above. Email is the fastest way to reach me.