Co-authored with Stream

I went on a walk last evening and drafted this using Stream. Phone in my pocket, ring on my finger, Stream loaded with my context. I kept walking, kept thinking, without stopping.
When we started Sandbar, I thought we were building devices that connect your voice to a chatbot. But over time, I realized that we were building something different; a conversational interface—a fundamentally different way of interacting with technology throughout daily life.
Every morning I wake up excited to work on three things.
1/ Keeping up with the speed of thinking
Voice interfaces live or die by latency. Not just signal processing and AI latency—end-to-end intent-to-insight latency. From the moment your finger taps the ring, you're racing through hardware, firmware, Bluetooth, mobile, socket, backend, models, and retrieval—with heavy parallel and async optimizations at every layer. Any one of those can break the feeling of fluency. For example, we don't use a wake gesture or wake word because either one can break your flow. Getting this right isn't a software problem or a hardware problem—it's a full-stack discipline that requires caring deeply about every layer simultaneously.
2/ Designing a conversational interface
Most voice interfaces are just screen interfaces with a microphone bolted on. But having rich interactions through conversation while your phone stays in your pocket requires making every turn meaningful and useful, without common primitives like scanning, skipping, or re-reading. That forces a completely different design discipline. The system needs deep memory to know what's already been established, and a UX that allows for fast back-and-forth—ideation, clarification, refinement—without losing the thread. The system can't just respond to what you asked; it has to model what you need next. When to answer directly, when to ask a clarifying question, when to compress and when to expand. We think about this as a set of new design primitives, not ported ones.
3/ An ambient interface that acts
The moments I find most valuable aren't the big queries—they're the small ones that happen in motion. A thought while walking that connects to a project. An idea while washing dishes. The problem is the gap between time of capture and time of use—the thought happens in one context, the work happens in another. But what if that gap didn't have to exist? A casual insight about your codebase, captured mid-dishwash, could trigger a full refactoring—agents picking up the thread with no gap, no context switching, no sitting down to start.
Most apps today are already adopting MCP or chat-based approaches—the monolithic visual app is quietly being replaced by agents. The hard problem isn't the agents. It's everything around them—context, coordination, and knowing which one to call.What we're building towards is a layer that sits underneath all of it. Because Stream already knows your context, your preferences, your day—spinning up a personalized agent isn't a setup task, it's a single conversation. And orchestrating multiple agents doesn't require remembering names or switching interfaces. You just talk. Stream figures out who should pick up the thread.
One persistent model of you. Not an assistant you open—a layer that follows your day and acts on it.
∽
We're living in a world where technology is being fundamentally reshaped by a new computing paradigm built on language. I feel lucky to work on new problems at the edge. If this kind of work excites you, I'd love to talk. We're hiring across machine learning, software, mobile, product, and more.
- Kirak, cofounder & CTO, co-authored with Stream