Musings

6 entries

Half-formed thoughts and working notes from the bench — applied AI, real-time infrastructure, and the craft of building across cultures.

2026.05.18 BUILDING

Cutting voice-agent latency below 300ms at the edge

A field-tested latency budget for real-time voice agents: where the 300ms goes, which hops you can delete outright, and how to make a slow LLM feel immediate by streaming the first phoneme before the sentence is done.

1 MIN

2026.05.04 THINKING

Designing for two reading directions without a redesign

A layout that respects more than one cultural reading order is not a translation pass — it is a constraint you carry from the first wireframe.

1 MIN

2026.04.21 BUILDING

Streaming structured output from LLMs with backpressure

Parsing half-formed JSON as it arrives, without letting a fast model overrun a slow client. A small state machine that has earned its keep.

1 MIN

2026.04.02 THINKING

A seal, a signature, and the shape of a good API

A personal seal is a contract pressed into a single mark. The best API surfaces aim for the same: small, deliberate, impossible to forge by accident.

1 MIN

2026.03.15 LEARNING

Postgres as a search engine: how far can you push it?

Before reaching for a dedicated cluster, I gave Postgres full-text, trigram, and vector search a real workload. It went further than expected.

1 MIN

2026.03.01 LEARNING

Edge caching strategies for personalized feeds

Personalization and caching are supposed to be enemies. With a layered key strategy at the edge, they can be made to cooperate.

1 MIN