Christopher Kosubinsky
  • Home
  • Thoughts
← All thoughts

#ai

  • Before It's Too Late to Ask

    March 25, 2026 ·
    #ai#ai-safety

    Learning about AI progress and safety — come along.

  • Routing Inference Requests

    March 23, 2026 ·
    #ai#distributed-systems

    Building a cache-aware router, pointing it at real GPUs, and measuring the tradeoff between latency and throughput.

  • Inside the GPU Server

    March 21, 2026 ·
    #ai#distributed-systems

    Weight streaming, KV caches, and why inference routing is a different problem.

  • Designing an Inference Gateway

    February 18, 2026 ·
    #ai#distributed-systems

    Tracing a request through the system that sits between you and the LLM.

© 2026 Christopher Kosubinsky. All rights reserved.