Skip to Content

AI Operations

I work with AI tools daily — not as a researcher, but as a practitioner who deploys, optimizes, and automates with them. I bring the same infrastructure discipline to AI operations that I apply to servers and clusters: measure everything, automate what repeats, and cut waste ruthlessly.

What I Do

LLM Token Optimization

Large language models are expensive to run. Every conversation round-trips the full context — a 10-turn session on a 7,000-line codebase can burn nearly a million tokens, most of it redundant.

I use Matryoshka RLM to cut that by up to 80%. Instead of feeding entire documents into the LLM context, Matryoshka treats them as external datasets queried through a declarative language. Results bind to server-side variables — the model receives compact pointers, not full content. The savings are real and measurable.

This matters for any organization running LLM workloads at scale. I can set it up, benchmark it against your current usage, and show you the numbers.

AI-Assisted Workflows

Daily Claude Code usage for infrastructure management, code review, documentation, and automation. Not toy prompts — production work: deploying Kubernetes services, debugging mail delivery, writing monitoring alert rules, managing DNS zones.

I've built and archived over 1,000 AI conversations across Claude, DeepSeek, and Mistral. That volume teaches you what works and what doesn't — how to prompt effectively, when to use which model, and where AI falls short.

Speech-to-Text Pipeline

Automated voicemail transcription using whisper.cpp running locally on private infrastructure. IMAP polling retrieves audio, whisper.cpp transcribes it, and ntfy delivers push notifications with the transcript. No cloud APIs, no per-minute billing, no data leaving the server.

Multi-Model Access

CLI-based access to 26+ models via OpenRouter, with shortcuts for quick queries, data analysis via piped stdin, and model selection by task. Different models for different jobs — not every question needs the most expensive answer.

What I Bring to a Team

Most people who understand AI can't deploy infrastructure. Most people who manage infrastructure haven't used AI tools in production. I do both, every day.

  • Practitioner first — I use these tools to do real work, not demos
  • Cost-conscious — I've optimized my own token spend and can do the same for yours
  • Infrastructure-native — I deploy AI tools on infrastructure I build and maintain
  • Can train others — seven years of teaching means I can bring your team up to speed, but I'd rather be building

Tools & Platforms

  • Claude Code
  • Claude API
  • Matryoshka RLM
  • whisper.cpp
  • OpenRouter
  • MCP Servers
  • Prompt Engineering
  • RLHF Evaluation
  • AI Cost Optimization
  • Multi-model Workflows
  • Playwright (AI data export)
  • tree-sitter