AI Operations

I work with AI tools daily — not as a researcher, but as a practitioner who deploys, optimizes, and automates with them. I bring the same infrastructure discipline to AI operations that I apply to servers and clusters: measure everything, automate what repeats, and cut waste ruthlessly.

What I Do

LLM Token Optimization

Large language models are expensive to run. Every conversation round-trips the full context — a 10-turn session on a 7,000-line codebase can burn nearly a million tokens, most of it redundant.

I use Matryoshka RLM to cut that by up to 80%. Instead of feeding entire documents into the LLM context, Matryoshka treats them as external datasets queried through a declarative language. Results bind to server-side variables — the model receives compact pointers, not full content. The savings are real and measurable.

This matters for any organization running LLM workloads at scale. I can set it up, benchmark it against your current usage, and show you the numbers.

AI-Assisted Workflows

Daily Claude Code usage for infrastructure management, code review, documentation, and automation. Not toy prompts — production work: deploying Kubernetes services, debugging mail delivery, writing monitoring alert rules, managing DNS zones.

I've built and archived over 1,000 AI conversations across Claude, DeepSeek, and Mistral. That volume teaches you what works and what doesn't — how to prompt effectively, when to use which model, and where AI falls short.

Speech-to-Text Pipeline

Automated voicemail transcription using whisper.cpp running locally on private infrastructure. IMAP polling retrieves audio, whisper.cpp transcribes it, and ntfy delivers push notifications with the transcript. No cloud APIs, no per-minute billing, no data leaving the server.

Multi-Model Access

CLI-based access to 26+ models via OpenRouter, with shortcuts for quick queries, data analysis via piped stdin, and model selection by task. Different models for different jobs — not every question needs the most expensive answer.

What I Bring to a Team

Most people who understand AI can't deploy infrastructure. Most people who manage infrastructure haven't used AI tools in production. I do both, every day.

Practitioner first — I use these tools to do real work, not demos
Cost-conscious — I've optimized my own token spend and can do the same for yours
Infrastructure-native — I deploy AI tools on infrastructure I build and maintain
Can train others — seven years of teaching means I can bring your team up to speed, but I'd rather be building

Tools & Platforms

Claude Code
Claude API
Matryoshka RLM
whisper.cpp
OpenRouter
MCP Servers
Prompt Engineering
RLHF Evaluation
AI Cost Optimization
Multi-model Workflows
Playwright (AI data export)
tree-sitter

RSP Works

Sections