Privacy is becoming a luxury in the AI world. If you’re tired of sending your data to the cloud every time you ask a question, running a model locally is the answer. Today, we’re looking at Qwen 3.5 9B—a powerhouse model from Alibaba—and how to get it running on your own machine using Ollama. Whether you’re a […]

Read More →

“Shock! Shock!” That’s how Professor Donald Knuth—the man who literally wrote the bible of computer science, The Art of Computer Programming—opened his latest paper. He wasn’t reacting to a scandal or a bug. He was reacting to a mathematical breakthrough. Specifically, a breakthrough made by an AI. For weeks, Knuth had been wrestling with an open […]

Read More →

A format designed for bloggers in 2004 now sits at the center of how AI systems read, write, and think. Table of contents How we got here If you work with LLMs at all, you’ve probably noticed something: Markdown is everywhere. Ask Claude a question, you get Markdown back. Ask GPT-4, same thing. Feed a […]

Read More →

LLMs are stateless. Agents aren’t. Here’s what sits in between. Table of contents Introduction LLMs are stateless by design. Each API call is independent — the model has no mechanism to remember what happened in a previous request. But somehow, the agents built on top of these models maintain context across long conversations, recall user […]

Read More →

If you’ve ever tried to train a machine learning model or just wondered why your computer fans start screaming when you open too many Chrome tabs, you’ve probably run into the alphabet soup of processors: CPU, GPU, and TPU. They all “process” things, but they do it in ways that are fundamentally different. Choosing the […]

Read More →