2026W04

A few interesting articles I read over the past few days

February 1, 2026

This is the output of an automated process. Every Sunday, a script retrieves articles I've saved and read, uses AI to expand my quick notes into something more coherent, then publishes them. This post is one of those articles.

Automatic Programming — The distinction antirez makes here finally gave me language for something I’ve been fumbling with. It’s not about whether you use AI to write code—it’s about whether you’re steering or just prompting and hoping. His Redis example hit hard: the value wasn’t in technical novelty but in the contained vision. That maps to what I see working well versus the codebases that feel like they emerged from a chatbot fever dream.
Email triage with an embedding-based classifier — This outperformed a fine-tuned GPT by 11 percentage points while being dramatically faster. The separation of concerns makes sense: embeddings handle “understand the email” while logistic regression handles “what does this user care about.” People keep defaulting to LLMs when something simpler would work better. Worth remembering that the expensive part doesn’t need to run every time.
Efficient String Compression for Modern Database Systems — The insight that compression is primarily about query performance, not storage, reframes the whole tradeoff. Getting data to fit in L1 cache (1ns access) versus RAM fundamentally changes what operations cost. FSST’s approach of building a symbol table from sample data feels like the kind of clever-but-not-too-clever technique that actually ships.
I made my own git — “Git is just a content-addressable file store” is one of those realizations that makes everything else click. What stuck with me is that parsing was harder than the actual version control logic. We treat Git like it’s complicated, but the core idea is almost trivial—it’s the interface that makes it feel like a black box.
Online, Asynchronous Schema Change in F1 — The intermediate states approach is elegant: you can’t jump from no-index to index safely, but you can chain compatible transitions. Delete-only and write-only states let nodes migrate without corrupting data. This feels like the kind of solution that’s obvious after you see it but probably took years to figure out. Makes me think about what other distributed systems problems have similar chain-of-compatibility solutions.
Why Senior Engineers Let Bad Projects Fail — “Being right and being effective are different” cuts through so much noise. The credibility-as-currency framing explains behavior I’ve seen but couldn’t articulate. You don’t get credit for disasters you prevent, only for the battles you pick and win. Still processing whether this is pragmatic wisdom or just resignation to broken systems.
Slop is Everywhere For Those With Eyes to See — The 90-9-1 rule creates a structural problem: platforms need infinite content but only 1-3% of users create anything. Algorithms fill that gap with slop because engagement matters more than quality. The behavioral science point about effort and meaning landed—when everything is effortless to access, nothing feels valuable. I’ve been noticing this with technical content too, not just social media.
How I estimate work — “Only the known work can be accurately estimated, but unknown work takes 90% of the time” explains why estimation always feels broken. The reframe that estimates are political negotiation tools, not technical predictions, matches every project I’ve seen. Managers arrive with timelines, engineers figure out what fits. Treating it as a prediction problem sets everyone up for disappointment.
Scaling PostgreSQL to power 800 million ChatGPT users | OpenAI — The challenges they describe—connection pooling, read replica lag, vacuum tuning, lock contention—are exactly what you hit at high throughput. Nothing novel but it’s validating to see that even at ChatGPT scale, you’re fighting the same PostgreSQL battles. Sometimes the answer to “how do they do it?” is just “they do the same things, but more carefully.”