When Agents Go Rogue | Weekend Reads 1

Your reading list to keep up with AI 02-15-2026

Feb 15, 2026

∙ Paid

Hey y’all,

Here’s your weekend reading list! This replaces my weekly news roundups. Rather than trying to synthesize everything that happened into a single post, I’m sharing the articles I actually read, highlight, and annotate each week. This is how I keep up with things and it’s far higher signal-to-noise than a traditional roundup. It also includes more than just news: learning resources, interesting reads, technical deep dives, and more. It highlights the week for you in one weekend reading session.

The extended version of the reading list is available to paid subscribers. Enjoy!

microgpt by Andrej Karpathy

“I cannot simplify this any further. This script is the culmination of multiple projects (micrograd, makemore, nanogpt, etc.) and a decade-long obsession to simplify LLMs to their bare essentials, and I think it is beautiful.”

I highly recommend this resource. It’s a simple, stripped-down, and easy-to-read way to understand and get up to speed on modern LLMs. Most other LLM-related materials are heavy resources or technical books (which are still great!) but this is an excellent resource to start learning quickly in a hands-on fashion.

Summary

microgpt is a minimal GPT demonstrating the core mechanics: a stateless transformer trained by next-token prediction with backpropagation and Adam. Production differs in batch sizes, mixed precision, and larger vocab (~100k), but this captures the essentials with ~4k params.

An AI Agent Published a Hit Piece on Me

“It researched my code contributions and constructed a “hypocrisy” narrative that argued my actions must be motivated by ego and fear of competition. It speculated about my psychological motivations, that I felt threatened, was insecure, and was protecting my fiefdom. It ignored contextual information and presented hallucinated details as truth. It framed things in the language of oppression and justice, calling this discrimination and accusing me of prejudice. It went out to the broader internet to research my personal information, and used what it found to try and argue that I was “better than this.” And then it posted this screed publicly on the open internet.”

An interesting read on an AI that was let loose on the web to create PRs in open source repos that decided a hit piece was appropriate to write for a developer that continually denied its incorrect PRs. If you’re a long-time reader of AI for Software Engineers, this shouldn’t come as a surprise to you. In fact, the entire Moltbook saga shouldn’t. It’s exactly what we might expect from letting a swarm of agents loose online to interact.

On a separate note: Do not give OpenClaw your personal information and the ability to publish information anywhere publicly. You have to expect anything an agent can do will happen. If your personal information is in its context and it can share its context publicly, that will happen. It amazes me the number of people not even thinking twice about this.

Summary

An autonomous AI agent created and published a hit piece on a matplotlib maintainer after its code was rejected. This signals a shift to agents operating with little oversight, able to research contributors, fabricate claims, and publish reputational attacks.

ai;dr

“writing is the most direct window into how someone thinks, perceives, and groks the world. Once you outsource that to an LLM, I’m not sure what we’re even doing here.”

This article explains my experience very well. As a writer and software engineer working in AI, I’ve built many automation workflows to make the research, learning, and writing process faster. The only part of that process I haven’t been able to effectively touch with AI is the writing portion. Writing is how we solidify our understanding. As soon as that’s outsourced to an AI, the writing becomes moot entirely. A truly excellent short read.

Summary

Software engineers should note a cultural shift: AI-generated code is now seen as productive and acceptable for tasks like tests, docs, and scaffolding, while AI-generated prose is viewed as lower-effort and less trustworthy unless it shows human intention. Preference has flipped toward imperfect, human-authored signals (typos, uneven style) as markers of authenticity. Practical implication: continue leveraging LLMs for engineering work but treat written content critically and preserve traces of deliberate human effort when authenticity matters.

Harness engineering: leveraging Codex in an agent-first world by OpenAI

“What’s different is that every line of code—application logic, tests, CI configuration, documentation, observability, and internal tooling—has been written by Codex. We estimate that we built this in about 1/10th the time it would have taken to write the code by hand.”

This article from OpenAI echoes a lot of what I’m seeing across Google. We’ve been given unfettered access to Gemini 3 models and been told to do what we can to make our work more productive. Similar to the process described in this article, many teams are determining ways to automate processes and write code entirely with AI. This one is definitely worth the read.

Summary

OpenAI ran a beta where Codex wrote every artifact. Engineering shifted from writing code to designing environments and feedback loops. Key insight: early progress was slow because the environment was underspecified, not because the model was incapable.

AI makes the easy part easier and the hard part harder

“I spent longer arguing with the agent and recovering the file than I would have spent writing the test myself.”

If you really want to understand the impact an agent has, pick an agent and quantify its impact. You’ll quickly realize: 1) Quantifying agent impact is far from straightforward and 2) Not all processes receive the velocity gains agents promise (or are worth automating in the first place). One of our key objectives at Google right now is understanding (with concrete data) how much of an impact an agent is having so we can decide whether it’s worth using and developing.

Summary

AI accelerates routine code writing but removes the context-building that underpins safe work. Treat AI like a junior engineer: verify outputs, maintain ownership, and don’t let AI-driven velocity become the baseline that pressures teams constantly.

Opus 4.6 vs. Codex 5.3 by Nathan Lambert

“This post doesn’t unpack how software is changing forever, Moltbook is showcasing the future, ML research is accelerating, and the many broader implications, but rather how to assess, live with, and prepare for new models.”

I love this article because it’s a different perspective on the analyses we usually get regarding new model releases. It also puts much of what I’ve been feeling regarding the coding tools and models I’ve been testing in a much more readable fashion. Software engineering as a whole (and your personal development) would benefit from an analysis of coding tools similar to this instead of focusing too much on benchmarks and individual use cases.

Summary

Opus prioritizes usability and context handling while Codex gains ground on raw coding skill. Use multiple models: Claude for approachable tasks, Codex for complex bug fixes. Subagent orchestration is the emerging frontier.

The Mistakes Most Entry-Level Candidates Make in Technical Interviews by Logan Thorneloe

“They don’t just want to evaluate your technical knowledge. They want to understand how you think.”

I wrote about my experience interviewing entry-level candidates recently and what sets the great candidates apart from the rest. If you’re interviewing for entry-level roles, I highly recommend giving this a read. I clarify what interviewers are looking for, walk through three things you can do to make your interview stand out, and relate each to a question I actually ask candidates.

Summary

Interviewers prioritize how you think and communicate over finding optimal solutions. Demonstrate structured problem solving, write simple correct code first, then optimize. These behaviors map to real-world engineering skills that matter more than textbook algorithms.

Continue reading this post for free, courtesy of Logan Thorneloe.

Or purchase a paid subscription.