The Oldest New Release You’ll See This Year
NetHack is older than the World Wide Web. It predates Windows, the commercial internet, and most of the people currently writing AI benchmarks. And on May 2, 2026, it shipped version 5.0.0 — a major release that the development team has been building toward for years. A game that refuses to die just got its most significant update in decades.
That tension is worth sitting with for a moment. We live in a world where AI agents are being trained on codebases, where LLMs are evaluated on their ability to play games, and where “agentic reasoning” is the phrase du jour in every product announcement. And the game that researchers keep reaching for when they want to test whether an AI can actually think — not just pattern-match — just got a serious architectural overhaul.
What Actually Changed
NetHack 5.0.0 is a direct descendant of NetHack 3.6, which tells you something about how the DevTeam thinks about versioning. This isn’t a cosmetic refresh. The release notes point to a few changes that matter more than they might appear on the surface:
- The source code is now compliant with the C99 standard, bringing a codebase that dates back to the 1980s into alignment with a standard that, yes, is itself over two decades old — but represents a real modernization of the underlying architecture.
- New accessibility features are included, with the team noting that the website is beginning a longer journey toward broader accessibility functionality.
- The jump from 3.6 to 5.0.0 signals that this is not a minor patch. The DevTeam skipped an entire major version number, which in a project this conservative is a loud statement.
C99 compliance might sound like housekeeping to anyone outside systems programming, but for a project like NetHack it’s foundational. It means the code can be compiled more reliably across modern toolchains, that contributors working with contemporary development environments have fewer friction points, and that the game’s internals are easier to reason about — which matters enormously if you’re building an agent that needs to interact with it programmatically.
Why AI Practitioners Should Pay Attention
Here’s my angle, and it’s specific to what we cover at clawgo.net: NetHack is one of the most demanding environments ever used to evaluate AI agent behavior. The game’s state space is enormous. It requires long-horizon planning, inventory management, risk assessment under uncertainty, and the ability to recover from catastrophic failure. Researchers at Facebook AI (now Meta AI) built the NetHack Learning Environment precisely because the game is so hard that it exposes the limits of current reinforcement learning approaches in ways that simpler benchmarks don’t.
A cleaner, more accessible codebase means that environment gets easier to build on. If you’re maintaining an agent framework that uses NetHack as a testbed, C99 compliance reduces the surface area of weird compiler behavior. If you’re a researcher who wants to extend the game’s mechanics to create new evaluation scenarios, a modernized architecture gives you more solid footing to work from.
The accessibility improvements are also worth tracking from an agent design perspective. Features that make a game more navigable for human players with different needs often translate directly into cleaner interfaces for programmatic agents. Structured output, better state representation, reduced reliance on visual-only cues — these are the kinds of changes that make a game more useful as an agent environment, not just more playable for humans.
The Quiet Persistence of Hard Problems
What I find genuinely interesting about this release is what it says about the relationship between old software and new AI research. The tools that end up being most useful for testing agent intelligence aren’t always the ones built for that purpose. NetHack wasn’t designed It was designed to be brutally difficult and endlessly replayable for human players. That’s exactly why it works so well as one.
The DevTeam has been maintaining this project across decades, through multiple shifts in how people think about software, games, and computing. They didn’t chase trends. They kept the game hard, kept the rules consistent, and kept shipping. Now, with 5.0.0, they’ve done the architectural work to make sure the project can survive another generation of contributors and use cases — including ones they almost certainly didn’t anticipate.
For anyone building or evaluating AI agents, that kind of long-term thinking is a model worth studying. The dungeon is still there. It’s still trying to kill you. And now it’s a little easier to build the agent that might finally survive it.
🕒 Published: