<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
    <channel>
      <title>In Rust We Trust 🦀 - factorio</title>
      <link>https://nikolaishelekhov.com</link>
      <description>Nikolai is a Rust software engineer building high-performance systems, backend and blockchain infrastructure.</description>
      <generator>Zola</generator>
      <language>en</language>
      <atom:link href="https://nikolaishelekhov.com/tags/factorio/rss.xml" rel="self" type="application/rss+xml"/>
      <lastBuildDate>Sun, 15 Feb 2026 00:00:00 +0000</lastBuildDate>
      <item>
          <title>Building an AI Coach That Sees Your Factorio Factory</title>
          <pubDate>Sun, 15 Feb 2026 00:00:00 +0000</pubDate>
          <author>Nikolai Shelekhov</author>
          <link>https://nikolaishelekhov.com/blog/building-factorio-sensei/</link>
          <guid>https://nikolaishelekhov.com/blog/building-factorio-sensei/</guid>
          <description xml:base="https://nikolaishelekhov.com/blog/building-factorio-sensei/">&lt;p&gt;&lt;img src=&quot;&#x2F;img&#x2F;factorio-sensei&#x2F;factorio-sensei.jpg&quot; alt=&quot;Factorio Sensei&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;I got tired of alt-tabbing to ask Claude about my Factorio factory.&lt;&#x2F;p&gt;
&lt;p&gt;Every time I wanted advice, I had to describe my setup manually. “I have 4 furnaces, 2 steam engines, researching automation…” — and half the time I’d forget something important or get the numbers wrong.&lt;&#x2F;p&gt;
&lt;p&gt;So I built a tool that lets Claude see the game directly.&lt;&#x2F;p&gt;</description>
      </item>
      <item>
          <title>Factorio as an AI Benchmark</title>
          <pubDate>Tue, 10 Feb 2026 00:00:00 +0000</pubDate>
          <author>Nikolai Shelekhov</author>
          <link>https://nikolaishelekhov.com/blog/factorio-as-ai-benchmark/</link>
          <guid>https://nikolaishelekhov.com/blog/factorio-as-ai-benchmark/</guid>
          <description xml:base="https://nikolaishelekhov.com/blog/factorio-as-ai-benchmark/">&lt;p&gt;The best AI models today can barely automate early-game smelting in Factorio.&lt;&#x2F;p&gt;
&lt;p&gt;Meanwhile, experienced human players build megabases processing millions of items per minute — with perfectly ratioed production lines, optimized train networks, and nuclear power grids running at scale.&lt;&#x2F;p&gt;
&lt;p&gt;That gap is interesting.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;why-factorio-is-hard-for-ai&quot;&gt;Why Factorio Is Hard for AI&lt;&#x2F;h2&gt;
&lt;p&gt;Factorio is not just a game. It is a real-time systems optimization problem.&lt;&#x2F;p&gt;
&lt;p&gt;It requires:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Long-horizon planning&lt;&#x2F;li&gt;
&lt;li&gt;Resource dependency tracking&lt;&#x2F;li&gt;
&lt;li&gt;Spatial reasoning&lt;&#x2F;li&gt;
&lt;li&gt;Throughput optimization&lt;&#x2F;li&gt;
&lt;li&gt;Incremental refactoring of live systems&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;The game punishes short-term thinking.&lt;br &#x2F;&gt;
Every early design decision propagates downstream.&lt;&#x2F;p&gt;
&lt;p&gt;This makes it fundamentally different from most existing AI benchmarks, which are:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Static&lt;&#x2F;li&gt;
&lt;li&gt;Text-only&lt;&#x2F;li&gt;
&lt;li&gt;Short-horizon&lt;&#x2F;li&gt;
&lt;li&gt;Deterministic&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Factorio is dynamic, persistent, and adversarial to naive planning.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;the-factorio-learning-environment&quot;&gt;The Factorio Learning Environment&lt;&#x2F;h2&gt;
&lt;p&gt;There is already an open-source research effort exploring this space:&lt;&#x2F;p&gt;
&lt;p&gt;Factorio Learning Environment&lt;br &#x2F;&gt;
Published at NeurIPS 2025.&lt;&#x2F;p&gt;
&lt;p&gt;It exposes a live Factorio server to LLM agents.&lt;br &#x2F;&gt;
Agents write Python code to interact with the environment and attempt to build automated factories.&lt;&#x2F;p&gt;
&lt;p&gt;The results so far highlight how difficult the problem really is.&lt;&#x2F;p&gt;
&lt;p&gt;Even strong language models struggle with:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Maintaining state over long sessions&lt;&#x2F;li&gt;
&lt;li&gt;Correctly sequencing multi-step production chains&lt;&#x2F;li&gt;
&lt;li&gt;Recovering from partial failures&lt;&#x2F;li&gt;
&lt;li&gt;Designing scalable layouts&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;This is not a prompt engineering problem.&lt;br &#x2F;&gt;
It is a systems reasoning problem.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;why-this-is-an-interesting-benchmark&quot;&gt;Why This Is an Interesting Benchmark&lt;&#x2F;h2&gt;
&lt;p&gt;Factorio introduces properties that resemble real-world infrastructure engineering:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Graph-based dependency trees&lt;&#x2F;li&gt;
&lt;li&gt;Constrained resource allocation&lt;&#x2F;li&gt;
&lt;li&gt;Throughput bottlenecks&lt;&#x2F;li&gt;
&lt;li&gt;Distributed logistics (trains, belts, bots)&lt;&#x2F;li&gt;
&lt;li&gt;Continuous optimization under growth&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;It is closer to distributed systems design than to puzzle solving.&lt;&#x2F;p&gt;
&lt;p&gt;That makes it a compelling unsaturated benchmark for autonomous agents.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;a-rust-based-approach&quot;&gt;A Rust-Based Approach&lt;&#x2F;h2&gt;
&lt;p&gt;I’m currently experimenting with a Rust rewrite of the agent layer using Rig.&lt;&#x2F;p&gt;
&lt;p&gt;The direction is deliberate.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;1-typed-tools&quot;&gt;1. Typed Tools&lt;&#x2F;h3&gt;
&lt;p&gt;Every game action becomes a strongly typed tool:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Place entity&lt;&#x2F;li&gt;
&lt;li&gt;Connect belts&lt;&#x2F;li&gt;
&lt;li&gt;Query inventory&lt;&#x2F;li&gt;
&lt;li&gt;Inspect recipes&lt;&#x2F;li&gt;
&lt;li&gt;Read map state&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;The domain is highly structured.&lt;br &#x2F;&gt;
Rust’s type system allows encoding that structure directly into the interface.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;2-multi-turn-agent-loops-over-rcon&quot;&gt;2. Multi-Turn Agent Loops over RCON&lt;&#x2F;h3&gt;
&lt;p&gt;Instead of single-shot execution, the agent operates in iterative loops:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Observe world state&lt;&#x2F;li&gt;
&lt;li&gt;Plan next action&lt;&#x2F;li&gt;
&lt;li&gt;Execute via RCON&lt;&#x2F;li&gt;
&lt;li&gt;Re-evaluate&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;This creates a feedback-driven control system rather than a stateless command generator.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;3-rag-over-the-recipe-graph&quot;&gt;3. RAG over the Recipe Graph&lt;&#x2F;h3&gt;
&lt;p&gt;Factorio’s crafting tree is a dependency graph.&lt;&#x2F;p&gt;
&lt;p&gt;Using retrieval over:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;The recipe tree&lt;&#x2F;li&gt;
&lt;li&gt;Wiki documentation&lt;&#x2F;li&gt;
&lt;li&gt;Item production chains&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;allows grounding decisions in structured domain knowledge instead of relying purely on model memory.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;why-rust-fits&quot;&gt;Why Rust Fits&lt;&#x2F;h2&gt;
&lt;p&gt;Factorio is deterministic and rule-based.&lt;&#x2F;p&gt;
&lt;p&gt;The action space is structured.
The state transitions are explicit.
The constraints are mechanical.&lt;&#x2F;p&gt;
&lt;p&gt;Rust feels like a natural fit for:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Modeling state transitions&lt;&#x2F;li&gt;
&lt;li&gt;Enforcing invariants&lt;&#x2F;li&gt;
&lt;li&gt;Building typed agent tooling&lt;&#x2F;li&gt;
&lt;li&gt;Keeping orchestration predictable&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;When the domain itself is a graph of dependencies, types become leverage.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;the-gap&quot;&gt;The Gap&lt;&#x2F;h2&gt;
&lt;p&gt;Humans build megabases.&lt;&#x2F;p&gt;
&lt;p&gt;AI struggles to build a stable smelting line.&lt;&#x2F;p&gt;
&lt;p&gt;That gap is not just amusing — it’s informative.&lt;&#x2F;p&gt;
&lt;p&gt;It exposes the limits of current reasoning systems when faced with:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Long-horizon planning&lt;&#x2F;li&gt;
&lt;li&gt;Structural optimization&lt;&#x2F;li&gt;
&lt;li&gt;Persistent world interaction&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Factorio may quietly become one of the most revealing AI benchmarks available.&lt;&#x2F;p&gt;
&lt;p&gt;The factory must grow — for both humans and AI.&lt;&#x2F;p&gt;
</description>
      </item>
    </channel>
</rss>
