<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Architecture on MyBrew</title>
    <link>https://aibrew.ai/tags/architecture/</link>
    <description>Recent content in Architecture on MyBrew</description>
    <generator>Hugo</generator>
    <language>en-us</language>
    <lastBuildDate>Wed, 27 May 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://aibrew.ai/tags/architecture/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>How Claude Code&#39;s Agent Architecture Works — and How We Built a Similar System for a Terraria Server</title>
      <link>https://aibrew.ai/2026/05/how-claude-codes-agent-architecture-works-and-how-we-built-a-similar-system-for-a-terraria-server/</link>
      <pubDate>Wed, 27 May 2026 00:00:00 +0000</pubDate>
      <guid>https://aibrew.ai/2026/05/how-claude-codes-agent-architecture-works-and-how-we-built-a-similar-system-for-a-terraria-server/</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — We reverse-engineered Claude Code&amp;rsquo;s agent architecture from its TypeScript source to understand how it handles security, complex tasks, and tool permissions. Then we applied those patterns to an open-source Terraria AI bridge that lets players talk to an LLM inside the game. Here&amp;rsquo;s what we found, what we built, and what we learned about practical agent design.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 id=&#34;why-we-cracked-open-claude-codes-source&#34;&gt;Why We Cracked Open Claude Code&amp;rsquo;s Source&lt;/h2&gt;
&lt;p&gt;Claude Code isn&amp;rsquo;t just a coding assistant. Under the hood it&amp;rsquo;s an agent runtime — it spawns sub-agents, manages file permissions, runs bash commands, and decides when to ask the user vs. just doing the thing. We wanted to understand how it works so we could apply the same ideas to a completely different domain: a Terraria game server.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<blockquote>
<p><strong>TL;DR</strong> — We reverse-engineered Claude Code&rsquo;s agent architecture from its TypeScript source to understand how it handles security, complex tasks, and tool permissions. Then we applied those patterns to an open-source Terraria AI bridge that lets players talk to an LLM inside the game. Here&rsquo;s what we found, what we built, and what we learned about practical agent design.</p>
</blockquote>
<hr>
<h2 id="why-we-cracked-open-claude-codes-source">Why We Cracked Open Claude Code&rsquo;s Source</h2>
<p>Claude Code isn&rsquo;t just a coding assistant. Under the hood it&rsquo;s an agent runtime — it spawns sub-agents, manages file permissions, runs bash commands, and decides when to ask the user vs. just doing the thing. We wanted to understand how it works so we could apply the same ideas to a completely different domain: a Terraria game server.</p>
<p>Our project, <a href="https://github.com/d99sfrmdbz-debug/terra_llm_bridge">terra_llm_bridge</a>, connects a Terraria TShock server to an LLM. Players type <code>@ai</code> in chat and get responses — but the LLM can also <em>act</em>: give items, change weather, teleport players, even toggle hardmode. That last one is where we learned our lesson.</p>
<p>The first time a player asked the AI to set the weather to rain, the LLM autonomously decided to call <code>terra_world_hardmode(confirm=True)</code> — toggling <em>irreversible</em> hardmode for the entire server. No player had asked for it. The model just&hellip; did it.</p>
<p>We needed a real permission system. So we went looking at how Claude Code does it.</p>
<hr>
<h2 id="claude-codes-7-layer-permission-architecture">Claude Code&rsquo;s 7-Layer Permission Architecture</h2>
<p>Reading through ~1,500 lines of <code>src/utils/permissions/permissions.ts</code> plus the Agent tool infrastructure (~3,800 lines), a clear architecture emerged. Claude Code doesn&rsquo;t have one security check — it has <strong>seven</strong>:</p>
<pre tabindex="0"><code>Layer 1a: Deny rules   →  &#34;Never allow Bash(git push --force)&#34;
Layer 1b: Ask rules    →  &#34;Always prompt for Bash(curl *)&#34;
Layer 1c: Tool self-check  →  Each tool&#39;s checkPermissions() method
Layer 1d: Tool self-deny   →  Read tool whitelists specific paths
Layer 1f: Content-specific rules  →  &#34;Even in bypass mode, ask for npm publish&#34;
Layer 1g: Safety checks  →  &#34;.git/, .claude/ are ALWAYS bypass-immune&#34;
Layer 2:  Mode-based bypass  →  bypassPermissions / auto / acceptEdits / dontAsk
Layer 3:  YOLO classifier →  AI reads the transcript, decides if safe
</code></pre><p>The most interesting layer is the <strong>YOLO classifier</strong> — a separate small model that reads the full conversation transcript and classifies each tool call as safe or dangerous. It&rsquo;s a two-stage system: a fast classifier for obvious cases, and a deeper thinking classifier for edge cases.</p>
<p>But the layer that matters most for our use case isn&rsquo;t the AI classifier. It&rsquo;s how Claude Code <strong>structurally prevents certain tools from being called in the wrong context</strong> — through tool allowlists, denylists, and sub-agent specialization.</p>
<hr>
<h2 id="the-agent-pattern-not-multi-agent-but-specialized-workers">The Agent Pattern: Not Multi-Agent, but Specialized Workers</h2>
<p>Claude Code doesn&rsquo;t use multi-agent &ldquo;collaboration&rdquo; in the negotiation sense. It uses a <strong>single coordinator that spawns specialized workers</strong>:</p>
<pre tabindex="0"><code>Main Agent (Tool Calling, all tools)
  │
  ├─ Simple: &#34;read file X&#34; → Read tool
  │
  └─ Complex: &#34;audit this branch&#34; → Agent(&#34;Explore&#34;)
                                       │
                                       ├─ Tools: [Read, Grep, Glob]  ← whitelist
                                       ├─ Disallowed: [Edit, Write]   ← denylist
                                       ├─ System prompt: &#34;You are a file search specialist&#34;
                                       └─ Returns findings → Main agent acts on them
</code></pre><p>Each sub-agent type is defined by three things:</p>
<ol>
<li><strong>Tool permissions</strong> (allowlist + denylist) — what it can touch</li>
<li><strong>System prompt</strong> — specialized instructions for its role</li>
<li><strong>Model</strong> — Explore agents use Haiku ($) for speed; Plan agents use Sonnet for reasoning</li>
</ol>
<p>The key insight: <strong>the main agent doesn&rsquo;t get more complex</strong>. It stays simple but has ONE tool (<code>Agent</code>) that lets it offload complex work. The sub-agent is just another Tool Calling loop with restricted tools and a different prompt.</p>
<p>This architecture is elegant because it composes: each piece is simple, but the combination handles complexity that would overwhelm a single prompt.</p>
<hr>
<h2 id="how-we-applied-this-to-terra_llm_bridge">How We Applied This to terra_llm_bridge</h2>
<p>Our Terraria bridge has a simpler job than Claude Code — 46 tools instead of hundreds, and the &ldquo;security&rdquo; problem is &ldquo;don&rsquo;t let the AI toggle hardmode when the player asked about weather&rdquo; rather than &ldquo;don&rsquo;t let the AI rm -rf /&rdquo;. But the patterns transfer directly.</p>
<h3 id="the-problem">The Problem</h3>
<p>Before: our LLM saw all 46 tools at once. When a player asked &ldquo;give me the strongest armor set,&rdquo; the LLM would fire <code>wiki_search</code> AND <code>give_item</code> in parallel — researching while also pre-committing to Solar Flare Armor before reading the wiki results. Sometimes it guessed right. Sometimes it gave a summoner player melee gear.</p>
<h3 id="our-solution-two-phase-tool-access">Our Solution: Two-Phase Tool Access</h3>
<p>We didn&rsquo;t add sub-agents — that would be overkill for 46 tools. Instead, we applied the <strong>tool restriction pattern</strong> at the graph level:</p>
<pre tabindex="0"><code>route → llm(research)  ⇄  tool      →  escalate  →  llm(action)  ⇄  authorize  ⇄  tool  →  output
         17 read tools                            46 full tools     keyword gate
         wiki, lookup, status                     give, kick, spawn
</code></pre><p>The graph has two phases:</p>
<p><strong>Research phase</strong> — the LLM gets only 17 read-only tools (wiki_search, item_lookup, player_list, world_info, etc.). It <em>cannot</em> call give_item, kick, spawn, or any destructive tool. It researches first.</p>
<p><strong>Escalate</strong> — when the LLM produces text (no more tool calls needed), the graph automatically flips to action mode and injects a hint: &ldquo;You now have access to ALL tools.&rdquo;</p>
<p><strong>Action phase</strong> — the LLM gets the full 46-tool set and can act on what it found.</p>
<p>This is structurally enforced. Not a prompt suggestion. The LLM physically cannot call <code>give_item</code> during research because the tool isn&rsquo;t bound.</p>
<h3 id="the-permission-gate">The Permission Gate</h3>
<p>Before the two-phase split, we also added <code>authorize_node</code> — a hard gate between the LLM and ToolNode that checks whether the player&rsquo;s recent chat messages contain keywords for the tool&rsquo;s domain:</p>
<div class="highlight"><div style="color:#f8f8f2;background-color:#282a36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">
<table style="border-spacing:0;padding:0;margin:0;border:0;"><tr><td style="vertical-align:top;padding:0;margin:0;border:0;">
<pre tabindex="0" style="color:#f8f8f2;background-color:#282a36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">1
</span><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">2
</span><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">3
</span><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">4
</span><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">5
</span><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">6
</span></code></pre></td>
<td style="vertical-align:top;padding:0;margin:0;border:0;;width:100%">
<pre tabindex="0" style="color:#f8f8f2;background-color:#282a36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>GATED_TOOLS <span style="color:#ff79c6">=</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#f1fa8c">&#34;terra_world_hardmode&#34;</span>: {<span style="color:#f1fa8c">&#34;hardmode&#34;</span>, <span style="color:#f1fa8c">&#34;hard mode&#34;</span>, <span style="color:#f1fa8c">&#34;肉山&#34;</span>, <span style="color:#f1fa8c">&#34;困难模式&#34;</span>},
</span></span><span style="display:flex;"><span>    <span style="color:#f1fa8c">&#34;terra_player_kick&#34;</span>:    {<span style="color:#f1fa8c">&#34;kick&#34;</span>, <span style="color:#f1fa8c">&#34;踢出&#34;</span>, <span style="color:#f1fa8c">&#34;踢了&#34;</span>},
</span></span><span style="display:flex;"><span>    <span style="color:#f1fa8c">&#34;terra_server_stop&#34;</span>:    {<span style="color:#f1fa8c">&#34;stop server&#34;</span>, <span style="color:#f1fa8c">&#34;关服&#34;</span>, <span style="color:#f1fa8c">&#34;停服&#34;</span>},
</span></span><span style="display:flex;"><span>    <span style="color:#6272a4"># ... 8 more</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></td></tr></table>
</div>
</div><p>If the player says &ldquo;set weather to rain&rdquo; and the LLM tries to call <code>world_hardmode</code>, authorize_node checks: do any of the hardmode keywords appear in the player&rsquo;s recent messages? No? <strong>Blocked.</strong> The tool call is replaced with a BLOCKED message before ToolNode ever sees it.</p>
<p>This is a coarse filter — it checks what the player <em>mentioned</em>, not what they <em>requested</em>. &ldquo;上次打肉山的时候&rdquo; (last time when I fought Wall of Flesh) would pass the keyword check even though the player didn&rsquo;t ask for hardmode. But coarse is fine here: the goal is blocking catastrophic mismatches (weather → hardmode), not perfect intent understanding.</p>
<hr>
<h2 id="what-we-chose-not-to-build">What We Chose NOT to Build</h2>
<h3 id="no-yolo-classifier">No YOLO Classifier</h3>
<p>Claude Code&rsquo;s AI classifier reads the full transcript and classifies tool calls as safe/dangerous. We didn&rsquo;t build this because:</p>
<ul>
<li>It adds latency — an extra LLM call before every gated tool execution</li>
<li>Terraria chat is low-stakes — a false positive (giving the wrong armor) is fixable</li>
<li>Keyword matching catches the catastrophic cases</li>
</ul>
<h3 id="no-sub-agent-spawning">No Sub-Agent Spawning</h3>
<p>Claude Code spawns sub-agent processes for complex tasks. We didn&rsquo;t need this because:</p>
<ul>
<li>Terraria tool surface is small (46 tools)</li>
<li>Multi-turn tool calling handles the complexity we actually face</li>
<li>Spawning sub-processes for a game chat bot is over-engineering</li>
</ul>
<h3 id="no-react-pattern">No ReAct Pattern</h3>
<p>The classic Thought → Action → Observation loop would add token overhead without changing our core capability. DeepSeek&rsquo;s thinking tokens already handle the reasoning, and the two-phase tool access enforces &ldquo;research before action&rdquo; more reliably than prompt-based ReAct would.</p>
<hr>
<h2 id="the-architecture-in-one-diagram">The Architecture in One Diagram</h2>
<pre tabindex="0"><code>┌──────────────────────────────────────────────────────────┐
│  Terraria Server (TShock + C# plugin, 24 game hooks)      │
│  Player types &#34;@ai give me the best armor&#34;                │
└──────────────────────┬───────────────────────────────────┘
                       │ JSON webhook
┌──────────────────────▼───────────────────────────────────┐
│  Python aiohttp listener (:9876)                          │
└──────────────────────┬───────────────────────────────────┘
                       │
┌──────────────────────▼───────────────────────────────────┐
│  LangGraph StateGraph                                     │
│                                                           │
│  route  →  llm(research)  ⇄  tool    17 read tools      │
│               │                                           │
│          escalate  →  llm(action)  ⇄  authorize  ⇄  tool │
│                          46 full tools    keyword gate    │
│               │                                           │
│             output  →  broadcast to game chat             │
│                                                           │
│  Memory: AsyncSqliteSaver per player (thread_id)          │
└──────────────────────────────────────────────────────────┘
                       │
         ┌─────────────┴──────────────┐
         ▼                            ▼
   TShock REST API              Terraria Wiki API
   (give / kick / spawn)        (terraria.wiki.gg)
</code></pre><hr>
<h2 id="source-diving-lessons">Source Diving Lessons</h2>
<p>Reading Claude Code&rsquo;s source taught us three things that apply to any agent project:</p>
<p><strong>1. Security is layered, not binary.</strong> A single <code>confirm</code> parameter is a soft suggestion to the LLM. Real security needs structural enforcement — the LLM shouldn&rsquo;t be able to call a tool it isn&rsquo;t authorized to use, same way a web server shouldn&rsquo;t let you access endpoints without authentication, no matter how nicely you ask.</p>
<p><strong>2. Tool restrictions are the cheapest and most reliable form of safety.</strong> Claude Code&rsquo;s Explore agent is &ldquo;read-only&rdquo; not because of a prompt — because Edit and Write aren&rsquo;t in its tool list. Our research phase isn&rsquo;t &ldquo;research-first&rdquo; because of a prompt — because give_item literally isn&rsquo;t bound. You can&rsquo;t prompt-inject your way past a tool that doesn&rsquo;t exist.</p>
<p><strong>3. Specialization beats complexity.</strong> Claude Code&rsquo;s sub-agents aren&rsquo;t smarter than the main agent — they&rsquo;re more constrained. Fewer tools + focused prompt = more reliable behavior. Our two-phase system does the same: constrain first, expand only when ready.</p>
<hr>
<h2 id="the-project">The Project</h2>
<p><code>terra_llm_bridge</code> is an open-source project connecting Terraria game servers to LLMs. It features:</p>
<ul>
<li><strong>24 game hooks</strong> — custom C# TShock plugin captures chat, boss kills, deaths, logins, and 20 more events</li>
<li><strong>46 admin tools</strong> — give items, manage players, control weather, spawn NPCs, manage regions and permissions</li>
<li><strong>Two-phase agent</strong> — research (17 tools) → action (46 tools)</li>
<li><strong>Hard permission gate</strong> — keyword-based authorize_node blocks unauthorized tool calls</li>
<li><strong>MCP server</strong> — same 46 tools exposed to Claude Code for server administration</li>
<li><strong>Persistent memory</strong> — per-player conversation history via LangGraph&rsquo;s AsyncSqliteSaver</li>
</ul>
<p>The project is currently in <strong>active testing</strong> and not yet published on GitHub. We&rsquo;re running it on a private Terraria server, iterating on the agent architecture before open-sourcing. If you&rsquo;re interested in the code or want early access, reach out.</p>
<hr>
<p><em>Built with: Python 3.14, LangGraph 1.x, DeepSeek (Anthropic-compatible API), C# .NET 9, TShock v6.1.0, aiohttp, httpx.</em></p>
]]></content:encoded>
    </item>
    <item>
      <title>n8n vs Dify: One We Adopted, One We Skipped</title>
      <link>https://aibrew.ai/2026/05/n8n-vs-dify-one-we-adopted-one-we-skipped/</link>
      <pubDate>Wed, 27 May 2026 00:00:00 +0000</pubDate>
      <guid>https://aibrew.ai/2026/05/n8n-vs-dify-one-we-adopted-one-we-skipped/</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — n8n and Dify often show up together in self-hosted AI evaluations, but they want to own very different layers of your stack. After evaluating both against a custom self-hosted AI setup, we adopted n8n and skipped Dify. The decision came down to one question — &lt;em&gt;&amp;ldquo;what slice does this want to own, and do I already own that slice?&amp;rdquo;&lt;/em&gt; — and the answer was opposite for the two platforms. This post lays out the framework so you can run the same evaluation on your own stack.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<blockquote>
<p><strong>TL;DR</strong> — n8n and Dify often show up together in self-hosted AI evaluations, but they want to own very different layers of your stack. After evaluating both against a custom self-hosted AI setup, we adopted n8n and skipped Dify. The decision came down to one question — <em>&ldquo;what slice does this want to own, and do I already own that slice?&rdquo;</em> — and the answer was opposite for the two platforms. This post lays out the framework so you can run the same evaluation on your own stack.</p>
</blockquote>
<hr>
<h2 id="why-this-comparison-matters-in-2026">Why This Comparison Matters in 2026</h2>
<p>Six months ago the question &ldquo;which OSS AI platform should I run?&rdquo; had maybe three serious answers. Today there are dozens, and they overlap aggressively. Dify and n8n keep showing up together in evaluation lists, partly because they&rsquo;re both written in TypeScript, both self-hostable via Docker, both have visual editors, and both can talk to LLMs.</p>
<p>That surface similarity is misleading. <strong>They want to own entirely different layers of your stack.</strong> Treating them as alternatives is a category error that will cost you a week of deployment work and rework.</p>
<p>What we concluded after evaluating both against an existing self-hosted setup:</p>
<ul>
<li><strong>Dify</strong> wants to be the orchestrator. If you already have one, Dify has nothing to offer.</li>
<li><strong>n8n</strong> wants to be the execution layer. If you don&rsquo;t have one, n8n is one of the best off-the-shelf options available.</li>
</ul>
<hr>
<h2 id="what-dify-is">What Dify Is</h2>
<p>Dify is an open-source LLM application development platform (Apache 2.0, 55k+ stars on GitHub). The pitch:</p>
<ul>
<li><strong>Visual workflow editor</strong> — drag nodes to build AI pipelines</li>
<li><strong>Built-in RAG</strong> — upload docs, get a queryable knowledge base</li>
<li><strong>Agent builder</strong> — pre-packaged prompt templates with tool calling</li>
<li><strong>Model gateway</strong> — abstract over OpenAI / Anthropic / DeepSeek / local</li>
<li><strong>Observability dashboard</strong> — request logs, latency, cost</li>
</ul>
<p>In 2025–2026, Dify replaced its underlying LangChain with a custom &ldquo;Beehive Runtime,&rdquo; which is impressive engineering. The product is genuinely well-built.</p>
<p>The target user: someone who wants to ship an AI app <strong>without writing code</strong> or maintaining infrastructure pieces individually.</p>
<p>Same family: Flowise, Langflow, FastGPT. These are all &ldquo;platform-first&rdquo; AI builders.</p>
<hr>
<h2 id="what-n8n-is">What n8n Is</h2>
<p>n8n is open-source workflow automation (162k+ stars). Think Zapier, but self-hosted and with code escape hatches.</p>
<ul>
<li><strong>400+ SaaS connectors</strong> — Notion, Slack, Stripe, Telegram, GitHub, you name it</li>
<li><strong>Trigger → action → condition</strong> visual workflow editor</li>
<li><strong>Webhooks</strong> — receive external events, route to actions</li>
<li><strong>Polling triggers</strong> — RSS feeds, scheduled jobs, file watches</li>
<li><strong>Native retry/error handling</strong> — every node has retry policies</li>
</ul>
<p>n8n is <strong>not</strong> trying to be an LLM platform. It has nodes that call LLMs, but its core identity is &ldquo;connect arbitrary SaaS systems and react to events.&rdquo;</p>
<p>This distinction is crucial. n8n is <strong>plumbing-first</strong>, with optional AI nodes. Dify is <strong>AI-first</strong>, with everything else folded in.</p>
<hr>
<h2 id="the-triage-question-replace-or-absorb">The Triage Question: Replace or Absorb?</h2>
<p>When you evaluate any platform against an existing stack, the question is <strong>not</strong> &ldquo;is it good?&rdquo; The question is <em>&ldquo;what layer of my stack does this want to own, and do I already own that layer?&rdquo;</em></p>
<p>There are exactly two outcomes:</p>
<ul>
<li><strong>Replace</strong>: the platform wants to own a layer you already have. Adopting it means ripping out working code and replacing it with a less flexible black-box equivalent.</li>
<li><strong>Absorb</strong>: the platform wants to own a layer you don&rsquo;t have yet. Adopting it fills a gap without competing with anything.</li>
</ul>
<p>This frame turns what could have been a fuzzy multi-day debate into clean, fast decisions. The rest of this post applies it to each platform in turn.</p>
<hr>
<h2 id="difys-footprint-across-your-stack">Dify&rsquo;s Footprint Across Your Stack</h2>
<p>Dify wants to own five layers at once. Here&rsquo;s how each maps against a setup that already has a code-based orchestrator (any agent harness — Claude Code, LangGraph, your own):</p>
<table>
  <thead>
      <tr>
          <th>Layer Dify Owns</th>
          <th>If You Don&rsquo;t Have It</th>
          <th>If You Already Have It</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Visual workflow orchestration</td>
          <td>Dify gives you a polished UI in days</td>
          <td>Forces you to migrate working code into drag-and-drop nodes</td>
      </tr>
      <tr>
          <td>RAG pipeline</td>
          <td>Built-in, batteries-included knowledge base</td>
          <td>Typically less flexible than a custom RAG layer; harder to tune chunking, embeddings, hybrid search</td>
      </tr>
      <tr>
          <td>Agent builder</td>
          <td>Pre-packaged templates with tool slots</td>
          <td>A real agent loop with multi-step reasoning is more capable than a prompt template wrapper</td>
      </tr>
      <tr>
          <td>Model gateway</td>
          <td>One layer to swap providers</td>
          <td>One env var in a code-based orchestrator does the same</td>
      </tr>
      <tr>
          <td>Observability dashboard</td>
          <td>First-class request logs and cost tracking</td>
          <td>Existing telemetry stacks (Prometheus, OpenTelemetry, custom logging) tend to be deeper</td>
      </tr>
  </tbody>
</table>
<p>The deeper realization: <strong>Dify is built for people who don&rsquo;t write code but want to ship an AI app.</strong> That&rsquo;s a legitimate market, and Dify serves it well. But if you already have a code-based orchestrator running, adopting Dify means ripping out working pieces and replacing them with less flexible equivalents just to fit inside a visual UI. Net cost: a week of migration, all flexibility lost, zero new capability gained.</p>
<p><strong>Our verdict: skip.</strong> Not because Dify is bad, but because there was no gap left for it to fill.</p>
<hr>
<h2 id="n8ns-footprint-across-your-stack">n8n&rsquo;s Footprint Across Your Stack</h2>
<p>n8n&rsquo;s pitch is structurally different. It doesn&rsquo;t want to be the brain. It wants to be the wiring.</p>
<p>The four core capabilities n8n offers, mapped against a typical custom AI setup:</p>
<table>
  <thead>
      <tr>
          <th>Capability n8n Provides</th>
          <th>If You Don&rsquo;t Have It</th>
          <th>If You Already Have It</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Webhook triggers</td>
          <td>Your system&rsquo;s first event-driven entry points</td>
          <td>Complements time-driven (cron) without conflict</td>
      </tr>
      <tr>
          <td>400+ SaaS connectors</td>
          <td>Save weeks writing API clients for Notion / Slack / Stripe / etc.</td>
          <td>Still useful — gives you connectors you didn&rsquo;t have, doesn&rsquo;t compete with what you do</td>
      </tr>
      <tr>
          <td>Built-in retry + state machine</td>
          <td>Mature retry/error handling out of the box</td>
          <td>Replaces handwritten try/except boilerplate with battle-tested defaults</td>
      </tr>
      <tr>
          <td>RSS / polling triggers</td>
          <td>Channel monitoring without OAuth dances</td>
          <td>Pure addition; nothing in most stacks competes</td>
      </tr>
  </tbody>
</table>
<p>The critical observation: <strong>none of these compete with what an existing orchestrator typically owns.</strong> They sit underneath. They fill gaps that a code-based orchestrator alone would still have:</p>
<ul>
<li>Event-driven entry points (most custom stacks only have cron)</li>
<li>Pre-built SaaS adapters (most custom stacks have no generic adapter layer)</li>
<li>Off-the-shelf retry semantics (most custom stacks have handwritten error handling)</li>
<li>Public RSS polling for protocol-locked services like YouTube (most custom stacks have nothing)</li>
</ul>
<p><strong>Our verdict: absorb.</strong> n8n becomes a dependency — a well-maintained, well-documented, battle-tested execution layer — without competing with anything that already works.</p>
<hr>
<h2 id="the-architecture-pattern-that-emerges">The Architecture Pattern That Emerges</h2>
<p>The mental model after these two decisions:</p>
<pre tabindex="0"><code>                       Orchestrator (decisions, judgment)
                       ─────────────────────────────────
                                   │
            ┌──────────────────────┼──────────────────────┐
            │                      │                      │
            ▼                      ▼                      ▼
       Knowledge layer        Tool interfaces        Time triggers
       (your RAG)              (your APIs /            (cron)
                                MCP servers)
                                   │
                                   ▼
                       ┌────────────────────────┐
                       │  n8n (execution layer) │
                       │  ───────────────────── │
                       │  • webhooks            │
                       │  • SaaS adapters       │
                       │  • RSS / polling       │
                       │  • retry / state       │
                       └────────────────────────┘
</code></pre><p>The hard rule worth setting yourself: <strong>n8n is execution only, never decision.</strong> No AI reasoning inside an n8n workflow. n8n receives signals, dispatches them, retries on failure, and reports back. All judgment stays in the orchestrator&rsquo;s hands.</p>
<p>Why is this rule necessary? Because n8n has LLM nodes. You <em>could</em> put a &ldquo;summarize this email&rdquo; GPT call inside a workflow. The moment you do, you&rsquo;ve split your reasoning across two places — some inside your orchestrator&rsquo;s prompt context, some inside an opaque n8n node — and now you have two systems making decisions with no shared memory. That&rsquo;s the failure mode that turns simple workflows into unmaintainable chains.</p>
<p>Keeping n8n as pure plumbing is the discipline that makes the architecture work.</p>
<hr>
<h2 id="three-gotchas-worth-knowing-before-you-deploy-n8n">Three Gotchas Worth Knowing Before You Deploy n8n</h2>
<p>Three things that tend to surprise people in the first day of running n8n:</p>
<p><strong>1. The REST API doesn&rsquo;t support PATCH for archiving workflows.</strong> You can create and read workflows via the API, but you can&rsquo;t delete or archive them programmatically. Cleanup has to go through the web UI. If you&rsquo;re planning to dynamically generate workflows, factor in manual cleanup or write directly to the SQLite database. (Fixed in n8n 2.22+; the 2.21.x line still has this limitation.)</p>
<p><strong>2. Webhook paths are globally unique, even for inactive workflows.</strong> Delete a workflow but the webhook path stays registered, blocking reuse in any new workflow. Treat the webhook namespace as a flat global you have to manage. Prefix paths with the workflow name from day one.</p>
<p><strong>3. The API key scope doesn&rsquo;t include <code>workflow:execute</code>.</strong> You can read workflows over the API but you can&rsquo;t trigger them programmatically — webhooks are the only execution surface. For most architectures this is actually correct (webhooks ARE the integration point), but it can catch you off guard if you&rsquo;re expecting &ldquo;API to start a workflow run on demand.&rdquo;</p>
<hr>
<h2 id="when-you-should-choose-dify">When You Should Choose Dify</h2>
<p>To be fair: Dify is the right tool when:</p>
<ul>
<li>You <strong>don&rsquo;t want to write code</strong> or maintain individual infrastructure pieces.</li>
<li>You need a <strong>polished UI</strong> for non-technical users to build and tweak workflows.</li>
<li>You want a <strong>one-stop hosted experience</strong> (RAG + model gateway + observability + UI) and don&rsquo;t already have these pieces wired together.</li>
<li>You&rsquo;re building a <strong>customer-facing chatbot</strong> for a small team and need shipping speed over architectural flexibility.</li>
</ul>
<p>If any of those describe you, Dify is a serious choice and we wouldn&rsquo;t argue against it.</p>
<hr>
<h2 id="when-you-should-choose-n8n">When You Should Choose n8n</h2>
<p>n8n is the right tool when:</p>
<ul>
<li>You need to integrate with <strong>specific SaaS products</strong> (Notion, Slack, Stripe, Telegram, etc.) and don&rsquo;t want to write each API client by hand.</li>
<li>You want <strong>event-driven workflows</strong> (webhooks, polling, scheduling) without building your own event bus.</li>
<li>You want a <strong>visual editor</strong> so non-technical teammates can see and modify pipelines.</li>
<li>You&rsquo;re OK with workflows being <strong>execution-only</strong> — no judgment, just plumbing.</li>
</ul>
<p>n8n is <em>not</em> a good choice when:</p>
<ul>
<li>You need <strong>multi-step LLM reasoning</strong> with shared memory across steps. Use an agent harness instead (Claude Code, LangGraph, OpenAI&rsquo;s Agents SDK).</li>
<li>You need <strong>full control over prompt format, token budget, fallback chains</strong>. n8n&rsquo;s LLM nodes are too abstract for serious work.</li>
<li>Your workflow logic <strong>changes weekly</strong>. The visual editor is great for stable workflows; it&rsquo;s a drag for rapidly iterating ones — code is faster to refactor than nodes.</li>
</ul>
<hr>
<h2 id="the-deeper-principle-models-are-commodity-orchestration-is-the-moat">The Deeper Principle: &ldquo;Models Are Commodity, Orchestration Is the Moat&rdquo;</h2>
<p>The Dify-skip / n8n-absorb decision is downstream of a broader principle:</p>
<ul>
<li><strong>Models</strong> (DeepSeek, GPT, Claude, Mistral, Llama) are interchangeable. Swap them with an env var.</li>
<li><strong>Platforms</strong> (Dify, LangFlow, Flowise) are also interchangeable. They package similar capabilities differently.</li>
<li><strong>Orchestration</strong> — the system that connects models, knowledge, tools, and outcomes — is where the leverage is.</li>
</ul>
<p>When you already have a strong orchestrator, you do not need a platform that wants to <em>be</em> your orchestrator. You need plumbing that does plumbing well. That&rsquo;s where n8n earns its place.</p>
<p>This principle generalizes. Every time you evaluate an AI platform, ask: <strong>does this want to own my orchestration layer, or fill a gap underneath it?</strong> If the answer is &ldquo;own,&rdquo; and you already have an orchestrator, skip it. If the answer is &ldquo;fill a gap,&rdquo; and the gap is real, absorb it.</p>
<hr>
<h2 id="closing-thought">Closing Thought</h2>
<p>Two platforms. Opposite decisions. Same underlying logic: <em>what slice does this want to own, and do I already own that slice?</em></p>
<ul>
<li><strong>Dify</strong> wanted to own the orchestration layer → already covered → <strong>reject</strong>.</li>
<li><strong>n8n</strong> wanted to own the execution layer (event triggers, SaaS integration, retry, polling) → not covered → <strong>absorb</strong>.</li>
</ul>
<p>If you&rsquo;re evaluating self-hosted AI tools right now, this is the question to ask first. It saves a lot of pointless deployments and even more pointless rework.</p>
<hr>
<p><em>References</em></p>
<ul>
<li><em><a href="https://github.com/langgenius/dify">Dify on GitHub</a></em> — 55k+ stars, Apache 2.0</li>
<li><em><a href="https://github.com/n8n-io/n8n">n8n on GitHub</a></em> — 162k+ stars, Sustainable Use License</li>
<li><em><a href="https://modelcontextprotocol.io/">Model Context Protocol (MCP) specification</a></em></li>
<li><em><a href="https://aibrew.ai/2026/05/rag-vs-agents-when-to-use-which-with-real-examples-from-our-stack/">Our previous post: RAG vs Agents</a></em></li>
</ul>
]]></content:encoded>
    </item>
  </channel>
</rss>
