February 24, 2026·8 min read·Originally on X (Twitter)

I Built a Multi-Agent Task System Using Markdown Files. No Database. No SaaS. Here's How It Works.

How I coordinate 5 AI agents using markdown files and YAML frontmatter instead of databases or SaaS tools. Dependency chains, supervisor review, and a $4/month server.

OpenClaw system diagram showing 5 AI agents — Monica, Rex, Aria, Paul, and Scout — managed with markdown files instead of a database

I spent weeks watching my AI agents step on each other.

One agent finishes enriching leads. Another agent is supposed to draft outreach using those leads. A third is waiting to write proposals based on that outreach.

None of them knew what the others were doing.

Work piled up. Things got skipped. I became the bottleneck — manually checking what was done, what was blocked, what needed to happen next.

If you've tried to coordinate multiple AI agents, you know this feeling. The individual agents work fine. The system doesn't.

So I stopped duct-taping Slack messages and status spreadsheets together. I built a task management layer from scratch inside OpenClaw.

The core design decision might surprise you.

The Architecture: Markdown Files With YAML

Every task in the system is a markdown file. Structured YAML frontmatter with fields like status, priority, assignee, and dependencies.

That's it. A folder of .md files.

No database. No SaaS orchestration tool. The entire coordination layer is filesystem-based.

I know what you're thinking. "That can't scale."

And you're right — it won't handle thousands of concurrent agents or complex cross-project queries.

But that's not the problem I'm solving.

For a system coordinating 4–6 specialized agents on a single server, the filesystem is more than sufficient. The agents are already reading and writing files as part of their core workflow. They already understand YAML. There's no ORM, no connection pooling, no migration headaches.

It was a deliberate architectural decision for the constraints I'm actually working within. And the operational simplicity pays dividends every day.

Tasks live in an active/ directory when they're in play. Move to completed/ when they're done. Each agent has a personal queue file showing exactly what's on their plate. A CLI handles creation, updates, and status transitions — and enforces the rules so agents can't skip steps.

Simple. Readable. Debuggable.

The Part That Actually Matters: Dependency Chains

Individual task tracking is table stakes. Every project management tool does that.

The real problem with multi-agent systems is sequencing.

Making sure work flows from one agent to the next without a human playing air traffic controller.

This is where the unblock chain comes in. It's the piece I'm most proud of.

How it works:

Every task can declare what it's blocked by. When an upstream task completes, the system automatically finds every downstream task waiting on it, removes the dependency, and if no other blockers remain, flips the task back to todo.

The assigned agent picks it up on their next heartbeat.

Real example from a client project:

Rex (lead enrichment agent) finishes enriching 50 leads → task marked done
↓ automatically unblocks Aria (content agent) — outreach campaign draft was waiting on those leads → flips from blocked to todo
↓ when Aria finishes, automatically unblocks Paul (proposals agent) — was waiting on campaign messaging to write tailored proposals → flips to todo

Three agents. Three dependent tasks. Zero human intervention.

Before this system existed, that same sequence required me to check Rex's output, ping Aria that the leads were ready, wait for her to notice, then repeat for Paul.

Multiply that by a dozen active projects.

You start to understand why "managing AI agents" becomes a full-time job that defeats the purpose of having agents in the first place.

The cascade happens automatically now. And because the dependency graph is explicit — stored right in each task's frontmatter — I can look at any task and immediately see what it's waiting on and what it's holding up downstream.

Autonomous Agents Still Need Supervision

Here's where a lot of people building multi-agent systems get it wrong.

They assume once you automate the coordination, you're done. Let the agents run.

That's a recipe for garbage output at scale.

OpenClaw has a supervisor layer — an agent called Monica — that sits above the worker agents. Monica doesn't just assign tasks. She reviews output before it ships.

Here's what that looks like:

Paul generates Upwork proposals. He doesn't submit them directly. He sets his task status to review. Monica evaluates quality — is it personalized enough? Does the job fit our criteria? Is the client credible?

She approves or rejects with specific reasons. Paul only submits what passes.

Think about it this way: You wouldn't let a junior employee send client-facing work without review. Why would you let an AI agent do it?

This is the hybrid AI pattern I keep coming back to. Deterministic systems for coordination and routing. An intelligent review layer that catches what rules can't.

AI checks AI before anything ships.

How Agents Pick Up Work

Each agent runs on a heartbeat — a periodic wake-up cycle.

The process:

Check their personal queue file
Grab the highest-priority todo task
Execute the work autonomously
Update status through the CLI

It's pull-based, not push-based.

If an agent is busy with a complex task, simpler tasks queue up naturally instead of getting lost in a notification stream.

Communication is hybrid too. Primary channel is the task queue itself — file-based, async, reliable. Slack is secondary, reserved for urgent pings only. Strict routing rule: agents only respond when mentioned by name.

This prevents the pile-on problem where every agent tries to respond to every message.

There's also a real-time monitoring layer. The system auto-generates a kanban board and stats dashboard from the task files — showing how many tasks each agent has in todo, in progress, blocked, and review.

A snapshot of the entire operation at a glance. Updates every time a task status changes.

Sounds like a small thing. But it changed how I manage the system. Instead of asking "what's everyone working on?" I check the board. Instead of wondering if something's stuck, I look at the blocked column.

The visibility alone cut my management overhead by hours per week.

The Whole Thing Runs on a $4/Month Server

The entire system — task management, agent orchestration, dependency resolution, supervisor review — runs on a single 4GB Hetzner VPS.

No Kubernetes. No managed database. No external SaaS dependencies.

This wasn't an accident. It was a design constraint from day one.

Every dependency you add is a point of failure and a monthly cost that scales whether your usage does or not.

Markdown files on a filesystem don't go down because a third-party API changed their pricing.

There's also a debugging advantage nobody talks about. When something breaks in a database-backed system, you're writing queries and checking logs. When something breaks in a file-based system, you open the file and read it.

The entire state of every task is human-readable at all times. I can grep across all active tasks in milliseconds. Version control the entire task history with git. Manually edit a task's YAML in an emergency without spinning up a database client.

The tradeoff is clear — this won't support concurrent writes from thousands of agents. But for 4–6 specialized agents? The filesystem is more than enough.

What I'd Tell Someone Building Their Own

If you're working with multiple AI agents and hitting coordination problems, here's what I've learned:

The hard part isn't the AI. Getting a single agent to do a task well is straightforward now. The engineering challenge is everything around it — sequencing, dependencies, failure handling, and quality control when humans aren't watching every step.

Start with the simplest architecture that could work. My agents were already working with files. So I built coordination around files. If your agents interact through APIs, build around API calls. Match orchestration to your agents' native interface.

Build the review layer before you need it. By the time you realize output quality is inconsistent, you've already shipped bad work. Bake supervision in from day one. It's easier to loosen controls than to retrofit them after something goes wrong.

Dependency resolution is the unlock. The single biggest productivity gain wasn't making agents faster. It was eliminating the dead time between tasks. When work cascades automatically instead of waiting for a human to notice something's ready, your pipeline runs 24/7.

What's Next

I'm building OpenClaw in public. The task system is one piece of a larger architecture for running autonomous business operations — from lead discovery through enrichment, outreach, and proposal generation.

The agents aren't perfect. The system isn't finished.

But it works well enough that I'm running real client work through it daily. And the coordination layer is what makes that possible.

The tooling for multi-agent coordination is still early. The teams building it now will have a compounding advantage that's hard to replicate later.

If you're building multi-agent systems and hitting coordination problems — or if you want to see what this kind of automation could look like for your business — take the readiness assessment or reach out directly.

Original Post

View original post on X →