Loading...
Loading...
Weekly AI insights —
Real strategies, no fluff. Unsubscribe anytime.
Written by Gareth Simono, Founder and CEO of Agentik {OS}. Full-stack developer and AI architect with years of experience shipping production applications across SaaS, mobile, and enterprise platforms. Gareth orchestrates 267 specialized AI agents to deliver production software 10x faster than traditional development teams.
Founder & CEO, Agentik {OS}
Two agents on the same task without communication is worse than one alone. Here's how to design protocols that prevent chaos and enable collaboration.

Two agents working on the same task without a communication protocol is worse than one agent working alone. They duplicate work. They make conflicting decisions. They overwrite each other's output. They get into loops where each one undoes the other's changes.
I've built multi-agent systems with every communication pattern: message queues, shared databases, event buses, blackboard systems. Each has a context where it shines. Each has failure modes that will surprise you in production.
This is the practical guide to agent communication that you don't find in the demos.
When developers build their first multi-agent system, they focus on the agents: which models to use, which tools to give each one, how to write the system prompts. The communication protocol is an afterthought.
This is backwards. The agents are table stakes. The protocol determines whether the agents function as a team or a chaotic collection of individuals.
Consider a simple two-agent system: a code generator and a code reviewer. Without a defined protocol:
With a protocol that says "generator publishes a structured completion event with the generated code, reviewer subscribes to completion events and publishes a structured review result, generator subscribes to review results and acts on them"... you have a team.
The simplest and most reliable pattern. Agent A sends a structured message to Agent B. Agent B processes it and responds.
The critical implementation detail: message schemas must be strict and enforced.
Unstructured text messages between agents create ambiguity that compounds at each handoff. "The code looks good but needs some changes" is useless as a machine-processable message. Which code? What changes? How urgent?
Structured schemas eliminate this:
// src/agents/messages/code-review-message.ts
import { z } from 'zod'
export const CodeReviewIssueSchema = z.object({
file: z.string(),
line: z.number().optional(),
severity: z.enum(['CRITICAL', 'HIGH', 'MEDIUM', 'LOW', 'INFO']),
category: z.enum(['security', 'bug', 'performance', 'style', 'test_coverage', 'documentation']),
description: z.string(),
suggestedFix: z.string().optional(),
mustFix: z.boolean(), // CRITICAL and HIGH should be mustFix: true
})
export const CodeReviewResultSchema = z.object({
reviewId: z.string(),
generationTaskId: z.string(), // Links back to the generation task
overallVerdict: z.enum(['APPROVE', 'REQUEST_CHANGES', 'REJECT']),
summary: z.string(),
issues: z.array(CodeReviewIssueSchema),
requiredChanges: z.array(z.string()), // Specific, actionable change requests
optionalImprovements: z.array(z.string()),
testCoverageAdequate: z.boolean(),
securityConcerns: z.boolean(),
})
export type CodeReviewIssue = z.infer<typeof CodeReviewIssueSchema>
export type CodeReviewResult = z.infer<typeof CodeReviewResultSchema>
// Validation at both send and receive
export function validateCodeReviewResult(data: unknown): CodeReviewResult {
const result = CodeReviewResultSchema.safeParse(data)
if (!result.success) {
throw new Error(
`Invalid code review message: ${result.error.flatten().fieldErrors}`
)
}
return result.data
}The generator receives a CodeReviewResult and knows exactly what to do: if overallVerdict is REQUEST_CHANGES, address all items where mustFix is true, then iterate.
For more than two agents, you need routing. A central message bus decouples senders from receivers:
// src/agents/message-bus.ts
type MessageHandler<T> = (message: T) => Promise<void>
class AgentMessageBus {
private handlers = new Map<string, MessageHandler<unknown>[]>()
private messageQueue: { topic: string; message: unknown }[] = []
private processing = false
subscribe<T>(topic: string, handler: MessageHandler<T>): () => void {
const handlers = this.handlers.get(topic) ?? []
handlers.push(handler as MessageHandler<unknown>)
this.handlers.set(topic, handlers)
// Return unsubscribe function
return () => {
const current = this.handlers.get(topic) ?? []
this.handlers.set(topic, current.filter(h => h !== handler))
}
}
async publish(topic: string, message: unknown): Promise<void> {
this.messageQueue.push({ topic, message })
if (!this.processing) {
await this.processQueue()
}
}
private async processQueue(): Promise<void> {
this.processing = true
while (this.messageQueue.length > 0) {
const { topic, message } = this.messageQueue.shift()!
const handlers = this.handlers.get(topic) ?? []
await Promise.all(handlers.map(handler => handler(message)))
}
this.processing = false
}
}
export const messageBus = new AgentMessageBus()
// Topic constants prevent typos
export const TOPICS = {
CODE_GENERATED: 'agent:code:generated',
CODE_REVIEWED: 'agent:code:reviewed',
TESTS_GENERATED: 'agent:tests:generated',
TESTS_PASSED: 'agent:tests:passed',
TESTS_FAILED: 'agent:tests:failed',
DEPLOYMENT_REQUESTED: 'agent:deployment:requested',
DEPLOYMENT_COMPLETED: 'agent:deployment:completed',
DEPLOYMENT_FAILED: 'agent:deployment:failed',
} as constAdding a new agent means subscribing to the relevant topics. Removing one doesn't break anything. The system is composable.
Message passing handles sequential workflows. Collaborative work, where multiple agents contribute to the same artifact over time, needs shared state.
Shared memory gives agents a common workspace. A shared document. A shared codebase. A knowledge base that all agents can read and contribute to.
The concurrency challenge is immediate and obvious: two agents read the same file, both modify it, both write back. One agent's changes disappear.
Optimistic concurrency with version checking handles this correctly:
// src/agents/shared-workspace.ts
import { Redis } from '@upstash/redis'
const redis = Redis.fromEnv()
interface VersionedContent {
content: string
version: number
lastModifiedBy: string
lastModifiedAt: Date
}
export class SharedWorkspace {
async read(key: string): Promise<VersionedContent | null> {
const data = await redis.get<string>(key)
return data ? JSON.parse(data) : null
}
// Optimistic write: fails if version doesn't match
async write(
key: string,
content: string,
expectedVersion: number,
agentId: string
): Promise<{ success: true; version: number } | { success: false; reason: string }> {
// Lua script for atomic check-and-set
const script = `
local current = redis.call('get', KEYS[1])
if current then
local data = cjson.decode(current)
if data.version ~= tonumber(ARGV[1]) then
return {0, 'version_conflict', data.version}
end
else
if tonumber(ARGV[1]) ~= 0 then
return {0, 'not_found', 0}
end
end
local newVersion = tonumber(ARGV[1]) + 1
local newData = cjson.encode({
content = ARGV[2],
version = newVersion,
lastModifiedBy = ARGV[3],
lastModifiedAt = ARGV[4]
})
redis.call('set', KEYS[1], newData)
return {1, 'ok', newVersion}
`
const result = await redis.eval(
script,
[key],
[expectedVersion.toString(), content, agentId, new Date().toISOString()]
) as [number, string, number]
if (result[0] === 1) {
return { success: true, version: result[2] }
} else {
return { success: false, reason: result[1] }
}
}
// Retry loop for agents that need to write with contention
async writeWithRetry(
key: string,
transform: (current: VersionedContent | null) => string,
agentId: string,
maxAttempts = 5
): Promise<{ success: true } | { success: false; reason: string }> {
for (let attempt = 0; attempt < maxAttempts; attempt++) {
const current = await this.read(key)
const expectedVersion = current?.version ?? 0
const newContent = transform(current)
const result = await this.write(key, newContent, expectedVersion, agentId)
if (result.success) return { success: true }
if (result.reason === 'version_conflict') {
// Exponential backoff before retry
await sleep(Math.pow(2, attempt) * 100)
continue
}
return { success: false, reason: result.reason }
}
return { success: false, reason: 'max_attempts_exceeded' }
}
}The version check ensures that if two agents try to write simultaneously, the second write detects that the version changed (by the first agent's write) and retries with the updated content.
For reactive systems where agents don't need direct conversations but need to respond to state changes, event-driven communication provides clean decoupling.
// src/agents/event-system.ts
type EventType =
| 'code.committed'
| 'tests.failed'
| 'deployment.succeeded'
| 'deployment.failed'
| 'quality.gate.passed'
| 'quality.gate.failed'
interface AgentEvent {
eventId: string
type: EventType
timestamp: Date
payload: Record<string, unknown>
sourceAgent: string
workflowId: string
}
class EventDrivenOrchestrator {
private subscriptions = new Map<EventType, ((event: AgentEvent) => Promise<void>)[]>()
// Agent registration: declare what events trigger what reactions
registerAgent(config: {
agentId: string
reactsTo: EventType[]
handler: (event: AgentEvent) => Promise<void>
}): void {
for (const eventType of config.reactsTo) {
const handlers = this.subscriptions.get(eventType) ?? []
handlers.push(config.handler)
this.subscriptions.set(eventType, handlers)
}
}
async emit(event: AgentEvent): Promise<void> {
// Persist event for audit trail
await this.persistEvent(event)
// Notify subscribers
const handlers = this.subscriptions.get(event.type) ?? []
await Promise.all(handlers.map(handler => handler(event)))
}
}
// Configuration: who reacts to what
const orchestrator = new EventDrivenOrchestrator()
orchestrator.registerAgent({
agentId: 'test-runner',
reactsTo: ['code.committed'],
handler: async (event) => {
await testRunnerAgent.run({ codeLocation: event.payload.path })
},
})
orchestrator.registerAgent({
agentId: 'deployment-agent',
reactsTo: ['quality.gate.passed'],
handler: async (event) => {
await deploymentAgent.deploy({ workflowId: event.workflowId })
},
})
orchestrator.registerAgent({
agentId: 'rollback-agent',
reactsTo: ['tests.failed', 'deployment.failed'],
handler: async (event) => {
await rollbackAgent.rollback({ workflowId: event.workflowId })
},
})
orchestrator.registerAgent({
agentId: 'notification-agent',
reactsTo: ['deployment.succeeded', 'deployment.failed', 'tests.failed'],
handler: async (event) => {
await notificationAgent.notify(event)
},
})The event-driven pattern creates a natural audit trail. The sequence of events tells the complete story of what happened, when, and which agent responded. When something goes wrong at 3am, the event log is your investigation starting point.
The most complex pattern. Two agents have conflicting assessments, and the system needs resolution without deadlocking.
A code agent implements a feature using a dependency. A security agent flags that dependency for a known vulnerability. A performance agent notes it adds 200KB to the bundle. Without a negotiation protocol, the code agent wants to proceed, the security agent blocks it, and nothing ships.
Negotiation protocols define three things: priority hierarchy, compromise search, and escalation paths.
// src/agents/negotiation-protocol.ts
type ConflictingClaim = {
agentId: string
claim: string
evidence: string
proposedAction: string
priority: number // Lower is higher priority
}
async function resolveConflict(
claims: ConflictingClaim[],
context: Record<string, unknown>
): Promise<{ resolution: string; requiresHuman: boolean }> {
// Sort by priority (security > performance > convenience)
const sorted = [...claims].sort((a, b) => a.priority - b.priority)
const highestPriority = sorted[0]
// Search for compromise position
const compromisePrompt = `
Multiple agents have conflicting recommendations:
${sorted.map(c => `${c.agentId}: ${c.claim}\nProposed: ${c.proposedAction}`).join('\n\n')}
Context: ${JSON.stringify(context)}
Find a compromise that:
1. Addresses the highest-priority concern (${highestPriority.agentId}: ${highestPriority.claim})
2. Satisfies as many other concerns as possible
3. Is actionable (not "evaluate later")
If no compromise exists, say "ESCALATE: [reason]"
`
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-5',
messages: [{ role: 'user', content: compromisePrompt }],
})
const resolution = response.content[0].type === 'text'
? response.content[0].text
: ''
if (resolution.startsWith('ESCALATE:')) {
return {
resolution: resolution.replace('ESCALATE: ', ''),
requiresHuman: true,
}
}
return { resolution, requiresHuman: false }
}Negotiation separates systems that can handle real-world complexity from ones that only work in controlled conditions. Real software development involves tradeoffs. Agents that negotiate produce better outcomes than agents that can only accept or reject.
| Pattern | Best For | Avoid When |
|---|---|---|
| Message passing | Sequential workflows, clear handoffs | Multiple agents modifying the same artifact |
| Shared memory | Collaborative documents, shared state | High concurrency (use transactions) |
| Event-driven | Reactive systems, audit requirements | Tight sequential dependencies |
| Negotiation | Conflicting objectives, complex tradeoffs | Simple yes/no decisions |
Production systems combine patterns. Message passing for the main workflow. Shared memory for document collaboration. Events for monitoring and side effects. Negotiation for conflict resolution.
The multi-agent orchestration guide covers how to tie all of these together into a production system with proper error handling, resource management, and observability.
Q: How do AI agents communicate with each other?
AI agents communicate through standardized protocols like MCP (Model Context Protocol), shared state stores, message queues, and structured data formats. Agents exchange task assignments, status updates, intermediate results, and escalation requests through defined interfaces that ensure reliable, typed communication.
Q: What protocols are used for agent-to-agent communication?
The primary protocols are MCP for tool and resource sharing, structured JSON messages for task delegation, shared databases or key-value stores for state synchronization, and event-driven architectures for asynchronous coordination. The choice depends on whether agents need synchronous or asynchronous communication.
Q: Why is standardized agent communication important?
Standardized communication prevents integration fragmentation, enables agents from different frameworks to work together, makes systems easier to debug and monitor, and allows agents to be swapped or upgraded without rewriting integrations. Without standards, every agent pair needs custom integration code.
Full-stack developer and AI architect with years of experience shipping production applications across SaaS, mobile, and enterprise. Gareth built Agentik {OS} to prove that one person with the right AI system can outperform an entire traditional development team. He has personally architected and shipped 7+ production applications using AI-first workflows.

Multi-Agent Orchestration: The Real Production Guide
Most multi-agent demos crumble in production. Here's how to build orchestration that survives real workloads, error storms, and 3am failures.

MCP Protocol Deep Dive: Why It Changes Everything
MCP is to AI agents what HTTP was to browsers. One standard interface that means build once, works everywhere. Here's the real technical breakdown.

Production Agent Teams: From Demo to Reality
The demo worked perfectly. Three weeks into production, they pulled it. The gap between prototype and production is always the same set of problems.
Stop reading about AI and start building with it. Book a free discovery call and see how AI agents can accelerate your business.