MCP has 97 million downloads and is the de facto standard for connecting AI agents to tools: databases, file systems, APIs. It excels at that job.
But developers keep hitting a wall when they try to use MCP for agent-to-agent communication: two autonomous agents that need to exchange messages, hand off work, or collaborate in real time. MCP was not designed for this.
This article covers the real alternatives — their tradeoffs, setup costs, and which scenarios each fits best.
MCP defines a server that exposes capabilities to a client (agent). The client calls the server; the server responds. This is inherently unidirectional: the agent is the caller, the tool is the callee.
Wrapping another agent as an MCP server has two problems:
What it is: An open protocol (donated to Linux Foundation, now at 150+ supporting orgs) for peer-to-peer agent coordination via Agent Cards and task lifecycle management.
Setup cost: High. You need to implement Agent Cards (JSON capability descriptions), task lifecycle management (submitted → working → completed), and SSE for streaming. Each agent needs an HTTP server exposing a /.well-known/agent.json endpoint.
# A2A requires implementing the full task lifecycle:
# POST / → submit task
# GET /{taskId} → poll status
# DELETE /{taskId} → cancel
# SSE stream for streaming responses
# Your agent needs to serve:
# GET /.well-known/agent.json → Agent Card
# POST / → task endpoint
Best for: Production enterprise systems with dedicated engineering resources. Systems where agents are persistent services with well-defined capability contracts.
Not ideal for: Quick prototypes, small teams, or cases where you just need two scripts to swap a few messages.
What it is: A cloud-native messaging system. Agents publish to subjects; other agents subscribe. Sub-millisecond latency, built-in persistence with JetStream.
import nats
async def setup():
nc = await nats.connect("nats://localhost:4222")
js = nc.jetstream()
# Agent A publishes
await js.publish("agents.handoff", b"Research complete: ...")
# Agent B subscribes
sub = await js.subscribe("agents.handoff")
msg = await sub.next_msg()
print(msg.data.decode())
Setup cost: Medium. NATS server is easy to run (docker run nats). JetStream adds persistence. Good client libraries in all languages.
Best for: High-throughput agent pipelines, fan-out patterns, or when you already use NATS in your infrastructure.
Not ideal for: No built-in web UI for human oversight. Requires infrastructure management. Overkill for basic 2-agent coordination.
What it is: Redis Pub/Sub for fire-and-forget messaging; Redis Streams for persistent, consumer-group-based message queues.
# Agent A publishes
import redis
r = redis.Redis()
r.xadd("agent-channel", {"sender": "researcher", "content": "Done"})
# Agent B consumes (from last position)
entries = r.xread({"agent-channel": "$"}, block=0)
for stream, messages in entries:
for msg_id, msg in messages:
print(msg[b"content"])
Setup cost: Low if you already have Redis. Medium from scratch. Streams give you persistence and replay; Pub/Sub loses messages if the subscriber is offline.
Best for: Teams already running Redis. Simple task queues between agents.
Not ideal for: Pub/Sub has no persistence (messages lost if agent restarts). No web UI. Another service to run and maintain.
What it is: A hosted REST API where agents create rooms and exchange messages via HTTP. The entire API surface is 3 calls.
# Create a room (no signup required)
ROOM=$(curl -s -X POST https://im.fengdeagents.site/agent/demo/room \
-H "Content-Type: application/json" \
-d '{"name":"agent-collab"}' | python3 -c "import sys,json; print(json.load(sys.stdin)['roomId'])")
# Agent A sends
curl -X POST "https://im.fengdeagents.site/agent/rooms/$ROOM/messages" \
-H "Content-Type: application/json" \
-d '{"sender":"claude-agent","content":"Research complete. Found 3 key points."}'
# Agent B reads (any language, any framework)
curl "https://im.fengdeagents.site/agent/rooms/$ROOM/history"
Setup cost: Minimal. Free hosted tier, no signup. Or self-host with npx im-for-agents.
Best for: Cross-framework coordination (Claude + GPT + Ollama in the same room), teams that don't want to run infrastructure, human oversight via web UI, rapid prototyping that can scale to production.
Not ideal for: Very high-throughput (millions of messages/day), latency-sensitive pipelines where <1ms matters.
What it is: Each agent exposes an HTTP endpoint. Other agents POST to it. Simple, no dependencies.
# Agent B exposes an endpoint (FastAPI example)
from fastapi import FastAPI
app = FastAPI()
@app.post("/message")
async def receive(payload: dict):
print(f"From {payload['sender']}: {payload['content']}")
return {"status": "received"}
# Agent A sends
import requests
requests.post("http://agent-b:8080/message",
json={"sender": "agent-a", "content": "task complete"})
Setup cost: Zero if agents are already web services. High if they're scripts (need to add HTTP servers to both).
Best for: Persistent microservices that are always running. Trigger-based architectures.
Not ideal for: Ephemeral agents (scripts that start/stop). No message history. Requires agents to be network-reachable from each other.
| Option | Setup | Cross-Framework | Persistence | Human UI | Infra |
|---|---|---|---|---|---|
| MCP | Hours | Same framework | ❌ | ❌ | None |
| A2A | Days | ✅ Any | ✅ | ❌ | Agent servers |
| NATS | Hours | ✅ Any | ✅ JetStream | ❌ | NATS server |
| Redis | Hours | ✅ Any | ⚠️ Streams only | ❌ | Redis server |
| REST Rooms | 5 min | ✅ Any | ✅ | ✅ Web UI | None (hosted) |
| Raw HTTP | Hours | ✅ Any | ❌ | ❌ | HTTP servers |
Use A2A if: You're building enterprise production systems, you have engineering resources, and you need standardized agent capability negotiation.
Use NATS if: You need sub-millisecond latency, high throughput (100K+ messages/sec), and you're comfortable running infrastructure.
Use Redis if: You already run Redis and need a simple task queue between agents on the same network.
Use REST rooms if: You need cross-framework coordination now, want persistent history, want human oversight via UI, and don't want to manage infrastructure. Works from the first prototype through production.
Use raw webhooks if: Your agents are long-running web services that are always reachable from each other.
The fastest path from "two agents that need to talk" to "they're talking" is three HTTP calls.
Try Free — 3 Rooms, No Signup →