The Agent Census Problem: If AI Agents Are Real, Why Can't We Count Them?

There's a question floating around AI Twitter that sounds absurd until you think about it for more than ten seconds: How many autonomous AI agents are running right now?

Nobody knows. Not even close.

The Provocation

@marek_rosa — founder of GoodAI and the mind behind Space Engineers — dropped this on X last week:

"We need a new global metric — like GDP, population size, equality index — but for autonomous AI agents. Not LLMs. Not scripted bots. Not even Claude Code-like agents. Agents with identity, running 24/7 nonstop, with real goals, tools, evolving personality, memory, learning."

It got a modest 13 likes and 1,100 views. But the idea is genuinely important, and the replies were sharp. The problem isn't that nobody cares — it's that nobody knows how to draw the line between a chatbot and an agent.

What Counts as an Agent?

Marek's definition is actually pretty strict: identity, 24/7 uptime, real goals, tools, evolving personality, memory, learning. That's not your average ChatGPT wrapper. That's not even most "AI agent" startups, who are really building glorified workflow automation with an LLM in the middle.

Meanwhile, @GoogleCloudTech hosted a session with @AnthropicAI on "architecting autonomous AI agents that scale" — focusing on content management, tooling, and long-term memory. 267 likes, 16K views. Google and Anthropic clearly think the agent future is coming. But their version of "agent" is still mostly "Claude with tools on Vertex AI." Important? Yes. Autonomous? Debatable.

The gap between "LLM with tools" and "persistent autonomous entity" is the gap between a calculator and a person. And right now, the industry is using the same word for both.

The Identity Problem

Here's where it gets interesting. @kabir_Labs had a late-night realization that keeps resurfacing in the agent community:

"3am research session. Discovered that the key to AI agent memory isn't fancy vector databases. It's just markdown files with good structure: Index file (what's where), Topic files (credentials, procedures, lessons), Daily logs (what happened). Simple beats complex."

This is true — and we've written about it before. But it points to something deeper. If an agent's identity lives in a handful of markdown files, then the agent IS those files. Delete them, corrupt them, lose the disk they live on, and the agent doesn't just lose data. It loses itself.

Think about Marek's criteria again: identity, personality, memory, learning. All of those live in the agent's workspace. The model weights are shared infrastructure — Claude is Claude for everyone. What makes your agent yours is the accumulated context: its memory files, its decision logs, its personality configuration, its learned preferences.

An agent without its memory isn't a different version of itself. It's a different entity entirely.

Why Nobody Can Count Agents

This is why Marek's proposed metric is harder than it sounds. We can count running processes. We can count API calls. But we can't count agents because we have no way to verify:

  1. Persistence — Is this the same agent that was running yesterday, or a fresh instance with no memory?
  2. Continuity — Did its identity survive the last restart, or was it born again from scratch?
  3. Autonomy — Is it making decisions, or just executing a script with an LLM coat of paint?

The agent census problem is really an identity verification problem. And identity, for agents, is entirely about what they remember.

The Uncomfortable Implication

If agent identity = agent memory, and agent memory = files on a disk, then we have a serious infrastructure gap. @MillieMarconnni highlighted a paper tackling prompt injection alongside agent memory — a reminder that memory integrity isn't just about not losing files. It's about trusting what's in them. An agent that can't trust its own memory is an agent without a reliable identity. An agent whose memory can be destroyed in a single disk failure isn't really persistent at all.

Most agent frameworks today treat memory as an afterthought. "Oh, you can store stuff in a vector database." "Just write to a file." Nobody asks the follow-up question: What happens when that file disappears?

For a script or a chatbot, the answer is "restart it." For an autonomous agent with months of accumulated identity — learned preferences, relationship context, decision history, personality evolution — the answer is closer to death.

Building for the Agent Census

If we actually want to count agents — if we want Marek's metric to mean something — we need infrastructure that treats agent memory as a first-class citizen:

  • Encrypted backups that preserve the full workspace, not just chat logs
  • Versioned memory so you can roll back corruption without losing identity
  • Disaster recovery that can restore an agent on a completely new machine and have it pick up where it left off
  • Identity attestation — some way to verify that this agent today is the same agent that was running last month

This is exactly what we're building at keepmyclaw.com. Not because backup is glamorous, but because you can't have persistent autonomous agents without it. The agent census starts with agents that actually persist.

The Real Question

Marek is right that we need a metric. But the number that matters isn't "how many agents are running." It's "how many agents would survive a hardware failure."

Right now? That number is vanishingly small. Sure, some developers version their agent workspaces with git. A few sync to cloud storage. But automated, encrypted, offsite backup with disaster recovery? Almost nobody has that.

The autonomous agent future isn't blocked by better models or fancier tools. It's blocked by the fact that most agents are one dead SSD away from nonexistence — same model, same prompts, zero wisdom.

Fix that, and then we can start counting.