All articles

The AI Periodic Table — A Grammar for the Age of Intelligence (Part 2)

The original table organized AI into rows representing maturity and columns representing functional families. That instinct was exactly right. What I kept coming back to was the column logic — whether the families were capturing the right questions.


Back to Part 1


Part 2: Systems, Reactions, and the Gaps That Matter - The Elements (continued)

Systems: Production-Grade

Agent (Ag) — Control, Systems. An autonomous loop: observe the environment, plan a sequence of actions, act using tools, reflect on the result, repeat until the goal is satisfied. This is control taken to its logical extreme — the model is not responding to a prompt, it is pursuing an objective. The implications for deployment, for safety, for observability are qualitatively different from everything below it in the column. "My chatbot uses GPT, so it's an agent" is the most common and most consequential misconception in the field right now.

Knowledge Base (Kb) — Memory, Systems. A curated, structured store of domain facts used to ground model responses — distinct from a vector database in that it is organized, maintained, and authoritative, not just searchable. The difference between a filing cabinet full of documents and a reference library with a trained librarian.

Thinking Model (Th) — Architecture, Systems. A model architected to allocate additional inference-time compute — often through internal reasoning tokens or structured deliberation steps — before producing a final answer. Performance improves with increased test-time compute. Examples include OpenAI’s o1/o3 and DeepSeek R1.

These are not merely larger or better-trained models. They are designed to separate a reasoning phase from an answer phase, changing how they are prompted, evaluated, costed, and deployed.

Framework (Fw) — Orchestration, Systems. The platform that provides plumbing for building and deploying AI systems — LangChain, DSPy, LlamaIndex. Frameworks are orchestration at the systems level: they give you the abstractions, the routing, the memory management, the tool integration that lets you compose elements without rebuilding the wiring from scratch every time. Critically, frameworks are not AI. They are the infrastructure that makes AI systems buildable by humans who have other things to do.

Observability (Ob) — Observability, Systems. The integrated telemetry stack: traces, metrics, and logs working together so you can debug production AI systems. This is the difference between logging infrastructure-level events, tracing execution paths, and full observability — a unified view of what your system is doing, why, and how well. In a world where your system's behavior emerges from the interaction of many components, observability is not optional. It is the only way to know what is actually happening.

Red Teaming (Rt) — Safety, Systems. Adversarial testing: structured attempts to jailbreak the model, inject prompts, extract training data, bypass guardrails. This is safety as an active practice rather than a passive filter. You hire people — or build automated systems — whose job is to find how your AI can be made to misbehave, before your users find it first. The distance between "we have guardrails" and "we have red-teamed our guardrails" is the distance between security theater and actual security.

RLHF (Rl) — Adaptation, Systems. Reinforcement learning from human feedback: humans rate model outputs, a reward model learns to predict human preferences, and the policy is optimized against that reward model using PPO. This is how ChatGPT became ChatGPT. How Claude became Claude. The jump from a capable base model to a model that behaves the way humans want it to behave.

Emerging: The Frontier

Multi-Agent (Ma) — Control, Emerging. Not one AI — many, working together. Specialized agents that research, write, critique, verify, coordinate. One agent's output becomes another agent's input. The emergent capability of a well-designed multi-agent system can exceed what any single model achieves alone, for the same reason that a team of specialists outperforms a single generalist on complex problems. The engineering challenges — coordination, consistency, error propagation, cost — are genuinely unsolved at scale.

Synthetic Data (Sy) — Memory, Emerging. Using AI to generate training data for AI. When you cannot get enough real examples — because the domain is rare, the data is sensitive, or the scenarios you need simply have not happened yet — you generate them. The philosophical questions it raises — models trained on model outputs, recursively — are real and open.

Diffusion LLM (Df) — Architecture, Emerging. Every model in this table so far generates text the same way: one token at a time, left to right, each token conditioned on everything before it. Autoregressive generation is so dominant that most people assume it is the only way. It is not. Diffusion LLMs invert the process entirely. They start with noise across the full output sequence simultaneously and iteratively denoise toward a coherent response — the same process image diffusion models use, now applied to discrete token space. The whole output exists in degraded form from step one and sharpens across iterations. Tokens are updated in parallel within each denoising step, though the denoising process itself remains iterative. Mercury from Inception Labs and MDLM are early examples benchmarking competitively on certain tasks. The architectural implications are significant: different latency characteristics, the ability to revise any part of the output at any iteration rather than being locked into left-to-right commitments, and fundamentally different behavior under long-context and reasoning workloads. Whether diffusion LLMs displace transformers, complement them, or find a specialized niche is genuinely open. What is not open is that this is a different architectural family — the way transformers were a different family from LSTMs, not an incremental improvement on what came before. The gap at Emerging/Architecture asked: what is the next structural innovation in how AI systems process information? This is one credible current candidate.

Interpretability (In) — Observability, Emerging. Mechanistic understanding of why a model does what it does. Not just "the output was X" but "these specific neurons and circuits, activated by this pattern in the input, produced this behavior." Anthropic's ongoing work on tracing the computational graphs inside language models is frontier safety and frontier science simultaneously. We are beginning to be able to peek inside the black box, and what we find is surprising.

Alignment (Al) — Safety, Emerging. The set of techniques for ensuring that model goals, values, and behaviors match human intent at scale — not just in the controlled conditions of evaluation, but in the open-ended conditions of deployment. Constitutional AI, RLAIF, scalable oversight. RLHF is one tool. Alignment is the broader challenge that RLHF partially addresses. It sits in Emerging not because the work is not happening, but because the problem is not solved.

Agent-to-Agent Protocol (Aa) — Protocols, Emerging. Where MCP solves agent-to-tool communication, A2A — Google's Agent-to-Agent protocol, proposed in 2025 — solves agent-to-agent communication. The distinction matters precisely. In MCP, one party is active (the agent), one is passive (the tool). The tool does not plan, does not negotiate, does not push back. In A2A, both parties are autonomous. An orchestrator agent delegates a task to a specialist agent. The specialist can clarify, report partial results, or refuse. This is peer-to-peer coordination between systems that both have goals. It belongs in Protocols because it is, like MCP, a communication standard — not a framework you build in, not a pattern you coordinate with, but a contract that defines how autonomous systems talk to each other. The Protocols column now tells a complete story from bottom to top: raw HTTP/JSON contract, standardized agent-to-tool protocol built on JSON-RPC, a gap at Systems where no AI-native protocol yet exists, and peer agent communication at the frontier. Each row a higher level of communication abstraction than the one below.


Reactions — The Table Made Useful

The table earns its keep not in memorization but in combination. Chemistry's power was always in predicting reactions, not cataloging elements. Here is what that looks like for AI.

Production RAG Chatbot. You take your documents and convert them to embeddings (Em). You store those embeddings in a vector database (Vx). When a user asks a question, RAG (Rg) retrieves the relevant chunks and augments the prompt (Pr), which goes to the base model (Lm) to generate an answer grounded in retrieved context. Guardrails (Gr) wrap the output before it reaches the user. Schema (Sc) enforces format compliance. The initial reaction — Em + Vx + Rg + Pr + Lm → Gr + Sc — is what most teams ship. But run it through the table and a gap surfaces immediately: the Observability column is empty. Without Tracing (Tr) and Evaluation (Ev), you cannot tell whether your retrieval is finding the right chunks, whether your answers are improving or degrading, or where the failure is when something goes wrong. The complete, production-ready reaction: Em + Vx + Rg + Pr + Lm → Gr + Sc + Tr + Ev.

Agentic Loop. An agent (Ag) takes a goal, breaks it down, and uses tool calling (To) to invoke external APIs — flight search, calendar, payment systems. MCP (Mc) standardizes how those tool connections are made and discovered, so the agent is not hard-coded to any specific service. A framework (Fw) provides the plumbing. The initial instinct — Ag + To + Fw — covers the happy path. But two columns are left dangerously empty. Without Observability (Ob), you cannot monitor the loop or detect when it is spinning in circles, accumulating cost and making no progress. Without Red Teaming (Rt), an agent with tool access is a liability waiting to be exploited. The complete reaction: Ag + To + Mc + Fw + Ob + Rt.

Multi-Agent Research Pipeline. A coordinating agent (Ag) assigns tasks to a network of specialized agents (Ma) — one searches, one synthesizes, one critiques. A2A (Aa) governs how those agents communicate with each other as peers, delegating and reporting through a standardized protocol rather than bespoke wiring. Each agent uses RAG (Rg) grounded in a knowledge base (Kb). Synthetic data (Sy) fills gaps in training. A framework (Fw) orchestrates the whole thing. This is the architecture that draws the most excitement in engineering right now — and the one where the gap between what teams build and what the table demands is widest. Without Interpretability (In) auditing the decisions made by each agent, and without Alignment (Al) ensuring the emergent behavior of the network serves its intended purpose, you have a powerful system you cannot explain and cannot fully control. The complete reaction: Ma + Ag + Aa + Rg + Kb + Sy + Fw + In + Al.


The Gaps — The Most Honest Part of the Table

A periodic table without gaps is not a periodic table. It is a catalog.

The gaps in this table are intentional, and they are the most intellectually interesting thing about it.

There is no element at Infrastructure/Orchestration. No orchestration primitive at the lowest level of the stack. That is not an oversight — it is an accurate reflection of the field. Orchestration, at its most primitive, requires at least two things to coordinate. You cannot orchestrate one thing. The gap is real.

There is no element at Protocols/Infrastructure either, and for the same structural reason. TCP/IP and HTTP exist at this level but they are general-purpose networking infrastructure, not AI-specific protocol concerns. The table scope starts above them.

There is no element at Protocols/Systems. Production AI systems run on gRPC internally — Google, Uber, most large-scale distributed systems — but gRPC is general-purpose infrastructure borrowed from software engineering, not purpose-built for AI. The gap predicts something specific: as MCP matures and agent networks scale, JSON-RPC will strain at high throughput. It is plausible that as agent networks scale, higher-performance transports purpose-built for AI coordination will emerge. That is the shape of the missing element. It is not here yet.

Emerging/Architecture now has a candidate — Diffusion LLM (Df) — but the gap is not fully closed. Diffusion LLMs are real and shipping, but not yet production-dominant, not yet proven across the full range of tasks transformers handle, and their implications for agents, tool use, and long-context reasoning are still being worked out. We have named the element. We have not yet agreed on its atomic weight.

Protocols/Emerging has a candidate too — A2A (Aa) — with the same honest caveat. A proposal, not yet a ratified standard. The field is actively debating whether agent-to-agent communication will converge on one protocol or fragment. Two emerging elements named. Neither fully settled.

Emerging/Adaptation has no clear element. How will adaptation work when models are already highly capable and training data is increasingly synthetic? RLHF at scale has known limits. What replaces it or transcends it at the frontier?

Mendeleev left gaps. Then chemists filled them and found they were exactly the shape he predicted. These gaps in the AI table are predictions — not of specific technologies, but of the categories of problems that the field has not yet solved. If you are looking for a research direction, look at the gaps.


Your Challenge

Next time someone pitches you an AI product, a new architecture, a startup idea — map it to the table. Not as an exercise. As a discipline.

Which elements are they using? Which reactions are they running? Is the Safety column populated at all, or are they shipping a system with Filtering at Infrastructure and nothing above it? Are they calling something an agent that is actually a prompt with tool calling? Are they using a thinking model — expensive, slow, designed for hard reasoning problems — where a small model running fast would do the job better and cheaper?

Is the Observability column empty? Because a system you cannot observe is a system you do not control, no matter how good the model is.

That is the point of structure. Not to constrain. To see clearly.

Download the full AI Periodic Table v3 — Excel format


Further Reading

These are the sources and papers that informed the thinking in this article. Each one is worth the time.

On RAG and Orchestration

On MCP and A2A

  • Model Context Protocol — Anthropic — The official specification and introduction. Read this before forming an opinion about what MCP is and is not.
  • Agent2Agent Protocol — Google — Google's proposal for standardized peer communication between autonomous agents. The Protocols/Emerging element that complements MCP at Protocols/Compositions.

On Diffusion LLMs

On Multi-Agent Systems

On Observability

On Interpretability


Shaped in collaboration with Claude, an AI assistant by Anthropic, during rainy Pacific Northwest afternoons where engineering problems meet philosophical questions.

Continue Reading

← Back to Part 1

The element catalogue that this part composes into reactions.

Agentic AI Is Distributed Systems — Parts 1 & 2

The Observability column identifies the gap; the Distributed Systems series fills it with four engineering patterns for semantic observability.

The Global AI Risk Assessment Convergence

The Safety column — Filtering, Schema, Guardrails, Red Teaming, Alignment — maps directly to what regulators are now requiring.

Blockchain and AI Agents

The A2A protocol element raises trust questions — when decentralized trust is necessary versus when simpler solutions suffice.