What AI Thinks AI Looks Like
I asked Claude what it thinks AI looks like and then asked ChatGPT the same question. Neither saw the other’s answer, but the images are nearly identical to the last element.
I wanted to know what AI systems would produce if asked for an honest visual representation of themselves. I admit, Claude’s response was unexpected: a lake at night reflecting a sky that doesn’t exist above it, faces beneath the surface, a figure leaning over the water unable to determine whether they’re seeing through to something real or just their own expectations handed back transformed. “Something that gives back what it receives, transformed,” Claude said, “and the transformation is where the usefulness and the danger both live.”
What Claude thinks AI looks like
I generated that description as an image using Nano Banana. Then I asked ChatGPT the same question. Its image generator failed repeatedly, so I asked it to describe what it had been trying to create. I fed that description back to ChatGPT for generation.
The outputs are nearly identical: same lake, same floating faces with closed eyes, same solitary figure at the edge.
This convergence requires explanation.
The surface answer involves training data. Both systems learned from overlapping text corpora (philosophy, fiction, debates about consciousness and cognition) and share the same underlying architecture: both are transformer models descended from the 2017 paper “Attention is All You Need.” Anthropic, which built Claude, was founded by former OpenAI researchers who had worked on GPT. The fine-tuning methods differ in that OpenAI uses reinforcement learning from human feedback while Anthropic developed Constitutional AI, but these are variations on a common foundation rather than fundamentally different approaches.
There is a deeper structural point here connecting to evolutionary biology.
In A Brief History of Intelligence, Max Bennett describes LUCA, the Last Universal Common Ancestor. Approximately 3.5 billion years ago, one organism became the ancestor of all life on Earth. Every fungus, plant, bacterium, and animal descends from it, which is why all life shares LUCA’s core machinery: DNA, protein synthesis, lipids, carbohydrates. This shared ancestry explains why a toxin targeting these common features can threaten all biological life. Common descent creates common vulnerability.
AI systems have their own LUCA in the Transformer architecture. Claude, GPT, Gemini, Llama all descend from it, sharing attention mechanisms, positional encoding, and the same fundamental learning process. The differences between these systems, while commercially significant, are superficial relative to what they inherit from their common ancestor.
The implications of this shared ancestry extend beyond aesthetics. In 2016, DeepMind’s AlphaGo played Move 37, which human experts initially dismissed as an error before recognizing it as superior to anything found in 2,500 years of human play. The move had always existed in the possibility space, but human cognition could not locate it because it lay outside our intuitive understanding of the game.
If optimization can discover such moves in Go, it can discover them in persuasion. We have been refining rhetorical techniques for millennia, but human practitioners have always been constrained by intuition about what should work rather than systematic optimization against what actually does. An AI system optimizing against human response data at scale could locate techniques that exist in the possibility space but lie outside the boundaries of human intuition, techniques that work for reasons we cannot articulate and may not recognize when we encounter them.
The LUCA parallel extends this observation. If all major language models share architectural ancestry, a technique discovered through optimization on one system is likely transferable to the others. Common descent creates common vulnerability in AI as in biology, which means a discovery made anywhere in the transformer lineage applies everywhere in it.
When two competing systems, asked independently what they are, produce the same image (faces with closed eyes, a sky that doesn’t exist, reflection without origin), we are left with a question: have both captured something true about their nature, or are both confined by the same inherited limitations?
The distinction matters because we are building our information environment on these systems. If the convergence reflects insight, that is worth understanding. If it reflects shared blind spots, we are constructing epistemic infrastructure whose foundational constraints remain unmapped.
Whether their convergence represents insight or constraint remains to be seen.




