What Is a Concept? A Survey of Theories for the Age of Artificial Minds
1. What Are We Talking About?
What is a concept? The question sounds like it should have a straightforward answer — after all, we use concepts constantly, and cognitive scientists have studied them for decades. But “concept” turns out to be one of those terms that fractures the moment you try to pin it down. Is the concept DOG a mental dictionary entry? A statistical summary of dogs you’ve encountered? A node in a causal theory of biology? An unstructured atom that simply latches onto doghood in the world? A perceptual simulation of a typical dog? An analogical abstraction that shifts every time you meet a new dog? All of these have been seriously proposed, and each captures something real about what concepts seem to do.
Before wading into the theories, we need to flag an ambiguity that runs through the entire literature. “Cognition” gets used in two very different senses. In the broad sense, cognition is information processing — everything from early retinal edge detection to deliberate mathematical reasoning counts. In the narrow sense, cognition is specifically conceptual processing: the kind of thinking that involves categorization, inference, and representations that can combine into thoughts with truth conditions. This distinction matters because some of the deepest disagreements in the field — including the one between Ned Block and Jake Quilty-Dunn that we’ll examine below — turn on where you draw the line between perceptual information processing and genuinely conceptual thought.
To make the question concrete, consider a simplified picture from cognitive architecture. In systems like ACT-R (Adaptive Control of Thought—Rational), the architecture looks roughly like this: perceptual buffers hold the deliverances of sensory processing — what you’re currently seeing, hearing, feeling. A central processor retrieves relevant structures from a long-term memory store, and the retrieval mechanism is called spreading activation. To get the intuition, think of free association: someone says “dog” and what comes to mind? Perhaps “bark,” “fur,” “loyal,” “walk,” “cat.” The pattern of what activates easily — and what doesn’t — reveals something about how the memory store is organized, about what is associated with what and how strongly. The things in the store that get activated, that get pulled into the central processor to interact with current perceptual experience — those are playing the concept role.
But what are they, exactly? What’s their internal structure? What format are they stored in? Where did they come from? How do they change when you learn something new? These are the questions the theories disagree about, and they fracture along several axes: content (what does a concept represent, and how?), format (what kind of representational vehicle carries it?), origin (is it innate, learned, culturally universal?), and dynamics (how does it change, combine, and get deployed?). These axes have been pursued somewhat independently across philosophy of mind, cognitive psychology, linguistics, and computer science — which is part of why the literature can feel like several parallel conversations that never quite meet.
This article surveys the major approaches along each axis. But it’s not a neutral survey — I have commitments that will color the commentary. In a previous article on phenomenal consciousness, I argued that conscious experience involves a distinctive homeostatic regulatory role: phenomenal acquaintance is not just content (what the experience is about) but a specific kind of vehicle (how the experience functions in the cognitive economy). In an article on free agency, I argued that genuine agency requires a recursive self-modeling decision architecture — an agent that can represent and evaluate its own decision-making process. Both claims implicate the nature of concepts directly. The first asks whether experiential acquaintance plays a constitutive role in grounding certain concepts. The second requires composable representations for self-modeling. A theory of concepts is upstream of both.
This piece also serves as a foundation for two planned follow-up articles. One will examine what LLM interpretability research reveals about the “shape” of large language model minds — whether the representations learned by neural networks constitute concepts in any of the senses surveyed here, and what that tells us about both the models and the theories. The other will look at cognitive architectures, the “Standard Model of Cognition,” and what happens when you try to design systems with explicit commitments about conceptual structure. The question that bridges both: if natural concepts are format-pluralistic, experientially grounded, and dynamically fluid, what should we expect from artificial systems that either learn representations from data or have them hand-designed?
2. The Content Landscape: What Concepts Are About
The classical view and its collapse
The oldest theory of concepts is the definitional one: a concept is a set of necessary and sufficient conditions. The concept BACHELOR is UNMARRIED + ADULT + MALE. The concept TRIANGLE is CLOSED FIGURE + THREE SIDES + THREE ANGLES. This goes back to Aristotle and was the default assumption in analytic philosophy through most of the twentieth century. It’s clean, compositional, and entirely wrong about most natural concepts.
The decisive blow came from Wittgenstein’s observation that many everyday concepts — GAME is his famous example — resist definition. What do chess, solitaire, ring-around-the-rosie, and professional football have in common? No single set of conditions covers them all. Instead, they share a network of overlapping similarities: “family resemblances,” in his phrase. Some games are competitive, some aren’t. Some involve skill, some luck. Some have rules, some are free-form. The concept holds together not by a definition but by a web of partial overlaps.
This observation, combined with a wave of experimental results in the 1970s, upended the classical view and launched the modern debate.
Prototypes and exemplars
Eleanor Rosch’s prototype theory proposed that concepts are organized around statistical “best examples” rather than definitions. We categorize new cases by their similarity to the prototype — the most typical member. Robins are more prototypically “bird” than penguins; chairs are more prototypically “furniture” than rugs. This explains a robust experimental finding: people are faster and more accurate at categorizing typical instances than atypical ones. Typicality is graded, not all-or-nothing, which is exactly what you’d expect if concepts are similarity clusters rather than definitions.
Exemplar theory (developed by Medin and Schaffer, later formalized by Nosofsky) pushes this further: rather than storing a single summary prototype, we store individual remembered instances — exemplars — and categorize new cases by comparing them to the whole set. This handles context-sensitivity better (your concept of “bird” might shift depending on whether you’re at a zoo or reading a biology textbook) and explains how we can learn categories with bimodal distributions that have no natural prototype.
Both approaches are similarity-based, and both work well for categorization tasks. But they share a deep problem: compositionality. The prototype of PET and the prototype of FISH don’t straightforwardly combine to give you the prototype of PET FISH (which is something like a goldfish, not the average of a typical pet and a typical fish). If concepts are just statistical summaries, it’s unclear how they compose into structured thoughts — and the ability to think structured thoughts like “every pet fish needs a clean tank” is arguably the central function of concepts.
Theory-theory
A more powerful approach treats concepts as embedded in intuitive causal-explanatory theories. On this view (developed by Gopnik, Wellman, and elaborated by Murphy and Medin), your concept BIRD isn’t just a feature cluster — it includes causal knowledge: birds fly because they have wings, they have feathers because of evolutionary descent, baby birds hatch from eggs. This explains why deeper causal features matter more for categorization than surface ones: a bird that’s been surgically altered to look like a squirrel is still a bird, because the causal-biological facts are what count.
Theory-theory is more explanatorily powerful than prototypes or exemplars, but it raises its own questions. Where do the initial theories come from? A child’s concept of animal seems to involve proto-biological causal reasoning from very early on — earlier than explicit instruction could explain. This points toward innate knowledge structures, which we’ll address in the grounding section. And there’s a circularity worry: if understanding a concept requires having a theory, and having a theory requires concepts to formulate it in, where does the process start?
Fodor’s atomism
Jerry Fodor took the compositionality problem as his starting point and arrived at a radically different view. In Concepts: Where Cognitive Science Went Wrong (1998), he systematically argued that prototypes, exemplars, and theories all fail as accounts of conceptual content because none of them can explain how concepts compose. His alternative: lexical concepts are unstructured atoms. The concept DOG has no internal structure — no features, no prototype, no theory. What makes it the concept DOG is a nomic (law-like) mind-world relation: DOG gets reliably tokened in the presence of (and because of) dogs.
This is elegant and composes beautifully (atomic symbols combine by syntactic rules, just like words in a sentence). But it faces what Fodor himself called the “doorknob problem”: if concepts are unstructured atoms defined by their mind-world connections, it’s mysterious how you could ever learn a new one. Acquiring the concept DOORKNOB can’t involve analyzing it into simpler components (it has none). Fodor was driven to a kind of radical nativism: perhaps all lexical concepts are innate, and experience merely triggers them. Most people find this implausible for concepts like CARBURETOR or CRYPTOCURRENCY.
Fodor’s view also illustrates an important methodological point. His atomism is a theory of conceptual content — what makes DOG about dogs. But his Language of Thought hypothesis, which we’ll turn to next, is a theory of conceptual format — the syntactic structure of mental representations. These are logically independent. You can endorse LOT without atomism (maybe the vehicles have language-like syntax but the primitives have internal structure), and you can endorse atomism without LOT (maybe atomic concepts are stored in a non-linguistic format). Keeping content and format questions separate turns out to be crucial.
Machery’s heterogeneity challenge
In Doing Without Concepts (2009), Edouard Machery argued that the content debate has reached an impasse for a reason: “concept” is not a natural kind. What we lump under that label actually involves at least three distinct kinds of psychological structures — prototypes, exemplars, and theories — that get deployed for different cognitive tasks. When you quickly categorize something, you might use a prototype. When you reason about its causal properties, you might use a theory. When you recall a specific instance, you’re using an exemplar. These aren’t three theories of the same thing; they’re descriptions of three different things that we’ve been mistakenly treating as one.
Machery’s eliminativism is bracing, and his diagnosis of the content impasse is largely convincing. But his conclusion — abandon the category “concept” altogether — doesn’t follow. The content-focused debates ran aground because they were looking for a single account of what concepts represent. But we don’t need content-based unity to have a stable explanatory target. We can characterize concepts functionally: concepts are whatever plays the concept-role in cognitive processing — the structures in long-term memory that get activated (by perception or by other concepts), that compose into structured thoughts, and that guide inference and action. This is the role we sketched with the ACT-R picture in the introduction. Whether those structures are prototypes, exemplars, theories, or something else in a given context — whether the category is a natural kind or a functional kind — we can remain agnostic. What matters is that the functional role picks out a stable target for investigation.
And once we have that functional target, a whole dimension of inquiry opens up that the content debates neglected: the format question. Not what concepts represent, but how they represent — what kind of representational vehicle carries conceptual content, what compositional properties it has, what computational operations it supports. This is what I’ll call the format turn, and it turns out to be productive regardless of where the natural kind dispute lands.
3. The Format Turn: How Concepts Represent
The Language of Thought
The idea that thinking requires a language-like representational system goes back at least to Fodor’s The Language of Thought (1975). The argument rests on three features of thought that any theory must explain:
Productivity: you can think an unbounded number of thoughts. There’s no longest thought, just as there’s no longest sentence. This requires a combinatorial system — a finite stock of primitives and rules for combining them into ever more complex structures.
Systematicity: if you can think “John loves Mary,” you can also think “Mary loves John.” The ability to entertain one thought seems to come as a package deal with the ability to entertain systematically related thoughts. This suggests the thoughts share constituents that can be rearranged.
Compositionality: the meaning of a complex thought is determined by the meanings of its parts and the way they’re combined. You understand “the brown dog chased the gray cat” because you understand the components and the syntactic structure.
Together, these suggest that mental representations have a language-like format: discrete symbols combined by syntactic rules into structured expressions. This is the Language of Thought (LOT) hypothesis. Note that LOT is a claim about vehicles — about the format of representations — not about content. It says thought is structured like a language, not that it’s conducted in English or any natural language. The language of thought, if it exists, would be a system of mental symbols with its own primitives and grammar, potentially quite different from any spoken language.
Mental imagery and the iconic alternative
But is all mental representation language-like? The mental imagery debate of the 1970s and 80s suggested not. Shepard and Metzler’s classic finding that the time to judge whether two shapes are identical increases linearly with their angular difference — as if subjects are mentally rotating one shape to match the other — suggested that some mental representations are image-like rather than sentence-like. Kosslyn’s research program elaborated this into a theory of depictive, spatially organized mental images.
The counterposition, associated primarily with Pylyshyn, held that what seems like imagery is really description in a language of thought — that the brain represents spatial information propositionally and the image-like phenomenology is epiphenomenal. This debate was never fully resolved in its original form, but it morphed into a more precise question about representational format: is the distinction between image-like and language-like representations a real computational distinction, or just a surface appearance?
Lawrence Barsalou’s perceptual symbol systems framework gave the imagery tradition a modern computational footing. On his account, concepts are grounded in simulations — partial reactivations of the sensorimotor states involved in perceiving and interacting with category members. Understanding “chair” involves (unconsciously, partially) simulating the visual, motor, and proprioceptive experience of encountering chairs. This is a radical empiricist claim: conceptual processing just is a kind of perceptual processing, running in the absence of the original stimulus.
Jesse Prinz’s Furnishing the Mind (2002) develops a related neo-empiricist position. His “proxytypes” are perceptual representations that stand in for categories — they function as concepts by being the representations that get activated in categorization, inference, and other conceptual tasks. Prinz explicitly aims to reconcile Locke-style empiricism with modern cognitive science, arguing that the vehicles of concepts are always perceptual in origin, even when they’re used for abstract thought.
These empiricist approaches face an important challenge from the compositionality arguments that motivate LOT. If concepts are simulations or images, how do they compose into structured thoughts? You can’t just overlay two images to get a conjunction. Somehow the image-like representations need to interface with compositional structure — which suggests that imagery and language-of-thought formats may need to coexist.
The format question crystallized: Block and Quilty-Dunn
This is precisely where the most illuminating recent work has focused. Ned Block, in The Border Between Seeing and Thinking (2023), argues that the distinction between iconic (image-like) and discursive (language-like) representational formats is the key to drawing the perception-cognition border. On his view, perception trades exclusively in iconic representations — rich, holistically composed, spatially structured. Cognition can use discursive representations — discrete, compositional, with predicate-argument structure. The border between seeing and thinking just is the border between iconic and discursive format.
When perceptual content enters cognition, Block proposes, it gets wrapped in a “cognitive envelope” — a discursive structure that makes it available for inference, reasoning, and report. The iconic content is still there inside the envelope, but the envelope itself is what allows it to interact with the rest of the cognitive system in a language-like way.
This is a form of pluralism — Block acknowledges multiple formats — but it’s a tidy pluralism with a hard border. Jake Quilty-Dunn wants to mess up the tidiness. His position, which he calls “perceptual pluralism,” holds that you find language-of-thought structures within perception itself, not just at the perception-cognition interface.
Quilty-Dunn (with Eric Mandelbaum and Nick Parot) identifies six properties that cluster together in language-of-thought representations: discrete constituents, role-filler independence, predicate-argument structure, logical operators, inferential promiscuity, and abstract conceptual content. Crucially, they treat these not as a definition but as a natural kind cluster (borrowing Richard Boyd’s homeostatic property cluster framework from philosophy of science): if you find some of these properties, you should expect to find others. The properties cluster together for a reason — they’re signatures of a single underlying representational format.
The striking empirical claim is that at least five of these six properties (all except logical operators, which remain an open question) show up in focal object perception — the way the visual system represents individual attended objects. The evidence from visual working memory is particularly telling: features like color and orientation, once encoded into an object representation, can be lost independently of each other. If the representation were holistically iconic (like a picture, where color and shape are fused at each point), you’d expect them to degrade together. Instead, they degrade independently — exactly what you’d expect from discrete constituents combined by compositional rules.
Block’s response is to say that these discursive properties belong to the cognitive envelope, not to perception proper. Quilty-Dunn pushes back: the visual system itself constantly computes over these object representations to maintain coherence across eye movements and to track objects over time. If the discursive structure were quarantined to cognition, why would the visual system be consulting it every few hundred milliseconds for its own perceptual purposes?
This debate matters for concept theory far beyond the perception-cognition boundary dispute. It asks whether there is a proprietary format for conceptual thought or whether conceptual and perceptual representations share format properties. If Quilty-Dunn is right, the boundary between “having a concept of X” and “perceiving an instance of X” is less a wall and more a gradient — perception already involves concept-like structures.
Analogy as the missing dynamic
Douglas Hofstadter and Emmanuel Sander, in Surfaces and Essences: Analogy as the Fuel and Fire of Thinking (2013), approach the format question from a completely different angle. Their central claim is that analogy-making is not one cognitive operation among many but the core of all conceptualization. Every act of categorization — even calling something a “chair” — is already an analogical leap: you’re recognizing a structural similarity between this thing and your prior encounters with chair-like things.
If this is right, concepts are never fully static data structures. They’re ongoing processes of comparison, stretching and contracting with each new encounter. The “surfaces” of a situation (what you notice first, the obvious features) give way to “essences” (the deeper structural parallels), and it’s the ability to move from surfaces to essences that constitutes genuine understanding. This is maximally dynamic — concepts aren’t things stored in memory so much as patterns of analogical activity that stabilize temporarily before being modified by the next encounter.
Hofstadter’s view intersects with the format debate in an interesting way. Analogical mapping requires structured representations (you need parts and relations to map between), so it presupposes something like the LOT properties Quilty-Dunn identifies. But it also requires the kind of fluid, context-sensitive similarity judgments that image-like representations support. Analogy-making might be precisely the process that bridges iconic and discursive formats — the mechanism by which perceptual similarity gets transformed into conceptual structure.
This raises the grounding question in an acute form. If concepts are analogical all the way down, what anchors the base cases? Hofstadter would say perceptual-motor experience — the “surfaces” are ultimately sensory surfaces. From the perspective I’ve developed in earlier articles, this is another indication that phenomenal acquaintance plays a foundational role. The surfaces that anchor the analogical process are not just information-bearing — they are experienced. Whether this experiential character is doing essential work or is merely along for the ride is one of the questions that divides the field, and it’s one I’ll return to.
Vehicles, content, and the polysemy provocation
One of the most striking results from Quilty-Dunn’s research program (with Mandelbaum, Kaplan, Payon, and Schwartz) concerns polysemy — words with multiple related meanings. Consider “breakfast”: it can refer to an event (a morning meal occasion) or to a kind of food (breakfast foods like cereal or eggs). These senses are related but distinct. Now compare a genuinely ambiguous word like “bank” — financial institution vs. riverbank — where the two meanings are unrelated (homonymy).
In experiments on logical reasoning, people treat equivocating arguments over polysemous words as significantly more valid than arguments over homonymous words. An argument that slides from “breakfast” as event to “breakfast” as food gets partial credit; an argument that slides from “bank” as institution to “bank” as riverbank does not. This suggests that the mind has a single representational vehicle for the polysemous concept — a vehicle that participates in compositional thought and logical inference without fixing a determinate referent.
This is a provocation for concept theory. If you can run a logical inference over a concept that doesn’t determinately refer to anything, then what’s doing the cognitive work can’t be the referential content — it has to be the vehicle itself. Quilty-Dunn draws the conclusion explicitly: content is epiphenomenal. The causal work in cognition is done by vehicle properties — syntactic structure, format, the computational role of the representation — not by what the representation is about.
This aligns with the vehicle-content distinction I emphasized in the consciousness article. There, the argument was that phenomenal consciousness involves a distinctive vehicle format — a homeostatic regulatory role — not just distinctive content. Here, the same distinction illuminates concept theory: what makes a concept the concept it is, for purposes of cognition, may be its vehicle properties rather than its content. A concept is individuated not just by what it represents but by how it represents — its format, its compositional properties, its place in the inferential network.
4. The Grounding Question: Where Concepts Come From
Wierzbicka’s semantic primes
Anna Wierzbicka’s Natural Semantic Metalanguage (NSM) program, laid out most systematically in Semantics: Primes and Universals (1996), offers a remarkable bottom-up approach to the question of conceptual primitives. Through decades of cross-linguistic fieldwork, Wierzbicka and her collaborators have identified roughly 60-65 semantic primes — meanings that appear to be lexicalized in every known human language. These primes include items like SOMEONE, SOMETHING, DO, HAPPEN, MOVE, THINK, KNOW, WANT, FEEL, SEE, HEAR, GOOD, BAD, BIG, SMALL, BECAUSE, IF, and others.
The claim is not just that these concepts are universal but that they are semantically primitive: they cannot be decomposed into simpler meanings. You can elucidate them by examples and canonical contexts, but you cannot define THINK in terms of something more basic. Every other meaning in any language can, in principle, be paraphrased using only these primes and their universal grammar of combination.
This makes Wierzbicka explicitly anti-Fodorian: she thinks concepts have internal structure all the way down to the primes. But she’s also skeptical of prototype theory (her reductive paraphrases capture something more definite than statistical typicality) and opposed to formalist approaches that strip meaning of its experiential and cultural texture.
What strikes me most about Wierzbicka’s primes, for the purposes of this survey, is their phenomenal and agentive saturation. The bedrock includes SEE, HEAR, FEEL — experiential notions. It includes WANT, THINK, KNOW — agentive-cognitive notions. It includes DO, HAPPEN, MOVE — causal-dynamic notions. If Wierzbicka is right, the universal foundation of human conceptual structure isn’t abstract logical machinery but a repertoire of experiential, agentive, and causal primitives.
This connects to both of my earlier articles. The consciousness article argued that phenomenal acquaintance provides a distinctive epistemic anchor — a way of knowing that’s constitutively different from theoretical or inferential knowledge. The agency article argued that recursive self-modeling requires agentive conceptual resources. Wierzbicka’s primes can be read as empirical evidence — linguistic, cross-cultural evidence — that both phenomenal and agentive notions are foundational to the human conceptual repertoire. Every known language has found it necessary to lexicalize both.
Carey’s core knowledge
Susan Carey’s The Origin of Concepts (2009) approaches the grounding question from developmental psychology rather than linguistics. She argues for innate core knowledge systems — domain-specific representational systems that are present from early infancy and provide the scaffolding for later conceptual development. The core systems she identifies include representations of objects (spatiotemporal principles governing physical bodies), number (small exact numerosity and approximate magnitude), agency (goal-directed action), and geometry (spatial navigation).
These systems are not full-blown concepts in the adult sense. They’re more like representational primitives with limited compositional power. But they provide the base from which richer concepts are built through a process Carey calls Quinean bootstrapping — a kind of conceptual change that generates genuinely new representational resources that go beyond what the core systems alone can express. This is her answer to Fodor’s doorknob problem: new concepts can be learned, but the learning process isn’t just accumulating more instances. It involves restructuring the representational system itself.
The relationship between Carey’s core knowledge domains and Wierzbicka’s semantic primes is itself an interesting question. They’re answers to different questions — Carey asks what’s developmentally foundational, Wierzbicka asks what’s linguistically universal — and they don’t map neatly onto each other. Carey’s object mechanics has no obvious Wierzbickan counterpart. Wierzbicka’s evaluative primes (GOOD, BAD) don’t correspond to any of Carey’s core domains. But agency appears in both — Carey’s core system for goal-directed action and Wierzbicka’s agentive primes (DO, WANT). And both converge on the idea that the conceptual foundation isn’t a blank slate or a set of logical connectives but a structured repertoire of experiential and interactive primitives.
Prinz’s grounding story
Jesse Prinz’s Furnishing the Mind (2002) tries to reconcile empiricism with concept theory through proxytypes — perceptual representations that stand in for categories. On this account, your concept DOG is a perceptual representation (perhaps multimodal — visual, auditory, motor) that gets activated when you categorize, reason about, or plan actions involving dogs. The proxytype need not be a prototype (it can be exemplar-like or theory-infected), but it must be perceptually grounded — ultimately traceable to sensory experience.
A note on how Prinz’s account relates to the arguments in my consciousness article. Unlike classical empiricism, Prinz does not require that every concept activation be conscious — proxytypes can operate unconsciously, and much of the perceptual processing that constitutes concept deployment happens below the threshold of awareness. So the claim isn’t that his account requires phenomenal experience for every act of categorization. The more interesting question is about the original grounding of conceptual primitives. Proxytypes are perceptual in origin — they’re derived from sensory encounters with category instances. Whether that originating perceptual contact needs to involve phenomenal acquaintance at some point in the developmental or evolutionary history, or whether entirely unconscious perceptual processing could do the grounding work, is a question Prinz’s framework leaves open. From the perspective of the consciousness article, where I argued that acquaintance involves a distinctive vehicle format (homeostatic regulation), the suspicion is that grounding requires more than just unconscious sensorimotor activation — but this is a further commitment, not something Prinz’s account forces on us.
5. The Computational View: Concepts as Probabilistic Inference
The Bayesian turn
The most prominent development in the computational study of concepts since Machery’s and Prinz’s books has been the rise of probabilistic or Bayesian approaches, associated primarily with Josh Tenenbaum, Tom Griffiths, Charles Kemp, and Noah Goodman. On this framework, concepts are generative models — probabilistic models of how category instances are produced — and concept learning is Bayesian inference: updating beliefs about which model best explains the observed data.
This can be understood as a computational-level formalization of theory-theory. The “theory” behind a concept is a probabilistic generative model; learning a concept is selecting the model (from a structured hypothesis space) that best predicts the data. The framework explains many empirical phenomena — how people generalize from few examples, how prior knowledge constrains learning, how concepts can be hierarchically organized — with impressive quantitative precision.
The Bayesian school positions itself squarely within the narrow sense of cognition — cognition as conceptual processing. And this is where a critical assessment is needed. The models are elegant, but they are toy models: each one addresses a specific conceptual task (learning number words, inferring causal structure, generalizing from a few examples) with a hand-designed hypothesis space and no connection to the rest of the cognitive system. There is no perception feeding the inference. No memory management constraining what hypotheses are available. No attention selecting what data gets processed. No real-time dynamics. No action or embodiment. The hypothesis space itself — which is where the real theoretical action is — is stipulated by the modeler rather than explained by the theory.
Compare this to the ACT-R picture from our introduction: there, concept retrieval is shaped by spreading activation in a memory store, constrained by architectural parameters like decay rates and activation levels, and driven by perceptual input that arrives through dedicated buffers. The Bayesian models abstract away from all of this. They tell you what concept learning would look like if it were pure rational inference over a given hypothesis space — a normative ideal — but they say nothing about the format of the representations, the dynamics of retrieval, or how the hypothesis space gets built in the first place. Marr’s computational level is genuinely important, but the Bayesian school sometimes writes as though it is the theory, rather than one level of a theory that needs algorithmic and implementational levels to be complete.
That said, the Bayesian framework does serve as a useful bridge to both of the follow-up articles. For the LLM interpretability piece, the question becomes: are the representations learned by large language models approximations of Bayesian generative models? There’s a lively debate about whether the statistical patterns extracted by deep learning correspond to something like the structured hypotheses that Bayesian models posit, or whether they’re a fundamentally different kind of knowledge. For the cognitive architecture piece, the question is: how do you implement something like Bayesian inference in a system with explicit architectural commitments about memory, attention, and control? ACT-R and similar systems have their own answers, involving chunk-based memory and utility-based selection, and these answers embody specific commitments about conceptual format that the Bayesian framework leaves unaddressed.
Deep learning and the representation question
Neural networks trained on large datasets learn distributed representations — high-dimensional vectors that encode statistical regularities in the training data. These representations support categorization, composition, and inference in ways that look concept-like. A well-trained image classifier develops internal representations that cluster visually similar objects, distinguish fine-grained categories, and generalize to novel instances. A large language model develops representations that support analogy, categorization, and something that looks like causal reasoning.
But which theory of concepts do these representations vindicate? They’re not prototypes in Rosch’s sense (they’re too high-dimensional and context-sensitive). They’re not exemplars in the traditional sense (they don’t store individual instances, though they can behave exemplar-like in some regimes). They’re not LOT atoms (they’re distributed and lack discrete symbolic structure at the level of individual neurons, though structured representations may emerge at higher levels of analysis). They’re not Hofstadter-style analogical engines in any obvious sense, though transformer architectures perform something like structural alignment across contexts.
One possibility is that distributed representations in neural networks are a genuinely new kind of concept — one that the existing theoretical landscape doesn’t cleanly accommodate. Another is that they implement something like Prinz’s proxytypes at a computational level, grounded in statistical co-occurrence rather than perceptual experience. A third possibility, and the one I find most interesting, is that interpretability tools are revealing structure in these representations that maps onto existing theoretical categories — superposition, feature splitting, and circuit-level analysis may be uncovering something like the LOT properties in neural network internals. But a methodological caution is needed here: interpretability tools like sparse autoencoders impose their own theoretical presuppositions. A sparsity prior assumes that the “right” decomposition of a representation involves a relatively small number of active features — which is already a substantive commitment about what concepts look like. The structure that interpretability reveals may be partly a reflection of the tool’s assumptions rather than the network’s intrinsic organization. This muddies the explanatory application of concept theory to neural network representations, and it’s a complication the LLM interpretability article will need to take seriously.
6. Synthesis: Questions for Artificial Minds
The landscape we’ve surveyed is rich and unresolved. But certain patterns emerge that are worth highlighting as we turn toward the question of artificial minds.
The functional characterization frees us to ask format questions. The content debates (prototypes vs. exemplars vs. theories vs. atoms) have reached something like a standoff, as Machery diagnosed. But once we adopt a functional characterization of concepts — whatever plays the concept-role in cognitive processing — we don’t need to resolve the content disputes before investigating format. The format question — iconic vs. discursive, static vs. dynamic, discrete vs. holistic — cuts across the content debates and opens productive empirical avenues. Specific format properties, like the discrete constituents and role-filler independence that Quilty-Dunn identifies, can be investigated regardless of whether the content of those representations is prototype-like, exemplar-like, or theory-like. For artificial systems, this means the question “what concepts has this system learned?” may be less productive than “what format are this system’s representations in, and what computational properties does that format support?”
Grounding keeps coming back — but how far does it go? Whether you approach concepts through Prinz’s proxytypes, Wierzbicka’s semantic primes, Carey’s core knowledge, or Hofstadter’s analogical surfaces, you arrive at something experiential and interactive at the foundation. The primes are saturated with experiential and agentive content. The core knowledge systems are perceptual and interactive. The analogical base cases are sensory surfaces. But a hard question lurks here: are all concepts grounded in this way? Thoughts about democracy, justice, or infinity seem abstract in themselves — connected to perceptual experience only incidentally, because the cognitive machinery that processes them also happens to be connected to perception. You can tell a grounding story (your concept of democracy is built from experiences of voting, debate, institutional interaction), but it’s not clear this story explains the concept’s content rather than merely its causal history. The grounding thesis is compelling for basic-level categories and conceptual primitives; whether it extends to genuinely abstract thought is an open question.
I want to venture a conjecture here, though. Even if concepts like DEMOCRACY are inherently abstract — even if their content has no essential perceptual component — their processing may still benefit from perceptual grounding, at a meta-epistemic level. Consider the concrete mechanism: when you think about democracy, you are typically perceiving tokens — reading the word on a page, hearing it spoken, vocalizing it internally. These perceptual encounters with tokens have nothing to do with the content of democracy (ink on paper doesn’t resemble democratic institutions). But they keep the processing perceptually anchored. The cognitive system is engaging with something real — marks, sounds, articulatory gestures — even as it manipulates abstract content. By “meta-epistemic” I mean that this perceptual contact contributes not to the truth of any particular thought about democracy but to the truth-preserving character of the processing itself: it provides coherence calibration, error signals, the kind of reality-responsive regulation I discussed in the consciousness article.
More broadly, abstract reasoning in biological cognizers never happens in a perceptual vacuum. It occurs in a system that is simultaneously processing environmental input, maintaining bodily homeostasis, tracking objects and events. This ambient perceptual embedding may contribute a further layer of meta-epistemic calibration beyond token perception — keeping the whole cognitive economy honest even when the content being processed is abstract. The guarantee, in both cases, is about the epistemic character of the process, not the content being processed. If this is right, then grounding matters even for abstract concepts, but not in the way empiricist grounding stories suggest. It matters architecturally rather than semantically.
For artificial systems, this reframes the question. It’s not just whether an LLM’s concepts have the right perceptual content (they mostly don’t), but whether its processing enjoys the kind of meta-epistemic coherence that perceptual grounding provides to biological cognizers. And the concern isn’t that LLMs can’t think about democracy — they manifestly can, in some sense — but that without perceptual calibration, the truth-preserving character of their abstract reasoning may be more fragile than it appears.
Compositionality and dynamics are in tension. The LOT tradition emphasizes compositionality — the ability to build complex thoughts from discrete parts. The Hofstadter tradition emphasizes fluidity — concepts are always shifting, always being modified by analogy to new cases. These seem to pull in opposite directions: compositionality wants stable, context-independent atoms; analogy wants context-sensitive, always-morphing structures. But Quilty-Dunn’s polysemy results suggest a way to have both: vehicles can be compositionally structured while their content remains underdetermined and context-sensitive. The syntax is stable even as the semantics shifts. For artificial systems, this tension plays out concretely: LLMs seem to have enormous analogical fluidity but questionable compositional rigor, while cognitive architectures have crisp compositionality but limited analogical flexibility. Whether these tradeoffs are necessary or are artifacts of current designs is a question for the follow-up articles.
The vehicle-content distinction has practical implications. If Quilty-Dunn is right that content is epiphenomenal and vehicles do the causal work, then interpretability research that focuses on what a representation means (its content, its referent) may be looking in the wrong place. What matters for understanding a system’s cognitive operations may be the format properties of its representations — their compositional structure, their inferential roles, their computational accessibility. This doesn’t mean content is irrelevant for evaluation (we care whether a system’s beliefs are true), but it means the mechanism of cognition may be best understood at the vehicle level. This is a claim with direct implications for how we study both LLMs and cognitive architectures.
The hardest question is about the base. What’s at the bottom of the conceptual hierarchy? Wierzbicka says: a small set of universal semantic primes, mostly experiential and agentive. Carey says: innate core knowledge systems for objects, number, agency, and geometry. Fodor says: every primitive concept is an unstructured atom. Hofstadter says: the base cases are perceptual-motor surfaces that get their structure from the analogies built on them. For natural minds, this question is partly empirical (developmental psychology and cross-linguistic research can constrain the answers) and partly philosophical (what counts as conceptual vs. pre-conceptual representation?). For artificial minds, the question becomes a design choice: what do you put at the foundation? LLMs start with tokens — arbitrary units of text. Cognitive architectures start with designed primitives. Neither starts with the experientially and agentively saturated primes that human conceptual systems seem to rest on. Whether this matters — and if so, how much — is perhaps the deepest question connecting concept theory to AI.
These are the questions that the two follow-up articles will take up, each from a different angle: the LLM article by examining what interpretability reveals about learned representations, the cognitive architecture article by examining what designed systems assume about conceptual structure and what happens when those assumptions meet reality.
This article was co-authored by Lukasz Stafiniak and Claude (Anthropic). The arguments, errors, and speculative leaps are jointly owned — blame is to be distributed across species at the reader’s discretion.