Building Minds: Developmental Psychology and the Architecture of Learning

From Piaget’s Constructivism Through Core Knowledge to Computational Models of Cognitive Growth

March 2026


1. Introduction: The Child as Architect

How does a mind get built? Not the brain — that question belongs to neurobiology — but the mind: the structured repertoire of concepts, expectations, causal theories, and reasoning capacities that transforms a newborn’s buzzing confusion into an adult’s coherent understanding of the physical and social world. This is the central question of developmental cognitive psychology, and it turns out to be the same question that drives cognitive architecture research and, increasingly, AI.

In a previous article on cognitive architectures, we surveyed the computational frameworks — ACT-R, Soar, Sigma, and the Common Model of Cognition — that attempt to specify the fixed machinery of adult thought. In an earlier article on concept theories, we examined what concepts are: their content, format, grounding, and dynamics. This article completes the triangle by asking how concepts and cognitive capacities develop in the first place. The developmental question is upstream of both: before there can be an architecture running skills and knowledge, something has to build the representations that the architecture operates over. Before there can be a theory of concepts, something has to explain how the first concepts arise.

The developmental story also turns out to be deeply relevant to artificial intelligence — not just as a source of inspiration, but as a constraint. If human cognitive development reveals something about the necessary conditions for building structured, general-purpose intelligence, then understanding it matters for evaluating what current AI systems achieve and what they lack. We will pay particular attention to Gary Drescher’s Made-Up Minds (1991), an ambitious AI project that took Piagetian constructivism seriously as a design philosophy and implemented a “schema mechanism” that attempted to build concepts from sensorimotor primitives. Drescher’s system is both a vindication and a stress test of constructivist ideas — showing what you can get from minimal innate structure, and where you run into walls that suggest something more is needed.

2. Piaget’s Framework: The Constructivist Vision

2.1 The Core Idea

Jean Piaget’s theory of cognitive development, developed across dozens of books from the 1920s through the 1970s, rests on a single powerful idea: children are not passive recipients of knowledge but active constructors of their understanding of the world. The mind is not furnished by experience (the empiricist picture) or pre-stocked by evolution (the nativist picture) but built, piece by piece, through the child’s own activity — manipulating objects, testing expectations, resolving contradictions.

The mechanism of construction is a cycle of two complementary processes. Assimilation is the interpretation of new experience in terms of existing cognitive structures (which Piaget called schemes). A toddler who has learned to grasp a rattle assimilates a new toy by applying the same grasping scheme to it. Accommodation is the modification of existing schemes when they fail to handle new input. When the toddler tries to grasp water and fails, the grasping scheme must be revised — or supplemented by a new scheme for scooping, pouring, or cupping. The interplay between assimilation and accommodation drives development forward: the child constantly tries to impose existing structures on new experience, and constantly revises those structures when they prove inadequate.

Piaget called the balance between assimilation and accommodation equilibration — a state of cognitive harmony that is constantly disrupted by encounters with novelty and re-established at a higher level of organization. This is not a passive equilibrium but a dynamic process, closer to the homeostatic regulation we discussed in the consciousness article than to a static balance point. The child’s cognitive system is always in motion, always adjusting, always seeking a more comprehensive and internally consistent understanding.

2.2 The Stage Theory

Piaget organized cognitive development into four broad stages, each characterized by a qualitatively different kind of thinking:

The sensorimotor stage (birth to approximately two years) is the period in which intelligence is entirely practical — a matter of coordinating perceptions and actions without internal representation. The infant learns that actions have consequences, that objects have properties, and — crucially — that objects continue to exist when out of sight (object permanence). Piaget believed this last achievement came late in the sensorimotor period, around 8–12 months, and depended on the infant’s ability to coordinate sequences of actions (reaching behind an occluder, pulling away a cloth). This timing claim would become one of the most contested elements of his theory.

The preoperational stage (roughly 2–7 years) is marked by the emergence of symbolic representation — language, pretend play, drawing — but also by systematic errors that reveal the limits of the child’s reasoning. The famous conservation tasks illustrate this beautifully. Pour water from a short, wide glass into a tall, thin one, and a preoperational child will insist there is now “more” water, because the level is higher. The child cannot yet coordinate the two relevant dimensions (height and width) into a single judgment. Piaget also documented egocentrism at this stage: the difficulty of taking another person’s perspective, as revealed by the “three mountains” task.

The concrete operational stage (roughly 7–11 years) sees the mastery of conservation, classification, and serial ordering, but only for concrete, perceptible objects and situations. The child can reason logically about things that can be manipulated and observed but struggles with purely hypothetical or abstract problems.

The formal operational stage (roughly 11 years onward) brings the capacity for abstract, hypothetical, and systematic reasoning — the ability to think about possibilities rather than just actualities, to design experiments that isolate variables, to reason about propositions rather than just objects.

2.3 What Piaget Got Right

Piaget’s stage theory has been extensively criticized, and we will get to the criticisms. But the correct parts are still foundational. The general trajectory from sensorimotor to symbolic to abstract reasoning is real — not as rigid or stage-like as Piaget proposed, but the direction of travel is not in dispute. The constructivist insight is profound: children are active builders of understanding, and their errors are often systematic and informative — not random failures but windows into coherent (if incomplete) cognitive structures. The emphasis on action as the foundation of knowledge has aged well; as we noted in the concepts article, Wierzbicka’s universal semantic primes are saturated with action and experience — DO, MOVE, SEE, FEEL — which suggests that the conceptual foundation really is rooted in sensorimotor engagement. And the idea of equilibration — that cognitive development is driven by the detection and resolution of contradictions — anticipates modern work on prediction error and Bayesian surprise as drivers of learning. Piaget lacked the formal tools to make this precise, but the intuition was sound.

3. Challenging Piaget: What Babies Know

3.1 The Violation-of-Expectation Revolution

The most dramatic revision of Piaget’s framework came from a methodological innovation: the violation-of-expectation (VOE) paradigm, pioneered by Renée Baillargeon in the mid-1980s. Piaget had assessed object permanence by watching whether infants would search for hidden objects. When infants younger than 8–9 months failed to search, he concluded they lacked the concept. But Baillargeon realized that search requires a complex coordination of motor planning, means-ends reasoning, and inhibitory control that might be beyond young infants for reasons having nothing to do with their understanding of objects. Her alternative: show infants events that either respect or violate physical expectations, and measure how long they look. Infants look longer at surprising events — so if they look longer at a screen rotating through the space where a hidden box should be, that suggests they expected the box to still be there.

In Baillargeon’s 1987 study, infants as young as 3½ months looked significantly longer at impossible events, suggesting they understood that hidden objects continue to exist and that solid objects cannot pass through each other. Subsequent work pushed the age to 2½ months for some occlusion events. Piaget had placed object permanence at 8–12 months; VOE studies were finding it at 3–4 months.

The VOE results did not go uncontested — longer looking at impossible events could reflect lower-level perceptual processes rather than genuine conceptual expectations. But the accumulated weight of evidence from multiple labs, paradigms, and physical principles (solidity, continuity, support, containment) has made it very difficult to maintain Piaget’s original timeline. Something about objects is understood very early — the question is what that something is, and where it comes from.

The deeper significance of the VOE findings emerged when Stahl and Feigenson (2015) showed that expectation violation isn’t just a laboratory curiosity — it’s an active learning strategy. Eleven-month-old infants who witnessed an object violate their expectations (passing through a wall, hovering in midair) didn’t just look longer. They selectively explored that object over others, and — crucially — they tested it in ways targeted at the specific violation: banging the object that had appeared to pass through a solid surface, dropping the object that had appeared to hover. Infants who saw the same object behave normally showed no such targeted exploration. This is curiosity with structure. The infant isn’t just startled by surprise; it is using prediction failure as a signal for where learning is most needed, and directing action accordingly. Piaget’s equilibration — the idea that cognitive disequilibrium drives development — gets its sharpest empirical vindication here, but in infants far younger than Piaget would have expected to show it.

3.2 Core Knowledge: Spelke’s Proposal

Elizabeth Spelke synthesized the VOE findings into the core knowledge hypothesis (Spelke 2022, précis 2023): human cognition is founded on six domain-specific systems that emerge early in infancy, have deep evolutionary roots, and continue to function throughout life. These include systems for representing objects (bounded, cohesive, spatiotemporally continuous), actions (goal-directed), number (approximate magnitudes obeying Weber’s law), geometry (distances and angles for navigation), social partners (individuals and their dispositions), and possibly social groups (membership and norms).

These systems are automatic, unconscious, modular (each operates over its own domain and cannot freely exchange information with others), attention-dependent, and abstractly representational — the object system represents objects as individuals that persist through occlusion, not just as patterns of visual stimulation. This is a deeply nativist proposal: core knowledge systems are part of the innate endowment, present in non-human primates and other vertebrates. Experience doesn’t create them but activates and refines them. The question then becomes: if the building blocks are innate, what role does development play?

3.3 The Composition Problem

Spelke’s answer to this question is one of the most interesting aspects of her framework. Core knowledge systems, being modular, cannot freely share information. The object system doesn’t know about numbers; the number system doesn’t know about agents. Yet adult human cognition seamlessly integrates information across these domains — we can judge that exactly three cups are hidden behind the screen, or that an agent is pointing toward the food behind the larger of two barriers. How does integration happen?

Spelke’s proposal is that natural language provides the compositional mechanism that allows core knowledge systems to exchange information. The syntax and semantics of language create structured representations that combine elements from different core systems — “three agents behind the wall” integrates number, agency, objects, and geometry in a single proposition. This is why, on Spelke’s view, only humans achieve the productive, flexible cognition that goes beyond core knowledge: only humans acquire natural language with compositional semantics.

This proposal has been criticized from multiple directions. Revencu and Csibra (2023) argue that language alone cannot explain conceptual productivity — that something more is needed, perhaps a dedicated system for communication or shared intentionality. Others have noted that pre-linguistic infants show some cross-domain integration that Spelke’s strict modularity should prevent. Susan Carey (2024), commenting on Spelke’s book, has emphasized the role of language but also the need for additional mechanisms of conceptual change.

Perhaps the most damaging challenge, however, comes from comparative cognition. If language is required for cross-domain integration, then non-linguistic animals should be trapped within their core systems — capable within each domain but unable to combine information across them. Corvids flatly contradict this prediction. Western scrub-jays cache food in hundreds of locations and selectively re-cache items when they have been observed by a competitor — integrating object representation (what was cached), spatial memory (where), temporal reasoning (when, since cached food degrades), and social cognition (who was watching, and what they could see from their vantage point). New Caledonian crows manufacture hooked stick tools through multi-step sequences that require coordinating physical-causal knowledge with prospective planning. Corvids achieve this cross-domain integration with brains organized very differently from mammalian cortex, without anything resembling compositional syntax. Whatever mechanism enables it, it isn’t language.

This suggests that language is one powerful mechanism for cross-domain integration — perhaps the most powerful one, and the one that enables the recursive, open-ended compositionality that is distinctively human — but not the only mechanism. There may be pre-linguistic or non-linguistic routes to inter-module communication that Spelke’s framework underestimates. The debate is very much alive, and it touches directly on the constructivist question: is development a matter of combining pre-existing representations (Spelke’s view) or constructing qualitatively new ones (the constructivist view)?

4. Carey’s Synthesis: Constructivism Vindicated (With Nativist Building Blocks)

4.1 Quinean Bootstrapping

Susan Carey’s The Origin of Concepts (2009) is the most ambitious attempt to synthesize the nativist insights of the core knowledge tradition with the constructivist insights of the Piagetian tradition. Carey accepts that infants begin life with rich innate representational systems — she agrees with Spelke about core knowledge. But she also insists, contra Fodor’s radical nativism, that genuinely new concepts can be learned — concepts whose expressive power exceeds anything that could be built by logical combination from innate primitives.

The mechanism she proposes is Quinean bootstrapping — named after the philosopher W.V.O. Quine’s metaphor of rebuilding a ship while at sea. The key idea is that learners can create placeholder symbols — symbols whose initial meaning is exhausted by their relations to other symbols and to observable regularities — and gradually fill those symbols with conceptual content through a process that is neither pure logical construction nor pure empirical generalization.

Her paradigm case is the acquisition of natural number concepts. Children learn the counting sequence (“one, two, three, four, five…”) before they understand what any of the words mean numerically. The words initially function as placeholders — their “meaning” is just their position in the list and their role in the counting routine. Over about two years (roughly ages 2–4), children slowly figure out what the words mean: first “one” (distinguished from “more than one”), then “two,” then “three,” and then — in a dramatic insight — they grasp the counting principle: that the last word in a count tells you how many things there are. At this point, they have constructed a genuinely new representational resource: exact cardinal number concepts. These concepts cannot be derived by logical combination from the core number system (which represents only approximate magnitudes) or the parallel individuation system (which handles only sets of 1–3). Something new has been created.

4.2 The Circularity Challenge

Carey’s account has been challenged on the grounds that it is circular: if learning new concepts requires hypothesis testing, and hypothesis testing requires already having the concepts to formulate hypotheses about, then bootstrapping presupposes what it’s supposed to explain. This is essentially Fodor’s argument against concept learning, now directed at Carey’s specific proposal.

The most sophisticated response comes from Beck (2017), who argues that the circularity challenge can be met by recognizing computational constraints — both internal (innate learning biases, attentional mechanisms) and external (linguistic structure, cultural scaffolding) — that guide the interpretation of placeholder symbols without presupposing the content those symbols will eventually acquire. The placeholders don’t need to already mean what they will come to mean; they just need to be embedded in computational processes that progressively constrain their interpretation.

This connects to a deeper point about format versus content that we emphasized in the concepts article. The placeholder symbols initially have vehicle properties (their syntactic position, their inferential relations to other symbols) without having determinate content. It is the vehicle properties that do the computational work during bootstrapping — the gradual accumulation of inferential roles that eventually constitutes having a new concept. If Quilty-Dunn is right that content is epiphenomenal and vehicles do the causal work, then bootstrapping becomes less mysterious: you don’t need to explain how a new meaning magically appears; you need to explain how a new computational role gets built up from interactions among existing computational roles. And that is something we know how to model.

4.3 Executive Function and Conceptual Change

A more recent development in Carey’s research program, conducted with Deborah Zaitchik since 2012, has investigated the role of executive function in conceptual change. Executive function — the set of cognitive control capacities including working memory updating, inhibitory control, and cognitive flexibility — turns out to be a critical factor in whether children can undergo the kind of conceptual restructuring that bootstrapping requires.

This makes architectural sense. In the cognitive architecture framework from our previous article, the serial bottleneck in procedural memory’s interaction with working memory is what enables deliberate, goal-directed thought. Working memory capacity — which grows from roughly one chunk at age 2 to four chunks in adulthood — constrains how many representational elements can be simultaneously held and compared. Inhibitory control — the ability to suppress a prepotent response — is needed when the new conceptual system conflicts with the old one (as when a child must suppress the perceptually compelling judgment that the taller glass has “more water”). Cognitive flexibility — the ability to shift between different representational frameworks — is needed to compare the old and new ways of thinking about a domain.

The neo-Piagetian theorists, particularly Pascual-Leone and Case, had already argued that growth in working memory capacity is a (or the) primary driver of stage transitions. Pascual-Leone’s Theory of Constructive Operators proposed that mental power (M stands for mental-attentional; the number of schemes that can be simultaneously activated) increases from 1 unit at age 2 to 7 units at age 15, with each increment enabling a new level of cognitive complexity. Case refined this by arguing that what increases is not raw capacity but the efficiency with which schemes can be processed — freeing up working memory resources that were previously consumed by lower-level operations.

The comparison to the cognitive architecture tradition is illuminating if we ask: what is a Pascual-Leonean scheme? It is a structured representational unit — it can encode a sensorimotor pattern, a relational structure, an executive control operation — but what matters for capacity purposes is that each scheme that must be simultaneously active to solve a task occupies one unit of M-capacity. This makes a scheme functionally equivalent to a chunk in the Common Model of Cognition: the unit that occupies one working memory slot. A concept, on this picture, is not a single scheme but a coordinated assembly of schemes — to deploy the concept of conservation, for instance, the child must simultaneously activate schemes for the two perceptual dimensions (height, width), the scheme for their compensatory relation, and an executive scheme for inhibiting the prepotent single-dimension response. That is four schemes — precisely the kind of demand that a preoperational child with two or three units of M-capacity cannot meet.

Each pointer from the cognitive architectures article maps onto one scheme or consolidated chunk. Our conjecture there was that the adult ceiling is approximately four pointers, motivated by the theoretical requirements of recursive Merge; empirically, the measured adult ceiling ranges from four (Cowan, Halford) to seven (Pascual-Leone), depending on the task and how chunking is controlled for. Case’s efficiency mechanism explains why effective capacity grows even between maturational steps: practiced operations get compiled into single chunks, reducing M-demand and freeing up pointers for novel elements. The child doesn’t get more pointers (beyond what maturation provides); the child gets more done per pointer.

5. The Neo-Piagetians: Architecture Meets Development

5.1 Stages as Working Memory Growth

The neo-Piagetian tradition represents a genuine bridge between developmental psychology and cognitive architecture. Where Piaget described stages in terms of the kind of thinking children could do (sensorimotor, preoperational, concrete operational, formal operational), the neo-Piagetians redescribed the same stages in terms of the computational resources available at each age — primarily working memory capacity and processing speed.

Pascual-Leone, Case, Halford, and Fischer all proposed variants of this redescription. Despite differences in detail, they converge on a core claim: Piagetian stage transitions correspond to increases in the number of relational elements that the child can simultaneously represent and coordinate. A sensorimotor infant can coordinate one or two elements (an action and its result). A preoperational child can coordinate two or three (two dimensions of a conservation task, but not in a way that cancels them out). A concrete-operational child can coordinate four to five, enabling the compensation reasoning that conservation requires. A formal-operational adolescent can coordinate six or seven, enabling reasoning about relations between relations — the abstract, hypothetical thought that Piaget described.

Halford’s relational complexity theory makes this particularly precise: a unary relation (a property of a single element) requires one working memory slot; a binary relation (a relation between two elements) requires two; a ternary relation requires three; and a quaternary relation requires four. Halford argues that four is the maximum relational complexity that adults can process without decomposing the relation into simpler parts — which maps neatly onto the approximately four-item working memory capacity limit that Cowan (2001) documented. Children develop through the relational complexity levels as their working memory capacity increases.

5.2 The Maturation Question

The neo-Piagetians split on a crucial question: what causes working memory to increase with age? The constructivist-learning camp (including Case in his later work) argues that the primary cause is experience-driven increases in processing efficiency — as the child practices cognitive operations, they become faster and more automatic, freeing up resources for additional elements. This is essentially the chunking mechanism that Soar and the Standard Model describe: practiced sequences get compiled into single operations, reducing the working memory load they require.

The maturational-attention camp (including Pascual-Leone) argues that something more is needed: a biologically timed increase in the capacity of the mental-attentional resource itself — a growth in the “hardware” rather than just optimization of the “software.” This is a genuinely different claim, because it implies that no amount of practice could get a 3-year-old to reason at a 7-year-old’s level. The maturation of prefrontal cortex — which undergoes dramatic development throughout childhood and adolescence, with synaptic pruning, myelination, and increasing connectivity — provides the neurobiological substrate for this claim.

The truth almost certainly involves both. Cortical maturation sets an upper bound on available cognitive resources at each age, while experience-driven efficiency gains determine how much of that capacity is effectively available for novel reasoning. This is analogous to the interaction between hardware and software in computing: a faster processor enables more complex programs, but optimized code can sometimes achieve on a slower processor what unoptimized code requires a faster one for.

6. Drescher’s Schema Mechanism: Constructivism Meets AI

6.1 The Radical Constructivist Wager

Gary Drescher’s Made-Up Minds (1991) is one of the most intellectually ambitious projects in the history of AI — and one of the least well known. Drescher set out to build a computational system that would learn the basic concepts of the physical world from sensorimotor experience, starting from almost nothing innate, in a process modeled explicitly on Piaget’s account of sensorimotor development. The system — the “schema mechanism” — was designed to discover regularities in its interaction with a simulated environment and to build increasingly abstract representations of objects, actions, and causal relationships.

The constructivist wager at the heart of the project is bold: almost all structure of the world — even what Kant claimed must be known a priori, such as the binding of experience into persisting objects with stable properties — can be learned from input by a relatively simple mechanism. Drescher doesn’t claim that the learning mechanism itself is simple in the sense of being trivial; he claims it is simple in the sense of requiring minimal domain-specific innate knowledge. The mechanism is domain-general. The structure comes from the world.

6.2 How the Schema Mechanism Works

The schema mechanism operates over a vocabulary of items and schemas. Primitive items are binary sensory states (On/Off); synthetic items are learned beliefs that can be On, Off, or Unknown. Schemas are triples of (precondition, action, postcondition) — essentially condition-action rules that predict what will happen when an action is taken in a given context.

Learning proceeds through marginal attribution: the system tracks, for each action, which state-transitions occur more often than their baseline rate. When a transition is reliably associated with an action, a new schema is created. The system then looks for context conditions that make the schema more reliable — features of the situation that must hold for the action-result connection to work. This is essentially a form of causal discovery: the system is finding the conditions under which actions reliably produce effects.

What makes the schema mechanism more than a simple stimulus-response learner is the synthetic item mechanism. When the system encounters a schema that is locally consistent (it works for a while, then stops working, then works again) but not globally reliable, it creates a synthetic item — an inferred hidden variable that explains the inconsistency. The paradigmatic case is object permanence: the schema “if I probe here, I see the ball” works when the ball is present but not when it has been moved. The system creates a synthetic item — call it “ball-is-here” — that, when added to the schema’s context, makes it reliable. The synthetic item is then given verification conditions: ways to test whether it is currently On or Off (checking whether the probe succeeds, checking whether another reliable schema predicts the ball’s presence).

This is a computational implementation of Piaget’s claim that the infant constructs the concept of a persisting object. The synthetic item “ball-is-here” is not a perceptual primitive — it cannot be directly observed. It is an inferred theoretical entity, created by the system to explain regularities and inconsistencies in its experience. As Drescher puts it, a synthetic item “reifies the set of circumstances under which a piece of one’s computational machinery behaves a certain way.” The concept of an object is, in this framework, a hypothesis about the world that the system invents to make its own predictions more reliable.

6.3 Counterfactuals and Non-Naive Induction

One of Drescher’s deepest insights is the central role of counterfactuals in learning. The system doesn’t just track what happens; it tracks what would have happened under different conditions. This is essential for causal reasoning: knowing that pressing a button turns on a light is useful only if you also understand that not pressing the button would leave the light off. The schema mechanism implements counterfactual reasoning through a careful tracking of “unexplained” transitions — state changes that are not already accounted for by existing reliable schemas — and uses these to discover new causal regularities.

Drescher is explicit that naive induction — mere correlation tracking — is insufficient for genuine concept learning. Section 8.6 of Made-Up Minds is titled “Why Non-Naive Induction Must Be Built In,” and it argues that the system requires built-in mechanisms for dealing with counterfactuals, for preferring schemas with inverse effects (turning an item On right after it was turned Off, or vice versa, as a heuristic for finding controllable variables), and for the marginal-attribution approach that focuses on unexplained rather than total correlations. These are not domain-specific innate knowledge structures; they are domain-general learning biases. But they are biases nonetheless, and they represent a concession to the nativist: some structure must be built in for learning to get off the ground.

This is where Drescher meets David Deutsch, despite their seemingly opposed starting points. Drescher is a constructivist; Deutsch is a critic of empiricism. But Drescher’s constructivism requires that the learning algorithm have what Deutsch calls universal reach — the capacity to discover any regularity in any domain, given sufficient experience. And Drescher accepts that this capacity cannot itself be learned; it must be part of the system’s innate architecture. Even the most radical constructivism, it turns out, requires a nativist kernel.

6.4 What the Schema Mechanism Achieves — and Where It Fails

The schema mechanism succeeds in rediscovering several Piagetian sensorimotor achievements from first principles: object permanence (through synthetic items), means-ends reasoning (through composite actions built from schema chains), and a primitive form of imitation (through the mechanism that detects when a goal state becomes satisfied externally). Remarkably, the system also reproduces some Piagetian errors at the right developmental stage — for instance, the failure to integrate tactile and visual object representations, which corresponds to the infant’s early failure to search for objects that have been seen but not touched.

But the system also has significant limitations. It never develops a notion of object identity across sensory modalities — the synthetic items for tactile-object-present and visual-object-present remain separate, just as they do for young Piagetian infants. The “subactivation” mechanism — an unimplemented proposal for simulating actions mentally rather than performing them physically — was designed to enable planning and prospection, but Drescher could not get it working. Variable-matching for generalizations (reasoning about “any object” rather than specific sensory configurations) was deliberately excluded to maintain the constructivist commitment, but Drescher acknowledged this might be a crippling limitation. And the system has no mechanism for hierarchical abstraction — for building concepts at multiple levels of generality.

These limitations are instructive. They suggest that constructivism, taken to its logical extreme, hits a ceiling. The schema mechanism can build up from sensorimotor primitives to something like object permanence and simple causal reasoning, but it cannot reach the level of abstract, variable-binding, hierarchically structured thought that characterizes human cognition from toddlerhood onward. Something more is needed — and the debate between Spelke, Carey, and the neo-Piagetians is essentially about what that something is. Is it richer innate structure that was never absent, just waiting to be activated (Spelke)? Or does representational expressiveness genuinely grow — and if so, at which level? Carey’s bootstrapping posits growth at the high level: new conceptual primitives, new compositional structures, qualitative discontinuities in what can be thought. The neo-Piagetian tradition posits growth at the low level: the format’s capacity to simultaneously bind multiple relational roles increases with M-power maturation and chunking efficiency — not new vocabulary, but more variables that the grammar can handle at once. These are not mutually exclusive, and both may be needed to explain how a schema mechanism that tops out at object permanence could eventually support the recursive, hierarchically structured thought of a human adult.

7. Complementary Learning Systems and the Pace of Development

7.1 Why Learning Has Two Speeds

One feature of cognitive development that is obvious to any parent but curiously undertheorized is its pace. Some things are learned fast — a toddler can learn a new word from a single exposure (Susan Carey’s own discovery of “fast mapping”). Other things are learned agonizingly slowly — conservation concepts resist instruction for years, and number concepts take two years to emerge even with constant exposure to counting.

The complementary learning systems (CLS) framework, which we discussed in the cognitive architectures article, provides a principled explanation for this asymmetry. The hippocampal system learns rapidly from single episodes, storing sparse, pattern-separated traces that can be replayed to neocortex for gradual consolidation. The neocortical system learns slowly, through interleaved exposure, extracting the statistical regularities that constitute general knowledge. The two systems solve the stability-plasticity dilemma that plagues any learning system: learn too fast and you overwrite old knowledge (catastrophic interference); learn too slowly and you can’t capture new experiences.

The 2016 update by Kumaran, Hassabis, and McClelland adds a crucial refinement: neocortical learning can be rapid when new information is consistent with known structure — when it fits into existing schemas. This explains a puzzling asymmetry in development. Children can learn schema-consistent facts almost instantly (a new word for a familiar category, a new instance of a known pattern) but require prolonged experience to learn schema-inconsistent information (a new conceptual framework that contradicts an existing one). Conservation is hard not because it is logically complex but because it contradicts a deeply entrenched schema: “more height means more stuff.”

7.2 Connecting CLS to Carey’s Bootstrapping

The CLS framework illuminates why Carey’s Quinean bootstrapping is slow. Bootstrapping creates new representational resources that are discontinuous with existing ones — the new concept of exact number cannot be derived by gradual generalization from the approximate magnitude system. This is precisely the kind of schema-inconsistent learning that CLS theory predicts will be difficult: the new representational system doesn’t fit into existing neocortical structure and must be built up through extensive hippocampal-to-neocortical consolidation, driven by repeated experience with counting, comparing, and number games.

The CLS framework also explains why sleep matters so much for developmental learning. Hippocampal replay during sleep consolidates episodic memories into neocortical representations — and children, who have far more to consolidate, sleep far more than adults. The well-documented relationship between sleep and cognitive development may reflect the brain’s need for extended consolidation periods during which new representational resources are gradually integrated with existing knowledge.

And it connects to the schema mechanism’s learning dynamics. Drescher’s system learns incrementally, creating new schemas as a side effect of performance — the same principle that the Common Model of Cognition codifies. But the schema mechanism has only one learning speed. Real brains have two: fast hippocampal encoding and slow neocortical consolidation. A more biologically faithful version of the schema mechanism would need to implement this dual-speed architecture, with rapid one-shot formation of new schemas complemented by slow, interleaved consolidation that extracts the statistical structure across episodes. This might help explain how the system could move beyond the sensorimotor level — the transition to symbolic, abstract thought may require the kind of progressive neocortical restructuring that only slow, schema-level learning can provide.

8. Vygotsky’s Social Dimension (And Why Architecture Needs It)

8.1 The Zone of Proximal Development

No account of cognitive development is complete without acknowledging Lev Vygotsky’s contribution: the insight that cognitive development is fundamentally social. Where Piaget’s child is a solitary explorer, constructing knowledge through individual interaction with the physical world, Vygotsky’s child is a participant in cultural practices, guided by more knowledgeable partners (parents, teachers, older children) who scaffold activities that the child could not perform alone.

Vygotsky’s concept of the zone of proximal development (ZPD) — the gap between what a child can do independently and what they can do with guidance — captures something that neither nativism nor individual constructivism accounts for: the extent to which cognitive development depends on the right kind of social support at the right time. A child who cannot solve a conservation task alone might succeed when an adult draws attention to the relevant dimensions, asks leading questions, or models the correct reasoning.

8.2 Social Scaffolding and Schema Formation

The social dimension matters for the cognitive architecture picture because it changes what the learning system has to do on its own. Drescher’s schema mechanism must discover all regularities from scratch, through its own sensorimotor exploration. A human child, by contrast, has parents who structure the environment to make regularities more salient, label objects to draw attention to their persistence, demonstrate actions to reveal their causal structure, and provide linguistic scaffolding that encodes conceptual distinctions the child hasn’t yet discovered.

This is what Carey’s notion of external computational constraints captures: the cultural environment provides structure that guides the bootstrapping process without having to be innately represented. The counting words, for instance, are a cultural invention — an external representational system that provides the placeholder symbols needed for bootstrapping exact number concepts. The child doesn’t need to invent the counting system; they need to discover what it means, which is a different (and easier) problem. Similarly, labels for objects (“look, the ball!”) provide external symbols that can anchor the internal synthetic items that represent object persistence — making the object-permanence problem easier for a socially embedded learner than for Drescher’s isolated schema mechanism.

The implication for AI is significant. Current language models are trained on the accumulated outputs of human cultural scaffolding — they inherit the concepts, distinctions, and structural regularities that millions of human authors have externalized in text. This gives them access to representational resources that no individual learner could construct from scratch, which may explain some of their surprising capabilities. But it also means they are parasitic on a developmental process — human cognitive development within a cultural context — that they do not replicate.

8.3 Fast Mapping and the Tomasello Bridge

One of the most striking facts about human cognitive development is that children can learn a new word from a single exposure — a capacity Susan Carey herself discovered in 1978, calling it fast mapping. A two-year-old hears “Give me the chromium one, not the red one” and immediately infers that “chromium” refers to the olive-colored tray. One exposure, one mapping, retained over weeks.

This is not just impressive memory. It reveals that something in the child’s cognitive architecture is ready to bind a meaning to an arbitrary symbol on a single encounter. The binding requires solving a hard disambiguation problem: the word could refer to the object’s color, shape, material, location, or any number of other properties. Markman’s constraints (whole-object assumption, mutual exclusivity) help narrow the hypothesis space, but they are not enough. The child also needs to determine what the speaker intends to refer to — which requires tracking the speaker’s gaze, gesture, and communicative purpose.

This is where Michael Tomasello’s work becomes essential. In Becoming Human (2019), Tomasello argues that the capacity for shared intentionality — specifically, joint attention, which emerges around 9–12 months — is the distinctively human cognitive adaptation that makes cultural learning possible. Before joint attention, the infant cannot determine what the speaker means by a new word, because they cannot track what the speaker is attending to. After joint attention, fast mapping becomes possible: the infant uses the speaker’s eyes, pointing, and tone to triangulate reference, then binds the new symbol to the inferred referent in a single step.

This has architectural implications that connect to everything we have discussed. Fast mapping requires, at minimum: a working memory slot free to hold the new symbol, a pointer to the candidate referent (sustained by joint attention), an inhibitory mechanism to suppress alternative interpretations (Markman’s constraints), and a binding operation that links symbol to referent. That is a Merge-like operation over working memory pointers — taking what two pointers reference (an arbitrary phonological form and an attended object or property) and combining them into a structured unit that can then be stored in long-term memory. Joint attention provides the social mechanism that selects which referent to bind; the architecture provides the computational mechanism that performs the binding.

Tomasello’s deeper claim is that shared intentionality is not just a prerequisite for word learning but the foundation of cumulative cultural evolution — the capacity for each generation to build on the cognitive achievements of previous generations rather than starting from scratch. This is what Drescher’s schema mechanism most conspicuously lacks, and what makes the difference between a system that tops out at object permanence and a species that builds cathedrals and writes physics textbooks. The individual constructivist, learning alone, hits Drescher’s ceiling. The culturally embedded constructivist, learning through shared intentionality, inherits a ratchet.

9. Implications for Artificial Minds

9.1 What Drescher’s System Teaches

The schema mechanism, read through the lens of the subsequent three decades of developmental and cognitive science, is both more right and more limited than it appeared in 1991. It is more right because the core constructivist insight — that concepts like object permanence can be understood as inferred theoretical entities created to explain regularities in experience — has been increasingly supported by computational models and empirical findings. The Bayesian turn in developmental psychology (Gopnik, Tenenbaum, Griffiths) essentially formalizes Drescher’s approach: concepts are generative models of the world, learned by inference from data.

But the schema mechanism is also more limited than constructivism alone would suggest, for reasons that illuminate what current AI systems need. First, synthetic items — the mechanism’s way of inventing hidden variables — are flat. Each synthetic item reifies the validity conditions of a single schema; it cannot bind to other synthetic items in structured, variable-bearing compositions. There is no way to represent “the same object that was here is now there” as opposed to “object-here is Off and object-there is On” — the system tracks individual state changes but cannot express identity across contexts. This is precisely the working memory gap: the Common Model of Cognition provides a central hub where multiple representations can be simultaneously held, compared, and composed through the Merge-like pointer operations we discussed in the cognitive architectures article. Without that hub, the schema mechanism cannot do what Carey’s bootstrapping requires — simultaneously activating placeholder symbols and existing representations and exploring structural mappings between them. Second, the system has no social channel — no way to receive external scaffolding from a more knowledgeable partner. Third, it has no dual-speed learning system — no hippocampal-like fast encoding complemented by neocortical-like slow consolidation.

9.2 The Developmental Desideratum

The deepest question from developmental psychology for AI is how much of the structure of thought falls out of text corpora alone — or, more precisely, out of the statistical structure of human communicative output — versus how much requires developmental architectural scaffolding to reproduce what evolution designed.

One way to frame this: the accumulated text of human civilization is itself a product of minds that developed through sensorimotor exploration, core knowledge activation, bootstrapping, social scaffolding, and dual-speed consolidation. All of that developmental structure is implicit in the text — in the patterns of argument, the causal vocabulary, the conceptual distinctions, the systematic compositionality of natural language. A system that learns from text inherits this structure secondhand, without having to reconstruct the developmental process that produced it. The question is whether secondhand inheritance is sufficient. It is possible — genuinely possible, not just as a debating point — that there is enough structure in text to guide learning dynamics toward the right representational organization, even with a minimal learning bias. The developmental path through sensorimotor stages, through conservation errors and number-word bootstrapping, through slow hippocampal-to-neocortical consolidation, may be one way to arrive at structured, compositional, causally coherent representations — but not the only way. Text might be a sufficiently rich signal that a powerful learner can extract the target structure directly, the way a student can sometimes learn physics from a textbook without performing the experiments.

On the other hand, the developmental tradition suggests reasons for caution. Drescher’s schema mechanism shows that even learning object permanence from scratch requires non-trivial architectural commitments — counterfactual reasoning, marginal attribution, synthetic items. The neo-Piagetian account shows that relational complexity is gated by working memory capacity, which matures on a biological schedule. Carey’s bootstrapping shows that some conceptual transitions require the creation of genuinely new representational primitives, not just recombination of existing ones. These are all claims about the process of learning, not just its endpoint — and they raise the possibility that skipping the process leaves the resulting representations subtly different in ways that matter under pressure. Whether this is so is an empirical question that AI systems are now, in effect, testing at scale.

10. Conclusion: The Ship Rebuilt at Sea

Quine’s metaphor of rebuilding a ship at sea — which Carey borrowed for her bootstrapping account — captures the central puzzle of cognitive development beautifully. You cannot take the ship out of the water to rebuild it. You cannot halt cognitive processing to install new concepts. Whatever changes occur must be made while the system is running, using the materials already on hand, without ever losing the ability to navigate.

What developmental psychology has revealed, across nearly a century of research from Piaget through Spelke and Carey, is that this reconstruction is both more constrained and more creative than anyone initially guessed. More constrained, because the starting materials include innate core knowledge systems with specific domains and limitations, and because biological maturation sets the pace of working memory growth. More creative, because bootstrapping mechanisms can generate genuinely novel representational resources — concepts that couldn’t even be formulated in the old vocabulary — and because cultural scaffolding provides external structure that individual learners couldn’t construct alone.

Drescher’s schema mechanism showed that a computationally minimal constructivism can rediscover basic Piagetian achievements from sensorimotor primitives, but it also showed where pure constructivism hits its ceiling. The neo-Piagetians showed that architectural constraints — especially working memory capacity — provide the framework within which constructive processes operate. The core knowledge theorists showed that the starting state is richer than empiricists assumed. And Carey showed that the developmental process involves episodes of genuine conceptual discontinuity — not just accumulation of more knowledge within a fixed framework, but reconstruction of the framework itself.

For cognitive architectures and AI, the implication is that the architecture and what runs on it cannot be understood independently. The cognitive architecture itself — working memory capacity, the serial bottleneck, the modular organization — is what biological maturation builds, not what learning or social construction produces. But the knowledge and skills that the architecture operates over are shaped by the developmental process: by the interaction of innate core knowledge, environmental input, social scaffolding, and multiple learning mechanisms over years. Building systems that replicate the adult repertoire without replicating the developmental process that filled the architecture may be possible — but the history of developmental psychology suggests we should at least understand what we’re skipping.

Allen Newell’s question — How can the human mind occur in the physical universe? — is also a developmental question. The human mind doesn’t just occur; it develops. Understanding how it develops is part of understanding how it’s possible at all.


Sources:

  • Piaget, J. (1954). The Construction of Reality in the Child. Basic Books.
  • Piaget, J. (1952). The Origins of Intelligence in Children. International Universities Press.
  • Carey, S. (2009). The Origin of Concepts. Oxford University Press.
  • Carey, S. (2015). Theories of development: In dialog with Jean Piaget. Developmental Review 38, 36–54.
  • Spelke, E.S. (2022). What Babies Know: Core Knowledge and Composition, Volume 1. Oxford University Press.
  • Spelke, E.S. & Kinzler, K.D. (2007). Core knowledge. Developmental Science 10(1), 89–96.
  • Drescher, G.L. (1991). Made-Up Minds: A Constructivist Approach to Artificial Intelligence. MIT Press.
  • Baillargeon, R. (1987). Object permanence in 3½- and 4½-month-old infants. Developmental Psychology 23(5), 655–664.
  • Stahl, A.E. & Feigenson, L. (2015). Observing the unexpected enhances infants’ learning and exploration. Science 348(6230), 91–94.
  • McClelland, J.L., McNaughton, B.L. & O’Reilly, R.C. (1995). Why there are complementary learning systems in the hippocampus and neocortex. Psychological Review 102(3), 419–457.
  • Pascual-Leone, J. (1970). A mathematical model for the transition rule in Piaget’s developmental stages. Acta Psychologica 32, 301–345.
  • Gopnik, A. & Wellman, H.M. (2012). Reconstructing constructivism: Causal models, Bayesian learning mechanisms, and the theory theory. Psychological Bulletin 138(6), 1085–1108.
  • Tomasello, M. (2019). Becoming Human: A Theory of Ontogeny. Harvard University Press.

This article was co-authored by Lukasz Stafiniak and Claude (Anthropic). The arguments, errors, and speculative leaps are jointly owned — developmental stage of responsibility: indeterminate.