A Shape Is Not Its Index: OCANNL’s Shapes Beside Star’s Algebraic Shapes

A Shape Is Not Its Index

Here is a padded matrix, the kind every convolution library has to lay out by hand:

1 2 3        -1 -1 -1 -1 -1
4 5 6   ↦     0  1  2  3  0
              0  4  5  6  0
             +1 +1 +1 +1 +1

Two systems describe this without index arithmetic, and they were built without knowledge of each other. Jakub Bachurski, Alan Mycroft, and Dominic Orchard’s Structuring Arrays with Algebraic Shapes (ARRAY ’25) types the padded array’s shape as a pair of concatenation shapes — rows split Top/Centre/Bottom, columns split Left/Centre/Right — and the index arithmetic becomes pattern matching over variant tags. OCANNL, across the four posts this series comprises, reads the same padded array as a coproduct in two axes — a block matrix — where each region is one summand of a disjoint union and the index arithmetic becomes injection offsets. A variant and a coproduct are the same thing: a sum type. Bachurski’s calculus, which they call Star, reaches it from the typing side; OCANNL reaches it from the loop-nest side. They meet in the middle, at the same structure, and the meeting is worth a post.

This is the fifth in the series, and the first to look outward. The four predecessors built OCANNL’s shape system from scratch — broadcasting as an order, contraction as an emergent property of an index map, convolution as an affine-indexed contraction, concatenation as a coproduct factor. Star arrives at much of the same algebra by a different route and with a different goal, and setting the two side by side is the fastest way to say what OCANNL is by saying what it is not. The thesis is in the title, and it is the one structural fact both systems turn on: a shape is not its index. They are distinct objects in a fixed correspondence, and once you grant that, you get to choose what to do with the correspondence. Star types it. OCANNL computes along it — and that single choice is the source of every divergence below.

I met Jakub at a couple of FUN-OCaml workshops and an OCaml Workshop; we have not worked together, so this is a reading from the outside, offered in the spirit of a shared program rather than a critique. The paper is excellent. The disagreements below are the productive kind — the kind where two designs that agree on the deep structure have made opposite bets at the surface, and the bets illuminate each other.

What Star is

A short, fair summary, in case you have not read it (you should). Star is a small functional array calculus — pointful, in the tradition of \tilde{F} and Single Assignment C — whose contribution is a type system for array shapes built from structural records and variants. Three ideas carry it.

First, algebraic shapes. A shape is not a tuple of integers but a tree of two constructors: a product shape {|ℓ : σ, …|}, indexed by a record of per-axis indices, models named multidimensional arrays; a concatenation shape ⟦T : σ, …⟧, indexed by a variant, models padding and concatenation. Records give a product on shapes; variants give a sum. The names are the point — a product shape is exactly a named-tensor axis set, and the paper is candid that this generalizes the named-tensor and xarray line of work.

Second, structural subtyping forming a distributive lattice. Star’s shapes and indices subtype — width and depth, records one way, variants the other — and the order is a distributive lattice with meets and joins. This is the algebraic-subtyping discipline of Dolan and Mycroft’s MLsub, with the polarity restriction lifted by Parreaux and Chau, and Star’s bet is that this discipline buys it principal ML-style type inference (conjectured, not yet proved). Crucially, broadcasting is the meet in this lattice: aligning two shapes finds their greatest lower bound, so a column vector and a row vector broadcast to the matrix that is their meet — the outer product.

Third, and most subtly, shapes and indices are distinct but isomorphic. Star insists you can no longer pretend an array’s shape and its index have the same type, the way you can when both are tuples of integers. A product shape is indexed by a record; a concatenation shape, built by a sequence of components, is indexed by one tagged variant. The relationship is carried by a metafunction \iota that maps each shape type \sigma to its index type \iota(\sigma), and \iota is an isomorphism in the type lattice. Arrays, in this reading, are containers in the sense of Abbott, Altenkirch and Ghani: a shape together with a position-type that depends on it, \Sigma(s : \mathrm{Shape}).\,\mathrm{Position}(s) \to \alpha. Star’s Position is exactly \iota.

The paper proves type safety in the simply-typed case — preservation and progress — and sketches the patterns it captures: tensors as products, padding as concatenations, broadcasting as the meet, axis indexing as record update and restriction. It leaves shape inference, stencils, recursive shapes, and the full polymorphic metatheory as future work. It is, in the best sense, a design paper: it stakes out a new region between untyped array programming and dependently typed array programming, and argues the region is habitable.

That third idea — a shape is not its index, but the two are isomorphic — is where Star and OCANNL are closest, and it is where the whole comparison should start.

Three convergences

Broadcasting is the meet

Both systems independently identify broadcasting with a greatest-lower-bound. Star defines its broadcast operator as the meet \sigma' \wedge \sigma'' in the shape lattice; the first post in this series defines broadcasting as an order and derives the rest of the machinery from getting that order right. Two systems, one sentence apart.

And the orientations agree — which is worth stating carefully, because the natural first guess is that they would be dual. Star’s product shapes subtype like records: more dimensions is a subtype, so the bigger shape sits lower, and the meet — the greatest lower bound — grows the shape by taking the union of dimensions. Column vector meets row vector and you get the matrix, sitting below both. OCANNL orients its order the same way: the claim-free unit 1_\emptyset and the empty row are at the top, larger and more specific shapes sit below (a concrete shape refines the claim-free unit it broadcasts from), and a pointwise operation constrains its result below its operands, R_{\text{result}} \sqsubseteq R_{\text{operand}}, so the broadcast result is the meet — the greatest lower bound — exactly as in Star. The same outer product, the same combination of axes, the same descent to the meet. What distinguishes the two orders is not direction but distributivity — Star’s is distributive, OCANNL’s is not — which the design-space paragraph below takes up. Nothing of substance hangs on the direction.

Concatenation is a variant is a coproduct

This is the strongest convergence, and the one I would lead with in any conversation with Jakub. Star types concatenation as a shape indexed by a variant: ⟦Front : σ, Centre : σ, Back : σ⟧ is indexed by a value tagged Front, Centre, or Back, and composing or decomposing a concatenation becomes tagging and pattern matching rather than range arithmetic. The fourth post reads concatenation as a coproduct of integer ranges — a disjoint union R_a \sqcup R_b with injection offsets \iota_a(i) = i, \iota_b(j) = A + j, and an at-most-one-summand cover check that is the coproduct’s universal property restated as a code-generation constraint.

A variant is a coproduct is a sum type. Star and OCANNL arrived at the same mathematical object from typing and from indexing respectively, which is about as much independent confirmation as a design choice can hope for. And the convergence is not superficial: Star’s padding example and OCANNL’s block-matrix example are the same construction. Star writes the padded matrix as a product of two concatenation shapes, {|row : ⟦Top, Centre, Bottom⟧, col : ⟦Left, Centre, Right⟧|}; the fourth post writes a block matrix as a coproduct in two axes, the active region picked out by which row-component and which column-component are live. The 3×3 grid of regions in Star’s Section 5.1 is the disjoint-cover tiling of the fourth post. Star types that the regions exist and have the right index type; OCANNL types it too and then emits the loop nest with the offsets and the no-overlap check. Same object, one layer deeper.

There is even a shared subtlety about the unit of the construction. The fourth post had to extend OCANNL’s size domain down to zero — the empty range, the unit of the coproduct, A \sqcup \emptyset \cong A — separate from the broadcast unit 1_\emptyset, the unit of the product. Star’s empty concatenation shape ⟦⟧ is an empty array, the same zero, and its empty product shape {||} is the single-element array, the same one. Two units for two operations, in both systems, because a sum and a product have different identities.

A shape is not its index

Now the title. Star’s \iota maps a shape type to its index type and is an isomorphism; it is the formal content of “a shape is not its index, but the two correspond exactly.” OCANNL makes the same separation, but operationally rather than in the types. The second post is built on it: a dimension carries a projection identifier distinct from its size, and an elaboration \llbracket\cdot\rrbracket maps a solved dimension to its projection term. Sizes are one reading of the constraint set; indices are another; and the second post’s load-bearing discipline — global sizes, local identities — exists precisely because a shape and its index must not be identified. Two axes forced to the same size by some reduction elsewhere in the program are the same size but not the same index, and fusing their loops because they happen to be equally long is a bug. Star says this in the type theory: shapes and indices are isomorphic, not equal. OCANNL says it in the solver: sizes are global, identities are local. It is the same fact, and \llbracket\cdot\rrbracket is OCANNL’s \iota — the canonical bridge from shape-structure to index-structure, realized as a program transformation instead of a type-level metafunction.

The payoff of stating it this way is that Star’s own future work turns out to describe what OCANNL has already built. Star’s Section 6.1 conjectures shape inference as a type-directed program transformation: given an array comprehension with the shape omitted, recover it as the meet of the operands’ shapes — the very |u| \sqcap |w| that the first post computes by from-usage inference. Star sketches it; OCANNL implements it, and adds the closing policy (leaves commit downward, interiors upward) that Star does not have because Star is not in the business of committing free variables to runtime-allocatable shapes. Star’s Section 6.4, stencils, is left open with the remark that a stencil index “degenerates to setting i : int” — which is exactly the gap the third post closes with the affine index Affine(Σ cᵢ·sᵢ + o). The convergence on the deep structure is what makes these correspondences legible: OCANNL is, in part, a realization and extension of the future-work section of a paper it had never read.

A Rosetta stone

The convergence is easiest to feel on the shared examples. Star’s worked patterns are OCANNL’s worked patterns, and writing them side by side — the shape as Star would type it, the operation as OCANNL would spec it — shows the agreement on structure and the divergence on what each system then does with it.

Pattern Star (shape type) OCANNL (einsum spec)
Matrix multiply [{|i:#, j:#|}][{|j:#, k:#|}] a +* b “m n; n p => m p”, or wordless a * b (compose)
Outer product meet {|row:#|} ∧ {|col:#|} u +* w “i; j => i, j”, shape inferred
Padding {|row:⟦Top,Centre,Bottom⟧, col:⟦Left,Centre,Right⟧|} x ++ “h, w => 1^h^1, 1^w^1” (zero-pad two axes)
GNN broadcast named axes, expand_dims avoided a +* b “i, f; j, f => i, j, f”, no expand_dims
Graph split/merge variant-indexed concatenation (a, b) ++^ “x; y => x^y”, injection offsets

Read across a row and the systems agree on the object. Read down the columns and the difference shows: Star’s column is a type, written by the programmer and checked; OCANNL’s column is a spec, from which sizes and a loop nest are inferred. Star types the padding; OCANNL types it and emits the offset arithmetic and the disjoint-cover check. The table is the whole comparison in miniature — same structure, two jobs.

Three divergences

The agreements are about structure. The disagreements are about goal, and they are sharper. Star optimizes for static safety guarantees expressible within ML-tradition type inference. OCANNL optimizes for shape inference and code generation. Where those goals pull apart, the systems pull apart — most visibly on numeric sizes and on boilerplate.

Sizes live in the solver, not in the types

Star’s sized shape # is the shape type of a flat axis; the actual size #n is a runtime value, bounds-checked at runtime, never in the type. This is deliberate and principled: the ML type tradition — Hindley–Milner, MLsub, algebraic subtyping — has no arithmetic at the type level, so a system that wants ML-style inference cannot, without leaving the tradition, write a type that says “this axis is stride · output long.” Star therefore declines to type sizes at all, and is candid about the cost: its Appendix A works through why genuine NumPy size-broadcasting — where a size-1 axis promotes to size n — is hard to type, mixing conjunctions and disjunctions of size constraints, and the paper sidesteps it by broadcasting on named-axis presence instead of on size-1 promotion. Where two NumPy arrays broadcast by one having a size-1 axis, Star broadcasts by one having a named axis the other lacks.

There is a principled way for Star to recover sizes without abandoning its philosophy, and it is worth naming because Star does not: refinement types. A liquid-types or size-dependent layer — the road taken by Futhark’s size-dependent types and by Colaço, Pauget and Pouzet’s polynomial sizes, both of which Star cites as the dependent extreme it is avoiding — would let an axis carry a refinement { n : int | n = stride · m } while keeping inference decidable through an SMT backend. That is the type-system-internal answer to the size question, and Star, staying squarely within type systems, could go there.

OCANNL takes a third road, and the contrast is the cleanest single statement of what OCANNL is. Arithmetic on sizes lives in OCANNL not in the types and not in refinements, but in inference — in the constraint solver. The convolution size relation input = stride · (output − 1) + span is solved, not typed; the element-count relation of a vectorized operation, output_elems = coeff · input_elems, is a row constraint the solver discharges; size-1 broadcasting — the case Star calls hard — is modeled head-on by making 1_\emptyset the claim-free top of a per-size order, so that an unbroadcastable pair of distinct sizes has no common refinement: their meet is the error bottom \bot, and the failure is reported as that meet. OCANNL types the conjunction-of-disjunctions nightmare of Appendix A by not typing it: it solves it. The three positions form a clean spectrum — sizes untyped and checked at runtime (Star), sizes refined and discharged by SMT (the road Star could take), sizes inferred and discharged by a bespoke lattice solver (OCANNL) — and a workshop paper that lays the spectrum out explicitly does both systems a service.

Boilerplate elimination is the goal, not a side effect

The first post opens with the claim that the machinery exists so that you do not write shapes — catching mismatches is a side benefit, not the point. This is the axis on which Star and OCANNL diverge most quietly and most completely, because Star does not measure itself by it at all. Star is a typing discipline: it asks the programmer to write more structure — named record fields, variant tags — in exchange for safety, and shape inference is explicitly future work. OCANNL is an inference discipline: it asks the programmer to write less, and treats every reshape, unsqueeze, and expand_dims the user does not have to write as the actual deliverable.

The divergence is concrete in Star’s own Graph Neural Network example. To add a (source, target) array to a (batch, source, target, feature) tensor, Star notes that NumPy forces an expand_dims to introduce a size-1 axis, and that naming the axes lets you avoid tracking its position. True — but Star still asks you to name the axes and write the comprehension. OCANNL writes the broadcast and infers the rest: an edge tensor from node features is a +* b "i, f; j, f => i, j, f", where i rides only on the left operand and j only on the right, so each broadcasts over the other’s node axis while the feature axis f co-iterates — the size-1 insertion is never written. Make the batch row a variable, ..g.. | i, f; ..g.. | j, f => ..g.. | i, j, f, and the same operator runs on one graph or a batch of them with no reshape, because each of a shape’s three rows carries its own .... (This is the gnn_broadcast test in the suite.) Row variables to avoid reshaping is not a feature Star is missing; it is a goal Star does not have — the two systems optimize different objective functions, and a fair comparison says which.

The design-space paragraph: why non-distributive, and why no schemes

One paragraph locates OCANNL against Star precisely, and it is the paragraph I would put in the related-work section. OCANNL’s dimension order is a bounded but non-distributive lattice: the claim-free unit 1_\emptyset is the top, the concrete sizes form an antichain of atoms in the middle, and an error bottom \bot completes it. Two distinct concrete sizes join to the top (the no-claim unit — a deliberate broadcast, not an error) and meet at the bottom (the broadcast error). With three or more size atoms strung between the one top and the one bottom, the order contains the diamond M_3 and is therefore not distributive. Star, by contrast, uses a distributive lattice, because distributivity is the precondition for MLsub-style principal inference. This is the crux, and it is not an oversight on either side: OCANNL cannot borrow that inference machinery, and the reason is the very thing that makes its order track sizes. The atoms are concrete sizes; an antichain of concrete sizes between shared bounds is non-distributive by construction. Star keeps distributivity precisely by not typing sizes — its # is one shape type for all sizes, so there is no antichain to begin with. One decision, sizes in the solver, read as a lattice fact: it buys size-tracking and costs distributivity, and the two are the same coin. And OCANNL does not need what distributivity would buy: MLsub uses it to generalize types into schemes at definition sites, and OCANNL gets polymorphism a different way — monomorphic existential variables, solved once, with rank-polymorphism living in the order and call-site polymorphism delegated to the host language, where each application of an operator (an ordinary OCaml function) re-emits the constraints with fresh variables. A poor man’s type scheme, built by function application rather than by a generalization rule. Star generalizes; OCANNL re-evaluates. Both reach polymorphism; only one needs a distributive lattice to do it. And the re-evaluation could be promoted to genuine generalization without disturbing that verdict. The %op extension already holds the operator’s body, so it could extract the operator’s residual shape constraints, skip the later stages — closing above all — and quantify the leftover free variables into a scheme, instantiated with fresh variables per call: MLsub’s generalize-at-definition, done by hand. The reason this needs no distributive lattice is that generalization and principal inference are separable — distributivity buys the latter, but quantifying residual variables is sound without it; what OCANNL would forgo is any claim that the resulting scheme is principal, which is exactly the open question its staged solver already lives with. So the scheme route is available, and declined on cost rather than impossibility: not before v1, and not before shape-inference time shows signs of becoming prohibitive, since today re-emitting an operator’s constraints per call is cheap beside the compilation that follows.

What OCANNL should borrow

Generosity with a purpose: Star has exactly the apparatus OCANNL’s blog-shaped account lacks, and the workshop paper should take it.

Star has a core calculus — a grammar of expressions, an operational semantics, a type system, and type-safety theorems with proofs — compressed into a handful of figures. OCANNL’s four posts describe algorithms (constraint solving, union-find, the closing policy) beautifully, but there is no formal object-language whose programs you can write down and whose semantics you can state. Star is the model for what that object should look like and how compactly it can be presented.

Star states principal inference as its goal and ties it to algebraic subtyping, where it remains a conjecture. OCANNL’s situation is the same in form and more interesting in detail. The core single-stage solver should be straightforwardly principal — the rules are monotonic, so the least-committed satisfying substitution is reached and is unique up to renaming — and the interest is in what survives once the stages wrap it: the affine index’s solve-then-evaluate strata, the padding fixpoint, concatenation’s coupling, and above all closing. Closing is deliberately not principal: it commits the variables the principal substitution leaves free, in a direction the constraints do not force. Here is a conjecture that sharpens what that costs. If every variable were closed upward, the result would be the most general shapes the constraints allow — the biggest types, which in this order are the tensors smallest in size — each unforced axis sent to the no-claim top 1_\emptyset. Uniform-upward closing is thus the closing nearest the principal, still-free substitution; the actual policy departs from it only on leaves, which it closes downward to a concrete allocatable shape, trading that generality for a size a buffer can be cut to. So the honest result is layered — principal core, generality-preserving on interiors, deliberately non-principal on leaves, with core projection inference not merely principal but canonical (an equivalence has no direction to choose) and the multi-stage question left open. That layered story, stated as the theorems below make it, is sharper than a single conjecture, and one Star has no need to tell, because Star does not close, stage, or compile.

OCANNL’s genuine debt to type-system research is narrower than Star’s and worth stating exactly, since it is not the container tradition. What OCANNL took was the emphasis on inference and the idea of combining row variables with subtyping — indeed the first post opens by retiring the name “row polymorphism plus nominal subtyping,” a name that records precisely that lineage even as it corrects it. The container and Naperian-functor vocabulary is a further borrow available to the paper, not one OCANNL already leans on. Star grounds its arrays in containers and Naperian functors — recognized PL vocabulary for indexable structures — and OCANNL’s arrays are containers, a shape together with a position map that projection inference computes; saying so in those words would cost nothing and signal fluency to exactly the audience the paper is for. It is an adoption to make deliberately, not a description of the present design.

And Star’s worked examples are OCANNL’s worked examples: padding, the outer product, the GNN broadcast, the graph construction. The Rosetta-stone table above is the seed of it; in the paper that table wants a worked pair or two spelled out in full — the same operation in Star’s variant-typed surface and OCANNL’s einsum spec, with OCANNL’s inferred sizes and loop nest shown alongside Star’s checked type — so the divergence speaks for itself, since Star types the padding and OCANNL types it and compiles it.

A template for the workshop paper

What follows is a skeleton — the spine, a core-calculus sketch in the compact style Star models, and the two theorem statements the paper would have to discharge. It is brainstorm material, not prose to lift; the paper is yours to write by hand. But it pins the obligations.

The spine. One elaboration, two readings, one bridge. A program elaborates to a single constraint set \Phi over dimension and row terms. The shape reading solves \Phi as an order, in the broadcast lattice (claim-free top 1_\emptyset, error bottom \bot), producing sizes. The index reading re-solves \Phi as an equivalence-and-coupling, producing a loop nest. The bridge is the elaboration \llbracket\cdot\rrbracket, which is OCANNL’s \iota: the canonical map from solved shape-structure to index-structure. Star types \iota; OCANNL computes along it. Everything in the four posts is an instance of this architecture, and the paper’s job is to make the architecture a theorem.

The terms (reusing the series’ notation, so the paper dovetails with the posts):

\begin{aligned} \text{dim term} \quad t &::= \alpha \;\mid\; n_b \;\mid\; \mathsf{Affine}\{\,\text{stride},\text{over},\text{conv},\text{offset}\,\} \;\mid\; \mathsf{Concat}[\,t,\ldots\,] \\ \text{row term} \quad R &::= \rho \;\mid\; l\cdot\diamond\cdot r \;\mid\; l\cdot\langle\rho\rangle\cdot r \\ \text{constraint} \quad \phi &::= t = t \;\mid\; t \sqsubseteq t \;\mid\; R = R \;\mid\; R \sqsubseteq R \\ \text{index form} \quad \pi &::= \mathsf{Fix}(c) \;\mid\; \mathsf{Iter}(s) \;\mid\; \mathsf{Affine}(\textstyle\sum_i c_i s_i + o) \;\mid\; \mathsf{Concat}(s,\ldots) \end{aligned}

with \alpha a dimension variable, n_b a size n carrying basis b (and 1_\emptyset the claim-free top), \rho a row variable, \diamond the broadcast marker, and the flanks l,r sequences of dim terms.

The order is the broadcast order of the first post, with two units. Per dimension, d \sqsubseteq 1_\emptyset for every d, \mathrm{join}(d, d') = 1_\emptyset for distinct atoms d \neq d', and \mathrm{meet}(d, d') = \bot (the broadcast error) for distinct atoms — so the dimension order is a bounded lattice with top 1_\emptyset (the product unit), an antichain of size atoms in the middle, and an error bottom \bot; it is non-distributive (M_3) whenever three or more sizes meet. Concatenation adds a second unit, 0 (the coproduct unit, the empty summand), which is neither the top nor the error bottom but off the broadcast order entirely — it is the bottom of the additive order the coproduct induces — admitted only for discardable_vars. Rows lift the dimension order structurally through the marker; rank-broadening inserts 1_\emptyset at the marker; rank-polymorphism is the order relating rows of different ranks.

The elaboration \llbracket\cdot\rrbracket = \iota. Per-operation, after sizes are solved: freshen every projection identifier (locality), re-derive the operation’s constraints, and read each as a projection equation. Schematically, \llbracket d_1 = d_2 \rrbracket = \mathsf{Eq}(\llbracket d_1\rrbracket, \llbracket d_2\rrbracket) unions two axes into one loop (co-iteration); a broadcast inequality with a size-one operand axis, \llbracket d_{\text{result}} \sqsubseteq (d_{\text{operand}}{=}1)\rrbracket = \mathsf{Eq}(\llbracket d_{\text{operand}}\rrbracket, \mathsf{Fix}(0)), severs and pins; a Concat becomes a connected component traversed sequentially. Equality unions, broadcasting severs, concatenation couples. A reduction is emergent: a product axis the output index map omits.

Theorem 1 (Core principality). For a constraint set \Phi, the core single-stage solver computes a substitution \sigma_\star with \sigma_\star \models \Phi such that for every model \sigma \models \Phi there is a u with \sigma = u \circ \sigma_\star; the free variables of \sigma_\star are exactly those \Phi does not force. Conjectured by Star for its setting; here it should follow from monotonicity — variables solved at most once and substituted out, bounds only tightening — together with the join-collapse of incompatible caps to the top 1_\emptyset. This is the clean half.

Remark (Closing is a policy, not a principal step). The principal \sigma_\star leaves leaves and interiors free; runtime allocation needs them determinate, so a closing phase commits each — a leaf downward to the most specific shape its bounds permit, an interior upward to the unit (the claim-free top). The direction is justified (a leaf has no more-permissive source to be pressured from; an unforced interior should make no claim) and the result is sound and total, but it is not forced by \Phi, so closing is deliberately non-principal. The paper’s obligation is to state closing as a separate, characterized layer rather than fold it into Theorem 1 — and to be precise that the two-phase interleave (solve, close leaves, solve again, close interiors) preserves soundness at each step.

Theorem 2 (Core projection canonicity). Core projection inference is forced and unique up to renaming of the fresh iterator symbols: it succeeds or reports a conflict, and never resolves an ambiguity. Stronger than principality — there is no closing direction to choose, because an equivalence has no up or down — and the asymmetry with shape inference (which must choose at closing) is itself a result worth stating.

Open problem (Multi-stage). Characterize what Theorems 1 and 2 become once the stages enter: the affine index’s solve-then-evaluate stratification, the padding fixpoint frozen at finalization, and concatenation’s connected-component coupling. The expected shape of the answer: determinacy and the no-guessing property survive, but flatness does not — a check between two derived (affine or coupled) terms can fire only after the stratum that evaluates its operands, so a conflict can surface later than where the constraint was written. Stating exactly which guarantees are preserved across the seams, and where principality could break, is the paper’s central technical contribution and the place it most clearly goes past Star, which has no stages to stratify.

Theorem 3 (Soundness of projection inference w.r.t. shapes). Let the shapes of an operation be solved and closed. Then the inferred index maps address only within those shapes: every \pi_T(p) for p in the product space is a valid multi-index into T at its solved shape; co-iteration is finer than size-equality; and the coproduct cover is disjoint (at most one summand active per position). This promotes the first post’s informal argument — indices derived from the same facts as the sizes cannot contradict the sizes — to a theorem. It is the compilation-side analogue of Star’s preservation and progress: where Star shows a well-typed term does not get stuck on a structural index, OCANNL shows a solved operation does not generate an out-of-bounds loop nest. The locality discipline (global sizes, local identities) is a premise here, not a convenience — it is what makes “co-iteration is finer than size-equality” true rather than merely hoped for.

The positioning is the design-space paragraph above: non-distributive lattice versus distributive lattice, sizes-in-the-solver versus sizes-untyped versus sizes-refined, polymorphism-by-re-evaluation versus polymorphism-by-generalization. And the limitations are honest: OCANNL’s rows are flat sequences, not Star’s nested algebraic shapes, which buys positional einsum matching and clean strides at the cost of the pytree-style nesting that Star’s recursive-types future work points toward; and OCANNL has not taken the refinement-types road, so its size arithmetic is a bespoke solver, not a reusable type-level discipline.

Where this leaves the series

Four posts found a familiar operation hiding inside a humbler one and reused the core rather than extending it. This post found a familiar system sitting beside OCANNL and reused the comparison rather than the code. Star and OCANNL are two points on one map. They agree on the deep structure — broadcasting is the meet, concatenation is a variant is a coproduct, a shape is not its index — because that structure is not anyone’s invention but the shape of the problem. They diverge on the bet: Star types the correspondence between shape and index and stops there, winning static safety inside the ML tradition at the cost of leaving sizes to runtime and boilerplate to the programmer; OCANNL computes along the correspondence, winning shape inference and a loop nest at the cost of a bespoke solver and a metatheory still to be written. The first bet is a type system. The second is the thing you reach for when the type system’s job is only half the problem — when, having decided a shape is not its index, you want a machine that fills in both.

The workshop paper is where the second bet gets its theorems. Jakub’s paper is where to send the reader who wants to see the first bet made well.

OCANNL is open source, at github.com/ahrefs/ocannl.