A Shape Is Not Its Index: OCANNL’s Shapes Beside Star’s Algebraic Shapes
A Shape Is Not Its Index
Here is a padded matrix, the kind every convolution library has to lay out by hand:
1 2 3 -1 -1 -1 -1 -1
4 5 6 ↦ 0 1 2 3 0
0 4 5 6 0
+1 +1 +1 +1 +1
Two systems describe this without index arithmetic, and they were
built without knowledge of each other. Jakub Bachurski, Alan Mycroft,
and Dominic Orchard’s Structuring Arrays with Algebraic Shapes
(ARRAY ’25) types the padded array’s shape as a pair of
concatenation shapes — rows split
Top/Centre/Bottom, columns split
Left/Centre/Right — and the index
arithmetic becomes pattern matching over variant tags. OCANNL, across the
four posts this
series
comprises,
reads the same padded array as a coproduct in two axes
— a block matrix — where each region is one summand of a disjoint union
and the index arithmetic becomes injection offsets. A variant and a
coproduct are the same thing: a sum type. Bachurski’s calculus, which
they call Star, reaches it from the typing side; OCANNL reaches it from
the loop-nest side. They meet in the middle, at the same structure, and
the meeting is worth a post.
This is the fifth in the series, and the first to look outward. The four predecessors built OCANNL’s shape system from scratch — broadcasting as an order, contraction as an emergent property of an index map, convolution as an affine-indexed contraction, concatenation as a coproduct factor. Star arrives at much of the same algebra by a different route and with a different goal, and setting the two side by side is the fastest way to say what OCANNL is by saying what it is not. The thesis is in the title, and it is the one structural fact both systems turn on: a shape is not its index. They are distinct objects in a fixed correspondence, and once you grant that, you get to choose what to do with the correspondence. Star types it. OCANNL computes along it — and that single choice is the source of every divergence below.
I met Jakub at a couple of FUN-OCaml workshops and an OCaml Workshop; we have not worked together, so this is a reading from the outside, offered in the spirit of a shared program rather than a critique. The paper is excellent. The disagreements below are the productive kind — the kind where two designs that agree on the deep structure have made opposite bets at the surface, and the bets illuminate each other.
What Star is
A short, fair summary, in case you have not read it (you should). Star is a small functional array calculus — pointful, in the tradition of \tilde{F} and Single Assignment C — whose contribution is a type system for array shapes built from structural records and variants. Three ideas carry it.
First, algebraic shapes. A shape is not a tuple of
integers but a tree of two constructors: a product shape
{|ℓ : σ, …|}, indexed by a record of per-axis indices,
models named multidimensional arrays; a concatenation shape
⟦T : σ, …⟧, indexed by a variant, models padding
and concatenation. Records give a product on shapes; variants give a
sum. The names are the point — a product shape is exactly a named-tensor
axis set, and the paper is candid that this generalizes the named-tensor
and xarray line of work.
Second, structural subtyping forming a distributive lattice. Star’s shapes and indices subtype — width and depth, records one way, variants the other — and the order is a distributive lattice with meets and joins. This is the algebraic-subtyping discipline of Dolan and Mycroft’s MLsub, with the polarity restriction lifted by Parreaux and Chau, and Star’s bet is that this discipline buys it principal ML-style type inference (conjectured, not yet proved). Crucially, broadcasting is the meet in this lattice: aligning two shapes finds their greatest lower bound, so a column vector and a row vector broadcast to the matrix that is their meet — the outer product.
Third, and most subtly, shapes and indices are distinct but isomorphic. Star insists you can no longer pretend an array’s shape and its index have the same type, the way you can when both are tuples of integers. A product shape is indexed by a record; a concatenation shape, built by a sequence of components, is indexed by one tagged variant. The relationship is carried by a metafunction \iota that maps each shape type \sigma to its index type \iota(\sigma), and \iota is an isomorphism in the type lattice. Arrays, in this reading, are containers in the sense of Abbott, Altenkirch and Ghani: a shape together with a position-type that depends on it, \Sigma(s : \mathrm{Shape}).\,\mathrm{Position}(s) \to \alpha. Star’s Position is exactly \iota.
The paper proves type safety in the simply-typed case — preservation and progress — and sketches the patterns it captures: tensors as products, padding as concatenations, broadcasting as the meet, axis indexing as record update and restriction. It leaves shape inference, stencils, recursive shapes, and the full polymorphic metatheory as future work. It is, in the best sense, a design paper: it stakes out a new region between untyped array programming and dependently typed array programming, and argues the region is habitable.
That third idea — a shape is not its index, but the two are isomorphic — is where Star and OCANNL are closest, and it is where the whole comparison should start.
Three convergences
Broadcasting is the meet
Both systems independently identify broadcasting with a greatest-lower-bound. Star defines its broadcast operator as the meet \sigma' \wedge \sigma'' in the shape lattice; the first post in this series defines broadcasting as an order and derives the rest of the machinery from getting that order right. Two systems, one sentence apart.
And the orientations agree — which is worth stating carefully, because the natural first guess is that they would be dual. Star’s product shapes subtype like records: more dimensions is a subtype, so the bigger shape sits lower, and the meet — the greatest lower bound — grows the shape by taking the union of dimensions. Column vector meets row vector and you get the matrix, sitting below both. OCANNL orients its order the same way: the claim-free unit 1_\emptyset and the empty row are at the top, larger and more specific shapes sit below (a concrete shape refines the claim-free unit it broadcasts from), and a pointwise operation constrains its result below its operands, R_{\text{result}} \sqsubseteq R_{\text{operand}}, so the broadcast result is the meet — the greatest lower bound — exactly as in Star. The same outer product, the same combination of axes, the same descent to the meet. What distinguishes the two orders is not direction but distributivity — Star’s is distributive, OCANNL’s is not — which the design-space paragraph below takes up. Nothing of substance hangs on the direction.
Concatenation is a variant is a coproduct
This is the strongest convergence, and the one I would lead with in
any conversation with Jakub. Star types concatenation as a shape indexed
by a variant: ⟦Front : σ, Centre : σ, Back : σ⟧ is
indexed by a value tagged Front, Centre, or
Back, and composing or decomposing a concatenation becomes
tagging and pattern matching rather than range arithmetic. The fourth
post reads concatenation as a coproduct of integer ranges —
a disjoint union R_a \sqcup R_b with
injection offsets \iota_a(i) = i, \iota_b(j) = A + j, and an
at-most-one-summand cover check that is the coproduct’s universal
property restated as a code-generation constraint.
A variant is a coproduct is a sum type. Star and OCANNL arrived at
the same mathematical object from typing and from indexing
respectively, which is about as much independent confirmation as a
design choice can hope for. And the convergence is not superficial:
Star’s padding example and OCANNL’s block-matrix example are the
same construction. Star writes the padded matrix as a product
of two concatenation shapes,
{|row : ⟦Top, Centre, Bottom⟧, col : ⟦Left, Centre, Right⟧|};
the fourth post writes a block matrix as a coproduct in two axes, the
active region picked out by which row-component and which
column-component are live. The 3×3 grid of regions in Star’s Section 5.1
is the disjoint-cover tiling of the fourth post. Star types
that the regions exist and have the right index type; OCANNL types it
too and then emits the loop nest with the offsets and the no-overlap
check. Same object, one layer deeper.
There is even a shared subtlety about the unit of the
construction. The fourth post had to extend OCANNL’s size domain down to
zero — the empty range, the unit of the coproduct, A \sqcup \emptyset \cong A — separate from
the broadcast unit 1_\emptyset, the
unit of the product. Star’s empty concatenation shape ⟦⟧ is
an empty array, the same zero, and its empty product shape
{||} is the single-element array, the same one. Two units
for two operations, in both systems, because a sum and a product have
different identities.
A shape is not its index
Now the title. Star’s \iota maps a shape type to its index type and is an isomorphism; it is the formal content of “a shape is not its index, but the two correspond exactly.” OCANNL makes the same separation, but operationally rather than in the types. The second post is built on it: a dimension carries a projection identifier distinct from its size, and an elaboration \llbracket\cdot\rrbracket maps a solved dimension to its projection term. Sizes are one reading of the constraint set; indices are another; and the second post’s load-bearing discipline — global sizes, local identities — exists precisely because a shape and its index must not be identified. Two axes forced to the same size by some reduction elsewhere in the program are the same size but not the same index, and fusing their loops because they happen to be equally long is a bug. Star says this in the type theory: shapes and indices are isomorphic, not equal. OCANNL says it in the solver: sizes are global, identities are local. It is the same fact, and \llbracket\cdot\rrbracket is OCANNL’s \iota — the canonical bridge from shape-structure to index-structure, realized as a program transformation instead of a type-level metafunction.
The payoff of stating it this way is that Star’s own future work
turns out to describe what OCANNL has already built. Star’s Section 6.1
conjectures shape inference as a type-directed program
transformation: given an array comprehension with the shape
omitted, recover it as the meet of the operands’ shapes — the very |u| \sqcap |w| that the first post computes
by from-usage inference. Star sketches it; OCANNL implements it, and
adds the closing policy (leaves commit downward, interiors upward) that
Star does not have because Star is not in the business of
committing free variables to runtime-allocatable shapes. Star’s
Section 6.4, stencils, is left open with the remark that a stencil index
“degenerates to setting i : int” —
which is exactly the gap the third
post closes with the affine index Affine(Σ cᵢ·sᵢ + o).
The convergence on the deep structure is what makes these
correspondences legible: OCANNL is, in part, a realization and extension
of the future-work section of a paper it had never read.
A Rosetta stone
The convergence is easiest to feel on the shared examples. Star’s worked patterns are OCANNL’s worked patterns, and writing them side by side — the shape as Star would type it, the operation as OCANNL would spec it — shows the agreement on structure and the divergence on what each system then does with it.
| Pattern | Star (shape type) | OCANNL (einsum spec) |
|---|---|---|
| Matrix multiply |
[{|i:#, j:#|}] ⊗ [{|j:#, k:#|}]
|
a +* b “m n; n p => m p”, or wordless a * b
(compose)
|
| Outer product |
meet {|row:#|} ∧ {|col:#|}
|
u +* w “i; j => i, j”, shape inferred
|
| Padding |
{|row:⟦Top,Centre,Bottom⟧, col:⟦Left,Centre,Right⟧|}
|
x ++ “h, w => 1^h^1, 1^w^1” (zero-pad two axes)
|
| GNN broadcast |
named axes, expand_dims avoided
|
a +* b “i, f; j, f => i, j, f”, no
expand_dims
|
| Graph split/merge | variant-indexed concatenation |
(a, b) ++^ “x; y => x^y”, injection offsets
|
Read across a row and the systems agree on the object. Read down the columns and the difference shows: Star’s column is a type, written by the programmer and checked; OCANNL’s column is a spec, from which sizes and a loop nest are inferred. Star types the padding; OCANNL types it and emits the offset arithmetic and the disjoint-cover check. The table is the whole comparison in miniature — same structure, two jobs.
Three divergences
The agreements are about structure. The disagreements are about goal, and they are sharper. Star optimizes for static safety guarantees expressible within ML-tradition type inference. OCANNL optimizes for shape inference and code generation. Where those goals pull apart, the systems pull apart — most visibly on numeric sizes and on boilerplate.
Sizes live in the solver, not in the types
Star’s sized shape # is the shape type of a
flat axis; the actual size #n is a runtime value,
bounds-checked at runtime, never in the type. This is deliberate and
principled: the ML type tradition — Hindley–Milner, MLsub, algebraic
subtyping — has no arithmetic at the type level, so a system that wants
ML-style inference cannot, without leaving the tradition, write a type
that says “this axis is stride · output long.” Star
therefore declines to type sizes at all, and is candid about the cost:
its Appendix A works through why genuine NumPy size-broadcasting — where
a size-1 axis promotes to size n — is
hard to type, mixing conjunctions and disjunctions of size constraints,
and the paper sidesteps it by broadcasting on named-axis
presence instead of on size-1 promotion. Where two NumPy arrays
broadcast by one having a size-1 axis, Star broadcasts by one having a
named axis the other lacks.
There is a principled way for Star to recover sizes without
abandoning its philosophy, and it is worth naming because Star does not:
refinement types. A liquid-types or size-dependent
layer — the road taken by Futhark’s size-dependent types and by Colaço,
Pauget and Pouzet’s polynomial sizes, both of which Star cites as the
dependent extreme it is avoiding — would let an axis carry a refinement
{ n : int | n = stride · m } while keeping inference
decidable through an SMT backend. That is the type-system-internal
answer to the size question, and Star, staying squarely within type
systems, could go there.
OCANNL takes a third road, and the contrast is the cleanest single
statement of what OCANNL is. Arithmetic on sizes lives in OCANNL not in
the types and not in refinements, but in
inference — in the constraint solver. The convolution size
relation input = stride · (output − 1) + span is solved,
not typed; the element-count relation of a vectorized operation,
output_elems = coeff · input_elems, is a row constraint the
solver discharges; size-1 broadcasting — the case Star calls hard — is
modeled head-on by making 1_\emptyset
the claim-free top of a per-size order, so that an
unbroadcastable pair of distinct sizes has no common refinement: their
meet is the error bottom \bot, and the
failure is reported as that meet. OCANNL types the
conjunction-of-disjunctions nightmare of Appendix A by not typing it: it
solves it. The three positions form a clean spectrum — sizes untyped and
checked at runtime (Star), sizes refined and discharged by SMT (the road
Star could take), sizes inferred and discharged by a bespoke lattice
solver (OCANNL) — and a workshop paper that lays the spectrum out
explicitly does both systems a service.
Boilerplate elimination is the goal, not a side effect
The first post opens with the claim that the machinery exists so
that you do not write shapes — catching mismatches is a side
benefit, not the point. This is the axis on which Star and OCANNL
diverge most quietly and most completely, because Star does not measure
itself by it at all. Star is a typing discipline: it asks the
programmer to write more structure — named record fields,
variant tags — in exchange for safety, and shape inference is explicitly
future work. OCANNL is an inference discipline: it asks the
programmer to write less, and treats every reshape, unsqueeze,
and expand_dims the user does not have to write as the
actual deliverable.
The divergence is concrete in Star’s own Graph Neural Network
example. To add a (source, target) array to a
(batch, source, target, feature) tensor, Star notes that
NumPy forces an expand_dims to introduce a size-1 axis, and
that naming the axes lets you avoid tracking its position. True — but
Star still asks you to name the axes and write the
comprehension. OCANNL writes the broadcast and infers the rest: an edge
tensor from node features is
a +* b "i, f; j, f => i, j, f", where i
rides only on the left operand and j only on the right, so
each broadcasts over the other’s node axis while the feature axis
f co-iterates — the size-1 insertion is never written. Make
the batch row a variable,
..g.. | i, f; ..g.. | j, f => ..g.. | i, j, f, and the
same operator runs on one graph or a batch of them with no reshape,
because each of a shape’s three rows carries its own ....
(This is the gnn_broadcast test in the suite.) Row
variables to avoid reshaping is not a feature Star is missing; it is a
goal Star does not have — the two systems optimize different objective
functions, and a fair comparison says which.
The design-space paragraph: why non-distributive, and why no schemes
One paragraph locates OCANNL against Star precisely, and it is the
paragraph I would put in the related-work section. OCANNL’s dimension
order is a bounded but non-distributive lattice: the claim-free
unit 1_\emptyset is the top, the
concrete sizes form an antichain of atoms in the middle, and an error
bottom \bot completes it. Two distinct
concrete sizes join to the top (the no-claim unit — a
deliberate broadcast, not an error) and meet at the bottom (the
broadcast error). With three or more size atoms strung between the one
top and the one bottom, the order contains the diamond M_3 and is therefore not
distributive. Star, by contrast, uses a distributive
lattice, because distributivity is the precondition for MLsub-style
principal inference. This is the crux, and it is not an oversight on
either side: OCANNL cannot borrow that inference machinery, and the
reason is the very thing that makes its order track sizes. The
atoms are concrete sizes; an antichain of concrete sizes between shared
bounds is non-distributive by construction. Star keeps distributivity
precisely by not typing sizes — its # is one shape
type for all sizes, so there is no antichain to begin with. One
decision, sizes in the solver, read as a lattice fact: it buys
size-tracking and costs distributivity, and the two are the same coin.
And OCANNL does not need what distributivity would buy: MLsub uses it to
generalize types into schemes at definition sites, and OCANNL gets
polymorphism a different way — monomorphic existential variables, solved
once, with rank-polymorphism living in the order and call-site
polymorphism delegated to the host language, where each application of
an operator (an ordinary OCaml function) re-emits the constraints with
fresh variables. A poor man’s type scheme, built by function application
rather than by a generalization rule. Star generalizes; OCANNL
re-evaluates. Both reach polymorphism; only one needs a distributive
lattice to do it. And the re-evaluation could be promoted to genuine
generalization without disturbing that verdict. The %op
extension already holds the operator’s body, so it could extract the
operator’s residual shape constraints, skip the later stages —
closing above all — and quantify the leftover free variables into a
scheme, instantiated with fresh variables per call: MLsub’s
generalize-at-definition, done by hand. The reason this needs no
distributive lattice is that generalization and principal
inference are separable — distributivity buys the latter, but
quantifying residual variables is sound without it; what OCANNL would
forgo is any claim that the resulting scheme is principal,
which is exactly the open question its staged solver already lives with.
So the scheme route is available, and declined on cost rather than
impossibility: not before v1, and not before shape-inference time shows
signs of becoming prohibitive, since today re-emitting an operator’s
constraints per call is cheap beside the compilation that follows.
What OCANNL should borrow
Generosity with a purpose: Star has exactly the apparatus OCANNL’s blog-shaped account lacks, and the workshop paper should take it.
Star has a core calculus — a grammar of expressions, an operational semantics, a type system, and type-safety theorems with proofs — compressed into a handful of figures. OCANNL’s four posts describe algorithms (constraint solving, union-find, the closing policy) beautifully, but there is no formal object-language whose programs you can write down and whose semantics you can state. Star is the model for what that object should look like and how compactly it can be presented.
Star states principal inference as its goal and ties it to algebraic subtyping, where it remains a conjecture. OCANNL’s situation is the same in form and more interesting in detail. The core single-stage solver should be straightforwardly principal — the rules are monotonic, so the least-committed satisfying substitution is reached and is unique up to renaming — and the interest is in what survives once the stages wrap it: the affine index’s solve-then-evaluate strata, the padding fixpoint, concatenation’s coupling, and above all closing. Closing is deliberately not principal: it commits the variables the principal substitution leaves free, in a direction the constraints do not force. Here is a conjecture that sharpens what that costs. If every variable were closed upward, the result would be the most general shapes the constraints allow — the biggest types, which in this order are the tensors smallest in size — each unforced axis sent to the no-claim top 1_\emptyset. Uniform-upward closing is thus the closing nearest the principal, still-free substitution; the actual policy departs from it only on leaves, which it closes downward to a concrete allocatable shape, trading that generality for a size a buffer can be cut to. So the honest result is layered — principal core, generality-preserving on interiors, deliberately non-principal on leaves, with core projection inference not merely principal but canonical (an equivalence has no direction to choose) and the multi-stage question left open. That layered story, stated as the theorems below make it, is sharper than a single conjecture, and one Star has no need to tell, because Star does not close, stage, or compile.
OCANNL’s genuine debt to type-system research is narrower than Star’s and worth stating exactly, since it is not the container tradition. What OCANNL took was the emphasis on inference and the idea of combining row variables with subtyping — indeed the first post opens by retiring the name “row polymorphism plus nominal subtyping,” a name that records precisely that lineage even as it corrects it. The container and Naperian-functor vocabulary is a further borrow available to the paper, not one OCANNL already leans on. Star grounds its arrays in containers and Naperian functors — recognized PL vocabulary for indexable structures — and OCANNL’s arrays are containers, a shape together with a position map that projection inference computes; saying so in those words would cost nothing and signal fluency to exactly the audience the paper is for. It is an adoption to make deliberately, not a description of the present design.
And Star’s worked examples are OCANNL’s worked examples: padding, the outer product, the GNN broadcast, the graph construction. The Rosetta-stone table above is the seed of it; in the paper that table wants a worked pair or two spelled out in full — the same operation in Star’s variant-typed surface and OCANNL’s einsum spec, with OCANNL’s inferred sizes and loop nest shown alongside Star’s checked type — so the divergence speaks for itself, since Star types the padding and OCANNL types it and compiles it.
A template for the workshop paper
What follows is a skeleton — the spine, a core-calculus sketch in the compact style Star models, and the two theorem statements the paper would have to discharge. It is brainstorm material, not prose to lift; the paper is yours to write by hand. But it pins the obligations.
The spine. One elaboration, two readings, one bridge. A program elaborates to a single constraint set \Phi over dimension and row terms. The shape reading solves \Phi as an order, in the broadcast lattice (claim-free top 1_\emptyset, error bottom \bot), producing sizes. The index reading re-solves \Phi as an equivalence-and-coupling, producing a loop nest. The bridge is the elaboration \llbracket\cdot\rrbracket, which is OCANNL’s \iota: the canonical map from solved shape-structure to index-structure. Star types \iota; OCANNL computes along it. Everything in the four posts is an instance of this architecture, and the paper’s job is to make the architecture a theorem.
The terms (reusing the series’ notation, so the paper dovetails with the posts):
\begin{aligned} \text{dim term} \quad t &::= \alpha \;\mid\; n_b \;\mid\; \mathsf{Affine}\{\,\text{stride},\text{over},\text{conv},\text{offset}\,\} \;\mid\; \mathsf{Concat}[\,t,\ldots\,] \\ \text{row term} \quad R &::= \rho \;\mid\; l\cdot\diamond\cdot r \;\mid\; l\cdot\langle\rho\rangle\cdot r \\ \text{constraint} \quad \phi &::= t = t \;\mid\; t \sqsubseteq t \;\mid\; R = R \;\mid\; R \sqsubseteq R \\ \text{index form} \quad \pi &::= \mathsf{Fix}(c) \;\mid\; \mathsf{Iter}(s) \;\mid\; \mathsf{Affine}(\textstyle\sum_i c_i s_i + o) \;\mid\; \mathsf{Concat}(s,\ldots) \end{aligned}
with \alpha a dimension variable, n_b a size n carrying basis b (and 1_\emptyset the claim-free top), \rho a row variable, \diamond the broadcast marker, and the flanks l,r sequences of dim terms.
The order is the broadcast order of the first post,
with two units. Per dimension, d \sqsubseteq
1_\emptyset for every d, \mathrm{join}(d, d') = 1_\emptyset for
distinct atoms d \neq d', and \mathrm{meet}(d, d') = \bot (the
broadcast error) for distinct atoms — so the dimension order is a
bounded lattice with top 1_\emptyset
(the product unit), an antichain of size atoms in the middle, and an
error bottom \bot; it is
non-distributive (M_3) whenever three
or more sizes meet. Concatenation adds a second unit, 0 (the coproduct unit, the empty summand),
which is neither the top nor the error bottom but off the broadcast
order entirely — it is the bottom of the additive order the
coproduct induces — admitted only for discardable_vars.
Rows lift the dimension order structurally through the marker;
rank-broadening inserts 1_\emptyset at
the marker; rank-polymorphism is the order relating rows of different
ranks.
The elaboration \llbracket\cdot\rrbracket = \iota.
Per-operation, after sizes are solved: freshen every projection
identifier (locality), re-derive the operation’s constraints, and read
each as a projection equation. Schematically, \llbracket d_1 = d_2 \rrbracket =
\mathsf{Eq}(\llbracket d_1\rrbracket, \llbracket d_2\rrbracket)
unions two axes into one loop (co-iteration); a broadcast inequality
with a size-one operand axis, \llbracket
d_{\text{result}} \sqsubseteq (d_{\text{operand}}{=}1)\rrbracket =
\mathsf{Eq}(\llbracket d_{\text{operand}}\rrbracket,
\mathsf{Fix}(0)), severs and pins; a Concat becomes
a connected component traversed sequentially. Equality unions,
broadcasting severs, concatenation couples. A reduction is
emergent: a product axis the output index map omits.
Theorem 1 (Core principality). For a constraint set \Phi, the core single-stage solver computes a substitution \sigma_\star with \sigma_\star \models \Phi such that for every model \sigma \models \Phi there is a u with \sigma = u \circ \sigma_\star; the free variables of \sigma_\star are exactly those \Phi does not force. Conjectured by Star for its setting; here it should follow from monotonicity — variables solved at most once and substituted out, bounds only tightening — together with the join-collapse of incompatible caps to the top 1_\emptyset. This is the clean half.
Remark (Closing is a policy, not a principal step). The principal \sigma_\star leaves leaves and interiors free; runtime allocation needs them determinate, so a closing phase commits each — a leaf downward to the most specific shape its bounds permit, an interior upward to the unit (the claim-free top). The direction is justified (a leaf has no more-permissive source to be pressured from; an unforced interior should make no claim) and the result is sound and total, but it is not forced by \Phi, so closing is deliberately non-principal. The paper’s obligation is to state closing as a separate, characterized layer rather than fold it into Theorem 1 — and to be precise that the two-phase interleave (solve, close leaves, solve again, close interiors) preserves soundness at each step.
Theorem 2 (Core projection canonicity). Core projection inference is forced and unique up to renaming of the fresh iterator symbols: it succeeds or reports a conflict, and never resolves an ambiguity. Stronger than principality — there is no closing direction to choose, because an equivalence has no up or down — and the asymmetry with shape inference (which must choose at closing) is itself a result worth stating.
Open problem (Multi-stage). Characterize what Theorems 1 and 2 become once the stages enter: the affine index’s solve-then-evaluate stratification, the padding fixpoint frozen at finalization, and concatenation’s connected-component coupling. The expected shape of the answer: determinacy and the no-guessing property survive, but flatness does not — a check between two derived (affine or coupled) terms can fire only after the stratum that evaluates its operands, so a conflict can surface later than where the constraint was written. Stating exactly which guarantees are preserved across the seams, and where principality could break, is the paper’s central technical contribution and the place it most clearly goes past Star, which has no stages to stratify.
Theorem 3 (Soundness of projection inference w.r.t. shapes). Let the shapes of an operation be solved and closed. Then the inferred index maps address only within those shapes: every \pi_T(p) for p in the product space is a valid multi-index into T at its solved shape; co-iteration is finer than size-equality; and the coproduct cover is disjoint (at most one summand active per position). This promotes the first post’s informal argument — indices derived from the same facts as the sizes cannot contradict the sizes — to a theorem. It is the compilation-side analogue of Star’s preservation and progress: where Star shows a well-typed term does not get stuck on a structural index, OCANNL shows a solved operation does not generate an out-of-bounds loop nest. The locality discipline (global sizes, local identities) is a premise here, not a convenience — it is what makes “co-iteration is finer than size-equality” true rather than merely hoped for.
The positioning is the design-space paragraph above: non-distributive lattice versus distributive lattice, sizes-in-the-solver versus sizes-untyped versus sizes-refined, polymorphism-by-re-evaluation versus polymorphism-by-generalization. And the limitations are honest: OCANNL’s rows are flat sequences, not Star’s nested algebraic shapes, which buys positional einsum matching and clean strides at the cost of the pytree-style nesting that Star’s recursive-types future work points toward; and OCANNL has not taken the refinement-types road, so its size arithmetic is a bespoke solver, not a reusable type-level discipline.
Where this leaves the series
Four posts found a familiar operation hiding inside a humbler one and reused the core rather than extending it. This post found a familiar system sitting beside OCANNL and reused the comparison rather than the code. Star and OCANNL are two points on one map. They agree on the deep structure — broadcasting is the meet, concatenation is a variant is a coproduct, a shape is not its index — because that structure is not anyone’s invention but the shape of the problem. They diverge on the bet: Star types the correspondence between shape and index and stops there, winning static safety inside the ML tradition at the cost of leaving sizes to runtime and boilerplate to the programmer; OCANNL computes along the correspondence, winning shape inference and a loop nest at the cost of a bespoke solver and a metatheory still to be written. The first bet is a type system. The second is the thing you reach for when the type system’s job is only half the problem — when, having decided a shape is not its index, you want a machine that fills in both.
The workshop paper is where the second bet gets its theorems. Jakub’s paper is where to send the reader who wants to see the first bet made well.
OCANNL is open source, at github.com/ahrefs/ocannl.