Prompts: Shape Inference After the Boilerplate: A Literature Survey from OCANNL
The prompts below are the human-side contribution to “Shape Inference After the Boilerplate: A Literature Survey from OCANNL” — the questions, framing, constraints, and course-corrections that steered the writing. They are reproduced here to delineate which part of the work is Łukasz Stafiniak’s.
Can you do literature search for related work on shape inference?
This is very rich. I attach a comparison of OCANNL with another related paper: Jakub Bachurski, Alan Mycroft, and Dominic Orchard’s Structuring Arrays with Algebraic Shapes (ARRAY ’25). Can you draft a substantial article, as a separate Markdown file, for our blog with this literature survey from OCANNL’s perspective? I’m thinking of a paper in the 6000 to 8000 words range.
Quick question: the two Gradual Types papers are from different groups. Do they cover similar ground or is it worth going deeper on both of them?
I attach an extraction of the Refined Remora paper, the sources of the Gradual Tensor Shape Checking paper, and two sections of the Tensor Comprehensions paper. Can you expand the article with deeper dives on these sources?
A few issues or remarks.
- Use LaTeX in cases like
1_emptyset. I would remove this paragraph “One small but telling Refined Remora result is that refinement erasure does not behave as refinement-type intuition might suggest.” It’s obvious that underconstrained shape information will fail to generate the expected program in many cases. But maybe we can mention that OCANNL distinguishes “weight parameter tensors” from other tensors, and fails with “you forgot to specify a hidden dimension” instead of guessing the smallest shape for parameters. - “In OCANNL, many shape facts live in a tensor-specific inference pass generated by operation specs.” Someone might think OCANNL introduces shape facts entirely ad-hoc, but OCANNL has constraint derivation that’s separate from constraint solving, and solving only introduces facts underspecified by the declarative semantics of the constraints or forced by pragmatic considerations: the closing policy, heuristics for solving or failing diophantine problems etc.
- “OCANNL’s broadcasting is also a meet in the broadcast order.” This is not a deep fact, just a choice to align with the type-theoretic perspective. Originally, the order was defined in the other direction in OCANNL, with a domain-theoretic perspective that 1_emptyset is the bottom because it has the least information.
- “Two distinct concrete sizes join to the claim-free top” for dimensions i.e. individual axes, the algebra for rows (axis sequences with broadcasting) is richer.
- “OCANNL falls back toward residual bounds, staged solving, and closing policies.” This is imprecise; the three terms name overlapping ideas. The sentence that follows it is good though.
- “It is also a limitation: OCANNL cannot claim to be a whole-program tensor shape checker for OCaml-Torch or PyTorch-style programs.” Not sure what you’re trying to tell here. There are two downsides – performance and how early errors are caught. All shape and projection inference happens at call sites, at actually traversed OCaml code paths, so detecting tensor logic errors in libraries relies on a test suite exercising shape inference using the library’s tensor operations.
- One thing that strikes me looking at the TC’s examples is that
OCANNL also has a numerical precision inference system, allowing the
notation to be even more succinct – no need to say
floatin functions because the actual precision will resolve at the function’s call site. - OCANNL commits to a valid or padded interpretation for convolutions
via its syntax extension: either directly in the specifications using a
<vs.=marker next to+, or implicitly by requiring ause_paddingboolean binding in scope of the specification. - I attach the starting scaffolding of my planned workshop article – take a look at its three examples showcasing notational brevity.