What is Sūtrakṛt?

A substrate for texts that mean more than one thing at once.

सूत्रकृत्
sūtrakṛt · the one who weaves the threads back together

i

Some texts are made to mean several things at once.

A song lyric is not a sentence. A scripture is not a manual. A legal opinion is not a FAQ entry. The texts that human beings have kept around for centuries — sūtras, sacred poems, common-law judgments, songs by Cohen and Dylan, Hamlet — work because they hold several legitimate readings at the same time, in tension, bounded by the formal craft of the text itself.

The Bhagavad-Gītā is the canonical example. For more than a thousand years, six major schools of Hindu commentary have read the same 700 verses and produced six different coherent philosophies: Śaṅkara's non-dual reading, Rāmānuja's qualified-non-dual reading, Madhva's dualist reading, Vallabha's pure-non-dual reading, Śrīdhara's philological-devotional reading, Madhusūdana's synthesis of non-dual realization with devotion. None of them is wrong. None of them is the consensus. The bounded disagreement is what kept the text alive.

ii

Modern AI tools, by default, flatten that.

The standard way AI reads a long-lived text today is: turn every passage into a numerical vector, find the closest vectors to your question, summarize what they say. That move works fine on a user manual. It silently destroys the texture of a text that was made to mean several things. The resulting summary sounds fluent, sounds confident, sounds like a good answer — and the bounded disagreement that gave the text its longevity is gone. The answer reads correctly. It just is not the kind of thing the text actually is.

Tomato soup survives blending. Bolognese does not.

iii

Sūtrakṛt refuses the flattening.

Sūtrakṛt is the underlying engine that powers this site. For every verse of the Gītā, it produces a structured object that holds all the readings the tradition has attested, alongside the Sanskrit text itself, the cross-references the verses make to each other, and a complete audit trail back to the source bhāṣya passages each reading was anchored to.

The reader picks the lens they want. Or picks none and reads all six side-by-side. The substrate stays modal — six schools, six readings, none privileged. The surface stays modeless — you do not have to choose a reading to start.

iv

What you'll find on a verse page.

If you came here to read
The Sanskrit mūla in Devanāgarī + IAST as the central column, the six schools' readings as the right-margin apparatus rail (color-coded by school), and two so-what questions a working modern reader might bring to the verse. The first screen, in 30 seconds.
If you came here to study
Each school's full English rendering, the bhāṣya divergence note, the everyday-application of that school's reading, the witness pointers back to the specific bhāṣya passages each rendering was anchored to.
If you came here to read scholarly source
Word-by-word with lemma + grammar + English meaning + each school's actual gloss-snippet in Sanskrit; intertextual panel ranked by the Sūtrakṛt substrate; anuvṛtti theme-chains; substrate version, fitted weights, corpus provenance. Everything is one click; nothing is forced.
If you came here to cite
Every verse page has a one-click copy with citation at the top. The full panel copies as Markdown with the citation block built in (substrate version, parser provenance, license, accessed-date, canonical URL). Per-school copy this reading on each card.
v

Briefly: how the substrate works.

For each verse, the substrate combines a multilingual sentence-embedding model (mE5-base) with five Sanskrit-aware symbolic features — theme-graph co-membership, vocative pattern correspondence, verbatim citation overlap, lemmatized lexical overlap, and Devanāgarī stem-prefix family. The composite scoring function was fit by grid search on Ramsukhdas's marked cross-references in his 1980 Sādhak Sañjīvanī and then frozen. Cross-validated on Śaṅkara's bhāṣya, the frozen weights produce 71.6% recall@4 — matching the fit-corpus performance within 0.1pp, which is the cross-school generalization claim the substrate rests on.

The word-by-word layer uses ByT5-Sanskrit-multitask (Nehrdich, Hellwig & Keutzer, EMNLP 2024) for lemma identification and grammatical analysis. The English meaning under each lemma comes from a curated 2,135-lemma Sanskrit-English gloss dictionary with prefix/suffix etymology where the lemma is decomposable. Per-school sense-snippets are extracted from each commentator's actual bhāṣya text in the school's own Sanskrit register (translation would be an additional collapse the substrate declines).

Code is open at github.com/ekras-doloop/sutrakrit-gita under MIT (substrate library) and CC-BY 4.0 (per-verse rendered objects). Reproducible byte-for-byte on a laptop with 8 GB RAM.

vi

Where this is going.

The Bhagavad-Gītā is the first text Sūtrakṛt has been built around because it is the most institutionally-coupled, longest-lived, most-instrumented testbed of bounded polysemy available — the equivalent of ImageNet for the substrate-rendered-edition idea. The architecture is designed to extend.

The next planned editions are the Yoga-Sūtras (with Vyāsa's bhāṣya + Vācaspati Miśra's Tattva-Vaiśāradī + Vijñānabhikṣu's Yoga-Vārttika commentary chain) and the principal Upaniṣads with their bhāṣya panels. The schema also generalizes — with per-domain feature engineering — to halakhic responsa, Quranic tafsīr, common-law precedent, critical-edition philology, and other corpora where bounded interpretive disagreement has been institutionally preserved.


figureBounded Polysemy — the architecture of meaningwhy one reading is too few, infinite readings are too many, and a bounded substrate is the discipline in between
i.The problem — pattern collapse
advaitaviśiṣṭadvaitaśuddhabhaktiadv-bhaktiflatteningfluent centroid— one confident readingfive witnesses lost

Modern systems answer what does this mean? by averaging across a corpus of disagreement. The output reads fluent. The texture — what each tradition actually said — is gone.

“Tomato soup survives blending. Bolognese does not.”

ii.The solution — six-layer per-verse object
6audit trailversion, weights, parser, source5convergent validationcross-school R@K on the panel4doctrinal projectionssix bhāṣya readings, tagged3intertextual panelfeature-decomposed cross-refs2word-by-word + lemmagrammar, sense, theme membership1mūla — root textDevanāgarī + IAST, daṇḍa-split

Every Bhagavad-Gītā verse is rendered against the same six-layer schema. Each layer carries witnesses back to a named source — no interpretation appears without the bhāṣya passage that licenses it.

The schema is the contribution; the BG edition is the existence proof.

iii.Case study — BG 18.63 & the ports
यथेच्छसि तथा कुरुyathecchasi tathā kuru“act as you wish” — BG 18.63advaita
right action coincides with the dissolution of confusion
dvaita
align your distinct will with the will of the divine
viśiṣṭādvaita
choose the path — karma, jñāna, or bhakti — your eligibility fits
bhakti
ponder the text fully; once moha lifts, the choice becomes clear

The same schema ports — with per-domain feature engineering — to halakhic responsa, Quranic tafsīr, common-law precedent, critical-edition philology, and to two AI applications already drafted as papers:

  • Sūtrakṛt for Code — per-function projections (security / performance / functional / OOP) with tools-as-witnesses (compilers, fuzzers, static analyzers).
  • Sūtrakṛt for Context-Engineering — modal-tagged context items (authority / retrieval / query) so prompt injection becomes a structurally-blockable boundary violation rather than a prose-following accident.