Prethink: When Code Quality Stops Being a Dashboard and Becomes Agent Context

Enterprise codebases carry a paradox. Quality metrics exist — cyclomatic complexity, coupling, cohesion, test coverage — but they live in dashboards that developers glance at during onboarding and never open again. Meanwhile, the AI coding agents now writing and reviewing code operate inside the repository itself, blind to everything that lives behind a web login.

Moderne's Prethink, released in March 2026, attempts to close that gap. Rather than asking agents to query an external service, it materialises code quality metrics as CSV and Markdown files directly inside the repository, in a .moderne/context/ directory. The agent reads them the same way it reads any other file.

The problem Prethink actually solves

Most AI coding assistants suffer from a context deficit. They can read the code, the tests, and the README, but they cannot see the structural health of the codebase. A function might look reasonable in isolation whilst being the most coupled method in a 50,000-line codebase. A class might have a clean API surface whilst hiding a God Class with a Weighted Method Count of 339.

Traditional quality tools — SonarQube, CodeClimate, Checkstyle, PMD — were built for human review. They produce web dashboards, CI gates, and PDF reports. None of these formats are useful to an agent that works in the editor. The agent cannot click through a SonarQube alert. It can, however, parse a CSV file sitting next to the code it is about to modify.

Prethink's contribution is not a new metric. It is the decision to materialise established metrics in a format the agent already understands.

What gets materialised

The .moderne/context/ directory typically contains five files:

File	Contents	Granularity
`method-quality-metrics.csv`	Cyclomatic, Cognitive, Nesting, ABC, Halstead	Per method
`class-quality-metrics.csv`	WMC, LCOM4, TCC, CBO, Maintainability Index	Per class
`package-quality-metrics.csv`	Afferent coupling, Efferent coupling, Instability, Abstractness, Cycles	Per package
`code-smells.csv`	God Class, Feature Envy, Data Class detections	Per class
`test-gaps.csv`	Risk-scored untested methods	Per method

The metrics themselves are not inventions. Cyclomatic complexity dates to McCabe in 1976. Halstead's volume and difficulty measures appeared in 1977. The ABC metric — Assignments, Branches, Conditions combined as the square root of the sum of their squares — was published by Fitzpatrick in 1997. Cognitive complexity is newer, proposed by Campbell in 2017, and attempts to capture how difficult a method is for a human to read rather than merely how many paths it contains.

What is genuinely useful is the composite debt score:

debt = 0.4 * normalised(cyclomatic)
     + 0.3 * normalised(cognitive)
     + 0.2 * normalised(nesting)
     + 0.1 * normalised(halsteadBugs)

This weighting is opinionated. A team that cares more about coupling than raw complexity can adjust the weights — the recipe is open source.

The class-level metrics include LCOM4 (Hitz and Montazeri, 1995), which counts the connected components of a class's method-attribute graph. A value above 1 means the class can be cleanly split into N independent classes. TCC (Tight Class Cohesion, Bieman and Kang, 1998) measures the proportion of method pairs that share attribute access, on a scale from 0.0 to 1.0.

At the package level, Prethink applies Robert Martin's metrics from Agile Software Development: Afferent Coupling (Ca), Efferent Coupling (Ce), Instability (I = Ce / (Ca + Ce)), and Abstractness (A). Packages are plotted on the Main Sequence where A + I = 1. Packages near (0, 0) sit in the Zone of Pain — concrete and stable, painful to change. Packages near (1, 1) sit in the Zone of Uselessness — abstract and unstable, contributing nothing.

The foundation: Lossless Semantic Trees

The engine behind Prethink is OpenRewrite's Lossless Semantic Tree (LST). Unlike a conventional Abstract Syntax Tree, an LST preserves formatting: whitespace, comments, indentation. This means metrics can be computed accurately and then the code can be refactored without destroying the author's original formatting.

OpenRewrite recipes traverse the LST to collect method invocations, type declarations, field accesses, and package dependencies. Because the LST is type-attributed, Prethink can distinguish between a call to String.length() and a call to a local length() method — something text-based linters cannot do reliably.

Two recipes ship with Prethink:

UpdatePrethinkContextStarter — uses an LLM to annotate each method with a natural-language description and assess complexity hotspots. The LLM output is cached by SHA-256 hash of the method body, so re-running on unchanged code does not incur fresh cost.
UpdatePrethinkContextNoAiStarter — skips the LLM step, relying entirely on deterministic metrics.

This is deterministic static analysis with an optional AI layer. It is not RAG, not embeddings, not an MCP server, not prompt engineering applied to the codebase. The metrics are computed by recipes, written to files, and read back by agents as context.

Honest trade-offs

Prethink is not free. The AI variant calls an LLM for every method annotation. A 50,000-method codebase will incur real cost on the first run, though subsequent runs only process changed methods thanks to the hash-based cache.

Recency is another concern. The .moderne/context/ files reflect the state of the codebase at the time the recipe ran. On a long-lived feature branch, they drift from reality. A team using trunk-based development will find them accurate; a team that works on month-long branches will find them stale.

Agent capability matters too. A small model — 1.5B to 7B parameters — may lack the context window to load a full CSV alongside the code it is editing. A larger reasoning model will make better use of the data. Prethink's value is proportional to the agent's ability to consume structured context.

Finally, the metrics themselves carry decades of academic baggage. Cyclomatic complexity treats every branch equally, whether it guards against null or implements core business logic. LCOM4 cannot detect meaningful cohesion in classes that use dependency injection. These metrics are useful signals, not ground truth.

Where this fits

Prethink occupies a specific niche: deterministic, compiler-accurate analysis materialised as agent-readable files. SonarQube excels at security hotspots and coverage gates. CodeScene excels at behavioural analysis — who changed what, where knowledge is concentrated. CodeClimate offers maintainability ratings for CI integration.

None of those tools place their output inside the repository where an agent works. Prethink does. Its integration with CLAUDE.md and .cursorrules means that an agent opening a codebase is immediately aware of its structural health without leaving the editor.

For teams already running OpenRewrite recipes — and many enterprises are, for migration work — adding Prethink is a configuration change, not a new tool to procure. For teams without OpenRewrite, it is a heavier commitment: a build-time recipe, a source repository large enough to justify the metrics, and agents capable of consuming them.

The broader pattern is worth noting. As coding agents take on larger responsibilities, the boundary between tooling output and source code is blurring. Dashboards are built for humans. Files are built for both.

The problem Prethink actually solves

Prethink's contribution is not a new metric. It is the decision to materialise established metrics in a format the agent already understands.

What gets materialised

The .moderne/context/ directory typically contains five files:

File	Contents	Granularity
`method-quality-metrics.csv`	Cyclomatic, Cognitive, Nesting, ABC, Halstead	Per method
`class-quality-metrics.csv`	WMC, LCOM4, TCC, CBO, Maintainability Index	Per class
`package-quality-metrics.csv`	Afferent coupling, Efferent coupling, Instability, Abstractness, Cycles	Per package
`code-smells.csv`	God Class, Feature Envy, Data Class detections	Per class
`test-gaps.csv`	Risk-scored untested methods	Per method

What is genuinely useful is the composite debt score:

debt = 0.4 * normalised(cyclomatic)
     + 0.3 * normalised(cognitive)
     + 0.2 * normalised(nesting)
     + 0.1 * normalised(halsteadBugs)

This weighting is opinionated. A team that cares more about coupling than raw complexity can adjust the weights — the recipe is open source.

The foundation: Lossless Semantic Trees

Two recipes ship with Prethink:

UpdatePrethinkContextStarter — uses an LLM to annotate each method with a natural-language description and assess complexity hotspots. The LLM output is cached by SHA-256 hash of the method body, so re-running on unchanged code does not incur fresh cost.
UpdatePrethinkContextNoAiStarter — skips the LLM step, relying entirely on deterministic metrics.

The problem Prethink actually solves

What gets materialised

The foundation: Lossless Semantic Trees

Honest trade-offs

Where this fits

Never miss a deep-dive

The problem Prethink actually solves

What gets materialised

The foundation: Lossless Semantic Trees

Honest trade-offs

Where this fits

Never miss a deep-dive