Engineering · April 17, 2026 · siddiquefaisal126

Grounded AI isn’t a feature. It’s a contract.

The only acceptable number of hallucinations in legal research is zero. Here's how we architected QanoonX to earn that number.

The only acceptable number of hallucinations in legal research is zero. A made-up case citation in a petition isn’t a clever demo — it’s a malpractice risk. So when we started QanoonX, we treated grounding as a hard constraint, not a nice-to-have.

What grounding actually means

Every factual claim QanoonX returns must be traceable to a source the user can open and read. No paraphrased precedents, no “based on similar cases”, no confident-sounding summaries of rulings that don’t exist.

Practically, this means three architectural choices we made early and stuck with:

1. Retrieval first, generation second

The model never answers from parametric memory. Every query triggers a hybrid retrieval step (dense vector + BM25 keyword) over our own indexed corpus. The LLM only sees what retrieval surfaced.

2. Citations are structured, not quoted

We return citations as first-class objects — PLD volume, year, court, page — not as text the model is “encouraged” to produce. That removes a whole class of plausible-sounding-but-wrong citations.

3. Evals that fail the build

Our eval suite has ~800 human-graded legal queries. If citation accuracy drops below 95% on a PR, the build fails. We’ve caught real regressions this way.

The hard part

Most of our engineering time isn’t on the LLM — it’s on the retrieval corpus, the chunking strategy, and the evals. The LLM is off-the-shelf. The contract is ours.

Every grounded-AI product we build for clients starts from this blueprint.

Leave a Reply

Your email address will not be published. Required fields are marked *

Three slots open this quarter

Let's build your next product.

We're a small team that ships. If you have a hard problem — especially one with AI at the center — we'd rather hear about it sooner than later.