The only acceptable number of hallucinations in legal research is zero. A made-up case citation in a petition isn’t a clever demo — it’s a malpractice risk. So when we started QanoonX, we treated grounding as a hard constraint, not a nice-to-have.
What grounding actually means
Every factual claim QanoonX returns must be traceable to a source the user can open and read. No paraphrased precedents, no “based on similar cases”, no confident-sounding summaries of rulings that don’t exist.
Practically, this means three architectural choices we made early and stuck with:
1. Retrieval first, generation second
The model never answers from parametric memory. Every query triggers a hybrid retrieval step (dense vector + BM25 keyword) over our own indexed corpus. The LLM only sees what retrieval surfaced.
2. Citations are structured, not quoted
We return citations as first-class objects — PLD volume, year, court, page — not as text the model is “encouraged” to produce. That removes a whole class of plausible-sounding-but-wrong citations.
3. Evals that fail the build
Our eval suite has ~800 human-graded legal queries. If citation accuracy drops below 95% on a PR, the build fails. We’ve caught real regressions this way.
The hard part
Most of our engineering time isn’t on the LLM — it’s on the retrieval corpus, the chunking strategy, and the evals. The LLM is off-the-shelf. The contract is ours.
Every grounded-AI product we build for clients starts from this blueprint.