Cost Accountability

The Problem

Without cost visibility, engineers cannot make quality-cost tradeoffs. Without per-stage cost tracking, engineers cannot identify which stage is responsible for cost explosions. Common anti-pattern: increasing top-k to improve retrieval quality without realising it drives quadratic LLM context processing costs. No mainstream RAG library surfaces this until after the bill arrives.

The rag-axis Approach

Every pipeline result carries a CostReport. This is not a logging feature. It is part of the core return type. CostReport contains:

tokens_consumed per stage (retrieval, reranking, context assembly, generation)
latency_ms per stage
estimated_cost_gbp (configurable currency)
total_tokens
cache_hit (bool — was this query served from cache)

Why Per-Stage

Aggregate cost per query is insufficient. Per-stage cost tells you:

Whether reranking a large candidate set is worth the latency
Whether your chunking strategy is producing oversized chunks
Whether semantic caching would pay for itself at your query volume

Non-Optional

CostReport is non-optional because optional cost tracking is cost tracking that does not exist in production.

Getting Started

Concepts

Architecture

Packages

Contributing

Cost accountability

Cost Accountability

The Problem

The rag-axis Approach

Why Per-Stage

Non-Optional

Getting Started

Concepts

Architecture

Packages

Contributing

​Cost Accountability

​The Problem

​The rag-axis Approach

​Why Per-Stage

​Non-Optional

Cost Accountability

The Problem

The rag-axis Approach

Why Per-Stage

Non-Optional