Interactive charts built from live CSI data. Updated with every evaluation run.
Where your dollar goes
Each rectangle represents one AI model, sized proportionally to its CSI score. The visual makes the efficiency gap visceral—no chart literacy required. The largest rectangles deliver the most useful intelligence per dollar. The smallest slivers are the premium models where you’re paying for brand, safety infrastructure, or frontier capability that hasn’t yet been commoditized. When the cheapest model’s rectangle dwarfs the most expensive, you’re looking at a market in the early stages of radical deflation.
CSI distribution
How are frontier models distributed across the efficiency spectrum? These histograms show the clustering pattern — most models bunch in the middle, with a few extreme outliers at both ends. The gold line marks the median.
CSI score distribution
Cost per task distribution
What does one task cost?
We priced three real-world tasks across all 16 frontier models using published API pricing. The same task that costs a fraction of a penny on one model costs over a dollar on another.
Summarize NVIDIA’s 10-K 80,000 tokens in, 2,000 out
Build a DCF model 1,500 tokens in, 8,000 out
Summarize a 20-page legal contract 12,000 tokens in, 1,500 out
Pricing as of most recent snapshot in database.
The efficiency frontier
The efficiency frontier maps every model by what it costs per task (x-axis) against how well it performs (y-axis). Bubble size represents overall CSI—the composite efficiency score. Models in the top-left corner deliver the highest capability at the lowest cost. The striking pattern: most frontier models score within a narrow capability band (0.90–1.00), but their costs span three orders of magnitude. The market isn’t differentiated by quality—it’s differentiated by price. This is the central insight of the Capability-Seconds Index: intelligence is becoming a commodity, and cost is the primary axis of competition.
CSI over time
The aggregate Capability-Seconds Index tracked daily. With only a few days of data, the line is sparse—as measurements accumulate, this chart will reveal the deflation trajectory of AI inference economics. Once 7+ data points exist, the published index switches from raw daily values to a 7-day rolling average.