Core Concepts

Rating System

How Caliber calculates ratings using gates, scores, and weighted criteria.

Rating Algorithm

Caliber evaluates market definitions using a two-stage process:

1.Execute all gate criteria - these are pass/fail checks that must all pass
2.If any gate fails, the final score is 0
3.Otherwise, execute all score criteria and calculate weighted scores
4.Sum the weighted scores to get the final score (0-100)
5.Map the final score to a rating band (AAA to C)

Criterion Types

Gate Criteria

Pass/fail checks that must all pass for the rating to proceed. If any gate fails, the final score is automatically 0.

>Source Reachability
>Source Blocklisted

Score Criteria

Weighted criteria that contribute to the final score. Each returns a score from 0-100, multiplied by its weight.

>Source Count (20%)
>Source Agreement (20%)
>Source Relevancy (20%)
>And more...

Evaluation Methods

Criteria are evaluated using one of two methods:

Static

Rule-based evaluation using deterministic logic. Fast and consistent. Used for objective checks like URL reachability and source counts.

LLM

Semantic analysis using a language model. Used for subjective assessments like prompt clarity and source relevancy.

Weighting

Score criteria have weights that determine their contribution to the final score. All weights sum to 1.0 (100%).

final_score = Σ (criterion_score × criterion_weight)

Example:
  source_count:       40 × 0.20 =  8.0
  source_agreement:  100 × 0.20 = 20.0
  source_history:     60 × 0.10 =  6.0
  prompt_subjectivity: 75 × 0.10 =  7.5
  temporal_soundness: 85 × 0.10 =  8.5
  source_relevancy:   90 × 0.20 = 18.0
  ────────────────────────────────────
  final_score:                   68.0

Rating Bands

The final score (0-100) is mapped to a letter rating:

Rating	Score Range	Definition
AAA	90-100	Exceptional definition quality; highly reliable and unambiguous
AA	80-89	Very strong; minor weaknesses
A	70-79	Strong; some limitations
BBB	60-69	Adequate; moderate ambiguity or risk
BB	50-59	Speculative; notable weaknesses
B	35-49	Weak; high risk of poor resolution
CCC	20-34	Very weak; high likelihood of problematic resolution
CC	10-19	Highly unreliable; severe structural issues
C	0-9	Structurally broken or guaranteed to fail resolution