Guides
Generating Markets with AI
How to use an AI agent to generate prediction market definitions that score well on Caliber, including the prompt template we use internally.
Overview
AI agents can generate high-quality market definitions when given the right instructions. The key is providing a system prompt that encodes Caliber's rating criteria, plus enough context about the event or topic you want to create a market for.
Caliber provides a ready-to-use prompt template on the Market Generator page. This guide explains the principles behind it and how to get the best results.
The System Prompt
The core of market generation is the system prompt that instructs the agent. This is the same prompt used by Caliber's own internal generation tooling:
You generate prediction market definitions based on context provided by the user (e.g. a news story, event details, or a topic). Each market will be resolved by an LLM agent that reads web pages and extracts data. Key rules: - **Extraction prompts**: Ask the agent to extract a specific fact. NEVER use yes/no questions. Good: "Who won the 2024 NBA Finals?" with expected_results: ["Boston Celtics", "Dallas Mavericks"] Bad: "Did the Celtics win?" with expected_results: ["Yes"] - **expected_results**: List ALL plausible outcomes, not just one - **source_urls**: Provide 4-5 diverse, publicly accessible sources that will contain the specific answer. No paywalls, no wikipedia. - **min_agreement**: Set to majority of sources (e.g. 4 for 5 sources, targeting 80% agreement) - **answer_type**: "string" for names/categories, "number" for quantities/prices - **comparison_operator**: "==" (equals), "!=" (not equals), ">" (greater than), ">=" (greater than or equal), "<" (less than), "<=" (less than or equal) - **resolution_start/end**: Must be AFTER the event concludes. Allow time for sources to publish results.
Providing Context
The agent needs context to generate markets about. The richer the context, the better the output. Good context sources include:
Example Workflow
A typical generation workflow looks like:
- 1.Gather context (e.g. copy a news article about a recent sports result)
- 2.Paste the prompt template into your AI agent with your context filled in
- 3.Submit the generated market JSON to Caliber via the rating form or API
- 4.Review the rating breakdown — if the score is low, feed the criteria feedback back to the agent and ask it to improve the weak areas
- 5.Repeat until the market reaches your target rating band
Iterative Refinement
Markets rarely score perfectly on the first attempt. The most effective approach is an iterative loop: generate, rate, refine. When refining, focus on the weakest criteria:
source_countAdd more diverse sources (aim for 5+)
source_relevancyReplace generic sources with ones that directly contain the answer
source_agreementIncrease min_agreement to 80%+ of total sources
prompt_subjectivityMake the prompt more specific and unambiguous
temporal_soundnessEnsure resolution dates are after the event concludes
source_reachabilityReplace broken URLs with working ones
source_blocklistedRemove blocklisted domains (.gov, wikipedia, paywalled sites)
Programmatic Generation
For automated pipelines, you can use the system prompt and JSON schema from the @prophecy/schema/market-generation package directly.
import {
MARKET_GENERATION_SYSTEM_PROMPT,
MARKET_GENERATION_SCHEMA,
} from "@prophecy/schema/market-generation"
const response = await llm.generate({
system: MARKET_GENERATION_SYSTEM_PROMPT,
prompt: `Generate markets about: ${context}`,
responseSchema: MARKET_GENERATION_SCHEMA,
})For more on what each criterion measures and how to optimize for it, see the Rating Criteria reference. To copy the full prompt template, visit the Market Generator page.