Bull vs Bear Market Analyst
Build an adversarial debate agent where Bull and Bear personas argue opposing investment theses, with a Judge scoring arguments and a Synthesizer providing balanced risk-aware recommendations
Bull vs Bear Market Analyst
Build an adversarial multi-agent system where two AI personas debate investment decisions, forcing consideration of both bullish and bearish perspectives before reaching a balanced conclusion.
| Difficulty | Advanced |
| Time | 3-4 days |
| Code | ~800 lines |
| Pattern | Adversarial Debate with Round Cycling |
TL;DR
Build a debate system using LangGraph round-based state machine (3 rounds of argumentation), parallel agent execution (Bull and Bear run simultaneously), structured argument models (claims, evidence, rebuttals), and judge scoring (logic, evidence, risk assessment). Eliminates confirmation bias by forcing both sides of any investment thesis.
Financial Disclaimer
This system is for educational purposes only. It does not constitute financial advice, investment recommendations, or solicitation to buy or sell securities. Always consult qualified financial advisors before making investment decisions.
What You'll Build
An adversarial market analysis agent that:
- Parses investment claims - Extracts the core thesis and key assumptions
- Generates Bull case - Arguments for why the investment will succeed
- Generates Bear case - Arguments for why the investment will fail
- Runs critique rounds - Each side attacks the other's weakest points
- Runs rebuttal rounds - Each side defends and concedes where appropriate
- Judges objectively - Scores arguments on logic, evidence, and risk assessment
- Synthesizes balanced view - Provides nuanced recommendation with risk factors
Why Adversarial Debate?
┌─────────────────────────────────────────────────────────────────────┐
│ THE PROBLEM WITH SINGLE-PERSPECTIVE ANALYSIS │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Traditional LLM Query: │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ User: "Should I invest in NVIDIA?" │ │
│ │ │ │
│ │ LLM: "NVIDIA is a great investment because..." │ │
│ │ (Anchors on first perspective, confirmation bias) │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ Adversarial Debate: │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ User: "Should I invest in NVIDIA?" │ │
│ │ │ │
│ │ Bull: "Yes, because AI demand, data center growth..." │ │
│ │ Bear: "No, because valuation, competition, cyclicality..." │ │
│ │ Bull: "But the moat from CUDA ecosystem..." │ │
│ │ Bear: "But AMD and custom chips are catching up..." │ │
│ │ │ │
│ │ Synthesizer: "Balanced view considering both perspectives" │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ Result: More robust analysis that considers risks AND opportunities │
│ │
└─────────────────────────────────────────────────────────────────────┘| Single Perspective | Adversarial Debate |
|---|---|
| Anchors on first idea | Forces consideration of opposites |
| Confirmation bias | Steelmans both sides |
| Misses risks | Explicitly surfaces risks |
| Overconfident | Calibrated uncertainty |
Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ ADVERSARIAL DEBATE ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Input: "Is Tesla a good investment at current prices?" │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ CLAIM PARSER │ Extract thesis, timeframe, assumptions │
│ └────────┬─────────┘ │
│ │ │
│ ▼ │
│ ╔══════════════════════════════════════════════════════════════╗ │
│ ║ ROUND 1: Opening Arguments ║ │
│ ║ ┌─────────────┐ ┌─────────────┐ ║ │
│ ║ │ BULL │ (parallel) │ BEAR │ ║ │
│ ║ │ 3 reasons │ │ 3 reasons │ ║ │
│ ║ │ to BUY │ │ to AVOID │ ║ │
│ ║ └─────────────┘ └─────────────┘ ║ │
│ ╚══════════════════════════════════════════════════════════════╝ │
│ │ │
│ ▼ │
│ ╔══════════════════════════════════════════════════════════════╗ │
│ ║ ROUND 2: Critiques ║ │
│ ║ ┌─────────────┐ ┌─────────────┐ ║ │
│ ║ │ BULL │ (parallel) │ BEAR │ ║ │
│ ║ │ attacks │ │ attacks │ ║ │
│ ║ │ Bear's │ │ Bull's │ ║ │
│ ║ │ arguments │ │ arguments │ ║ │
│ ║ └─────────────┘ └─────────────┘ ║ │
│ ╚══════════════════════════════════════════════════════════════╝ │
│ │ │
│ ▼ │
│ ╔══════════════════════════════════════════════════════════════╗ │
│ ║ ROUND 3: Rebuttals & Concessions ║ │
│ ║ ┌─────────────┐ ┌─────────────┐ ║ │
│ ║ │ BULL │ (parallel) │ BEAR │ ║ │
│ ║ │ defends │ │ defends │ ║ │
│ ║ │ & concedes │ │ & concedes │ ║ │
│ ║ └─────────────┘ └─────────────┘ ║ │
│ ╚══════════════════════════════════════════════════════════════╝ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ JUDGE │ Scores: Logic, Evidence, Risk Assessment │
│ └────────┬─────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ SYNTHESIZER │ Balanced recommendation with confidence │
│ └──────────────────┘ │
│ │
│ Output: Nuanced analysis acknowledging both bull and bear cases │
│ │
└─────────────────────────────────────────────────────────────────────┘Project Structure
debate-market-analyst/
├── src/
│ ├── __init__.py
│ ├── config.py
│ ├── models/
│ │ ├── __init__.py
│ │ ├── arguments.py # Argument, Critique, Rebuttal models
│ │ ├── scoring.py # JudgeScore, Verdict models
│ │ └── state.py # DebateState for LangGraph
│ ├── agents/
│ │ ├── __init__.py
│ │ ├── parser.py # Claim parser agent
│ │ ├── bull.py # Bull (bullish) agent
│ │ ├── bear.py # Bear (bearish) agent
│ │ ├── judge.py # Judge agent
│ │ └── synthesizer.py # Synthesizer agent
│ ├── workflow/
│ │ ├── __init__.py
│ │ └── debate.py # LangGraph debate workflow
│ └── api/
│ ├── __init__.py
│ └── main.py # FastAPI endpoints
├── tests/
├── docker-compose.yml
└── requirements.txtTech Stack
| Technology | Purpose |
|---|---|
| LangGraph | Round-based state machine with parallel branches |
| OpenAI GPT-4o | Different personas (Bull, Bear, Judge, Synthesizer) |
| Pydantic | Structured argument and scoring models |
| FastAPI | API to submit claims and retrieve debates |
Implementation
Configuration
# src/config.py
from pydantic_settings import BaseSettings
from typing import List
class Settings(BaseSettings):
# LLM Settings
openai_api_key: str
openai_model: str = "gpt-4o"
temperature_debaters: float = 0.7 # Creative for arguments
temperature_judge: float = 0.2 # Consistent for scoring
# Debate Settings
num_arguments_per_side: int = 3
num_critique_points: int = 2
num_rounds: int = 3 # Opening, Critique, Rebuttal
# Scoring Weights
weight_logic: float = 0.35
weight_evidence: float = 0.40
weight_risk_assessment: float = 0.25
class Config:
env_file = ".env"
settings = Settings()Understanding the Configuration:
┌─────────────────────────────────────────────────────────────────────┐
│ WHY DIFFERENT TEMPERATURES? │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Debaters (Bull & Bear): temperature = 0.7 │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ • Need creative, diverse arguments │ │
│ │ • Should explore different angles │ │
│ │ • Higher temperature = more varied perspectives │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ Judge: temperature = 0.2 │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ • Needs consistent, reproducible scoring │ │
│ │ • Should apply criteria objectively │ │
│ │ • Lower temperature = less randomness in evaluation │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ Scoring weights prioritize EVIDENCE (0.40) because: │
│ • Financial claims should be backed by data │
│ • Speculation without evidence is dangerous │
│ • Logic alone can lead to plausible but wrong conclusions │
│ │
└─────────────────────────────────────────────────────────────────────┘Argument Models
# src/models/arguments.py
from pydantic import BaseModel, Field
from typing import List, Optional
from enum import Enum
from datetime import datetime
class Stance(str, Enum):
BULL = "bull"
BEAR = "bear"
class InvestmentClaim(BaseModel):
"""Parsed investment claim/thesis."""
original_query: str
asset: str = Field(description="Stock, crypto, commodity, etc.")
thesis: str = Field(description="Core investment thesis to debate")
timeframe: str = Field(description="Investment horizon: short/medium/long term")
assumptions: List[str] = Field(
default_factory=list,
description="Key assumptions embedded in the claim"
)
risk_tolerance: str = Field(
default="moderate",
description="Implied risk tolerance: conservative/moderate/aggressive"
)
class Argument(BaseModel):
"""A single argument for or against the investment."""
stance: Stance
claim: str = Field(description="The main assertion")
reasoning: str = Field(description="Why this claim is true")
evidence: List[str] = Field(
description="Supporting data, facts, or precedents"
)
confidence: float = Field(
ge=0.0, le=1.0,
description="How confident in this argument (0-1)"
)
class ArgumentSet(BaseModel):
"""Collection of arguments from one side."""
stance: Stance
arguments: List[Argument]
round_number: int = 1
class Critique(BaseModel):
"""A critique of an opponent's argument."""
target_argument_index: int = Field(
description="Which opponent argument this critiques (0-indexed)"
)
weakness_identified: str = Field(
description="The flaw or weakness in the argument"
)
counter_evidence: List[str] = Field(
description="Evidence that undermines the argument"
)
severity: str = Field(
description="How damaging: minor, moderate, severe"
)
class CritiqueSet(BaseModel):
"""Collection of critiques from one side."""
stance: Stance
critiques: List[Critique]
round_number: int = 2
class Rebuttal(BaseModel):
"""Defense against a critique, possibly with concessions."""
critique_index: int = Field(
description="Which critique this rebuts (0-indexed)"
)
defense: str = Field(
description="Why the critique is wrong or overstated"
)
concession: Optional[str] = Field(
None,
description="What the side concedes is valid about the critique"
)
revised_confidence: float = Field(
ge=0.0, le=1.0,
description="Updated confidence after considering critique"
)
class RebuttalSet(BaseModel):
"""Collection of rebuttals from one side."""
stance: Stance
rebuttals: List[Rebuttal]
round_number: int = 3Scoring Models
# src/models/scoring.py
from pydantic import BaseModel, Field
from typing import List, Optional
from .arguments import Stance
class ArgumentScore(BaseModel):
"""Score for a single argument."""
argument_index: int
logic_score: float = Field(ge=0.0, le=10.0, description="Soundness of reasoning")
evidence_score: float = Field(ge=0.0, le=10.0, description="Quality of evidence")
risk_score: float = Field(ge=0.0, le=10.0, description="Risk assessment quality")
overall_score: float = Field(ge=0.0, le=10.0)
comments: str
class SideScore(BaseModel):
"""Aggregate score for one side of the debate."""
stance: Stance
argument_scores: List[ArgumentScore]
critique_effectiveness: float = Field(
ge=0.0, le=10.0,
description="How well they attacked opponent's arguments"
)
rebuttal_effectiveness: float = Field(
ge=0.0, le=10.0,
description="How well they defended their arguments"
)
total_score: float = Field(ge=0.0, le=100.0)
class JudgeVerdict(BaseModel):
"""Complete verdict from the judge."""
bull_score: SideScore
bear_score: SideScore
winner: Stance = Field(description="Which side made stronger arguments")
margin: str = Field(description="How decisive: narrow, moderate, decisive")
key_differentiators: List[str] = Field(
description="What made the difference"
)
unresolved_questions: List[str] = Field(
description="Important questions neither side addressed well"
)
class Synthesis(BaseModel):
"""Final synthesized recommendation."""
recommendation: str = Field(
description="BUY, SELL, HOLD, or AVOID"
)
confidence: float = Field(
ge=0.0, le=1.0,
description="Confidence in recommendation"
)
bull_case_summary: str = Field(
description="Strongest points from bull side"
)
bear_case_summary: str = Field(
description="Strongest points from bear side"
)
key_risks: List[str] = Field(
description="Most important risks to monitor"
)
key_catalysts: List[str] = Field(
description="Events that could change the thesis"
)
position_sizing_hint: str = Field(
description="Suggested position size based on conviction"
)
review_triggers: List[str] = Field(
description="When to revisit this analysis"
)Debate State
# src/models/state.py
from typing import TypedDict, List, Optional, Annotated
from enum import Enum
import operator
from .arguments import (
InvestmentClaim, ArgumentSet, CritiqueSet, RebuttalSet
)
from .scoring import JudgeVerdict, Synthesis
class DebatePhase(str, Enum):
PARSING = "parsing"
OPENING = "opening"
CRITIQUE = "critique"
REBUTTAL = "rebuttal"
JUDGING = "judging"
SYNTHESIS = "synthesis"
COMPLETE = "complete"
class DebateState(TypedDict):
"""State for the adversarial debate workflow."""
# Input
raw_query: str
# Parsed claim
claim: Optional[InvestmentClaim]
# Current phase
phase: DebatePhase
current_round: int
# Arguments (accumulated across rounds)
bull_arguments: Optional[ArgumentSet]
bear_arguments: Optional[ArgumentSet]
# Critiques
bull_critiques: Optional[CritiqueSet]
bear_critiques: Optional[CritiqueSet]
# Rebuttals
bull_rebuttals: Optional[RebuttalSet]
bear_rebuttals: Optional[RebuttalSet]
# Judgment
verdict: Optional[JudgeVerdict]
# Final output
synthesis: Optional[Synthesis]
# Audit trail
reasoning_trace: Annotated[List[str], operator.add]State Progression Through Debate:
┌─────────────────────────────────────────────────────────────────────┐
│ DEBATE STATE PROGRESSION │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Phase: PARSING │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ claim: InvestmentClaim(asset="TSLA", thesis="...", ...) │ │
│ │ bull_arguments: None │ │
│ │ bear_arguments: None │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Phase: OPENING (Round 1) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ bull_arguments: ArgumentSet(arguments=[Arg1, Arg2, Arg3]) │ │
│ │ bear_arguments: ArgumentSet(arguments=[Arg1, Arg2, Arg3]) │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Phase: CRITIQUE (Round 2) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ bull_critiques: CritiqueSet(critiques=[Crit1, Crit2]) │ │
│ │ bear_critiques: CritiqueSet(critiques=[Crit1, Crit2]) │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Phase: REBUTTAL (Round 3) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ bull_rebuttals: RebuttalSet(rebuttals=[Reb1, Reb2]) │ │
│ │ bear_rebuttals: RebuttalSet(rebuttals=[Reb1, Reb2]) │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Phase: JUDGING │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ verdict: JudgeVerdict(winner="bear", margin="moderate") │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Phase: SYNTHESIS │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ synthesis: Synthesis(recommendation="HOLD", confidence=0.6) │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘Claim Parser Agent
# src/agents/parser.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from ..models.arguments import InvestmentClaim
from ..config import settings
class ClaimParser:
"""Parses raw investment queries into structured claims."""
def __init__(self):
self.llm = ChatOpenAI(
model=settings.openai_model,
api_key=settings.openai_api_key,
temperature=0.1 # Low for consistent parsing
).with_structured_output(InvestmentClaim)
self.prompt = ChatPromptTemplate.from_messages([
("system", """You are a financial analyst parsing investment queries.
Extract the following from the user's question:
1. The specific asset (stock ticker, crypto, commodity, etc.)
2. The core investment thesis being proposed
3. The implied timeframe (short: <1yr, medium: 1-3yr, long: >3yr)
4. Key assumptions embedded in the question
5. Implied risk tolerance
If any information is ambiguous, make reasonable assumptions and note them.
Examples:
- "Should I buy NVDA?" → asset: NVDA, thesis: NVDA is a good buy at current prices
- "Is Bitcoin going to 100k?" → asset: BTC, thesis: Bitcoin will reach $100,000
- "Tech stocks for retirement" → asset: Tech sector, thesis: Tech stocks are good for long-term retirement investing, timeframe: long"""),
("human", "{query}")
])
async def parse(self, query: str) -> InvestmentClaim:
"""Parse a raw query into a structured investment claim."""
chain = self.prompt | self.llm
result = await chain.ainvoke({"query": query})
result.original_query = query
return resultBull Agent
# src/agents/bull.py
from typing import List, Optional
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from ..models.arguments import (
InvestmentClaim, Argument, ArgumentSet, Stance,
Critique, CritiqueSet, Rebuttal, RebuttalSet
)
from ..config import settings
class BullAgent:
"""The bullish debater - argues FOR the investment."""
def __init__(self):
self.llm = ChatOpenAI(
model=settings.openai_model,
api_key=settings.openai_api_key,
temperature=settings.temperature_debaters
)
self.stance = Stance.BULL
async def generate_arguments(
self,
claim: InvestmentClaim
) -> ArgumentSet:
"""Generate opening arguments for the bull case."""
llm = self.llm.with_structured_output(ArgumentSet)
prompt = ChatPromptTemplate.from_messages([
("system", """You are a BULLISH investment analyst. Your job is to make
the strongest possible case FOR investing in the asset.
Generate {num_args} compelling arguments for why this is a GOOD investment.
For each argument:
1. Make a clear, specific claim
2. Explain the reasoning
3. Provide concrete evidence (data, trends, comparisons)
4. Assign a confidence level (0-1)
Think like a growth investor: focus on upside potential, competitive advantages,
market opportunities, and positive catalysts.
IMPORTANT: Steelman your position. Make the STRONGEST possible bull case,
even if you personally might be skeptical."""),
("human", """Asset: {asset}
Thesis: {thesis}
Timeframe: {timeframe}
Assumptions: {assumptions}
Generate your strongest bull case arguments.""")
])
chain = prompt | llm
result = await chain.ainvoke({
"num_args": settings.num_arguments_per_side,
"asset": claim.asset,
"thesis": claim.thesis,
"timeframe": claim.timeframe,
"assumptions": ", ".join(claim.assumptions) if claim.assumptions else "None specified"
})
result.stance = self.stance
result.round_number = 1
return result
async def generate_critiques(
self,
claim: InvestmentClaim,
opponent_arguments: ArgumentSet
) -> CritiqueSet:
"""Critique the bear's arguments."""
llm = self.llm.with_structured_output(CritiqueSet)
prompt = ChatPromptTemplate.from_messages([
("system", """You are a BULLISH analyst critiquing BEARISH arguments.
Your opponent made these arguments against investing in {asset}:
{opponent_args}
Generate {num_critiques} critiques attacking their weakest points.
For each critique:
1. Identify which argument you're attacking (by index, 0-based)
2. Explain the weakness or flaw in their reasoning
3. Provide counter-evidence
4. Rate the severity (minor, moderate, severe)
Focus on:
- Flawed assumptions
- Cherry-picked data
- Outdated information
- Logical fallacies
- Missing context"""),
("human", "Critique the bear's arguments.")
])
# Format opponent arguments
opponent_text = "\n".join([
f"[{i}] {arg.claim}\n Reasoning: {arg.reasoning}\n Evidence: {', '.join(arg.evidence)}"
for i, arg in enumerate(opponent_arguments.arguments)
])
chain = prompt | llm
result = await chain.ainvoke({
"asset": claim.asset,
"opponent_args": opponent_text,
"num_critiques": settings.num_critique_points
})
result.stance = self.stance
result.round_number = 2
return result
async def generate_rebuttals(
self,
claim: InvestmentClaim,
my_arguments: ArgumentSet,
opponent_critiques: CritiqueSet
) -> RebuttalSet:
"""Defend against bear's critiques, conceding where appropriate."""
llm = self.llm.with_structured_output(RebuttalSet)
prompt = ChatPromptTemplate.from_messages([
("system", """You are a BULLISH analyst defending your arguments.
Your original arguments:
{my_args}
The bear critiqued your arguments:
{critiques}
Generate rebuttals for each critique.
For each rebuttal:
1. Identify which critique you're responding to (by index)
2. Defend your position where the critique is unfair or wrong
3. CONCEDE points that are valid - intellectual honesty strengthens your credibility
4. Provide an updated confidence level
IMPORTANT: Good debaters concede valid points. If a critique is legitimate,
acknowledge it and explain why your overall thesis still holds despite this weakness."""),
("human", "Defend your bull case and concede where appropriate.")
])
my_args_text = "\n".join([
f"[{i}] {arg.claim}\n Evidence: {', '.join(arg.evidence)}"
for i, arg in enumerate(my_arguments.arguments)
])
critiques_text = "\n".join([
f"[{i}] Attacking argument {c.target_argument_index}: {c.weakness_identified}\n Counter-evidence: {', '.join(c.counter_evidence)}"
for i, c in enumerate(opponent_critiques.critiques)
])
chain = prompt | llm
result = await chain.ainvoke({
"my_args": my_args_text,
"critiques": critiques_text
})
result.stance = self.stance
result.round_number = 3
return resultBear Agent
# src/agents/bear.py
from typing import List, Optional
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from ..models.arguments import (
InvestmentClaim, Argument, ArgumentSet, Stance,
Critique, CritiqueSet, Rebuttal, RebuttalSet
)
from ..config import settings
class BearAgent:
"""The bearish debater - argues AGAINST the investment."""
def __init__(self):
self.llm = ChatOpenAI(
model=settings.openai_model,
api_key=settings.openai_api_key,
temperature=settings.temperature_debaters
)
self.stance = Stance.BEAR
async def generate_arguments(
self,
claim: InvestmentClaim
) -> ArgumentSet:
"""Generate opening arguments for the bear case."""
llm = self.llm.with_structured_output(ArgumentSet)
prompt = ChatPromptTemplate.from_messages([
("system", """You are a BEARISH investment analyst. Your job is to make
the strongest possible case AGAINST investing in the asset.
Generate {num_args} compelling arguments for why this is a BAD investment.
For each argument:
1. Make a clear, specific claim
2. Explain the reasoning
3. Provide concrete evidence (data, trends, comparisons)
4. Assign a confidence level (0-1)
Think like a skeptical analyst: focus on risks, overvaluation, competition,
market headwinds, and negative catalysts.
IMPORTANT: Steelman your position. Make the STRONGEST possible bear case,
even if you personally might be bullish."""),
("human", """Asset: {asset}
Thesis: {thesis}
Timeframe: {timeframe}
Assumptions: {assumptions}
Generate your strongest bear case arguments.""")
])
chain = prompt | llm
result = await chain.ainvoke({
"num_args": settings.num_arguments_per_side,
"asset": claim.asset,
"thesis": claim.thesis,
"timeframe": claim.timeframe,
"assumptions": ", ".join(claim.assumptions) if claim.assumptions else "None specified"
})
result.stance = self.stance
result.round_number = 1
return result
async def generate_critiques(
self,
claim: InvestmentClaim,
opponent_arguments: ArgumentSet
) -> CritiqueSet:
"""Critique the bull's arguments."""
llm = self.llm.with_structured_output(CritiqueSet)
prompt = ChatPromptTemplate.from_messages([
("system", """You are a BEARISH analyst critiquing BULLISH arguments.
Your opponent made these arguments for investing in {asset}:
{opponent_args}
Generate {num_critiques} critiques attacking their weakest points.
For each critique:
1. Identify which argument you're attacking (by index, 0-based)
2. Explain the weakness or flaw in their reasoning
3. Provide counter-evidence
4. Rate the severity (minor, moderate, severe)
Focus on:
- Overly optimistic assumptions
- Ignored risks
- Survivorship bias
- Valuation concerns
- Competitive threats"""),
("human", "Critique the bull's arguments.")
])
opponent_text = "\n".join([
f"[{i}] {arg.claim}\n Reasoning: {arg.reasoning}\n Evidence: {', '.join(arg.evidence)}"
for i, arg in enumerate(opponent_arguments.arguments)
])
chain = prompt | llm
result = await chain.ainvoke({
"asset": claim.asset,
"opponent_args": opponent_text,
"num_critiques": settings.num_critique_points
})
result.stance = self.stance
result.round_number = 2
return result
async def generate_rebuttals(
self,
claim: InvestmentClaim,
my_arguments: ArgumentSet,
opponent_critiques: CritiqueSet
) -> RebuttalSet:
"""Defend against bull's critiques, conceding where appropriate."""
llm = self.llm.with_structured_output(RebuttalSet)
prompt = ChatPromptTemplate.from_messages([
("system", """You are a BEARISH analyst defending your arguments.
Your original arguments:
{my_args}
The bull critiqued your arguments:
{critiques}
Generate rebuttals for each critique.
For each rebuttal:
1. Identify which critique you're responding to (by index)
2. Defend your position where the critique is unfair or wrong
3. CONCEDE points that are valid - intellectual honesty strengthens your credibility
4. Provide an updated confidence level
IMPORTANT: Good debaters concede valid points. If a critique is legitimate,
acknowledge it and explain why your overall thesis still holds despite this weakness."""),
("human", "Defend your bear case and concede where appropriate.")
])
my_args_text = "\n".join([
f"[{i}] {arg.claim}\n Evidence: {', '.join(arg.evidence)}"
for i, arg in enumerate(my_arguments.arguments)
])
critiques_text = "\n".join([
f"[{i}] Attacking argument {c.target_argument_index}: {c.weakness_identified}\n Counter-evidence: {', '.join(c.counter_evidence)}"
for i, c in enumerate(opponent_critiques.critiques)
])
chain = prompt | llm
result = await chain.ainvoke({
"my_args": my_args_text,
"critiques": critiques_text
})
result.stance = self.stance
result.round_number = 3
return resultJudge Agent
# src/agents/judge.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from ..models.arguments import ArgumentSet, CritiqueSet, RebuttalSet, Stance
from ..models.scoring import JudgeVerdict, SideScore, ArgumentScore
from ..config import settings
class JudgeAgent:
"""Impartial judge that scores the debate."""
def __init__(self):
self.llm = ChatOpenAI(
model=settings.openai_model,
api_key=settings.openai_api_key,
temperature=settings.temperature_judge # Low for consistency
).with_structured_output(JudgeVerdict)
self.prompt = ChatPromptTemplate.from_messages([
("system", """You are an IMPARTIAL investment debate judge.
You must evaluate both sides fairly based on:
1. LOGIC (weight: {weight_logic})
- Soundness of reasoning
- Internal consistency
- Avoidance of logical fallacies
2. EVIDENCE (weight: {weight_evidence})
- Quality and relevance of data
- Recency of information
- Credibility of sources
3. RISK ASSESSMENT (weight: {weight_risk})
- Acknowledgment of uncertainties
- Consideration of downside scenarios
- Calibrated confidence levels
Score each argument 0-10 on each criterion.
Score critique and rebuttal effectiveness 0-10.
IMPORTANT: Be objective. The side with BETTER ARGUMENTS wins,
regardless of your personal views on the investment."""),
("human", """DEBATE TRANSCRIPT:
=== BULL CASE ===
Opening Arguments:
{bull_arguments}
Critiques of Bear:
{bull_critiques}
Rebuttals:
{bull_rebuttals}
=== BEAR CASE ===
Opening Arguments:
{bear_arguments}
Critiques of Bull:
{bear_critiques}
Rebuttals:
{bear_rebuttals}
Score this debate and determine a winner.""")
])
async def judge(
self,
bull_arguments: ArgumentSet,
bear_arguments: ArgumentSet,
bull_critiques: CritiqueSet,
bear_critiques: CritiqueSet,
bull_rebuttals: RebuttalSet,
bear_rebuttals: RebuttalSet
) -> JudgeVerdict:
"""Judge the debate and produce a verdict."""
def format_arguments(args: ArgumentSet) -> str:
return "\n".join([
f"[{i}] {a.claim}\n Reasoning: {a.reasoning}\n Evidence: {', '.join(a.evidence)}\n Confidence: {a.confidence}"
for i, a in enumerate(args.arguments)
])
def format_critiques(crits: CritiqueSet) -> str:
return "\n".join([
f"[{i}] Attacking #{c.target_argument_index}: {c.weakness_identified}\n Severity: {c.severity}"
for i, c in enumerate(crits.critiques)
])
def format_rebuttals(rebs: RebuttalSet) -> str:
return "\n".join([
f"[{i}] Defense: {r.defense}\n Concession: {r.concession or 'None'}\n Revised confidence: {r.revised_confidence}"
for i, r in enumerate(rebs.rebuttals)
])
chain = self.prompt | self.llm
result = await chain.ainvoke({
"weight_logic": settings.weight_logic,
"weight_evidence": settings.weight_evidence,
"weight_risk": settings.weight_risk_assessment,
"bull_arguments": format_arguments(bull_arguments),
"bull_critiques": format_critiques(bull_critiques),
"bull_rebuttals": format_rebuttals(bull_rebuttals),
"bear_arguments": format_arguments(bear_arguments),
"bear_critiques": format_critiques(bear_critiques),
"bear_rebuttals": format_rebuttals(bear_rebuttals)
})
return resultSynthesizer Agent
# src/agents/synthesizer.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from ..models.arguments import InvestmentClaim, ArgumentSet, RebuttalSet
from ..models.scoring import JudgeVerdict, Synthesis
from ..config import settings
class SynthesizerAgent:
"""Produces balanced final recommendation from debate."""
def __init__(self):
self.llm = ChatOpenAI(
model=settings.openai_model,
api_key=settings.openai_api_key,
temperature=0.3
).with_structured_output(Synthesis)
self.prompt = ChatPromptTemplate.from_messages([
("system", """You are a senior investment advisor synthesizing a debate.
Both bull and bear cases have been argued. The judge has scored them.
Your job is to provide a BALANCED, NUANCED recommendation.
DO NOT simply side with the winner. Instead:
1. Acknowledge the strongest points from BOTH sides
2. Identify key risks that must be monitored
3. Suggest appropriate position sizing based on uncertainty
4. Provide clear triggers for revisiting the analysis
Recommendation options:
- STRONG BUY: High conviction, consider overweight position
- BUY: Positive outlook, standard position size
- HOLD: Balanced risks/rewards, maintain current position
- SELL: Negative outlook, reduce position
- STRONG SELL: High conviction negative, consider exiting entirely
- AVOID: For those not currently holding, stay away
IMPORTANT: Your recommendation should reflect the DEBATE OUTCOME,
not just one side. A narrow bull victory with valid bear concerns
might still be a HOLD, not a BUY."""),
("human", """Investment Claim: {asset} - {thesis}
Timeframe: {timeframe}
DEBATE OUTCOME:
Winner: {winner} (margin: {margin})
Key Bull Points:
{bull_summary}
Key Bear Points:
{bear_summary}
Unresolved Questions:
{unresolved}
Synthesize a final recommendation.""")
])
async def synthesize(
self,
claim: InvestmentClaim,
bull_arguments: ArgumentSet,
bear_arguments: ArgumentSet,
verdict: JudgeVerdict
) -> Synthesis:
"""Produce final synthesized recommendation."""
bull_summary = "\n".join([
f"- {a.claim} (confidence: {a.confidence})"
for a in bull_arguments.arguments
])
bear_summary = "\n".join([
f"- {a.claim} (confidence: {a.confidence})"
for a in bear_arguments.arguments
])
chain = self.prompt | self.llm
result = await chain.ainvoke({
"asset": claim.asset,
"thesis": claim.thesis,
"timeframe": claim.timeframe,
"winner": verdict.winner.value,
"margin": verdict.margin,
"bull_summary": bull_summary,
"bear_summary": bear_summary,
"unresolved": "\n".join(verdict.unresolved_questions)
})
return resultLangGraph Debate Workflow
# src/workflow/debate.py
from typing import Literal
from langgraph.graph import StateGraph, END
from ..models.state import DebateState, DebatePhase
from ..models.arguments import InvestmentClaim
from ..agents.parser import ClaimParser
from ..agents.bull import BullAgent
from ..agents.bear import BearAgent
from ..agents.judge import JudgeAgent
from ..agents.synthesizer import SynthesizerAgent
# Initialize agents
parser = ClaimParser()
bull = BullAgent()
bear = BearAgent()
judge = JudgeAgent()
synthesizer = SynthesizerAgent()
async def parse_claim_node(state: DebateState) -> DebateState:
"""Parse the raw query into a structured claim."""
claim = await parser.parse(state["raw_query"])
return {
**state,
"claim": claim,
"phase": DebatePhase.OPENING,
"current_round": 1,
"reasoning_trace": [f"Parsed claim: {claim.asset} - {claim.thesis}"]
}
async def bull_opening_node(state: DebateState) -> DebateState:
"""Bull generates opening arguments."""
arguments = await bull.generate_arguments(state["claim"])
return {
"bull_arguments": arguments,
"reasoning_trace": [f"Bull presented {len(arguments.arguments)} arguments"]
}
async def bear_opening_node(state: DebateState) -> DebateState:
"""Bear generates opening arguments."""
arguments = await bear.generate_arguments(state["claim"])
return {
"bear_arguments": arguments,
"reasoning_trace": [f"Bear presented {len(arguments.arguments)} arguments"]
}
async def advance_to_critique_node(state: DebateState) -> DebateState:
"""Advance to critique phase."""
return {
**state,
"phase": DebatePhase.CRITIQUE,
"current_round": 2,
"reasoning_trace": ["Advancing to critique round"]
}
async def bull_critique_node(state: DebateState) -> DebateState:
"""Bull critiques bear's arguments."""
critiques = await bull.generate_critiques(
state["claim"],
state["bear_arguments"]
)
return {
"bull_critiques": critiques,
"reasoning_trace": [f"Bull critiqued {len(critiques.critiques)} bear arguments"]
}
async def bear_critique_node(state: DebateState) -> DebateState:
"""Bear critiques bull's arguments."""
critiques = await bear.generate_critiques(
state["claim"],
state["bull_arguments"]
)
return {
"bear_critiques": critiques,
"reasoning_trace": [f"Bear critiqued {len(critiques.critiques)} bull arguments"]
}
async def advance_to_rebuttal_node(state: DebateState) -> DebateState:
"""Advance to rebuttal phase."""
return {
**state,
"phase": DebatePhase.REBUTTAL,
"current_round": 3,
"reasoning_trace": ["Advancing to rebuttal round"]
}
async def bull_rebuttal_node(state: DebateState) -> DebateState:
"""Bull rebuts bear's critiques."""
rebuttals = await bull.generate_rebuttals(
state["claim"],
state["bull_arguments"],
state["bear_critiques"]
)
return {
"bull_rebuttals": rebuttals,
"reasoning_trace": [f"Bull defended with {len(rebuttals.rebuttals)} rebuttals"]
}
async def bear_rebuttal_node(state: DebateState) -> DebateState:
"""Bear rebuts bull's critiques."""
rebuttals = await bear.generate_rebuttals(
state["claim"],
state["bear_arguments"],
state["bull_critiques"]
)
return {
"bear_rebuttals": rebuttals,
"reasoning_trace": [f"Bear defended with {len(rebuttals.rebuttals)} rebuttals"]
}
async def judge_node(state: DebateState) -> DebateState:
"""Judge evaluates the debate."""
verdict = await judge.judge(
state["bull_arguments"],
state["bear_arguments"],
state["bull_critiques"],
state["bear_critiques"],
state["bull_rebuttals"],
state["bear_rebuttals"]
)
return {
**state,
"verdict": verdict,
"phase": DebatePhase.SYNTHESIS,
"reasoning_trace": [f"Judge verdict: {verdict.winner.value} wins by {verdict.margin}"]
}
async def synthesize_node(state: DebateState) -> DebateState:
"""Synthesize final recommendation."""
synthesis = await synthesizer.synthesize(
state["claim"],
state["bull_arguments"],
state["bear_arguments"],
state["verdict"]
)
return {
**state,
"synthesis": synthesis,
"phase": DebatePhase.COMPLETE,
"reasoning_trace": [f"Final recommendation: {synthesis.recommendation}"]
}
def create_debate_workflow() -> StateGraph:
"""Create the adversarial debate workflow."""
workflow = StateGraph(DebateState)
# Add nodes
workflow.add_node("parse_claim", parse_claim_node)
workflow.add_node("bull_opening", bull_opening_node)
workflow.add_node("bear_opening", bear_opening_node)
workflow.add_node("advance_critique", advance_to_critique_node)
workflow.add_node("bull_critique", bull_critique_node)
workflow.add_node("bear_critique", bear_critique_node)
workflow.add_node("advance_rebuttal", advance_to_rebuttal_node)
workflow.add_node("bull_rebuttal", bull_rebuttal_node)
workflow.add_node("bear_rebuttal", bear_rebuttal_node)
workflow.add_node("judge", judge_node)
workflow.add_node("synthesize", synthesize_node)
# Set entry point
workflow.set_entry_point("parse_claim")
# Round 1: Opening (parallel)
workflow.add_edge("parse_claim", "bull_opening")
workflow.add_edge("parse_claim", "bear_opening")
# Both openings must complete before critique
workflow.add_edge("bull_opening", "advance_critique")
workflow.add_edge("bear_opening", "advance_critique")
# Round 2: Critique (parallel, but needs opponent's arguments)
workflow.add_edge("advance_critique", "bull_critique")
workflow.add_edge("advance_critique", "bear_critique")
# Both critiques must complete before rebuttal
workflow.add_edge("bull_critique", "advance_rebuttal")
workflow.add_edge("bear_critique", "advance_rebuttal")
# Round 3: Rebuttal (parallel, needs opponent's critiques)
workflow.add_edge("advance_rebuttal", "bull_rebuttal")
workflow.add_edge("advance_rebuttal", "bear_rebuttal")
# Both rebuttals must complete before judging
workflow.add_edge("bull_rebuttal", "judge")
workflow.add_edge("bear_rebuttal", "judge")
# Judging and synthesis
workflow.add_edge("judge", "synthesize")
workflow.add_edge("synthesize", END)
return workflow.compile()
# Create the debate agent
debate_agent = create_debate_workflow()LangGraph Workflow Visualization:
┌─────────────────────────────────────────────────────────────────────┐
│ ADVERSARIAL DEBATE STATE MACHINE │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Entry │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ PARSE_CLAIM │ │
│ └──────┬───────┘ │
│ │ │
│ ├──────────────────────┐ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ BULL_OPENING │ │ BEAR_OPENING │ Round 1 (parallel) │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ └──────────┬───────────┘ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ ADVANCE_CRITIQUE │ │
│ └────────┬─────────┘ │
│ │ │
│ ┌────────┴────────┐ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ BULL_CRITIQUE│ │ BEAR_CRITIQUE│ Round 2 (parallel) │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ └────────┬────────┘ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ ADVANCE_REBUTTAL │ │
│ └────────┬─────────┘ │
│ │ │
│ ┌────────┴────────┐ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ BULL_REBUTTAL│ │ BEAR_REBUTTAL│ Round 3 (parallel) │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ └────────┬────────┘ │
│ ▼ │
│ ┌──────────────┐ │
│ │ JUDGE │ │
│ └──────┬───────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ SYNTHESIZE │ │
│ └──────┬───────┘ │
│ │ │
│ ▼ │
│ END │
│ │
└─────────────────────────────────────────────────────────────────────┘| LangGraph Feature | How It's Used |
|---|---|
| Parallel edges | Bull and Bear run simultaneously each round |
| Synchronization nodes | advance_critique and advance_rebuttal wait for both sides |
| State accumulation | reasoning_trace uses operator.add to log all steps |
| Structured state | TypedDict ensures type safety across nodes |
FastAPI Application
# src/api/main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Optional, List
from ..workflow.debate import debate_agent, DebateState
from ..models.state import DebatePhase
from ..models.scoring import Synthesis
app = FastAPI(
title="Bull vs Bear Market Analyst",
description="Adversarial debate system for investment analysis",
version="1.0.0"
)
class DebateRequest(BaseModel):
query: str
class DebateResponse(BaseModel):
asset: str
thesis: str
recommendation: str
confidence: float
bull_case_summary: str
bear_case_summary: str
winner: str
margin: str
key_risks: List[str]
key_catalysts: List[str]
position_sizing_hint: str
review_triggers: List[str]
reasoning_trace: List[str]
@app.post("/debate", response_model=DebateResponse)
async def run_debate(request: DebateRequest):
"""Run an adversarial investment debate."""
initial_state: DebateState = {
"raw_query": request.query,
"claim": None,
"phase": DebatePhase.PARSING,
"current_round": 0,
"bull_arguments": None,
"bear_arguments": None,
"bull_critiques": None,
"bear_critiques": None,
"bull_rebuttals": None,
"bear_rebuttals": None,
"verdict": None,
"synthesis": None,
"reasoning_trace": []
}
try:
result = await debate_agent.ainvoke(initial_state)
if not result.get("synthesis"):
raise HTTPException(
status_code=500,
detail="Debate did not produce a synthesis"
)
synthesis = result["synthesis"]
verdict = result["verdict"]
claim = result["claim"]
return DebateResponse(
asset=claim.asset,
thesis=claim.thesis,
recommendation=synthesis.recommendation,
confidence=synthesis.confidence,
bull_case_summary=synthesis.bull_case_summary,
bear_case_summary=synthesis.bear_case_summary,
winner=verdict.winner.value,
margin=verdict.margin,
key_risks=synthesis.key_risks,
key_catalysts=synthesis.key_catalysts,
position_sizing_hint=synthesis.position_sizing_hint,
review_triggers=synthesis.review_triggers,
reasoning_trace=result.get("reasoning_trace", [])
)
except Exception as e:
raise HTTPException(
status_code=500,
detail=f"Debate failed: {str(e)}"
)
@app.get("/health")
async def health():
return {"status": "healthy", "service": "bull-bear-debate"}Deployment
Docker Configuration
# docker-compose.yml
version: '3.8'
services:
debate-api:
build: .
ports:
- "8000:8000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY src/ ./src/
EXPOSE 8000
CMD ["uvicorn", "src.api.main:app", "--host", "0.0.0.0", "--port", "8000"]Requirements
# requirements.txt
langgraph>=0.2.0
langchain>=0.3.0
langchain-openai>=0.2.0
openai>=1.40.0
fastapi>=0.115.0
uvicorn>=0.30.0
pydantic>=2.9.0
pydantic-settings>=2.5.0Example Usage
curl -X POST http://localhost:8000/debate \
-H "Content-Type: application/json" \
-d '{"query": "Should I invest in NVIDIA at current prices for a 2-year hold?"}'Example Response:
{
"asset": "NVDA",
"thesis": "NVIDIA is a good investment at current prices for a 2-year horizon",
"recommendation": "BUY",
"confidence": 0.65,
"bull_case_summary": "AI infrastructure demand, CUDA moat, data center growth",
"bear_case_summary": "High valuation, competition from AMD/custom chips, cyclical risk",
"winner": "bull",
"margin": "narrow",
"key_risks": [
"Valuation compression if AI hype fades",
"Competition from AMD MI300 and custom silicon",
"Semiconductor cycle downturn"
],
"key_catalysts": [
"Enterprise AI adoption acceleration",
"New product launches (Blackwell)",
"Hyperscaler capex guidance"
],
"position_sizing_hint": "Standard position, not overweight given valuation",
"review_triggers": [
"P/E exceeds 50x",
"AMD gains >20% GPU market share",
"AI infrastructure spending slows"
],
"reasoning_trace": [
"Parsed claim: NVDA - NVIDIA is a good investment...",
"Bull presented 3 arguments",
"Bear presented 3 arguments",
"Advancing to critique round",
"Bull critiqued 2 bear arguments",
"Bear critiqued 2 bull arguments",
"Advancing to rebuttal round",
"Bull defended with 2 rebuttals",
"Bear defended with 2 rebuttals",
"Judge verdict: bull wins by narrow",
"Final recommendation: BUY"
]
}Key Learnings
-
Adversarial structure forces thoroughness - By requiring arguments on both sides, the system naturally surfaces risks that a single-perspective analysis would miss.
-
Concessions build credibility - The rebuttal round's requirement to concede valid points produces more honest, calibrated analysis than pure advocacy.
-
Parallel execution improves quality - Running Bull and Bear simultaneously (with the same information) prevents one side from anchoring on the other's framing.
-
Synthesis != winner - The synthesizer considering both sides even when one "wins" produces more nuanced recommendations than simply siding with the victor.
Key Concepts Recap
| Concept | What It Is | Why It Matters |
|---|---|---|
| Adversarial Debate | Opposing agents argue both sides | Eliminates confirmation bias |
| Steelmanning | Making the STRONGEST version of each argument | Ensures fair comparison |
| Round-based State | 3 rounds: Opening, Critique, Rebuttal | Structured argumentation |
| Parallel Execution | Bull and Bear run simultaneously | Prevents anchoring bias |
| Concession Mechanism | Sides must acknowledge valid critiques | Calibrated confidence |
| Weighted Scoring | Logic, Evidence, Risk weights | Prioritizes evidence over rhetoric |
| Synthesis ≠ Winner | Final view considers both sides | Nuanced recommendations |
Next Steps
After completing this project, continue with:
- Differential Diagnosis Debate - Apply this pattern to medical diagnosis
- Drug Interaction Arbitrator - Pharmacy domain specialization
- Tumor Board Simulator - Multi-expert debate with 4+ specialists