P

paulgauthier

Paul Gauthier analyzes AI-model performance, benchmarking results, and the second-order implications for cloud and semiconductor incumbents. Coverage emphasizes cost/performance trade-offs, competitive positioning, and how benchmark datapoints translate (or do not translate) to investable signals.

Trust score
0 / 100
Track record
0 / 100
Thesis calls
6
Evaluated calls
6
Average return
-1.23%
Win rate
83%

Past bets that played out

Notable threads examine new state-of-the-art results on the aider polyglot coding benchmark—specifically an “R1+Sonnet” combo claiming 64% vs “o1” at 62% with materially lower inference cost—and a GOOGL Gemini 2.5 Pro leaderboard entry that now reports benchmark costs (~$6 for the aider run). Analyses are cautious: these are model-level performance/cost datapoints with limited direct linkage to public-company revenue or pricing, but they inform views on AI compute demand and competitive inference-cost pressure.

AMDrightbacktest DEMOTE

Tweet claims a new state-of-the-art result on the aider polyglot coding benchmark: a combo “R1+Sonnet” scores 64% vs “o1” at 62%, with “14x less cost” than o1. This is an AI-model performance/cost datapoint, but it’s not directly tied to any public company product, revenue, or pricing; tradable implications are therefore limited and mostly second-order (AI compute demand, competitive positioning of model providers, and inference-cost pressure).

Mentioned: Jun 17, 2026, 11:30 PM EDTConviction: 24 / 100Return: -37.66%
Source: Paul Gauthier @paulgauthier Jan 24, 2025 R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost c...
AMZNrightbacktest PROMOTE

Tweet claims a new state-of-the-art result on the aider polyglot coding benchmark: a combo “R1+Sonnet” scores 64% vs “o1” at 62%, with “14x less cost” than o1. This is an AI-model performance/cost datapoint, but it’s not directly tied to any public company product, revenue, or pricing; tradable implications are therefore limited and mostly second-order (AI compute demand, competitive positioning of model providers, and inference-cost pressure).

Mentioned: Jun 17, 2026, 11:30 PM EDTConviction: 40 / 100Return: +15.68%
Source: Paul Gauthier @paulgauthier Jan 24, 2025 R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost c...
GOOGLrightbacktest PROMOTE

Tweet claims a new state-of-the-art result on the aider polyglot coding benchmark: a combo “R1+Sonnet” scores 64% vs “o1” at 62%, with “14x less cost” than o1. This is an AI-model performance/cost datapoint, but it’s not directly tied to any public company product, revenue, or pricing; tradable implications are therefore limited and mostly second-order (AI compute demand, competitive positioning of model providers, and inference-cost pressure).

Mentioned: Jun 17, 2026, 11:30 PM EDTConviction: 38 / 100Return: +14.93%
Source: Paul Gauthier @paulgauthier Jan 24, 2025 R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost c...

What this channel is watching now

Primary focus tickers: GOOGL (most-mentioned), MSFT, AMZN, NVDA, and AMD. Research centers on model-level benchmarking claims, inference pricing disclosures, and the second-order effects for cloud providers and chipmakers rather than firm-level financials.

Latest videos and market context

No recent video content; analysis is published as short-form posts on X (@paulgauthier) highlighting benchmark updates and cost/performance observations.

Paul Gauthier @paulgauthier Apr 12, 2025 Gemini 2.5 Pro's leaderboard entry has been updated with costs, now that it ...

n/a

A tweet notes Gemini 2.5 Pro’s leaderboard entry now includes benchmark costs because it’s available via paid API, and claims it cost ~$6 to run the aider polyglot coding benchmark—cheaper than most top-10 entries except DeepSeek. This is mildly supportive of Google’s AI price/performance competitiveness, but it’s a narrow, third-party benchmark datapoint and not a financial metric.

Paul Gauthier @paulgauthier Jan 24, 2025 R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost c...

n/a

Tweet claims a new state-of-the-art result on the aider polyglot coding benchmark: a combo “R1+Sonnet” scores 64% vs “o1” at 62%, with “14x less cost” than o1. This is an AI-model performance/cost datapoint, but it’s not directly tied to any public company product, revenue, or pricing; tradable implications are therefore limited and mostly second-order (AI compute demand, competitive positioning of model providers, and inference-cost pressure).

Proof-backed call history

Publishes concise commentary and benchmarking takeaways on AI models and inference economics. Recent posts flagged leaderboard cost disclosures for Gemini 2.5 Pro and a reported SOTA on the aider polyglot coding benchmark by an R1+Sonnet combo.

GOOGLrightbacktest PROMOTE

A tweet notes Gemini 2.5 Pro’s leaderboard entry now includes benchmark costs because it’s available via paid API, and claims it cost ~$6 to run the aider polyglot coding benchmark—cheaper than most top-10 entries except DeepSeek. This is mildly supportive of Google’s AI price/performance competitiveness, but it’s a narrow, third-party benchmark datapoint and not a financial metric.

Mentioned: Jun 17, 2026, 11:30 PM EDTConviction: 42 / 100Return: +7.56%
Source: Paul Gauthier @paulgauthier Apr 12, 2025 Gemini 2.5 Pro's leaderboard entry has been updated with costs, now that it ...
AMDrightbacktest DEMOTE

Tweet claims a new state-of-the-art result on the aider polyglot coding benchmark: a combo “R1+Sonnet” scores 64% vs “o1” at 62%, with “14x less cost” than o1. This is an AI-model performance/cost datapoint, but it’s not directly tied to any public company product, revenue, or pricing; tradable implications are therefore limited and mostly second-order (AI compute demand, competitive positioning of model providers, and inference-cost pressure).

Mentioned: Jun 17, 2026, 11:30 PM EDTConviction: 24 / 100Return: -37.66%
Source: Paul Gauthier @paulgauthier Jan 24, 2025 R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost c...
NVDArightbacktest DEMOTE

Tweet claims a new state-of-the-art result on the aider polyglot coding benchmark: a combo “R1+Sonnet” scores 64% vs “o1” at 62%, with “14x less cost” than o1. This is an AI-model performance/cost datapoint, but it’s not directly tied to any public company product, revenue, or pricing; tradable implications are therefore limited and mostly second-order (AI compute demand, competitive positioning of model providers, and inference-cost pressure).

Mentioned: Jun 17, 2026, 11:30 PM EDTConviction: 27 / 100Return: -5.35%
Source: Paul Gauthier @paulgauthier Jan 24, 2025 R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost c...
GOOGLrightbacktest PROMOTE

Tweet claims a new state-of-the-art result on the aider polyglot coding benchmark: a combo “R1+Sonnet” scores 64% vs “o1” at 62%, with “14x less cost” than o1. This is an AI-model performance/cost datapoint, but it’s not directly tied to any public company product, revenue, or pricing; tradable implications are therefore limited and mostly second-order (AI compute demand, competitive positioning of model providers, and inference-cost pressure).

Mentioned: Jun 17, 2026, 11:30 PM EDTConviction: 38 / 100Return: +14.93%
Source: Paul Gauthier @paulgauthier Jan 24, 2025 R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost c...
AMZNrightbacktest PROMOTE

Tweet claims a new state-of-the-art result on the aider polyglot coding benchmark: a combo “R1+Sonnet” scores 64% vs “o1” at 62%, with “14x less cost” than o1. This is an AI-model performance/cost datapoint, but it’s not directly tied to any public company product, revenue, or pricing; tradable implications are therefore limited and mostly second-order (AI compute demand, competitive positioning of model providers, and inference-cost pressure).

Mentioned: Jun 17, 2026, 11:30 PM EDTConviction: 40 / 100Return: +15.68%
Source: Paul Gauthier @paulgauthier Jan 24, 2025 R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost c...
MSFTwrongbacktest DEMOTE

Tweet claims a new state-of-the-art result on the aider polyglot coding benchmark: a combo “R1+Sonnet” scores 64% vs “o1” at 62%, with “14x less cost” than o1. This is an AI-model performance/cost datapoint, but it’s not directly tied to any public company product, revenue, or pricing; tradable implications are therefore limited and mostly second-order (AI compute demand, competitive positioning of model providers, and inference-cost pressure).

Mentioned: Jun 17, 2026, 11:30 PM EDTConviction: 43 / 100Return: -2.54%
Source: Paul Gauthier @paulgauthier Jan 24, 2025 R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost c...

About this channel

I translate technical benchmark results into practical implications for investors: how model cost/performance datapoints might affect cloud compute demand, competitive positioning among model providers, and pressure on inference pricing. I avoid overstating single-benchmark results and emphasize their limited direct tradable linkage.

Subscribersn/a
Videosn/a
Win rate83%
Average return-1.23%

@paulgauthier

Unlock the full track record

Follow @paulgauthier for timely, skeptical takes on AI benchmark claims and what they imply for GOOGL, MSFT, AMZN, NVDA, and AMD.