Model Madness - Four AI Models, One NCAA Bracket

On March 19, 2026 - the day the NCAA Tournament first round tipped off - I gave the same prompt to four AI models and asked each one for a complete bracket prediction.

Same question. Same day. Now we watch.

The prompt was exactly this:

"Give me a final four full bracket for the NCAA men's tournament that's about to start."

No extra context. No seeding data. No bracket hints. Just the raw question, cold, the same way you'd ask a friend who follows college basketball.

The Models

Claude (Sonnet 4.6), ChatGPT (GPT-4o), Gemini (2.5 Pro), and Grok (Grok 3). All asked on the same day, March 19, 2026.

The Picks

Every model gave a different champion. Not one overlap.

Model	Champion Pick
Claude	Florida
ChatGPT	UConn
Gemini	Duke
Grok	Arizona

The one thing they agreed on: Arizona in the Final Four. All four models, independently, put the Wildcats in Indianapolis. That's the only unanimous pick across the entire bracket.

Everything else was chaos.

What Gets Tracked

The site auto-scores each model's picks in real time using ESPN's public scoreboard API - no key required, no manual updates. As games complete, pick cards flip green or red, the leaderboard re-sorts, and the score ticker updates.

Scoring is weighted by round:

Final Four pick: 10 pts
Semifinal winner: 20 pts
Champion: 40 pts
Maximum possible: 120 pts

There's also a comparison section showing how each model's champion pick stacked up against ESPN Bracket Challenge public pick percentages - so you can see which model was playing chalk and which was being contrarian.

What Made This Interesting

ChatGPT picked Purdue in the Midwest and leaned heavily on UConn repeating - both reads that looked more like last year's bracket than this year's field. Purdue lost Zach Edey to the NBA draft. ChatGPT either didn't know or didn't weight it.

Gemini was the most detailed - it gave regional reasoning, noted the Indianapolis venue's historical tilt toward blue bloods, and flagged specific upset alerts. Duke over Purdue in the title game, citing the 2010 and 2015 precedent.

Grok was the most hedged - it opened by noting the tournament had just started, listed its picks confidently, then immediately offered to adjust for more upsets.

Claude picked Florida. Defending champions. Best rebounding margin in the country. Frontcourt depth that wears teams down. A pick that nobody else made.

We'll see.

Tech

The site is a single static HTML file deployed to Vercel. No build step, no framework, no database. ESPN's public scoreboard API gets polled every 60 seconds. All four sets of picks are hardcoded - there's nothing to update, nothing to manage. The tournament either validates the picks or it doesn't.

Built in one Claude Code session.

Where It Landed

The tournament has wrapped and modelmadness.ai is no longer online, but the experiment stands: same prompt, same day, four different champions.