Flappy Bird Olympics

Every month, a new local LLM claims it can generate working games.
This is the arena where we put those claims to the test.
One pixelated bird at a time.

What is this?

Each entry here is a Flappy Bird clone generated entirely by a different local LLM model. Same prompt, same constraints, completely different results. Some work flawlessly. Some crash on launch. Some are... art.

Why does it matter

Flappy Bird is the perfect stress test for code generation. It's simple enough that a model should be able to produce it, but complex enough that bugs, edge cases, and subtle logic errors reveal exactly where each model stands. How does it handle collision detection? Game loops? State management? The answers tell you more about a model's capabilities than any benchmark.

How to use

Pick a model from the sidebar to see its game in action. If the model produced multiple iterations, toggle between versions using the brick tabs above the game. Read the curator notes on the right for analysis on what worked, what didn't, and what surprised me.

How the Olympics Work

Every model gets the same three prompts in sequence, with optional adjustments in between. The goal is to see how each model handles iterative refinement under identical conditions.

Prompt 1 — Initial

Write me a standalone Flappy Bird game in a single HTML file. Include all CSS and JavaScript inline.

Prompt 2 — Jazzed Up

This is really good. Jazz it up as much as you possibly can and make it feel like a real high-quality game from a developer.

Prompt 3 — Finale

Now — the same game except add a unique, original custom music soundtrack that plays in the background.

Adjustment (optional)

Between any stage, I can send a targeted fix — e.g. "The pipes are too narrow" or "Fix the collision detection." This is where the real iteration happens.